Individuality and convergence of the infant gut microbiota during the first year of life

The human gut microbiota plays a vital role in health and disease, and microbial colonization is a key process in infant development. Here, we analyze 2684 fecal specimens from 12 infants during their first year of life, providing detailed insights into the human gut colonization process. Maturation of the gut microbial community shows strong temporal structure and specific developmental stages. At 2–4 months of age, there is a period of accelerated convergence concurrent with a bloom of Bifidobacterium, a genus associated with metabolism of oligosaccharides found in breast milk. The end of this period coincides with the introduction of solid food, a reduction in the relative abundance of Bifidobacterium, and an increase in several groups of Firmicutes. Our findings highlight the dynamic nature and individuality of the gut colonization process, and the need for high-frequency sampling over an extended period when designing and interpreting infant microbiome studies.

in the main text.

Supplementary Figure 8 | Relative abundances of the OTUs in ID5.
Relative abundances of the 40 most abundant genera in ID5. Coloured dots under the day number correspond to the colour code displayed in Fig. 1. The red arrow indicates when the introduction of solid foods began. ID5 had periods of travel between days 79-83, and 87-167 (sampled in 95% alcohol). The colour key is consistent in Figures S4-S15 and in Figure 4 in the main text.

Supplementary Figure 9 | Relative abundances of the OTUs in ID6.
Relative abundances of the 40 most abundant genera in ID6. Coloured dots under the day number correspond to the colour code displayed in Fig. 1. The red arrow indicates when the introduction of solid foods began. The colour key is consistent in Figures S19-S27 and in Figure 4 in the main text. The colour key is consistent in Figures S4-S15 and in Figure 4 in the main text. For each panel there is a highly significant relationship between the primary axis of variation and time since birth (p<<0.001 for all tests, mean R 2 =0.7 (range 0.38-0.85), linear regression). Dots are coloured according to the bar beneath the panels. Early and late refer to the order in which the samples were collected (see Table 1 for days of first and last sampling).

Supplementary Figure 17 | nMDS of all infants based on Bray-Curtis distances.
Non-metric multidimensional scaling plot of all twelve infants using a single model. a. The infants showed significant clustering by individual (p<0.001, explained variance R 2 =0.29; PERMANOVA). The colour code for each of the individual infants is indicated in the box above the panel. b. The infants showed a significant overall time trend when regressing the nMDS coordinates along the first dimension on time since birth (p<0.001, R 2 =0.17, linear regression). For each infant, the colour code refers to the order in which the samples were collected (see Table 1 for days of first and last sampling). The models were based on Bray-Curtis distances.

Supplementary Figure 18 | nMDS of all infants based on weighted UniFrac distances.
Non-metric, multidimensional scaling plot of all twelve infants using a single model. a. The infants showed significant clustering by individual (p<0.001, explained variance R 2 =0.27; PERMANOVA). The colour code for each of the individual infants is indicated in the box above the panel. b. The infants showed a significant overall time trend when regressing the nMDS coordinates along the first dimension on time since birth (p<0.001, R 2 =0.26, linear regression). The bar beneath the panel indicates the time since birth when a sample was collected. The time trend was even stronger when looking at the second dimension (p<0.001, R 2 =0.3). For each infant, the colour code refers to the order in which the samples were collected (see Table 1 for days of first and last sampling).

Supplementary Figure 19 | Mean pairwise contemporaneous weighted UniFrac distances between 11 infants*.
All data series were interpolated to equal length (365 pseudo-days) for this analysis. Dots show the observed values while the black line shows a fitted generalized additive model (p<<0.001, R 2 =0.53). The shaded band represents 95% confidence limits. Dotted vertical lines indicate the window of convergence (days 60-130). * ID8 was excluded from this analysis since sampling did not commence until day 54.

Supplementary Figure 20 | Mean contemporaneous Bray-Curtis distances.
In each panel the dots represent the mean distance between the individual indicated above and 10 other individuals over time*, based on data series interpolated to 365 points. The fitted lines are generalized additive models using 9 degrees of freedom for estimating the smooth terms. Shaded areas represent 95% confidence limits. *ID8 was excluded from this analysis since sampling did not commence until day 54.

Supplementary Figure 21 | Mean contemporaneous weighted UniFrac distances.
In each panel the dots represent the mean distance between the individual indicated above and 10 other individuals over time*, based on data series interpolated to 365 points. The fitted lines are generalized additive models using 9 degrees of freedom for estimating the smooth terms. Shaded areas represent 95% confidence limits. *ID8 was excluded from this analysis since sampling did not commence until day 54. 78.3 % *The correlation coefficients and P-values come from testing the relationship between the indicatied OTUs and the mean contemporraneous Bray-Curtis distances between the infants during the convergence period (See Figure 2).