Variation in Human Milk Composition Is Related to Differences in Milk and Infant Fecal Microbial Communities

Previously published data from our group and others demonstrate that human milk oligosaccharide (HMOs), as well as milk and infant fecal microbial profiles, vary by geography. However, little is known about the geographical variation of other milk-borne factors, such as lactose and protein, as well as the associations among these factors and microbial community structures in milk and infant feces. Here, we characterized and contrasted concentrations of milk-borne lactose, protein, and HMOs, and examined their associations with milk and infant fecal microbiomes in samples collected in 11 geographically diverse sites. Although geographical site was strongly associated with milk and infant fecal microbiomes, both sample types assorted into a smaller number of community state types based on shared microbial profiles. Similar to HMOs, concentrations of lactose and protein also varied by geography. Concentrations of HMOs, lactose, and protein were associated with differences in the microbial community structures of milk and infant feces and in the abundance of specific taxa. Taken together, these data suggest that the composition of human milk, even when produced by relatively healthy women, differs based on geographical boundaries and that concentrations of HMOs, lactose, and protein in milk are related to variation in milk and infant fecal microbial communities.


Introduction
Human milk is a complex biological fluid that provides all the nutritional requirements that support infant growth and development. This is, in part, attributed to the fact that milk is a rich source of lactose, lipids, human milk oligosaccharides (HMOs), protein, and numerous other micronutrients [1]. Additionally, both culture-dependent and culture-independent methods have demonstrated the presence of microbiota in milk [2][3][4][5][6], with emerging data suggesting that these microbiota may play a role in seeding or supplementing the nascent infant gastrointestinal (GI) microbiome [7].
In addition to supporting infant development, milk constituents (including lactose, protein, and HMOs) both directly and indirectly modulate host-associated microbial communities. As the principal carbohydrate source in milk, lactose is generally digested in the small intestine via lactase. Undigested lactose that reaches the large intestine is readily metabolized, and occasionally preferred over glucose, by resident microbes, including Lactobacillus and Bifidobacterium, into short-chain fatty acids and other volatile compounds [8][9][10][11]. Similarly, while most proteins are completely digested in the small intestine [1], partially digested proteins reaching the large intestine may be utilized by microbes [12]; although this is understudied in infants. In addition, some proteins (e.g., lactoferrin and secretory immunoglobulin A) function as host defense agents, modulating bacterial composition in the infant's GI tract by repressing growth of pathogens.
In contrast to lactose and protein, HMOs largely pass through the GI tract intact, as infants lack the enzymes to digest them [13,14]. Upon reaching the large intestine HMOs function principally as substrates for host-associated microbiota. However, HMOs not only promote the growth of microbes that are generally considered beneficial (e.g., Bifidobacterium [15]), they also function as antimicrobials that protect against pathogens, all of which contribute to shaping the infant GI microbiome and, in turn, infant health [16][17][18][19][20][21].
Results from the INSPIRE study (a large geographically and socioculturally diverse cohort) have previously demonstrated that the profiles of milk-borne immune factors [22], HMOs [23], and maternal and infant microbiomes [24,25] vary substantially across geographical/sociocultural boundaries. As HMOs and other components of milk, including lactose and protein, are able to shape microbial abundance, we hypothesized that variation in these milk factors could be related to differences in the structure of milk and infant fecal microbial communities, as well as to the abundance of specific bacterial taxa. To test this hypothesis, we investigated relationships between and among microbial communities, and the concentrations of milk lactose, protein, and HMOs in milk and infant fecal samples collected from maternal-infant dyads in the INSPIRE study.

Study Design
The participants in this study were recruited as part of the INSPIRE study, which has been described in detail [22][23][24][25]. All study procedures were approved by the Washington State University Institutional Review Board (#13264) and at each study location. Sample collection took place between May 2014 and April 2016 and was carried out as a crosssectional, epidemiological, multi-cohort study. Briefly, samples were collected from 11 populations, including two from Ethiopia (rural population, ETR; urban population, ETU); Kenya (KE), Ghana (GN), two from The Gambia (rural population, GBR; urban population, GBU), Peru (PE), Spain (SP), Sweden (SW), and two from the United States of America (California, USC; Washington/Idaho, USW). To be eligible to participate, women had to be nursing or pumping ≥5 times a day and be ≥18 years of age. Exclusion criteria included: (1) current indication of a breast infection or breast pain that the woman did not consider Microorganisms 2021, 9,1153 3 of 17 normal for lactation; (2) illness (i.e., self-reported fever, vomiting, severe cough, or diarrhea) in the last 7 days; and/or (3) antibiotic use in the previous 30 days. For inclusion, infants had to be described as healthy by their mothers, have no signs of acute illness (i.e., fever, vomiting, severe cough, diarrhea, or rapid breathing) in the previous 7 days, and have not received antibiotics in the previous 30 days.

Milk and Infant Fecal Sampling
A total of 412 milk and 406 infant fecal samples were collected as part of the INSPIRE cohort. Descriptions of the sampling protocols for milk and feces have been previously described [24]. Briefly, milk was collected using gloved hands by participants or research personnel, after twice cleaning the breast with prepackaged castile soap towelettes (PDI, Inc, Woodcliff Lake, NJ, USA). Milk samples were collected via electric pump (Symphony, Medela Inc., Switzerland; PE, SW, USC, USW) or hand expression (ETR, ETU, KE, GN, GBR, GBU, SP) into sterile containers. Collected milk was immediately frozen (−20 • C), except in ETR where it was preserved in a 1:1 ratio with Milk Preservation Solution (Norgen Biotek, Ontario, CA) and frozen within 6 days. Approximately 1 g of feces was collected from diapers (Parent's Choice; Walmart, Bentonville, AR, USA) or directly from the infant's skin using a sterile, single-use scoop (Sarstedt AG & Co., Nümbrecht, Germany). Fecal samples were then placed into the accompanying sterile polypropylene container and frozen at −20 • C within 30 min of collection. For fecal samples collected in ETR, RNAlater (Ambion, Austin, TX, USA) was added to each fecal sample in a ∼1:4 ratio (feces:preservative) and frozen within 6 days. Milk and fecal samples were shipped on dry ice to the University of Idaho, where they were immediately frozen at −20 • C.

DNA Extraction and 16S rRNA Gene Amplification/Sequencing
DNA was extracted from milk and infant fecal samples as previously described [24]. Extracted DNA from milk and infant fecal samples was subjected to a dual-barcoded, two-step, 30-cycle polymerase chain reaction (PCR) to amplify the V1-V3 hypervariable region of the 16S rRNA gene. In the first step, a 7-fold degenerate forward primer targeting nucleotide position 27 [26] and a reverse primer targeting nucleotide position 534 (positions numbered according to the Escherichia coli 16S rRNA gene) were used as described previously [27]. Amplicons were pooled to contain 50 ng of DNA from each sample. Size selection of amplicon pools were performed using AMPure beads (Beckman Coulter, Indianapolis, IN, USA), quality checked on a Fragment Analyzer (Advanced Analytical Technologies, Inc., Ankeny, IA, USA), and quantified using the KAPA Biosciences Illumina library quantification kit and Applied Biosystems StepOne Plus real-time PCR system. Amplicons passing quality control for milk and feces were sequenced by sample type on separate MiSeq (Illumina, San Diego, CA, USA) sequencing runs (v3 paired-end, 300-bp protocol for 600 cycles at the University of Idaho Institute for Bioinformatics and Evolutionary Studies Genomics Core).

16S rRNA Gene Amplicon Data Processing
Samples were processed as previously described [24], with the following modifications. The DADA2-silva-derived taxonomy was edited to replace "NA" classifications with the next highest classification (e.g., if an amplicon sequence variant, ASV, was unclassified at the genus level but was classified at the family level as "Prevotellaceae", it was given a genus-level classification of "Family_Prevotellaceae"). The ASV table was then filtered to remove ASVs assigned to mitochondria and chloroplasts. Furthermore, prevalence-based filtering to identify and remove potentially confounding sequences from milk samples that may have inadvertently arisen through sample collection, preparation, and reagent contamination [28,29], despite exercising aseptic technique [24], was performed using the R package decontam (v. 0.99.1) [30]. DNA extraction blanks (n = 23), generated and treated in parallel with milk samples, were used as negative controls in the filtering. As ETR milk samples were collected and stored using a preservative, these samples (n = 40) and their respective negative controls (n = 5) were processed through decontam separately from the remaining cohort samples (n = 390) and their respective negative controls (n = 18). ASV tables were used as the input for the isNotContaminant function using the default parameters (method = "prevalence", threshold = 0.5). After removing negative controls and ASVs lacking statistical support, ASV tables were merged in phyloseq and samples with less than 1000 reads (and their respective paired milk or infant fecal sample) were removed. An overview of the relative abundances of the most abundant genera before and after decontam filtering is presented in Figure S1 and a R markdown file containing code for decontam is included as a supplemental file.

Microbial Community State Type Analysis
The R package DirichletMultinomial (v. 1.20.0) [31] was used to describe variability in microbiome data and cluster samples into community state types (i.e., "microbial lactotypes" for milk samples and "enterotypes" for infant fecal samples) based on the genus level abundance tables (filtered to remove genera present across all samples with a relative abundance of less than 0.01%). Model fit was determined based on the minimum Laplace goodness of fit.

Microbial Alpha and Beta Diversity
For alpha and beta diversity analyses, samples were rarefied to 95% of the minimum sample read count. The rarefied data were used to generate the Bray-Curtis dissimilarity distance matrix, or converted to binary counts to generate the binary Jaccard distance matrix, and used for multidimensional scaling (MDS) and non-metric multidimensional scaling (NMDS). For MDS, the vegan adonis function was used to perform permutational multivariate analysis of variance (PERMANOVA) with 999 permutations to test for differences between groups. The vegan envfit function was used to fit and determine goodness of fit and p-values for selected maternal and environmental factors (e.g., maternal body mass index (BMI), mode of delivery, and HMOs concentration) onto the Bray-Curtis NMDS ordination data using 9999 permutations.

Cohort Demographics
Data for 357 maternal-infant dyads were available for analyses after microbiome sequence processing and quality control. On average, maternal age was 27.4 ± 6.1 years, with milk and infant fecal samples collected an average of 64.6 ± 21.9 days postpartum. Overall, the majority (86%) of births were via vaginal delivery, and the frequency of exclusive breastfeeding at the time of sample collection was 60%. Consistent with prior reports of the INSPIRE cohort [22][23][24][25], there were myriad differences in demographics across populations. Additional selected demographics for these dyads are detailed in Table S1.
Community state type (CST, communities of similar microbial composition and abundance) analysis has been used to explore variation of the microbial communities of the feces (i.e., enterotypes) [39][40][41][42], vagina [43], and milk [3]. To identify and examine CSTs, we applied Dirichlet multinomial mixtures modelling to both milk and infant fecal microbiomes. Milk samples formed four clusters or microbial lactotypes (i.e., L1 through L4) ( Figure 1A), whereas infant fecal samples formed two clusters or microbial enterotypes (i.e., E1 and E2) ( Figure 1B). Both microbial lactotypes and enterotypes differed with respect to parity, maternal age, maternal BMI, and exclusive breastfeeding status (Tables S2  and S3). Lactotypes also differed by time postpartum and maternal secretor status. The distribution of populations within milk and infant fecal CSTs varied. Among lactotypes, L4 was comprised exclusively of rural Ethiopian (ETR) subjects, although not all ETR subjects belonged to the L4 cluster ( Figure 1A). L1 was mainly comprised of individuals from the Americas and Europe (i.e., PE, SP, SW, USC, and USW), while L2 and L3 were both largely comprised of individuals from Africa (i.e., L2-ETU, GBR, GBU, and KE; L3-GN), although L3 contained a marginal proportion of participants from the SP cohort. Similarly, Microorganisms 2021, 9, 1153 6 of 17 infant enterotype membership varied largely by African and non-African populations, with individuals from the Americas and Europe grouping mainly with E1 ( Figure 1B). An assessment of the relationship between infant fecal enterotypes and the lactotypes of their respective mothers revealed mothers with milk that belonged to L1 or L3 had a larger proportion of infants that belonged to E1 (67% and 63%, respectively); mothers with milk that belonged to L2 had a larger proportion of infants that belonged to E2 (60%). However, infants of mothers with milk that belonged to L4 were split between E1 (49%) and E2 (51%).
The abundance and prevalence of several genera were associated with individual milk and infant fecal CSTs based on indicator species analysis (Figure 1; FDR p < 0.1) (Tables S4  and S5). Among the most abundant genera within milk, Streptococcus, Propionibacterium, Lactobacillus, and Corynebacterium were identified as indicator taxa for L1, L2, L3, and L4, respectively. Bifidobacterium was also identified as an indicator taxon of L4. Among the most abundant genera within infant feces, Bacteroides, Escherichia/Shigella, and Clostridium sensu strictu were identified as indicator taxa of E1, whereas Streptococcus, Bifidobacterium, Lactobacillus, and Staphylococcus were identified as indicator taxa of E2.
L4 was comprised exclusively of rural Ethiopian (ETR) subjects, although not all ETR subjects belonged to the L4 cluster ( Figure 1A). L1 was mainly comprised of individuals from the Americas and Europe (i.e., PE, SP, SW, USC, and USW), while L2 and L3 were both largely comprised of individuals from Africa (i.e., L2-ETU, GBR, GBU, and KE; L3-GN), although L3 contained a marginal proportion of participants from the SP cohort. Similarly, infant enterotype membership varied largely by African and non-African populations, with individuals from the Americas and Europe grouping mainly with E1 ( Figure 1B). An assessment of the relationship between infant fecal enterotypes and the lactotypes of their respective mothers revealed mothers with milk that belonged to L1 or L3 had a larger proportion of infants that belonged to E1 (67% and 63%, respectively); mothers with milk that belonged to L2 had a larger proportion of infants that belonged to E2 (60%). However, infants of mothers with milk that belonged to L4 were split between E1 (49%) and E2 (51%).
The abundance and prevalence of several genera were associated with individual milk and infant fecal CSTs based on indicator species analysis (Figure 1; FDR p < 0.1) (Tables S4 and S5). Among the most abundant genera within milk, Streptococcus, Propionibacterium, Lactobacillus, and Corynebacterium were identified as indicator taxa for L1, L2, L3, and L4, respectively. Bifidobacterium was also identified as an indicator taxon of L4. Among the most abundant genera within infant feces, Bacteroides, Escherichia/Shigella, and Clostridium sensu strictu were identified as indicator taxa of E1, whereas Streptococcus, Bifidobacterium, Lactobacillus, and Staphylococcus were identified as indicator taxa of E2.  CSTs also differed from one another across multiple alpha (i.e., Shannon diversity, observed ASVs, and Pielou's evenness) and beta (i.e., Bray-Curtis and binary Jaccard) diversity metrics ( Figure S3). In general, both Shannon and observed ASV metrics were highest in L4 and lowest in L3, with similar differences with respect to Pielou's evenness ( Figure S3A). In contrast, there were no difference in alpha diversity metrics between infant fecal enterotypes ( Figure S3B). Examination of the beta diversity of milk microbial lactotypes and infant fecal enterotypes confirmed that CSTs differed in community structure and membership (PERMANOVA, p < 0.001; Figure S3C,D).
In summary, although milk and infant fecal microbiomes share numerous taxa, the microbial communities of these sample types differ. Additionally, despite the gradients of taxa abundances within these microbial communities, the microbiomes of milk and infant feces can be classified into multiple CSTs that are related to both maternal and infant factors, as well as to the presence and abundance of indicator genera.

Milk Lactose and Protein Concentrations
In milk, and similar to prior reports [1,44], average concentrations of lactose and protein were 79.2 ± 11.0 g/L and 15.0 ± 2.8 g/L, respectively. There was substantial variation in the range of concentrations for both of these milk constituents, with lactose ranging from 26.7 to 118.3 g/L (interquartile range [IQR], 72.7 to 84.3 g/L) and protein from 9.4 to 32.6 g/L (IQR, 13.2 to 16.6 g/L; Figure 2). The average concentrations of both lactose and protein differed by population (p < 0.001, both; Figure 2A). While lactose concentration differed among lactotypes (p = 0.008), protein concentration did not (p = 0.348; Figure 2B). Although the concentration of lactose did not differ between enterotypes (p = 0.553), a trend was observed for protein concentration (p = 0.051; Figure 2C).
CSTs also differed from one another across multiple alpha (i.e., Shannon diversity, observed ASVs, and Pielou's evenness) and beta (i.e., Bray-Curtis and binary Jaccard) diversity metrics ( Figure S3). In general, both Shannon and observed ASV metrics were highest in L4 and lowest in L3, with similar differences with respect to Pielou's evenness ( Figure S3A). In contrast, there were no difference in alpha diversity metrics between infant fecal enterotypes ( Figure S3B). Examination of the beta diversity of milk microbial lactotypes and infant fecal enterotypes confirmed that CSTs differed in community structure and membership (PERMANOVA, p < 0.001; Figure S3C,D).
In summary, although milk and infant fecal microbiomes share numerous taxa, the microbial communities of these sample types differ. Additionally, despite the gradients of taxa abundances within these microbial communities, the microbiomes of milk and infant feces can be classified into multiple CSTs that are related to both maternal and infant factors, as well as to the presence and abundance of indicator genera.

Milk Lactose and Protein Concentrations
In milk, and similar to prior reports [1,44], average concentrations of lactose and protein were 79.2 ± 11.0 g/L and 15.0 ± 2.8 g/L, respectively. There was substantial variation in the range of concentrations for both of these milk constituents, with lactose ranging from 26.7 to 118.3 g/L (interquartile range [IQR], 72.7 to 84.3 g/L) and protein from 9.4 to 32.6 g/L (IQR, 13.2 to 16.6 g/L; Figure 2). The average concentrations of both lactose and protein differed by population (p < 0.001, both; Figure 2A). While lactose concentration differed among lactotypes (p = 0.008), protein concentration did not (p = 0.348; Figure 2B). Although the concentration of lactose did not differ between enterotypes (p = 0.553), a trend was observed for protein concentration (p = 0.051; Figure 2C).

Concentration and Composition of HMOs
Average concentration of total HMOs in milk was 12,913 ± 4039 nmol/mL (range, 4053-22,884 nmol/mL; IQR, 9735-15,847 nmol/mL). As expected and previously shown [23], concentration of 2 FL was most associated with HMO composition profiles ( Figure S4). Although multiple differences in HMO concentrations were observed among microbial lactotypes, fewer differences were observed between enterotypes (Tables S6 and S7). When microbial lactotypes and enterotypes were considered, concentrations of four HMOs differed in both (DSLNT, LSTc, LNnT, and 3 SL; FDR p < 0.1). An additional 12 HMOs differed in concentration among microbial lactotypes, and 3FL differed in concentration between enterotypes. Examination of the concentrations of HMOs grouped by shared features (e.g., HMO-bound fucose) or fucose/sialic-acid-bound HMOs within CSTs revealed that almost all groups differed among microbial lactotypes, except for type 1 and type 2 HMOs. In contrast, fewer HMO groupings differed between enterotypes, although differences were observed in the concentrations of type 2 HMOs and HMOs with internal α-2-6-sialyated linkages. Significant differences in the ratios of HMO-bound fucose to HMO-bound sialic acid as well as to total HMOs were also observed among lactotypes but not between enterotypes.

Association between Milk Composition and Milk and Infant Fecal Bacterial Beta Diversity
As microbial communities are comprised of taxa present in gradients of abundance, we examined the association of maternal/infant characteristics and the concentrations of milk-borne lactose, protein, and HMOs to the overall microbial community structure of milk and infant fecal samples using envfit. Not surprisingly, for both milk and infant fecal microbiomes, population cohort and respective CSTs were significantly associated with microbial composition (20 to 37% of the explained variance in milk and infant fecal microbiomes; Figure 3). Microbial lactotype was significantly associated with infant fecal microbiota community structure (~8%), but not the inverse. Maternal age, parity, and exclusive breastfeeding were also associated with the composition of both milk and infant fecal microbiomes, whereas time postpartum, maternal BMI, and mode of delivery were only associated with the infant fecal microbiome.
Numerous associations were also identified between the concentrations of milk-borne factors and microbial community structure ( Figure 3). For instance, variation in lactose concentration was associated with the microbial community structure of milk (~4% explained variance) but not infant feces. In contrast, variation in protein concentration was associated with the structure of the infant fecal (~3%) microbiome, but not that of milk. Individually, 3FL, DSLNT, and 2 FL explained the most variance within both microbiomes (>4%, each), with an additional seven and eight HMOs also associated with milk and infant fecal microbiomes, respectively. Examination of HMOs by shared characteristics revealed that the concentration of HMO-bound fucose explained the most variance within the milk microbiome (~7%), and the ratio of HMO-bound fucose to HMO-bound sialic acid explained the most variance within the infant fecal microbiome (~12%). Thus, variation in milk and infant fecal microbiomes were related to differences in maternal and infant characteristics, as well as the concentrations of milk-borne factors. Microorganisms 2021, 9, x FOR PEER REVIEW 9 of 17

Correlation of Milk Factors with Bacterial Taxa Abundance
A correlation analysis was performed to identify associations between and among concentrations of milk lactose, protein, HMOs, and the relative abundances of specific bacterial genera in milk and infant feces. Similar to prior observations of the milk microbiome and HMOs [19], stronger correlations were observed within a class of constituents (e.g., HMO-to-HMO and bacterium-to-bacterium) than between milk macronutrients/HMO and bacterial taxa (Figure 4). This was mainly attributed to the grouping of HMOs with shared features such as α-1-2-fucosylated HMO (e.g., 2 FL and LNFP I, = 0.74) or sialylated HMOs (e.g., 6 SL and LSTc, = 0.65). Significant bacterium-to-bacterium correlations were also observed within and among milk and infant fecal samples. With respect to bacteria, the strongest inverse associations were observed between Bacteroides and Associations were also observed between milk constituents and both milk and infant fecal microbiota (Figure 4). For example, the relative abundances of milk and infant fecal Lactobacillus were inversely related to concentrations of fucosylated HMOs, whereas relative abundances of milk and infant fecal Veillonella were inversely correlated with concentrations of sialylated HMOs. Given the specific adaptations for HMO utilization of specific Bifidobacteria species, it is noteworthy that the abundance of infant fecal Bifidobacteria was only significantly associated with the concentration of fucosylated HMOs, except for an inverse correlation with DFLNH; however, the relative abundance of milk Bifidobacteria was inversely related to the concentration of several fucosylated HMO, including 2 FL. Positive correlations were observed between milk and infant fecal Bifidobacteria with DSLNT, DSLNH, and other sialylated HMOs. In contrast, Streptococcus in milk (but not infant feces) was positively associated with fucosylated HMOs. Correlations between the relative abundance of bacteria in milk and infant feces and the concentrations of protein and lactose were few and weak; although, in general, correlations between bacteria and lactose tended to be stronger than those between bacteria and protein. Taken together, although within class associations of milk-borne factors were stronger, numerous associations between these same factors and specific milk and infant fecal microbiota were identified.

Discussion
Human milk contains a diversity of nutrients and bioactive factors known or postulated to influence maternal and infant health. Here we tested the hypothesis that variation in the concentrations of milk lactose, protein, and HMOs would be associated with differ-

Discussion
Human milk contains a diversity of nutrients and bioactive factors known or postulated to influence maternal and infant health. Here we tested the hypothesis that variation in the concentrations of milk lactose, protein, and HMOs would be associated with dif-ferences in the community structure and abundance of milk and infant fecal microbiota. Although milk and infant fecal microbiota are known to vary by geographical location, we found milk and infant fecal microbiota grouped into four microbial lactotypes and two enterotypes, respectively, based on similarities in community composition. Interestingly, while the CSTs within milk and infant feces contained representative genera (e.g., Lactobacillus in L3 and Bacteroides in E1), several genera were shared among all samples; for example, Staphylococcus and Veillonella were core genera in milk and infant fecal samples, respectively.
Interestingly, despite Staphylococcus spp. (e.g., Staphylococcus aureus) being commonly implicated as a common etiological agent of mastitis [45,46], mastitis was not reported by any of the women in the current study. Indeed, Staphylococcus spp. are common constituents of milk microbial communities [3,47,48]. While differences in traits and virulence factors among strains of staphylococci [49] may help to partially explain why not all carriers of Staphylococcus spp. go on to develop mastitis, it also indicates that the presence of Staphylococcus spp. in milk alone is not sufficient to treat with antibiotics, and that more research related to the microbial etiology of mastitis is needed.
Similar to HMOs, concentrations of lactose and protein displayed a substantial amount of interindividual variation that differed among population cohorts and CSTs. Concentrations of HMOs are largely determined by genetic factors [50]. For example, individuals with a functioning FUT2-encoded fucosyltransferase (i.e., secretors) are able to synthesize α-1-2-fucosylated HMOs such as 2 -fucosyllactose (2 -FL) and lacto-N-fucopentaose I (LNFP I), whereas individuals lacking a functional FUT2 allele (i.e., non-secretors) are unable to synthesize these glycans [51]. Less is known about the underlying genetics that contribute to the concentrations of milk lactose and protein. There are some data that suggest host genetics may play a role in milk lactose concentrations; e.g., lactose concentrations differ among ABH and Lewis secretor types [52], and differences in lactose concentrations have been observed in milk produced by women living in five different countries using metabolomics [53]. In contrast, while single nucleotide polymorphisms impacting bioactivity of some milk proteins have been identified [54], none have been strongly related to differences in milk protein concentration [55]. Regional differences in maternal nutrition may explain some of the variation in concentrations that we observed, although this is unlikely to be a significant contributor as both milk lactose and protein levels are relatively unaltered by diet [56,57]. The full extent of the impact of host genetics on lactose and total protein concentrations remains to be determined.
The variation in protein and lactose concentrations among population cohorts may also have implications for current recommendations for dietary consumption of these nutrients. Adequate intake (AI) levels of nutrients for infants from 0 to 6 months are estimated based on an exclusive human milk diet (average concentration of the nutrients in milk coupled with milk intake volume of 0.78 L/d) by healthy, full-term infants born to healthy, well-nourished mothers [58]. For example, the AIs for carbohydrates (in human milk being almost exclusively lactose) and protein are 74 g/L and 11.7 g/L, respectively [58]. In the present study, 70 percent of overall participants produced milk that met or exceeded the value used to establish the AI for carbohydrates/lactose. However, this varied by cohort, ranging from 47 percent of USW women to 100 percent of USC women. Similarly, whereas 92 percent of overall participants produced milk with protein concentrations that met or exceeded the value used to establish the AI for protein, this ranged from 66 percent of USW women to 100 percent of GBR and PE women. Although in the current study we did not assess milk intake volume of the infants, the differences in concentrations of lactose and protein across cohorts suggest that average consumption of these important macronutrients may differ by population.
The complex microbial communities present in milk and infant feces were associated with numerous maternal/infant factors as well as milk HMO, lactose, and protein concentrations. Not surprisingly, based on prior data [24], population cohort explained the most variance within both microbiomes; however, maternal age, parity, and exclusive breastfeeding were also significantly associated with variation. Interestingly, microbial lactotype was significantly associated with variation in the infant fecal microbiome, in support of the hypothesis that milk microbiota contribute to and influence infant GI microbiome composition [4,7,59].
While the proportion of secretors differed among microbial lactotypes, maternal secretor status per se was not associated with the microbial community structure of milk or infant fecal microbiota, consistent with findings from the CHILD and TwinsUK study cohorts [19,60]. Instead, microbial community structures of milk and infant fecal samples were associated with concentrations of α-1-2-fucosylated HMOs and HMO-bound fucose. This is likely related to underlying host genetics that results in a large degree of variation in the concentrations of α-1-2-fucosylated HMOs in the milk produced by secretors [23]. Although we did not analyze dietary patterns in this study, maternal diet may play a role in altered patterns of α-1-2-fucosylated HMO composition [61].
Several other HMOs (e.g., 3FL, LNFP III, LSTb, and DSLNT) were also found to be related to the microbial community structures of milk and infant feces. Interestingly, DSLNT, hypothesized to protect against necrotizing enterocolitis [62][63][64], was among the top three HMOs that explained the most variance in the structure of both milk and infant fecal microbiomes. In a correlation analysis, DSLNT was also positively associated with the abundance of Bifidobacterium in both milk and infant feces; Bifidobacterium is considered to be health-promoting during infancy [65] and has been found to positively correlate with DSLNT concentrations in milk [64]. It is notable that all the infants in this study were reported as born at term and currently healthy.
While concentrations of lactose were associated with the microbial community structure of milk, concentrations of protein were not. Conversely, while concentrations of protein were associated with the microbial community structure of infant feces, concentrations of lactose were not. These differences are likely reflective of host factors and in the microenvironment of the mammary gland and the GI tract. For example, lactose plays a major role in controlling milk volume via maintenance of the osmolarity of milk in the mammary gland and is mostly metabolized prior to reaching the lower GI tract. Lactose concentrations were positively correlated with the abundance of Streptococcus in milk, a genus known to be capable of fermenting lactose. However, given the abundance of lactose in milk, it is unlikely to be a rate-limiting substrate for bacterial growth. Instead, lactose may function to modulate bacterial abundance by inducing metabolization of other milk-borne factors, similar to HMO-induced amino acid utilization [66]; however, this remains to be examined.
Several limitations to the current study should be noted. As we only examined total protein, we were unable to examine associations among microbiota and individual proteins or classes of proteins, such as secretory immunoglobulin A and lactoferrin, both of which can directly influence microbial ecology [67,68]. Additionally, although we examined the associations between microbiota and concentration of 19 HMOs that represent the majority of HMOs present in milk, to date over 200 HMO species (most present in low abundance) have been identified in human milk [69]. However, the 19 analyzed HMOs not only represent the majority of all HMOs by mass, they also describe the entire known chemical space of HMOs, including type 1 and 2 structures, branching, as well as all types of fucosylation and sialylation. Whether the other more complex (structure redundancy) and less abundant HMOs have an impact on milk and infant fecal microbiota is unknown and needs additional research. Finally, HMO concentrations were only measured in milk and not infant feces; as such we could not examine how HMOs may have been metabolized as they pass through the infant GI tract and any relationships with resident microbiota. Additionally, while we attempted to standardize and/or optimize milk and fecal collection, handling, and analysis protocols, due to practical considerations this was not always possible. For example, lack of access to reliable refrigeration required us to mix and store all ETR samples with a preservative, which may have influenced downstream analyses. However, key strengths of this work are the large and globally diverse cohort of dyads; recruitment of relatively healthy participants in each cohort; and inclusion of appropriate controls and filtering parameters applied prior to analysis.

Conclusions
Taken together, our results demonstrate that variation in human milk and infant fecal microbial communities are associated with differences in the concentrations and profiles of milk lactose, protein, and HMOs. Future work should focus on understanding how these associations develop and mature over the course of lactation and infant development.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/microorganisms9061153/s1, Figure S1: Relative abundances of the initial top 25 bacterial genera in milk before and after prevalence-based filtering; Figure S2: Milk and infant fecal microbiome beta diversities and most abundant taxa; Figure S3: Differences in the alpha and beta diversities of milk and infant fecal community state types; Figure S4: HMO composition; Table S1: Selected cohort characteristics; Table S2: Milk microbial community state type (lactotype) characteristics; Table S3: Infant fecal microbial community state type (enterotype) characteristics; Table S4: Indicator taxa of the microbial lactotypes; Table S5: Indicator taxa of the infant microbial enterotypes; Table S6: HMO concentrations and ratios among microbial lactotypes; Table S7:   Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
Data and materials that support the findings of this study are available upon request from the corresponding authors.