The Vaginal Microbiota of Pregnant Women Varies with Gestational Age, Maternal Age, and Parity

ABSTRACT The composition of the vaginal microbiota is heavily influenced by pregnancy and may factor into pregnancy complications, including spontaneous preterm birth. However, results among studies have been inconsistent due, in part, to variation in sample sizes and ethnicity. Thus, an association between the vaginal microbiota and preterm labor continues to be debated. Yet, before assessing associations between the composition of the vaginal microbiota and preterm labor, a robust and in-depth characterization of the vaginal microbiota throughout pregnancy in the specific study population under investigation is required. Here, we report a large longitudinal study (n = 474 women, 1,862 vaginal samples) of a predominantly African-American cohort—a population that experiences a relatively high rate of pregnancy complications—evaluating associations between individual identity, gestational age, and other maternal characteristics with the composition of the vaginal microbiota throughout gestation resulting in term delivery. The principal factors influencing the composition of the vaginal microbiota in pregnancy are individual identity and gestational age at sampling. Other factors are maternal age, parity, obesity, and self-reported Cannabis use. The general pattern across gestation is for the vaginal microbiota to remain or transition to a state of Lactobacillus dominance. This pattern can be modified by maternal parity and obesity. Regardless, network analyses reveal dynamic associations among specific bacterial taxa within the vaginal ecosystem, which shift throughout the course of pregnancy. This study provides a robust foundational understanding of the vaginal microbiota in pregnancy and sets the stage for further investigation of this microbiota in obstetrical disease. IMPORTANCE There is debate regarding links between the vaginal microbiota and pregnancy complications, especially spontaneous preterm birth. Inconsistencies in results among studies are likely due to differences in sample sizes and cohort ethnicity. Ethnicity is a complicating factor because, although all bacterial taxa commonly inhabiting the vagina are present among all ethnicities, the frequencies of these taxa vary among ethnicities. Therefore, an in-depth characterization of the vaginal microbiota throughout pregnancy in the specific study population under investigation is required prior to evaluating associations between the vaginal microbiota and obstetrical disease. This initial investigation is a large longitudinal study of the vaginal microbiota throughout gestation resulting in a term delivery in a predominantly African-American cohort, a population that experiences disproportionally negative maternal-fetal health outcomes. It establishes the magnitude of associations between maternal characteristics, such as age, parity, body mass index, and self-reported Cannabis use, on the vaginal microbiota in pregnancy.

ABSTRACT The composition of the vaginal microbiota is heavily influenced by pregnancy and may factor into pregnancy complications, including spontaneous preterm birth. However, results among studies have been inconsistent due, in part, to variation in sample sizes and ethnicity. Thus, an association between the vaginal microbiota and preterm labor continues to be debated. Yet, before assessing associations between the composition of the vaginal microbiota and preterm labor, a robust and in-depth characterization of the vaginal microbiota throughout pregnancy in the specific study population under investigation is required. Here, we report a large longitudinal study (n 5 474 women, 1,862 vaginal samples) of a predominantly African-American cohort -a population that experiences a relatively high rate of pregnancy complicationsevaluating associations between individual identity, gestational age, and other maternal characteristics with the composition of the vaginal microbiota throughout gestation resulting in term delivery. The principal factors influencing the composition of the vaginal microbiota in pregnancy are individual identity and gestational age at sampling. Other factors are maternal age, parity, obesity, and self-reported Cannabis use. The general pattern across gestation is for the vaginal microbiota to remain or transition to a state of Lactobacillus dominance. This pattern can be modified by maternal parity and obesity. Regardless, network analyses reveal dynamic associations among specific bacterial taxa within the vaginal ecosystem, which shift throughout the course of pregnancy. This study provides a robust foundational understanding of the vaginal microbiota in pregnancy and sets the stage for further investigation of this microbiota in obstetrical disease. than that of nonpregnant women (58). These general findings have been replicated by other investigators (20,21,48). Therefore, it has been proposed that an increased stability of the vaginal microbiota and Lactobacillus dominance during pregnancy plays a protective role and reduces the likelihood of pregnancy complications, especially sPTB (47,61). However, the association between variation in the composition of the vaginal microbiota and preterm birth continues to be debated (45,(62)(63)(64). Potential explanations for the inconsistencies in results among published studies include differences in sample sizes and cohort ethnicity. Therefore, before evaluating associations between the composition of the vaginal microbiota and obstetrical disease, including sPTB, a robust and in-depth characterization of the vaginal microbiota throughout pregnancy in the specific study population under investigation is required.
This initial investigation focuses on an urban population that experiences a high frequency of pregnancy complications (65)(66)(67)(68)(69)(70)(71)(72)(73)(74)(75). It uses 16S rRNA gene amplicon sequencing to assess the trajectory of the composition of the vaginal microbiota throughout gestation ending in term delivery. Leveraging longitudinal samples from a large set of patients with well-characterized demographic and clinical data, this study establishes the magnitude of associations between maternal characteristics, such as age, parity, and ethnicity, on the vaginal microbiota. Such knowledge is important for the assessment of previous reports and for informing future analyses of the vaginal microbiota's relationship with obstetrical complications, especially sPTB. Moreover, this study provides information on a predominantly African-American population, for which available data are overall sparse despite this population experiencing disproportionally negative maternal-fetal health outcomes (65)(66)(67)(68)(69)(70)(71)(72)(73)(74)(75).

RESULTS AND DISCUSSION
The demographic characteristics of the 474 patients with term delivery included in this study (the largest cohort sampled to date) are presented in Table 1. This cohort is predominantly African-American (94.5% [448/474]) with a body mass index (BMI) above 25 kg/m 2 (65% [306/472]). The distribution of the gestational ages at which the 1,862 vaginal fluid samples were collected from these patients is depicted in Fig. 1. Each woman had 3 to 4 samples (median of 4) collected between 8 and 38 16 weeks of gestation.
Effect of patient/subject identity. The structure of the vaginal microbiota during pregnancy has been reported to vary with gestational and maternal age (47,48,58). However, the scope and strength of all factors potentially influencing the structure and fluidity of the vaginal microbiota during pregnancy remain to be elucidated. In the current study, beta diversity, or the shared diversity of the microbiota between samples, was characterized using the Jaccard (i.e., microbiota composition) and Bray-Curtis (i.e., microbiota structure) indices. The variation in vaginal microbiota composition and structure was primarily explained by patient identity (subject identity [subjectID]: composition, R 2 5 58% to 61%; structure, R 2 5 65% to 68%) and by the patient-specific variation with gestational age (interaction between subjectID and gestational age: composition, R 2 5 16% to 18%; structure, R 2 5 14% to 16%). Only a modest proportion of the variance Vaginal Microbiota across Term Pregnancy Microbiology Spectrum in the composition and structure of the vaginal microbiota was explained by maternal characteristics, such as age (0.2% to 1.4%), parity (0.3% to 1.9%), and self-reported Cannabis use (0.3% to 1.8%). Overall, these findings illustrate the large influence that patient (i.e., individual) identity has on the composition and structure of the vaginal microbiota, which is consistent with general observations of microbiotas at other body sites, such as the oral cavity, gut, and skin (76)(77)(78)(79)(80)(81). Furthermore, these results highlight the importance of robust longitudinal, as opposed to cross-sectional, studies to account for inter-and intra-individual variability in the microbiota (80,(82)(83)(84).
Effect of gestational age. Alpha diversity, which is the diversity of the microbiota within individual samples, was characterized using Chao1 (i.e., richness) and Shannon and Simpson (i.e., richness and evenness) indices. Both richness ( Fig. 2A and C) and evenness ( Fig. 2B and D) of the vaginal microbiota decreased with advancing gestational age from the first to the third trimester (P , 0.0001 for slopes differing from zero, as determined by linear mixed-effects (LME) models). This finding is consistent with previous reports of alpha diversity in cohorts of primarily African-Americans (28,48,58). By contrast, previous reports of largely Caucasian cohorts indicated that alpha diversity is generally low and consistent throughout the entirety of gestation (23,47,48,85). Nevertheless, in the current study, there was substantial heterogeneity in the rate of decrease in vaginal microbiota alpha diversity among patients; the decrease was steeper for women who had higher baseline diversity early in pregnancy (correlation between random intercepts and random slopes include Shannon 5 20.79, FIG 2 Decrease in alpha diversity of the vaginal microbiota with gestational age in women ultimately delivering at term. Graphical representation of low and high bacterial community richness (A) and evenness (B). Linear mixed-effects models illustrating decreases of bacterial community richness (C) and richness coupled with evenness (D) over the course of gestation. Each dot corresponds to one sample. The red line represents the linear fit using linear mixed-effects models. The dark blue line represents the model fit, and light blue areas define the 95% confidence intervals derived from generalized additive models with splines transformation of gestational age at sampling.

Vaginal Microbiota across Term Pregnancy
Microbiology Spectrum Simpson 5 20.67, and Chao1 5 20.67) (see, for example, Fig. 3). The decrease in alpha diversity with advancing gestational age remained significant after adjusting for potential confounding variables, including maternal age, parity, and BMI (Table 2). These alterations in the overall structure of the vaginal microbiota likely reflect physiological alterations (e.g., glycogen levels [5,86,87]) in the vaginal microenvironment across gestation that favor the predominance of a few bacterial taxa that can thrive under these conditions (e.g., Lactobacillus spp.). Indeed, it has been shown that the relative abundance of Lactobacillus in the vaginal microbiota is highest among African-American women with high free glycogen levels (54). The vaginal microbiota is consistently categorized into CSTs that are defined by a dominance or lack of Lactobacillus spp. (4,11,23,44). Using a protocol established previously for assigning CSTs to vaginal samples based on 16S rRNA gene sequence data (11), we identified seven CSTs among the 1,862 samples included in this study (Fig. 4A).  (27, 29, 36, 39-41, 44, 47, 58, 85, 88-94), further illustrating the depth of complexity of CST IV-designated communities.
In this study, CST prevalence was a function of gestational age (Fig. 4B). Except for the two least abundant CSTs (II and IV-C), for which statistical power was limited, the membership probability to any CST displayed dynamic changes with gestational age (P , 0.05 for all; Fig. 4B). While Lactobacillus-dominated CSTs I, III, and V tended to be more abundant with advancing gestational age, the abundance of the more diverse CSTs IV-A and -B declined steadily as term gestation approached (Fig. 4B). Notably, in a secondary analysis of women (n 5 309) for which samples were available from each of four discrete time points across gestation, there was a pronounced shift in CST composition with advancing gestational age; specifically, there was an increase in CSTs I and III at the end of pregnancy, derived primarily from patients with an initial CST IV-A or IV-B (Fig. 4C). These findings are in line with prior cross-sectional (21,24,29,36,39,40,48,88,89,91,93) and longitudinal (22, 23, 25-28, 41, 44, 47, 48, 58, 85, 90, 92, 94, 95) studies that included characterization of the structure and dynamics of the vaginal microbiota in term pregnancies. At a community level, pregnancy has been shown to create favorable conditions for a Lactobacillus-dominated vaginal microbiota, particularly CSTs I and III, and a shift away from the more diverse CSTs IV-A and IV-B, as gestation progresses to term (20,23,48,58). Although shifts in the vaginal microbiota occur in gravid and nongravid women, an increased prevalence of specifically Lactobacillus-dominated CSTs in pregnant women is intriguing because it could protect against ascending infection (96), which could culminate in sPTB, through the competitive exclusion of opportunistic pathogens within the vaginal microenvironment (47,97). Furthermore, Lactobacillus species produce lactic acid, which has anti-inflammatory properties (98)(99)(100). Additional research exploring the functional role of the vaginal microbiota on the host in large longitudinal cohorts is warranted to address these hypotheses.
After describing the changes in the composite measures of the vaginal microbiota alpha and beta diversity and CSTs, we utilized linear mixed-effects (LME) modeling to analyze the relationships between gestational age and maternal characteristics with the relative abundances of individual bacterial taxa denoted as amplicon sequence variants (ASVs) (see Table S1 in the supplemental material). Increased gestational age was positively correlated with 33 exclusively Lactobacillus ASVs (q , 0.1) and negatively correlated with ASVs that are typical members of vaginal CST IV (Table S1; Fig. 5). a Chao1 richness, Shannon, and Simpson diversity. b Gestational age was centered at 8 wks and then scaled by 4; therefore, the change in diversity corresponds to a 4-wk interval. c Maternal age was scaled by 5; therefore, the diversity corresponds to a 5-yr increase in maternal age.  (11). (B) Dynamics of vaginal CST prevalence as a function of gestational age among women ultimately delivering at term. The log-odds of membership for each CST were modeled using binomial linear-mixed effects models. Fixed effects in these models included gestational age (linear and quadratic terms, as needed) and maternal characteristics, while one random intercept was allowed for each subject. (C) Alluvial plot illustrating the temporal dynamics of vaginal CST prevalence and transitions among 309 women who delivered at term and contributed one sample per each of the four discrete time periods (10 to 37 weeks).

Vaginal Microbiota across Term Pregnancy Microbiology Spectrum
To supplement the LME models at the ASV level, we further implemented analysis of composition of microbiomes (ANCOM) (101) to identify ASVs changing in abundance throughout gestation. Gestational age was treated as a main fixed effect; patient identity was treated as a random effect; and maternal age, parity, Cannabis use, ethnicity, and race were included as covariates. Seventy-five ASVs were positively or negatively associated with gestational age (see Table S2 in the supplemental material, Fig. 6). As in the LME analysis, many Lactobacillus ASVs were positively associated with gestational age, while many bacteria typically associated with CST IV were negatively FIG 5 Changes in the relative abundance of amplicon sequence variants (ASVs) in vaginal 16S rRNA gene profiles across gestational age in women who ultimately delivered at term. Only the first ASV for each microbial taxon with a significant corrected P value (q , 0.05) presented in Table S1 is shown. Panels with positive correlations are ordered before those with negative correlations. Each dot within an individual panel corresponds to one sample. The red lines represent linear fits through relative abundance data using linear mixed-effects models. The blue lines and gray bands represent the model fits and 95% confidence intervals derived from generalized additive models, respectively. The green lines represent the estimates from negative binomial mixed-effects generalized additive models. associated with gestational age (Fig. 6). This pattern was especially evident when considering the 56 of these 75 ASVs that overlapped with those identified as being significantly associated with gestational age in the LME analysis (Table S2; see Fig. S1 in the supplemental material). Specifically, among these overlapping ASVs, six were positively associated with gestational age and each was classified as Lactobacillus. By contrast, the 50 ASVs that were negatively associated with gestational age, using both analytic approaches, were largely classified as Atopobium, Gardnerella, Megasphaera, Prevotella, and Sneathia, which are each CST IV-typical bacteria. The direction of these associations is intriguing given prior reports that Lactobacillus-dominated vaginal CSTs are linked with favorable reproductive outcomes (23,26,28), while conversely, CST IV-and CST IV-typical bacteria have been associated with an increased risk of sPTB (25,28,38). Notably, however, the ANCOM analyses additionally indicated that multiple ASVs classified as Ca. Lachnocurva vaginae also increased in abundance with advancing gestation (Fig. 6). Ca. Lachnocurva vaginae, previously referred to as BVAB1 (12), is an established resident bacterium of the vaginal ecosystem (102,103), and it is typically a component of CST IV. Its increase in abundance throughout pregnancy is a novel finding, and its potential clinical significance warrants further investigation.
Overall, the results were largely congruent between the LME models and ANCOM analyses, although there were some notable differences (e.g., Ca. Lachnocurva vaginae). The fundamental difference between LME and ANCOM is that ANCOM is inferring about abundances, whereas LME is inferring about relative abundances. Specifically, LME evaluates whether the abundance of a particular taxon in a unit volume of an ecosystem, relative to all other taxa, has changed between two ecosystems. On the other hand, ANCOM evaluates whether the abundance of a particular taxon, in a unit volume of an ecosystem, has changed between two ecosystems. This difference in inference between LME and ANCOM explains the differing results obtained by using these two approaches. Regardless, LME and ANCOM identified ecologically plausible variation in microbiota membership across gestational age, and a large proportion of the bacterial ASVs changing in composition and abundance across pregnancy were discovered with both approaches.
Effect of maternal parity. In addition to the changes in alpha and beta diversity observed with increasing gestational age, there was also a significant effect of parity. Specifically, parity was positively correlated with alpha diversity ( Table 2). One potential FIG 6 Amplicon sequence variants (ASVs) classified at the genus level identified as less or more abundant in the vaginal microbiota with advancing gestational age. As gestation advances, Lactobacillus, and to a lesser extent Ca. Lachnocurva, ASVs become more abundant and many members of community state type (CST) IV become less abundant.

Vaginal Microbiota across Term Pregnancy
Microbiology Spectrum explanation for the association between parity and increased alpha diversity is the marked increase in the alpha diversity of the vaginal microbiota after live birth (85,104), and this phenomenon may be cumulative across multiple pregnancies, mirroring maternal-fetal immunological memory (105)(106)(107)(108). Notably, this phenomenon cannot be explained simply by advancing maternal age since this covariate was not positively correlated with alpha diversity of the vaginal microbiota (Table 2). This finding indicates that, while there is a consistent reduction in the richness and evenness of the vaginal microbiota throughout pregnancy, at least among women who have a diverse microbiota at pregnancy onset, this effect may be mitigated by parity. With respect to beta diversity, only modest percentages of the variance in the composition and structure of the vaginal microbiota were explained by parity (0.3% to 1.9%). Nevertheless, differences in CST membership based on parity, while adjusting for gestational age, were found (Table 3), with parity (odds ratio [OR] 5 1.46 for each additional previous delivery) being associated with an increase in CST IV-B. At the ASV level, higher maternal parity was significantly associated with an increase in 58 ASVs classified as typical vaginal CST IV bacteria (e.g., Gardnerella, Megasphaera, Prevotella, and Sneathia), while lower maternal parity was exclusively correlated with 7 Lactobacillus ASVs, of which 6 were classified as L. crispatus (q , 0.1) (Table S1). This finding is consistent with a recent report of increased vaginal microbiota diversity with higher parity during subsequent gestations (109). The ecological and clinical implications of the correlation between increased parity and vaginal microbiota diversity warrant further investigation.
Effect of maternal age. Only a modest proportion of the variance in the composition and structure of the vaginal microbiota was explained by maternal age (0.2% to 1.4%). Nevertheless, differences in CST membership based on maternal age, while adjusting for gestational age, were found (Table 3), with higher maternal age (OR 5 0.64 for each additional 5 years) associated with a decrease in CST III. Similarly, at the ASV level, there were significant negative correlations between maternal age and 18 ASVs, of which 16 were classified as L. iners, while only 4 ASVs, classified as L. crispatus, were positively correlated with maternal age (Table S1). While these correlations contrast with a previous report (47), the differences in ethnic makeup and sample size between the two cohorts could account for this discrepancy. Effect of obesity. There was a significant effect of obesity, defined as a BMI greater than 28 kg/m 2 , on alpha diversity of the vaginal microbiota (Table 2). Specifically, there was a positive correlation between obesity and richness of the vaginal microbiota across gestation. This finding is in contrast with the intestinal microbiota, for which there tends to be a negative correlation between obesity and richness across gestation (110). Similar patterns are evident outside pregnancy as well. Obesity is associated with high alpha diversity of the vaginal microbiota (111) and, in general, low alpha diversity of the gut microbiota (112)(113)(114)(115). Thus, for both of these body sites, obesity is associated with levels of microbiota alpha diversity that are widely viewed as nonoptimal. As obesity is characterized by a low-grade systemic inflammatory response (116)(117)(118)(119), these data highlight potential dynamic interactions between systemic inflammation and microbiota alpha diversity throughout the human body that can influence health and disease, including pregnancy outcomes (120,121).
Bacterial taxa are highly correlated with one another during normal pregnancy. Table S3 in the supplemental material shows some of the strongest associations (LME-adjusted q of ,0.05 and absolute Spearman correlation coefficient of .0.5) between pairs of bacterial taxa during pregnancy. In this analysis, each genus-level taxon (or family, if a genus-level designation was not available) was represented by one ASV retained based on the strongest association with gestational age. A subset of these significant correlations (involving the most relatively abundant ASVs) is shown in Fig. 7. Atopobium (ASV10) and Gardnerella (ASV3) (r 5 0.74), Eggerthellaceae (ASV39) and Parvimonas (ASV21) (r 5 0.83), Dialister (ASV25) and Eggerthellaceae (ASV39) (r 5 0.83), and Sneathia (ASV11) and Parvimonas (ASV 21) (r 5 0.74) were among the most highly correlated pairs of bacterial taxa in pregnancy (Table S3). These data suggest potential synergistic relationships among these typical members of CST IV in pregnancy and potentially beyond.
Network analysis reveals further changes in microbiota structure throughout pregnancy. The results from LME modeling were followed up with network analyses throughout gestation. Network analyses of the 25 most relatively abundant ASVs revealed that Lactobacillus ASVs were consistently network hubs, defined as ASVs closest to the center of the network, across term gestation (Fig. 8A to D). It is worth mentioning that there was limited resolution to differentiate Lactobacillus spp., given that the V4 hypervariable region of the 16S rRNA gene was targeted for sequencing (133). Nevertheless, Lactobacillus ASVs were clearly split between a primary group (ASVs 2, 7, 12, 13, 15, and 20) that included a mixture of L. crispatus, L. jensenii, and L. gasseri and a secondary group (ASVs 1 and 6) comprised exclusively of L. iners (Fig. 8A to D). These positive associations were interesting, given the exclusionary nature of the CST-defining Lactobacillus spp. in the former group. In addition, this group maintained strong negative associations with Gardnerella (ASVs 3, 8, and 9), Atopobium (ASV 10), and Megasphaera (ASV 4) throughout gestation. By contrast, L. iners ASVs had very few associations with other ASVs, either positive or negative (Fig. 8A to D).
This dichotomy may be due to the species-specific ability of Lactobacillus to produce lactic acid (both L-and D-isomers) and-to a lesser extent-hydrogen peroxide, each of which can create hostile conditions for other bacteria (134,135). While L. iners can produce L-lactic acid, it lacks key genes to produce D-lactic acid (136). Conversely, Lactobacillus crispatus produces both isomers (130). Given that these two isomers of lactic acid differentially affect the biochemistry of vaginal fluid (130), they may also differentially influence the composition of the broader microbiota. Furthermore, unlike L. crispatus, L. iners lacks the ability to produce hydrogen peroxide (20,123). Hydrogen peroxide is an established antimicrobial compound, yet it may only play a minor role within the vaginal ecosystem, as its production may be limited in this typically hypoxic environment (137). Regardless, the inability to produce both inhibitory metabolites may explain the lack of strong negative associations and, therefore, the permissive nature of L. iners toward other ASVs when it predominates in the vagina (20).
We further analyzed the networks by defining clusters, formed by optimal grouping of ASVs based on strengths of association, which revealed differences in the degree of association between Lactobacillus spp. and CST IV-typical bacteria (Fig. 8A to D). In particular, CST IV bacteria were split between two groups. The first group, which contained Gardnerella (ASVs 3, 8, and 9), Atopobium (ASV 10), and Megasphaera, exhibited strong negative associations with Lactobacillus ASVs (Fig. 8A to D). The second group, which contained Sneathia, Dialister, Fastidiosipila, Ca. Lachnocurva, Parvimonas, and Atopobium (ASV 24), had only weak negative associations with Lactobacillus ASVs (Fig. 8A to D). Interestingly, bacteria within the second group formed increasingly positive associations among themselves as gestation progressed (Fig. 8A to D). These contrasting patterns among non-Lactobacillus ASVs are in concordance with results of a prior report (104) that identified strong exclusionary associations between L. crispatus and G. vaginalis but only moderate negative associations between L. crispatus and other CST IV bacteria. These clustering profiles mirror previously proposed splits within CST IV, with the green cluster comprised of Gardnerella, Atopobium, and Megasphaera representing CST IV-B and the orange cluster, formed by diverse bacteria (Atopobium, Dialister, Ca. Lachnocurva, Parvimonas, Prevotella, and Sneathia), representing CST IV-C (Fig. 8A to  D). While CST IV-B is defined by Gardnerella predominance, CST IV-C lacks a predominance of both Lactobacillus and Gardnerella. Instead, CST IV-C is formed by a multitude of diverse bacteria (11). The increase in positive associations among these particular CST IV-C members suggests that CST IV-C bacteria can coexist, leading to more species-rich and diverse vaginal microbiotas than the more exclusionary CSTs. These findings are echoed in Table S3, which reveals that the strongest associations among ASVs, as determined by LME modeling, exist among CST IV bacteria. For example, Sneathia ASV 11 and Parvimonas ASV 21 were highly correlated (r 5 0.74) and were consistently positively associated throughout gestation in the network analyses ( Fig. 8A to D).
Intriguingly, Gardnerella ASVs 18 and 23 exhibited a distinct Gardnerella-correlative phenotype from the other Gardnerella ASVs (3, 8, and 9), which were in a separate cluster throughout most of gestation (Fig. 8A, C, and D). Instead of exhibiting the strong Lactobacillus-negative associations of Gardnerella ASVs 3, 8, and 9, Gardnerella ASVs 18 and 23 were often positively correlated with other ASVs throughout gestation, including those classified as Lactobacillus (Fig. 8B to D). Notably, Gardnerella ASV G2, which was an ASV associated with sPTB in a prior study (25), shared 100% identity with G. vaginalis ASV 9, and, in our study, it displayed strong negative associations with the L. crispatus cluster (Fig. 8A to D). ASVs 18 and 23 did not match any current Gardnerella type strain with 100% identity using BLAST (see Table S4 in the supplemental material); they may represent unique Gardnerella strains. This finding is important because Gardnerella is associated with bacterial vaginosis (138) and sPTB (139), yet seemingly in a strain-dependent manner (25). Therefore, these network analyses highlight the need for strain-level resolution of the vaginal microbiota to fully understand its complex dynamics and ecology in health and disease.
Lastly, each of the four networks from different time points in gestation was compared to each other to identify global and ASV-specific network changes. Globally, natural connectivity (i.e., robustness of the network) significantly decreased and positive edge percentage (i.e., proportion of positive associations) significantly increased (Fig. 8E) as pregnancy progressed, both with strong linear trends across gestation (R 2 5 0.895 and 0.985, respectively). The increase in positive edge percentage was a combination of a decrease in negative Lactobacillus-CST IV-B associations and an increase in the strength of positive associations among CST IV-C-associated ASVs. These gestational changes observed in positive edge percentage were associated with a significant decrease in closeness (i.e., the sum of the shortest paths between a node and all other nodes) for Gardnerella ASVs 3, 8, and 9 since many of their strong negative Lactobacillus spp. associations were lost or diminished as pregnancy progressed (Fig. 8B to D). Conversely, positive associations for CST IV-associated ASVs increased in general, likely resulting from an increase in available niches within the vaginal ecosystem as term approaches. Such changes may be due to the shift toward Lactobacillus-dominated CSTs observed in Fig. 4B and C. FIG 8 Legend (Continued) below the diagonal. Cells shaded red and cells shaded blue represent statistically significant differences (after false discovery rate correction) between time periods for positive edge percentage and natural connectivity, respectively. Differences were assessed by a one-tailed test of the observed difference compared to a nonparametric permuted sampling distribution of the corresponding measure (n 5 1,000).

Vaginal Microbiota across Term Pregnancy Microbiology Spectrum
Collectively, these network analyses demonstrate the complex interactions between members of the vaginal microbiota. We were able not only to confirm classical associations of Lactobacillus species with members of other genera, but also, through longitudinal collection of vaginal samples, to identify shifts in network connectivity as gestation progressed. Furthermore, different ASVs attributed to the same genus (e.g., Gardnerella) demonstrated distinct ecological dynamics, suggesting that strain-level variation is a driver of community phenotypes commonly denoted as CSTs. Therefore, this study highlights the need for strain-level investigations utilizing metagenomic data to further characterize these shifts in vaginal microbiota ecology throughout gestation and to determine their underlying causes and consequences.
Conclusions and future directions. The composition of the vaginal microbiota is broadly consistent across populations of reproductive-age women worldwide (1-5); yet, the relative abundances of CSTs can vary within populations, including by ethnicity (2)(3)(4)(5), and within individuals over time. It is hypothesized that variation in the vaginal microbiota is contributing to obstetrical complications, especially sPTB (22,23,25,27,28). Here, we have provided a longitudinal study of a predominantly African-American population with a large sample size and extensive demographic and clinical data, which allowed for the simultaneous evaluation of the effects of a broad range of maternal characteristics on the vaginal microbiota. Indeed, this study represents the largest and most comprehensive longitudinal survey of the vaginal microbiota throughout gestation resulting in a term delivery and thereby provides foundational understanding. The focus on African-American women is a clear strength of the study because they constitute a high-risk population that experiences a relatively high rate of pregnancy complications (65)(66)(67)(68)(69)(70)(71)(72)(73)(74)(75).
In the current study, we report that the principal factors influencing the composition and structure of the vaginal microbiota in pregnancy are individual patient identity and gestational age at sampling. The pronounced effect of individual identity highlights the need for longitudinal studies to account for inter-and intra-individual variability when evaluating the strengths of potential relationships between the composition of the vaginal microbiota and obstetrical complications. Furthermore, the richness and evenness of the vaginal microbiota decreased throughout pregnancy, with the microbiota becoming increasingly predominated by Lactobacillus species with advancing gestation. Such typical changes can be partially mitigated by maternal parity and obesity. Importantly, Lactobacillus species, especially L. crispatus, are generally perceived to promote the vaginal and reproductive health of women (4,13,15,16,19,20,25,27,28,36,40,44,89,90,92,97,(125)(126)(127)(128)(129)(130); therefore, any factors that could potentially reduce the likelihood of the transition of the vaginal microbiota to a Lactobacillus-dominant community during pregnancy need to be identified. Lastly, network analyses revealed dynamic interactions among individual bacterial strains within the vaginal microbiota during pregnancy and that the number and structure of these interactions change with advancing gestation. A critical consideration moving forward will be to assess whether patterns in these strain-level interactions within the vaginal microbiota differ between women delivering at term or those who ultimately experience sPTB. Ideally, this assessment should be done by using metagenomics (140), in addition to 16S rRNA gene sequencing, so that the strain-level designation of bacterial taxa can be more readily achieved, and information on the functional and virulence potential of various strains within individual microbiotas can be gleaned (28,104). Furthermore, a key element missing from most studies is the characterization of host-immune microbiome interactions, which can be readily assessed by evaluating the immunoproteome (e.g., cytokines, chemokines, and defensins) within the vaginal ecosystem. The local immune response of women to their vaginal microbiota may be as or more variable than the composition of their microbiota. Therefore, elucidating the dynamics of host immune-microbiome interactions and their potential influence on obstetrical outcomes, especially sPTB, is a critical future direction (27, 28, 141, 142). Study design. We conducted a retrospective longitudinal cohort study to characterize variation in the vaginal microbiota across gestation in pregnancies ending in normal term delivery. A normal pregnancy was defined as a woman with no obstetrical syndromes (fetal demise, intrauterine growth restriction, preeclampsia, preterm labor, and preterm prelabor rupture of membranes), fetal anomalies, or neonates with birth weights outside the 10th to 90th percentiles; who agreed to participate in this study and provided written signed informed consent; and who delivered at term (38 to 42 weeks of gestation). Three or four samples of vaginal fluid were collected longitudinally across pregnancy from each woman under direct visualization from the posterior vaginal fornix using a Dacron swab (Medical Packaging Corp., Camarillo, CA). Vaginal swabs were stored at 280°C until DNA extraction.
DNA extraction from vaginal swabs. Genomic DNA was extracted from vaginal swabs (n 5 1,862) alongside nontemplate negative controls, addressing any potential background DNA contamination (n 5 73). All vaginal swabs were randomized across extraction runs. Extractions were conducted using a MagAttract PowerMicrobiome DNA/RNA EP extraction kit (QIAGEN, Germantown, MD), with minor modifications to the manufacturer's protocols. Briefly, swabs were transferred to clean, labeled Corning cryovials (Corning, Corning, NY) and immersed in 750 mL solution MBL preheated to 60°C. Swabs were then vortexed for 10 min. A provided empty PowerBead plate was then centrifuged for 1 min at 4,400 Â g, and vaginal swab lysates were added to corresponding wells of the PowerBead plate. Plates containing lysates were centrifuged for 1 min at 4,400 Â g. The plates were then loaded onto a TissueLyser II plate shaker (QIAGEN), firmly secured, and shaken at 17 Hz for 20 min. Plates were then removed from the shaker and immediately centrifuged at 4,400 Â g for 6 min. The supernatant was then carefully transferred 185 mL at a time to a provided collection plate. Following the transfer, 150 mL of solution IRS were added to each well, and plates were incubated at 4°C for 10 min. Plates were centrifuged for 15 min at 4,400 Â g, and the supernatant were transferred to a new collection plate. The plate was centrifuged for 2 min at 4,400 Â g, and 850 mL of the supernatants were transferred to a clean collection plate. The collection plate was loaded onto an epMotion 5075 liquid handler (Eppendorf, Enfield, CT) for further processing following the default onboard protocols. The above procedure yielded between 0.13 and 550 ng/mL purified DNA from the vaginal swabs as measured by a Qubit 3.0 fluorimeter and Qubit double-stranded DNA (dsDNA) assay kit (Life Technologies, Carlsbad, CA), following the manufacturer's protocol. The purified DNA was transferred to the provided 96-well microplates and stored at 220°C.
16S rRNA gene amplicon sequencing and bioinformatic processing. The V4 region of the 16S rRNA gene was amplified from vaginal swab DNA extracts and sequenced at the Michigan State University Research Technology Support Facility (https://rtsf.natsci.msu.edu/) using the dual-indexing sequencing strategy developed by Kozich et al. (143). The forward primer was 515F (59-GTGCCAGCMGCCGCGGTAA-39) and the reverse primer was 806R . Each PCR contained 0.5 mM of each primer, 1.0 mL template DNA, 7.5 mL of 2Â DreamTaq hot start PCR master mix (Life Technologies, Carlsbad, CA), and nuclease-free water to produce a final volume of 15 mL. Reactions were performed using the following conditions: 95°C for 3 min; followed by 30 cycles of 95°C for 45 s, 50°C for 60 s, and 72°C for 90 s; and an additional elongation at 72°C for 10 min.
16S rRNA gene amplicon sequences were clustered into amplicon sequence variants (ASVs) defined by 100% sequence similarity using DADA2 version 1.12 (144) in R version 3.6.1 (145) according to the online MiSeq protocol (https://benjjneb.github.io/dada2/tutorial.html) with minor modifications, as described previously (146). These modifications included allowing truncation lengths of 250 and 150 bases and a maximum number of expected errors of 2 and 7 bases, for forward and reverse reads, respectively. Reads were truncated at the first instance of a quality score less than or equal to 2. Any reads containing ambiguous nucleotides were removed from the data set. To increase the power for detecting rare variants, sample inference allowed for pooling of samples. Additionally, samples in the resulting sequence table were pooled prior to the removal of chimeric sequences. Sequences were then classified using the silva_nr_v132_train_set database with a minimum bootstrap value of 80%, and sequences that were derived from Archaea, chloroplast, or Eukaryota were removed. Per Holm et al. (12), ASVs classified as Shuttleworthia were reclassified manually as Ca. Lachnocurva vaginae.
The R package decontam version 1.6.0 (147) was used to identify ASVs that were likely background DNA contaminants based on their distribution among biological samples and negative controls using the "IsContaminant" method. An ASV was identified as a contaminant and subsequently removed from the data set if it had a decontam P score of #0.5, was present in at least 15% of negative controls with an overall average relative abundance of at least 1.0%, and had a greater average relative abundance in controls than that in biological samples. Based on these criteria, a total of four ASVs, constituting 0.028% of the sequences in the data set, and classified as Escherichia, Pelomonas, Pseudomonas, and Micrococcaceae, were identified as contaminants and were removed.
For assigning community state types (CSTs) to the bacterial community profiles, ASVs were first taxonomically classified using the V4_trimmed_noEuks_nr_Complete.fa reference library supplied with the speciateIT classifier code (https://github.com/ravel-lab/speciateIT) and the classify.seqs command in mothur (148) with a Vaginal Microbiota across Term Pregnancy Microbiology Spectrum bootstrap cutoff value of 80. Read counts for ASVs assigned to the same taxon were then combined, and CSTs were assigned using VALENCIA, a nearest centroid-based classifier (11). Statistical analysis. (i) Analysis of vaginal microbiota composition and structure. To determine the percentage of variance explained (R 2 ) in the composition (Jaccard index) or structure (Bray-Curtis index) of the vaginal microbiota, permutational multivariate analysis of variance (PERMANOVA) analyses (149) were performed using the interaction terms between subjectID and gestational age at sampling within the "adonis2" function in the R package vegan version 2.5-6 (150). The confidence intervals of R 2 statistics were obtained by bootstrap sampling of patients and all their associated longitudinal measurements. Confidence intervals for R 2 statistics for additional patient specific covariates (i.e., maternal age, parity, obesity, race, ethnicity, and Cannabis use) while accounting for gestational age were obtained in a separate analysis in which PERMANOVA analysis was performed on bootstrap samples of subjects. For each subject, only one random longitudinal observation was selected, hence generating cross-sectional data sets in which observations were independent, and hence, PERMANOVA could be applied. Empirical 95% confidence intervals of R 2 statistics were obtained using 1,000 bootstrap iterations.
(ii) Changes in alpha diversity with gestational age and maternal characteristics. The relative abundance for each amplicon sequence variant (ASV) was determined as the ratio of the count of each ASV divided by the total number of ASV counts in each sample. Starting with the relative abundance data, for each sample, we calculated the Shannon and Simpson diversity indices using the "diversity" function in the vegan package and the Chao1 diversity index using the "chao1" function in the fossil package. Each measure of diversity was then correlated with the continuous variable gestational age at sampling using LME models implemented in the lme4 package in R. In these models, a random intercept and a random slope with gestational age were allowed for each subject to account for the repeated and potentially correlated observations from the same subject. Gestational age values were centered at 8 weeks and then scaled by 4 to facilitate the interpretation of random intercepts and convergence of model fitting algorithms. The complexity of the gestational age dependence was assessed by comparing the model fit between a linear and quadratic trend using a likelihood ratio test for LME models. The same test was used to determine the need for subject-specific gestational age slopes. To further inspect nonlinear trends in alpha diversity as a function of gestational age, generalized additive models (GAM) for repeated observations were also fit based on spline transformation of gestational age. Such models were available from the mgcv package in R. The effects of maternal characteristics (maternal age, obesity, parity, race, ethnicity, smoking, and Cannabis use) were assessed by including them as covariates in LME models. A P value of ,0.05 was used to infer significance in these analyses.
(iii) Changes in vaginal community state types (CSTs) with gestational age and maternal characteristics. The log-odds of membership in a given community state type (CST) were modeled using binomial LME models using the glmer function in R. Fixed effects in these models included gestational age at sampling (linear and quadratic terms, as needed) and maternal characteristics. Of note, due to the sparse responses in these models (membership to a given CST), it was not feasible to test whether there were subject-specific departures in the CST membership probability trends versus gestational age (random slopes for gestational age), and yet, subject-specific shifts in membership probabilities were allowed via random intercepts in the mixed effects models.
(iv) Changes in the relative abundance of individual amplicon sequence variants (ASVs) with gestational age and maternal characteristics. The analysis of the relative abundance of each amplicon sequence variant (ASV) in association with gestational age at sampling was performed using LME models based on ASV count data while assuming a negative binomial distribution of the counts. Such models were implemented in the glmmTMB package in R and included an offset term of the total number of reads per sample so that changes in relative abundance with gestational age were being estimated as opposed to differences in absolute counts. These models included gestational age and maternal characteristics as fixed effects and random intercept and random gestational age slope for each subject. All analyses involved control of the false discovery rate at a 10% level (q , 0.1).
Additionally, we implemented analysis of composition of microbiomes (ANCOM) (101) for further differential abundance analysis of ASVs. After adding a pseudocount of one to all observed abundances, ANCOM accounts for the compositionality issue of the microbiome data by performing the additive log ratio (ALR) transformation. For each taxon, ANCOM uses all other taxa, one at a time, as the reference in forming the ALR transformation. The transformed data were treated as the response of the LME model which includes gestational age as the main fixed effect and maternal age, parity, Cannabis use, ethnicity, and race as covariates, while allowing a random intercept and a random slope for each subject. For a given taxon, the output W statistic represents the number of ALR-transformed models where the taxon is differentially abundant with regard to the main fixed effect, after adjusting for multiple testing correction for the number of ALR models corresponding to each taxon. The larger the value of W, the more likely the taxon is differentially abundant between compared sample groups.
(v) Bacterial taxa with the most highly associated relative abundances during normal pregnancy. To assess the association between pairs of ASVs, we modeled their log-transformed relative abundance data using LME models. In these models, one of the two ASVs was treated as a response variable while the other was treated as an explanatory variable. A random effect was allowed for each subject. Naive Spearman correlation coefficients were also calculated for each pair. The significance of correlations was based on an adjusted P value of ,0.05.
(vi) Associations of ASVs across term pregnancy through network analysis. The R packages NetCoMi 1.0.2 (151), SpiecEasi 1.1.2, and seqtime 0.1.1 were used in R version 4.0.3 to create correlation networks between ASVs from vaginal samples across four time points in gestation resulting in term delivery. Only one sample per subject was included for each time point to control for subjectID. Networks were generated using Spearman's correlation since the data were not normally distributed and nonparametric. Only the top 25 predominant ASVs in the entire data set were considered for each network. A topological overlap matrix generated from the network adjacency matrix was utilized as a dissimilarity measure after transforming the data through multiplicative simple replacement and performing a centered log-ratio transformation to account for the zero-inflated data and to normalize the data, respectively. Community structure was determined by implementing the fast greedy modularity optimization algorithm (152). The layout of the network for the first time period was used as the layout for all subsequent time periods. Edges displayed in the network exceeded a threshold of 0.3, and edge thickness was tied to the strength of the correlation between two given nodes. Networks of the four time periods were compared using the NetCoMi package "netCompare" function with 1,000 permutations.
Data availability. The 16S rRNA gene sequencing files for each vaginal sample have been deposited on the National Center for Biotechnology Information Sequence Read Archive (SRA) as BioProject PRJNA895490.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. SUPPLEMENTAL FILE 1, XLSX file, 0.6 MB.