Very Pre Term S5). In particular, the Lactobacillus-poor CST 4 community cr association with preterm delivery than Lactobacillus exhibited a much stromacil did any of the lacte illus-dominated CSTs(1-3, 5) We next explored the relationship of preterm birth with temporal features of the CST 4 community. We found the du- ration and proportion of time during which a womans community remained in CSt 4 to be associated with birth. Fig 4A demonstrates that CsT 4 prevalence is co ith an earlier gestational age at delivery(P= 1.1 x 10-4 Pearson; P=0.015, Spearman). This correlation remained sig 0. 010 nificant after correcting for the effect of white or nonwhite race (P=2.5 x 10, Pearson; P=0.046, Spearman). This association 0001 between CST 4 and preterm birth was present at every time window during gest on, suggesting that the valu e of cst 4 in predicting preterm birth begins early in pregnancy(Fig 4B) Prevotella Gardnerella and Ureaplasma Abundances Stratify Preterm Risk for Women with the High-Diversity Vaginal CST. We tested CSt 4 Fig.2. Heat map of the fractional abundance of the 20 most abundant oTUs samples from the first group of subjects(n=40)for associations in the vaginal communities of 40 pled longitud inally during delivery The nonindependence of samples from the same subject, gnancy. Clustering on the abundance profiles of individual samples (n using the partitioning around medoids algorithm identified six CSTs CSTs combined with heterogeneity in the number and timing of sample from different subjects in our study, complicated this comparison ally account for >90% of the community. L crispatus, L jensen i, L iners, and Therefore, we tested under the two extreme but contrary models L gasser, respectively. CST 4 was significantly es are indicated by the bar at the top: term delivery (gray), >37 gestational plete dependence of samples within subjects-with the recognition eeks preterm(maroon), <36 wk; very preterm (pink), <32 wk marginal that these two models bound the actual solution. delivery during the 37th gestational week(white). When CST 4 samples were treated independently, both Ure- Wald test)and Gardnerella(Padi= 1.5 10- ) had strong positive as- concordantly: CST 1, Lactobacillus crispatus-dominant; CST 2, sociations with preterm birth. When CST 4 samples within subjects Lactobacillus gasseri-dominant; CST 3, Lactobacillus iners-domi- were treated completely nonindependently (by merging the CST 4 nant; CST 4, diverse community; and CSt 5, lactobacillus samples from each subject before testing)only the association with enil-dominant. The pregnancy-associated communities at the Gardnerella remained significant( Padi=0.054). Although ther body sites(stool, saliva, and tooth/gum) could not be warranted because of the small number of subjects, thes represented by a small number of discrete CSTs ng of low Lactobacillus abundant abundance of Gardnerella in particular may increase the risk of The Dynamic Network of the Vaginal Communities During eveals Strong Variation in CST Stability and Interco Intraindividual vaginal community states were genera [四③ terindividual variability w pect to both the most prevalent CST and the frequency of inter-CST transitions. A B Some subjects(e.g, Tl1) stably maintained a single CST througl out gestation, whereas other subjects(e.g, T6) exhibited relatively frequent transitions between CSTS. Notably, the frequency of in- terstate transitions did not appear to be associated with either healthy term delivery or preterm deliv we represented vaginal CST dynamics as a Markov chain. Because vaginal communities exhibited interstate transitions presents a Markov chain generated by inferring inter-CST tran sition probabilities from our data(SI Appendix, Table S4). Our nodel indicated that the four Lactobacillus-dominated CSTs N CSTS 1, 2, 3, and 5)were more stable(had higher self-transitie probabilities)than the diverse CSt (4). This finding is qualitative milar to the observations of Gajer et al of CSTs in nonpregnant women (25); however, the Lactobacillus-dominated CSTs were of the observed inter-CST transition patterns CST time course of the 40 subjects from the first subject group. Color indicates also is of interest. CST 2(L gasseri-dominated) had the fewest CST vn in the key; the black parenthesis indicates delivery. Subj connections. Indeed in our cohort. CST 2 was not observed to be P1-P7 delivered preterm(before gestational week 37): subjects M1-M5 were another state, it transitioned only to CST 1(L crispatu4s-domi- o37 gestation ginal (gestational week 37): subjects Tl-T28 delivered at term reachable from any other CST, and when CST 2 transitioned te lated). In contrast, CST 4 was the most interconnected and was associations with preterm birth. (B)Dynamics of the vaginal communities were the only state exhibiting bidirectional transitions with three other bilities between CSTs. Arrow weights are proportional to the maximum lihood-estimate of the week-to-week transition probabilities between states versity Vaginal CST Was Associated with Preterm Birth. Node sizes scale with the number of subjects in which the CST was seen.Color indicates the strength of the association with preterm birth (ie the pro- The red CSTs exhibited substantially different strengths of portion of the specimens from the CST that came from subjects who delivered association with preterm birth(Fig. 3B and SI Appendix, Table preterm). The self-transition probabilities are shown numerically 11062Iwww.pnas.org/cgi/doi/10.1073/pnas.1502875112 DiGiulio et alconcordantly: CST 1, Lactobacillus crispatus-dominant; CST 2, Lactobacillus gasseri-dominant; CST 3, Lactobacillus iners-dominant; CST 4, diverse community; and CST 5, Lactobacillus jensenii-dominant. The pregnancy-associated communities at the other body sites (stool, saliva, and tooth/gum) could not be represented by a small number of discrete CSTs. The Dynamic Network of the Vaginal Communities During Pregnancy Reveals Strong Variation in CST Stability and Interconnectedness. Intraindividual vaginal community states were generally stable on the time scale of weeks (Fig. 3A). However, substantial interindividual variability was observed with respect to both the most prevalent CST and the frequency of inter-CST transitions. Some subjects (e.g., T11) stably maintained a single CST throughout gestation, whereas other subjects (e.g., T6) exhibited relatively frequent transitions between CSTs. Notably, the frequency of interstate transitions did not appear to be associated with either healthy term delivery or preterm delivery. Because vaginal communities exhibited interstate transitions, we represented vaginal CST dynamics as a Markov chain. Fig. 3B presents a Markov chain generated by inferring inter-CST transition probabilities from our data (SI Appendix, Table S4). Our model indicated that the four Lactobacillus-dominated CSTs (CSTs 1, 2, 3, and 5) were more stable (had higher self-transition probabilities) than the diverse CST (4). This finding is qualitatively similar to the observations of Gajer et al. of CSTs in nonpregnant women (25); however, the Lactobacillus-dominated CSTs were more stable in our cohort (SI Appendix, SI Discussion). The structure of the observed inter-CST transition patterns also is of interest. CST 2 (L. gasseri-dominated) had the fewest connections. Indeed, in our cohort, CST 2 was not observed to be reachable from any other CST, and when CST 2 transitioned to another state, it transitioned only to CST 1 (L. crispatus-dominated). In contrast, CST 4 was the most interconnected and was the only state exhibiting bidirectional transitions with three other CSTs (all except CST 2). The High-Diversity Vaginal CST Was Associated with Preterm Birth. The observed CSTs exhibited substantially different strengths of association with preterm birth (Fig. 3B and SI Appendix, Table S5). In particular, the Lactobacillus-poor CST 4 community exhibited a much stronger association with preterm delivery than did any of the Lactobacillus-dominated CSTs (1–3, 5). We next explored the relationship of preterm birth with temporal features of the CST 4 community. We found the duration and proportion of time during which a woman’s vaginal community remained in CST 4 to be associated with preterm birth. Fig. 4A demonstrates that CST 4 prevalence is correlated with an earlier gestational age at delivery (P = 1.1 × 10−4 , Pearson; P = 0.015, Spearman). This correlation remained significant after correcting for the effect of white or nonwhite race (P = 2.5 × 10−4 , Pearson; P = 0.046, Spearman). This association between CST 4 and preterm birth was present at every time window during gestation, suggesting that the value of CST 4 in predicting preterm birth begins early in pregnancy (Fig. 4B). Gardnerella and Ureaplasma Abundances Stratify Preterm Risk for Women with the High-Diversity Vaginal CST. We tested CST 4 samples from the first group of subjects (n = 40) for associations between the relative abundances of individual taxa and preterm delivery. The nonindependence of samples from the same subject, combined with heterogeneity in the number and timing of samples from different subjects in our study, complicated this comparison. Therefore, we tested under the two extreme but contrary models of sample dependence—complete sample independence and complete dependence of samples within subjects—with the recognition that these two models bound the actual solution. When CST 4 samples were treated independently, both Ureaplasma (Padj = 5 × 10−34, Benjamini–Hochberg–corrected Wald test) and Gardnerella (Padj = 1.5 × 10−13) had strong positive associations with preterm birth. When CST 4 samples within subjects were treated completely nonindependently (by merging the CST 4 samples from each subject before testing) only the association with Gardnerella remained significant (Padj = 0.054). Although caution is warranted because of the small number of subjects, these findings suggest that in the setting of low Lactobacillus abundance, a high abundance of Gardnerella in particular may increase the risk of preterm birth. In addition, Ureaplasma deserves further investigation as a risk factor (SI Appendix, SI Discussion). Fig. 2. Heat map of the fractional abundance of the 20 most abundant OTUs in the vaginal communities of 40 women sampled longitudinally during pregnancy. Clustering on the abundance profiles of individual samples (n = 761) using the partitioning around medoids algorithm identified six CSTs. CSTs 1, 2, 3, and 5 were characterized by dominant Lactobacillus species that typically account for >90% of the community: L. crispatus, L. jensenii, L. iners, and L. gasseri, respectively. CST 4 was significantly more diverse. Pregnancy outcomes are indicated by the bar at the top: term delivery (gray), >37 gestational weeks; preterm (maroon), <36 wk; very preterm (pink), <32 wk; marginal delivery during the 37th gestational week (white). Fraction preterm 0.0 0.5 1.0 1 2 3 4 5 0.979 0.976 0.875 0.684 0.831 A B ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) 10 20 30 40 Subject CST 1 2 3 4 5 P1 P7 M1 M5 T1 T28 Gestational Weeks Fig. 3. Dynamics of the vaginal community during pregnancy. (A) Vaginal CST time course of the 40 subjects from the first subject group. Color indicates CST as shown in the key; the black parenthesis indicates delivery. Subjects P1–P7 delivered preterm (before gestational week 37); subjects M1–M5 were considered marginal (gestational week 37); subjects T1–T28 delivered at term (>37 gestational weeks). Marginal subjects were excluded when calculating associations with preterm birth. (B) Dynamics of the vaginal communities were approximated as a Markov chain with subject-independent transition probabilities between CSTs. Arrow weights are proportional to the maximum-likelihood-estimate of the week-to-week transition probabilities between states. Node sizes scale with the number of subjects in which the CST was seen. Color indicates the strength of the association with preterm birth (i.e., the proportion of the specimens from the CST that came from subjects who delivered preterm). The self-transition probabilities are shown numerically. 11062 | www.pnas.org/cgi/doi/10.1073/pnas.1502875112 DiGiulio et al