Metatranscriptome Analysis of the Vaginal Microbiota Reveals Potential Mechanisms for Protection against Metronidazole in Bacterial Vaginosis

Bacterial vaginosis is a serious issue for women in their reproductive years. Although it can usually be cured by antibiotics, the recurrence rate is very high, and some women do not respond to antibiotic therapy. The reasons for that are not known. Therefore, we undertook a study to detect the activity of the complete microbiota in the vaginal fluid of women who responded to antibiotic therapy and compared it to the activity of the microbiota in women who did not respond. We found that one of the most important pathogens in bacterial vaginosis, Gardnerella vaginalis, has activated genes that can repair the DNA damage caused by the antibiotic in those women that do not respond to therapy. Suppressing these genes might be a possibility to improve the antibiotic therapy of bacterial vaginosis.

T he healthy vaginal microbiome is characterized by low pH and low diversity and can be categorized into community state types (CSTs) that are dominated by different Lactobacillus spp. such as L. crispatus, L. iners, L. gasseri, and, less frequently, L. jensenii or a more diverse community (1). Bacterial vaginosis (BV) is a frequent multifactorial disease of women in their reproductive years that is characterized by a shift of the Lactobacillus species-dominated bacterial community to a community of various, mostly anaerobic bacteria (2). BV is associated with a higher risk of preterm birth and of acquiring sexually transmitted infections such as HIV (3). The most common bacteria found in BV, identified by 16S rRNA gene sequencing, are Gardnerella, Atopobium, Prevotella, Bacteroides, Peptostreptococcus, Mobiluncus, Sneathia, Leptotrichia, Mycoplasma, and BV-associated bacterium 1 (BVAB1) to BVAB3 of the order Clostridiales. Recently, three CSTs dominated by Gardnerella vaginalis, Lachnospiraceae, and Sneathia sanguinegens, respectively, have been described (4). In our recent clinical study, S. amnii was identified as the best biomarker for BV (5).
The most important pathogen in BV is Gardnerella vaginalis (6). It is currently the only described species in the genus Gardnerella, but genome comparisons suggest that it can be separated into four genetically isolated subspecies (7). While they cannot be resolved by 16S rRNA gene sequencing, the universal target from the chaperonin-60 (cp-60) gene separates the species into the same four subgroups (group A, clade 4; subgroup B, clade 2; subgroup C, clade 1; subgroup D, clade 3) (6,8,9). All four subgroups of G. vaginalis can be detected in the vaginal microbiota of healthy women throughout the menstrual cycle (10). Subgroups A and C define distinct CSTs in health (11). Isolates from the four subgroups of G. vaginalis differ in their virulence as well as in their resistance against metronidazole. The sialidase activity of G. vaginalis is an important virulence factor, and it was detected in all isolates from subgroup B and in a few isolates from subgroup C but not in isolates from subgroups A and D (12). The presence of sialidase activity is used for diagnosis of BV in a commercial kit (13). Resistance against metronidazole was found in isolates from subgroups A and D, while those from subgroups B and C were highly susceptible (14).
Metronidazole is a widely applied chemotherapeutic agent used to treat infectious diseases caused by anaerobic bacteria, and it is the first-line antibiotic for treating BV (15,16). Metronidazole is a prodrug which requires enzymatic reduction within the cell, which occurs under anaerobic conditions only, to transform it into an active form (17). Activated metronidazole acts by covalently binding to DNA, disrupting its helical structure and causing single-and double-strand breaks that lead to DNA degradation and death of the pathogens (17). Resistance can therefore be mediated by lack of activation of the prodrug, or by repair of DNA damage, and has been studied in various pathogens. In Helicobacter pylori and Campylobacter spp., ferredoxin, ferredoxin/ ferredoxin-NADP reductase (FNR), and nitroreductase contribute to metronidazole resistance (17). In Bacteroides fragilis, genes responsible for DNA repair such as recA and the recA-mediated autopeptidase (Rma) gene and a gene named the nitroimidazole resistance (nim) gene encoding a nitroimidazole reductase were shown to confer resistance against metronidazole (18,19). Failure of BV treatment by metronidazole is relatively rare (5,20). It is unclear if it is caused by resistance of the BV pathogens to metronidazole and which mechanisms are acting in vivo. A recent study has demonstrated that failure of treatment of BV with metronidazole is not associated with higher loads of G. vaginalis and Atopobium vaginae (21). Isolates from G. vaginalis subgroups A and D are intrinsically resistant against metronidazole, but the underlying mechanism is unknown (14).
Until now, the majority of studies regarding the vaginal microbiota have focused on 16S rRNA gene sequencing, answering only questions on the taxonomic composition of bacterial communities and not on their functions (2). A metatranscriptome analysis comparing vaginal swabs from two women with BV to vaginal swabs from two healthy subjects showed that L. iners upregulates transcription of the cholesterol-dependent cytolysin (CDC) and of genes belonging to the clustered regularly interspaced short palindromic repeat (CRISPR) system in BV (22). No study has investigated the activity shifts of the vaginal microbiota during antibiotic treatment of BV.
We had previously analyzed the vaginal microbiota in the context of a clinical trial using 16S rRNA gene sequencing (5). Of 37 patients diagnosed with BV and included in this study, 31 were initially cured by a single oral dose of metronidazole. Six patients did not respond; i.e., they were still diagnosed with BV according to the Nugent score after antibiotic therapy. Here we asked if differences in the activity of the microbiota might be responsible for the lack of response in those six patients. We therefore analyzed their metatranscriptomes at the time of diagnosis of BV (visit 1) and after treatment with metronidazole (visit 2) and compared them to those of 8 patients that responded to treatment according to the Nugent score.
The high rate of recurrence is another crucial problem for BV treatment. The 1-year recurrence rate of BV ranges from 40% to 80% after therapy with metronidazole (23) or clindamycin cream (24). CSTs dominated by L. iners might have an increased probability to shift to a dysbiotic state (22,25,26). In the second part of our study, we therefore followed the activity of the microbiota of four of the patients that initially responded to metronidazole treatment over a period of 3 months (visits 3 to 5) and analyzed gene expression of L. crispatus and L. iners in vivo.
We show the importance of G. vaginalis for BV, which can be massively underestimated using 16S rRNA gene sequencing. The relative abundances of the four subgroups of G. vaginalis could be determined in responders and nonresponders. Transcripts potentially leading to a lack of response to metronidazole treatment were identified. CRISPR-Cas genes are suggested to be a novel mechanism of G. vaginalis to mitigate the DNA-damaging effect of metronidazole. L. iners highly expressed genes for pore-forming toxins in vivo, and the transcripts that were most highly expressed in L. crispatus in vivo encoded enzymes for D-lactate and hydrogen peroxide production.

RESULTS
Study population and overview of sequencing results. We studied the vaginal microbiome of 14 patients before, during, and after metronidazole treatment of BV using metatranscriptome sequencing (Fig. 1A). Patients were part of a clinical trial described elsewhere (5). In the first part of the study, we analyzed samples from two time points (diagnosis of BV, visit 1) and after metronidazole treatment (visit 2). Eight patients responded to treatment, and six patients did not respond to treatment with the antibiotic and thus were still BV positive according to the Nugent score at visit 2.
In the second part of our study, three additional time points were analyzed for four of the patients that initially responded to metronidazole therapy, covering a total period of 3 months. Those four patients belonged to the lactic acid arm of the clinical study. Two of them experienced recurrence, while the other two were stably non-BV after treatment according to the Nugent score (see details in Data Set S1, sheet 1, in the supplemental material). In total, we analyzed 40 vaginal fluid samples, totaling 22 with BV status and 18 without. Metatranscriptome sequencing resulted in a total of 1,879,945,342 reads. Of these, 1,377,516,082 reads (73%) were left after quality filtering and removal of rRNA (Data Set S1, sheet 2). On average, 34 million reads were analyzed per sample.
Construction of the reference genome and gene databases for taxonomic and activity profiling. Human reads comprised~11% (BV) versus~56% (non-BV) of the total putative mRNA reads based on the standard Kraken (27) database (Data Set S1, sheet 2). This suggests that the bacterial load is much lower in non-BV than in BV since the human contamination is much higher in non-BV. Using the standard Kraken database, only 41% of total putative microbial (nonhuman) mRNA reads could be assigned taxonomically (Data Set S1, sheet 2). To improve the fraction of taxonomically assignable reads, we then constructed a refined database (ref_Genome) which combined the urogenital subset of the Human Microbiome Project (HMP) (28) database (147 genomes) and all species which are not included in the urogenital subset of the HMP database but are detected by the standard Kraken database with an abundance of Ͼ1% (7 genomes). We also added S. amnii and S. sanguinegens, which had previously been shown to be highly abundant based on 16S rRNA gene sequencing (5) but were not contained in either the HMP database or the standard Kraken reference database. There are four G. vaginalis strains in the HMP database, of which one belongs to subgroup A and three belong to subgroup C. Given the importance and high intraspecies diversity of G. vaginalis, we added 5 additional G. vaginalis genomes based on the genome tree reported in the NCBI database and the completeness of the genome assembly; those five strains cover all four subgroups. We added the genomes of Gardnerella sp. strain 26-12 and Gardnerella sp. strain 30-4, which were isolated from the bladder recently (29). They were classified into G. vaginalis subgroup A based on sequence homology (29). In total, the database contained 163 bacterial genomes from 105 species (Data Set S1, sheet 3). Using the database, the rate of taxonomically classified putative microbial mRNA reads could be improved to 86% on average (Data Set S1, sheet 2). The subspecies composition of the G. vaginalis subcommunity. Species with an average relative abundance lower than 0.5% were grouped into "Others." A red dot at the top of a sample column in panel B indicates BV. The digits indicate the patient ID, while the letters a and b denote visit 1 and visit 2. After the first sampling at visit 1, the patients were treated with metronidazole. Total putative bacterial mRNA reads were mapped to the ref_Genome database using Kraken (see Materials and Methods for details). BV status was determined by Nugent score. The "BV" and "NBV" in parentheses in panel C indicate BV and non-BV, respectively.
For functional assignment, we constructed a reference gene database (ref_Gene) (Data Set S1, sheet 4). It was based on the same genomes as the ref_Genome database, except that the seven additional Gardnerella species genomes were not included because of the low quality of the annotation of coding sequences. The ref_Gene database contained 301,323 genes. To investigate the activity shifts of the communities, we mapped the cleaned metatranscriptomic reads to the ref_Gene database using Burrows-Wheeler Aligner (BWA). In total, 78% of total putative microbial mRNA reads could be mapped to the ref_Gene database. Per sample, an average of 8.9 million microbial mRNA reads could be mapped with a mapping quality (MAPQ) value of Ͼ10 (Data Set S1, sheet 2).
Shifts in the taxonomic composition of the active community following metronidazole treatment. The taxonomic composition of transcripts was determined using Kraken and the ref_Genome database. Figure 1B shows that in all communities with BV status, the most abundant species were G. vaginalis, A. vaginae, S. amnii, and Prevotella timonensis. In the posttreatment communities, the metatranscriptomes from responders (non-BV; Nugent score of Ͻ6) were dominated by L. crispatus, L. iners, and L. jensenii, representing typical CSTs of the healthy female microbiota.
On average, fewer than 14 species contributed Ͼ90% of the mapped reads in BV and 3 species accounted for Ͼ90% of the mapped reads in non-BV communities (Fig. 1C). The individual dominance plots showed the same pattern, where 10 species contributed Ͼ90% of the metatranscriptomes for most BV patients. In non-BV subjects, this number was 2 for most patients and the dominance curves were extremely steep. For comparison, in the periodontal metatranscriptome, more than 100 species were required to cover 90% of mapped reads (30). These data show that the active microbiota in BV is much less diverse than is suggested by 16S rRNA gene sequencing.
G. vaginalis was the most dominant active species in BV. To estimate the relative abundances of the four subgroups of G. vaginalis, we extracted all reads assigned to G. vaginalis from the metatranscriptomes and assigned them to the four strains representing subgroups A, B, C, and D (409-05, 00703Bmah, HMP9231, and 00703Bmash, respectively) using Kraken. For this analysis, we used samples from visit 1 where G. vaginalis reads comprised at least 20% of all reads, which included all 6 patients without response to treatment and 5 of the 8 patients that responded to treatment. Figure 1D shows that, on average, Ͼ95% of the reads could be mapped to the four subgroups and that, on average, only 7% were assigned ambiguously. In those patients that did not respond to treatment, subgroups A and D comprised 68.5% Ϯ 17.2% of all reads, while they accounted for 30.5% Ϯ 29.3% of all reads in patients that responded to treatment (Wilcoxon test P ϭ 0.0520). We observed that Gardnerella spp. previously isolated from the bladder (Gardnerella sp. strain 26-12 and Gardnerella sp. strain 30-4) (29) contributed on average 6% of all taxonomically assigned reads in BV (see Fig. S1 in the supplemental material).
Comparison of the taxonomic compositions of vaginal fluid samples between metatranscriptome and 16S rRNA gene sequencing. We compared the taxonomic compositions determined using 16S rRNA sequencing previously (5) to the taxonomic composition of the metatranscriptome determined here by Kraken with the ref_Genome database. In non-BV samples, we did not observe large differences for the four most abundant species (Data Set S1, sheet 6), while in BV samples, large differences between the two data sets were found. Figure 2 shows the top 12 most abundant taxa identified using 16S rRNA gene sequencing or metatranscriptomics. Most of the abundant species identified in the mRNA sequencing data set were also identified using 16S rRNA gene amplicon sequencing, although usually at different abundances. For example, A. vaginae comprised 13% of all reads based on 16S rRNA gene sequencing but only 11% in the metatranscriptome data set. The most pronounced difference was observed for G. vaginalis, which comprised, on average, 47% of the relative abundance in the metatranscriptome and, on average, only 5% in the 16S rRNA sequencing data. Several additional differences were found. Higher-level taxa such as Veillonellaceae (family) or Parvimonas (genus) are not listed among the top 12 taxa of the metatranscriptome, because mapping occurred to the species level, and the abundance of individual species of these higher-order taxa was too low for them to be found among the top 12 taxa (Data Set S1, sheet 6). BVAB2 is readily detected by PCR and is an important indicator of BV, but it has not yet been cultivated and so there is no genome available to map the reads against.
Global community profiling in non-BV and BV. In order to profile the function of the communities, all cleaned putative mRNA reads (Data Set S1, sheet 2) were mapped using BWA onto the ref_Gene database annotated with KEGG ortholog (KO) genes. We used principal-component analysis (PCA) to visualize the difference between the microbiota in BV and the microbiota in non-BV on the levels of taxonomy (16S rRNA gene) (Fig. 3A), taxonomic composition of expressed genes (metatranscriptome) (Fig. 3B), and functional annotation of transcripts to KEGG orthologues (KO genes) (Fig. 3C). Figure 3A shows that the non-BV communities form a tight cluster on the level of the 16S rRNA sequencing, while the BV communities vary, in accordance with the studies using amplicon sequencing of BV. L. iners, Prevotella spp., G. vaginalis, A. vaginae, and S. amnii drive the separation between non-BV and BV. On the level of the taxonomic composition of the metatranscriptomes (Fig. 3B), this pattern was reversed; samples from non-BV communities were much more heterogeneous than those from BV communities. The non-BV communities clustered into two groups dominated by L. iners and L. crispatus, respectively, whereas G. vaginalis, A. vaginae, and S. amnii were abundant in the BV communities. This reversal is even stronger on the level of KO genes (Fig. 3C). Samples from BV form a tight cluster, while those from non-BV vary widely. The pattern is opposite that found for the phylogenetic marker gene. The KO genes that contributed most to these differences in the non-BV communities were phosphofructokinase isozyme gene pfkA (31) and ribosomal protein coding genes rpsI, rpmF, and rplU (32,33). In the BV communities, the msmE, cycB, and pflD genes that encode proteins involved in carbohydrate uptake and metabolism (34,35) were stably more highly expressed.
In vivo expression of putative metronidazole resistance-associated genes in G. vaginalis. To clarify the possible contribution of genes related to metronidazole resistance in Gram-positive pathogens to the differences in the responses to treatment of the vaginal microbiota, we examined their expression (Data Set S1, sheet 9) in G. vaginalis. For this analysis, BV communities from 11 patients at visit 1 were analyzed, and the level of G. vaginalis transcripts was Ͼ20%. Six of these patients did not respond to treatment, and five responded. Although A. vaginae and S. amnii are also key players in BV, we could not analyze them here since there were too few samples dominated by them. As shown in Fig. 4A, there was no clear expression pattern for most of these genes (detailed data are provided in Data Set S1, sheet 9). The only significantly changed gene expression was that of the gene encoding ferredoxin, which was less active in G. vaginalis in nonresponding patients (fold change ϭ 1.67; Wilcoxon test P ϭ 0.00866 based on relative abundance [read count of given genes of G. vaginalis/read count of G. vaginalis percentage]). CRISPR-associated protein coding genes of G. vaginalis were strongly upregulated in vaginal fluids of patients not responding to treatment. We then performed a global analysis of differential expression (DE) of KO genes of G. vaginalis in the same communities (visit 1, 11 BV samples with Ͼ20% transcripts from G. vaginalis, including 6 patients that did not respond to treatment and 5 that responded). We observed that there were 9 KO genes highly upregulated with a false-discovery rate (FDR) of Յ0.05 (log 2 fold change of up to 9.46) in communities without response. Strikingly, among the most strongly upregulated KO genes, seven were cas genes (36) (Fig. 4B). In total, there were 8 different G. vaginalis CRISPR-associated (Cas) genes found in the genomes, namely, cas1 to cas3 and casA to casE, of which 7 (cas1 to cas3 and casA to casD) were upregulated.
There were 9 KO genes that were downregulated, but the fold change values were not as high as for the upregulated genes. The fucP (fucose permease) gene was identified as the most strongly downregulated gene, with a log 2 fold change of Ϫ4. 24.
Time course of activity profiles and recurrence. In the second part of our study, we analyzed the metatranscriptome of vaginal fluid samples from four of the patients that initially responded to therapy with metronidazole for the complete duration of the clinical trial. Two of these patients experienced recurrence of BV, and two remained stably non-BV. Five time points were analyzed, the first two of which are shown in Fig. 1A as described above. They represented acute BV at the time of diagnosis (visit 1) and after metronidazole therapy (visit 2). Here, we also show data from visits 3 to 5, which were all visits by non-BV subjects, with the exception of recurrence at visit 5 in patient 04_001 and at visit 3 in patient 06_004. Figure 5A shows the taxonomic composition of the communities. L. crispatus dominated the microbiota in one of the two patients that stably maintained a non-BV status, and L. iners dominated the microbiota in the other. The results of principal-component analysis of the activity profiles are shown in Fig. 5B. In acute BV, samples from all four patients clustered together (red circle). After the treatment, the data corresponding to samples from patient 08_006, who was stably non-BV, moved into a very dense and distinct cluster (illustrated by the arrow 1 and enclosed by a green circle). Samples from patient The expression values were calculated based on the relative abundances of reads mapped onto G. vaginalis using BWA. "NR1" (No response 1) indicates the BV samples from six patients that did not respond to metronidazole treatment; "WR1" (With response 1) represents the BV samples from four patients who afterward responded to metronidazole. The dot plot illustrates the log2 fold change (log2FC) values of the corresponding activities in comparisons between G. vaginalis from nonresponders and G. vaginalis from responders. The values in the heat map were scaled using the Z-score. In the figure key, "exp." indicates the relative expression level. (B) "NR1" samples were compared with "WR1" samples. KO genes with an FDR of Յ0.05 are colored in red or turquoise (significantly differentially regulated), while those with an FDR of Ͼ0.05 are in colored in gray. (Continued on next page) Activity Profiling of the Vaginal Microbiota 13_019, who also remained stably non-BV, moved to a different cluster after treatment, shown by arrow 2 and encircled in blue.
The activity shifts in patient 06_004, who experienced recurrence at visit 3, were especially noteworthy. After treatment, the community moved toward an activity profile distinct from all others (arrow 3). The recurrence of BV caused the community to shift back to the BV cluster (red circle, arrow 4). At visit 4, the community moved to the non-BV cluster (blue circle) and the patient became non-BV according to the Nugent score. We speculate that there was an unknown intervention after visit 3 that changed the microbiome but which was not recorded. Interestingly, the other case of recurrence (patient 04_001) had a different progression. From visit 2 to visit 4, patient 04_001 was non-BV and these samples clustered together in the "non-BV" cluster indicated by the blue circle. At visit 5, however, patient 04_001 had recurrent BV and the community shifted back again to the BV cluster.

Transcriptomics of L. iners and L. crispatus in vivo.
The stable colonization of the vaginal fluid of two patients with either L. iners or L. crispatus that responded to antibiotic therapy allowed us to profile their gene expression in vivo to gain more understanding of their different roles in the vaginal microbiota. We extracted the reads mapped on L. crispatus in patient 08_006 (time points b to e in Fig. 1A) and L. iners in patient 13_019 (time points b to e) and performed a differential expression (DE) analysis using edgeR to compare their activity profiles based on their KO genes, comparing the expression of KO genes of L. crispatus in L. crispatus-dominated samples (n ϭ 4) with the expression of KO genes of L. iners in L. iners-dominated samples (n ϭ 4). The Venn diagram in Fig. 5C shows that the two species share 569 KO genes, while 58 are unique to L. iners and 244 are unique for L. crispatus, indicating that L. crispatus possesses a far greater number of diverse functions than L. iners. The DE analysis identified 654 significantly differentially expressed KO genes, of which 393 were upregulated in L. crispatus (Data Set S1, sheet 8). Among the top 100 most differentially expressed KO genes in terms of FDR value, 64 were upregulated in L. crispatus. Remarkably, genes encoding enzymes involved in the production of H 2 O 2 (pyruvate oxidase, NADH oxidase, glycolate oxidase) (37,38) were highly expressed in L. crispatus (Data Set S1, sheet 8, KO genes colored in blue). The D-lactate dehydrogenase gene (K03778) was the most highly expressed gene in L. crispatus (log 2 counts per min [log 2 CPM] ϭ 13.3) (Data Set S1, sheet 8, colored in light green), and this gene is absent in the genome of L. iners. On the other hand, we found that inerolysin (INY) was highly expressed (log 2 CPM ϭ 9.8) in L. iners but was absent in the genome of L. crispatus (Data Set S1, sheet 8, colored in red). Interestingly, we also found the gene orthologous to the inerolysin gene known as the vaginolysin gene (K11031) (39) to be highly expressed in G. vaginalis (details in Data Set S1, sheet 7). Hemolysin C, another pore-forming toxin, was highly expressed (log 2 CPM ϭ 9.8) in L. iners but absent in L. crispatus.

DISCUSSION
The aim of this study was to identify activity patterns in the vaginal fluid microbiota in BV and after metronidazole therapy. In particular, we compared the transcriptional profiles of G. vaginalis in the vaginal microbiotas of patients who did and did not respond to metronidazole treatment. This is the first study to have investigated the activity alterations of the vaginal microbiota in patients with BV during treatment with the antibiotic metronidazole using the metatranscriptomics approach. We found several changes in gene expression in nonresponding patients that might contribute to resistance against metronidazole by either not activating the prodrug or repairing DNA damage.
G. vaginalis was the most dominant active species in BV. G. vaginalis can be divided into four phylogenetic subgroups which may in the future be described as subspecies and which differ in virulence and in susceptibility to metronidazole (6,12). We found transcripts from all four subgroups in all patients, as previously shown based on sequencing of the universal target cp-60 gene (6,9,11). Interestingly, in those patients that did not respond to treatment, Gardnerella subgroups A and D, whose members are resistant to metronidazole (14), were slightly more abundant.
Sequencing of phylogenetic marker genes such as the 16S rRNA gene or the cp-60 universal target is a fast and sensitive method to profile the microbiota composition, but it does not provide functional information and is prone to PCR bias. Moreover, DNA from dead cells might also be detected. Therefore, we compared the taxonomic composition of the transcripts with that of the 16S rRNA genes determined previously in those samples (5). We observed that G. vaginalis comprised on average 47% of all transcripts in BV, while only 5% of 16S rRNA genes were assigned to this species. This suggests that G. vaginalis is transcriptionally more active than other vaginal bacteria; moreover, the commonly used 27F primer was previously shown to underrepresent G. vaginalis (40). Other differences between the two methods are caused by the low taxonomic resolution of the 16S rRNA gene, especially of short amplicons, where a large fraction of 16S rRNA reads is assigned to higher-level taxa, e.g., genus or family. In contrast, the metatranscriptome reads are mapped to genomes and so have specieslevel resolution. Finally, transcripts can only be mapped if a genome is available. If the species in question has not yet been cultivated, as, for example, in the case of the BVAB strains, then reads cannot be assigned. In the periodontal pocket microbiota, about 50% of all reads cannot be mapped to any bacterial genome (30). In contrast, the vaginal microbiota is much less diverse and most of its representatives have been cultivated; using the improved ref_Genome database, we were able to map 86% of all reads, indicating that uncultivated taxa did not contribute very significantly to the active community in BV and after metronidazole therapy.
Low diversity in health and high diversity in BV are hallmarks of BV, and it is so striking that it has even been suggested to use diversity indices based on PCR-amplified 16S rRNA genes in addition to the clinical diagnosis based on Amsel criteria and the Nugent score (4,(41)(42)(43). Our comparison between communities in BV and after metronidazole therapy on the levels of (i) the 16S rRNA gene, (ii) the taxonomic composition of total transcripts, and (iii) functional profiling based on KO genes shows a reversal of this observation: BV communities, although highly diverse on the taxonomic level, cluster tightly together on the functional level of KO genes. In contrast, non-BV communities are similar on the taxonomic level but are highly diverse among individuals on the functional level.
In our metatranscriptome analysis we found evidence for mechanisms that hinder the activation of the metronidazole prodrug or that mitigate the damage that metronidazole inflicts on DNA and thus could be important reasons for the lack of response in some women.
We show that the ferredoxin gene of G. vaginalis was less active in those patients that did not respond to metronidazole. As an electron carrier, ferredoxin is downregulated in H. pylori bacteria grown in the presence of metronidazole (17). It is required for activation of the prodrug in H. pylori (44) and might have a similar role in G. vaginalis. Lack of response might result from lack of activation of the prodrug. Unexpectedly, the nitroimidazole resistance (nim) gene, which has been shown to mediate resistance to metronidazole in B. fragilis by transforming metronidazole to a nontoxic amino derivative (16), was not highly expressed in nonresponders. This could have been due to technical problems, since the Nim protein sequence contains only partial coding DNA sequences (CDS) (https://www.ebi.ac.uk/ena/data/view/AGN03877). Moreover, nim-negative strains of B. fragilis can tolerate high levels of metronidazole, indicating the importance of other mechanisms of resistance (16).
Remarkably, cas genes of G. vaginalis were highly upregulated in samples from patients that did not respond to metronidazole treatment. The CRISPR-Cas genes are present in about half of all Bacteria and most Archaea (45); they represent a mechanism of adaptive immunity which protects the prokaryotic cell against foreign DNA and has been developed into a universal tool for genome editing (46). The cas genes of G. vaginalis belong to the Escherichia coli subtype and were found in about half of the clinical isolates (36). Their upregulation might reflect increased phage attacks in BV. Phages have been hypothesized to be crucial for the etiology of BV by causing the collapse of Lactobacillus populations (47); accordingly, L. iners upregulates its CRISPR-Cas system in BV (22). More than 400 annotated prophage sequences were found in 39 Gardnerella strains (29). They might be induced to enter the lytic cycle by the change in pH accompanying the shift to BV. However, the viral transcripts contributed 0.1% of the total metatranscriptome in both nonresponders and responders before treatment.
The upregulation of CRISPR-Cas system genes in G. vaginalis from those patients that did not respond to treatment by metronidazole suggests that the CRISPR-Cas system might have a role in mitigating the DNA-damaging effect of metronidazole. In addition to providing adaptive immunity, CRISPR-Cas systems can have various additional functions (48), and it was previously shown that they can protect the cell against DNA-damaging agents (49). The Cas1 enzyme of E. coli (YgbT) physically and genetically interacts with the DNA repair system (RecBC, RuvB) and is recruited to DNA doublestrand breaks; moreover, YgbT is necessary for resistance of E. coli to DNA damage caused by the genotoxic antibiotic mitomycin C or UV light (49). Our findings suggest that the CRISPR-Cas system may protect the vaginal microbiota against the DNAdamaging effect of metronidazole. If experimentally confirmed, this finding might open a new path for fighting bacterial resistance against DNA-damaging agents. For example, it would be worth testing if the susceptibility to metronidazole can be modified in Gardnerella isolates and possibly other vaginal pathogens according to the expression level of cas genes. It is not known how upregulation of cas genes is regulated in the vaginal microbiota. It might be a response to phage attack; thus, by suppressing cas genes, the susceptibility to phages might be increased simultaneously with the susceptibility to metronidazole. Using CRISPR engineered phages for therapy of dysbiotic communities has been considered to be one of many options of new therapeutic strategies based on a deeper understanding of the human microbiome (50).
L. iners and L. crispatus dominate their respective CSTs in the healthy vaginal microbiota. There were many factors observed by laboratory or genomic studies (37,51,52) which suggest that more protection against dysbiosis is provided by L. crispatus than by L. iners. Here we analyzed which genes are actually highly expressed in vivo; we observed that genes encoding proteins for the production of H 2 O 2 and D-lactic acid were highly expressed in L. crispatus. H 2 O 2 inhibits BV-associated bacteria, but it has been questioned if its level in the vaginal milieu is high enough, and it was suggested that lactic acid is more protective (53). In L. iners, the genes for the pore-forming toxins inerolysin and hemolysin C were highly active, supporting the hypothesis that L. iners may play an ambiguous role in the vaginal econiche and is associated with vaginal dysbiosis (26,54).
Conclusions. This first study of the in vivo transcriptional activity of the vaginal fluid microbiota during metronidazole treatment of BV focused on possible reasons for the lack of response to antibiotic therapy in some patients. Genes related to activation of the prodrug and repairing the DNA damage caused by metronidazole were shown to be differentially expressed in responders and nonresponders. A completely new role for Cas proteins is hypothesized which warrants closer inspection and may help to develop more-efficient novel therapies to improve the treatment of BV. achieve reliable alignments. Reads that mapped with a mapping quality score (MAPQ) lower than 10 were excluded. MAPQ contains the Phred-scaled posterior probability that the mapping position is wrong (59).
KEGG ortholog (KO) gene annotation of ref_Gene database. The ref_Gene database was annotated using KEGG prokaryote protein sequences. The KEGG prokaryote protein sequence database represents a nonredundant protein data set of Bacteria and Archaea at the species level and contains about 7 million nonredundant peptide sequences grouped into 14,390 distinct KO genes. A KO gene contains several genes from different species with similar functions. DIAMOND (60), a much faster alternative to BLASTX, was applied to map the ref_Gene sequences against the KEGG prokaryote protein sequence database with its "more sensitive mode." To obtain reliable annotation, only alignments with sequence identity of Ն50 and E values of Յ1eϪ5 and query coverage of Ն70% were taken into account. By annotating the genes in the ref_Gene database to KO genes, we were able to determine the expression profile of KO genes and to investigate the activity shifts in BV based on differential expression analysis of KO genes. The cumulative dominance analysis and PCA were carried out using primer 7 (61).
Differential expression (DE) analysis. All differential expression (DE) analyses were performed using the R package edgeR (62). The Benjamini-Hochberg (BH) method was used to correct the P value of the results of DE analysis with the false-discovery rate (FDR) for multiple comparisons. Genes with a FDR of lower than 0.05 were considered significantly differentially regulated. The sample groups defined for each comparison are listed in Data Set S1, sheet 1.
Detection of putative metronidazole resistance-related genes. To detect the expression of previously reported putative metronidazole resistance genes such as recA and the recA-mediated autopeptidase (Rma), peroxiredoxin, nitroimidazole resistance protein (NIM), ferredoxin/ferredoxin-NADP reductase (FNR), nitroreductase, and ferredoxin genes, we examined the expression level of these genes for G. vaginalis in the vaginal community from patients without response to metronidazole treatment (n ϭ 6) as well as in the vaginal community from patients with response (n ϭ 4). As most of these genes do not have corresponding KO genes, we annotated the ref_Gene database based on the sequences of these genes using BLASTN. The sequences were retrieved from ENA by key words of each of "ferredoxin,Љ ЉNADPH flavin oxidoreductase,Љ Љnitroreductase,Љ Љperoxiredoxin,Љ Љpyruvate ferredoxin oxidoreductase,Љ ЉrecA,Љ and Љnitroimidazole resistance" plus ЉG. vaginalis.Љ In total, 155 unique sequences of G. vaginalis were obtained for the annotation of ref_Gene database. The identification of duplicate sequences was done by SeqKit (63).
Data availability. The sequencing data have been deposited in the European Nucleotide Archive with accession number PRJEB21446.