A Combination of Metagenomic and Cultivation Approaches Reveals Hypermutator Phenotypes within Vibrio cholerae-Infected Patients

ABSTRACT Vibrio cholerae can cause a range of symptoms, from severe diarrhea to asymptomatic infection. Previous studies using whole-genome sequencing (WGS) of multiple bacterial isolates per patient showed that V. cholerae can evolve modest genetic diversity during symptomatic infection. To further explore the extent of V. cholerae within-host diversity, we applied culture-based WGS and metagenomics to a cohort of both symptomatic and asymptomatic cholera patients from Bangladesh. While metagenomics allowed us to detect more mutations in symptomatic patients, WGS of cultured isolates was necessary to detect V. cholerae diversity in asymptomatic carriers, likely due to their low V. cholerae load. Using both metagenomics and isolate WGS, we report three lines of evidence that V. cholerae hypermutators evolve within patients. First, we identified nonsynonymous mutations in V. cholerae DNA repair genes in 5 out of 11 patient metagenomes sequenced with sufficient coverage of the V. cholerae genome and in 1 of 3 patients with isolate genomes sequenced. Second, these mutations in DNA repair genes tended to be accompanied by an excess of intrahost single nucleotide variants (iSNVs). Third, these iSNVs were enriched in transversion mutations, a known hallmark of hypermutator phenotypes. While hypermutators appeared to generate mostly selectively neutral mutations, nonmutators showed signs of convergent mutation across multiple patients, suggesting V. cholerae adaptation within hosts. Our results highlight the power and limitations of metagenomics combined with isolate sequencing to characterize within-patient diversity in acute V. cholerae infections, while providing evidence for hypermutator phenotypes within cholera patients. IMPORTANCE Pathogen evolution within patients can impact phenotypes such as drug resistance and virulence, potentially affecting clinical outcomes. V. cholerae infection can result in life-threatening diarrheal disease or asymptomatic infection. Here, we describe whole-genome sequencing of V. cholerae isolates and culture-free metagenomic sequencing from stool of symptomatic cholera patients and asymptomatic carriers. Despite the typically short duration of cholera, we found evidence for adaptive mutations in the V. cholerae genome that occur independently and repeatedly within multiple symptomatic patients. We also identified V. cholerae hypermutator phenotypes within several patients, which appear to generate mainly neutral or deleterious mutations. Our work sets the stage for future studies of the role of hypermutators and within-patient evolution in explaining the variation from asymptomatic carriage to symptomatic cholera.


Preparing Revision Guidelines
To submit your modified manuscript, log onto the eJP submission site at https://msystems.msubmit.net/cgi-bin/main.plex. Go to Author Tasks and click the appropriate manuscript title to begin the revision process. The information that you entered when you first submitted the paper will be displayed. Please update the information as necessary. Here are a few examples of required updates that authors must address: • Point-by-point responses to the issues raised by the reviewers in a file named "Response to Reviewers," NOT IN YOUR COVER LETTER. • Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file. • Each figure must be uploaded as a separate file, and any multipanel figures must be assembled into one file. For complete guidelines on revision requirements for your article type, please see the journal Article Types requirement at https://journals.asm.org/journal/mSystems/article-types. Submissions of a paper t hat does not conform t o mSyst ems guidelines will delay accept ance of your manuscript .
The authors generally addressed the reviewers' concerns and I recommend the manuscript be published in mSystems after addressing my remaining concerns. The narrow focus of the revised abstract is a strength. I want to particularly commend the authors for their thoughtful revisions when fielding some of these concerns -the paragraph on iSNVs and co-infection will be helpful for readers.
The paper provides sufficient evidence of its major claim that V. cholerae hypermutators can evolve within hosts by sequencing samples from symptomatic patients and asymptomatic household contacts.
I do have some remaining concerns that the authors should address before publication: -Regarding the statistics of DNA repair mutations, the statistics should be presented for all subjects, not just Patient F (who has the strongest signal of hypermutation). An example calculation would be the probability of identifying >=5 subjects with a mutation in a DNA repair pathway given the number of mutations in each subject.
-Line 165: "It would be unlikely for random sequencing errors to occur in the exact same four sites on two consecutive days by chance alone therefore these iSNVs are likely true positives." I strongly disagree with this statement. Most false positive iSNVs result from mapping error, and thus are reproducible. Interestingly, mapping errors tend to have reproducible frequency-so I'd be more willing to accept this argument if these reproducible iSNVs changed dramatically in frequency across the timepoints.
-Line 587 -590: "We did not detect any iSNVs among the five isolates sequenced from patient 58.00. In contrast, the metagenomic analysis of patient N revealed seven iSNVs (Table 1), suggesting a potentially higher sensitivity for the detection of rare variants, or possibly falsepositive iSNVs inferred from metageomic reading mapping compared to isolate sequencing." This could be reworded to state that when only shallow isolate WGS is possible, metagenomics is more appropriate for the detection of iSNVs. Overall, the authors should take another pass through the manuscript for claims that were toned down in response to the first round of reviews. Line 252: "This suggests that deleterious mutations in hypermutators could be counterbalanced by adaptive mutations that maintain growth." This statement is too strong given that variation in iRep estimates are not well understood and probably driven by noise.
Line 279: "Together, these analyses suggest that V. cholerae hypermutators produce NS mutations that are predominantly deleterious or neutral". While this is technically true, it is likely that these hypermutators may have just as many adaptive mutations, but the ability to detect them is drowned out by additional noise.
-Line 337: "Among the other index cases, we found no iSNVs in patient 58.00". Clarify that this is for isolates only.
-mutL mutations lead to an excess of transition (not transversion!) mutations of all types. This is wrong on line 234.
- Figure S2 should have a sense of error, either through plotting of absolute number or with error bars -It is super confusing to use to schemes to refer to the same patient (e.g. N and 58.00). This gets particularly problematic in the pan-genome section, when I cannot compare isolate pangenomes to the iSNV data.
-Line 342: How does an isolate have mutations? What this is reference to should be stated. The authors adequately addressed my previous comments in this version of the manuscript. I thank the authors for their careful attention to each of the reviewers' comments.

New minor concern:
Line 192 -194 -"No iSNVs were observed at the same nucleotide position in different patients, suggesting that iSNVs rarely spread by homologous recombination...". I wonder whether this is sufficient evidence? Perhaps this could benefit from additional explanation?

Reviewer #2 (Comments for the Author):
The authors generally addressed the reviewers' concerns and I recommend the manuscript be published in mSystems after addressing my remaining concerns. The narrow focus of the revised abstract is a strength. I want to particularly commend the authors for their thoughtful revisions when fielding some of these concerns -the paragraph on iSNVs and co-infection will be helpful for readers. The paper provides sufficient evidence of its major claim that V. cholerae hypermutators can evolve within hosts by sequencing samples from symptomatic patients and asymptomatic household contacts. I do have some remaining concerns that the authors should address before publication: Response: We thank the reviewer for their positive assessment, and for the previous round of reviews that significantly improved the manuscript. We address the remaining suggestions as detailed below.
-Regarding the statistics of DNA repair mutations, the statistics should be presented for all subjects, not just Patient F (who has the strongest signal of hypermutation). An example calculation would be the probability of identifying >=5 subjects with a mutation in a DNA repair pathway given the number of mutations in each subject.
Response: Thank you for this suggestion. We have added the following text to the beginning of the of the Results section on hypermutators: " Assuming that DNA repair genes are of average length and contain an average number of NS sites, we can estimate the one-sided binomial probability that NS mutations occur in the observed number of DNA repair genes in each of these five patients (Table 1). We calculated this probability assuming a binomial success rate of 0.0127 (obtained by dividing 51, the number of DNA repair genes (GO:0006281) by 4007, the total number of genes in the V. cholerae N16961 reference genome). By multiplying the probabilities from each patient, we obtain an overall probability of 0.0023 that we would see the observed number of DNA repair genes with NS mutations in all five patients. This number of patients with mutated DNA repair genes is therefore unlikely to have occurred by chance alone, given the observed number of mutations." -Line 165: "It would be unlikely for random sequencing errors to occur in the exact same four sites on two consecutive days by chance alone therefore these iSNVs are likely true positives." I strongly disagree with this statement. Most false positive iSNVs result from mapping error, and thus are reproducible. Interestingly, mapping errors tend to have reproducible frequency-so I'd be more willing to accept this argument if these reproducible iSNVs changed dramatically in frequency across the timepoints.

Response:
We agree that it is difficult to fully exclude the possibility of sequencing or mapping errors here, and we have adjusted the text to reflect this. As suggested, we checked the minor allele frequencies at these four positions at the two sampled time points (with coverage X): - The frequencies are comparable, except for position 13163, which has relatively low coverage. Therefore, most positions are consistent with the reviewer's hypothesis that mapping errors should occur at similar frequencies. On the other hand, a systematic mapping error would be expected to occur in other samples, not just the two from the same patient. This is not the case, as we observed no nucleotide positions with iSNVs in more than one patient. We therefore adjusted the text as follows, which we believe succinctly captures the uncertainty: "It would be unlikely for random sequencing errors to occur in the exact same four sites on two consecutive days by chance alone, therefore these iSNVs are likely either true positives or systematic (site-specific) sequencing or read mapping errors. However, systematic errors would be expected to be seen in other samples at the same nucleotide positions, which is not the case." -Line 587 -590: "We did not detect any iSNVs among the five isolates sequenced from patient 58.00. In contrast, the metagenomic analysis of patient N revealed seven iSNVs (Table 1), suggesting a potentially higher sensitivity for the detection of rare variants, or possibly false-positive iSNVs inferred from metageomic reading mapping compared to isolate sequencing." This could be reworded to state that when only shallow isolate WGS is possible, metagenomics is more appropriate for the detection of iSNVs. Overall, the authors should take another pass through the manuscript for claims that were toned down in response to the first round of reviews.

Response:
We agree with this suggestion, and have reworded this section as follows: "In contrast, the metagenomic analysis of patient N revealed seven iSNVs (Table 1), suggesting a higher sensitivity for the detection of rare variants which could be easily missed by sequencing only a few isolates. Despite a potentially higher error rate, metagenomics is more appropriate for sensitively detecting iSNVs when only shallow isolate sequencing is possible."

Line 252: "This suggests that deleterious mutations in hypermutators could be counterbalanced by adaptive mutations that maintain growth." This statement is too strong given that variation in iRep estimates are not well understood and probably driven by noise.
Response: We agree that this statement was too speculative, and we have removed it and replaced it with the following, as suggested: "This lack of association could be due to noisy replication rate estimates from iRep, and could be revisited in larger patient cohorts." Line 279: "Together, these analyses suggest that V. cholerae hypermutators produce NS mutations that are predominantly deleterious or neutral". While this is technically true, it is likely that these hypermutators may have just as many adaptive mutations, but the ability to detect them is drowned out by additional noise.

Response:
We agree, and have added the following sentence to clarify this point: "This does not exclude the possibility of adaptive mutations in hypermutators, but these are difficult to pinpoint against the overwhelming background of non-adaptive mutations." -Line 337: "Among the other index cases, we found no iSNVs in patient 58.00". Clarify that this is for isolates only.

Response:
We agree and have modified this sentence as follows: "Among the other index cases, we found no iSNVs in the isolates from patient N"

-mutL mutations lead to an excess of transition (not transversion!) mutations of all types. This is wrong on line 234.
Response: Thank you for catching this error. We have now corrected it as follows: " For instance, it has been shown in other bacterial pathogens that mutations in mutT and mutL lead to strong mutator phenotypes, increasing the rate of A:T→C:G transversions and G:C →A:T transitions respectively (34), which we observed in patients (F and I) containing these mutations (Table 1, Fig.  S2)." - Figure S2 should have a sense of error, either through plotting of absolute number or with error bars Response: Thank you for this suggestion. We found that plotting the absolute number of iSNVs made it difficult to compare the patients, which range from 6 to 207 iSNVs in these plots. To make the panels visually comparable while also showing the absolute numbers, we have now added the number of iSNVs to the header of each panel. We believe this now makes the sampling error clear.
-It is super confusing to use to schemes to refer to the same patient (e.g. N and 58.00). This gets particularly problematic in the pan-genome section, when I cannot compare isolate pangenomes to the iSNV data.

Response:
We apologize for this confusion. Patient N (also called 58.00) was the only patient with both a metagenome and isolate genome sequences. For clarity, we now refer to this patient uniquely as Patient N, in both the manuscript text, Figure 3 (which illustrates the pangenome analysis), and Table  S1. We believe this is now clear in the following sentence and the paragraph that follows: "The index case from household 58 (patient N) was the only sample also included in the metagenomic analysis described above, allowing a comparison between culture-dependent and -independent assessments of within-patient diversity." -Line 342: How does an isolate have mutations? What this is reference to should be stated.
Response: We have clarified this sentence as follows: "One isolate sampled from this contact had the highest number of mutations seen in any branch in the phylogeny (five NS mutations, all G : C→T : A transversions) relative to its ancestral branch (i.e. to the other isolates from the same person)."

-Supplementary Table 1-Why does each subject have multiple household numbers?
Response: Each row in this table is actually a specific person, but we agree that the lack of ID for some of them makes it unclear. In the update Table S1, we have added a specific ID and the accession number of the reads for each sample. Your manuscript has been accepted, and I am forwarding it to the ASM Journals Department for publication. For your reference, ASM Journals' address is given below. Before it can be scheduled for publication, your manuscript will be checked by the mSystems senior production editor, Ellie Ghatineh, to make sure that all elements meet the technical requirements for publication. She will contact you if anything needs to be revised before copyediting and production can begin.
Otherwise, you will be notified when your proofs are ready to be viewed.
As an open-access publication, mSystems receives no financial support from paid subscriptions and depends on authors' prompt payment of publication fees as soon as their articles are accepted. =

Publicat ion Fees:
You will be contacted separately about payment when the proofs are issued; please follow the instructions in that e-mail. Arrangements for payment must be made before your article is published. For a complete list of Publicat ion Fees, including supplemental material costs, please visit our website.
Corresponding authors may join or renew ASM membership to obtain discounts on publication fees. Need to upgrade your membership level? Please contact Customer Service at Service@asmusa.org.
For mSyst ems research art icles, you are welcome to submit a short author video for your recently accepted paper. Videos are normally 1 minute long and are a great opportunity for junior authors to get greater exposure. Importantly, this video will not hold up the publication of your paper, and you can submit it at any time.
Details of the video are: · Minimum resolution of 1280 x 720 · .mov or .mp4. video format · Provide video in the highest quality possible, but do not exceed 1080p · Provide a still/profile picture that is 640 (w) x 720 (h) max · Provide the script that was used We recognize that the video files can become quite large, and so to avoid quality loss ASM suggests sending the video file via https://www.wetransfer.com/. When you have a final version of the video and the still ready to share, please send it to Ellie Ghatineh at eghatineh@asmusa.org.
Thank you for submitting your paper to mSystems.  Table S4: Accept Table S1: Accept Table S3: Accept