Variation in pre-therapy levels of selected Mycobacterium tuberculosis transcripts in sputum and their relationship with 2-month culture conversion [version 1; peer review: 2 approved with reservations]

Background: The abundance of transcripts arising from Mycobacterium tuberculosis (MTB) in sputum pre-chemotherapy may enhance our understanding of factors influencing treatment response. We hypothesized that differences in the prevalence of preexisting slowly metabolizing MTB in sputum may be partially responsible for differences in the rate of sputum clearance during treatment. Methods: Quantitative reverse transcription-polymerase chain reaction (qRT-PCR) was used to characterize a selected limited transcription profile of MTB in sputum pre-chemotherapy and assess inter-individual variation. The difference in cycle threshold (Ct) per gene, normalized to 16S, between exponential/stationary phase culture and sputum was calculated and stratified by 2-month culture converter status. Results: HIV-1 uninfected patients with rifampicin-susceptible tuberculosis provided sputum pre-chemotherapy; 11 patients were negative for MTB culture after two months of therapy and 8 remained culture-positive. Increased icl1 and prpD and rpsN2:rpsN1 in sputum relative to culture suggested cholesterol utilization and a low-zinc environment respectively. Increased hspX and decreased atpA and nuoG relative to exponential culture suggested a slowly metabolizing subpopulation of Open Peer Review


Introduction
Tuberculosis (TB) causes over a million deaths annually 1 . One attribute believed to contribute to its success as a pathogen is its ability to persist as phenotypically-adapted subpopulations under antimicrobial, environmental or immune stress.
A well-defined feature of Mycobacterium tuberculosis (MTB) phenotypic adaptation involves up-regulation of the DosR regulon in response to reduced oxygen availability, or exposure to nitric oxide or carbon monoxide 2,3 . This regulon consists of 48 genes under the control of transcription factor DosR, including hspX coding for alpha-crystallin stress protein.
Previous studies have reported abundant DosR-regulated transcripts in sputum pre-chemotherapy 4-6 . The DosR regulon includes a small RNA(sRNA) transcript MTS1388. This was reported to accumulate in infected tissues and we anticipated that its enhanced stability might make it a particularly sensitive target for reverse transcription-polymerase chain reaction (RT-PCR) 7 . It was of interest to examine the possible influence of MTB genotype on transcriptional profiles observed in sputum. It has been reported that a mutation in strains from the Beijing family introduces a new transcriptional start site that results in constitutive expression of the dosR gene (and hence the associated regulon) independent of stress induction 8 .
MTB phenotypes with reduced replication and metabolism, particularly in the context of hypoxia, are characterized by a low level of ATP and a switch from proton-pumping NADH dehydrogenase-1 to non-proton-pumping NADH dehydrogenase-2 9,10 . We therefore included atpA, nuoG and ndh in our panel of target genes. In addition, we included genes encoding the major sigma factors sigA and sigB, together with icl1 and prpD as markers of the methylcitrate cycle that has been shown to be induced as a consequence of cholesterol catabolism during infection 11 . A second abundant sRNA, MTS2823, has been linked to the methylcitrate cycle 12 and was also included in the set of target genes.
Finally, we included two paralogous ribosomal protein genes, rpsN1 and rpsN2, on the basis of previous studies of the MTB sputum phenotype (RPL, unpublished data). MTB has four ribosomal proteins that can be expressed as alternative zincbinding or zinc-independent forms dependent on the availability of zinc in the growth medium 13 ; rpsN1 encodes a zinc-binding S14 protein and rpsN2 the corresponding zinc-independent form.
We hypothesized that differences in the prevalence of preexisting drug-tolerant MTB in sputum may be partially responsible for differences in the rate of sputum clearance during treatment. To test this, we used qRT-PCR to measure the abundance of the selected transcripts in sputum samples from a panel of patients infected with different MTB lineages and with differing responses to subsequent antitubercular chemotherapy. To determine inter-individual variation in m/sRNA expression patterns, our protocol lacked an amplification step after the random primer based cDNA synthesis, prior to quantification by qRT-PCR 14 .

Setting
Recruitment was at Site B Ubuntu Clinic, a primary care integrated HIV/tuberculosis clinic, in Khayelitsha township, Western cape, South Africa as part of a prospective study (University of Cape Town Human Research Ethics Committee approval 568/2012) assessing frequency and determinants of acquired drug resistance. The study was carried out from March 2013-July 2014, with clinical follow-up until November 2015. A subset of the patients was invited to provide sputum for qRT-PCR in this nested study. All participants gave written consent.

Participants
HIV-uninfected patients with GenXpert® MTB/Rif-confirmed rifampicin-susceptible pulmonary TB, were recruited pre commencement of quadruple therapy with rifampicin/isoniazid/ ethambutol/pyrazinamide. Patients were excluded if they were under 18, had received treatment for tuberculosis in the previous 6 months, had a positive pregnancy test, were unable to give informed consent, declined testing for HIV or were unable to expectorate sputum.

Sample size considerations
This was an exploratory study and there was no formal sample size calculation. A sample size of 20, was selected to compare variability of expression of selected transcripts, stratified by culture converter status and MTB strain.

Sample collection
Full details of clinical variables collected and follow up procedures have been described elsewhere 15 . In this sub-study, patients produced spontaneous sputum samples TRIzol®LS (10296028, Invitrogen, Carlsbad, USA) in 3:1 volume was immediately added to spontaneously expectorated sputum, vortexed until the mixture was homogenous. Samples were transported 30 minutes on ice from the clinical site to the laboratory to be frozen at -20°C.

RNA extraction from sputum
1 ml aliquots thawed and were ribolyzed at 6.5 m/sec for 45 seconds and again after 5 minutes with a FastPrep-24 ribolyzer (MPBio, UK). 200μl of 100% chloroform (CHCl 3 ) was added, the mixture vortexed and centrifuged at 3000 rpm for 10 minutes. The liquid content was transferred to a new tube, 600μl CHCl 3 / isoamyl alcohol (24:1) was added and centrifuged at 13000 rpm for 1 minute. The top aqueous layer was transferred to a new tube and mixed well with an equal volume of 100% ethanol. The RNA ethanol mixture was then extracted using the RNA Clean and Concentrator kit (R1013, Zymo, Irvine,USA).

Axenic culture of reference M. tuberculosis
M. tuberculosis H37Rv (Systems strain) was available inhouse and cultured in 7H9 media (M0178, Sigma-Aldrich, UK) containing 0.2% v/v glycerol, 0.05% v/v Tween 80 and 10% ADC in roller bottle. Exponential phase culture was harvested when OD 600 measured between 0.7-1 (Biochrom WPA CO8000 Cell Density Meter) and stationary phase culture was harvested 4 weeks after the OD 600 reached a plateau of 1.0. 10 mL of culture was pelleted by centrifugation and washed once with phosphate buffered saline (PBS) before addition of 2ml Trizol®LS.

cDNA synthesis, primer design and qPCR
First strand cDNA synthesis was performed from 1μg RNA from each sputum using Superscript VILO system (11754-050, Invitrogen Carlsbad, USA) and random hexamers (N8080127, Invitrogen). This process was also carried out for H37Rv exponential and stationary phase cultures. Serial dilutions were carried out on H37Rv cDNA until the average cycle threshold (Ct) for 16S was comparable to values obtained from sputa to allow optimal comparability. Primers and probes complementary to 8 mRNAs (atpA, SigA, SigB, nuoG, ndh, icl1, prpD, rpsN1, rpsN2, HspX), 2 sRNAs (MTS2823 and MTS1338) and 16S ribosomal RNA were designed using Primer Quest Tool (Integrated DNA Technologies, UK, see Table 1). Primers and probes with potential to form hairpin structure were excluded and potential crossreactivity was checked by the BLASTN tool. qPCR was carried out on customized TaqMan array plates pre-printed with primers and probes described above. Reaction was run in triplicate on a ABI Prism 7000 (Applied Bioscience, UK). Conditions were 95°C/10 minutes, followed by 40 cycles of 95°C/15 seconds and 60°C/60 seconds. Separate sputum samples taken simultaneously at baseline, were grown in MGIT liquid cultures using the BD BACTEC MGIT 960 system at a routine diagnostic laboratory, National Health Laboratory Service (Groote Schuur Hospital). Isolates were collected and genotyped by spoligotyping. Spoligotyping was done according to the internationally standardized method. 500μl aliquots of MGIT culture were heat killed at 80°C for 1 hour and 5μl of lysate used for PCR. Chromosomal DNA of M. tuberculosis H37Rv and M. bovis BCG P3 were used as positive controls and water as a negative control. Primers DRa and DRb were used to amplify spacers in the direct repeat (DR) region. Biotin labeled PCR products were then hybridized to immobilized spacer-oligos that represent spacers of known sequence. The presence of spacers was visualized on film as black squares after incubation with streptavidin-peroxidase and electrochemical luminescence system detection 16 . Hybridisation patterns obtained were used to assign the isolates to genotype families using the SPOTCLUST SpolDB3 database 17 .

Data normalization
To normalize for overall bacterial load, all cycle thresholds (Ct) of the genes of interest (GOI) were divided by Ct of the housekeeping gene 16S (Ct GOI / Ct 16s , not ; denoted as SPUT GOI ). For individual GOIs (atpA, SigA, SigB, nuoG, ndh, icl1, prpD, rpsN2, HspX, MST2823, and MTS1338), the difference between exponential phase culture (Ct GOI / Ct 16s , denoted as EXP GOI ) and SPUT GOI was calculated for each patient, along with the median for all patients (ΔGOI EXP ). The difference in abundance of the mRNA/sRNA transcripts, in sputum compared with the reference H37Rv cultures, was summarized as an overall 'mean deviation' from exponential/ stationary culture. The overall 'mean deviation from exponential phase culture'(MDEx) was calculated as the mean of all values of [EXP GOI -SPUT GOI ] inclusive of 11 GOIs and inclusive of all patients (N=19).

where j represents each GOI and N= total number of patients
The 'mean deviation from stationary phase culture' (MDStat) was calculated as the mean of [STAT GOI -SPUT GOI ] inclusive of all GOIs and inclusive of all patients.

where j represents each GOI and N=total number of patients
The Wilcoxon rank-sum test was used compare statistical significance between groups. Correlation between continuous variables was assessed by the Spearman correlation co-efficient. All analyses were performed using GraphPad Prism version 8 (La Jolla CA).

Results
In total, 20 participants provided sputum for qRT-PCR, of whom 19 were included in the study (see underlying data 18 ). One was excluded due to high Ct values for all m/sRNAs analysed, indicative of insufficient MTB RNA. Eight out of 19 participants were non-converters who remained MTB culture-positive at 2 months, and 11 were converters who became culture-negative by 2 months. All patients were culture-negative after 5 months of therapy. Seven patients were infected by a strain from the MTB lineage 2 (Beijing), one with a Lineage 3 strain, and 11 with strains belonging to Lineage 4 ( Table 2).
The bacterial load in sputum was assessed by time to positivity (TTP) in MGIT culture and by the Ct for 16S ribosomal RNA. Median TTP at baseline for non-converters who remained culture-positive at 2 months was 7.5 days (IQR 6-9) and was 7 days (IQR 6-11.5) for converters who were culture-negative at 2 months. 16S Ct values were significantly correlated with TTP at baseline (R= 0.70, p<0.0001). Figure 1A illustrates primary Ct data for rpsN1 and rpsN2 transcripts. In exponential cultures of MTB H37Rv, the Ct for rpsN1 was lower than rpsN2 -i.e. expression of the rpsN2 transcript was repressed, consistent with growth in zinc-replete 7H9 medium 13 -while the reverse was observed in stationary phase. Ct values were much higher in sputum, reflecting the lower number of bacteria, with the ratio of rpsN1 to rpsN2 suggestive of growth in a low-zinc environment.   Figure 1B and 1C illustrate 16S-normalized transcript abundance in sputum compared to exponential and stationary phase culture. Median levels of atpA, sigA and nuoG in sputum were slightly lower than exponential culture but significantly higher (p<0.01) than stationary phase. Conversely, hspX in sputum was higher than in exponential culture but lower than stationary phase. The 16S-normalized abundance of rpsN2, icl1 and prpD transcripts in sputum was significantly higher (p<0.01) than under either phase of laboratory culture. The mean deviation of the total set of transcripts in sputum from H37Rv growth status at baseline (MDEx and MDStat) showed that the overall sputum profile resembled exponential culture more closely than stationary phase ( Figure 1D).
To further define a potential reduced-metabolism phenotype in sputum, we searched for a hspX hi atpA lo nuoG lo signal and its variability between individuals. A strong correlation was observed between 16S-normalized abundance of atpA and nuoG (R=0.92, p<0.001) There was a non-significant trend towards reduced expression of these markers of oxidative phosphorylation in non-converters. 6/8 (75%) had below median levels for both nuoG and atpA. This was compared with 4/11 (36%, p=0.1) and 5/11 (45%, p=0.2) of converters, (Figure 2A). There was a non-significant trend of increased expression of the DosR regulated transcript (hspX) above the median of the entire group, in non-converters 5/8 (63%) compared with converters 4/11 (36%, p=0.2) ( Figure 2B).
The 16S-normalized transcript abundance of hspX showed no significant difference (p=0.8) between the 7 Beijing and 12 non-Beijing samples included in the study ( Figure 2C), consistent with the conclusion that the DosR phenotype detected in sputum reflects an induced response to hypoxia, nitric oxide or carbon monoxide.

Discussion
This study adds to literature describing the sputum phenotype of M. tuberculosis by demonstrating the potential application of qRT-PCR to analyse differences in well-characterized patient cohorts. By avoiding the pre-amplification step characteristic of previous protocols, the present study avoids potential technical issues involving non-linear amplification and preferential enrichment of abundant transcripts.
Based on the selected set of target genes, the overall transcription profile detected for MTB in sputum more closely resembled H37Rv in exponential culture than in stationary phase culture. It has previously been proposed that MTB assumes a non-replicating phenotype in sputum 5 , but comparison of our own with published data 6,19 indicates that the sputum profile falls between replicating and non-replicating states, and that differing interpretations may partially reflect inter-laboratory variation in reference cultures. The extent to which MTB continues to replicate within the respiratory tract could have an important influence on disease transmission. Consistent with previous descriptions of the transcriptional phenotype of MTB during infection 4-6 , icl1 and prpD transcripts were significantly higher in sputum than in laboratory culture. The products of these two genes participate in the methylcitrate cycle that is required to process potentially toxic propionate generated during metabolism of cholesterol and odd branch length fatty acids. This suggests that MTB in sputum resembles intracellular MTB in utilizing cholesterol as a carbon source 20 . In contrast, abundant representation of rpsN2 transcript in sputum is characteristic of growth in a low-zinc environment, which is inverse to the zinc intoxication associated with MTB in activated macrophages 21 .
Increased abundance of hspX and decreased abundance of atpA and nuoG transcripts relative to exponential culture is consistent with the presence of a metabolically less-active MTB phenotype in sputum. While a trend towards enrichment of this sub-population in patients who required longer treatment to achieve culture conversion was observed, it did not reach statistical significance. By knowing the lineage of the infecting strain in each case, we were able to exclude the possibility that genotypic variation -for example, constitutive dosR expression in Beijing strains -was responsible for the observed variable representation of the low-metabolism phenotype.
Findings were limited by the relatively small sample size and suggest the need for expanded studies focusing on more phenotypically informative transcripts. As the sputum RNA was not pre-amplified, certain genes of interest had very high Ct values which should be interpreted with care. An important limitation relates to the use of 16S rRNA to normalize samples with respect to bacterial load. The relative stabilities of rRNA and mRNA in the respiratory tract are unknown and preferential enrichment for one or the other in non-replicating or dead bacteria could skew the data. A further limitation is comparison of transcripts from sputum to exponential and stationary phase cultures of H37Rv instead of isolates cultured from the original sputa.
In conclusion, direct qRT-PCR interrogation of sputum samples using a limited panel of MTB transcripts reveals an overall transcriptional phenotype that bears closer resemblance exponentially-growing bacteria than to stationary phase culture. Differential abundance of selected informative transcripts suggested a metabolically less-active subpopulation with a prevalence that varied between individual untreated patients. We were unable to demonstrate a statistically significant correlation between this subpopulation and subsequent response to treatment. This study has attempted to correlate treatment response in terms of culture conversion at the end of intensive phase treatment with levels of gene expression related to slow metabolizing population of Mtb at baseline with a hypothesis that presence of such population at baseline may reflect slower culture conversions. The study is exploratory with a limited sample size. They have used Taqman based qPCR assay to quantify relative abundance of specific transcripts. The choice of transcripts is well justified to address this important research question.

Data availability
The study while showing that the Mtb in sputum at baseline, resembles overall exponential Mtb in culture also highlights the presence of transcripts that resemble slow metabolizing Mtb. Unfortunately, the main objective of the paper to correlate the presence of slow metabolizing bacteria with culture conversion could not be statistically established. Nevertheless, this reviewer cites some major technical concerns that could have a bearing on the interpretation of the authors. The paper does not describe the limit of detection for each gene. The data shows several of the samples have a Ct value >38, which is generally considered less reliable. Generally considered Ct value for detecting one transcript by Taqman method is 38 indicating that anything above 38 is essentially absence of transcript. The results are especially worrying when some of the genes had Ct >38 even in cultured H37Rv sample when it was adjusted to have similar 16s as sputum (based on raw data submitted). In complex samples like sputum, anything above Ct of 35 needs to be considered with caution as stochasticity can significantly affect the qPCR outcome. In our experience working with direct samples like sputum, we observed Mtb specific DNA/RNA to be less than 1% in such samples despite using steps to minimize human DNA/RNA contamination. Hence it is important to determine the limit of detection and prove the specificity of the primer-probe pair in such samples by comparing the true positive samples with MTB negative sputum samples.

1.
The study also lacks/does not mention another important control with reverse-transcriptase negative samples to rule out the possibility of gDNA contamination in their samples.

2.
In Figure 1A, the comparison has been done with direct Ct values and trends deciphered with higher amount of RNA in cultured H37Rv. The results and interpretation may change if more accurate normalized value with respect to 16S and comparable RNA quantity were used.

Michelle B. Ryndak
Department of Pathology, New York University Langone Medical Center, New York, NY, USA This is an exploratory study seeking to correlate the transcriptional profile of M. tb in patient sputum with response to treatment as measured by conversion to bacterial clearance by 2 months post-treatment. This study includes M. tb in stationary cultures as well as exponential cultures as references against sputum M. tb and presents the finding that M. tb in sputum more closely resembles the former than the latter (at least among the genes selected for this study).
Additionally, a few of the metabolic adaptations of M. tb in the sputum environment were defined. The authors have been careful to document patient characteristics and clinical features of the infections as well as identify the strain classification associated with each infection. Furthermore, the study was performed without an RNA amplification step to avoid a non-linear amplification bias. Most of the results show trends rather than statistical findings; however, in an exploratory study with limited sample size, this information can be nonetheless informative. Unfortunately, the main premise of this paper, the correlation of M. tb transcriptional profile in sputum with treatment response, is not brought to fruition, most likely because the sample size is not robust enough to yield convincing results. Therefore, the study falls short in its conclusions. It will be interesting to see the results of a similar study with an increased sample size.

Some concerning technical issues:
There is no mention of the quality of RNA used in the study (Bioanalyzer or Nanodrop). 1.
There is no mention of how DNA contamination is either assessed or addressed. (DNAse treatment, direct PCR on the RNA samples, parallel RT-PCR without reverse transcriptase, etc).

2.
The 1 μg of RNA obtained from a 1 ml aliquot of sputum diluted 1:3 may be a mixture of eukaryotic and prokaryotic RNA (unlike RNA obtained from the in vitro cultured M. tb) and may not be an accurate measure of the starting RNA. The serial dilution of RNA from the reference bacteria to obtain similar 16S transcripts as in the sputum RNA is a reasonable way to deal with this issue. Based on these serial dilutions, can the approximate amounts of starting bacterial RNA from sputum and cultures be derived?

3.
Presentation and interpretation of data: The use of HspX and MTS1338 as markers of a DosR-regulated phenotype: If there is a DosR-regulated phenotype, these two genes should correlate with each other up or down; however, while both are upregulated in sputum against exponential culture, HspX is down and MTS1338 is up against stationary culture. While "stationary" bacteria are known to induce the DosR regulon, sputum bacteria may not. This would not be counter to findings in other published transcriptional profiles of M. tb in patient sputum (see below). Perhaps, including an additional well-characterized DosR-regulated gene may help to clarify this.

1.
Based on Figure 2C, the authors conclude that strain lineage does not affect the results of the study; however, only the two DosR regulon genes are used to compare the Beijing vs non-Beijing strains and only HspX is discussed in the text. (A statement regarding the effect/lack of effect of strain lineage on differential expression of the other target genes should also be included.) Granted, the premise is based on the fact that Beijing strains have constitutive dosR expression regardless of inducing stresses. If this is the case, why do 5 of the 7 Beijing samples fall in the low abundance quadrant for both HspX and MTS1338 2.
genes? This seems counterintuitive. Literature: In addition to Garton et al., a more recent study of the transcriptome of M. tb in patient sputum has been published 1 . A discussion of the similarities and differences between these studies and the findings in this current publication would be useful to the reader.
Minor Points: In Figure 1A legend, define red horizontal lines. Can the data in Figure 1A be depicted in its 16S-normalized form?
In Figure 2C, there are 13 boxes representing non-Beijing strains; however, "12" are referenced in the text.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.