Whole Genome DNA Methylation Analysis of Active Pulmonary Tuberculosis Disease Identifies Novel Epigenotypes: PARP9/miR-505/RASGRP4/GNG12 Gene Methylation and Clinical Phenotypes

We hypothesized that DNA methylation patterns may contribute to the development of active pulmonary tuberculosis (TB). Illumina’s DNA methylation 450 K assay was used to identify differentially methylated loci (DML) in a discovery cohort of 12 active pulmonary TB patients and 6 healthy subjects (HS). DNA methylation levels were validated in an independent cohort of 64 TB patients and 24 HS. Microarray analysis identified 1028 DMLs in TB patients versus HS, and 3747 DMLs in TB patients after versus before anti-TB treatment, while autophagy was the most enriched signaling pathway. In the validation cohort, PARP9 and miR505 genes were hypomethylated in the TB patients versus HS, while RASGRP4 and GNG12 genes were hypermethylated, with the former two further hypomethylated in those with delayed sputum conversion, systemic symptoms, or far advanced lesions. MRPS18B and RPTOR genes were hypomethylated in TB patients with pleural involvement. RASGRP4 gene hypermethylation and RPTOR gene down-regulation were associated with high mycobacterial burden. TB patients with WIPI2/GNG12 hypermethylation or MRPS18B/FOXO3 hypomethylation had lower one-year survival. In vitro ESAT6 and CFP10 stimuli of THP-1 cells resulted in DNA de-methylation changes of the PARP9, RASGRP4, WIPI2, and FOXO3 genes. In conclusions, aberrant DNA methylation over the PARP9/miR505/RASGRP4/GNG12 genes may contribute to the development of active pulmonary TB disease and its clinical phenotypes, while aberrant DNA methylation over the WIPI2/GNG12/MARPS18B/FOXO3 genes may constitute a determinant of long-term outcomes.


Genome-Wide DNA Methylation Data Analysis
The Methylation Module in the Illumina Genome Studio V2009.2 (San Diego, CA, USA) was used to generate the β value for each CpG locus. The β value was calculated as: (intensity of methylated probe)/(intensity of methylated probe + intensity of unmethylated probe). The β-values ranged between 0 (least methylated) and 1 (most methylated) and was then transformed into a M-value to achieve better statistical properties 3 , which was the log2 ratio of the intensity of methylated probes versus unmethylated probes using the following equation: M value = log2 (β value/(1-β value)).
A positive M-value meant higher intensity from the methylated probes than the unmethylated probes and a negative M value meant the opposite. The significance threshold in M value comparisons was p < 0.005 and a false discovery rate (q) <0.5.
To identify differential methylated CpG sites, M values of the case and control groups were analyzed with the Mann-Whitney test by Partek ® Genomics Suite ® software to obtain a p value and q value, as described previously 2 . Significantly differentially methylated CpG sites with a p value < 0.005, q value < 0.01, at least a 10% difference in their β value (large effect size), and known biological or functional relevance were selected for further validation 4 . For the differentially methylated CpG sites, their corresponding gene symbols were used for pathway analysis using MetaCore from Thomson Reuters, which uses hypergeometric tests to examine whether the genes are enriched in any known pathway. The top 10 pathways were selected based on their p values (<0.005) and q values (<0.25). All methylation datasets have been deposited in the NCBI Gene Expression Omnibus with the accession number GSE118469.

Quantitative Realtime Reverse Transcription (RT)-PCR Method
Total RNA from PBMCs was isolated by RNA Extraction RiboPureTM-Blood (Ambion), and converted to single stranded cDNA using a cDNA archive kit (Applied Biosystems) followed by the amplification of specific gene transcript by using TaqMan   probe and specific primers (supplementary Table S2). GAPDH was used as the internal control. The PCR reaction was performed at 94 ℃ for 10 min, followed by amplification (95 ℃ for 10 s, 60 ℃ for 30 s), and cooling (40 ℃ for 30 s), for 30 cycles. The PCR products were subjected to 1% agarose gel electrophoresis and photographed. Relative expression levels were calculated using the ΔΔCq method with the median value for the HS group as the calibrator. All amplification reactions were performed simultaneously.

Analysis of miRNA-505 Gene Expression
cDNA was generated from 2 µL of purified total RNA using the TaqMan Advanced miRNA cDNA Synthesis kit (Thermo Fisher Scientific, Waltham, MA, United States).
Additionally, 1 pM of the synthetic C. Elegans oligo, cel-miR-39 (Sequence: UCACCGGGUGUAAAUCAGCUUG), was added to the isolated total RNA. This sequence does not exist in humans and was used as an exogenous control. All qPCR reactions were normalized to their corresponding cel-miR-39 Ct values. Quantitative RT-PCR was performed for each sample using 2.5 µL of diluted cDNA, TaqMan Advanced miRNA Assays (cel-miR-39-3p: 478293_mir ; hsa-miR-505-5p: 478957_mir ; Thermo Fisher Scientific, Waltham, MA, United States), and Applied Biosystems™ TaqMan™ Fast Advanced Master Mix (Thermo Fisher Scientific, Waltham, MA, United States) under fast cycling conditions. All TaqMan assays quantitative RT-PCR was carried out using the ABI 7500fast Real-Time PCR System (Applied Biosystems). Real-time PCR cycling conditions consisted of 95 ℃ for 20 s, followed by 40 cycles of 95 ℃ for 3 s and 60 ℃ for 30 s. miRNA fold expression changes were determined by the 2 -ΔΔCT method.  Biotin-AAAACATCTACCAATAAAAATTTCACAT S.P: GTGATTATTGGGATTTTGG Table S5. Primer sequences used for quantitative RT-PCR of the 9 candidate genes verified in the validation cohort.