Data on whole genome sequencing of extrapulmonary tuberculosis clinical isolates from India

This article describes the whole genome sequencing data from 5 extrapulmonary tuberculosis clinical isolates. The whole genome sequencing was carried out on Illumina MiSeq platform to identify single nucleotide variations (SNVs) associated with drug resistance. A total of 214 SNVs in the coding and promoter regions were identified in the whole genome sequencing analysis. Among the identified SNVs, 18 SNVs were identified in genes known to be associated with first and second line drug resistance. The data is related to the research article “Whole genome sequencing of Mycobacterium tuberculosis isolates from extrapulmonary sites” (Sharma et al., 2017) [1].


a b s t r a c t
This article describes the whole genome sequencing data from 5 extrapulmonary tuberculosis clinical isolates. The whole genome sequencing was carried out on Illumina MiSeq platform to identify single nucleotide variations (SNVs) associated with drug resistance. A total of 214 SNVs in the coding and promoter regions were identified in the whole genome sequencing analysis. Among the Whole genome sequencing of Mycobacterium tuberculosis isolates from extrapulmonary sites [1].

Value of the data
This data provides insight into the genomic profiles of M. tuberculosis clinical isolates from extra pulmonary sites Lineage-specific SNVs identified in whole genome sequencing allows accurate strain typing and provided the information of lineage distribution of EPTB isolates The data also provided information on SNVs associated with conferring resistance to anti-tubercular drugs Since genomic profiles of EPTB isolates remains largely unexplored, this data would add value to our current knowledge on genomes of M. tuberculosis isolated from different infection sites

Data
The data represents whole genome sequencing of 5 extra pulmonary isolates from 3 different sites. All five clinical isolates sequenced in this data set belonged to East-African-Indian lineage (Lineage 3) (Fig. 1A). A scientific interpretation of this data set was performed by Sharma et al. [1]. Data analysis led to the identification of 15 SNVs in the coding region of genes (Fig. 1B), which are known to confer drug resistance to first and second line anti-tubercular drugs (Supplementary Table 1A). Apart from known drug resistance SNVs, we also identified 199 SNVs in the promoter regions corresponding to 157 genes (Supplementary Table 1B) (Fig. 2). Three of these 157 genes are associated with drug resistance show promoter region SNVs in all of the 5 isolates (Fig. 1B).

Culturing and DNA isolation of extrapulmonary isolates
The 5 EPTB isolates were obtained from Department of Medical Microbiology, The Postgraduate Institute of Medical Education and Research, Chandigarh, India. The isolates were cultured and   maintained as described in [1]. The LJ slants were incubated at 37°C for a maximum period of 8 weeks. They were inspected daily for growth or for contamination. The isolates were then tested to rule out non tuberculous mycobacteria (NTM) or other infection and were cultured for DNA extraction as previously described [1]. DNA was extracted from the isolates cultured on the LJ slants using cetyltrimethylammonium bromide (CTAB) protocol [2].

Library preparation and sequencing
DNA libraries were constructed and sequencing was carried out on Illumina MiSeq instrument as described previously [1]. Sequencing was performed using a 2 Â 100 paired-end (PE) configuration ( Table 1).

Variant calling and data analysis
Paired end reads were quality checked using FastQC version-0.11.5. Raw reads of Phred quality score of o 20 were discarded. High quality reads were mapped to the H37Rv reference genome (NC_000962.3) using Burrows-Wheeler Alignment Tool (BWA version-0.7.15) [3]. Variants were identified using GATK [4]. The variants were annotated using in-house perl scripts. Phylogenetic analysis was carried out using KvarQ version-0.12.2 [5]. SNVs identified in the isolates were used to generate phylogenetic tree FastTree version-2.1.10 [6].