Genetic diversity of Mycobacterium tuberculosis strains circulating in Botswana

Background Molecular typing of Mycobacterium tuberculosis (M.tb) isolates can inform Tuberculosis (TB) control programs on the relative proportion of transmission driving the TB epidemic. There is limited data on the M. tb genotypes that are circulating in Botswana. The aim of this study was to generate baseline data on the genetic diversity of M.tb isolates circulating in the country. Methods A total of 461 M.tb isolates received at the Botswana National Tuberculosis Reference Laboratory between March 2012 and October 2013 were included in this study. Drug susceptibility testing was conducted using the BD BACTEC MGIT 960 System. M.tb strains were genotyped using spoligotyping and spoligotype patterns were compared with existing patterns in the SITVIT Web database. A subset of drug resistant isolates which formed spoligo clusters (n = 65) was additionally genotyped with 12-loci MIRU. Factors associated with drug resistance and clustering were evaluated using logistic regression. Results Of the 461 isolates genotyped, 458 showed 108 distinct spoligotype patterns. The predominant M.tb lineages were Lineage 4 (81.9%), Lineage 2 (9%) and Lineage 1 (7.2%). The predominant spoligotype families within Lineage 4 were LAM (33%), S (14%), T (16%), X (16%). Three hundred and ninety-two (86%) isolates could be grouped into 44 clusters (2–46 isolates per cluster); giving a clustering rate of 76%. We identified 173 (37.8%) drug resistant isolates, 48 (10.5%) of these were multi-drug resistant. MIRU typing of the drug resistant isolates allowed grouping of 46 isolates into 14 clusters, giving a clustering rate of 49.2%. There was no association between age, sex, treatment category, region and clustering. Conclusions This study highlights the complexity of the TB epidemic in Botswana with multiple strains contributing to disease and provides baseline data on the population structure of M.tb strains in Botswana.

Introduction Tuberculosis (TB) is amongst the top 10 causes of death throughout the world [1]. In 2017, an estimated 10.4 million people fell ill with TB; an estimated 1.3 million HIV uninfected people and 374 000 HIV infected people died of the disease. Botswana is a high TB burden country with an estimated TB incidence rate of 300 per 100,000 population [2]. According to the 2014 National TB report, about 60% of TB patients in Botswana are also co-infected with HIV [3]. Despite this high TB burden, studies to identify drivers of the TB epidemic in Botswana are sparse. The genetic diversity of Mycobacterium tuberculosis (M.tb) strains in Botswana remains largely unknown.
Molecular typing of M.tb is a useful tool to better understand the epidemiology of TB [4,5]. Spacer oligonucleotide typing (spoligotyping), a genotyping method based on the amplification of the direct repeat (DR) region of the M.tb genome, is a rapid method to detect and differentiate M.tb clinical specimens into different lineages and sub-lineages [6,7]. Mycobacterial Interspersed Repetitive Units-Variable Number of Tandem Repeats (MIRU-VNTR) is another polymerase chain reaction (PCR) based genotyping method used to differentiate M.tb strains [8]. Both, spoligotyping and MIRU-VNTR typing methods are widely used and results of these assays are stored in international databases [9,10]. Another genotyping method, which has been proven useful in investigating TB transmission in the past, is the insertion sequence 6110 restriction fragment length polymorphism (IS6110-RFLP) typing [5,[11][12][13][14]. These assays enable the study of the global distribution and phylogenetic analyses of M.tb lineages across countries. M.tb lineages have been found to be distributed differently across the globe and thus reflect migration patterns and local epidemiology [15]. Tracking recent transmission chains in a TB endemic area enables the TB control programs to identify transmission hotspots and to implement effective targeted interventions [16].
Monitoring circulating M.tb strains remains a critical part of any TB control program as it can help improve our understanding of transmission dynamics in specific geographical locations [17]. Studies were previously conducted in Botswana by Lockman et al. to investigate the molecular epidemiology of M.tb. The findings of these studies in four communities and the population at large showed that there was IS6110-RFLP clustering rate of 25% and 38% respectively [18,19]. The high IS6110-RFLP clustering rates were suggestive of active transmission within communities in Botswana. However, the studies conducted by Lockman et al. did not provide any detailed information on the phylogenetic lineages of the M.tb strains that were genotyped. Importantly, these studies were carried out more than 18 years ago so there is a need to generate more recent data on the M.tb diversity in Botswana. We generated more recent baseline data on the genetic diversity and distribution of circulating M.tb strains from specimens collected throughout Botswana.

Study setting and population
This was a retrospective, cross-sectional study. The study was approved by the University of Botswana Ethics Institutional Review Board and the Ministry of Health and Wellness Human Research Development Division. Permission was granted to use de-identified patient clinical information therefore patient consent was waived [Reference No: HPDME: 13/18/1 Vol. XI (140)]. We obtained all M.tb isolates (n = 461) that were available from the Botswana National Tuberculosis Reference Laboratory (NTRL) bio-repository. NTRL is a reference laboratory that receives sputum specimens for microscopy, culture and drug susceptibility testing (DST) from all public health facilities throughout the country. M.tb isolates included in this study were obtained from pulmonary and extrapulmonary samples received at NTRL from March 2012 to October 2013. These specimens were collected from 146 regional health posts, clinics and hospitals across all the 24 health districts and later sent to NTRL for culture. At the time of the study the indications for culture were as follows: HIV positive patients with 2 negative sputum smear results, new patients who are smear positive at month 3 of treatment, all retreatment patients, patients who have received anti-tuberculosis therapy (ATT) for more than 1 month in the past, all children, patients who develop TB during or after isoniazid preventative (IPT), symptomatic individuals who are at higher risk of MDR-TB such as health care workers, MDR-TB contacts and laboratory personnel, patients with suspected cryptogenic TB as well as fluids or tissues which are suspected to be infected by M.tb [3]. The patients' clinical information was obtained from the laboratory patient management system.
The administrative health districts were grouped into 4 regions namely; North West, Central, Southern and South West. A central location was chosen for each region and its coordinates were used to plot the distribution of lineages in each region. The Botswana map was generated by using freely available R packages ggmap, maps and mapdata. The maps with pie charts were edited using Adobe Illustrator CS6. The get_map() is a smart wrapper that queries Stamen Maps by specified longitude (19.8, 29) and latitude (-18, -26.7).

Drug susceptibility testing
Drug susceptibility testing (DST) for the first line drugs; isoniazid (INH), rifampicin (RIF), ethambutol (EMB) and streptomycin (STR) was performed using the BACTEC MGIT 960 System (Becton Dickinson Diagnostic Systems, Sparks, MD, USA) following the manufacturer's instructions. Susceptibility to first line drugs was tested at the following concentrations: 0.1 μg/ mL and 0.4 μg/mL for low and high concentrations of isoniazid respectively, 1.0 μg/mL for rifampicin and streptomycin and 5.0 μg/mL for ethambutol. Isolates were categorized into groups based on the resistance profile. Resistance to only one drug was termed monoresistance, resistance to at least isoniazid (INH) and rifampicin (RIF) were termed multi-drug resistance (MDR) and resistance to more than one first line drug but not meeting MDR criteria was termed polyresistance.

Spoligotyping
Spoligotyping was performed as previously described by Kamerbeek et al. [6] on 461 isolates. A commercially available spoligotyping kit (Mapmygenome, Hyderabad India) was used for amplification, hybridization and detection of DNA. The direct repeat (DR) region of the isolates was amplified by PCR using oligonucleotides primers DRa (5'-GGTTTTGGGTCTGAC GAC-3') biotinylated at the 5' end) and DRb (5'-CCGAGAGGGGACGGAAAC-3') derived from the DR sequence for 30 cycles at a Tm of 55˚C. The characterized M.tb strains H37Rv (laboratory reference strain) and Mycobacterium bovis (M.bovis)-Bacillus Calmette-Guérin (BCG) were included as positive controls; distilled water with PCR mix was used as negative control in each run. The resulting spoligotype patterns were reported in octal and binary formats and compared to existing patterns in the international genotyping database SITVIT2 [10] available at the following address: http://www.pasteur-guadeloupe.fr:8081/SITVIT_ONLINE/. Spoligotype patterns were grouped as spoligotype international types (SITs) if they shared identical patterns with existing patterns in the database. Isolates that did not share their patterns with previously reported patterns were termed "unknown". M.tb lineages and families were assigned based on spoligotyping data submitted to the SITVIT2 database. A clustering rate for spoligotyping was calculated using the formula (nc−c)/n, where n c is the total number of isolates clustered by spoligotyping, c is the number of clusters and n is the total number of isolates that were genotyped. A cluster was defined as two or more isolates with identical spoligotype patterns.

MIRU typing
To enhance the discriminatory resolution to identify possible clusters of transmission we applied the standardized 12-loci MIRU-VNTR protocol by Supply et al. [20] to the subset of isolates that were resistant to either INH or RIF or to both INH and RIF and which clustered according to spoligotyping (n = 65). M.tb H37Rv DNA was used as positive control and DNA-free water as a negative control. The PCR products were fractionated by electrophoresis in 2% agarose gels in 1XTris/Borate/Ethylenediaminetetraacetic acid buffer (TBE buffer) at 65V for 16hrs to allow for clear discrimination. The number of repeats at each locus was calculated from the size of DNA fragments according to the standardized table (http://www. MIRU-VNTRplus.org). The results were recorded in a digital format where each digit represented the number of repeats at a particular locus. A clustering rate for MIRU was calculated using the formula (nc−c)/n, where n c is the total number of isolates clustered by MIRU, c is the number of clusters and n is the total number of MIRU typed isolates in the study. A cluster was defined as two or more isolates with identical MIRU patterns.

Data analysis
STATA version 14 (Stata Corp, College Station, TX, USA) was used for statistical analysis. Factors associated with drug resistance and clustering were evaluated using univariate logistic regression. A p-value of <0.05 was considered statistically significant.
We identified 108 distinct spoligotype patterns; 44 clusters (2-46 isolates per cluster) and 69 had unique spoligotype patterns (S1 Table). The clustering rate for spoligotyping was found to be 76%. Lineage 4 (Euro American) was also the most prevalent lineage within the drug resistant TB isolates (81%) ( Table 4). Sixty-five of the 173 drug resistant isolates were successfully genotyped by MIRU. We identified 33 MIRU patterns among the drug resistant isolates; 46 isolates formed 14 clusters (2-6 isolates per cluster) and 19 isolates had unique MIRU patterns. The MIRU clustering rate for a subset of drug resistant isolates was calculated to be 49.2% (S2 Table). Using univariate logistic regression analysis there was no association between age, sex, HIV status, treatment category or geographical region and clustering ( Table 4). The risk of resistance to at least one anti-TB drug was 1.4 (p = 0.067) and 1.1 (p = 0.676) times higher for females and HIV negative patients respectively, but this was not statistically significant. We also could not detect an association between sex, age, HIV status, M.tb lineage and drug resistance (Table 5)

Discussion
This study provides most recent insights on the genetic diversity of M.tb strains circulating in Botswana. The analyzed M.tb strains belong to Lineages 1-4 and 6 with a predominance of Lineage 4 strains. Within Lineage 4 strains, the predominant families were LAM (Latin American Mediterranean), S, T and X. Previous studies have shown that the TB epidemic in Africa is dominated by Lineage 4 (Euro-American) strains [21]. In Southern Africa LAM11-ZWE and Lineage 2 (East Asian) have been shown to be dominant [21].
The overall strain diversity observed in Botswana is similar to that of M.tb lineages in neighboring countries South Africa, Zambia and Zimbabwe [21][22][23]. The presence of the AFRI_1 (Lineage 6) strain in the central region of Botswana is noteworthy. This M.tb family is usually not found outside of West African countries [24][25][26]. In a recent study by Mbugi et al to assess   the genetic diversity of M.tb in Africa, countries that share borders with Botswana such as South Africa, Namibia, Zambia and Zimbabwe did not report any Lineage 6 strains. Lineage 6 strains were only reported in Cameroon, Nigeria and Sudan which is consistent with other published data [22]. The Lineage 6 M.tb strain detected in our study could have been introduced through recent human migrations between West African countries and Botswana. However, since no additional clinical or ethnicity data from the respective patient was available, we can only speculate about the origin of the strain and more investigations are required to assert this hypothesis. An interesting finding in this study is the presence of an M. bovis strain-the causative agent of bovine tuberculosis (BTB) [23,27]. The presence of only one animal strain in our dataset suggests that there is either a low burden of BTB among cattle or low transmission rates from cattle to humans. There is limited data on the prevalence of BTB in Botswana and in one study carried out in the Northern region of Botswana by Jori et al. in 2013 the prevalence of BTB was reported to be very low [27]. The impact of BTB as a zoonotic disease in Botswana remains to a greater extent unknown and warrants further studies. There were 2 (0.4%) isolates with a genotype of the reference strain H37Rv; reasons for this might be laboratory contamination of patient samples with the reference strain. This however could not be further investigated.
There is limited published data about M.tb genotype clustering rates in African countries. The clustering rate obtained in this study by spoligotyping was 76% and comparable with those previously reported by Mulenga et al. in a study at Ndola district in Zambia [13]. A previous study in Botswana by Lockman et al., obtained a population clustering rate of 38% by IS6110 RFLP [18]. This clustering rate was much lower than that obtained by spoligotyping, which is to be expected since IS6110 RFLP is a more discriminatory genotyping method. The clustering rate for a subset of drug resistant isolates was 49.2%. These results cannot be used to infer transmission since 12 loci MIRU is not discriminatory enough to identify possible transmission chains [16]. Compared to other studies conducted in Zambia and Benin showing MIRU clustering rates of 37.7% and 34% respectively, our clustering rate is higher than those obtained in both countries [13]. Further investigations on the transmission dynamics in Botswana through the use of highly discriminatory genotyping techniques such as whole genome sequencing are required to determine the proportion of clustered (i.e. transmitted) versus non-clustered (i.e. acquired/imported/reactivated) drug resistant TB cases. The presence of unclustered or unique isolates suggests that TB reactivation and/or individual cases of imported TB from neighbouring countries may be playing a role in local TB epidemiology [28]. Since the unclustered isolates are acquired or reactivated, diagnosis or treatment compliance could be a problem and may need to be investigated. According to the previous national anti-tuberculosis drug resistance survey conducted between 2003 to 2013, there has been an increase in cases of drug resistant TB in Botswana; the proportion of new patients with any resistance to INH and RIF had increased 3.1-fold since ). This however, does not reflect the drug resistance rates in the general population due to the national TB program's criteria for culturing specimens at the time of the study. The high rates of streptomycin resistance could be attributed to the fact that streptomycin was at the time of the study used in empiric retreatment regimens before drug resistance testing results are available, it is also commonly used to treat other bacterial infections [29,31]. The presence of both INH and RIF monoresistance indicates that there is an increased risk for development of MDR-TB especially in cases where patients are not properly adhering to treatment [32,33]. There were no significant associations between drug resistance and sex, age, lineage or smear result in this study. Similar findings have been reported in other countries [34]. We acknowledge that our study had some limitations. Firstly, there is a selection bias in the study population because cultures were only performed for a certain group of patients who met the national culture inclusion criteria as described in the methods section. As a result, our findings may not be generalizable to the entire population. Secondly, the use of spoligotyping and 12-loci MIRU typing does not allow to draw conclusions about recent transmission of M. tb in Botswana. 15 loci or 24 loci-MIRU-VNTR typing methods with an automated system could potentially already reduce cluster rates, but despite having frequently been used for transmission analyses, a recent study by Meehan et al. showed that MIRU-typing generally is not discriminatory enough for that purpose [16]. MIRU typing may nevertheless serve as a first step in surveillance of potential transmission hotspots in a population, but further investigations using whole genome sequencing (WGS) are required to provide additional resolution [16]. As shown by other studies, WGS is able to discriminate between TB isolates at the single nucleotide level (SNP) which can in turn identify TB cases which are most likely to linked by transmission [35,36]. This genotyping technique has improved understanding of the TB epidemic in other Southern African countries.
Thirdly, there are other variables than those analyzed that could have played a role in the dominance and distribution of M.tb lineages; these include the frequency of social gatherings, family history of TB and travel history that were not investigated in this study. Finally, we did not have treatment outcome data for the patients and could therefore not determine if there is any association between treatment outcomes and M.tb lineage. Despite these limitations, our study provides important baseline information on the circulating M.tb strains and on the types of M.tb drug resistance present in Botswana. Such data is invaluable for the planning of future studies investigating the molecular epidemiology of M.tb and its transmission dynamics.

Conclusions
Our study provides baseline data on the genetic diversity of M.tb strains circulating in Botswana which could be used for future molecular epidemiology studies and contributes to the body of knowledge regarding global M.tb genetic diversity. It also shows the utility of spoligotyping and 12 loci MIRU in robustly assigning lineages and understanding the M.tb population structure in a TB endemic setting. However, there is a need for further epidemiological studies over extended period of time using whole genome sequencing to determine the precise genetic diversity and the transmission dynamics of M.tb strains in Botswana using a sample set that is representative of the entire country.
Supporting information S1 Table. Spoligotype international types (SIT) and corresponding spoligotyping-defined families and lineages of M. tuberculosis isolates included in the study. (DOCX) S2 Table. MIRU types (MIT) and corresponding spoligotyping-defined families and lineages for some of the drug resistant M. tb isolates in the study. (DOCX)