Discriminatory Ability of Hypervariable Variable Number Tandem Repeat Loci in Population-based Analysis of Mycobacterium tuberculosis Strains, London, UK

Hypervariable loci should be included in standardized panels because they can provide consistent comparable results in multiple settings.

G lobally, tuberculosis (TB) accounts for almost 2 million deaths each year (1). Although TB notification rates in the United Kingdom (13.8/100,000 in 2007) remain low, rates differ substantially by region: London (43.2/100,000) accounts for ≈40% of all TB cases registered in the United Kingdom, and ≈75% of TB patients in London were born abroad (2). Rates of drug resistance also are higher in London than in the rest of the United Kingdom: 8.6% of isolates are isoniazid resistant, and 1.2% are multidrug resistant (UK Health Protection Agency; www. hpa.org.uk).
In settings where incidence of TB is low or moderate, molecular genotyping is used to investigate suspected TB outbreaks, laboratory cross-contamination, and reactivation and (at a population level) to identify clustered cases that are not apparently linked; for the latter purpose, the highest possible level of discrimination is required (3). For these purposes, insertion sequence (IS) 6110 restriction fragment length polymorphism (RFLP) analysis-often supplemented with spoligotyping and, more recently, with variable number tandem repeat (VNTR) typing-is used routinely.
The highest levels of epidemiologic discrimination of strains of the Mycobacterium tuberculosis complex (MTBC) can be achieved by using multilocus VNTR typing, but these results depend on the number and loci used, particularly for homogenous strain groups such as the Beijing family (3)(4)(5). This approach overcomes technical difficulties associated with IS6110-RFLP and is amenable to automation that results in a high throughput (6)(7)(8)(9)(10). A standardized panel of 15 + 9 VNTR loci (24 loci) has been proposed (7,11), but it is unclear whether sufficient discrimination would be seen when the panel is used in populations with a substantial prevalence of homogenous MTBC families (4,5,12). In addition, the discriminative power of VNTR loci may vary markedly among genetic families (7,13). Recent studies evaluating the discriminative power of VNTR typing have produced conflicting results that were generated by using convenience samples (small populations with low diversity or populations confined to a single geographic setting). These studies highlighted a need for larger population-based studies to identify discrimina-tive VNTR loci and ascertain their applicability for various genetic groups.
Concerns about the stability and reproducibility of particularly useful hypervariable loci, such as 3232, 2163a, 3336, and 1982 (3)(4)(5)14), have been raised (7,15). As a result, they have been excluded from the proposed international panels for VNTR typing. For these reasons, we conducted a study to examine the stability of hypervariable loci and the parameters associated with reproducibility, to select loci suitable for prospective molecular epidemiologic studies, and to evaluate the discriminatory power of these loci at a population level in a metropolitan setting.

Bacterial Isolates
A total of 2,261 individual MTBC isolates (1 per patient) were included in this prospectively designed population study. These isolates represented 95.7% of the bacteriologically confirmed TB cases reported from the 30 London hospitals in the 12 months from April 2005 through March 2006. These isolates had been characterized by using spoligotyping, and all but 4 were assigned to 1 of 36 spoligotype families (16,17). Multiple isolates were available from 265 patients (11.7%), resulting in serial isolate sets of 2-6 isolates, which had been sampled at intervals of 3 days to 11 months (N = 632).

Hypervariable Loci Optimization
We selected 16 previously characterized MTBC isolates to cover the complete range of repeat sizes at control loci MIRU 26 and exact tandem repeat (ETR)-B and experimental hypervariable loci VNTRs 1982 and 3232 (except 0 repeats for the locus 3232). For each of the 16 extracts, four 10-µL PCRs were conducted for each of the primer mixes in duplicate. Of these 4 reactions, the first was performed as described previously with BIOTAQ polymerase (Bioline, London, UK) (any enzyme in the given context means enzyme in conjunction with the buffer recommended and supplied by a manufacturer). Three other sets of PCRs were conducted under different amplification conditions (1).

Method 1
Diamond DNA polymerase (Bioline) was used (9). The PCR amplification cycle was 3 min at 95°C, followed by 35 cycles of 30 s at 95°C, 30 s at 60°C, and 2 min at 72°C, and 1 final cycle of 5 min at 72°C (2).

Method 3
HotStartTaq Plus DNA polymerase (QIAGEN) was used. The PCR mixture was the same as in method 2, and the amplification cycle was the same, except that the initial 95°C activation time was reduced to 5 min.
We manually calculated the number of repeats within each PCR product by resolving 4 µLof each product on a 1.2% (wt/vol) agarose gel (Agarose LE Analytical grade; Promega, Southampton, UK) against a 2,000-bp Hyper-Ladder II standard (Bioline). The number of repeats at each locus also was calculated by sizing in a denaturing capillary electrophoresis system using a CEQ 8000 instrument with a DNA Size Standard 600 (Beckman Coulter, High Wycombe, UK) and MapMarker D1 labeled 640-1000 (BioVentures, Inc., Murfreesboro, TN, USA) because fragments were expected to be >600 bp. Three parameter sets (Table 1) were used to analyze all fragments. The different parameters examined were capillary temperature (60°C for methods 1 and 2 and 50°C for method 3, respectively), denaturation time (120 s for method 1 and 180 s for methods 2 and 3, respectively) and separation time (60 min for methods 1 and 2 and 70 min for method 3, respectively). Fragment data traces were automatically analyzed by using the scheme shown in Table 1. For locus 3232, we accounted for offset values (i.e., difference among actual sizes of PCR fragments and apparent sizes indicated by electrophoresis) when calculating number of repeats in Table 1.

Assessing Stability and Reproducibility of VNTR Loci
All isolates were grouped into 265 sets of serial isolates (2-6 isolates each) and typed at all 22 loci. Primer sequences for all loci were as described previously (3,9,20,21). PCR was set up by using BIOTAQ polymerase for amplifying 12 MIRU and 3 ETR loci and Diamond polymerase for the additional 7 VNTR loci. Capillary electrophoresis was performed by using the parameters described in method 1.

Optimization of Hypervariable Loci
We evaluated factors that potentially affect the reproducibility of hypervariable VNTR loci by using various PCR and capillary and manual electrophoresis separation conditions as described in the Materials and Methods. The ability to correctly amplify different VNTR loci depended on the enzyme used (Table 2); all polymerases efficiently amplified MIRU 26 and ETR-B, as indicated by the presence of PCR fragments on agarose gels and capillary electrophoresis peaks. However, locus VNTR 3232 was amplified effectively only with Bioline Diamond (15/16 strains, 93.8%). Although all polymerases except Bioline BIOTAQ were able to amplify DNA at locus VNTR 1982, longer fragments were amplified more efficiently by QIAGEN and Bioline Diamond polymerases. Therefore, Diamond polymerase was selected for the amplification of additional VNTR loci.
We assessed 3 methods for capillary electrophoresis. For each locus, apparent fragment sizes were plotted against expected fragment sizes for each method (Figure 1).
MIRU 26 fragments sizes were as expected for all allelic variants (except for the variant with 2 repeats) when BIOTAQ and Diamond polymerases were used, but sizes were larger than expected with QIAGEN polymerases. The smaller ETR-B fragments with 1 and 2 repeats all gave expected sizes with methods 1 and 2 but were less than expected with method 3 (where the capillary temperature was decreased). These results did not affect overall interpretation. For the higher number of repeats (4-6 repeats), all polymerases generated fragments that, when analyzed by using method 3, gave apparent sizes lower than expected. In some cases, this result affected the interpretation. The apparent sizes of VNTR 1982 fragments were all similar to the expected values independent of the polymerase used and the method used for capillary electrophoresis.

Serial Isolates
Amplification was performed by using BIOTAQ polymerase for 12 MIRU and 3 ETR loci and Diamond  11 (3) 15 (7) 13 (6) 13 (4) *MIRU, mycobacterial interspersed repetitive unit; ETR, exact tandem repeat; VNTR, variable number tandem repeats. †Refer to Table 1. ‡Numbers in parentheses represent number of extracts whose calculated number of repeats were higher and lower than the expected value on the basis of that produced by the standard procedure (method 1). polymerase for 7 VNTR loci with the optimized parameters in method 1. Analysis was blinded. No disagreements occurred in the interpretation of VNTR repeat numbers among isolates in a set. In a proportion of isolates (N = 124), genotyping results were validated by using both capillary electrophoresis and manual electrophoresis for PCR fragment separation, and again, no discrepancies were found between VNTR loci copy numbers in strains isolated from the same patient at different time points (Figure 2).

Population Genotyping in Metropolitan Setting with 2 Panels of VNTR Loci
A total of 2,261 MTBC isolates circulating in London with known spoligotypes were genotyped by using a defined set of 15 loci (12 MIRU and 3 ETR); all known spoligotyping families were represented in the test population (online Technical Appendix, available from www.cdc.gov/EID/ content/15/10/1609-Techapp.pdf). Complete 15-loci profiles were obtained for 2,046 strains (90.5% of all strains). Data for the remaining profiles were incomplete for >1 locus.
Overall PCR failure rate was 1.6%, with the highest number of failures (n = 72) at locus ETR-A and the lowest number of failures (n = 4) at locus ETR-C. When PCR failed, DNA was reextracted from original cultures, and genotyping was attempted again. If the second attempt was unsuccessful, the results for the locus were marked as missing.
Genotyping of MTBC isolates by using 15 MIRU-ETR loci yielded 1,036 unique profiles and 235 clusters containing 2-53 isolates (  The discriminative power of VNTR loci also varied among spoligotype families. The mean 15 MIRU-ETR HGDI value for the Beijing family was low (0.163), which indicates that this family is relatively homogeneous, even within the diverse London population settings. Notably, mean 15 MIRU-ETR HGDI values for genetic families within the Euro-American lineage (T, Haarlem, S, X, Latin American-Mediterranean) were generally higher (0.307-0.378) than those for Beijing and Central Asian (CAS) (0.235). Within spoligotype families, the additional 7 VNTR increased variability in all cases, except for M. bovis. The highest HGDI were seen in the Latin American-Mediterranean family with locus 2163B; in Beijing, Haarlem, and M. africanum with VNTR 3232; in East African-Indian with VNTR 2163A; in X with VNTR 1982; in T with VNTR 3336; and in CAS with VNTR 4052. Within the East African-Indian family, the hypervariable loci VNTR 3232 varied little, with 93.7% isolates having a single copy. A small proportion of strains (Table 4) analyzed by using more discriminative loci, including VNTR 3232, 1982, 2163A, and 3336, generated PCR products that were too large for automated analysis but were resolved manually.

Discussion
Polymorphisms in rapidly evolving repetitive sequences, such as minisatellite VNTR, are a valuable tool for prospective epidemiologic analyses and provide a high degree of discrimination in situations in which few a priori epidemiologic data are available. In this population-based study, we genotyped 2,261 individual MTBC isolates obtained from patients residing in London by using 22 VN-TR-MIRU loci.
Conflicting views on the use of hypervariable loci for typing have been reported, even when loci such as VNTR 3232 have been shown to have high discriminatory power (3,5,14). Some studies have demonstrated difficulty in amplification of multiple alleles, absence of PCR amplification products, varying data interpretation, and lack of reproducibility among laboratories (7). Similar problems were found with another potentially valuable hypervariable locus, VNTR 1982 (5,7). Therefore, we believed that by identifying the conditions that provided good, reproducible discrimination, we would be able to define the optimal conditions that would enable molecular epidemiologists to use VNTR 1982 and 3232. We addressed variability and reproducibility for these 2 loci using MIRU 26 and ETR-B as  controls that give stable comparable results in both agarose gel and capillary electrophoresis and have been used previously in a multilaboratory comparative study (7). In all cases, identical data were produced for MIRU 26 and ETR-B irrespective of the DNA polymerase used. Amplification of VNTR 1982 and 3232 varied with different DNA polymerases, particularly when expected fragments were long.
The differing performances of polymerases for amplifying different loci can be explained by their varying properties. BIOTAQ polymerase is a basic Taq that can be used for a wide range of templates, whereas Diamond polymerase has been modified by a point mutation at the active site of the enzyme, enabling it to read through regions of secondary structure, microsatellites, and guanine cytosine-rich templates, such as those found in the M. tuberculosis genome. The QIAGEN polymerases are chemically modified polymerases with a high specificity similar to that of Diamond polymerase; thus they showed similar capabilities in amplifying VNTR 1982 and 3232. In addition, the buffer used with the QIAGEN polymerases is designed to increase the specificity of primer binding, making these polymerases suitable for dealing with complex genomic DNA.
Conditions that affect the denaturation of PCR products, and therefore their linearity before fragment sizing by electrophoresis, would be expected to influence apparent sizes of PCR fragments and copy number enumeration. We investigated the influence of DNA denaturation time and capillary separation temperature. As expected, we found that lowering the separation rate increased the discrimination of fragments >1,000 bp.
A marked difference was observed when the capillary temperature was decreased (method 3), which was independent of the polymerase used and locus investigated and demonstrated that separation conditions are critical for the correct interpretation of the VNTR typing results. In method 3, apparent fragment sizes were smaller and offset values were markedly larger, to the point that in some cases the calculated copy number was different from that expected.
Taking all the data together, we used BIOTAQ for amplifying MIRU and ETR loci, and Diamond polymerase for amplifying the extra 7 hypervariable VNTR loci, using the separation conditions detailed in method 1. We also demonstrated the reproducibility and stability of the extra 7 VNTR loci by comparing 22 MIRU-VNTR profiles from serial isolates. The resulting profiles of serial isolates from the same patients were identical, indicating that the conditions used for fragment amplification, detection, and analysis were ideal for typing of these loci and that these loci could be used for routine genotyping.
Clustering rates seen by using 15 MIRU-ETR loci far exceeded those previously reported when IS6110 RFLP was used in a London population study (22,23). We concluded that 15-MIRU-ETR genotyping was insufficiently discriminative and was producing so-called false clustering. This view was supported by the spoligotyping results in which 38 (16%) of 235 isolates of 15 MIRU-ETR clusters contained isolates that belonged to >2 spoligo families (Table 3).
Applying all 22 loci gave the lowest clustering rate (22.2%) in MTBC strains obtained over 1 year from a single metropolitan setting (London), a rate almost identical to the proportion established in previous studies conducted in London in 1993London in and 1995London in -1997 and similar to previously reported rates in population-based studies in low-to-middle TB incidence settings where RFLP and PCR-based genotyping methods were used (11,(24)(25)(26). These findings suggest, from the public health viewpoint, that TB transmission in London has remained stable over the past decade. Our study provides strong evidence that PCR-based methods, especially VNTR-MIRU, can replace IS6110 RFLP typing for prospective analysis and that 12 MIRU (27), and 15 MIRU-ETR loci panels alone are insufficiently discriminating for evaluation of TB transmission.
The recently proposed VNTR panel (3,5,7,11) provides similar degrees of discrimination (comparable to that achieved by IS6110 RFLP), although discrimination of individual VNTR loci is not equal for different MTBC genetic families (13). Inclusion of highly polymorphic VNTR loci effectively differentiates strains within highly conserved groups and is vital for prospective genotyping. Our study demonstrated that even in settings of low TB incidence and relatively low TB transmission rates, TB families, such as Beijing and CAS, remain more conserved than others, and hypervariable loci (e.g., VNTR 3232, 2163A, 4052) provide much higher discrimination than MIRU and ETR loci either alone or in combination.
Our current results agree with the preliminary results of our earlier studies about the applicability of hypervariable VNTR loci (VNTR 3232, VNTR 3336; VNTR 2163a, and VNTR1982, in particular) and recent reports (28)(29)(30) demonstrating their effectiveness for discrimination among Beijing strains. This agreement suggests that these loci are discriminating and reproducible, especially where Beijing strains are dominant (e.g., China, Russia, Baltic countries) (28) and should be included in standardized VNTR panels. They can be used successfully at multiple laboratories with consistent results, provided the conditions for proposed reaction and PCR fragment separation are adhered to and specific DNA polymerases are used.