Identifying and Quantifying Genotypes in Polyclonal Infections due to Single Species

The combination of real-time PCR and capillary electrophoresis permits the rapid identification and quantification of pathogen genotypes.

S imultaneous infection with multiple pathogens of the same species occurs in human patients with HIV, hepatitis C, Epstein-Barr virus, dengue, tuberculosis, and malaria (1)(2)(3)(4)(5)(6)(7). However, available laboratory methods do not distinguish among pathogen genotypes in samples from individual patients. They do not permit the identification or quantitation of genotypes in samples with multiple pathogens of the same species, or the identification of size polymorphisms produced by insertions and deletions.
Conventional polymerase chain reaction (PCR) with agarose gel electrophoresis permits the identification of pathogen allotypes (strains) in human blood and tissue and an assessment of the sizes of their amplicons but does not define allotype copy number or genotype copy number. Real-time PCR permits identification and quantitation of allotypes (8,9) but does not permit the identification of genotypes within allotypes.
From the epidemiologic perspective, a molecular strategy to define the allotypes and genotypes of human pathogens and their copy numbers would permit one to study the dynamics of simultaneous infection with multiple genotypes in ways that have been impossible. For example, this knowledge could be used to identify novel genotypes (size polymorphisms) resulting from insertions and deletions at polymorphic loci.
From the bioterrorism perspective, a strategy to identify size polymorphisms (insertions and deletions) in critical regions of pathogen genomes would be of immense value. This information could be used to test for deletions in regulatory (suppressor) regions and for the insertion of new genes in regions controlled by strong promoters. Available methods do not permit the rapid identification of size polymorphisms within allotypes or the quantitation of individual pathogen genotypes.
To address this challenge, we used real-time PCR and capillary electrophoresis. Real-time PCR with allotypespecific primers permits one to define the allotypes present and their copy number (8,9). Capillary electrophoresis permits one to define individual genotypes within allotypes and genotype copy number. The combination of real-time PCR and capillary electrophoresis also permits the identification of insertions and deletions in potentially critical regions of pathogen genomes. Before obtaining informed consent from the participants (after IRB reviews and approvals), the protocol was reviewed with the chief and elders of the village and the women's council. After those additional reviews and approvals, informed consent was obtained from persons >18 years of age and from the parents and guardians of children <17 years of age before obtaining blood samples. DNA was isolated from filter paper blots and blood samples by using the QIAamp DNA Mini Kit (Qiagen, Valencia, CA, USA). Control template DNA was obtained from cultured parasites (10) by using the QIAamp DNA Blood Mini Kit (Qiagen). Cloned isolates used as controls for the 4 known allotypes of the polymorphic block 2 region of merozoite surface protein 1 (msp1) were Indochina I/CDC for MAD20, Haiti 135 for K1, 7G8 for RO33, and OK/JC 5 for MAD20/RO33 hybrid allotype parasites (11)(12)(13). DNA template concentrations were estimated from standard curves by plotting the fluorescence of 5 DNA standards with concentrations of 1 µg/mL (1,000 ng/ml), 100 ng/mL, 10 ng/mL, 1 ng/mL, and 0 ng/mL (blank or negative control) vs. the log 10 of template DNA concentration by using PicoGreen dye (Molecular Probes, Eugene, OR, USA) with the VersaFluor fluorometer (Bio-Rad, Hercules, CA, USA).

Primer and Probe Design for Real-time PCR
Primers and probes were designed by using the Beacon Designer software, version 2.03, Premier (Biosoft International, Palo Alto, CA, USA) (available from http://www.premierbiosoft.com/molecular_beacons/), in combination with manual manipulation. The primers and probes used to amplify the K1, MAD20, RO33, and hybrid (MAD20/RO33) allotypes of the block 2 region of msp1 and the internal control gene (erythrocyte-binding antigen 175, eba175) in P. falciparum are listed in Table 1, together with information on fluorophores, melting temperatures, final reactant concentrations, and the observed ranges of amplicon sizes. Unlabeled primers and fluorophorelabeled probes were obtained from Integrated DNA Technologies (Coralville, IA, USA), LUX-labeled primers from Invitrogen (Carlsbad, CA, USA); and the Cy5labeled probe for eba175 from Biosearch Technologies (Novarto, CA, USA).

Real-time PCR Amplification of Pathogen DNA
Real-time PCR was performed with the iCycler (Bio-Rad) by using the amplification conditions described below (Table 1) with a 2× multiplex-specific master mix (Qiagen) and 3-µL aliquots of template DNA. Reaction mixtures were supplemented with 2.5 U recombinant Taq polymerase (Invitrogen) and subjected to an initial denaturation at 95°C for 15 min, followed by 45 cycles of denaturation at 94° C for 30 s, annealing at 53°C for 90 s, and extension at 72°C for 90 s. Fluorescence measurements were obtained during the annealing step with TaqMan probes (K1 and MAD20/RO33 hybrid allotypes, and the eba175 internal control), and during the elongation step with LUX primers (MAD20 and RO33 allotypes). Each sample was tested in quadruplicate. Two samples were used to define the allotypes present and their copy number with the iCycler; the other 2 samples were removed from the iCycler during the exponential (logarithmic) stage of amplification for capillary electrophoresis to define the genotypes present and genotype copy number.

Template Specificity and Optimization of Multiplex PCR
Template specificity was tested for each primer probe set with the 4 control template DNAs (from Indochina I, Haiti 135, 7G8 and hybrid MAD20/RO33 parasites). Amplicons of the expected sizes were obtained with matched template and primer probe sets; no amplicons were obtained with unmatched template and primer probe sets. Negative controls likewise yielded no amplicons. DNA extracted from specimens without parasites was used to control for primer-primer and primer-probe interactions, and other potential causes of false-positive PCR results. After establishing specificity, the reaction conditions were optimized by defining the efficiencies of each primer probe set using a series of 10-fold dilutions with each control template DNA. These efficiencies were then matched to the efficiencies obtained with the multiplex PCR to adjust the final primer and probe concentrations so the efficiencies of the multiplex PCR matched those of the individual PCRs.

PCR Amplification and Allotype Quantitation
Standard curves were generated by using 10-fold dilutions of template DNA (3-µL aliquots) from each of the control parasites to estimate the initial copy numbers of the 4 allotypes in each sample. The standard curves (regression lines) for each allotype, the resulting reaction efficiencies, threshold cycle (C T ) values and estimates of initial copy numbers were calculated by using the iCycler Software (Bio-Rad).

Capillary Electrophoresis and Genotype Quantitation
To estimate amplicon size (base pairs) and copy number for each genotype, 2 replicates were removed from the iCycler for each sample during the logarithmic amplification stage, as determined by real-time relative fluorescence unit (RFU) data, and stopped with 0.5 mol/L EDTA. For each reaction, two 1-µL aliquots of the real-time PCR reaction mixture were loaded onto a DNA 500 Lab Chip (Agilent Technologies, Waldbronn, Germany) and run on the Bioanalyzer 2100 (Agilent Technologies), according to the manufacturer's instructions. With capillary electrophoresis using the DNA 500 Lab Chip, a linear relationship was shown between amplicon size and elution time (r 2 >0.998, p<0.001 for amplicons from 25 bp to 400 bp; data not shown). The copy numbers for each genotype were calculated from the molarities provided by the Agilent software. These calculations are based on the observation that the concentration of each amplicon is proportional to its peak area on the electropherogram.

Real-time PCR To Identify Pathogen Allotypes
Real-time PCR with allotype-specific primers permits the amplification of individual allotypes in specimens from infected human subjects (first 3 columns of Table 2, and Figure 1, panel A). Based on control specimens containing only 1 allotype and on negative controls, this strategy is specific. Based on filter paper blots for specimens containing >100 parasites/µL, it is sensitive. However, real-time PCR with allotype-specific primers does not distinguish among (identify) genotypes within allotypes (Figure 1). This is because real-time PCR cannot identify size polymorphisms, whether they result from natural events such as the spontaneous addition and deletion of tripeptide repeats in malaria parasites (14) or deliberately malevolent manipulation of microorganisms in the laboratory as potential agents of bioterrorism.

Optimization of Real-time PCR
Estimates of efficiency (the degree to which replication increases the number of amplicons by the expected 2-fold increment [100% efficiency] in each cycle) indicate that the efficiency of the real-time PCR assays performed in these studies was excellent (90%-100%). In addition, efficiencies of reactions in multiplex did not differ significantly from individual reaction efficiencies or from other reaction efficiencies in multiplex.

Reproducibility of Copy Number (Threshold Cycle, C T ) Estimates
Data for estimates of copy number were based on the amplification of block 2 of msp1 from P. falciparum malaria parasites ( Table 2). The reproducibility of C T estimates was examined separately for exemplary 93-and 154-bp amplicons, and found to have means of 27.99 and 28.62 cycles, respectively, with standard deviations of 0.34 and 0.13 cycles (i.e., coefficients of variation [CVs] of 1.2% and 0.5% for these 2 amplicons, n = 10 for each).

Capillary Electrophoresis To Identify Pathogen Genotypes
In contrast to real-time PCR, which identifies only allotypes, capillary electrophoresis identifies genotypes within allotypes (based on size polymorphisms) in samples from persons ( Figure 1, panels B-D). Across participants (in groups of samples), this method permits one to identify the spectrum (range) of genotypes in the population (data for samples from 10 persons infected with P. falciparum are presented as an example in Figure 2 and Table 3). The reproducibility of capillary electrophoresis is sufficient to separate amplicons that differ by >5 bp. This conclusion was based on a comparison of amplicons containing 148

Reproducibility of Genotype Copy Number Estimates
Based on the electropherograms, the reproducibility of peak area measurements and estimates of genotype copy number was excellent. CVs varied from 0.13% to 0.45% for amplicon concentrations between 10 nmol/L and 80 nmol/L (n = 12 replicates at each of 4 template concentrations of a 95-bp amplicon from 10 nmol/L to 80 nmol/L, data not shown).

Real-time Fluorescence in Relation to Peak Area
The slopes of increasing fluorescence based on real time PCR with the iCycler were indistinguishable from the increasing peak areas on the electropherogram (Figure 4, panels A and B, slopes of 0.2252 and 0.2223, p > 0.5). The similarity of these slopes (based on different parameters) indicates that increases in RFUs are directly proportional to increases in amplicon concentration (molarity). This result permits one to extrapolate from allotype copy number to genotype copy number based on peak area.

Field Samples from Persons with Polyclonal Infections
Three of the 4 known P. falciparum allotypes have size polymorphisms within block 2 of msp1. K1, MAD20, and hybrid MAD20/RO33 allotype parasites have size polymorphisms because they contain tripeptide repeats within block 2 of msp1; RO33 does not have size polymorphisms because it does not have tripeptide repeats (15). These size polymorphisms are evident for K1 in a sample from a single person ( Table 2) and for K1, MAD20, and hybrid MAD20/RO33 parasites in samples from 10 persons (Figure 2 and Table 3).

Simultaneous Infection and Detection of Genetically Modified Organisms
Studies by a number of investigators have shown that simultaneous infection with multiple pathogens (geno-types) of the same species occurs in patients with HIV, hepatitis C, Epstein-Barr virus, dengue, tuberculosis, and malaria (1-7) and have identified deletions and insertions (genotypes) due to tandem repeats in cytomegalovirus (15). Because pathogen genotypes based on insertions and deletions are common, the strategy reported here is potentially applicable to all microbial human pathogens. This complexity of infection is likely to be important in the pathogenesis and transmission of many emerging infectious diseases. For example, epidemiologically and clinically meaningful events such as severe disease and antimicrobial drug resistance are likely to be driven by competition among pathogen genotypes in vivo (by the virulence and antimicrobial susceptibility/resistance determinants of the predominant genotypes) and may also affect transmission.
In addition to block 2 of msp1 in P. falciparum, other examples of natural sequence variation detectable by using real-time PCR and capillary electrophoresis (variations >5 nucleotides/bp) include duplications and deletions in the 3′ noncoding regions (NCRs) of dengue (16) and yellow fever (17) and insertions and deletions in the env gene of HIV (18,19). For Mycobacterium tuberculosis, examples include variation in the tandem repeats within IS6110 (20), variable numbers of tandem repeats (VNTRs) (21), and genomic deletions (22). For select agents, examples include variation in VNTRs (multiple locus VNTR analysis) in Bacillus anthracis (23,24), similar differences in Yersinia pestis (25), and insertions, deletions, and variation in the inverted terminal repeat region and the coding region of the smallpox virus (26,27) (Table 4).
In addition, disease-producing agents may be modified in the laboratory to increase their virulence or to introduce antimicrobial drug resistance for bioterrorist events (28)(29)(30). However, available methods are inadequate to rapidly diagnose and quantitate simultaneous infection with multiple pathogens (genotypes) of the same species or identify insertions and deletions in critical regions of pathogen genomes. The results reported here provide a strategy to address these issues based on real-time PCR and capillary electrophoresis.

Real-time PCR To Identify and Quantitate Pathogen Allotypes
As demonstrated here and elsewhere, allotype-specific primers permit one to identify the pathogen allotypes in a specimen (1)(2)(3)(4)(5)(6)(7). In addition, real-time PCR may be used to quantitate the numbers of microorganisms in a specimen. Because the relationship between the number of cycles necessary to reach the C T and the log 10 of copy number is linear, real-time PCR can be used to estimate the initial amount of template DNA (copy number) (8,9).

Capillary Electrophoresis To Identify and Quantitate Pathogen Genotypes
In contrast to real-time PCR (in which all amplicons [genotypes] are examined together in the same well once each cycle), capillary electrophoresis detects the amplicons from each genotype as they pass a fluorescence or absorbance detector. This is accomplished by separating dsDNA amplicons based on their size (base pairs) by using a charged electrical field to drive the dsDNA polyanions to the detector at the anode. Because this separation is driven by the ratio of the electrical driving force to the mass of each amplicon, the rate of movement to the anode is inversely proportional to mass (size in base pairs). Thus, smaller amplicons travel faster and have shorter retention times on the electropherogram (Figure 1, panels B-D).

Detection of Artificial-size Polymorphisms
The results reported here demonstrate that capillary electrophoresis is sufficiently sensitive to detect insertions and deletions >5 bp in size. This finding means that capillary electrophoresis is more than sufficiently sensitive to detect biologically significant insertions and deletions in genetically modified organisms (23)(24)(25)(26)(27)(28)(29)(30). Thus, it provides an open-ended strategy to test for genetically modified organisms, by testing for size polymorphisms at critically important sites in the pathogen genome, e.g., at sites related to pathogenicity (virulence) or antimicrobial resistance.

Advantages, Limitations, and Potential Pitfalls
The advantages of real-time PCR followed by capillary electrophoresis are that it can be performed without waiting days or weeks for cultures to grow and that it detects pathogens that do not grow in conventional culture media or under standard conditions (31). In addition, as noted above (14), sequence information is enormously helpful in selecting loci within the genome likely to have insertions and deletions and interpreting the results obtained. Although insertions, deletions, and single nucleotide polymorphisms (SNPs) produce detectable changes in melting curves (32,33), melting curves are qualitative rather than quantitative. In addition, melting curves alone cannot identify specific insertions, deletions, or quasispecies (SNPs) without the addition of probes for the affected target region of the genome or the use of PCR (34,35). Finally, because the strategy reported here tests for size polymorphisms, it does not require prior knowledge of the specific sequences that may have been introduced into (or deleted from) the pathogen genome to identify genetically modified organisms. However, this strategy does have 3 limitations. First, sequences identical to (or cross-reactive with) host sequences cannot be used as targets because blood and tissue specimens are inevitably contaminated with host DNA (this issue can be resolved by searching the GenBank database). Second, the threshold of detection for genetically modified organisms is the addition (or removal) of sequences >5 bp (based on the sensitivity of capillary electrophoresis), i.e., point mutations (SNPs = quasispecies) (36-38) cannot be detected with this strategy. As a result, this method is likely to be of greater value for organisms with dsDNA genomes such as bacteria, eukaryotic parasites, and dsDNA viruses (in which quasispecies are less common because of more accurate replication) than for organisms with single negative-stranded RNA genomes (in which quasispecies are more common because their replication depends on the error-prone reverse transcriptase-HIV, hepatitis C, hepatitis B) (39,40). Third, capillary electrophoresis may need to be performed separately for each allotype to avoid confusion between amplicons of similar size from different allotypes (Figures 1-3).

Conclusions
The strategy reported here can be used for epidemiolog-ic studies of simultaneous infection with multiple pathogens (genotypes) of the same species in emerging infectious diseases and for the rapid identification of select agents that have been genetically modified to increase their virulence or antimicrobial drug resistance.