Sequencing of 16S rRNA Gene: A Rapid Tool for Identification of Bacillus anthracis

In a bioterrorism event, a tool is needed to rapidly differentiate Bacillus anthracis from other closely related spore-forming Bacillus species. During the recent outbreak of bioterrorism-associated anthrax, we sequenced the 16S rRNA generom these species to evaluate the potential of 16S rRNA gene sequencing as a diagnostic tool. We found eight distinct 16S types among all 107 16S rRNA gene seqs fuences that differed from each other at 1 to 8 positions (0.06% to 0.5%). All 86 B. anthracis had an identical 16S gene sequence, designated type 6; 16S type 10 was seen in all B. thuringiensis strains; six other 16S types were found among the 10 B. cereus strains. This report describes the first demonstration of an exclusive association of a distinct 16S sequence with B. anthracis. Consequently, we were able to rapidly identify suspected isolates and to detect the B. anthracis 16S rRNA gene directly from culture-negative clinical specimens from seven patients with laboratory-confirmed anthrax.

In a bioterrorism event, a tool is needed to rapidly differentiate Bacillus anthracis from other closely related spore-forming Bacillus species. During the recent outbreak of bioterrorism-associated anthrax, we sequenced the 16S rRNA genes from these species to evaluate the potential of 16S rRNA gene sequencing as a diagnostic tool. We found eight distinct 16S types among all 107 16S rRNA gene sequences that differed from each other at 1 to 8 positions (0.06% to 0.5%). All 86 B. anthracis had an identical 16S gene sequence, designated type 6; 16S type 10 was seen in all B. thuringiensis strains; six other 16S types were found among the 10 B. cereus strains. This report describes the first demonstration of an exclusive association of a distinct 16S sequence with B. anthracis. Consequently, we were able to rapidly identify suspected isolates and to detect the B. anthracis 16S rRNA gene directly from culture-negative clinical specimens from seven patients with laboratory-confirmed anthrax.
he gram-positive, rod-shaped, and spore-forming bacterium Bacillus anthracis is the cause of the acute and often lethal disease anthrax. Phenotypic characteristics commonly used to differentiate B. anthracis from closely related B. cereus and B. thuringiensis, such as susceptibility to ß-lactam antibiotics, lack of motility, lack of hemolysis on sheep blood agar (SBA) plate, and susceptibility to γ-phage lysis, may vary among different Bacillus species strains, hampering their identification and differentiation. Phenotypically and genotypically B. thuringiensis can be differentiated from B. cereus by the presence of the CRY crystal protein and plasmid-encoded cry genes (1), but if this plasmid were lost, B. thuringiensis could no longer be distinguished from B. cereus (1). The sequence of the 16S rRNA gene has been widely used as a molecular clock to estimate relationships among bacteria (phylogeny), but more recently it has also become important as a means to identify an unknown bacterium to the genus or species level. The 16S rRNA gene sequences of B. anthracis, B. cereus, and B. thuringiensis have high levels of sequence similarity (>99%) that support their close relationships shown by DNA hybridization (2)(3)(4)(5)(6)(7). A limited number of 16S rRNA sequences of B. anthracis (7 sequences), B. cereus (34 sequences), and B. thuringiensis (16 sequences) have been available at GenBank. Although those sequences are of different lengths and qualities, in complementary regions they differ from each other by no more than a few nucleotides. Therefore, this minimal level of diversity seen in the 16S rRNA of B. anthracis, B. cereus, and B. thuringiensis was thought to be an obstacle for using 16S rRNA gene sequencing to identify and differentiate these three species. The bioterrorism events of October 2001 prompted us to evaluate several new molecular approaches to rapidly identify B. anthracis. We determined the entire 16S rRNA sequences in a large number of representative strains of B. anthracis, B. cereus, and B. thuringiensis to evaluate the potential of 16S rRNA sequencing not only to rapidly identify B. anthracis in culture, but also to detect B. anthracis directly in clinical specimens.

Bacterial Strains
A total of 107 strains were included in this study. Of 86 B. anthracis isolates analyzed (Table 1), 18 were selected to represent a wide range of temporal , geographic (16 countries), and source diversity (soil, animals, or humans). Fourteen reference and standard strains, such as the Vollum, Ames, Pasteur, New Hampshire, V770, and Sterne strains, were also included. The remaining 54 strains were isolated from October to December 2001 during the bioterrorism-associated anthrax outbreak in the United States. Ten B. cereus and 11 B. thuringiensis strains were also analyzed by 16S rRNA sequencing. All strains were identified by standard microbiologic procedures and according to the Laboratory Response Network diagnostic criteria (9,10).

BIOTERRORISM-RELATED ANTHRAX
were analyzed for 16S rRNA gene amplification and products sequenced.

Polymerase Chain Reaction (PCR)
A 1,686-bp fragment of DNA, including the 1,554-bp 16S rRNA gene, was amplified from all 107 Bacillus species strains by using primers 67F and 1671R (Table 2). For clinical samples, we used the initial DNA amplicon as a template in a nested PCR with a second set of internal primers, 23F and 136R (Table 1). Both sets of primers were designed from the B. anthracis genome sequence (http://www.tigr.org). The fulllength size of B. anthracis 16S rRNA gene (1,554 bp) was determined from an alignment of the 16S rRNA genes from Escherichia coli, Neisseria gonorrhoeae (GenBank accession nos. J01859 and X07714, respectively), and the 16S rRNA gene regions of the B. anthracis genome sequence (http:// www.tigr.org). Whole cell suspensions or DNA extracts were used for PCR of isolates or clinical samples, respectively. For whole cell suspensions, a single colony from an SBA plate was resuspended in 200 µL of 10 mM Tris, pH 8.0. The suspension was put in a Millipore 0.22-µm filter unit (Millipore, Bedford, MA), heated at 95°C for 20 min, centrifuged at 8,000 rpm for 2 min, and then used for PCR. Each final PCR reaction (100 µL) contained 5 U of Expand DNA polymerase (Roche, Mannheim, Germany); 2 µL of whole cell suspension or DNA; 10 mM Tris-HCl (pH 8.0); 50 mM KCl; 1.5 mM MgCl 2 ; 200 µM (each) dATP, dCTP, dGTP, and dTTP; and 0.4 µM of each primer. Reactions were first incubated for 5 min at 95ºC. Then 35 cycles were performed as follows: 15 s at 94ºC, 15 s at the annealing temperature of 52ºC, and 1 min 30 s at 72ºC. Reactions were then incubated at 72ºC for another 5 min. The annealing temperature for the nested PCR was 50ºC. PCR products were purified with Qiaquick PCR purification kit (Qiagen).

16S rRNA Sequence Determination
The amplified products of approximately 1,686 bp (1,656 bp for nested PCR) were sequenced by using a modification of 16 primers as described (Table 2) (11). Sequencing was performed by using a Big Dye terminator cycle sequencing kit (Applied BioSystems, Foster City, CA). Sequencing products were purified by using Centri-Sep spin columns (Princeton Separations, Adelphia, NJ) and were resolved on an Applied BioSystems model 3100 automated DNA sequencing system (Applied BioSystems). The length of sequences obtained differed for each primer but were sufficient to provide 5-to 8fold sequence coverage. An inner fragment of 1,554 bp was  obtained and analyzed by using the GCG (Wisconsin) Package, v. 10.1, (Genetics Computer Group, Madison, WI). A number was assigned for each allele of 16S rRNA gene sequence in order of elucidation; a single base change or a mixed base (more than one nucleotide determined at a single position) was considered a new 16S type. When a novel 16S type, mixed base pairs, or any discrepancies in the alignment were obtained, the 16S rRNA gene amplification and sequencing of the entire gene or parts containing the problematic region were repeated.

GenBank 16S rRNA Gene Sequences and Accession Numbers
Sixty 16S rRNA gene sequences of B. anthracis, B. cereus, and B. thuringiensis were available in GenBank. Thirty-nine of these sequences were incomplete, contained a large number of undetermined nucleotides, or were not associated with a specific strain identification, and therefore were not used in this study. The remaining 21 sequences were identified as  Table 1] and 7 from clinical specimens [GenBank accession nos. AY138359 to AY138365).

16S rRNA Gene Sequence Diversity
The 1,554-bp nucleotide sequences of the entire 16S rRNA gene from all 107 Bacillus species strains were aligned and compared. Differences were found at eight single nucleotide positions (positions 1, 2, 3, 4, 6, 9, 12, and 13), and no gaps were present. When 21 Bacillus 16S rRNA sequences from GenBank were added to the alignment, five additional positions with differences (positions 5, 7, 8, 10, and 11) were located ( Table 3). The 13 positions of differences were distributed throughout the gene (Table 3). In six of these positions (positions 1, 2, 3, 4, 6, and 12), more than one nucleotide was detected (mixed nucleotides) ( Table 3). These results indicated that the strain contained multiple rRNA operons with slightly different 16S rRNA gene sequences.
We found eight different 16S types among the 107 16S rRNA genes from our collection of isolates (Table 3). All 86 B. anthracis had an identical sequence, 16S type 6, containing a single mixed base, a W(A/T) at position 12, not found in the other two species. 16S type 10 was seen in all 11 B. thuringiensis strains, and a single mixed base pair was identified in all strains at position 6. Six other 16S types were found among the 10 B. cereus strains. Three additional 16S types were found among the 18 GenBank sequences that we analyzed. 16S types 1, 4, and 5 correlated to B. mycoides, B. thuringiensis, and B. cereus, respectively (Table 3). Five B. anthracis sequences from GenBank were identical to the 16S type 6 found in all our 86 B. anthracis isolates, and three were identical to the 16S type 7 found in B. cereus.

16S rRNA Sequencing Directly in Clinical Specimens
We detected 16S rRNA genes in 7 (3.5%) of 198 clinical samples: all were 16S type 6 characteristic for B. anthracis. None of the seven specimens were culture positive (Table 4), although all specimens had been collected from patients with laboratory-confirmed anthrax.

Discussion
The goal of this study was to evaluate the potential of 16S rRNA sequencing to rapidly identify B. anthracis in cultures. We found that 16S rRNA genes of B. anthracis were highly conserved; only one 16S type (16S type 6) was identified in all 86 strains tested. However, not all B. anthracis 16S rRNA genes sequences in GenBank are type 6. Three of the eight B. anthracis 16S rRNA sequences are reported as type 7, a type that, in our study, we found exclusively among the B. cereus strains. The only difference between type 7 and type 6 is a mixed base pair at position 1146. The strain designations of two of these three 16S type 7 B. anthracis strains in GenBank are Ames and Sterne. We did not acquire these particular strains from the submitting laboratory, but the one Ames and two Sterne strains (obtained from different sources) in our collection were consistently type 6. A third Sterne strain 16S rRNA sequence in GenBank is also type 6.
One possible explanation for these different 16S rRNA sequencing results may be the use of different sequencing approaches, such as using cloned DNA versus genomic DNA as template. In sequencing clones, one allele may be missed if only a few clones are sequenced, not representing the total diversity. In this case, the position with the mixed base would not be detected. If both types 6 and 7 exist in B. anthracis, the difference may be due to recombination, mutation, or loss of an allele. The type 7 B. anthracis sequences in GenBank are unpublished; therefore, we do not know if the genes were cloned and, if so, how many clones were sequenced.
The complete B. anthracis genome was posted at http:// www.tigr.org/tigr-scripts/ufmg/ReleaseDate.pl on May 7, 2002. The genome has 11 rRNA operons. There are 10 positions in the 16S rRNA gene where the nucleotides are not identical among the 11 rRNA operons, but the DNA sequencing software scores only one of them as a mixed base 100% of the time. This position is 1146, where five 16S rRNA genes contain Ts and six have As in a 54%:46% ratio. In this case, the base-calling software (GCG; Genetics Computer Group) always assigns a W at that position. At position 1137, there are seven Gs and four As, a 64%:36% ratio, but the position is scored as a G, the predominant base. In eight positions, a 9%:91% ratio is present. For example, at position 1047 are one BIOTERRORISM-RELATED ANTHRAX T and 10 Cs. In these cases, the nucleotide is called as the predominant base by the base-calling software.
The quality of DNA sequences generated in laboratories has been greatly improved by the introduction of automated sequencing systems and DNA alignment software, but other factors, such as the purity of the DNA template and number of overlapping nucleotide fragments in the alignment, contribute to the reliability of the final sequence. Mixed base pairs are clearly the result of sequence differences between different rRNA operons and not due to any sequencing artifacts. In this study, the length of the fragment sequences varied for each primer, but they were of sufficient length to provide 5-to 8fold sequence coverage in both directions. This 5-8 sequence overlap simplifies identifying and clarifying positions with double signals, increasing the confidence in our final consensus sequence. The occurrence of mixed base pairs in rRNA sequences is well known and accepted (15)(16)(17)(18)(19). The Ribosomal Database Project Web site shows that operon heterogeneity has been documented in several different bacterial species (http://rrndb.cme.msu.edu/rrndb/rrn_table.pdf). In addition, we did not observe mixed base pairs in single-copy genes such as pagA and a variety of others. A previous study of a small set of Bacillus strains isolated from soil demonstrated the diversity of 16S rRNA genes of both B. cereus and B. thuringiensis (15). Our results confirm the diversity among B. cereus strains, although we did not find diversity among B. thuringiensis strains. The lack of diversity in our collection of B. thuringiensis strains may be associated with natural selection with human host; 6 of 11 of our B. thuringiensis strains were isolated from humans.

Direct Amplification of 16S rRNA from Clinical Samples
Even though B. anthracis is present at high levels (up to 10 8 /mL) in the blood of patients with anthrax and will readily grow on standard bacteriologic media, as for other bacteria, specimens collected after the administration of antimicrobial therapy may fail to grow B. anthracis. Laboratory confirmation for the two patients with inhalational anthrax whose specimens were analyzed (patient #10i [12], and patient #11i [13]) was achieved by isolation and identification of B. anthracis from clinical samples at the medical facility where the patients were treated. Generally, for all patients, isolates themselves were forwarded to the appropriate public health laboratory and then to the Centers for Disease Control and Prevention for confirmatory identification and molecular subtyping, but the initial clinical specimens were not sent along with the isolates. With few exceptions, clinical specimens available for analysis from these two patients and from other patients with inhalational anthrax were collected after initiation of antimicrobial therapy, resulting in few culture-positive results. For 3 of the 11 inhalational patients, laboratory confirmation was based on two of three available supportive tests, including PCR targeting two plasmid and one chromosomal target (14), immunohistochemistry or a reactive anti-protective antigen titer (immunoglobulin G ELISA) (12,20). Laboratory confirmation for the two cutaneous cases with skin biopsies analyzed in this 16S types identified in 107 strains in this study a Numbers refer to the number of positions where mismatches are found. Numbers in parentheses refer to positions in the 16S rRNA gene. b R refers to a purine (A or G) at that position; Y refers to a pyrimidine (C or T) at that position; and W refers to an A or T at that position. c A, C, G, and T refer to the four deoxynucleotides that DNA comprises. d Five additional positions of differences (positions 5, 7, 8, 10, and 11) were found when GenBank sequences were used. e The last position (position 13) on 16S types 1, 4, and 5 is missing because those GenBank sequences are shorter. study was indeed achieved by these supportive laboratory tests: one case was confirmed by immunohistochemistry and a reactive anti-protective antigen title (IgG ELISA). For the other case all three supportive laboratory tests were positive. Previously, strains having <3% difference between their 16S rRNA genes were considered the same species (21). However, differences between 16S rRNA genes for some Bacillus species, such as B. anthracis, B. cereus, and B. thuringiensis, are <1% (1). Such small differences (e.g., one base between sequences or partial matches at a single nucleotide position in the 16S rRNA gene) have not been used for species differentiation. Our study clearly demonstrates that such small differences might be important for species identification. DNA-DNA hybridization and 16S rRNA sequencing studies have shown that these three Bacillus species are closely related and probably represent a single species (3,6,7). If the three were classified as a single species, 16S rRNA sequencing appears to have the potential to differentiate strains at the subspecies level.
Although pXO1 and pXO2 plasmids must be detected to confirm the virulence of B. anthracis, 16S rRNA sequencing has a powerful capacity to rapidly identify B. anthracis and other species. Although further studies are needed to fully evaluate 16S sequencing as a diagnostic assay, its value as a tool for rapid initial screening in outbreak investigations has been demonstrated. The immunohistochemical (IHC), serologic, and PCR results are described in reference 14.