Abstract
A comparison of variable regions within the 16S rRNA gene is widely used to characterize relationships between bacteria and to identify phylogenetic affiliation of unknown bacteria. In environmental studies, polymerase chain reaction amplification of 16S rRNA followed by cloning and sequencing of numerous individual clones is an extensively used molecular method for elucidating microbial diversity. The sequencing process typically utilizes a forward and reverse primer pair to produce two partial reads (~700 to 800 base pairs each) that overlap and in total cover a large region of the full 16S rRNA sequence (~1.5 k base). In a typical application, this approach rapidly generates very large numbers of 16S rRNA datasets that can overwhelm manual processing efforts leading to both delays and errors. In particular, the approach presents two computational challenges: (1) the assembly of a composite sequence from the two partial reads and (2) the subsequent appropriate identification of the organism represented by the newly sequenced clones. Herein, we describe a software package, search, trim, identify, track, and capture the uniqueness of 16S rRNAs using public and in-house database (STITCH), which offers automated sequence pair splicing and genetic identification, thus simplifying the computationally intensive analysis of large sequencing libraries. The STITCH software is freely accessible over the Internet at: http://prion.bchs.uh.edu/stitch/.
Similar content being viewed by others
References
Woodsmall RM, Benson DA (1993) Information resources at the national center for biotechnology information. Bull Med Libr Assoc 81:282–284
Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Geer LY, Kapustin Y, Khovayko O, Landsman D, Lipman DJ, Madden TL, Maglott DR, Ostell J, Miller V, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Sirotkin K, Souvorov A, Starchenko G, Tatusov RL, Tatusova TA, Wagner L, Yaschenko E (2007) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 35:D5–D12
Maidak BL, Cole JR, Lilburn TG, Parker CT Jr, Saxman PR, Farris RJ, Garrity GM, Olsen GJ, Schmidt TM, Tiedje JM (2001) The RDP-II (ribosomal database project). Nucleic Acids Res 29:173–174
Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J, Glockner FO (2007) SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res 35:7188–7196, Epub 2007 Oct 7118
Lane DJ, Pace B, Olsen GJ, Stahl DA, Sogin ML, Pace NR (1985) Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. Proc Natl Acad Sci USA 82:6955–6959
Ruppitsch W, Stoger A, Indra A, Grif K, Schabereiter-Gurtner C, Hirschl A, Allerberger F (2007) Suitability of partial 16S ribosomal RNA gene sequence analysis for the identification of dangerous bacterial pathogens. J Appl Microbiol 102:852–859
Saiki RK, Scharf S, Faloona F, Mullis KB, Horn GT, Erlich HA, Arnheim N (1985) Enzymatic amplification of beta-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science 230:1350–1354
Wilson KH, Blitchington RB, Greene RC (1990) Amplification of bacterial 16S ribosomal DNA with polymerase chain reaction. J Clin Microbiol 28:1942–1946
Matsuda M, Tazumi A, Kagawa S, Sekizuka T, Murayama O, Moore JE, Millar BC (2006) Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis. BMC Vet Res 2:1
Miyajima M, Matsuda M, Haga S, Kagawa S, Millar BC, Moore JE (2002) Cloning and sequencing of 16S rDNA and 16S–23S rDNA internal spacer region (ISR) from urease-positive thermophilic Campylobacter (UPTC). Lett Appl Microbiol 34:287–289
Gal S (1993) Sequencing of double-stranded PCR products. Humana Press Inc., Totowa
Gomez-Alvarez V, Teal TK, Schmidt TM (2009) Systematic artifacts in metagenomes from complex microbial communities. ISME J 3:1314–1317
Kunin V, Engelbrektson A, Ochman H, Hugenholtz P (2010) Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. Environ Microbiol 12:118–123
La Duc MT, Dekas A, Osman S, Moissl C, Newcombe D, Venkateswaran K (2007) Isolation and characterization of bacteria capable of tolerating the extreme conditions of clean-room environments. Appl Environ Microbiol 73:2600–2611
Newcombe DA, Schuerger AC, Benardini JN, Dickinson D, Tanner R, Venkateswaran K (2005) Survival of spacecraft-associated microorganisms under simulated martian UV irradiation. Appl Environ Microbiol 71:8147–8156
de Lillo A, Ashley FP, Palmer RM, Munson MA, Kyriacou L, Weightman AJ, Wade WG (2006) Novel subgingival bacterial phylotypes detected using multiple universal polymerase chain reaction primer sets. Oral Microbiol Immunol 21:61–68
Miralles G, Grossi V, Acquaviva M, Duran R, Claude Bertrand J, Cuny P (2007) Alkane biodegradation and dynamics of phylogenetic subgroups of sulfate-reducing bacteria in an anoxic coastal marine sediment artificially contaminated with oil. Chemosphere 68:1327–1334
Hassan AA, Akineden O, Kress C, Estuningsih S, Schneider E, Usleber E (2007) Characterization of the gene encoding the 16S rRNA of Enterobacter sakazakii and development of a species-specific PCR method. Int J Food Microbiol 116:214–220
Zhi XY, Tang SK, Li WJ, Xu LH, Jiang CL (2006) New genus-specific primers for the PCR identification of novel isolates of the genus Streptomonospora. FEMS Microbiol Lett 263:48–53
Bathe S, Hausner M (2006) Design and evaluation of 16S rRNA sequence based oligonucleotide probes for the detection and quantification of Comamonas testosteroni in mixed microbial communities. BMC Microbiol 6:54
Hansen BM, Hendriksen NB (2001) Detection of enterotoxic Bacillus cereus and Bacillus thuringiensis strains by PCR analysis. Appl Environ Microbiol 67:185–189
Keohavong P, Thilly WG (1989) Fidelity of DNA polymerases in DNA amplification. Proc Natl Acad Sci USA 86:9253–9257
Eckert KA, Kunkel TA (1990) High fidelity DNA synthesis by the Thermus aquaticus DNA polymerase. Nucleic Acids Res 18:3739–3744
Barnes WM (1992) The fidelity of Taq polymerase catalyzing PCR is improved by an N-terminal deletion. Gene 112:29–35
Clarke LA, Rebelo CS, Goncalves J, Boavida MG, Jordan P (2001) PCR amplification introduces errors into mononucleotide and dinucleotide repeat sequences. Mol Pathol 54:351–353
MacVector, Inc. (2010). Available at: http://www.macvector.com/
Sequencher (2009). Available at: http://www.genecodes.com/
Wang Q, Garrity GM, Tiedje JM, Cole JR (2007) Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 73:5261–5267
Ludwig W, Strunk O, Westram R, Richter L, Meier H, Yadhukumar BA, Lai T, Steppi S, Jobb G, Forster W, Brettske I, Gerber S, Ginhart AW, Gross O, Grumann S, Hermann S, Jost R, Konig A, Liss T, Lussmann R, May M, Nonhoff B, Reichel B, Strehlow R, Stamatakis A, Stuckmann N, Vilbig A, Lenke M, Ludwig T, Bode A, Schleifer KH (2004) ARB: a software environment for sequence data. Nucleic Acids Res 32:1363–1371
DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi D, Hu P, Andersen GL (2006) Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol 72:5069–5072
Chun J, Lee JH, Jung Y, Kim M, Kim S, Kim BK, Lim YW (2007) EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal RNA gene sequences. Int J Syst Evol Microbiol 57:2259–2261
Basic Local Alignment Search Tool (BLAST) (2010). Available at: http://www.ncbi.nlm.nih.gov/blast/Blast.cgi
Mori H, Maruyama F, Kurokawa K (2010) VITCOMIC: visualization tool for taxonomic compositions of microbial communities based on 16S rRNA gene sequences. BMC Bioinform 11:332
Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres B, Thallinger GG, Van Horn DJ, Weber CF (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75:7537–7541
Azad RK, Borodovsky M (2004) Probabilistic methods of identifying genes in prokaryotic genomes: connections to the HMM theory. Brief Bioinform 5:118–130
Ribosomal database project (RDP) (2009). Available at: http://rdp.cme.msu.edu/hierarchy/hb_intro.jsp
Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680
Wang GC, Wang Y (1996) The frequency of chimeric molecules as a consequence of PCR co-amplification of 16S rRNA genes from different bacterial species. Microbiology 142:1107–1114
Huber T, Faulkner G, Hugenholtz P (2004) Bellerophon: a program to detect chimeric sequences in multiple sequence alignments. Bioinformatics 20:2317–2319
Ashelford KE, Chuzhanova NA, Fry JC, Jones AJ, Weightman AJ (2006) New screening software shows that most recent large 16S rRNA gene clone libraries contain chimeras. Appl Environ Microbiol 72:5734–5741
Ashelford KE, Chuzhanova NA, Fry JC, Jones AJ, Weightman AJ (2005) At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies. Appl Environ Microbiol 71:7724–7736
Acknowledgment
The research described in this publication was carried out in part at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration and in part at the University of Houston under a subcontract from the Jet Propulsion Laboratory.
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
ESM 1
(DOC 56 kb)
Rights and permissions
About this article
Cite this article
Zhu, D., Vaishampayan, P.A., Venkateswaran, K. et al. STITCH: Algorithm to Splice, Trim, Identify, Track, and Capture the Uniqueness of 16S rRNAs Sequence Pairs Using Public or In-house Database. Microb Ecol 61, 669–675 (2011). https://doi.org/10.1007/s00248-010-9779-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00248-010-9779-2