The role of replicates for error mitigation in next-generation sequencing

Robasky, Kimberly; Lewis, Nathan E.; Church, George M.

doi:10.1038/nrg3655

Opinion
Published: 10 December 2013

The role of replicates for error mitigation in next-generation sequencing

Kimberly Robasky^na1^nAff1,
Nathan E. Lewis^na1^nAff4 &
George M. Church^nAff3

Nature Reviews Genetics volume 15, pages 56–62 (2014)Cite this article

18k Accesses
193 Citations
53 Altmetric
Metrics details

Subjects

Abstract

Advances in next-generation sequencing (NGS) technologies have rapidly improved sequencing fidelity and substantially decreased sequencing error rates. However, given that there are billions of nucleotides in a human genome, even low experimental error rates yield many errors in variant calls. Erroneous variants can mimic true somatic and rare variants, thus requiring costly confirmatory experiments to minimize the number of false positives. Here, we discuss sources of experimental errors in NGS and how replicates can be used to abate such errors.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Sources of and tools to cope with unexpected or erroneous variants.**

**Figure 2: Platform-independent method for choosing quality score thresholds.**

**Figure 3: An example application of plotting replicate scores to assess filter efficiency.**

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Article Open access 25 March 2024

Wenpin Hou & Zhicheng Ji

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

References

O'Rawe, J. et al. Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med. 5, 28 (2013).
CAS PubMed PubMed Central Google Scholar
Kircher, M., Heyn, P. & Kelso, J. Addressing challenges in the production and analysis of Illumina sequencing data. BMC Genomics 12, 382 (2011).
CAS PubMed PubMed Central Google Scholar
Metzker, M. L. Sequencing technologies — the next generation. Nature Rev. Genet. 11, 31–46 (2010).
CAS PubMed Google Scholar
Sboner, A., Mu, X. J., Greenbaum, D., Auerbach, R. K. & Gerstein, M. B. The real cost of sequencing: higher than you think! Genome Biol. 12, 125 (2011).
PubMed PubMed Central Google Scholar
Ratan, A. et al. Comparison of sequencing platforms for single nucleotide variant calls in a human sample. PLoS ONE 8, e55089 (2013).
CAS PubMed PubMed Central Google Scholar
Peters, B. A. et al. Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells. Nature 487, 190–195 (2012).
CAS PubMed PubMed Central Google Scholar
Williams, C. et al. A high frequency of sequence alterations is due to formalin fixation of archival specimens. Am. J. Pathol. 155, 1467–1471 (1999).
CAS PubMed PubMed Central Google Scholar
Yost, S. E. et al. Identification of high-confidence somatic mutations in whole genome sequence of formalin-fixed breast cancer specimens. Nucleic Acids Res. 40, e107 (2012).
CAS PubMed PubMed Central Google Scholar
Akbari, M., Hansen, M. D., Halgunset, J., Skorpen, F. & Krokan, H. E. Low copy number DNA template can render polymerase chain reaction error prone in a sequence-dependent manner. J. Mol. Diagn. 7, 36–39 (2005).
CAS PubMed PubMed Central Google Scholar
Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).
CAS PubMed PubMed Central Google Scholar
Leal, S. M. Detection of genotyping errors and pseudo-SNPs via deviations from Hardy–Weinberg equilibrium. Genet. Epidemiol. 29, 204–214 (2005).
PubMed PubMed Central Google Scholar
Walsh, P. S., Erlich, H. A. & Higuchi, R. Preferential PCR amplification of alleles: mechanisms and solutions. PCR Methods Appl. 1, 241–250 (1992).
CAS PubMed Google Scholar
Hutchison, C. A. 3rd, Smith, H. O., Pfannkoch, C. & Venter, J. C. Cell-free cloning using phi29 DNA polymerase. Proc. Natl Acad. Sci. USA 102, 17332–17336 (2005).
CAS PubMed PubMed Central Google Scholar
Hodges, E. et al. Genome-wide in situ exon capture for selective resequencing. Nature Genet. 39, 1522–1527 (2007).
CAS PubMed Google Scholar
Aird, D. et al. Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol. 12, R18 (2011).
CAS PubMed PubMed Central Google Scholar
Bystrykh, L. V. Generalized DNA barcode design based on Hamming codes. PLoS ONE 7, e36852 (2012).
CAS PubMed PubMed Central Google Scholar
Koboldt, D. C., Ding, L., Mardis, E. R. & Wilson, R. K. Challenges of sequencing human genomes. Brief Bioinform. 11, 484–498 (2010).
CAS PubMed PubMed Central Google Scholar
Xuan, J., Yu, Y., Qing, T., Guo, L. & Shi, L. Next-generation sequencing in the clinic: promises and challenges. Cancer Lett. 340, 284–295 (2012).
PubMed PubMed Central Google Scholar
Nakamura, K. et al. Sequence-specific error profile of Illumina sequencers. Nucleic Acids Res. 39, e90 (2011).
CAS PubMed PubMed Central Google Scholar
Fuller, C. W. et al. The challenges of sequencing by synthesis. Nature Biotech. 27, 1013–1023 (2009).
CAS Google Scholar
Roberts, R. J., Carneiro, M. O. & Schatz, M. C. The advantages of SMRT sequencing. Genome Biol. 14, 405 (2013).
PubMed PubMed Central Google Scholar
Yang, X., Chockalingam, S. P. & Aluru, S. A survey of error-correction methods for next-generation sequencing. Brief Bioinform. 14, 56–66 (2013).
CAS PubMed Google Scholar
Lynch, M. Rate, molecular spectrum, and consequences of human mutation. Proc. Natl Acad. Sci. USA 107, 961–968 (2010).
CAS PubMed PubMed Central Google Scholar
Laurie, C. C. et al. Detectable clonal mosaicism from birth to old age and its relationship to cancer. Nature Genet. 44, 642–650 (2012).
CAS PubMed Google Scholar
Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl Acad. Sci. USA 109, 14508–14513 (2012).
CAS PubMed PubMed Central Google Scholar
Luo, C., Tsementzi, D., Kyrpides, N., Read, T. & Konstantinidis, K. T. Direct comparisons of Illumina versus Roche 454 sequencing technologies on the same microbial community DNA sample. PLoS ONE 7, e30087 (2012).
CAS PubMed PubMed Central Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genet. 43, 491–498 (2011).
CAS PubMed Google Scholar
Ajay, S. S., Parker, S. C., Abaan, H. O., Fajardo, K. V. & Margulies, E. H. Accurate and comprehensive sequencing of personal genomes. Genome Res. 21, 1498–1505 (2011).
PubMed PubMed Central Google Scholar
Meynert, A. M., Bicknell, L. S., Hurles, M. E., Jackson, A. P. & Taylor, M. S. Quantifying single nucleotide variant detection sensitivity in exome sequencing. BMC Bioinformatics 14, 195 (2013).
PubMed PubMed Central Google Scholar
Leek, J. T. et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Rev. Genet. 11, 733–739 (2010).
CAS PubMed Google Scholar
Baranzini, S. E. et al. Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis. Nature 464, 1351–1356 (2010).
CAS PubMed PubMed Central Google Scholar
Reumers, J. et al. Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing. Nature Biotech. 30, 61–68 (2012).
CAS Google Scholar
Lam, H. Y. et al. Performance comparison of whole-genome sequencing platforms. Nature Biotech. 30, 78–82 (2012).
CAS Google Scholar
Jung, H., Bleazard, T., Lee, J. & Hong, D. Systematic investigation of cancer-associated somatic point mutations in SNP databases. Nature Biotech. 31, 787–789 (2013).
CAS Google Scholar
Drmanac, R. et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327, 78–81 (2010).
CAS PubMed Google Scholar
Pelak, K. et al. The characterization of twenty sequenced human genomes. PLoS Genet. 6, e1001111 (2010).
PubMed PubMed Central Google Scholar
Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008).
CAS PubMed PubMed Central Google Scholar
Lee, W. et al. The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465, 473–477 (2010).
CAS PubMed Google Scholar
Ball, M. P. et al. A public resource facilitating clinical use of genomes. Proc. Natl Acad. Sci. USA 109, 11920–11927 (2012).
CAS PubMed PubMed Central Google Scholar
Laurie, C. C. et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet. Epidemiol. 34, 591–602 (2010).
PubMed PubMed Central Google Scholar
Ye, K., Schulz, M. H., Long, Q., Apweiler, R. & Ning, Z. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865–2871 (2009).
CAS PubMed PubMed Central Google Scholar
Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462–467 (2005).
CAS PubMed Google Scholar
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).
PubMed PubMed Central Google Scholar
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
PubMed PubMed Central Google Scholar
Lindgreen, S. AdapterRemoval: easy cleaning of next-generation sequencing reads. BMC Res. Notes 5, 337 (2012).
PubMed PubMed Central Google Scholar
Degner, J. F. et al. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25, 3207–3212 (2009).
CAS PubMed PubMed Central Google Scholar
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
CAS PubMed PubMed Central Google Scholar
Genovese, G. et al. Using population admixture to help complete maps of the human genome. Nature Genet. 45, 406–414 (2013).
CAS PubMed Google Scholar
Church, D. M. et al. Modernizing reference genome assemblies. PLoS Biol. 9, e1001091 (2011).
CAS PubMed PubMed Central Google Scholar
Pruitt, K. D., Tatusova, T. & Maglott, D. R. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35, D61–D65 (2007).
CAS PubMed Google Scholar
Rusk, N. One genome, two haplotypes. Nature Methods 8, 107 (2011).
CAS PubMed Google Scholar
Fan, H. C., Wang, J., Potanina, A. & Quake, S. R. Whole-genome molecular haplotyping of single cells. Nature Biotech. 29, 51–57 (2011).
CAS Google Scholar
Kitzman, J. O. et al. Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nature Biotech. 29, 59–63 (2011).
CAS Google Scholar
Browning, S. R. & Browning, B. L. Haplotype phasing: existing methods and new developments. Nature Rev. Genet. 12, 703–714 (2011).
CAS PubMed Google Scholar
Bansal, V. & Bafna, V. HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24, i153–i159 (2008).
PubMed Google Scholar
Chen, R. et al. Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 148, 1293–1307 (2012).
CAS PubMed PubMed Central Google Scholar
Roach, J. C. et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636–639 (2010).
CAS PubMed PubMed Central Google Scholar
Lupski, J. R. et al. Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. N. Engl. J. Med. 362, 1181–1191 (2010).
CAS PubMed PubMed Central Google Scholar
Chapman, S. J. & Hill, A. V. Human genetic susceptibility to infectious disease. Nature Rev. Genet. 13, 175–188 (2012).
CAS PubMed Google Scholar
Ott, J., Kamatani, Y. & Lathrop, M. Family-based designs for genome-wide association studies. Nature Rev. Genet. 12, 465–474 (2011).
CAS PubMed Google Scholar
Gibson, G. Rare and common variants: twenty arguments. Nature Rev. Genet. 13, 135–145 (2011).
Google Scholar
Wang, K., Li, M. & Hakonarson, H. Analysing biological pathways in genome-wide association studies. Nature Rev. Genet. 11, 843–854 (2010).
CAS PubMed Google Scholar
Schloissnig, S. et al. Genomic variation landscape of the human gut microbiome. Nature 493, 45–50 (2013).
PubMed Google Scholar
Robins, W. P., Faruque, S. M. & Mekalanos, J. J. Coupling mutagenesis and parallel deep sequencing to probe essential residues in a genome or gene. Proc. Natl Acad. Sci. USA 110, E848–857 (2013).
CAS PubMed PubMed Central Google Scholar
Conrad, T. M., Lewis, N. E. & Palsson, B. O. Microbial laboratory evolution in the era of genome-scale science. Mol. Syst. Biol. 7, 509 (2011).
PubMed PubMed Central Google Scholar
Shendure, J. et al. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 1728–1732 (2005).
CAS PubMed Google Scholar
Barrick, J. E. & Lenski, R. E. Genome dynamics during experimental evolution. Nature Rev. Genet. 14, 827–839 (2013).
CAS PubMed Google Scholar
Xu, X. et al. The genomic sequence of the Chinese hamster ovary (CHO)-K1 cell line. Nature Biotech. 29, 735–741 (2011).
CAS Google Scholar
Lewis, N. E. et al. Genomic landscapes of Chinese hamster ovary cell lines as revealed by the Cricetulus griseus draft genome. Nature Biotech. 31, 759–765 (2013).
CAS Google Scholar
Brinkrolf, K. et al. Chinese hamster genome sequenced from sorted chromosomes. Nature Biotech. 31, 694–695 (2013).
CAS Google Scholar
Becker, J. et al. Unraveling the Chinese hamster ovary cell line transcriptome by next-generation sequencing. J. Biotechnol. 156, 227–235 (2011).
CAS PubMed Google Scholar
Kildegaard, H. F., Baycin-Hizal, D., Lewis, N. E. & Betenbaugh, M. J. The emerging CHO systems biology era: harnessing the 'omics revolution for biotechnology. Curr. Opin. Biotechnol. 24, 1102–1107 (2013).
PubMed Google Scholar
Furey, T. S. ChIP-seq and beyond: new and improved methodologies to detect and characterize protein–DNA interactions. Nature Rev. Genet. 13, 840–852 (2012).
CAS PubMed Google Scholar
Meaburn, E. & Schulz, R. Next generation sequencing in epigenetics: insights and challenges. Semin. Cell Dev. Biol. 23, 192–199 (2012).
CAS PubMed Google Scholar
Ley, T. J. et al. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456, 66–72 (2008).
CAS PubMed PubMed Central Google Scholar
Rios, J., Stein, E., Shendure, J., Hobbs, H. H. & Cohen, J. C. Identification by whole-genome resequencing of gene defect responsible for severe hypercholesterolemia. Hum. Mol. Genet. 19, 4313–4318 (2010).
CAS PubMed PubMed Central Google Scholar
Schneeberger, K. et al. SHOREmap: simultaneous mapping and mutation identification by deep sequencing. Nature Methods 6, 550–551 (2009).
CAS PubMed Google Scholar
Cooper, G. M. & Shendure, J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nature Rev. Genet. 12, 628–640 (2011).
CAS PubMed Google Scholar
Gonzalez-Perez, A. et al. Computational approaches to identify functional genetic variants in cancer genomes. Nature Methods 10, 723–729 (2013).
CAS PubMed PubMed Central Google Scholar
Reva, B., Antipin, Y. & Sander, C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 39, e118 (2011).
CAS PubMed PubMed Central Google Scholar
Lewis, N. E. & Abdel-Haleem, A. M. The evolution of genome-scale models of cancer metabolism. Front. Physiol. 4, 237 (2013).
PubMed PubMed Central Google Scholar
Ala-Korpela, M., Kangas, A. J. & Inouye, M. Genome-wide association studies and systems biology: together at last. Trends Genet. 27, 493–498 (2011).
CAS PubMed Google Scholar
Moreau, Y. & Tranchevent, L. C. Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nature Rev. Genet. 13, 523–536 (2012).
CAS PubMed Google Scholar
Zamft, B. M. et al. Measuring cation dependent DNA polymerase fidelity landscapes by deep sequencing. PLoS ONE 7, e43876 (2012).
CAS PubMed PubMed Central Google Scholar
Drukier, A. et al. New dark matter detectors using DNA for nanometer tracking. arXiv 1206.6809 (2012).
Hubisz, M. J., Lin, M. F., Kellis, M. & Siepel, A. Error and error mitigation in low-coverage genome assemblies. PLoS ONE 6, e17034 (2011).
CAS PubMed PubMed Central Google Scholar
Macabeo-Ong, M. et al. Effect of duration of fixation on quantitative reverse transcription polymerase chain reaction analyses. Mod. Pathol. 15, 979–987 (2002).
PubMed Google Scholar
Kerick, M. et al. Targeted high throughput sequencing in clinical cancer settings: formaldehyde fixed-paraffin embedded (FFPE) tumor tissues, input amount and tumor heterogeneity. BMC Med. Genom. 4, 68 (2011).
CAS Google Scholar
Lin, M. T. et al. Quantifying the relative amount of mouse and human DNA in cancer xenografts using species-specific variation in gene length. Biotechniques 48, 211–218 (2010).
CAS PubMed PubMed Central Google Scholar
Innis, M. A., Gelfand, D. H., Sninsky, J. J. & White, T. J. PCR protocols: a guide to methods and applications (Academic press, 1990).
Google Scholar
Wojdacz, T. K., Hansen, L. L. & Dobrovic, A. A new approach to primer design for the control of PCR bias in methylation studies. BMC Res. Notes 1, 54 (2008).
PubMed PubMed Central Google Scholar
Kanagawa, T. Bias and artifacts in multitemplate polymerase chain reactions (PCR). J. Biosci. Bioeng. 96, 317–323 (2003).
CAS PubMed Google Scholar
Nagalakshmi, U. et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320, 1344–1349 (2008).
CAS PubMed PubMed Central Google Scholar
Pont-Kingdon, G. et al. Design and analytical validation of clinical DNA sequencing assays. Arch. Pathol. Lab Med. 136, 41–46 (2012).
CAS PubMed Google Scholar
Gogol-Doring, A. & Chen, W. An overview of the analysis of next generation sequencing data. Methods Mol. Biol. 802, 249–257 (2012).
CAS PubMed Google Scholar
Whiteford, N. et al. Swift: primary data analysis for the Illumina Solexa sequencing platform. Bioinformatics 25, 2194–2199 (2009).
CAS PubMed PubMed Central Google Scholar
Loman, N. J. et al. Performance comparison of benchtop high-throughput sequencing platforms. Nature Biotech. 30, 434–439 (2012).
CAS Google Scholar
Huse, S. M., Huber, J. A., Morrison, H. G., Sogin, M. L. & Welch, D. M. Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol. 8, R143 (2007).
PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors thank T. Gianoulis for her feedback and inspiration, and J. Dupuis, Professor of Biostatistics at Boston University, Massachusetts, USA, for her encouragement and feedback during the nascent stages of replicate analysis. They also thank W. Jones, Global Head of Genomic Bioinformatics, Quintiles, and E. Aronesty, author of the ea-utils FASTQ processing package, for critical review of the manuscript. Some of this work was supported by the US National Institutes of Health grant P50HG005550.

Author information

George M. Church
Present address: Department of Genetics, Harvard Medical School, and the Wyss Institute for Biologically Inspired Engineering at Harvard University, Boston, Massachusetts 02115, USA.,
Kimberly Robasky
Present address: Program in Bioinformatics, Boston University, Massachusetts 02115, USA.Department of Genetics, Harvard Medical School, and the Wyss Institute for Biologically Inspired Engineering at Harvard University, Boston, Massachusetts 02115, USA. Present address: Expression Analysis, a Quintiles Company, Durham, North Carolina 27713, USA.,
Nathan E. Lewis
Present address: Department of Genetics, Harvard Medical School, and the Wyss Institute for Biologically Inspired Engineering at Harvard University, Boston, Massachusetts 02115, USA. Department of Biology, Brigham Young University, Provo, Utah 84602, USA. Present address: Division of Pediatric Pharmacology and Drug Discovery, University of California, San Diego School of Medicine, La Jolla, California 92093, USA.,
Kimberly Robasky and Nathan E. Lewis: These authors contributed equally to this work.

Authors and Affiliations

Authors

Kimberly Robasky
View author publications
You can also search for this author in PubMed Google Scholar
Nathan E. Lewis
View author publications
You can also search for this author in PubMed Google Scholar
George M. Church
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nathan E. Lewis.

Ethics declarations

Competing interests

K.R. is currently under employment by Expression Analysis, a Quintiles company. G.M.C. has advisory roles in and research sponsorships from several companies that are involved in genome sequencing technology and personal genomics. For a list of G.M.C's tech transfer, advisory roles and funding sources, see http://arep.med.harvard.edu/gmc/tech.html.

Supplementary information

Supplementary information S1 (box)

Datasets (PDF 268 kb)

Supplementary information S2 (box)

Method for Assessing Specificity/Sensitivity with Replicates (PDF 237 kb)

Glossary

Barcodes: Known DNA sequences that are appended to the ends of DNA fragments before sequencing for the purpose of pooling samples together to reduce cost.
Base call: Identification of the nitrogenous base (A, G, C or T) that is added to the short read during sequencing.
Batch effect: The statistical bias of indeterminate cause observed in samples that are processed together with the same sample preparation, the same library preparation and the same sequencing experiment.
Homopolymer: A sequence of multiple consecutive identical nucleotides.
Insertions and deletions: (Indels). Variants that are created by either the insertion or the deletion of nucleotides with respect to a matching reference.
Misalignment: The alignment of a sequencing read to an incorrect location on a reference genome. This can occur when reads align equally well to multiple genomic locations owing to indels, repeats and low-complexity regions of the genome.
Multiple displacement amplification: (MDA). A technique that is used for amplifying DNA sequences by synthesizing DNA from random hexamer primers.
Read clipping: Removal of adaptor and barcode sequences or of low-quality bases near read ends following sequencing.
Sequencing errors: Errors that are seen in the base call of short reads from next-generation sequencing technology.
Sequencing read depth: The number of reads that contributes to the variant call at a single location; also known as read depth, fold coverage and depth of coverage. It can also refer to the average read depth across the entire targeted sequence area.
Short reads: Short sequences of nucleotide bases and their respective quality scores that are obtained through next-generation sequencing from longer target sequences.
Somatic mosaicism: Genetic diversity among cells of a single organism.
Substitution errors: Errors that occur when one base is substituted for another during sequencing.
Variant call errors: An accumulation of misaligned reads or of reads with base call errors over a particular locus, which results in that locus being called a variant when it truly matches the reference, and vice versa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Robasky, K., Lewis, N. & Church, G. The role of replicates for error mitigation in next-generation sequencing. Nat Rev Genet 15, 56–62 (2014). https://doi.org/10.1038/nrg3655

Download citation

Published: 10 December 2013
Issue Date: January 2014
DOI: https://doi.org/10.1038/nrg3655

This article is cited by

Mitochondrial point heteroplasmy: insights from deep-sequencing of human replicate samples
- Marina Korolija
- Viktorija Sukser
- Kristian Vlahoviček
BMC Genomics (2024)
Characterization and mitigation of artifacts derived from NGS library preparation due to structure-specific sequences in the human genome
- HuiJuan Chen
- YiRan Zhang
- Qiming Zhou
BMC Genomics (2024)
Effects of Aegilops longissima chromosome 1Sl on wheat bread-making quality in two types of translocation lines
- Yuliang Qiu
- Zhiyang Han
- Xingguo Ye
Theoretical and Applied Genetics (2024)
Authentication of milk thistle commercial products using UHPLC-QTOF-ESI + MS metabolomics and DNA metabarcoding
- Ancuța Cristina Raclariu-Manolică
- Quentin Mauvisseau
- Carmen Socaciu
BMC Complementary Medicine and Therapies (2023)
Activation of human endogenous retroviruses and its physiological consequences
- Nicholas Dopkins
- Douglas F. Nixon
Nature Reviews Molecular Cell Biology (2023)