Statistical Modeling of Coverage in High-Throughput Data

Golan, David; Rosset, Saharon

doi:10.1007/978-1-62703-514-9_4

Statistical Modeling of Coverage in High-Throughput Data

David Golan² &
Saharon Rosset²

Protocol
First Online: 01 January 2013

6336 Accesses
1 Altmetric

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1038))

Abstract

In high-throughput sequencing experiments, the number of reads mapping to a genomic region, also known as the “coverage” or “coverage depth,” is often used as a proxy for the abundance of the underlying genomic region in the sample. The abundance, in turn, can be used for many purposes including calling SNPs, estimating the allele frequency in a pool of individuals, identifying copy number variations, and identifying differentially expressed shRNAs in shRNA-seq experiments.

In this chapter we describe the fundamentals of statistical modeling of coverage depth and discuss the problems of estimation and inference in the relevant experimental scenarios.

This is a preview of subscription content, log in via an institution.

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

Metzker ML (2010) Sequencing technologies – the next generation. Nat Rev Genet 11:31–46
Article PubMed CAS Google Scholar
Kircher M, Kelso J (2010) High-throughput DNA sequencing – concepts and limitations. Bioessays 32:425–536
Article Google Scholar
Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1):57–63
Article PubMed CAS Google Scholar
Medvedev P, Stanciu M, Brudno M (2009) Computational methods for discovering structural variation with next generation sequencing. Nat Methods 6:S13–S20
Article PubMed CAS Google Scholar
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9(4):357–359
Article PubMed CAS Google Scholar
Li H et al (2009) The sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25:2078–2079
Article PubMed Google Scholar
R Development Core Team (2010) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Google Scholar
Glenn TC (2011) Field guide to next-generation DNA sequencers. Mol Ecol Resour 11:759–769
Article PubMed CAS Google Scholar
Benjamini Y, Speed TP (2012) Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res 40(10):1–14
Google Scholar
McCullagh P, Nelder J (1989) Generalized linear models, 2nd edn. Chapman and Hall/CRC, Boca Raton
Google Scholar
Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10(3):512–526
PubMed CAS Google Scholar
Hilbe JM (2007) Negative binomial regression. Cambridge University Press, Cambridge
Book Google Scholar
Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11:R106
Article PubMed CAS Google Scholar
1000 Genomes Project Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061–1073
Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematical Sciences, Tel Aviv University, Tel Aviv, Israel
David Golan & Saharon Rosset

Authors

David Golan
View author publications
You can also search for this author in PubMed Google Scholar
Saharon Rosset
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

, Faculty of Medicine, Tel Aviv University, N/A, Tel Aviv, 69978, Israel
Noam Shomron

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Golan, D., Rosset, S. (2013). Statistical Modeling of Coverage in High-Throughput Data. In: Shomron, N. (eds) Deep Sequencing Data Analysis. Methods in Molecular Biology, vol 1038. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-514-9_4

Download citation

DOI: https://doi.org/10.1007/978-1-62703-514-9_4
Published: 18 June 2013
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-513-2
Online ISBN: 978-1-62703-514-9
eBook Packages: Springer Protocols

Publish with us

Policies and ethics