Designing Algorithms for Determining Significance of DNA Missense Changes

Gowrisankar, Sivakumar; Lebo, Matthew S.

doi:10.1007/978-1-4939-0847-9_14

Sivakumar Gowrisankar³ &
Matthew S. Lebo^4,5

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1168))

4434 Accesses

Abstract

Humans differ from each other in their genomes by <1 %. This determines the difference in susceptibility to disease, phenotypes, and traits. Predominantly, when looking for causal disease mutations, protein-coding sequences are screened first since those have the highest probability of affecting the function of a protein. Recent technological advances have seen a rise in the number of experiments being conducted to study a variety of diseases from monogenic to complex traits. Several computational approaches have been developed to extract putative functional missense variants. In this chapter we review some of these approaches and describe a standard step-by-step procedure that can be used to classify variants for the purpose of clinical care. We also provide two examples demonstrating this approach, one for a patient with a dilated cardiomyopathy diagnosis, and the other for a patient with an unknown etiology undergoing whole-genome sequencing (WGS).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

ESP:: Exome Sequencing Project
HGMD:: Human Gene Mutation Database
LOF:: Loss of function
OMIM:: Online Mendelian inheritance in man
SNV:: Single nucleotide variation
VUS:: Variant of unknown significance
WES:: Whole-exome sequencing
WGS:: Whole-genome sequencing

References

Kruglyak L, Nickerson DA (2001) Variation is the spice of life. Nat Genet 27:234–236
Article CAS PubMed Google Scholar
Fu W, O'Connor TD, Jun G et al (2013) Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493:216–220
Article CAS PubMed Central PubMed Google Scholar
Yue P, Moult J (2006) Identification and analysis of deleterious human SNPs. J Mol Biol 356:1263–1274
Article CAS PubMed Google Scholar
Ariyaratnam R, Casas JP, Whittaker J et al (2007) Genetics of ischaemic stroke among persons of non-European descent: a meta-analysis of eight genes involving approximately 32,500 individuals. PLoS Med 4:e131
Article PubMed Central PubMed Google Scholar
Lopes LR, Rahman MS, Elliott PM (2013) A systematic review and meta-analysis of genotype-phenotype associations in patients with hypertrophic cardiomyopathy caused by sarcomeric protein mutations. Heart 99: 1800–1811
Article PubMed Google Scholar
McCarthy MI, Zeggini E (2009) Genome-wide association studies in type 2 diabetes. Curr Diabetes Rep 9:164–171
Article CAS Google Scholar
O'Seaghdha CM, Fox CS (2012) Genome-wide association studies of chronic kidney disease: what have we learned? Nat Rev Nephrol 8:89–99
Article Google Scholar
Bolze A, Byun M, McDonald D et al (2010) Whole-exome-sequencing-based discovery of human FADD deficiency. Am J Hum Genet 87:873–881
Article CAS PubMed Central PubMed Google Scholar
Foroud T (2013) Whole exome sequencing of intracranial aneurysm. Stroke 44:S26–S28
Article PubMed Google Scholar
Karow J (2011) Baylor Whole Genome Laboratory launches clinical exome sequencing test. http://genomeweb.com/print/988726
Sunyaev SR (2012) Inferring causality and functional significance of human coding DNA variants. Hum Mol Genet 21:R10–R17
Article CAS PubMed Central PubMed Google Scholar
Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164
Article PubMed Central PubMed Google Scholar
McLaren W, Pritchard B, Rios D et al (2010) Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26:2069–2070
Article CAS PubMed Central PubMed Google Scholar
Ng SB, Turner EH, Robertson PD et al (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461: 272–276
Article CAS PubMed Central PubMed Google Scholar
Pope BJ, Nguyen-Dumont T, Odefrey F et al (2013) FAVR (Filtering and Annotation of Variants that are Rare): methods to facilitate the analysis of rare germline genetic variants from massively parallel sequencing datasets. BMC Bioinformatics 14:65
Article PubMed Central PubMed Google Scholar
1000 Genome Project Consortium, Abecasis, G.R., Auton, A. et al (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65
Article Google Scholar
Ngamphiw C, Assawamakin A, Xu S et al (2011) PanSNPdb: the Pan-Asian SNP genotyping database. PLoS One 6:e21451
Article CAS PubMed Central PubMed Google Scholar
Biesecker LG, Mullikin JC, Facio FM et al (2009) The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine. Genome Res 19:1665–1674
Article CAS PubMed Central PubMed Google Scholar
Sherry ST, Ward MH, Kholodov M et al (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311
Article CAS PubMed Central PubMed Google Scholar
Tennessen JA, Bigham AW, O'Connor TD et al (2012) Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337:64–69
Article CAS PubMed Central PubMed Google Scholar
Cooper GM, Shendure J (2011) Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat Rev Genet 12:628–40
Article CAS PubMed Google Scholar
Adzhubei IA, Schmidt S, Peshkin L et al (2010) A method and server for predicting damaging missense mutations. Nat Methods 7:248–249
Article CAS PubMed Central PubMed Google Scholar
Hu J, Ng PC (2013) SIFT Indel: predictions for the functional effects of amino acid insertions/deletions in proteins. PLoS One 8: e77940
Article CAS PubMed Central PubMed Google Scholar
Kumar P, Henikoff S, Ng PC (2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc 4:1073–1081
Article CAS PubMed Google Scholar
Schwarz JM, Rodelsperger C, Schuelke M et al (2010) MutationTaster evaluates disease-causing potential of sequence alterations. Nat Methods 7:575–576
Article CAS PubMed Google Scholar
Johnson AD, Handsaker RE, Pulit SL et al (2008) SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24:2938–2939
Article CAS PubMed Central PubMed Google Scholar
Reva B, Antipin Y, Sander C (2011) Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res 39:e118
Article CAS PubMed Central PubMed Google Scholar
Stone EA, Sidow A (2005) Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res 15:978–986
Article CAS PubMed Central PubMed Google Scholar
Stenson PD, Mort M, Ball EV et al (2013) The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet [Epub ahead of print]
Google Scholar
Landrum MJ, Lee JM, Riley GR et al (2013) ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res [Epub ahead of print]
Google Scholar
Fokkema IF, Taschner PE, Schaafsma GC et al (2011) LOVD v. 2.0: the next generation in gene variant databases. Hum Mutat 32: 557–563
Article CAS PubMed Google Scholar
Whirl-Carrillo M, McDonagh EM, Hebert JM et al (2012) Pharmacogenomics knowledge for personalized medicine. Clin Pharmacol Ther 92:414–417
Article CAS PubMed Central PubMed Google Scholar
Duzkale H, Shen J, McLaughlin H et al (2013) A systematic approach to assessing the clinical significance of genetic variants. Clin Genet 84:453–63
Article CAS PubMed Google Scholar
Shendure J, Lieberman Aiden E (2012) The expanding scope of DNA sequencing. Nat Biotechnol 30:1084–1094
Article CAS PubMed Google Scholar
Gowrisankar S, Lemer-Ellis JP, Cox S et al (2010) Evaluation of second-generation sequencing of 19 dilated cardiomyopathy genes for clinical applications. J Mol Diagn 12:818–827
Article CAS PubMed Central PubMed Google Scholar
Kitzman JO, Snyder MW, Ventura M et al (2012) Noninvasive whole-genome sequencing of a human fetus. Sci Transl Med 4:137ra76
Article PubMed Central PubMed Google Scholar
Johnson GC, Esposito L, Barratt BJ et al (2001) Haplotype tagging for the identification of common disease genes. Nat Genet 29:233–237
Article CAS PubMed Google Scholar
Raychaudhuri S (2011) Mapping rare and common causal alleles for complex human diseases. Cell 147:57–69
Article CAS PubMed Central PubMed Google Scholar
Kottgen M (2007) TRPP2 and autosomal dominant polycystic kidney disease. Biochim Biophys Acta 1772:836–850
Article PubMed Google Scholar
Myerowitz R (1997) Tay-Sachs disease-causing mutations and neutral polymorphisms in the Hex A gene. Hum Mutat 9:195–208
Article CAS PubMed Google Scholar
Simons C, Wolf NI, McNeil N et al (2013) A de novo mutation in the beta-tubulin gene TUBB4A results in the leukoencephalopathy hypomyelination with atrophy of the basal ganglia and cerebellum. Am J Hum Genet 92:767–773
Article CAS PubMed Central PubMed Google Scholar
Brauch KM, Karst ML, Herron KJ et al (2009) Mutations in ribonucleic acid binding protein gene cause familial dilated cardiomyopathy. J Am Coll Cardiol 54:930–941
Article CAS PubMed Central PubMed Google Scholar
Karolchik D, Barber GP, Casper J et al (2013) The UCSC Genome Browser database: 2014 update. Nucleic Acids Res [Epub ahead of print]
Google Scholar
Fan X, Tang L (2013) Aberrant and alternative splicing in skeletal system disease. Gene 528:21–26
Article CAS PubMed Google Scholar
Reese MG, Eeckman FH, Kulp D et al (1997) Improved splice site detection in genie. J Comput Biol 4:311–323
Article CAS PubMed Google Scholar
Pertea M, Lin X, Salzberg SL (2001) GeneSplicer: a new computational method for splice site prediction. Nucleic Acids Res 29:1185–1190
Article CAS PubMed Central PubMed Google Scholar
Yang Y, Swaminathan S, Martin BK et al (2003) Aberrant splicing induced by missense mutations in BRCA1: clues from a humanized mouse model. Hum Mol Genet 12:2121–2131
Article CAS PubMed Google Scholar
Hakonarson H, Grant SF, Bradfield JP et al (2007) A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature 448:591–594
Article CAS PubMed Google Scholar
Sladek R, Rocheleau G, Rung J et al (2007) A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445: 881–885
Article CAS PubMed Google Scholar
Manolio TA, Collins FS, Cox NJ et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753
Article CAS PubMed Central PubMed Google Scholar
Koboldt DC, Zhang Q, Larson DE et al (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22:568–576
Article CAS PubMed Central PubMed Google Scholar
Saunders CT, Wong WS, Swamy S et al (2012) Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28:1811–1817
Article CAS PubMed Google Scholar
Cibulskis K, Lawrence MS, Carter SL et al (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213–219
Article CAS PubMed Google Scholar
Forbes SA, Bindal N, Bamford S et al (2010) COSMIC: mining complete cancer genomes in the catalogue of somatic mutations in cancer. Nucleic Acids Res 39:D945–D950
Article PubMed Central PubMed Google Scholar
Jordan DM, Kiezun A, Baxter SM et al (2011) Development and validation of a computational method for assessment of missense variants in hypertrophic cardiomyopathy. Am J Hum Genet 88:183–192
Article CAS PubMed Central PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Novartis Institutes for Biomedical Research, 500 Technology Square, 840-14, Cambridge, MA, 02139, USA
Sivakumar Gowrisankar
Partners HealthCare Center for Personalized Genetic Medicine, Cambridge, MA, USA
Matthew S. Lebo
Department of Pathology, Harvard Medical School, Boston, MA, USA
Matthew S. Lebo

Authors

Sivakumar Gowrisankar
View author publications
You can also search for this author in PubMed Google Scholar
Matthew S. Lebo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sivakumar Gowrisankar .

Editor information

Editors and Affiliations

Department of Medical Genomics, Royal Prince Alfred Hospital and Sydney Medical School, University of Sydney, Camperdown, Australia
Ronald Trent

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Gowrisankar, S., Lebo, M.S. (2014). Designing Algorithms for Determining Significance of DNA Missense Changes. In: Trent, R. (eds) Clinical Bioinformatics. Methods in Molecular Biology, vol 1168. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0847-9_14

Download citation

DOI: https://doi.org/10.1007/978-1-4939-0847-9_14
Published: 09 May 2014
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-0846-2
Online ISBN: 978-1-4939-0847-9
eBook Packages: Springer Protocols

Publish with us

Policies and ethics