Skip to main content

Designing Algorithms for Determining Significance of DNA Missense Changes

  • Protocol
  • First Online:
Clinical Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1168))

  • 4434 Accesses

Abstract

Humans differ from each other in their genomes by <1 %. This determines the difference in susceptibility to disease, phenotypes, and traits. Predominantly, when looking for causal disease mutations, protein-coding sequences are screened first since those have the highest probability of affecting the function of a protein. Recent technological advances have seen a rise in the number of experiments being conducted to study a variety of diseases from monogenic to complex traits. Several computational approaches have been developed to extract putative functional missense variants. In this chapter we review some of these approaches and describe a standard step-by-step procedure that can be used to classify variants for the purpose of clinical care. We also provide two examples demonstrating this approach, one for a patient with a dilated cardiomyopathy diagnosis, and the other for a patient with an unknown etiology undergoing whole-genome sequencing (WGS).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

ESP:

Exome Sequencing Project

HGMD:

Human Gene Mutation Database

LOF:

Loss of function

OMIM:

Online Mendelian inheritance in man

SNV:

Single nucleotide variation

VUS:

Variant of unknown significance

WES:

Whole-exome sequencing

WGS:

Whole-genome sequencing

References

  1. Kruglyak L, Nickerson DA (2001) Variation is the spice of life. Nat Genet 27:234–236

    Article  CAS  PubMed  Google Scholar 

  2. Fu W, O'Connor TD, Jun G et al (2013) Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493:216–220

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  3. Yue P, Moult J (2006) Identification and analysis of deleterious human SNPs. J Mol Biol 356:1263–1274

    Article  CAS  PubMed  Google Scholar 

  4. Ariyaratnam R, Casas JP, Whittaker J et al (2007) Genetics of ischaemic stroke among persons of non-European descent: a meta-analysis of eight genes involving approximately 32,500 individuals. PLoS Med 4:e131

    Article  PubMed Central  PubMed  Google Scholar 

  5. Lopes LR, Rahman MS, Elliott PM (2013) A systematic review and meta-analysis of genotype-phenotype associations in patients with hypertrophic cardiomyopathy caused by sarcomeric protein mutations. Heart 99: 1800–1811

    Article  PubMed  Google Scholar 

  6. McCarthy MI, Zeggini E (2009) Genome-wide association studies in type 2 diabetes. Curr Diabetes Rep 9:164–171

    Article  CAS  Google Scholar 

  7. O'Seaghdha CM, Fox CS (2012) Genome-wide association studies of chronic kidney disease: what have we learned? Nat Rev Nephrol 8:89–99

    Article  Google Scholar 

  8. Bolze A, Byun M, McDonald D et al (2010) Whole-exome-sequencing-based discovery of human FADD deficiency. Am J Hum Genet 87:873–881

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  9. Foroud T (2013) Whole exome sequencing of intracranial aneurysm. Stroke 44:S26–S28

    Article  PubMed  Google Scholar 

  10. Karow J (2011) Baylor Whole Genome Laboratory launches clinical exome sequencing test. http://genomeweb.com/print/988726

  11. Sunyaev SR (2012) Inferring causality and functional significance of human coding DNA variants. Hum Mol Genet 21:R10–R17

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  12. Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164

    Article  PubMed Central  PubMed  Google Scholar 

  13. McLaren W, Pritchard B, Rios D et al (2010) Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26:2069–2070

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  14. Ng SB, Turner EH, Robertson PD et al (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461: 272–276

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  15. Pope BJ, Nguyen-Dumont T, Odefrey F et al (2013) FAVR (Filtering and Annotation of Variants that are Rare): methods to facilitate the analysis of rare germline genetic variants from massively parallel sequencing datasets. BMC Bioinformatics 14:65

    Article  PubMed Central  PubMed  Google Scholar 

  16. 1000 Genome Project Consortium, Abecasis, G.R., Auton, A. et al (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65

    Article  Google Scholar 

  17. Ngamphiw C, Assawamakin A, Xu S et al (2011) PanSNPdb: the Pan-Asian SNP genotyping database. PLoS One 6:e21451

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  18. Biesecker LG, Mullikin JC, Facio FM et al (2009) The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine. Genome Res 19:1665–1674

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  19. Sherry ST, Ward MH, Kholodov M et al (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  20. Tennessen JA, Bigham AW, O'Connor TD et al (2012) Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337:64–69

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  21. Cooper GM, Shendure J (2011) Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat Rev Genet 12:628–40

    Article  CAS  PubMed  Google Scholar 

  22. Adzhubei IA, Schmidt S, Peshkin L et al (2010) A method and server for predicting damaging missense mutations. Nat Methods 7:248–249

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  23. Hu J, Ng PC (2013) SIFT Indel: predictions for the functional effects of amino acid insertions/deletions in proteins. PLoS One 8: e77940

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  24. Kumar P, Henikoff S, Ng PC (2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc 4:1073–1081

    Article  CAS  PubMed  Google Scholar 

  25. Schwarz JM, Rodelsperger C, Schuelke M et al (2010) MutationTaster evaluates disease-causing potential of sequence alterations. Nat Methods 7:575–576

    Article  CAS  PubMed  Google Scholar 

  26. Johnson AD, Handsaker RE, Pulit SL et al (2008) SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24:2938–2939

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  27. Reva B, Antipin Y, Sander C (2011) Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res 39:e118

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  28. Stone EA, Sidow A (2005) Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res 15:978–986

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  29. Stenson PD, Mort M, Ball EV et al (2013) The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet [Epub ahead of print]

    Google Scholar 

  30. Landrum MJ, Lee JM, Riley GR et al (2013) ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res [Epub ahead of print]

    Google Scholar 

  31. Fokkema IF, Taschner PE, Schaafsma GC et al (2011) LOVD v. 2.0: the next generation in gene variant databases. Hum Mutat 32: 557–563

    Article  CAS  PubMed  Google Scholar 

  32. Whirl-Carrillo M, McDonagh EM, Hebert JM et al (2012) Pharmacogenomics knowledge for personalized medicine. Clin Pharmacol Ther 92:414–417

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  33. Duzkale H, Shen J, McLaughlin H et al (2013) A systematic approach to assessing the clinical significance of genetic variants. Clin Genet 84:453–63

    Article  CAS  PubMed  Google Scholar 

  34. Shendure J, Lieberman Aiden E (2012) The expanding scope of DNA sequencing. Nat Biotechnol 30:1084–1094

    Article  CAS  PubMed  Google Scholar 

  35. Gowrisankar S, Lemer-Ellis JP, Cox S et al (2010) Evaluation of second-generation sequencing of 19 dilated cardiomyopathy genes for clinical applications. J Mol Diagn 12:818–827

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  36. Kitzman JO, Snyder MW, Ventura M et al (2012) Noninvasive whole-genome sequencing of a human fetus. Sci Transl Med 4:137ra76

    Article  PubMed Central  PubMed  Google Scholar 

  37. Johnson GC, Esposito L, Barratt BJ et al (2001) Haplotype tagging for the identification of common disease genes. Nat Genet 29:233–237

    Article  CAS  PubMed  Google Scholar 

  38. Raychaudhuri S (2011) Mapping rare and common causal alleles for complex human diseases. Cell 147:57–69

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  39. Kottgen M (2007) TRPP2 and autosomal dominant polycystic kidney disease. Biochim Biophys Acta 1772:836–850

    Article  PubMed  Google Scholar 

  40. Myerowitz R (1997) Tay-Sachs disease-causing mutations and neutral polymorphisms in the Hex A gene. Hum Mutat 9:195–208

    Article  CAS  PubMed  Google Scholar 

  41. Simons C, Wolf NI, McNeil N et al (2013) A de novo mutation in the beta-tubulin gene TUBB4A results in the leukoencephalopathy hypomyelination with atrophy of the basal ganglia and cerebellum. Am J Hum Genet 92:767–773

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  42. Brauch KM, Karst ML, Herron KJ et al (2009) Mutations in ribonucleic acid binding protein gene cause familial dilated cardiomyopathy. J Am Coll Cardiol 54:930–941

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  43. Karolchik D, Barber GP, Casper J et al (2013) The UCSC Genome Browser database: 2014 update. Nucleic Acids Res [Epub ahead of print]

    Google Scholar 

  44. Fan X, Tang L (2013) Aberrant and alternative splicing in skeletal system disease. Gene 528:21–26

    Article  CAS  PubMed  Google Scholar 

  45. Reese MG, Eeckman FH, Kulp D et al (1997) Improved splice site detection in genie. J Comput Biol 4:311–323

    Article  CAS  PubMed  Google Scholar 

  46. Pertea M, Lin X, Salzberg SL (2001) GeneSplicer: a new computational method for splice site prediction. Nucleic Acids Res 29:1185–1190

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  47. Yang Y, Swaminathan S, Martin BK et al (2003) Aberrant splicing induced by missense mutations in BRCA1: clues from a humanized mouse model. Hum Mol Genet 12:2121–2131

    Article  CAS  PubMed  Google Scholar 

  48. Hakonarson H, Grant SF, Bradfield JP et al (2007) A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature 448:591–594

    Article  CAS  PubMed  Google Scholar 

  49. Sladek R, Rocheleau G, Rung J et al (2007) A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445: 881–885

    Article  CAS  PubMed  Google Scholar 

  50. Manolio TA, Collins FS, Cox NJ et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  51. Koboldt DC, Zhang Q, Larson DE et al (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22:568–576

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  52. Saunders CT, Wong WS, Swamy S et al (2012) Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28:1811–1817

    Article  CAS  PubMed  Google Scholar 

  53. Cibulskis K, Lawrence MS, Carter SL et al (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213–219

    Article  CAS  PubMed  Google Scholar 

  54. Forbes SA, Bindal N, Bamford S et al (2010) COSMIC: mining complete cancer genomes in the catalogue of somatic mutations in cancer. Nucleic Acids Res 39:D945–D950

    Article  PubMed Central  PubMed  Google Scholar 

  55. Jordan DM, Kiezun A, Baxter SM et al (2011) Development and validation of a computational method for assessment of missense variants in hypertrophic cardiomyopathy. Am J Hum Genet 88:183–192

    Article  CAS  PubMed Central  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sivakumar Gowrisankar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this protocol

Cite this protocol

Gowrisankar, S., Lebo, M.S. (2014). Designing Algorithms for Determining Significance of DNA Missense Changes. In: Trent, R. (eds) Clinical Bioinformatics. Methods in Molecular Biology, vol 1168. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0847-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-0847-9_14

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-0846-2

  • Online ISBN: 978-1-4939-0847-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics