Abstract
Humans differ from each other in their genomes by <1 %. This determines the difference in susceptibility to disease, phenotypes, and traits. Predominantly, when looking for causal disease mutations, protein-coding sequences are screened first since those have the highest probability of affecting the function of a protein. Recent technological advances have seen a rise in the number of experiments being conducted to study a variety of diseases from monogenic to complex traits. Several computational approaches have been developed to extract putative functional missense variants. In this chapter we review some of these approaches and describe a standard step-by-step procedure that can be used to classify variants for the purpose of clinical care. We also provide two examples demonstrating this approach, one for a patient with a dilated cardiomyopathy diagnosis, and the other for a patient with an unknown etiology undergoing whole-genome sequencing (WGS).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- ESP:
-
Exome Sequencing Project
- HGMD:
-
Human Gene Mutation Database
- LOF:
-
Loss of function
- OMIM:
-
Online Mendelian inheritance in man
- SNV:
-
Single nucleotide variation
- VUS:
-
Variant of unknown significance
- WES:
-
Whole-exome sequencing
- WGS:
-
Whole-genome sequencing
References
Kruglyak L, Nickerson DA (2001) Variation is the spice of life. Nat Genet 27:234–236
Fu W, O'Connor TD, Jun G et al (2013) Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493:216–220
Yue P, Moult J (2006) Identification and analysis of deleterious human SNPs. J Mol Biol 356:1263–1274
Ariyaratnam R, Casas JP, Whittaker J et al (2007) Genetics of ischaemic stroke among persons of non-European descent: a meta-analysis of eight genes involving approximately 32,500 individuals. PLoS Med 4:e131
Lopes LR, Rahman MS, Elliott PM (2013) A systematic review and meta-analysis of genotype-phenotype associations in patients with hypertrophic cardiomyopathy caused by sarcomeric protein mutations. Heart 99: 1800–1811
McCarthy MI, Zeggini E (2009) Genome-wide association studies in type 2 diabetes. Curr Diabetes Rep 9:164–171
O'Seaghdha CM, Fox CS (2012) Genome-wide association studies of chronic kidney disease: what have we learned? Nat Rev Nephrol 8:89–99
Bolze A, Byun M, McDonald D et al (2010) Whole-exome-sequencing-based discovery of human FADD deficiency. Am J Hum Genet 87:873–881
Foroud T (2013) Whole exome sequencing of intracranial aneurysm. Stroke 44:S26–S28
Karow J (2011) Baylor Whole Genome Laboratory launches clinical exome sequencing test. http://genomeweb.com/print/988726
Sunyaev SR (2012) Inferring causality and functional significance of human coding DNA variants. Hum Mol Genet 21:R10–R17
Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164
McLaren W, Pritchard B, Rios D et al (2010) Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26:2069–2070
Ng SB, Turner EH, Robertson PD et al (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461: 272–276
Pope BJ, Nguyen-Dumont T, Odefrey F et al (2013) FAVR (Filtering and Annotation of Variants that are Rare): methods to facilitate the analysis of rare germline genetic variants from massively parallel sequencing datasets. BMC Bioinformatics 14:65
1000 Genome Project Consortium, Abecasis, G.R., Auton, A. et al (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65
Ngamphiw C, Assawamakin A, Xu S et al (2011) PanSNPdb: the Pan-Asian SNP genotyping database. PLoS One 6:e21451
Biesecker LG, Mullikin JC, Facio FM et al (2009) The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine. Genome Res 19:1665–1674
Sherry ST, Ward MH, Kholodov M et al (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311
Tennessen JA, Bigham AW, O'Connor TD et al (2012) Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337:64–69
Cooper GM, Shendure J (2011) Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat Rev Genet 12:628–40
Adzhubei IA, Schmidt S, Peshkin L et al (2010) A method and server for predicting damaging missense mutations. Nat Methods 7:248–249
Hu J, Ng PC (2013) SIFT Indel: predictions for the functional effects of amino acid insertions/deletions in proteins. PLoS One 8: e77940
Kumar P, Henikoff S, Ng PC (2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc 4:1073–1081
Schwarz JM, Rodelsperger C, Schuelke M et al (2010) MutationTaster evaluates disease-causing potential of sequence alterations. Nat Methods 7:575–576
Johnson AD, Handsaker RE, Pulit SL et al (2008) SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24:2938–2939
Reva B, Antipin Y, Sander C (2011) Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res 39:e118
Stone EA, Sidow A (2005) Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res 15:978–986
Stenson PD, Mort M, Ball EV et al (2013) The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet [Epub ahead of print]
Landrum MJ, Lee JM, Riley GR et al (2013) ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res [Epub ahead of print]
Fokkema IF, Taschner PE, Schaafsma GC et al (2011) LOVD v. 2.0: the next generation in gene variant databases. Hum Mutat 32: 557–563
Whirl-Carrillo M, McDonagh EM, Hebert JM et al (2012) Pharmacogenomics knowledge for personalized medicine. Clin Pharmacol Ther 92:414–417
Duzkale H, Shen J, McLaughlin H et al (2013) A systematic approach to assessing the clinical significance of genetic variants. Clin Genet 84:453–63
Shendure J, Lieberman Aiden E (2012) The expanding scope of DNA sequencing. Nat Biotechnol 30:1084–1094
Gowrisankar S, Lemer-Ellis JP, Cox S et al (2010) Evaluation of second-generation sequencing of 19 dilated cardiomyopathy genes for clinical applications. J Mol Diagn 12:818–827
Kitzman JO, Snyder MW, Ventura M et al (2012) Noninvasive whole-genome sequencing of a human fetus. Sci Transl Med 4:137ra76
Johnson GC, Esposito L, Barratt BJ et al (2001) Haplotype tagging for the identification of common disease genes. Nat Genet 29:233–237
Raychaudhuri S (2011) Mapping rare and common causal alleles for complex human diseases. Cell 147:57–69
Kottgen M (2007) TRPP2 and autosomal dominant polycystic kidney disease. Biochim Biophys Acta 1772:836–850
Myerowitz R (1997) Tay-Sachs disease-causing mutations and neutral polymorphisms in the Hex A gene. Hum Mutat 9:195–208
Simons C, Wolf NI, McNeil N et al (2013) A de novo mutation in the beta-tubulin gene TUBB4A results in the leukoencephalopathy hypomyelination with atrophy of the basal ganglia and cerebellum. Am J Hum Genet 92:767–773
Brauch KM, Karst ML, Herron KJ et al (2009) Mutations in ribonucleic acid binding protein gene cause familial dilated cardiomyopathy. J Am Coll Cardiol 54:930–941
Karolchik D, Barber GP, Casper J et al (2013) The UCSC Genome Browser database: 2014 update. Nucleic Acids Res [Epub ahead of print]
Fan X, Tang L (2013) Aberrant and alternative splicing in skeletal system disease. Gene 528:21–26
Reese MG, Eeckman FH, Kulp D et al (1997) Improved splice site detection in genie. J Comput Biol 4:311–323
Pertea M, Lin X, Salzberg SL (2001) GeneSplicer: a new computational method for splice site prediction. Nucleic Acids Res 29:1185–1190
Yang Y, Swaminathan S, Martin BK et al (2003) Aberrant splicing induced by missense mutations in BRCA1: clues from a humanized mouse model. Hum Mol Genet 12:2121–2131
Hakonarson H, Grant SF, Bradfield JP et al (2007) A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature 448:591–594
Sladek R, Rocheleau G, Rung J et al (2007) A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445: 881–885
Manolio TA, Collins FS, Cox NJ et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753
Koboldt DC, Zhang Q, Larson DE et al (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22:568–576
Saunders CT, Wong WS, Swamy S et al (2012) Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28:1811–1817
Cibulskis K, Lawrence MS, Carter SL et al (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213–219
Forbes SA, Bindal N, Bamford S et al (2010) COSMIC: mining complete cancer genomes in the catalogue of somatic mutations in cancer. Nucleic Acids Res 39:D945–D950
Jordan DM, Kiezun A, Baxter SM et al (2011) Development and validation of a computational method for assessment of missense variants in hypertrophic cardiomyopathy. Am J Hum Genet 88:183–192
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Gowrisankar, S., Lebo, M.S. (2014). Designing Algorithms for Determining Significance of DNA Missense Changes. In: Trent, R. (eds) Clinical Bioinformatics. Methods in Molecular Biology, vol 1168. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0847-9_14
Download citation
DOI: https://doi.org/10.1007/978-1-4939-0847-9_14
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-0846-2
Online ISBN: 978-1-4939-0847-9
eBook Packages: Springer Protocols