Abstract
In the field of biomedical sciences, it is much essential to discover the mode of gene interaction and their environment in context to their influence over a particular genetic trait. The Single Nucleotide Polymorphisms (SNPs) are the variations encountered in DNA sequence generated due to single nucleotide alteration in genomic sequence. Recently, Genome-Wide Association Studies(GWAS) have revealed the significant associations between SNPs and disease. To be categorized as mutation, it must occur among at least 1% of the population. SNPs are the most common form of variation occurring in a population. Studies show evidence that the key reason behind complex disease development is SNP-SNP interactions, not individual SNPs. Several models have been developed and implemented for the analysis of SNP-SNP interactions. This paper presents the understanding of big data analytics in the field of biomedical and healthcare along with the different machine learning methods implemented for SNP-SNP interactions analysis studies. These analysis methods and models enable us to acquire more comprehensive understanding of human physiology and disease relation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wang Y, Kung L, Ting C, Byrd TA (2015) Beyond a technical perspective: understanding big data capabilities in health care. In: 48th Hawaii international conference system sciences (HICSS), IEEE, pp 3044–3053
Wamba SF, Akter S, Edwards A, Chopin G, Gnanzou D (2015) How ‘big data’ can make big impact: Findings from a systematic review and a longitudinal case study. Int J Prod Econ 165:234–246
Ker JI, Wang Y, Hajli MN, Song J, Ker CW (2014) Deploying lean in healthcare: evaluating information technology effectiveness in US hospital pharmacies. Int J Inf Manage 34(4):556–560
Wang Y, Kung L, Byrd TA (2018) Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol Forecast Soc Chang 126:3–13
Zhang Y, Qiu M, Tsai CW, Hassan MM, Alamri A (2017) Health-CPS: healthcare cyber-physical system assisted by cloud and big data. IEEE Syst J 11(1):88–95
Lee CH, Yoon HJ (2017) Medical big data: promise and challenges. Kidney Res Clin Pract 36(1):3
Kienzler AK, Hargreaves CE, Patel SY (2017) The role of genomics in common variable immunodeficiency disorders. Clin Exp Immunol 188(3):326–332
Moore JH, Andrews PC (2015) Epistasis analysis using multifactor dimensionality reduction. In: Epistasis, Humana Press, New York, NY, pp 301–314
Yoo C, Ramirez L, Liuzzi J (2014) Big data analysis using modern statistical and machine learning methods in medicine. Int. Neurourol J 18(2):50
Heidema AG, Boer JM, Nagelkerke N, Mariman EC, Feskens EJ (2006) The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases. BMC Genet 7(1):23
Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH (2004) Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 69(1):138–147
He H, Oetting WS, Brott MJ, Basu S (2010) Pair-wise multifactor dimensionality reduction method to detect gene-gene interactions in a case-control study. Hum Hered 69(1):60–70
Gui J, Andrew AS, Andrews P, Nelson HM, Kelsey KT, Karagas MR, Moore H (2011) A robust multifactor dimensionality reduction method for detecting gene–gene interactions with application to the genetic analysis of bladder cancer susceptibility. Ann Hum Genet 75(1):20–28
MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, Junkins H, McMahon A, Milano A, Morales J, Pendlington ZM (2016) The new NHGRI-EBI catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 45(D1):D896–D901
Yang CH, Chuang LY, Lin YD (2017) Multiobjective differential evolution-based multifactor dimensionality reduction for detecting gene–gene interactions. Sci Rep 7(1):12869
Goodman SN (1999) Probability at the bedside: the knowing of chances or the chances of knowing? Ann Intern Med 130(7):604–606
Mechanic LE, Luke BT, Goodman JE, Chanock SJ, Harris CC (2008) Polymorphism interaction analysis (PIA): a method for investigating complex gene-gene interactions. BMC Bioinformatics 9(1):146
Uppu S, Krishna A, Gopalan R (2018) A review on methods for detecting SNP interactions in high-dimensional genomic data. IEEE/ACM Trans Comput Biol Bioinf 15(2):599–612
Chen SH, Sun J, Dimitrov L, Turner AR, Adams TS, Meyers DA, Chang BL, Zheng SL, Grönberg H, Xu J, Hsu FC (2008) A support vector machine approach for detecting gene-gene interaction. Genet Epidemiol 32(2):152–167
Chevrolat JP, Golmard JL, Ammar S, Jouvent R, Boisvieux JF (1998) Modelling behavioral syndromes using Bayesian networks. Artif Intell Med 14(3):259–277
Murphy K, Mian S (1999) Modelling gene expression data using dynamic Bayesian networks. Technical report, vol 104. Computer Science Division, University of California, Berkeley, CA
Somogyi R, Sniegoski CA (1996) Modeling the complexity of genetic networks: understanding multigenic and pleiotropic regulation. Complexity 1(6):45–63
Weaver DC, Workman CT, Stormo GD (1999) Modeling regulatory networks with weight matrices. Biocomputing 112–123
D’haeseleer P, Wen X, Fuhrman S, Somogyi R (1999) Linear modeling of mRNA expression levels during CNS development and injury. Biocomputing 41–52
Johnson AD (2009) Single-nucleotide polymorphism bioinformatics: a comprehensive review of resources. Circ Cardiovasc Genet 2(5):530–536
Shah ND, Pathak J (2014) Why health care may finally be ready for big data. Harvard Bus Rev 3
Jiang P, Winkley J, Zhao C, Munnoch R, Min G, Yang LT (2016) An intelligent information forwarder for healthcare big data systems with distributed wearable sensors. IEEE Syst J 10(3):1147–1159
Luzón-Toro B, Bleda M, Navarro E et al (2015) Identification of epistatic interactions through genome-wide association studies in sporadic medullary and juvenile papillary thyroid carcinomas. BMC Med Genomics 8(1):83
Yang CH, Lin YD, Yang CS, Chuang LY (2015) An efficiency analysis of high-order combinations of gene–gene interactions using multifactor-dimensionality reduction. BMC Genom 16(1):489
Yang CH, Weng ZJ, Chuang LY, Yang CS (2017) Identification of SNP-SNP interaction for chronic dialysis patients. Comp Boil Med 83:94–101
Jabeen A, Ahmad N, Raza K (2018) Machine learning-based state-of-the-art methods for the classification of RNA-seq data. In: Classification in BioApps, Springer, Berlin, pp 133–172
Bush WS, Edwards TL, Dudek SM, McKinney BA, Ritchie MD (2008) Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction. BMC Bioinformatics 9(1):238
Raza K (2017) Protein features identification for machine learning-based prediction of protein-protein interactions. In: ICICCT 2017, vol 750. Springer, Singapore, pp 305–317
Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29(1):308–311
International HapMap Consortium (2003) The international HapMap project. Nature 426(6968):789
Hirakawa M, Tanaka T, Hashimoto Y, Kuroda M, Takagi T, Nakamura Y (2002) JSNP: a database of common gene variations in the Japanese population. Nucleic Acids Res 30(1):158–162
Park J, Hwang S, Lee YS, Kim SC, Lee D (2006) SNP@ Ethnos: a database of ethnically variant single-nucleotide polymorphisms. Nucleic Acids Res 35(suppl_1):D711–D715
Rajeevan H, Osier MV, Cheung KH, Deng H, Druskin L, Heinzen R, Kidd JR, Stein S, Pakstis AJ, Tosches NP, Yeh CC (2003) ALFRED: the ALelle frequency database. update. Nucleic Acids Res 31(1):270–271
Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, Hubbell E, Veitch J, Collins PJ, Darvishi K, Lee C (2008) Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 40(10):1253
Huang CY (2016) A new multilayer hierarchy model for classifying weighted data point: SNP genotype calls. In: Proceedings of the the 3rd multidisciplinary international social networks conference on socialinformatics, data science, ACM, pp 37
Giannoulatou E, Yau C, Colella S, Ragoussis J, Holmes CC (2008) GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population. Bioinformatics 24(19):2209–2214
Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O’donnell CJ, De Bakker PI (2008) SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24(24):2938–2939
Lange K, Sinsheimer JS, Sobel E (2005) Association testing with mendel. Genet Epidemiol 29(1):36–50
Abecasis GR, Cherny SS, Cookson WO, Cardon LR (2002) Merlin–rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30(1):97
Franke A, Wollstein A, Teuber M, Wittig M, Lu T, Hoffmann K, Nürnberg P, Krawczak M, Schreiber S, Hampe J (2006) GENOMIZER: an integrated analysis system for genome-wide association data. Hum Mutat 27(6):583–588
Curk T, Rot G, Zupan B (2011) SNPsyn: detection and exploration of SNP–SNP interactions. Nucleic Acids Res 39(suppl_2):W444–W449
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ahmad, N., Jabeen, A., Raza, K. (2020). Machine Learning Based Outlook for the Analysis of SNP-SNP Interaction for Biomedical Big Data. In: Kumar, A., Paprzycki, M., Gunjan, V. (eds) ICDSMLA 2019. Lecture Notes in Electrical Engineering, vol 601. Springer, Singapore. https://doi.org/10.1007/978-981-15-1420-3_2
Download citation
DOI: https://doi.org/10.1007/978-981-15-1420-3_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1419-7
Online ISBN: 978-981-15-1420-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)