Skip to main content

Machine Learning Based Outlook for the Analysis of SNP-SNP Interaction for Biomedical Big Data

  • Conference paper
  • First Online:
ICDSMLA 2019

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 601))

Abstract

In the field of biomedical sciences, it is much essential to discover the mode of gene interaction and their environment in context to their influence over a particular genetic trait. The Single Nucleotide Polymorphisms (SNPs) are the variations encountered in DNA sequence generated due to single nucleotide alteration in genomic sequence. Recently, Genome-Wide Association Studies(GWAS) have revealed the significant associations between SNPs and disease. To be categorized as mutation, it must occur among at least 1% of the population. SNPs are the most common form of variation occurring in a population. Studies show evidence that the key reason behind complex disease development is SNP-SNP interactions, not individual SNPs. Several models have been developed and implemented for the analysis of SNP-SNP interactions. This paper presents the understanding of big data analytics in the field of biomedical and healthcare along with the different machine learning methods implemented for SNP-SNP interactions analysis studies. These analysis methods and models enable us to acquire more comprehensive understanding of human physiology and disease relation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang Y, Kung L, Ting C, Byrd TA (2015) Beyond a technical perspective: understanding big data capabilities in health care. In: 48th Hawaii international conference system sciences (HICSS), IEEE, pp 3044–3053

    Google Scholar 

  2. Wamba SF, Akter S, Edwards A, Chopin G, Gnanzou D (2015) How ‘big data’ can make big impact: Findings from a systematic review and a longitudinal case study. Int J Prod Econ 165:234–246

    Article  Google Scholar 

  3. Ker JI, Wang Y, Hajli MN, Song J, Ker CW (2014) Deploying lean in healthcare: evaluating information technology effectiveness in US hospital pharmacies. Int J Inf Manage 34(4):556–560

    Article  Google Scholar 

  4. Wang Y, Kung L, Byrd TA (2018) Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol Forecast Soc Chang 126:3–13

    Article  Google Scholar 

  5. Zhang Y, Qiu M, Tsai CW, Hassan MM, Alamri A (2017) Health-CPS: healthcare cyber-physical system assisted by cloud and big data. IEEE Syst J 11(1):88–95

    Article  Google Scholar 

  6. Lee CH, Yoon HJ (2017) Medical big data: promise and challenges. Kidney Res Clin Pract 36(1):3

    Article  Google Scholar 

  7. Kienzler AK, Hargreaves CE, Patel SY (2017) The role of genomics in common variable immunodeficiency disorders. Clin Exp Immunol 188(3):326–332

    Article  Google Scholar 

  8. Moore JH, Andrews PC (2015) Epistasis analysis using multifactor dimensionality reduction. In: Epistasis, Humana Press, New York, NY, pp 301–314

    Google Scholar 

  9. Yoo C, Ramirez L, Liuzzi J (2014) Big data analysis using modern statistical and machine learning methods in medicine. Int. Neurourol J 18(2):50

    Article  Google Scholar 

  10. Heidema AG, Boer JM, Nagelkerke N, Mariman EC, Feskens EJ (2006) The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases. BMC Genet 7(1):23

    Article  Google Scholar 

  11. Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH (2004) Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 69(1):138–147

    Article  Google Scholar 

  12. He H, Oetting WS, Brott MJ, Basu S (2010) Pair-wise multifactor dimensionality reduction method to detect gene-gene interactions in a case-control study. Hum Hered 69(1):60–70

    Article  Google Scholar 

  13. Gui J, Andrew AS, Andrews P, Nelson HM, Kelsey KT, Karagas MR, Moore H (2011) A robust multifactor dimensionality reduction method for detecting gene–gene interactions with application to the genetic analysis of bladder cancer susceptibility. Ann Hum Genet 75(1):20–28

    Article  Google Scholar 

  14. MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, Junkins H, McMahon A, Milano A, Morales J, Pendlington ZM (2016) The new NHGRI-EBI catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 45(D1):D896–D901

    Google Scholar 

  15. Yang CH, Chuang LY, Lin YD (2017) Multiobjective differential evolution-based multifactor dimensionality reduction for detecting gene–gene interactions. Sci Rep 7(1):12869

    Article  Google Scholar 

  16. Goodman SN (1999) Probability at the bedside: the knowing of chances or the chances of knowing? Ann Intern Med 130(7):604–606

    Article  Google Scholar 

  17. Mechanic LE, Luke BT, Goodman JE, Chanock SJ, Harris CC (2008) Polymorphism interaction analysis (PIA): a method for investigating complex gene-gene interactions. BMC Bioinformatics 9(1):146

    Article  Google Scholar 

  18. Uppu S, Krishna A, Gopalan R (2018) A review on methods for detecting SNP interactions in high-dimensional genomic data. IEEE/ACM Trans Comput Biol Bioinf 15(2):599–612

    Article  Google Scholar 

  19. Chen SH, Sun J, Dimitrov L, Turner AR, Adams TS, Meyers DA, Chang BL, Zheng SL, Grönberg H, Xu J, Hsu FC (2008) A support vector machine approach for detecting gene-gene interaction. Genet Epidemiol 32(2):152–167

    Article  Google Scholar 

  20. Chevrolat JP, Golmard JL, Ammar S, Jouvent R, Boisvieux JF (1998) Modelling behavioral syndromes using Bayesian networks. Artif Intell Med 14(3):259–277

    Article  Google Scholar 

  21. Murphy K, Mian S (1999) Modelling gene expression data using dynamic Bayesian networks. Technical report, vol 104. Computer Science Division, University of California, Berkeley, CA

    Google Scholar 

  22. Somogyi R, Sniegoski CA (1996) Modeling the complexity of genetic networks: understanding multigenic and pleiotropic regulation. Complexity 1(6):45–63

    Google Scholar 

  23. Weaver DC, Workman CT, Stormo GD (1999) Modeling regulatory networks with weight matrices. Biocomputing 112–123

    Google Scholar 

  24. D’haeseleer P, Wen X, Fuhrman S, Somogyi R (1999) Linear modeling of mRNA expression levels during CNS development and injury. Biocomputing 41–52

    Google Scholar 

  25. Johnson AD (2009) Single-nucleotide polymorphism bioinformatics: a comprehensive review of resources. Circ Cardiovasc Genet 2(5):530–536

    Google Scholar 

  26. Shah ND, Pathak J (2014) Why health care may finally be ready for big data. Harvard Bus Rev 3

    Google Scholar 

  27. Jiang P, Winkley J, Zhao C, Munnoch R, Min G, Yang LT (2016) An intelligent information forwarder for healthcare big data systems with distributed wearable sensors. IEEE Syst J 10(3):1147–1159

    Google Scholar 

  28. Luzón-Toro B, Bleda M, Navarro E et al (2015) Identification of epistatic interactions through genome-wide association studies in sporadic medullary and juvenile papillary thyroid carcinomas. BMC Med Genomics 8(1):83

    Article  Google Scholar 

  29. Yang CH, Lin YD, Yang CS, Chuang LY (2015) An efficiency analysis of high-order combinations of gene–gene interactions using multifactor-dimensionality reduction. BMC Genom 16(1):489

    Article  Google Scholar 

  30. Yang CH, Weng ZJ, Chuang LY, Yang CS (2017) Identification of SNP-SNP interaction for chronic dialysis patients. Comp Boil Med 83:94–101

    Article  Google Scholar 

  31. Jabeen A, Ahmad N, Raza K (2018) Machine learning-based state-of-the-art methods for the classification of RNA-seq data. In: Classification in BioApps, Springer, Berlin, pp 133–172

    Google Scholar 

  32. Bush WS, Edwards TL, Dudek SM, McKinney BA, Ritchie MD (2008) Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction. BMC Bioinformatics 9(1):238

    Article  Google Scholar 

  33. Raza K (2017) Protein features identification for machine learning-based prediction of protein-protein interactions. In: ICICCT 2017, vol 750. Springer, Singapore, pp 305–317

    Google Scholar 

  34. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29(1):308–311

    Article  Google Scholar 

  35. International HapMap Consortium (2003) The international HapMap project. Nature 426(6968):789

    Google Scholar 

  36. Hirakawa M, Tanaka T, Hashimoto Y, Kuroda M, Takagi T, Nakamura Y (2002) JSNP: a database of common gene variations in the Japanese population. Nucleic Acids Res 30(1):158–162

    Article  Google Scholar 

  37. Park J, Hwang S, Lee YS, Kim SC, Lee D (2006) SNP@ Ethnos: a database of ethnically variant single-nucleotide polymorphisms. Nucleic Acids Res 35(suppl_1):D711–D715

    Google Scholar 

  38. Rajeevan H, Osier MV, Cheung KH, Deng H, Druskin L, Heinzen R, Kidd JR, Stein S, Pakstis AJ, Tosches NP, Yeh CC (2003) ALFRED: the ALelle frequency database. update. Nucleic Acids Res 31(1):270–271

    Article  Google Scholar 

  39. Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, Hubbell E, Veitch J, Collins PJ, Darvishi K, Lee C (2008) Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 40(10):1253

    Article  Google Scholar 

  40. Huang CY (2016) A new multilayer hierarchy model for classifying weighted data point: SNP genotype calls. In: Proceedings of the the 3rd multidisciplinary international social networks conference on socialinformatics, data science, ACM, pp 37

    Google Scholar 

  41. Giannoulatou E, Yau C, Colella S, Ragoussis J, Holmes CC (2008) GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population. Bioinformatics 24(19):2209–2214

    Article  Google Scholar 

  42. Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O’donnell CJ, De Bakker PI (2008) SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24(24):2938–2939

    Google Scholar 

  43. Lange K, Sinsheimer JS, Sobel E (2005) Association testing with mendel. Genet Epidemiol 29(1):36–50

    Article  Google Scholar 

  44. Abecasis GR, Cherny SS, Cookson WO, Cardon LR (2002) Merlin–rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30(1):97

    Article  Google Scholar 

  45. Franke A, Wollstein A, Teuber M, Wittig M, Lu T, Hoffmann K, Nürnberg P, Krawczak M, Schreiber S, Hampe J (2006) GENOMIZER: an integrated analysis system for genome-wide association data. Hum Mutat 27(6):583–588

    Article  Google Scholar 

  46. Curk T, Rot G, Zupan B (2011) SNPsyn: detection and exploration of SNP–SNP interactions. Nucleic Acids Res 39(suppl_2):W444–W449

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khalid Raza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ahmad, N., Jabeen, A., Raza, K. (2020). Machine Learning Based Outlook for the Analysis of SNP-SNP Interaction for Biomedical Big Data. In: Kumar, A., Paprzycki, M., Gunjan, V. (eds) ICDSMLA 2019. Lecture Notes in Electrical Engineering, vol 601. Springer, Singapore. https://doi.org/10.1007/978-981-15-1420-3_2

Download citation

Publish with us

Policies and ethics