Abstract
Revealing the underlying complex architecture of human diseases has received considerable attention since the exploration of genotype-phenotype relationships in genetic epidemiology. Identification of these relationships becomes more challenging due to multiple factors acting together or independently. A deep neural network was trained in the previous work to identify two-locus interacting single nucleotide polymorphisms (SNPs) related to a complex disease. The model was assessed for all two-locus combinations under various simulated scenarios. The results showed significant improvements in predicting SNP-SNP interactions over the existing conventional machine learning techniques. Furthermore, the findings are confirmed on a published dataset. However, the performance of the proposed method in the higher-order interactions was unknown. The objective of this study is to validate the model for the higher-order interactions in high-dimensional data. The proposed method is further extended for unsupervised learning. A number of experiments were performed on the simulated datasets under same scenarios as well as a real dataset to show the performance of the extended model. On an average, the results illustrate improved performance over the previous methods. The model is further evaluated on a sporadic breast cancer dataset to identify higher-order interactions between SNPs. The results rank top 20 higher-order SNP interactions responsible for sporadic breast cancer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cordell, H.J.: Detecting gene–gene interactions that underlie human diseases. Nat. Rev. Genet. 10(6), 392–404 (2009)
Van Steen, K.: Travelling the world of gene–gene interactions. Briefings Bioinform. 13(1), 1–19 (2012)
Upstill-Goddard, R., et al.: Machine learning approaches for the discovery of gene–gene interactions in disease data. Briefings Bioinform. 14(2), 251–260 (2013)
Chen, C.C., et al.: Methods for identifying SNP interactions: a review on variations of Logic regression, random forest and Bayesian logistic regression. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(6), 1580–1591 (2011)
Purcell, S., et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3), 559–575 (2007)
Schwender, H., Ickstadt, K.: Identification of SNP interactions using logic regression. Biostatistics 9(1), 187–198 (2008)
Park, M.Y., Hastie, T.: Penalized logistic regression for detecting gene interactions. Biostatistics 9(1), 30–50 (2008)
Niel, C., et al.: A survey about methods dedicated to epistasis detection. Front. Genet. 6, 285 (2015). doi:10.3389/fgene.2015.00285
Ritchie, M.D., et al.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69(1), 138–147 (2001)
Nelson, M., et al.: A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11(3), 458–470 (2001)
Culverhouse, R., Klein, T., Shannon, W.: Detecting epistatic interactions contributing to quantitative traits. Genetic Epidemiol. 27(2), 141–152 (2004)
Wu, Q., et al.: SNP selection and classification of genome-wide SNP data using stratified sampling random forests. IEEE Trans. Nanobiosci. 11(3), 216–227 (2012)
Jiang, R., et al.: A random forest approach to the detection of epistatic interactions in case-control studies. BMC Bioinform. 10(Suppl. 1), S65 (2009)
Schwarz, D.F., König, I.R., Ziegler, A.: On safari to random jungle: a fast implementation of random forests for high-dimensional data. Bioinformatics 26(14), 1752–1758 (2010)
Yoshida, M., Koike, A.: SNPInterForest: a new method for detecting epistatic interactions. BMC Bioinform. 12(1), 469 (2011)
Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167–1173 (2007)
Han, B., Chen, X.-W.: bNEAT: a Bayesian network method for detecting epistatic interactions in genome-wide association studies. BMC Genom. 12(Suppl. 2), S9 (2011)
Padyukov, L.: Between the Lines of Genetic Code: Genetic Interactions in Understanding Disease and Complex Phenotypes. Academic Press, Waltham (2013)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Min, S., Lee, B., Yoon, S.: Deep learning in bioinformatics. arXiv preprint arXiv:1603.06430 (2016)
Uppu, S., Krishna, A., and Gopalan, P.R., Towards deep learning in genome-wide association interaction studies. In: Pacific Asia Conference on Information System, Taiwan (2016). ISBN 9789860491029
Uppu, S., Krishna, A., Gopalan, P.R.: A deep learning appraoch to detect SNP interactions. J. Softw. (accepted), Will be published in vol. 11, no. 10, October 2016
Bengio, Y., I.J. Goodfellow, and A. Courville, Deep Learning. An MIT Press electronic book, version 10–18 (2015). http://www.deeplearningbook.org/
Candel, A., et al.: Deep Learning with H2O (2016). http://h2o.ai/resources/
Recht, B., et al.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems (2011)
Uppu, S., Krishna, A., Gopalan, P.R.: Detecting SNP interactions in balanced and imbalanced datasets using associative classification. Aust. J. Intell. Inf. Process. Syst. 14(1), 7–18 (2014)
Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Uppu, S., Krishna, A. (2016). Improving Strategy for Discovering Interacting Genetic Variants in Association Studies. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9947. Springer, Cham. https://doi.org/10.1007/978-3-319-46687-3_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-46687-3_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46686-6
Online ISBN: 978-3-319-46687-3
eBook Packages: Computer ScienceComputer Science (R0)