Abstract
Among the large of number of statistical methods that have been proposed to identify gene-gene interactions in case-control genome-wide association studies (GWAS), gene-based methods have recently grown in popularity as they confer advantage in both statistical power and biological interpretation. All of the gene-based methods jointly model the distribution of single nucleotide polymorphisms (SNPs) sets prior to the statistical test, leading to a limited power to detect sums of SNP-SNP signals. In this paper, we instead propose a gene-based method that first performs SNP-SNP interaction tests before aggregating the obtained p-values into a test at the gene level. Our method called AGGrEGATOr is based on a minP procedure that tests the significance of the minimum of a set of p-values. We use simulations to assess the capacity of AGGrEGATOr to correctly control for type-I error. The benefits of our approach in terms of statistical power and robustness to SNPs set characteristics are evaluated in a wide range of disease models by comparing it to previous methods. We also apply our method to detect gene pairs associated to rheumatoid arthritis (RA) on the GSE39428 dataset. We identify 13 potential gene-gene interactions and replicate one gene pair in the Wellcome Trust Case Control Consortium dataset at the level of 5%. We further test 15 gene pairs, previously reported as being statistically associated with RA or Crohn’s disease (CD) or coronary artery disease (CAD), for replication in the Wellcome Trust Case Control Consortium dataset. We show that AGGrEGATOr is the only method able to successfully replicate seven gene pairs.
Acknowledgments
I acknowledge Maud Marchal for reading through the manuscript.
References
Babron, M.-C., A. Etcheto and M.-H. Dizier (2015): “A new correction for multiple testing in gene-gene interaction studies,” Ann. Hum. Genet., doi: 10.1111/ahg.12113.10.1111/ahg.12113Search in Google Scholar PubMed
Chang, X., R. Yamada, A. Suzuki, T. Sawada, S. Yoshino, S. Tokuhiro and K. Yamamoto (2005): “Localization of peptidylarginine deiminase 4 (padi4) and citrullinated protein in synovial tissue of rheumatoid arthritis,” Rheumatology, 44, 40–50.10.1093/rheumatology/keh414Search in Google Scholar PubMed
Chang, X., Y. Zheng, Q. Yang, L. Wang, J. Pan, Y. Xia, X. Yan and J. Han (2012): “Carbonic anhydrase i (ca1) is involved in the process of bone formation and is susceptible to ankylosing spondylitis,” Arthritis Res. Ther., 14, R176.Search in Google Scholar
Chang, X., B. Xu, L. Wang, Y. Wang, Y. Wang and S. Yan (2013): “Investigating a pathogenic role for txndc5 in tumors,” Int. J. Oncol., 43, 1871–1884.Search in Google Scholar
Cheverud, J. M. (2001): “A simple correction for multiple comparisons in interval mapping genome scans,” Heredity, 87, 52–58.10.1046/j.1365-2540.2001.00901.xSearch in Google Scholar PubMed
Conneely, K. N. and M. Boehnke (2007): “So many correlated tests, so little time! Rapid adjustment of p values for multiple correlated tests,” Am. J. Hum. Genet., 81, 1158–1168.Search in Google Scholar
Cordell, H. J. (2002): “Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans,” Hum. Mol. Genet., 11, 2463–2468.Search in Google Scholar
Cordell, H. J. (2009): “Detecting gene-gene interactions that underlie human diseases,” Nat. Rev. Genet., 10, 392–404.Search in Google Scholar
Dong, C., X. Chu, Y. Wang, Y. Wang, L. Jin, T. Shi, W. Huang and Y. Li (2008): “Exploration of gene-gene interaction effects using entropy-based methods,” Eur. J. Hum. Genet., 16, 229–235.Search in Google Scholar
Emily, M. (2012): “Indor: a new statistical procedure to test for snp x snp epistasis in genome-wide association studies,” Stat. Med., 31, 2359–2373.Search in Google Scholar
Emily, M., T. Mailund, J. Hein, L. Schauser and M. H. Schierup (2009): “Using biological networks to search for interacting loci in genome-wide association studies,” Eur. J. Hum. Genet., 17, 1231–1240.Search in Google Scholar
Excoffier, L. and M. Slatkin (1995): “Maximum likelihood estimation of molecular haplotype frequencies in a diploid population,” Mol. Biol. Evol., 12, 921–927.Search in Google Scholar
Galwey, N. W. (2009): “A new measure of the effective number of tests, a practical tool for comparing families of non-independent significance tests,” Genet. Epidemiol., 33, 559–568.10.1002/gepi.20408Search in Google Scholar PubMed
Gao, X., J. Starmer and E. R. Martin (2008): “A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms,” Genet. Epidemiol., 32, 361–369.Search in Google Scholar
Genz, A. and F. Bretz (2009): Computation of multivariate normal and T probabilities, 1st ed., New York: Springer-Verlag.10.1007/978-3-642-01689-9_1Search in Google Scholar
Goodarzi, M. O., Y. V. Louwers, K. D. Taylor, M. R. Jones, J. Cui, S. Kwon, Y.-D. I. Chen, X. Guo, L. Stolk, A. G. Uitterlinden, J. S. Laven and R. Azziz (2011): “Replication of association of a novel insulin receptor gene polymorphism with polycystic ovary syndrome,” Fertil. Steril., 95, 1736–1741.Search in Google Scholar
Han, S., B.-Z. Yang, H. R. Kranzler, X. Liu, H. Zhao, L. A. Farrer, E. Boer-winkle, J. B. Potash and J. Gelernter (2013): “Integrating gwass and human protein interaction networks identifies a gene subnetwork underlying alcohol dependence,” Am. J. Hum. Genet., 93, 1027–1034.Search in Google Scholar
Hendricks, A. E., J. Dupuis, M. W. Logue, R. H. Myers and K. L. Lunetta (2014): “Correction for multiple testing in a gene region,” Eur. J. Hum. Genet., 22, 414–418.Search in Google Scholar
Hill, W. G. and A. Robertson (1968): “Linkage diseqilibrium in finite populations,” Theor. Appl. Genet., 38, 226–231.Search in Google Scholar
Hindorff, L. A., P. Sethupathy, H. A. Junkins, E. M. Ramos, J. P. Mehta, F. S. Collins and T. A. Manolio (2009): “Potential etiologic and functional implications of genome-wide association loci for human diseases and traits,” Proc. Natl. Acad. Sci. USA, 106, 9362–9367.10.1073/pnas.0903103106Search in Google Scholar PubMed PubMed Central
Howie, B. N., P. Donnelly and J. Marchini (2009): “A flexible and accurate genotype imputation method for the next generation of genome-wide association studies,” PLoS Genet., 5, e1000529.Search in Google Scholar
Huang, H., P. Chanda, A. Alonso, J. S. Bader and D. E. Arking (2011): “Gene-based tests of association,” PLoS Genet., 7, e1002177.Search in Google Scholar
Iwamoto, T., K. Ikari, T. Nakamura, M. Kuwahara, Y. Toyama, T. Tomatsu, S. Mo-mohara and N. Kamatani (2006): “Association between padi4 and rheumatoid arthritis: a meta-analysis,” Rheumatology, 45, 804–807.10.1093/rheumatology/kel023Search in Google Scholar PubMed
Jiang, B., X. Zhang, Y. Zuo and G. Kang (2011): “A powerful truncated tail strength method for testing multiple null hypotheses in one dataset,” J. Theor. Biol., 277, 67–73.Search in Google Scholar
Jorgenson, E. and J. S. Witte (2006): “A gene-centric approach to genome-wide association studies,” Nat. Rev. Genet., 7, 885–891.Search in Google Scholar
Jung, J., J. J. Song and D. Kwon (2009): “Allelic based gene-gene interactions in rheumatoid arthritis,” BMC Proc., S7, S76.Search in Google Scholar
Kang, G., W. Yue, J. Zhang, Y. Cui, Y. Zuo and D. Zhang (2008): “An entropy-based approach for testing genetic epistasis underlying complex diseases,” J. Theor. Biol., 250, 362–374.Search in Google Scholar
Keshava Prasad, T. S., R. Goel, K. Kandasamy, S. Keerthikumar, S. Kumar, S. Mathivanan, D. Telikicherla, R. Raju, B. Shafreen, A. Venugopal, L. Balakrishnan, A. Marimuthu, S. Banerjee, D. S. Somanathan, A. Sebastian, S. Rani, S. Ray, C. J. Harrys Kishore, S. Kanth, M. Ahmed, M. K. Kashyap, R. Mohmood, Y. L. Ramachandra, V. Krishna, B. A. Rahiman, S. Mohan, P. Ranganathan, S. Ramabadran, R. Chaerkady and A. Pandey (2009): “Human protein reference database,” Nuc. Acids Res., 37, D767–D772.Search in Google Scholar
Larson, N. B. and D. J. Schaid (2013): “A kernel regression approach to gene-gene interaction detection for case-control studies,” Genet. Epidemiol., 37, 695–703.Search in Google Scholar
Larson, N. B., G. D. Jenkins, M. C. Larson, R. A. Vierkant, T. A. Sellers, C. M. Phelan, J. M. Schildkraut, R. Sutphen, P. P. D. Pharoah, S. A. Gayther, N. Wentzensen, E. L. Goode and B. L. Fridley (2014): “Kernel canonical correlation analysis for assessing gene-gene interactions and application to ovarian cancer,” Eur. J. Hum. Genet., 22, 126–131.Search in Google Scholar
Lewis, C. M. (2002): “Genetic association studies: design, analysis and interpretation,” Brief. Bioinform., 3, 146–153.Search in Google Scholar
Li, W. and J. Reich (2000): “A complete enumeration and classification of two-locus disease models,” Hum. Hered., 50, 334–349.Search in Google Scholar
Li, J. and L. Ji (2005): “Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix,” Heredity, 95, 221–227.10.1038/sj.hdy.6800717Search in Google Scholar PubMed
Li, J. and Y. Chen (2008): “Generating samples for association studies based on hapmap data,” BMC Bioinformatics, 9, 44.10.1186/1471-2105-9-44Search in Google Scholar PubMed PubMed Central
Li, J., R. Tang, J. Biernacka and M. de Andrade (2009): “Identification of gene-gene interaction using principal components,” BMC Proceedings, 3, S78.10.1186/1753-6561-3-S7-S78Search in Google Scholar
Li, M.-X., H.-S. Gui, J. Kwan and P. Sham (2011): “Gates: a rapid and powerful gene-based association test using extended simes procedure,” Am. J. Hum. Genet., 88, 283–293.Search in Google Scholar
Li, J., D. Huang, M. Guo, X. Liu, C. Wang, Z. Teng, R. Zhang, Y. Jiang, H. Lv and L. Wang (2015): “A gene-based information gain method for detecting gene-gene interactions in case-control studies,” Eur. J. Hum. Genet., 23, 1566–1572.Search in Google Scholar
Liu, J. Z., A. F. Mcrae, D. R. Nyholt, S. E. Medland, N. R. Wray, K. M. Brown, N. K. Hayward, G. W. Montgomery, P. M. Visscher, N. G. Martin and S. Mac-gregor (2010): “A versatile gene-based test for genome-wide association studies,” Am. J. Hum. Genet., 87, 139–145.Search in Google Scholar
Liu, Y., H. Xu, S. Chen, X. Chen, Z. Zhang, Z. Zhu, X. Qin, L. Hu, J. Zhu, G.-P. Zhao and X. Kong (2011): “Genome-wide interaction-based association analysis identified multiple new susceptibility loci for common diseases,” PLoS Genet., 7, e1001338.Search in Google Scholar
Ma, L., A. G. Clark and A. Keinan (2013): “Gene-based testing of interactions in association studies of quantitative traits,” PLoS Genet., 9, e1003321.Search in Google Scholar
Maher, B. (2008): “Personal genomes: the case of the missing heritability,” Nature, 456, 18–21.10.1038/456018aSearch in Google Scholar PubMed
Manolio, T. A., F. S. Collins, N. J. Cox, D. B. Goldstein, L. A. Hindorff, D. J. Hunter, M. I. McCarthy, E. M. Ramos, L. R. Cardon, A. Chakravarti, J. H. Cho, A. E. Guttmacher, A. Kong, L. Kruglyak, E. Mardis, C. N. Rotimi, M. Slatkin, D. Valle, A. S. Whittemore, M. Boehnke, A. G. Clark, E. E. Eichler, G. Gibson, J. L. Haines, T. F. C. Mackay, S. A. McCarroll and P. M. Visscher (2009): “Finding the missing heritability of complex diseases,” Nature, 461, 747–753.10.1038/nature08494Search in Google Scholar PubMed PubMed Central
Marchini, J., P. Donnelly and L. R. Cardon (2005): “Genome-wide strategies for detecting multiple loci that influence complex diseases,” Nat. Genet., 37, 413–417.Search in Google Scholar
Montana, G. (2005): “Hapsim: a simulation tool for generating haplotype data with pre-specified allele frequencies and ld coefficients,” Bioinformatics, 21, 4309–4311.10.1093/bioinformatics/bti689Search in Google Scholar PubMed
Moore, J. H. (2003): “The ubiquitous nature of epistasis in determining susceptibility to common human diseases,” Hum. Hered., 56, 73–82.Search in Google Scholar
Moore, J. and B. White (2007): “Tuning relieff for genome-wide genetic analysis,” Lect. Notes Comput. Sc., 4447, 166–175.Search in Google Scholar
Musameh, M. D., W. Y. S. Wang, C. P. Nelson, C. Llus-Ganella, R. Debiec, I. Subirana, R. Elosua, A. J. Balmforth, S. G. Ball, A. S. Hall, S. Kathiresan, J. R. Thompson, G. Lucas, N. J. Samani and M. Tomaszewski (2015): “Analysis of gene-gene interactions among common variants in candidate cardiovascular genes in coronary artery disease,” PLoS One, 10, e0117684.10.1371/journal.pone.0117684Search in Google Scholar PubMed PubMed Central
Neale, B. M. and P. C. Sham (2004): “The future of association studies: gene-based analysis and replication,” Am. J. Hum. Genet., 75, 353–362.Search in Google Scholar
Neuman, R. J. and J. P. Rice (1992): “Two-locus models of diseases,” Genet. Epidemiol., 9, 347–365.Search in Google Scholar
Nielsen, D. M., M. G. Ehm, D. V. Zaykin and B. S. Weir (2004): “Effect of and three-locus linkage disequilibrium on the power to detect marker/phenotype associations,” Genetics, 168, 1029–1040.10.1534/genetics.103.022335Search in Google Scholar PubMed PubMed Central
Nyholt, D. R. (2004): “A simple correction for multiple testing for single nucleotide polymorphisms in linkage disequilibrium with each other,” Am. J. Hum. Genet., 74, 765–769.Search in Google Scholar
Peng, Q., J. Zhao and F. Xue (2010): “A gene-based method for detecting gene co-association in a case-control association study,” Eur. J. Hum. Genet., 18, 582–587.Search in Google Scholar
Phillips, P. (2008): “Epistasis, the essential role of gene interactions in the ture and evolution of genetic systems,” Nat. Rev. Genet., 9, 855–867.Search in Google Scholar
Pritchard, J. K. and M. Przeworski (2001): “Linkage disequilibrium in Models and data,” Am. J. Hum. Genet., 69, 1–14.Search in Google Scholar
Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, M. A. R. Ferreira, D. J. Maller, P. Sklar, P. I. W. de Bakker, M. J. Daly and P. C. Sham (2007): “Plink: a toolset for whole-genome association and population-based linkage analysis,” Am. J. Hum. Genet., 81, 559–575.Search in Google Scholar
R Core Team (2013): R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, URL http://www R-project.org/.Search in Google Scholar
Rajapakse, I., M. D. Perlman, P. J. Martin, J. A. Hansen and C. Kooperberg (2012): “Multivariate detection of gene-gene interactions,” Genet. Epidemiol., 36, 622–630.Search in Google Scholar
Ritchie, M. D., L. W. Hahn, N. Roodi, L. R. Bailey, W. D. Dupont, F. F. Parl and J. H. Moore (2001): “Multifactor-dimensionality reduction reveals order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet., 69, 138–147.Search in Google Scholar
Schwarz, D., I. Konig and A. Ziegler (2010): “On safari to random jungle: a implementation of random forests for high dimensional data,” Bioinformatics 26, 1752–1758.10.1093/bioinformatics/btq257Search in Google Scholar PubMed PubMed Central
Seaman, S. and B. Mller-Myhsok (2005): “Rapid simulation of p values for product methods and multiple-testing adjustment in association studies,” Am. J. Hum. Genet., 76, 399–408.Search in Google Scholar
The 1000 Genomes Project Consortium, G. (2012): “An integrated map of genetic variation from 1,092 human genomes,” Nature, 491, 56–65.10.1038/nature11632Search in Google Scholar PubMed PubMed Central
Ueki, M. and H. J. Cordell (2012): “Improved statistics for genome-wide interaction analysis,” PLoS Genet., 8, e1002625.Search in Google Scholar
Wan, X., C. Yang, Q. Yang, H. Xue, X. Fan, N. L. S. Tang and W. Yu (2010): “Boost: a fast approach to detecting gene-gene interactions in genome-wide case-control studies,” Am. J. Hum. Genet., 87, 325–340.Search in Google Scholar
Weir, B. S. (2008): “Linkage disequilibrium and association mapping,” Annu. Rev. Genom. Hum. G., 9, 129–142.Search in Google Scholar
Wodak, S. J., J. Vlasblom, A. L. Turinsky and S. Pu (2013): “Protein-protein interaction networks: the puzzling riches,” Curr. Opin. Struc. Biol., 23, 941–953.Search in Google Scholar
WTCCC (2007): “Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls,” Nature, 447, 661–678.10.1038/nature05911Search in Google Scholar PubMed PubMed Central
Wu, M. C., P. Kraft, M. P. Epstein, D. M. Taylor, S. J. Chanock, D. J. Hunter and X. Lin (2010a): “Powerful snp-set analysis for case-control genome-wide association studies,” Am. J. Hum. Genet., 86, 929–942.10.1016/j.ajhg.2010.05.002Search in Google Scholar PubMed PubMed Central
Wu, X., H. Dong, L. Luo, Y. Zhu, G. Peng, J. D. Reveille and M. Xiong (2010b): “A novel statistic for genome-wide interaction analysis,” PLoS Genet., 6, e1001131.10.1371/journal.pgen.1001131Search in Google Scholar PubMed PubMed Central
Yuan, Z., Q. Gao, Y. He, X. Zhang, F. Li, J. Zhao and F. Xue (2012): “Detection for gene-gene co-association via kernel canonical correlation analysis,” BMC Genetics, 13, 83.10.1186/1471-2156-13-83Search in Google Scholar PubMed PubMed Central
Zavala-Cerna, M. G., N. G. Gonzalez-Montoya, A. Nava, J. I. Gamez-Nava, M. Moran-Moguel, R. C. Rosales-Gomez, S. A. Gutierrez-Rubio, J. Sanchez-Corona, L. Gonzalez-Lopez, I. P. Davalos-Rodriguez and M. Salazar-Paramo (2013): “Padi4 haplotypes in association with ra mexican patients, a new prospect for antigen modulation,” Clin. Dev. Immunol., 2013. http://dx.doi.org/10.1155/2013/383681.10.1155/2013/383681Search in Google Scholar PubMed PubMed Central
Zaykin, D., L. A. Zhivotovsky, P. Westfall and B. Weir (2002): “Truncated product method for combining p-values,” Genet. Epidemiol., 22, 170–185.Search in Google Scholar
Zhang, Y. and J. S. Liu (2007): “Bayesian inference of epistatic interactions in case-control studies,” Nat. Genet., 39, 1167–1173.Search in Google Scholar
Zhang, X., X. Yang, Z. Yuan, Y. Liu, F. Li, B. Peng, D. Zhu, J. Zhao and F. Xue (2013): “A plspm-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design,” PLoS One, 8, e62129.10.1371/journal.pone.0062129Search in Google Scholar PubMed PubMed Central
Zhao, J., L. Jin and M. Xiong (2006): “Test for interaction between two unlinked loci,” Am. J. Hum. Genet., 79, 831–845.Search in Google Scholar
Zheng, Y., L. Wang, W. Zhang, H. Xu, Y. Chang, X., L. Wang, W. Zhang, H. Xu and X. Chang (2012): “Transgenic mice over-expressing carbonic anhydrase I showed aggravated joint inflammation and tissue destruction,” BMC Muscu-loskeletal Disorders, 13, 256.10.1186/1471-2474-13-256Search in Google Scholar PubMed PubMed Central
Zuk, O., E. Hechter, S. R. Sunyaev and E. S. Lander (2012): “The mystery of missing heritability: genetic interactions create phantom heritability,” Proc. Natl. Acad. Sci. USA, 109, 1193–1198.10.1073/pnas.1119675109Search in Google Scholar PubMed PubMed Central
Supplemental Material:
The online version of this article (DOI: 10.1515/sagmb-2015-0074) offers supplementary material, available to authorized users.
©2016 by De Gruyter