Gene-gene Interaction Analyses for Atrial Fibrillation

Atrial fibrillation (AF) is a heritable disease that affects more than thirty million individuals worldwide. Extensive efforts have been devoted to the study of genetic determinants of AF. The objective of our study is to examine the effect of gene-gene interaction on AF susceptibility. We performed a large-scale association analysis of gene-gene interactions with AF in 8,173 AF cases, and 65,237 AF-free referents collected from 15 studies for discovery. We examined putative interactions between genome-wide SNPs and 17 known AF-related SNPs. The top interactions were then tested for association in an independent cohort for replication, which included more than 2,363 AF cases and 114,746 AF-free referents. One interaction, between rs7164883 at the HCN4 locus and rs4980345 at the SLC28A1 locus, was found to be significantly associated with AF in the discovery cohorts (interaction OR = 1.44, 95% CI: 1.27–1.65, P = 4.3 × 10–8). Eight additional gene-gene interactions were also marginally significant (P < 5 × 10–7). However, none of the top interactions were replicated. In summary, we did not find significant interactions that were associated with AF susceptibility. Future increases in sample size and denser genotyping might facilitate the identification of gene-gene interactions associated with AF.

Epistasis refers to the interaction of multiple genes that might pose joint genetic effects. Epistasis plays a ubiquitous role in disease predisposition, conferring an increased risk in addition to the main effects for many complex diseases, such as breast cancer 18 and coronary heart disease 19 . Gene-gene interactions play important roles in regulating various biological events and cellular behaviors 20,21 . However, it remains unclear whether gene interactions contribute to the biological basis of AF.
The most straightforward approach to identifying interactions is to perform an exhaustive search of all the possible combinations of genetic variants and to test if any of them are significantly associated with AF. However, a major problem with such a comprehensive search is the huge computational burden. Assuming one million SNPs are genotyped in a typical GWAS, a complete search of a two-marker model would require testing 5 × 10 11 pairs of SNPs. This number would further increase exponentially for multiple-SNP models. The cost of multiple testing corrections even in the 1 million marker scenario is extreme. For example, a Bonferroni correction requires P < 1 × 10 −13 for significance in such a number of tests. As few SNP pairs will meet this threshold, false negatives are likely without massive sample sizes.
It has been suggested that at least one variant in significant gene-gene interactions tends to have a strong main effect 22 . We therefore sought to identify potential interactions between top AF susceptibility SNPs and other genome-wide variants in relation to AF by performing a meta-analysis of results from multiple studies.

Results
In total, our study included 8,173 AF cases and 65,237 AF-free referents of European ancestry from 15 studies. The clinical characteristics of the study participants are shown in Table 1.
Supplemental Figure 1 presents Q-Q plots for the interaction p-values of genome-wide SNPs with each of the AF-associated variants. The effect of population stratification was negligible, with genomic control λ ranging from 0.98 to 1.01. Table 2 shows the most significant interactions (P < 5 × 10 −7 ) that were associated with AF susceptibility. The top 10 interactions for each AF SNP are shown in Supplemental Table 1. None of interactions reached the significance after adjusting for multiple testing (P < 5 × 10 −8 /17 = 2.8 × 10 −9 ). Only one interaction, SNP rs7164883 with rs4980345 exceeded the traditional genome-wide significance threshold (P < 5 × 10 −8 ) for association with AF (P = 4.3 × 10 −8 ). Both interacting SNPs are located in chromosome 15, 12Mb apart. The corresponding regional plot is shown in Fig. 1, and the forest plot of each contributing study is shown in Fig. 2. The SNP rs7164883 is located within the first intron of HCN4, and was also one of the top SNPs found to be significantly associated with AF in our previous study 10 . The SNP rs4980345 was located within the tenth intron of SLC28A1. SNP rs4980345 was not associated with AF (P = 0.78) in marginal analyses from the prior meta-analysis 10 .
We also tested the association of rs2106261 at the ZFHX3 locus and rs2200733 at the PITX2 locus with AF, which was recently reported to be associated with AF in a meta-analysis of three Chinese samples (OR = 5.36, P = 8.0 × 10 −24 ) 23 . The interaction, however, was not significant in any of the 16 studies included in the present paper, or in our meta-analysis (all with P > 0.05).
We then tried to replicate our findings in an independent cohort, UK Biobank, which included more than 2000 AF cases and 11,000 AF-free referents. As shown in Table 2, none of significant interactions from discovery phase were replicated (all with P > 0.05/9 = 0.0056).

Discussion
In the past decade, increasing evidence has suggested that the genetic predisposition is an important factor that contributes to AF as well as many other cardiovascular diseases 24,25 . Due to the enormous number of association tests, few studies have been performed to investigate the associations of gene interactions with AF susceptibility. By restricting our analyses to interactions with known AF loci, we limited the multiple testing burden in our analysis and sought to examine the potential mechanisms by which variants at top loci contribute to AF susceptibility. One genome-wide significant gene interaction with AF, rs7164883 at the HCN4 locus and rs4980345 at the SLC28A1 locus, was found. Eight additional interactions were also marginally significant (P < 5 × 10 −7 ), but did not withstand multiple testing correction. However, none of the top interactions were significant in the replication phase. It is noteworthy that the ORs of suggestive interactions from the replication cohort were very moderate. The most significant interaction from the discovery cohorts, rs7164883 with rs4980345, was even in the reverse direction in the replication cohort. Given that the replication cohort has similar genetic background to the discovery cohorts, the discrepancy indicates that these suggestive interactions are unlikely to be true AF-related interactions. Our analyses were restricted to interactions with loci previously found to have a main effect association with AF. The underlying assumption of our approach is that interactions with significant effects tend to have observable main effects in at least one of the interacting SNPs 22 . However, it is possible that two variants without main effects might have large interaction effects. Our analysis will not identify such interactions. A variety of other methods have been developed to account for the enormous number of interactions between variants in genetic association studies 26,27 . One approach is to employ prior biological knowledge to limit the search space 28 . Gene interactions have been discovered through experimental assays. These might be used to guide the search of potential variant interactions. Additionally, it has been recognized that many known genetic interactions were enriched with well-studied pathways, and could only happen under certain conditions 29 , which might introduce additional bias to the analysis. In fact, none of the top interactions identified in the present study was reported in known interaction databases 30 , suggesting that the interaction between some variants may arise through some other intermediate pathway.
We did not detect a recently reported interaction with AF by Huang and colleagues 23 . This interaction involved rs2106261 at the ZFHX3 locus and rs2200733 at the PITX2 locus. SNP rs2106261 was the most significant SNP at the ZFHX3 locus associated with AF in our earlier meta-analysis 9 . SNP rs2200733 was one of the top SNPs at the PITX2 locus, and is in complete linkage disequilibrium (r 2 = 1.0) with the most significant SNP rs6817105, the SNP we tested in this study. One possible explanation for the discrepancy between the findings of the two studies is the difference in allele frequency between the Asian population studied by Huang 23 vs. the European ancestry population we studied (18% vs 28% for rs2106261, and 45% vs 16% for rs2200733, respectively). The effect of allelic difference and linkage disequilibrium could be amplified when the interaction was tested, suggesting that population stratification should be considered when comparing the results from studies based on different ethnicities.
We acknowledge several limitations of our study. All study participants in our study are of European ancestry, thus it is unclear whether our findings are relevant for populations of other ancestries. Furthermore, our analysis was restricted to two-variant interactions. However, it is possible that some interactions might involve more than two variants. Although our current study included more than 8,000 AF cases and 65,000 referents, it is possible that we did not have sufficient power to identify meaningful interactions for AF. We are currently expanding our AFGen Consortium to include additional cohorts, not only participants of European ancestry, but also participants of African ancestry and Asian ancestry. With the increasing sample size, we might be able to identify significant interactions in the future. In addition, we are currently imputing genotypes from individual studies to emerging reference panels such as the Haplotype Reference Consortium 31 , which is expected to provide better resolution to identify interacting variants. Given that our current study only tested interactions with known AF loci, we are also planning to expand our analyses to all interactions with the increasing sample size and more advanced computational methods.
In summary, we identified one genome-wide significant gene-gene interaction that was associated with AF susceptibility, suggesting that gene interactions might be involved in the development of AF. However, the finding was not replicated. Future work in functional genomics and efficient algorithms for epistasis analysis will likely facilitate the discovery of additional novel and high-order interactions that contribute to AF.

Materials and Methods
Study participants. Our Table 2. Most significant interactions associated with AF (P < 5 × 10 −7 ). $ CAF: coding allele frequency; + OR: odds ratio; * CI: confidence interval. AF ascertainment. Details about AF ascertainment were described in previous publications 9, 10,14 . Briefly, at each study, we combined evidence from a variety of sources to determine AF status, including electrocardiograms, Holter recordings, rhythm cards, medical records, and/or hospital discharge diagnostic codes. To achieve higher statistical power, we did not distinguish prevalent and incident AF cases, but combined them as individuals with a history of AF.
Genotyping. Genotyping was performed independently in each study, using either Affymetrix SNP arrays or Illumina SNP arrays 9 , and then imputed to ~2.5 million SNPs in the HapMap II release 22 CEU panel to obtain a comprehensive set of SNPs across the genome. Detailed information regarding genotyping platforms, quality control metrics, and imputation methods for each study has been described previously 9,10,12-14 .
Known AF-associated variants. The known AF-associated variants were selected from recent GWAS in which β 1 and β 2 are the main effects for the known AF SNP and the SNP to be tested, respectively. β int represents the effect of the interaction between the AF SNP and the SNP to be tested. PCs represent principal components as necessary in each study to account for population structure. The model was also adjusted for age at DNA draw and sex, two factors that contribute significantly to AF risk. Studies with multiple study centers also adjusted for site. In order to account for the family correlation in FHS, we used generalized estimating equations (GEE) as implemented in the "geepack" R package. The association of each interaction with AF was adjusted for the independence working correlation structure in FHS, where each pedigree was a cluster in the robust variance estimate for the effect of interest.  The null hypothesis was that the interaction term, β int = 0. Each study estimated and provided β int and a robust estimate of standard error SE(β int ) for each SNP interacting with each of the 17 AF-associated SNPs. Thus, we performed 17 interaction GWAS. The study-specific interaction regression parameter estimates r were then meta-analyzed using METAL 32 , applying a fixed effects approach weighted for the inverse of the variance. The effect of interaction was presented as an interaction odds ratio (OR), i.e., exp(β int ). Given that we performed the genome-wide test for 17 SNPs, we defined significant interactions as those with a P-value less than 2.8 × 10 −9 (= 5 × 10 −8 /17 SNPs tested).
In the replication phase, we tested the association of significant or suggestive interactions (P < 5 × 10 −7 ) in an independent cohort, UK Biobank. An interaction was replicated if it had the same direction of effect as the discovery, and the association P < 0.05/N, where N was the number of tests.