Systematic Evaluation of Pleiotropy Identifies 6 Further Loci Associated With Coronary Artery Disease

Background Genome-wide association studies have so far identified 56 loci associated with risk of coronary artery disease (CAD). Many CAD loci show pleiotropy; that is, they are also associated with other diseases or traits. Objectives This study sought to systematically test if genetic variants identified for non-CAD diseases/traits also associate with CAD and to undertake a comprehensive analysis of the extent of pleiotropy of all CAD loci. Methods In discovery analyses involving 42,335 CAD cases and 78,240 control subjects we tested the association of 29,383 common (minor allele frequency >5%) single nucleotide polymorphisms available on the exome array, which included a substantial proportion of known or suspected single nucleotide polymorphisms associated with common diseases or traits as of 2011. Suggestive association signals were replicated in an additional 30,533 cases and 42,530 control subjects. To evaluate pleiotropy, we tested CAD loci for association with cardiovascular risk factors (lipid traits, blood pressure phenotypes, body mass index, diabetes, and smoking behavior), as well as with other diseases/traits through interrogation of currently available genome-wide association study catalogs. Results We identified 6 new loci associated with CAD at genome-wide significance: on 2q37 (KCNJ13-GIGYF2), 6p21 (C2), 11p15 (MRVI1-CTR9), 12q13 (LRP1), 12q24 (SCARB1), and 16q13 (CETP). Risk allele frequencies ranged from 0.15 to 0.86, and odds ratio per copy of the risk allele ranged from 1.04 to 1.09. Of 62 new and known CAD loci, 24 (38.7%) showed statistical association with a traditional cardiovascular risk factor, with some showing multiple associations, and 29 (47%) showed associations at p < 1 × 10−4 with a range of other diseases/traits. Conclusions We identified 6 loci associated with CAD at genome-wide significance. Several CAD loci show substantial pleiotropy, which may help us understand the mechanisms by which these loci affect CAD risk.

cardiovascular risk factor, particularly blood pressure and lipid traits (2). Furthermore, several loci show association with other diseases; for example, the CADassociated variants in the chromosome 9p21 locus also associate with risk of stroke as well as abdominal, aortic, and intracranial aneurysms (3,4). These observations suggest that a comprehensive analysis of variants associated with other diseases and traits might not only identify additional loci associated with risk of CAD, but also provide important insights  Novel CAD Risk Loci and Pleiotropy on the array for identity by descent testing. The results of an analysis of rare (minor-allele frequency <5%) coding sequence ("exome") variants on this array with CAD were recently reported (5).
We identified 6 new loci associated at genome-wide significance with CAD, annotated these, and undertook a detailed examination of the extent of pleiotropy of these loci as well the previously known CAD loci.

METHODS
The study consisted of discovery and replication phases and has been described in more detail elsewhere (5). Briefly, the discovery cohort included 42,335 cases and 78,240 control subjects from 20 individual studies (Online Table 1); the replication cohort, which was separately assembled and  Table 2). With the exception of participants from 2 studies in the replication cohort who were of South Asian ancestry, all participants were of European ancestry (Online Table 2).
Samples were genotyped on the Illumina HumanExome BeadChip versions 1.0 or 1.1, or the Illumina OmniExome (which includes markers from the HumanExome BeadChip) arrays followed by quality control procedures as previously described (5).
STATISTICAL ANALYSIS. In discovery samples that passed quality control procedures, we performed individual tests for association of the selected variants with CAD in each study separately, using logistic regression analysis with principal components of ancestry as covariates (5). We combined evidence across individual studies using an inverse-variance weighted fixed-effects meta-analysis. Heterogeneity was assessed by Cochran's Q statistic (6). In the discovery phase, we defined suggestive novel association as a meta-analysis p value #1 Â 10 À6 .
For variants with suggestive association, we performed association analysis in the replication studies (Online Appendix). We defined significant novel associations as those nominally significant For both the novel loci and all previously reported CAD loci (1,2), we tested the association of the lead CAD-associated variant (or, if unavailable, a proxy) with traditional cardiovascular risk factors using publicly available GWAS meta-analyses datasets for systolic, diastolic, and pulse pressures (7,8); lowdensity lipoprotein (LDL) cholesterol level; highdensity lipoprotein (HDL) cholesterol level; triglycerides level (9,10); type 2 diabetes mellitus (11); body mass index (BMI) (12); and smoking quantity (13). The maximum size of these datasets ranged from 41,150 to 339,224 individuals. For variants available on the exome array with a known genome-wide association with a risk factor, we also compared the magnitude of the reported association with the risk factor to the observed association with CAD in our analysis.
To identify any associations with other diseases or traits, we searched version 2 of the GRASP (Genome-Wide Repository of Associations between SNPs and Phenotypes) database (14) and the National Human

Genome Research Institute-European Bioinformatics
Institute GWAS catalog (15), plus we collected all associations below 1 Â 10 À4 . For all associations, we identified the lead variant for that trait or disease and calculated pairwise LD with the lead CAD-associated variant using the SNAP web server (16).

RESULTS
In the discovery cohort, 28 variants not located in a known CAD locus (defined as AE300 kb from the published lead SNP) showed association with CAD at a p value <1 Â 10 À6 (Online Table 3). No marked heterogeneity was observed, justifying the use of a fixedeffects model. We then tested these 28 variants for replication, and 6 variants showed both a nominally significant (p < 0.05) association in the replication cohort and a combined discovery and replication meta-analyses p value exceeding the threshold for genome-wide significance (p < 5 Â 10 À8 ) ( Table 1). As typical for GWAS findings, the risk alleles were common (allele frequencies ranging from 15% to 86%), and the risk increase per allele was modest (ranging from 4% to 9%) ( Table 1).
ANNOTATION OF NOVEL LOCI. Forest and regional association plots for the 6 Table 4). Apart from the lead variant at the KCNJ13-GIGYF2 locus, which is a nonsynonymous SNP, none of the other loci had a variant affecting protein sequence in high LD with the lead variant.  Novel CAD Risk Loci and Pleiotropy ester transfer protein (CETP) gene, which mediates the transfer of cholesteryl esters from HDL cholesterol to other lipoproteins and was placed on the array because of its association with plasma HDL cholesterol level (9,10). The risk (C) allele is associated with lower HDL cholesterol and modest increases in plasma LDL cholesterol and triglycerides levels (9,10). Previous studies have shown that rs1800775 is itself functional in that the C allele disrupts binding of the Sp1 transcription factor resulting in increased promoter activity (18). This is in agreement with our annotation, which predicts this to be more likely to be a functional SNP than the only other SNP in high LD, rs3816117 (Online Figure 3). Consistent with this, we also found associations between rs1800775 and CETP expression (r 2 of 0.77) with the best eSNP (i.e., the lead SNP for the eQTL) in monocytes and liver (Online Table 5), and previous studies have shown that the variant is also associated with plasma CETP level (19,20).  Table 6), but a stronger association with plasma LDL cholesterol and triglycerides levels ( Table 2). rs11957830 was included on the array because of an association of the A allele (CAD risk-associated allele) with higher levels of vitamin E ( Table 3) (21). Variants in high LD with the CAD risk allele at rs11057830 have also been associated with increased lipoprotein-associated phospholipase A 2 (Lp-PLA 2 ) activity (22). Analysis of eQTL identified an association between rs11057841 (r 2 ¼ 0.92 with the lead variant), and expression of SCARB1 in the intestine (Online Table 5). Functional annotation of the locus did not identify a strong candidate causal SNP, but rs10846744 (r 2 ¼ 0.94 with the lead variant) overlaps a deoxyribonuclease I hypersensitivity peak in a region bound by several transcription factors (Online Figure 3).  Table 4). The risk (C) allele of the lead variant has previously been associated with reduced risk of migraine (23), and there is an association of the alternate (T) allele with reduced lung function (24).
rs3869109, another variant at the HLA locus approximately 700 kb away from the new lead variant, has been reported to be associated with CAD (27). In our discovery cohort, rs3869109 has a p value of association with CAD of 0.23.       Table 5).  This chord diagram depicts associations that passed Bonferroni correction ( Table 2). Connections indicate that single nucleotide polymorphisms at respective loci associate with both coronary artery disease (CAD) and the respective risk factor; they do not imply that the risk factor causally explains the association with CAD. Red indicates new CAD loci. BMI ¼ body mass index; HDL ¼ high-density lipoprotein; LDL ¼ low-density lipoprotein.

CENTRAL ILLUSTRATION Significant Associations of CAD Loci With Cardiovascular Risk Factors
Webb et al. The full results are shown in Online Table 6, and the significant associations are summarized in Table 2 Several loci showed multiple associations ( Table 3).
Although in most cases, the CAD-associated risk allele was also associated with an increased risk (or level) of the other disease or trait, this was not always the case. Furthermore, in some loci with multiple associations, the direction of association varied between diseases ( Table 3).

DISCUSSION
This large-scale meta-analysis of common variants, including many with prior evidence for association with another complex trait, resulted in the identification of 6 new CAD loci at genome-wide significance. We also showed that almost one-half of the CAD loci that have been identified to date demonstrate pleiotropy, an association with another disease or trait. The findings added to our understanding of the genetic basis of CAD and might provide clues to the mechanisms by which such loci affect CAD risk.
Our findings of a genome-wide association with CAD of a functional variant in the promoter of the CETP gene that is also associated with its expression and plasma activity (18)(19)(20) have added to previous evidence linking genetically determined increased activity of this gene with higher risk of CAD (20).
There has been a longstanding interest in CETP inhibition as a therapeutic target, primarily because of the effect on plasma HDL cholesterol level. However, several CETP inhibitors have recently failed to improve cardiovascular outcomes in large randomized clinical trials (28)(29)(30) and, in 1 case, caused harm Although previous studies have shown that the CETP genetic variant we report here affects CETP activity, the precise mechanism(s) by which this variant modifies CAD risk remains uncertain.
A notable finding was the association with CAD of common variants located in the SCARB1 gene. Association of variants at the SCARB1 locus with CAD was also reported by the CARDIoGRAMplusC4D consortium, but this did not reach genome-wide significance (1). The gene encodes the canonical receptor, SR-BI, responsible for HDL cholesteryl ester uptake in hepatocytes and steroidogenic cells (33). Genetic modulation of SR-BI levels in mice is associated with marked changes in plasma HDL cholesterol (34).
Consistent with this, a rare loss of function variant in which leucine replaces proline 376 (P376L) in SCARB1 was recently identified through sequencing of individuals with high plasma HDL cholesterol (35).
Interestingly, despite having higher plasma HDL, 346L carriers had an increased risk of CAD, suggesting that the association of variation at this locus on CAD is not driven primarily through plasma HDL (35).
Indeed, there is only a nominal association of the lead CAD variant at this locus (rs11057830) with plasma HDL cholesterol (Online Table 6). The variant is also modestly associated with plasma LDL cholesterol and serum triglycerides ( Table 2). All 3 of these lipid associations are directionally consistent with epidemiological evidence linking them to CAD risk and could, in combination, explain the association of the locus with CAD. However, the lead variant is more strongly associated with Lp-PLA 2 activity and mass (Table 3), which could provide an alternative explanation for its association with CAD. Irrespective of the mechanism, our findings, when combined with those of Zanoni et al. (35), suggest that modulating SR-B1 may be therapeutically beneficial.
After adjusting for multiple testing, we found that slightly more than one-third of the CAD loci showed an association with traditional cardiovascular risk factors. Although the vast majority of associations were in the direction consistent with the epidemiological association of these risk factors with CAD, as noted in the previous text with respect to loci affecting the HDL cholesterol level, this should not be interpreted as implying that these loci affect CAD risk through an effect on the specific risk factor. Indeed, for variants available on the array with a known genome-wide association with these risk factors, we found a poor correlation between the magnitudes of their effect of the risk factor and their association with CAD in our dataset except for LDL cholesterol (Online Figure 4).
Almost one-half of the CAD loci showed a strong or suggestive association with other diseases or traits with, in many cases, the identical variant being the lead variant reported for the association with these other conditions ( Table 3). Some of the associations with other traits-for example, coronary calcification (3q22, 6p24, 9p21, 13q34, and 15q25) or carotid intima-media thickness (4q31 and 19q13)-are not surprising, as these traits are known to be correlated with CAD. Others, such as risk of stroke (7p21 and 9p21), might reflect a shared etiology. However, the mechanism(s) behind most of the observed pleiotropy is not clear, although the findings could provide clues as to how the locus may affect CAD risk. As an example, 5 loci (12q24, 1p13, 6q25, 11q23, and 19q13) show strong associations with plasma activity and/or mass of Lp-PLA 2 . Lp-PLA 2 is expressed in atherosclerotic plaques where studies have suggested a role in the production of proinflammatory and proapoptotic mediators, primarily through interaction with oxidized LDL (37,38). A meta-analysis of prospective studies showed an independent and continuous relationship of plasma Lp-PLA 2 with CAD risk (39). However, it should be noted that Mendelian randomization analyses have not supported a causal role of secreted Lp-PLA 2 in coronary heart disease (40), and phase III trials of darapladib, an Lp-PLA 2 inhibitor, have shown no benefit in patients with stable coronary heart disease (41) or acute coronary syndromes (42) when added to conventional treatments including statins.
Chronic inflammation plays a key role in both the pathogenesis of CAD and of inflammatory bowel disease. It is therefore interesting to note the association of the same locus at 15q22 with CAD as well as Crohn's disease and ulcerative colitis ( Table 3). Association of this locus with CAD at genome-wide significance was recently reported by the CARDIoGRAMplusC4D consortium (2) with the lead SNP (rs56062135) showing strong linkage disequilibrium (r 2 ¼ 0.9) with the lead SNP (rs17293632) associated with inflammatory bowel disease. Both rs56062135 and rs17293632 lie in a region of w30 kb within the initial introns of the SMAD family member 3 gene (SMAD3), a signal transducer in the transforming growth factor-beta pathway.
Indeed, rs17293632 was included on the exome array because of its known association with Crohn's disease and showed a significant association with CAD in our combined dataset (p ¼ 1.78 Â 10 À8 ). Farh et al. (43) interrogated ChIP-seq data from ENCODE and found allele-specific binding of the AP-1 transcription factor to the major (C) allele in heterozygous cell lines and