Genomic Risk Prediction of Coronary Artery Disease in 480,000 Adults

Background Coronary artery disease (CAD) has substantial heritability and a polygenic architecture. However, the potential of genomic risk scores to help predict CAD outcomes has not been evaluated comprehensively, because available studies have involved limited genomic scope and limited sample sizes. Objectives This study sought to construct a genomic risk score for CAD and to estimate its potential as a screening tool for primary prevention. Methods Using a meta-analytic approach to combine large-scale, genome-wide, and targeted genetic association data, we developed a new genomic risk score for CAD (metaGRS) consisting of 1.7 million genetic variants. We externally tested metaGRS, both by itself and in combination with available data on conventional risk factors, in 22,242 CAD cases and 460,387 noncases from the UK Biobank. Results The hazard ratio (HR) for CAD was 1.71 (95% confidence interval [CI]: 1.68 to 1.73) per SD increase in metaGRS, an association larger than any other externally tested genetic risk score previously published. The metaGRS stratified individuals into significantly different life course trajectories of CAD risk, with those in the top 20% of metaGRS distribution having an HR of 4.17 (95% CI: 3.97 to 4.38) compared with those in the bottom 20%. The corresponding HR was 2.83 (95% CI: 2.61 to 3.07) among individuals on lipid-lowering or antihypertensive medications. The metaGRS had a higher C-index (C = 0.623; 95% CI: 0.615 to 0.631) for incident CAD than any of 6 conventional factors (smoking, diabetes, hypertension, body mass index, self-reported high cholesterol, and family history). For men in the top 20% of metaGRS with >2 conventional factors, 10% cumulative risk of CAD was reached by 48 years of age. Conclusions The genomic score developed and evaluated here substantially advances the concept of using genomic information to stratify individuals with different trajectories of CAD risk and highlights the potential for genomic screening in early life to complement conventional risk prediction.

A s coronary artery disease (CAD) is the leading cause of morbidity and mortality worldwide, early identification of individuals who are at high risk of CAD is essential for primary prevention.
As the heritability of CAD has been estimated to be 40% to 60%, comprehensive information on genetic susceptibility could contribute importantly to CAD risk stratification (1,2).
Although family history has long been identified as a risk factor for CAD, elucidation of the genetic architecture of CAD has advanced substantially only during the past decade with the advent of genomewide association studies. Results from these assumption-free surveys across the genome have laid foundations for developing genomic risk scores (GRS) in the estimation of an individual's underlying genomic risk (3)(4)(5)(6)(7)(8)(9). Furthermore, because GRS are based on germline DNA, they are quantifiable in early life, at or before birth. Hence, they offer the potential for early risk screening and primary prevention before other conventional risk factors become informative.
Due to several inter-related factors, however, previous GRS for CAD have been unable to provide comprehensive assessment of the potential of using genomic information in CAD risk prediction.
First, because previously published GRS have utilized only genetic variants of genome-wide significance (4,5,8) or involved genotyping arrays that focused only on pre-selected loci (3), they have not fully utilized genome-wide variation, preventing accurate estimation of the relative contribution of each genetic variant to CAD risk. Second, because previous studies mass index [BMI], diabetes, family history, and high cholesterol) on different genomic risk backgrounds, with the aim of delineating event rates across age, sex, clinical risk factors, and genomic risk score strata to identify individuals who are more likely to benefit from earlier and more intensive therapies. Finally, to assess the potential therapeutic implications of genomic risk scores, we tested the impact of blood pressure and lipid-lowering medication on the performance of the metaGRS.

METHODS
STUDY DESIGN AND PARTICIPANTS. The design of this study is shown in Online Figure 1. Details of the design of the UKB have been reported previously (15).
Participants were members of the general U.K. population between age 40 and 69 years at recruitment, identified through primary care lists, who accepted an  Genotyping of UK Biobank participants was undertaken using a custom-built genome-wide array (the UK Biobank Axiom array) of w826,000 markers.
Genotyping was done in 2 phases. A total of 50,000 subjects were initially typed as part of the UK BiLEVE project (16). The rest of the participants were genotyped using a slightly modified array. Imputation to w92 million markers was subsequently carried out using the Haplotype Reference Consortium (17) and UK10K/1000Genomes haplotype resource panels; however, at the time of analysis, known issues existed with the imputation using the latter panel.  Values are mean AE SD or n (%). CAD ¼ coronary artery disease.

RESULTS
The characteristics of the UKB subjects in the external validation set (N ¼ 482,629) are shown in Table 1,  other abbreviations as in Figure 1.    To investigate the potential role of the metaGRS in earlier life genetic screening, we compared the sex-stratified cumulative incidence of CAD across quintiles of the metaGRS (Figure 3). In UKB men, we observed that CAD risk in the highest metaGRS quintile began exponentially increasing shortly after age 40 years, reaching a threshold of 10% cumulative risk by 61 years of age ( Figure 3). By comparison, CAD risk for men in the lowest metaGRS quintile did not   Dotted lines represent 95% CIs. GRS ¼ genomic risk score; HR ¼ hazard ratio; other abbreviations as in Figure 1.

DISCUSSION
In an analysis of almost 500,000 people in a prospective nationwide cohort study, we evaluated a combined genomic risk score (metaGRS) built from summary statistics of the largest previous genomewide association studies of CAD (Central Illustration).
We report a series of findings that substantially advance the concept of using genomic information to help stratify individuals for CAD risk in general populations, an approach that leverages the fixed nature of germline DNA over the life course to anticipate different lifelong trajectories of CAD risk.
First, our metaGRS achieved greater risk discrimination than previously published genomic risk scores based on selected SNPs (3-9). For example, we found

Age (Years)
A genomic risk score for coronary artery disease Greater association with future coronary artery disease than any single conventional risk factor Independent of yet complements conventional risk factors Provides meaningful lifetime risk estimates of coronary artery disease Quantifiable at or before birth and shows potential for risk screening in early life The genomic score provides potential for risk screening early in life as well as complements conventional risk factors for coronary artery disease.  In translating genomic risk scores, standardization in assay and data processing will be necessary but achievable, including in imputation (e.g., reference panel and quality control) and handling of population stratification (e.g., using a population-specific GRS distribution and/or adjustment of GRS directly).
We have made the metaGRS algorithm freely available (21) (24). Fourth, current GWAS sample sizes and imputation efficiencies are also limiting in that they introduce noise into GRS estimates. Our meta-score approach here addresses this to some extent; however, future large-scale cohorts will offer more powerful genomic scores.
Last, despite the metaGRS showing substantial CAD risk discrimination in individuals already on medication, we were also unable to assess the effect of medication versus nonmedication in individuals who are at high metaGRS risk, as without blind randomization, this analysis would be susceptible to reverse causation, with those on medication likely already at higher CAD risk.

CONCLUSIONS
The genomic score developed and evaluated in the present study strengthens the concept of using genomic information to stratify individuals for CAD risk in general populations and demonstrates the potential for genomic screening in early life to complement conventional risk prediction.