Integration of Biomarker Polygenic Risk Score Improves Prediction of Coronary Heart Disease

Visual Abstract


SUMMARY
There are several established biomarkers for coronary heart disease (CHD), including blood pressure, cholesterol, and lipoproteins.It is of high interest to determine how a combined polygenic risk score (PRS) of CHDassociated biomarkers (BioPRS) can further improve genetic prediction of CHD.We developed CHDBioPRS, combining BioPRS with PRS of CHD in the UK Biobank and tested it on FinnGen.We found that BioPRS was clearly predictive of CHD and that CHDBioPRS improved the standard CHD PRS.The largest effect was observed with early onset cases in FinnGen, with HRs above 2 per standard deviation of CHDBioPRS.C oronary heart disease (CHD), a com- plex disease caused by a gradual build-up of fatty deposits in the arteries, is a major cause of death worldwide.
2][3][4] These risk factors are also used in clinical risk calculators to evaluate preventive therapies and strategies.Although clinical risk scores enable identification of some individuals at high risk, [5][6][7] a large proportion of CHD cases are not detected by these scores, and the utility of clinical scores is limited for young adults 3,[8][9][10] and women. 11,123][24] Because PRS are based on germline DNA, risk profiling can be conducted in early life when the individuals with the highest values of PRS are likely to benefit from an early adoption of preventive strategies.
The landmark study of PRS for common diseases conducted by Khera et al 23 showed that a sizable portion of the population carry a polygenic CHD risk equivalent to known monogenic mutations conferring severalfold increased risk.This PRS, comprising more than 6 million single nucleotide polymorphisms (SNPs), was generated with the use of LDPred 19 and has proven to be effective in validation sets across multiple populations while also performing favorably compared with other PRS 25  ing studies have focused on combining several GWAS of CHD, 13,26,27 our focus is on the combination of effects of known CHD-associated biomarkers into a single PRS and its integration with CHDPRS.
Furthermore, because the current risk calculators do not work equally well for women as for men, it is of high importance to quantify the contribution of Bio-PRS within each sex. 28Another important goal is to predict a subgroup of CHD cases with an early onset of the disease. 29

METHODS
The workflow of the study is presented in Figure 1.
UK BIOBANK DATA.The design of UK Biobank (UKB) and the background of its participants have been reported previously. 22,30We restricted our analyses to samples that were of self-reported European ancestry to avoid potential spurious associations driven by allele frequency differences when including    was calculated from the Friedewald formula 32 as    1.

Lin et al
Biomarker Polygenic Risk Score Improves Prediction of CHD FinnGen.The design of the Finnish FinnGen 34 project and participant backgrounds are presented in Table 1.validation. 35,36The optimal model identified 10 biomarkers, which were selected for BioPRS construction (Table 2).As shown in Supplemental Figure 1, a ¼ 0.25 (closer to ridge regression) produced the same optimal set as a ¼ 0.50, and a ¼ 0.75 (closer to lasso regression) further excluded ApoA1.

The
BIOMARKER PRS.PRS with continuous shrinkage (CS) 21 was run on each of the GWAS summary results of the selected 10 biomarkers.To account for linkage disequilibrium (LD), we used the 1000 Genomes Where h 0 (t) is the baseline hazard rate, z denotes the We used the "coef" function from the "survival" library 39 of R software to estimate the b biomarkerPRS coefficients as previously recommended. 40,41oPRS.We combined the biomarker PRS into a score named BioPRS by standardizing (mean of 0 and SD of 1) sum of the 10 biomarker PRS after multiplying each PRS by the beta-coefficient (b i ) of the corresponding biomarker from formula (equation 1): 2) CHDPRS.We generated a PRS for CHD (named CHDPRS) by applying PRS-CS 21 to CHD GWAS reported by Nikpay et al 13 using the European panel from the 1000 Genomes Project 37 for LD reference.
Our CHDPRS contained 1,087,715 SNPs.In addition, we compared our CHDPRS with "Khera PRS," which is the PRS for CHD generated with the use of LDpred by Khera et al 23 based on the same GWAS 13 that we used to generate our CHDPRS.
CHDBioPRS.CHDBioPRS was constructed from integration of BioPRS and CHDPRS.Weights of the 2 PRS (CHDPRS and mBioPRS) were estimated in the UKB validation set with the use of a Cox regression model with CHD as outcome: CHDBioPRS is the standardized sum of the CHDPRS and BioPRS multiplied by their weights from formula (equation 3): 4) In addition to the derivation above, a similar procedure was done also for men and women separately (biomarker selection using glmnet, biomarker weights in BioPRS using Cox regression, and combination of CHDPRS and BioPRS using another Cox regression).
SCORE2.SCORE2 42 is a prediction model for 10year cardiovascular disease risk that uses information on age, total cholesterol, HDL, SBP, diabetes, and smoking.We calculated SCORE2 in our UKB data sets to give a comparison point for our PRS.
We note that performance of SCORE2 may be overly optimistic in UKB, because the UKB data were used in the derivation of SCORE2.Because SCORE2 requires laboratory measurements, it cannot be applied in FinnGen, where those lab measurements are not available.We constructed combined pre- CHDBioPRS by an approach similar to that described by equations ( 3) and (4).
EARLY ONSET.We identified all individuals with early CHD onset (<55 years of age), and to further account for sex differences, we defined early CHD onset for women as <60 years of age and early CHD onset for men as <50 years of age. 43ATISTICAL ANALYSIS.5][46] All scores were standardized to have a mean of 0 and a variance of    2 to 7, likelihood ratio statistics are in Supplemental Table 18, and SCORE2-related statistics are in Supplemental Table 19.
PRS ¼ polygenic risk score; other abbreviations as in Table 1.

Lin et al
Biomarker Polygenic Risk Score Improves Prediction of CHD There were clear differences in effect sizes between UKB and FinnGen.These differences could in part relate to differences in sample ascertainment procedures and genetic background.The UKB participants are known to be healthier than the general population, 48 and therefore the relative contribution of genetics to their disease risk may be larger, whereas the FinnGen participants are recruited through their contacts with the Finnish health care system.In addition, the FinnGen participants are on average 5 years older than the UKB participants, and the CHD case rate in FinnGen is nearly double that of UKB (10.2% vs 5.6%).Differences in performance of PRS are known to exist even between populations of European ancestry. 49In our study, the biomarker GWAS effect sizes and LD information used in creating PRS were derived in UKB or from other non-Finnish European populations, which could lead to better predictive power of PRS in UKB compared with Finnish data. 50For our CHDPRS, the effect sizes were taken from a large GWAS meta-analysis 13

(
J Am Coll Cardiol Basic Trans Science 2023;8:1489-1499) © 2023 The Authors.Published by Elsevier on behalf of the American College of Cardiology Foundation.This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
composed of smaller numbers of variants.Because PRS have proven to be successful for CHD prediction, it remains of high interest to systematically determine how a combined polygenic biomarker score (BioPRS) constructed with biomarkers associated with CHD can improve on the established CHDPRS.Recently, multi-PRS models, using 35 PRS from blood and urine biomarkers, have been shown to improve genetic risk prediction of common diseases such as type 2 diabetes and gout. 2 While other exist-

A
B B R E V I A T I O N S A N D A C R O N Y M S Apo = apolipoprotein BMI = body mass index CHD = coronary heart disease CPD = cigarettes per day CRP = C-reactive protein GWAS = genome-wide association study HbA1c = glycated hemoglobin HDL = high-density-lipoprotein LD = linkage disequilibrium LDL = low-density-lipoprotein LRT = likelihood ratio test MI = myocardial infarction NRI = net reclassification improvement PRS = polygenic risk score SBP = systolic blood pressure SNP = single nucleotide polymorphism TRIG = triglycerides UKB = UK Biobank study (GWAS) was performed on UKB genotype data (UKB (self-reported British White) Training and Validation combined), separately in females and males, for 16 biomarkers.Construction and testing of CHDBioPRS: We regressed CHD on the biomarkers in the UKB training data using elastic net Cox regression and retained 10 biomarkers.BioPRS is constructed by weighting each PRS by its coefficient in joint Cox regression model for CHD in UKB training data.CHDBioPRS is the sum of the standard CHDPRS and BioPRS where the weights are estimated from a Cox regression model predicting CHD within UKB validation data.Independent testing was performed on the FinnGen and the UKB Test (self-reported Non-British White) cohorts.CHD ¼ coronary heart disease; UKB ¼ UK Biobank.
FinnGen test cohort contained 321,302 FinnGen data freeze 7 participants.The CHD case definition in FinnGen (I9_CHD) is consistent with our UKB CHD definition except that the FinnGen definition also includes samples with angina only (I20.0) as cases.Consequently, we removed the 2,989 anginaonly cases from FinnGen, in addition to removal of 6,109 participants younger than 16 years old at enrollment.BIOMARKER MODEL.Using the 16 CHD associated biomarkers for the UKB training data as predictors and incident CHD as outcome (and excluding prevalent CHD cases), we used penalized Cox proportional hazard models (glmnet R package) using the elastic net penalty (a ¼ 0.50) with 20-fold cross- vector of covariate values (sex and the first 10 principal components of population structure), and each biomarker has coefficient b biomarkerPRS that corresponds to a change in the logarithm of the hazard rate per one standard deviation of the biomarker PRS value.

Table 2 ,
Supplemental Table1 The authors attest they are in compliance with human studies committees and animal welfare regulations of the authors' institutions and Food and Drug Administration guidelines, including patient consent where appropriate.For more information, visit the Author Center.Manuscript received March 9, 2023; revised manuscript received July 6, 2023, accepted July 10, 2023.individualsfrom different ancestry backgrounds in our GWAS.We made use of 4 sets of UKB samples.First, our GWAS set contained 343,695 samples who were unrelated (pairwise kinship coefficients reported by UKB <0.044) and who self-reported "WhiteFIGURE 1 Study Design and Workflow CHD biomarkers: We identified CHD associated risk factors or biomarkers from UKB (

TABLE 1
Sample Characteristics in UKB and FinnGen

TABLE 2
a Statin adjustment for statin users (16.2%) is done either by division (/) or addition (þ) by the value given in the last column.Statin and blood pressure medicinal use was identified with the use of fields 20003, 6153, and 6177.b Selected in optimal regularized model.c Total cholesterol calculated from Friedewald formula of HDL þ LDL þ TRIG/2.2 in units of mmol/L.BP ¼ blood pressure; HDL ¼ high-density lipoprotein; LDL ¼ low-density lipoprotein; other abbreviations as in Table

Table 3
same pattern, where BioPRS itself is clearly predictive of CHD (HR estimates per SD vary from 1.42 to 1.45), CHDPRS on its own is more predictive than BioPRS (HR estimates vary from 1.62 to 1.78), and CHDBioPRS is the most predictive (HRs vary from 1.73 to 1.88).For C-index, z-scores, and AUC metrics, see Supplemental Tables2 to 4. When the scores are applied in FinnGen

TABLE 3
HRs (With 95% CIs) From Cox Regression Model of 3 Different PRS BioPRS is combination of PRS of 10 CHD-related biomarkers, CHDPRS is a standard PRS for CHD, and CHDBioPRS combines BioPRS and CHDPRS.Early onset was defined as CHD before 55 years of age.C-index, AUC, and P values are in Supplemental Tables Fimea), the national supervisory authority for welfare and health.Recruitment protocols followed the biobank protocols approved by Fimea.The Coordinating Ethics Committee of the Hospital District of Helsinki and Uusimaa statement no.for the FinnGen study is HUS/990/2017.modelsfor coronary artery disease.Circ Genom Precis Med.2020;13(6):e002932.50.Kerminen S, Martin AR, Koskela J, et al.Geographic variation and bias in the polygenic scores of complex diseases and traits in Finland.Am J Hum Genet.2019;104(6):1169-1181.51.Saar A, Lall K, Alver M, et al.Estimating the performance of three cardiovascular disease risk scores: the Estonian Biobank cohort study.J Epidemiol Community Health.2019;73:272-277.52.Sedlak T, Herscovici R, Cook-Wiens G, et al.Predicted versus observed major adverse cardiac event risk in women with evidence of ischemia and no obstructive coronary artery disease: a report from WISE (Women's Ischemia Syndrome Evaluation).J Am Heart Assoc.2020;9(7):e013234.53.Woodward M. Cardiovascular disease and the female disadvantage.Int J Environ Res Public Health.2019;16(7):1165.54.Bots SH, Peters SAE, Woodward M. Sex differences in coronary heart disease and stroke mortality: a global assessment of the effect of ageing between 1980 and 2010.BMJ Global Willer CJ, Schmidt EM, Sengupta S, et al.Discovery and refinement of loci associated with lipid levels.Nat Genet.2013;45(11):1274-1283.58.Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ.Clinical use of current poly-genic risk scores may exacerbate health disparities.Nat Genet.2019;51(4):584-591.59.Mars N, Koskela JT, Ripatti P, et al.Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers.Nat Med.2020;26(4):549-557.60.Sirugo G, Williams SM, Tishkoff SA.The missing diversity in human genetic studies.Cell.2018;177(1):26-31.KEY WORDS biomarkers, coronary heart disease, genomics, GWAS, polygenic risk scores APPENDIX For supplemental figures and tables and a list of the contributors of FinnGen, please see the online version of this paper.
CONCLUSIONSThe integration of biomarker PRS improves on the standard PRS for prediction of CHD, where the gain was largest among early onset CHD cases.This study strengthens the evidence for genome-based CHD prediction and quantifies the interplay between standard CHD PRS and PRS of biomarkers associated with CHD.