Identification of novel SNPs associated with coronary artery disease and birth weight using a pleiotropic cFDR method

Objectives: Clinical and epidemiological findings indicate an association between coronary artery disease (CAD) and low birth weight (BW). However, the mechanisms underlying this relationship are largely unknown. Here, we aimed to identify novel single-nucleotide polymorphisms (SNPs) associated with CAD, BW, and their shared pleiotropic loci, and to detect the potential causal relationship between CAD and BW. Methods: We first applied a genetic pleiotropic conditional false discovery rate (cFDR) method to two independent genome-wide association studies (GWAS) summary statistics of CAD and BW to estimate the pleiotropic enrichment between them. Then, bi-directional Mendelian randomization (MR) analyses were performed to clarify the causal association between these two traits. Results: By incorporating related traits into a conditional analysis framework, we observed the significant pleiotropic enrichment between CAD and BW. By applying the cFDR level of 0.05, 109 variants were detected for CAD, 203 for BW, and 26 pleiotropic variants for both traits. We identified 11 CAD- and/or BW-associated SNPs that showed more than three of the metabolic quantitative trait loci (metaQTL), protein QTL (pQTL), methylation QTL (meQTL), or expression QTL (eQTL) effects. The pleiotropic SNP rs10774625, located at ATXN2, showed metaQTL, pQTL, meQTL, and eQTL effects simultaneously. Using the bi-directional MR approach, we found a negative association from BW to CAD (odds ratio [OR] = 0.68, 95% confidence interval [CI]: 0.59 to 0.80, p = 1.57× 10-6). Conclusion: We identified several pleiotropic loci between CAD and BW by leveraging GWAS results of related phenotypes and identified a potential causal relationship from BW to CAD. Our findings provide novel insights into the shared biological mechanisms and overlapping genetic heritability between CAD and BW.


INTRODUCTION
Coronary artery disease (CAD) is characterized by the narrowing or obstruction of the coronary arteries, which can lead to chest pain, arrhythmia, heart failure, and even permanent heart damage [1]. In 2017, over 485 million people suffered from CAD, resulting in 17.8 million deaths [2,3], making this disease the leading cause of morbidity and mortality worldwide [4].
Numerous studies have shown that early life experiences, including low birth weight (BW), may increase the risk of cardiovascular diseases [5][6][7]. Thus, the World Health Organization has classified low BW as a risk factor for CAD later in life [8]. However, the prevalence of CAD does not decrease with higher BW accompanied by improved living conditions [9]. In addition, many randomized controlled trials designed to improve BW revealed different results [10,11], leaving the relationship between BW and CAD unclear.
By leveraging the pleiotropic effect in related traits, a conditional false discovery rate (cFDR) method was developed without additional subjects recruitment [28]. This approach is cost-effective and could improve the identification of novel genetic loci underlying missing heritability, thereby elucidating genetic mechanisms associated with multiple phenotypes [29][30][31][32]. Furthermore, Mendelian randomization (MR) is an approach to investigate the potential causality between exposure and outcome using genetic instrumental variables [33]. As genetic variants are randomly distributed among the population and are generally independent of confounders, such analysis may reduce confounding bias and eliminate potential reversed causal relationship [34].
In this study, we applied cFDR and bi-directional MR analyses to two large and independent GWAS datasets aiming to 1) identify additional novel loci and the genetic pleiotropy of CAD and BW, and 2) estimate the causality between CAD and BW. Therefore, we can improve SNP detection, and clarify the shared mechanic relationship and overlapping genetic heritability between these two traits better.

Pleiotropic enrichment estimation
We found leftward separations between each line (including the null line) in the stratified quantile-quantile (Q-Q) plots, which indicated the pleiotropy of CAD conditional on BW ( Figure 1A), as well as BW conditional on CAD ( Figure 1B). As shown in foldenrichment plots ( Figure 1C, 1D), distinct upward shifts from the baseline demonstrated a strong pleiotropic enrichment between BW and CAD. We observed the most notable pleiotropy with an enrichment fold greater than 40 in BW conditional on CAD.
Furthermore, the stratified Q-Q plots for CAD conditional on autism spectrum disorder (ASD) (Supplementary Figure 1A), and BW conditional on ASD (Supplementary Figure 1C) all showed no enrichment and vice versa (Supplementary Figure 1B,  1D), which can be the negative controls.

Potentially pleiotropic SNPs identified using conjunction cFDR (ccFDR)
We calculated the ccFDR value and constructed the conjunction Manhattan plot to explore the pleiotropic loci between CAD and BW. ( Figure 2C). Precisely 26 potentially pleiotropic loci that reached a significance threshold at ccFDR ≤ 0.05 were mapped to 13 chromosomes and annotated to 26 different genes. We validated three SNPs that were statistically significant in the original GWAS and CAD-related study, nine loci were also found to be related to other phenotypes (Supplementary Table 8). Using validation datasets, we found 17 pleiotropic SNPs for both traits, 12 of which (70.5%) were also pleiotropic loci in the original ccFDR research (Supplementary Table 9). We then detected 18 pleiotropic SNPs that showed more than one of the metaQTL, pQTL, meQTL, or eQTL effects. Particularly, rs10774625 showed all QTL effects simultaneously ( Table 2).

Causality between BW and CAD
After instrument selection, LD clumping, variant extraction, and harmonization, 52 BW-CAD SNP pairs were selected when choosing BW as exposure (Supplementary Table 10). The MR-Egger regression test result (intercept: -0.0025, 95% confidence interval

AGING
[CI]: -0.015 to 0.014, p = 0.973) suggested that there was no genetic confounding due to horizontal pleiotropy. The null-pleiotropy result was also confirmed using scatter plots and funnel plots (Supplementary Figures 2, 3). There was no apparent heterogeneity in our chosen SNPs, as evidenced by Cochran's Q test (Supplementary  Table 11). We found a negative association of BW to CAD from the inverse-variance weighted (IVW) estimates (odds ratio [OR] = 0.68, 95% CI: 0.59 to 0.80, p = 1.57× 10 -6 ), which was consistent with all other MR methods (Table 3 and Figure 3). MR leave-one-out sensitivity analysis demonstrated that there was no influence of outlying and/or pleiotropic (Supplementary Figure 4). However, in the opposite direction, we found no causal relationship from CAD to BW (Supplementary  Table 12).

Functional enrichment and protein-protein interaction analyses
We discovered significant enrichment of biological processes including "regulation of phospholipid metabolic process" (p = 1.10×10 -4 ) and "negative regulation of lipid transport" (p = 2.40×10 -4 ) for genes associated with CAD by conducting functional enrichment analysis. Moreover, genes associated with BW were enriched in gene ontology (GO) terms like "tube morphogenesis" (p = 1.20×10 -4 ) and "regulation of multicellular organismal process" (p = 3.10×10 -4 ). Interestingly, the results for pleiotropic variants showed a cluster of biological processes in insulin and kinase categories, which might contribute to body growth and the progression of CAD (Table 4).

DISCUSSION
In this study, we incorporated summary statistics from two independent GWAS datasets and discovered 109 and 203 SNPs associated with CAD and BW, respectively. By performing the ccFDR method, we further detected 26 pleiotropic loci associated with both phenotypes. Following a bi-directional MR analysis and functional annotation, we confirmed the causal relationship from BW to CAD and speculated the underlying shared genetic mechanisms between these two traits.
Notably, we identified 11 CAD-and/or BW-associated SNPs that showed more than three of the metaQTL, pQTL, meQTL, or eQTL effects. These functional loci might have a great effect on the pathogenesis of CAD and/or BW. For example, rs11172113 is located in the intron of LRP1, a member of the low-density lipoprotein receptor family, which regulates extracellular proteolytic activities [46]. LRP1 plays a pivotal role in mediating inflammation and efferocytosis [47], and mouse studies have shown that LRP1 knockout leads to diminished vessel integrity and highdensity lipoprotein secretion [48]. Another study proved that LRP1 regulates food intake and energy homeostasis by acting as a co-activator of PPARγ [49]. Moreover, the lipidomic analysis demonstrated that the metabolite C18:1 sphingomyelin, which is associated with rs11172113, was enhanced in CAD patients compared to that in the control group [50]. Another longitudinal prospective study revealed that the alteration of sphingomyelin metabolism is associated with BW percentiles [51], suggesting a potentially crucial role for this SNP in both traits.
Furthermore, we identified one pleiotropic locus, rs10774625, showing metaQTL, pQTL, eQTL, and meQTL effects simultaneously. rs10774625 is located in the intron of ATXN2. One population-based GWAS demonstrated that the ATXN2-SH3 region contributes to changes in the retinal venular caliber, an endophenotype of the microcirculation related to clinical cardiovascular diseases [52]. Animal experiments supported the role of ATXN2 in translational regulation as well as embryonic development [53]. Another ATXN2 knockdown experiment demonstrated that mice lacking ATXN2 develop dysfunction in energy metabolism and weight regulation [54,55]. It has been reported that rs10774625 is associated with the kynurenine metabolite pathway (KP) [56]. Evidence indicates that the activation of indoleamine 2,3-dioxygenase, the inducible enzyme in KP, is closely limited by endothelial cells [57], vascular smooth muscle cells [58], and dendritic cells [59], all of which play vital roles in cardiac pathophysiology [60]. Epidemiologically, it was shown that the concentration of kynurenine is associated with body weight indexes in a European cohort of more than 1000 people [61]. An immunohistochemistry study also detected that the kynurenine-to-tryptophan ratio limits the expression of inflammatory markers in the adipose tissue, which is correlated with body weight [62]. In addition, beta-2microglobulin (B2M) is associated with rs10774625, which reduces the capacity for energy conversion and restricts intrauterine growth, resulting in low BW [63], and is also implicated in the pathogenesis of CAD [64]. These facts indicated that rs10774625 (representing gene ATXN2) might be essential in linking the pathogenesis between CAD and BW.   According to the functional enrichment results, we could also hypothesize the possible shared pathogenesis mechanisms between CAD and BW. GO terms including "regulation of phospholipid metabolic process", "regulation of multicellular organismal process", and "insulin receptor binding," have important impacts on metabolic abnormalities, such as impaired fasting glucose [65], dyslipidemia, and hypertension [66], which could contribute to the increased risk for both traits.

AGING
Our study has some strengths. First, we improved the identification of potential CAD-and BW-associated SNPs and detected several pleiotropic loci in both traits. Following MR analysis, we assessed the causal effect between these two related traits. Second, we took into account ASD, which is unlikely to be correlated with CAD and BW, for a "control trait" enrichment analysis, which provided a baseline to examine pleiotropic enrichment and statistically validate the novel findings in our study. Third, evidence from metaQTL, pQTL, eQTL, and meQTL effects suggested a possible explanation for the etiology of CAD and/or BW and improved the interpretability of the results.
Additionally, our study includes some limitations. First, we were unable to link the genetic findings to clinical measures due to the lack of raw datasets for individual clinical outcomes. However, our study aimed to identify potential novel SNPs and explore the overlapping biological mechanisms between CAD and BW. We hope that our findings can be validated via functional experiments or fine-mapping studies. Second, although we confirmed the causal relationship from BW to CAD, the causalities of metabolomics, proteomics, and methylation between these two traits are unclear. Nevertheless, this problem could be solved by a followup multivariable MR study.

CONCLUSIONS
In conclusion, by applying the cFDR and bi-directional MR analyses to two strongly associated traits, we detected significant pleiotropic SNPs of potential functions for CAD and/or BW and estimated the causal relationship from BW to CAD. These findings provide a better understanding of the shared genetic mechanisms between CAD and BW, which might suggest a novel research direction for early disease prevention and subsequent treatment.

GWAS data sources
The first CAD GWAS was obtained from the Coronary Artery Disease Genome-wide Replication and Metaanalysis plus The Coronary Artery Disease Genetics (CARDIoGRAMplusC4D) Consortium. This metaanalysis of 48 multiple ancestry studies involved more than 8.6 million SNPs from 60,801 cases and 123,504 controls [18]. The first BW dataset conducted by the Early Growth Genetics (EGG) Consortium consisted of 45 multiple ancestry studies including 321,223 subjects. As the control trait, the ASD dataset, collected by the Psychiatric Genomics Consortium, contained 15,954 participants with European ancestry (7,387 ASD cases and 8,567 controls) [67]. For validation, two other CAD AGING and BW datasets were used. The validation CAD dataset, comprising 10,801 cases and 137,914 controls, was collected by the CARDIoGRAMplusC4D Consortium [17]. The validation BW dataset, including 153,781 subjects, was collected by the EGG Consortium [21]. All datasets contained the summary statistics of each locus and the conducted genomic control [17,18,21,22,67].

Data processing
First, two GWAS datasets were combined and 8,285,296 common SNPs with summary statistics remained for both CAD and BW phenotypes. Then, we performed LD-based pruning (r 2 ≤ 0.2) using HapMap III genotypes as a reference, and the SNP of the pair with longer allele frequency was retained [31,68]. After merging and pruning, 141,779 variants were prepared for further analysis.

Pleiotropic enrichment evaluation
We constructed stratified Q-Q plots to estimate the pleiotropic enrichment in two related phenotypes using the "ggplot2" R package. In this study, -log10(p) which means the nominal p-value and -log10(q) which means the empirical quantile were plotted on the Y-and Xaxes, respectively, at different significance levels (p ≤ 1, p ≤ 0.1, p ≤ 0.01, p ≤ 0.001, and p ≤ 0.0001). Under the null hypothesis, plots would fall on the line Y=X, and the enrichment of pleiotropic loci could be evaluated by the degree of leftward deviation from the null line. Additionally, we constructed fold-enrichment plots as a supplement for the Q-Q plots. Fold-enrichment and -log10(p) were plotted on the Y-and X-axes, respectively, at different significance levels (p ≤ 1, p ≤ 0.1, p ≤ 0.01, and p ≤ 0.001) for CAD and BW. Pleiotropy could be visually observed via an upward deflection from the baseline (for the group including all SNPs (p = 1)).

Calculation of cFDR and ccFDR values
The cFDR method was used to estimate the possibility that a random SNP was not associated with the primary trait, given that its strength for the conditional traits was below the threshold [28]. This was an extension of the original FDR framework, applied for the cross-trait analysis [69]. Specifically, we computed cFDR for each SNP, selecting CAD as the primary phenotype given its association with BW (CAD|BW) and vice versa (BW|CAD). To detect the pleiotropic loci for both traits, we calculated the ccFDR value, the maximum of the two cFDR values. The ccFDR value indicated that the possibility that a given SNP was false positively related to two traits (CAD and BW) simultaneously. The thresholds for cFDR and ccFDR were set at 0.05.
Detailed steps of this approach have been described by Andreassen et al. [29].

Bi-directional MR analysis
To determine the relationship between BW and CAD, we performed a bi-directional MR analysis using the "TwoSampleMR" R package [70]. First, SNPs that were genome-wide significant (p ≤ 5x10 -8 ) in the exposure GWAS dataset were selected as genetic variants. To ensure that the instruments for exposure were independent, we performed LD-based clumping (r 2 > 0.001) and only retained the SNP with a lower pvalue [68,71]. Then, we extracted summary-level statistics for each selected SNP from the outcome trait and removed the SNPs related to the outcome phenotype (p ≤ 5x10 -8 ). The summary associations of candidate genetic variants were harmonized as described previously [72]. Finally, MR was conducted using IVW, simple median, weighted median, weighted mode, maximum likelihood, and MR-Egger approaches. BW and CAD were used as exposure and outcome measures, respectively, to identify the causal direction. The datasets used in the MR analysis were the same as that in the original cFDR analysis (The first CAD and BW datasets). To investigate whether any SNP had an outlying and/or pleiotropic influence, we also performed a leave-one-out sensitivity analysis.
We used the GOEAST software to detect statistically overrepresented GO terms within the selected gene sets [74]. Meanwhile, using the STRING database, we conducted protein-protein interaction analyses to investigate the interaction and functional relationships of the identified CAD-and/or BW-related genes [75].

AUTHOR CONTRIBUTIONS
Xinrui Wu conceived the study, performed data analysis, interpretation and wrote the manuscript. Xu Lin, Qi Li, and Zun Wang were responsible for data collection and analysis. Na Zhang and Mengyuan Tian AGING contributed to the manuscript. Xiaolei Wang conducted experiments. Hongwen Deng gave constructive suggestions during the whole process. Hongzhuan Tan provided guidance in study design, organized the investigation and is the corresponding author. All authors have read and approved the final manuscript before submission.

CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest.

Supplementary Tables
Please browse Full Text version to see the data of Supplementary Tables 1, 3