Association between de novo lipogenesis susceptibility genes and coronary artery disease

Background and aims: Coronary artery disease (CAD) is the principal cause of death in individuals with non-alcoholic fatty liver disease (NAFLD). The aim of this study was to use genetic epidemiology to study the association between de novo lipogenesis (DNL), one of the major pathways leading to NAFLD, and CAD risk. Methods and results: DNL susceptibility genes were used as instruments and selected using three approaches: 1) genes that are associated with both high serum triglycerides and low sex hormone-binding globulin, both downstream consequences of DNL (unbiased approach), 2) genes that have a known role in DNL (biased approach), and 3) genes that have been associated with serum fatty acids, used as a proxy of DNL. Gene-CAD effect estimates were retrieved from the meta-analysis of CARDIoGRAM and the UK Biobank ( w 76014 cases and w 264785 controls). Effect estimates were clustered using a ﬁ xed-effects meta-analysis. Twenty-two DNL susceptibility


Introduction
Non-alcoholic fatty liver disease (NAFLD) has become a substantial health burden that is associated with hepatic complications including end-stage liver failure and hepatocellular carcinoma [1]. Notably, NAFLD is also strongly associated with extrahepatic complications such as type 2 diabetes and coronary artery disease (CAD), the latter of which has become the leading cause of death in patients with NAFLD [2,3].
Over the past years, there has been on ongoing discussion on the causal role of NAFLD in the development of CAD [4]. The central, systemic role of the liver in metabolic processes and the need for long-term follow-up complicates conventional epidemiological and intervention studies that target NAFLD [5].
Genetic epidemiology can serve as an alternative method to infer causality. As genetic variants that predispose to or protect from an exposure of interest (such as NAFLD) are randomly distributed at conception, they can be used as an instrument to study the causal effect of the exposure on the outcome (such as CAD) [6]. We previously applied this approach and showed that genetic variants that result in NAFLD through impaired secretion of verylow-density lipoproteins (VLDL) protect from CAD [7]. However, it remains uncertain how other more principal pathways that result in NAFLD e in particular de novo lipogenesis (DNL), i.e. the process of converting non-lipid precursors, such as glucose and fructose, into fatty acids [8] e contribute to the risk of CAD.
In the present study we, therefore, aimed to use genetic epidemiology to gain more insight into the association between DNL and CAD.

Unbiased approach
Since DNL inhibits the synthesis of SHBG and stimulates VLDL production [9e12], we assumed that genetic variants that predispose to both low serum SHBG and high triglyceride levels are likely DNL susceptibility genes. The SHBG and triglyceride susceptibility genes were retrieved from genome-wide association (GWA) studies in the Global Lipids Genetics Consortium (serum triglycerides) and UK Biobank (BMI-adjusted SHBG), respectively [13,14]. Genetic variants were included if the associations with serum SHBG and triglycerides both reached genome-wide significance (p < 5 * 10 À8 ), and the effect allele was positively associated with serum triglyceride levels and inversely associated with serum SHBG levels. Genetic variants were excluded if they were in linkage disequilibrium (r 2 > 0.1, the variant with the largest absolute effect estimate was retained).

Biased approach
In the biased approach, we screened all genome-wide significant (p < 5)10 À8 ) SHBG susceptibility genes in the UK Biobank for a potential involvement in DNL (based on genecards.org). Genes, and their corresponding genetic variants, that were of interest were further explored in existing literature to verify their role in DNL. The effect allele of the genetic variant was chosen as the allele that decreases serum SHBG levels. Genetic variants were excluded if they were in linkage disequilibrium, as described previously.

Fatty acid approach
In the fatty acid approach, we selected genetic variants that have previously been identified from a GWA study (p < 5)10 À8 ) for palmitic acid (16:0), stearic acid (18:0), palmitoleic acid (16:ln-7), or oleic acid (18:ln-9), used as biomarkers of DNL [15]. The effect allele was chosen as the allele that increases plasma fatty acids. If a genetic variant had effects on multiple plasma fatty acids, the effect allele was chosen as the allele that increases the concentration of the fatty acid that is most proximal in the pathway of DNL (see Supplementary Fig. 1). Genetic variants were excluded if they were in linkage disequilibrium, as described previously.

Associations with coronary artery disease
Summary-level data for the association of the selected genetic variants with CAD was retrieved from the publicly available data of the CARDIoGRAMplusC4D 1000 Genomesbased GWAS, Myocardial Infarction Genetics and CARDIo-GRAM Exome chip, and UK Biobank SOFT CAD study [16]. The CARDIoGRAM dataset was originally published as a meta-analysis of w68 studies, including individuals aged 18 years and older of primarily European descent. The UK Biobank includes individuals aged between 18 and 69 years of age, primarily of European descent, and living in the United Kingdom. In the current study, data from w76 014 cases and w264 785 controls was used. CAD was defined as a history of fatal or nonfatal myocardial infarction, percutaneous transluminal coronary angioplasty (PTCA) or coronary artery bypass grafting (CABG), chronic ischemic heart disease (IHD) or angina, based on self-reported data or hospital records. Controls were defined as those individuals who did not fulfil the criteria for CAD and who did not suffer from an aneurysm or atherosclerotic cardiovascular disease, based on hospital records.

Statistical analyses
For each approach, a fixed-effect meta-analysis was conducted to combine the CAD risk estimates conferred by all individual DNL genetic variants. The overall effect estimate should, therefore, be interpreted as the average CAD risk conferred by one DNL risk allele [7]. Higgin's I 2 and Cochran's Q statistic were calculated to identify heterogeneity of the effect estimates. Potential influential outliers were identified statistically using the leave-one-out method [17]. Results were considered statistically significant at p < 0.05. All analyses were conducted with the R statistical software (R Developmental Core Team) using the metaphor package [18].

Unbiased approach
Thirty-one genes that reached genome-wide significance (p < 5)10 À8 ) for the association with both serum SHBG and triglycerides were identified. Nine genes were subsequently excluded because of linkage disequilibrium (SNX17), misalignment of the predefined direction of the association with serum triglycerides and SHBG (AKR1C4, APOC1, APOC1P1, and MET ), or absence in the outcome dataset (GATAD2A, HSD17B13, MACF1, and NRBF2). Therefore, twenty-two genes that predisposed to low serum SHBG and high triglyceride levels were included in the final analysis (Supplementary Table 1). Clustering of these genetic variants resulted in a statistically significant association with CAD (OR: 1.016, 95% CI: 1.012; 1.020, I 2 : 72.7%, Q: 76.9) (Fig. 1). As the I 2 statistic indicated significant heterogeneity, the analysis was repeated after exclusion of the most influential outliers (JMJD1C, MYRF, and TRIB1). Exclusion of these genes reduced the heterogeneity, and did not affect the strength of the association (OR: 1.020, 95% CI: 1.015; 1.024, I 2 : 41.8%, Q: 30.9).

Biased approach
Ten SHBG susceptibility genes were identified that are known to be involved in DNL. One gene (IRS1) was excluded as it was unavailable in the outcome dataset. The remaining nine genes (GCK, GCKR, GPAM, INSR, MLXIPL, PNPLA3, PTEN, SCAP, and TRIB1) were included in the analysis (Supplementary Table 2). Their putative role in DNL is shown in Supplementary Fig. 1. Clustering of these genetic variants resulted in a statistically significant association with CAD (OR: 1.013, 95% CI: 1.007; 1.020, I 2 : 74.3%, Q: 31.1) (Fig. 2).
Since the PNPLA3 major allele, which according to the selection criteria was associated with higher rates of DNL and lower serum SHBG levels (Supplementary Table 2), is also associated with a higher VLDL secretion rate and a lower intrahepatic lipid content [19,20], the analysis was repeated after exclusion of this variant. This did not affect the strength of the association (OR: 1.011, 95% CI: 1.004; 1.018, I 2 : 73.2%, Q: 26.1). Furthermore, we repeated the analysis after exclusion of influential outliers (GPAM, PNPLA3, PTEN, and TRIB1). The strength of the association remained materially unchanged while the heterogeneity was reduced (OR: 1.012, 95% CI: 1.003; 1.020, I 2 : 0.0%, Q: 2.4).

Fatty acid approach
Of the eight fatty acid susceptibility genes that were previously identified in a GWA study [15], one gene (ALG14) was excluded from the current analysis because of linkage disequilibrium. The remaining genes are presented in Supplementary Table 3. Clustering of these genetic variants did not result in a statistically significant association with CAD (OR: 1.004, 95% CI: 0.996; 1.011, I 2 : 74.3%, Q: 24.0) (Fig. 3). After exclusion of the influential outliers (GCKR and PKD2L1), and a consequent reduction in heterogeneity, the strength of the association increased and reached statistical significance (OR: 1.009, 95% CI: 1.000; 1.018, I 2 : 0.0%, Q: 3.4).

Discussion
The aim of the present study was to assess the association between DNL susceptibility genes and CAD. DNL susceptibility genes were identified using an unbiased and biased selection approach, as well as by using fatty acid susceptibility genes as a proxy for DNL. Clustering of these genes revealed a statistically significant association between DNL susceptibility genes, but not fatty acid genes, with CAD. Figure 1 Association between de novo lipogenesis susceptibility genes, identified by an unbiased approach, and coronary artery disease (CAD). The overall effect estimate represents the average risk of CAD conferred by one de novo lipogenesis risk allele.
de novo lipogenesis and coronary artery disease Both experimental and observational studies have shown that an increase in DNL results in an increase in VLDL secretion as well as a reduction in serum SHBG levels [9,10,12]. We, therefore, assumed that the overlap in the triglyceride and SHBG susceptibility genes likely represent genes that also predispose to DNL. As triglycerides are a well-known risk factor for cardiovascular disease, it is perhaps unsurprising that the genes identified in this approach predispose to CAD [21e23]. Serum triglycerides are, therefore, a likely mediator in this association. Of interest, we previously showed that the direction of the association between NAFLD susceptibility genes and CAD depends on their effect on serum lipids [7], which further corroborates the mediation effect of serum lipids on CAD risk. As a limitation of this unbiased approach, we cannot exclude that other processes may also predispose to both serum SHBG and triglyceride levels, in particular upstream factors of DNL such as obesity [24]. To address this, we used BMI-adjusted SHBG susceptibility genes, which meant that well-known obesity genes, such as FTO [25], were not identified. Nevertheless, despite our efforts to reduce the effect of obesity, we cannot exclude residual confounding. Second, in the unbiased approach Higgin's I 2 remained relatively high, also in the sensitivity analyses in which significant outliers were excluded. This indicates that some heterogeneity has remained.
To overcome some of the limitations of the unbiased approach, we also selected genes based on their involvement in DNL. We identified nine genes that are known to regulate the process of DNL, including MLXIPL and GCKR Figure 3 Association between fatty acid susceptibility genes and coronary artery disease (CAD). Overall effect estimate represents the average risk of CAD conferred by one fatty acid risk allele.

Figure 2
Association between de novo lipogenesis susceptibility genes, identified by a biased approach, and coronary artery disease (CAD). Overall effect estimate represents the average risk of CAD conferred by one de novo lipogenesis risk allele. [26,27]. Nonetheless, careful evaluation of the genes reveals the absence of genes that encode several other well-known lipogenic enzymes, such as acetyl-coenzyme A carboxylase (ACC ) and fatty acid synthase (FASN ) [28,29]. As these genes did not predispose to serum SHBG level e a selection criteria we enforced to ensure that the genetic variants are associated with downstream consequences of DNL and, hence, are likely to be functional variants (or in linkage disequilibrium with a variant that is) e they could not be included. Furthermore, there was a notable discrepancy between the DNL susceptibility genes identified by the biased and unbiased approach. This was somewhat surprising, as we had anticipated that all genes involved in DNL (biased approach) would also be both SHBG and triglyceride susceptibility genes (unbiased approach). This discrepancy could be the result of the very stringent significant p-value threshold applied for the selection of both SHBG and triglyceride susceptibility genes (p < 5)10 À8 ).
In the present study, we did not observe an association between fatty acid susceptibility genes and CAD, although exclusion of potential outliers did reveal a statistically significant association. Previous observational studies that used fatty acids as a proxy for DNL found inconclusive associations with cardiovascular disease [30e32]. The use of fatty acids as a proxy of DNL has, more recently, been scrutinized by stable isotope studies. DNL associated only weakly, though significantly, with palmitic acid (16:0) and stearic acid (18:0), the direct products of DNL, but not with its derivatives, such as palmitoleic acid (16:ln-7) or oleic acid (18:ln-9) [33]. The validity of these fatty acids as a serum biomarker of DNL can, therefore, be questioned.
Despite it being marked as a statistical outlier in the fatty acid approach, GCKR may be one of the most valid DNL susceptibility genes included in the present study. First, it is the only gene that was identified as a DNL susceptibility gene in all three approaches. Second, there is ample biological plausibility that variants in GCKR affect DNL. The minor allele in GCKR encodes a variant of glucokinase regulatory protein (GKRP), a liver-specific protein, which binds glucokinase less effectively [34,35]. Thereby, it increases the hepatic influx of glucose resulting in higher availability of substrate for DNL ( Supplementary  Fig. 1) [36]. Indeed, stable isotope studies have shown that individuals carrying the minor allele of GCKR have higher rates of DNL [27]. The statistically significant association of the GCKR minor allele with CAD in this study and our previous meta-analysis [37], therefore, further establishes a role for DNL in the pathogenesis of CAD.
The findings in this study may provide a glimpse into the long-term consequences of therapies that affect DNL. On the one hand, therapies that reduce DNL, such as ACC inhibitors which are currently undergoing phase II trials as a potential treatment for NAFLD [38], may in the long-term also have beneficial cardiovascular effects. This beneficial side-effect is desirable as cardiovascular disease is the principal cause of death in individuals with NAFLD [2,3]. As previously indicated, in this study we were unable to assess the effects of genetic variants in ACC specifically, although we did study upstream variants, including MLXIPL. On the other hand, the present findings also indicate that therapies that stimulate DNL should be avoided. Currently, compounds that augment hepatic glucose uptake, such as liver-specific glucokinase activators and disruptors of the GKRP-glucokinase complex, are under investigation as a new class of glucose-lowering medication [39,40]. These drugs have biological analogies with variants in GCK and GCKR and may, therefore, stimulate DNL and, hence, cause NAFLD and CAD [41].
This study has several strengths. Given the absence of GWA studies for DNL, we used three independent methods to identify DNL susceptibility genes, which allowed us to overcome the limitations unique to each approach and, thereby, to test the robustness of our findings. Furthermore, by retrieving gene-CAD effect estimates from the CARDIo-GRAM and UK Biobank dataset, which includes more than 340 000 individuals, we had sufficient statistical power to assess the relationship between DNL susceptibility genes and CAD. Finally, as indicated, the current results can shed light on the possible long-term consequences of drug therapies that affect DNL, a finding which would otherwise require years of follow-up in conventional research.
In addition to the hitherto described considerations, this study has several additional limitations. First, a primary assumption of all Mendelian randomization studies is that the instrumental genes do not affect the outcome, other than through the exposure, i.e. there should be no horizontal pleiotropy [42]. PNPLA3 illustrates this risk of horizontal pleiotropy. The major allele of PNPLA3, which was included in the biased approach based on its association with lower serum SHBG levels and higher rates of DNL [20], is also associated with higher VLDL secretion [19,20]. The latter, which is thought to be the primary effect of PNPLA3, is also known to be a risk factor for CAD [7]. Exclusion of PNPLA3 in the current analyses did not, however, affect the strength of the associations. Likewise, pleiotropic effects may also explain why two variants (i.e. MYRF and PKD2L1) were found to be statistically significantly protective for CAD, which was in direct contrast with the average effect of DNL genes found in this study. Second, common genetic variants are known to have very small effects on any outcome trait [43]. As gene-exposure data, i.e. gene-DNL data, was unavailable, we were unable to conduct full Mendelian randomization analyses. Consequently, the current data cannot be extrapolated to quantify the effect that DNL may have on CAD. The results of the current study should, therefore, be interpreted as the average risk of CAD conferred by one DNL genetic variant, which explains the observed small effect sizes. If gene-DNL data becomes available in the future e which is not likely given the laborious nature of quantifying DNL e the current study should be repeated to draw conclusions on the extent to which DNL contributes to the risk of CAD.
In summary, DNL susceptibility genes, but not fatty acid susceptibility genes, are associated with CAD. These findings enhance our understanding of the contribution of different pathways of intrahepatic lipid accumulation in the risk of cardiovascular disease, and suggest that augmented DNL may have negative consequences on the risk of CAD. The current findings justify further studies of the long-term consequences of therapies targeting DNL as a means to not only treat NAFLD, but also to reduce the risk of extrahepatic complications of NAFLD, such as CAD.

Funding
This work was supported by a research grant from the European Foundation for the Study of Diabetes/Sanofi.

Data accessibility
The datasets generated during and/or analysed during the current study are not publicly available but are available from the corresponding author on reasonable request.

Declaration of competing interest
The authors declare that there is no competing interests.