Heritability of 596 lipid species and genetic correlation with cardiovascular traits in the Busselton Family Heart Study[S]

CVD is the leading cause of death worldwide, and genetic investigations into the human lipidome may provide insight into CVD risk. The aim of this study was to estimate the heritability of circulating lipid species and their genetic correlation with CVD traits. Targeted lipidomic profiling was performed on 4,492 participants from the Busselton Family Heart Study to quantify the major fatty acids of 596 lipid species from 33 classes. We estimated narrow-sense heritabilities of lipid species/classes and their genetic correlations with eight CVD traits: BMI, HDL-C, LDL-C, triglycerides, total cholesterol, waist-hip ratio, systolic blood pressure, and diastolic blood pressure. We report heritabilities and genetic correlations of new lipid species/subclasses, including acylcarnitine (AC), ubiquinone, sulfatide, and oxidized cholesteryl esters. Over 99% of lipid species were significantly heritable (h2: 0.06–0.50) and all lipid classes were significantly heritable (h2: 0.14–0.50). The monohexosylceramide and AC classes had the highest median heritabilities (h2 = 0.43). The largest genetic correlation was between clinical triglycerides and total diacylglycerol (rg = 0.88). We observed novel positive genetic correlations between clinical triglycerides and phosphatidylglycerol species (rg: 0.64–0.82), and HDL-C and alkenylphosphatidylcholine species (rg: 0.45–0.74). Overall, 51% of the 4,768 lipid species-CVD trait genetic correlations were statistically significant after correction for multiple comparisons. This is the largest lipidomic study to address the heritability of lipids and their genetic correlation with CVD traits. Future work includes identifying putative causal genetic variants for lipid species and CVD using genome-wide SNP and whole-genome sequencing data.

Clinical lipid measures such as total cholesterol, LDL-C, and HDL-C reflect the cholesterol component of the lipoprotein particles, which are complex mixtures of phospholipids, sphingolipids, free cholesterol, cholesteryl esters, and triglycerides (referred to as triacylglycerol (TG) species in the mass spectrometric measurements), together with a range of proteins (11). These lipid classes contain potentially thousands of individual molecular species that make up the human lipidome, which can now be measured using established low-cost high-throughput methods (12). Lipids are transported through plasma as lipoproteins for exchange between the liver, intestine, and peripheral tissues. Their composition and abundance are likely to reflect underlying metabolic processes influenced by the environment, diet, and genetics (11). The individual species comprising the lipidome may represent novel predictors of CVD risk, particularly if measured in longitudinal cohort studies where causality may potentially be inferred (11,13). Owing to their close proximity to an individual's metabolic state, genetic investigations into these lipid species may provide insight into CVD risk and prediction, above that already identified through the genetic analysis of the composite clinical lipid measures. This is particularly the case for lipid species that are genetically correlated with disease-related traits, as the search for pleotropic clues can be restricted to the more informative species (i.e., those that are heritable and genetically correlated).
Associations between the circulating lipidome and CVD traits have provided insight into CVD etiology and identified novel biomarkers. Meikle et al. (14) identified 13 lipid classes and 102 lipids associated with stable coronary artery disease (CAD) compared with healthy controls. In a more recent study, nine lipid classes/subclasses and 113 lipid species from the apoA fraction, and seven classes/subclasses and 113 lipid species from plasma were associated with unstable CAD (compared with stable CAD) (15). In the Malmo Diet and Cancer study, incident cardiovascular events were marginally associated with lipid species belonging to the lysophosphatidylcholine (LPC), SM, and TG lipid classes (16). Ganna et al. (17) found significant associations between incident coronary heart disease and 32 metabolites, five of which were also associated in an independent cohort. After adjustment for traditional CVD risk factors, the addition of lipid species to a base model predicting CVD events marginally improved for CVD events (C-statistic increased from 0.68 to 0.70) and CVD deaths (C-statistic increased from 0.74 to 0.76) (13). Similar results were seen in a study of 5,991 individuals from a population-based study, with the inclusion of seven lipid species in a traditional risk factor model improving the C-statistic by 0.025 and 0.054 for CVD events and CVD death, respectively (18).
Numerous studies have estimated the heritability of the lipidome and its association and/or genetic correlation with CVD traits. In a study of Mexican Americans from the San Antonio Family Heart Study (19), all 319 lipid species were significantly heritable, with a median heritability of 0.37. This study also identified lipid species clustered associated with risk of cardiovascular death, and other CVDrelated risk factors, such as obesity, type 2 diabetes, and higher triglycerides. In a more recent study of 2,181 individuals, Tabassum et al. (20) estimated SNP-based heritabilities of 141 lipid species ranging between 0.10 and 0.54. Strong genetic correlations were observed between TG and diacylglycerol (DG) lipid classes and the clinical lipid measure of triglycerides (average r g = 0.88). Most recently, Demirkan et al. (21) examined the genetic correlation between 90 lipid species from TG, SM, phosphatidylcholine (PC), alkylphosphatidylcholine [PC(O)], LPC, phosphatidylethanolamine (PE), and alkylphosphatidylethanolamine [PE(O)] classes and identified genetic correlations between lipid species and clinical lipids (i.e., triglycerides, LDL-C, HDL-C), total body fat percentage, and BMI. These studies indicate that the human lipidome is heritable and highlight the genetic pleiotropy that exists between the plasma lipidome and CVD traits.
The aim of this study was to estimate the heritability of the human lipidome and the genetic correlation between lipid classes and species with CVD traits using our expanded lipidomic profile (with more specific lipid species and new lipid classes) of the Busselton Family Heart Study.

Study population
Participants (n = 4,492) studied were taken from the 1994/95 survey of the original participants of the long-running epidemiological study, the Busselton Health Study, for whom genome-wide SNP data, extensive phenotype data, and blood serum were available. The Busselton Health Study is a community-based study in Western Australia that includes both related and unrelated individuals (predominantly of European ancestry), and has been described in more detail elsewhere (22)(23)(24). Informed consent was obtained from all participants and the 1994/95 health survey was approved by the University of Western Australia Human Research Ethics Committee (UWA HREC). The current study, the Busselton Family Heart Study, was approved by the UWA HREC. This study was conducted in accordance with the ethical principles of the Declaration of Helsinki.

Serum lipidomic profiling
Targeted lipidomic profiling was performed in positive-ion mode using electrospray ionization-tandem mass spectrometry to quantify the major fatty acids of 596 lipid species from 33 lipid classes, from blood serum. Positive-ion mode only was selected to minimize the time required for analysis of each sample, while still providing good coverage of the lipidome (25). Profiling was performed at the Metabolomics Laboratory, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia. Serum lipids were isolated using a single phase butanol:methanol extraction (26) and quantified by liquid chromatography-tandem mass spectrometry as previously described (25). Briefly, serum samples (10 ul) were placed into a randomized order, with blank and pooled quality control samples placed every 20 and 10 serum samples, respectively. Serum aliquots were extracted in a single-phase extraction with the addition of 100 ul of butanol:methanol (1:1), containing a mix of 18 nonphysiological or stable isotope-labeled lipid standards between 10 and 10,000 pmol each. Lipid analysis was performed by liquid chromatography electrospray ionization-tandem mass spectrometry using an Agilent 1290 HPLC coupled to an Agilent 6490 triple quadrupole mass spectrometer. The ion source was operated in positive ionization mode, with conditions: gas temperature 150°C, gas flow 17 liters/min, nozzle pressure 20 psi, sheath gas temperature 200°C, sheath gas flow 10 liters/min, capillary voltage 3,500 V, nozzle voltage 1,000 V. Liquid chromatography was performed on a Zorbax Eclipse Plus C18, 1.8 um, 100 × 2.1 mm column (Agilent Technologies). Solvents consisted of water:acetonitrile:isopropanol containing 10 mM ammonium formate (solvent A, 50:30:20; solvent B, 1:9:90). The column was heated to 60°C and the autosampler regulated to 25°C. Lipid extracts (1 ul) were injected and separated under gradient conditions using a flow rate of 400 ul/min: 0 min, 10% B; 2.7 min, 45% B; 2.8 min, 53% B; 9 min, 65% B; 9.1 min, 89% B; 11 min, 92% B; 11.1 min, 100% B; 11.9 min, 100% B; 12.8 min, 10% B; 12.9 min, 10% B (flow rate 600 ul/min); 13.9 min, 10% B (flow rate 600 ul/ min); 14 min, 10% B (flow rate 400 ul/min); held at 10% B and 400 ul/min until the next injection at 16.2 min. The first minute and last 3 min of each analytical run were diverted to waste.
A total of 497 transitions, representing 596 lipid species, were measured using dynamic multiple reaction monitoring, where data were collected during a retention time window specific to each lipid species. Raw mass spectrometry data were analyzed using Mass Hunter Quant B08 (Agilent Technologies). Lipid concentrations were calculated by relating the area under the chromatographic peak for each lipid species to the corresponding internal standard. Correction factors were applied to adjust for differences in response factors, where these were known (25).

Phenotypic variables
Details of the Busselton Health Study data collection have been published previously (27). For this study, we examined eight CVD phenotypic variables: HDL-C, LDL-C, triglycerides, total cholesterol, SBP, DBP, BMI, and waist-hip ratio (WHR). Serum cholesterol and triglycerides were calculated by standard enzymatic methods on a Hitachi 747 (Roche Diagnostics, Sydney, Australia) from fasting blood collected in 1994/95. HDL-C was determined on a serum supernatant after polyethylene glycol precipitation using an enzymatic cholesterol assay and LDL-C was estimated using the Friedewald formula (28). Five minute resting SBP and DBP were used. Height and weight (used to calculate BMI) were collected from participants at the time of interview (1994/95). WHR was calculated as waist circumference (centimeters) / hip circumference (centimeters). Use of antihypertensive and lipid-lowering medications was collected at interview (1994/95).

Genotype data
Genotyping was performed on the Illumina Human 610K Quad-Bead Chip (Illumina Inc., San Diego, CA) at the Centre National de Genotypage in Paris, France (n = 1,468), and on the Illumina 660 W Quad Array Bead Chip (Illumina Inc.) at the PathWest Laboratory Medicine WA (Nedlands, Western Australia, Australia) (n = 3,428). Complete linkage clustering based on pairwise identity by state distance in PLINK (29) showed no batch effects, therefore the batches were merged.

Statistical analyses
We first used the general linear mixed effects model incorporated in the Sequential Oligogenic Linkage Analysis Routines (SOLAR) (30) to estimate the narrow-sense heritabilities of lipid classes (n = 33), lipid species (n = 596), and eight CVD traits: HDL-C, LDL-C, triglycerides, total cholesterol, SBP, DBP, BMI, and WHR. SOLAR uses a variance-component method to partition observed covariance between individuals into genetic and environmental components. Heritability is defined as the variance in the trait due to additive genetic effects divided by the sum of the additive genetic effects and the random (unmeasured) environmental effects. The null hypothesis of no heritability (h 2 = 0) was tested by comparing the log likelihood for the full model [with the genetic relatedness matrix (GRM)] and the reduced model (without the GRM), using likelihood ratio tests. Twice the difference in log-likelihoods of these models was distributed as a  2 random variable with 1 degree of freedom. Genetic correlations between lipid classes/species and eight CVD traits were calculated in SOLAR. Lipid classes were defined as the sum of the lipids within each class (i.e., total lipid abundance for each class). Genetic correlations were estimated using a variance components model to partition the phenotypic correlation between the traits into proportion of variability due to shared genetic effects ( g ) and the proportion of variability due to shared environmental effects. The null hypothesis of no genetic correlation (r g = 0) was tested by comparing the log likelihood for the full model (with the GRM) and the reduced model (without the GRM), using likelihood ratio tests.
All heritability and genetic correlation analyses included a GRM, to exploit both known and unknown relatedness in the sample. We estimated empirical kinship probabilities between pairs of individuals from all genome-wide SNP data using Linkage Disequilibrium Adjusted Kinships (LDAK) software (31), as described previously (24), to form the GRM. Any value of kinship in the GRM less than 0.05 was set to zero to minimize potential bias from using both closely and distantly related individuals. Using this method, heritability estimates derived from SNP data have been shown to be similar to those obtained from using identity-by-descent measures from pedigrees with known pedigree structures (9).
Rank-based inverse normal transformed residuals were used in all analyses. All analyses included adjustments for age, sex, age 2 , and their interactions, age × sex and age 2 × sex. Adjustment for age and sex were included in heritability and genetic correlation analyses to estimate the additive genetic effects of lipid classes/ species and CVD traits after accounting for these covariates. In addition, interaction terms were included as the relationship between outcomes (lipid classes/species and CVD traits) and age was different between men and women (age × sex interaction). We also identified that the relationship between outcomes and age was not linear, hence the inclusion of age 2 and the age 2 × sex interaction. In our heritability and genetic correlation analyses, these interactions consistently showed P-values <0.05. For individuals taking antihypertensive medications (n = 507), their SBP and DBP measures were increased by 10 mmHg and 5 mmHg, respectively. In addition, lipid measures were corrected by the use of lipid-lowering medications, by matching individuals taking lipid-lowering medication (n = 108) and individuals not taking medication on age, sex, and BMI, and calculating a multiplicative correction factor. The resulting correction factors were as follows: HDL-C (1.068), LDL-C (1.123), total cholesterol (1.059), triglycerides (1.138), and lipid species (1.234). Sensitivity analysis (results not shown) indicated no marked change in heritability or genetic correlation estimates compared with excluding either these 108 individuals or using unadjusted measures. Only individuals with genotype, lipidomic, and phenotype data were included for analysis (n = 4,492). Manipulation of data and creation of residuals was performed in R 3.5.1 (32).
The false discovery rate (33) was used to correct for multiple testing, with q < 0.05 considered statistically significant. Table 1 shows the study characteristics for the full cohort (n = 4,492). The average age of the study participants was 50.8 years, with mean BMI of 26.04. All CVD traits were significantly heritable (all q < 0.05) ( Table 1). The most heritable CVD trait was HDL-C (h 2 = 0.59, q = 7.2 × 10 61 ) and the least heritable was WHR (h 2 = 0.25, q = 2.0 × 10 15 ). All lipid classes were significantly heritable (q < 0.05; Table 2). Heritabilities ranged from 0.14 [oxidized cholesteryl esters (oxCEs)] to 0.50 [monohexosylceramide (HexCer)]. The estimates of these class totals and their concentration are shown in Fig. 1. The median heritability was 0.34.

Genetic correlation
Genetic correlations between the lipid classes (sum of the lipid species within each class) and CVD traits are presented in Fig. 3 (see supplemental Table S2 for further detail). Approximately 57% of genetic correlations were statistically significant (q < 0.05). The largest genetic correlation was observed between the clinical lipid triglycerides and DG (r g = 0.89, SE = 0.02; q = 3.9 × 10 22 ). DG and PC(O) were genetically correlated with all CVD traits, and alkenylphosphatidylcholine [PC(P)] was genetically correlated with all CVD traits, except for DBP. LDL-C was genetically correlated with all lipid classes (r g range: 0.13-0.82), apart from PS. In fact, PS was not genetically correlated with any CVD trait. Similarly, HDL-C was genetically correlated (r g range: 0.42 to 0.69) with all lipid classes, besides PS and AC.
We also investigated the genetic correlations between individual lipid species and CVD traits, identifying where class correlations were driven by specific lipid species (Fig. 4). Overall, 51% of the 4,768 pairwise lipid species-CVD trait genetic correlations were statistically significant (q < 0.05; supplemental Table S3). All lipid classes had at least one species significantly genetically correlated with one CVD trait. Additionally, we observed large heterogeneity in genetic correlations within each class. The largest genetic correlations were between triglycerides and members of the DG class (r g range: 0.47-0.89; q < 4.1 × 10 9 ). Three PS lipid species showed significant genetic correlations with LDL-C [PS(40:6), r g = 0.32],   supplemental Table S1. For lipids, please refer to the abbreviations used in Fig. 1.

DISCUSSION
The current study provides evidence to support the role of additive genetic effects (significant heritabilities) in lipid species concentrations and the presence of genetic pleiotropy (shared genes) between these lipid species and CVD traits. This is the largest lipidomic study to address the heritability of lipids. The fine detail of our lipidomic profiling provides the opportunity to assess heritability not only at the lipid class level, but also to assess how the fatty acid composition affects heritability and genetic correlations with CVD risk factors. These data will become an important resource as we seek to better understand the relationship between genetic and environmental risk with CVD, both of which are partially mediated through lipid metabolism. In this work, we report three important findings: first, we have shown that the human lipidome (both lipid species and lipid classes) is significantly heritable. Herein, we report the heritabilities for the largest number of lipid species to date (596 species from 33 lipid classes) in the largest sample of individuals (n = 4,492) to date. Second, we have identified lipid classes and lipid species genetically correlated with CVD traits. Third, we have replicated previous studies showing CVD traits are heritable.
The range of lipid species heritabilities (h 2 : 0.06-0.50) observed in this study for lipid species and classes was comparable to that of earlier studies with estimates ranging from 0.09 to 0.60 (19,20). This highlights and supports the role of additive genetic effects on lipid species levels, rather than only the effect of dietary intake or environmental factors.
Strong positive genetic correlations between the clinical measure of triglycerides and the mass spectrometric measure of TG and DG lipid classes were similar to those identified previously (20,21). In addition, we also observed a novel strong positive genetic correlation between the clinical lipid triglycerides and phosphatidylglycerol (PG) species (r g : 0.64-0.82). A novel strong positive genetic correlation was also observed between HDL-C and species within the PC(P) class. Interestingly, HDL-C showed stronger positive correlations with lipid classes containing ether (O) or vinylether (P) bonds. Genetic correlations between the less unsaturated/shorter chain species and HDL-C were stronger than those with the highly unsaturated species. For example, the median genetic correlation with HDL-C within the PC(P) class was 0.57, compared with 0.46 in the PC(O) class and 0.42 in the PC class. Similarly, the median genetic correlation with HDL-C within the alkenylphosphatidylethanolamine [PE(P)] class was 0.45, compared with 0.43 in the PE(O) class and 0.17 in the PE class. We also observed marked differences in genetic correlations between lipids in the phospholipid classes, PC and SM. For example, the median genetic correlation between LDL-C and lipid species in the SM class was 0.47 (range: 0.30-0.67), compared with 0.27 (range: 0.14-0.63) in the PC class; the significant genetic correlations were always positive. However, 96% (25/26) of significant genetic correlations between lipids in the SM class and triglycerides were negative, while only 35% (6/17) of significant genetic correlations between PC lipids and triglycerides were negative. This is interesting as structurally these lipid classes are similar, in that they both contain a phosphorylcholine head group. However, unlike the SMs that contain only saturated and monounsaturated acyl chains, the PCs contain many species of polyunsaturated acyl chains and it appears that these species may drive the positive association with triglycerides. While genetic correlations between the human lipidome and lipid concentrations may not be surprising, we also identified genetic correlations between lipid classes and individual lipid species and measures of obesity (WHR and BMI) and blood pressures. A recent study of 5,537 participants from three Dutch population-based cohorts examined 90 plasma lipids and failed to identify any significant genetic correlations between blood pressure or WHR, but did observe several associations with BMI (21). In the current study, lipid species within the DG, PC, PC(O), PC(P), SM, and TG classes showed the most consistent genetic correlations with WHR and BMI (supplemental Fig. S1), with some species from PC and SM classes genetically correlated with BMI in an earlier study (21). Compared with the previous study, the lipidomic methodology in the current study allowed greater specificity of individual lipid species. In addition, Linkage Disequilibrium Score Regression is known to be less powerful than variance-component methods used in the current study (34). In general, species within classes showed consistent directions of genetic correlations with BMI and WHR [for example, all significant genetic correlations between sulfatide (Sulf) species and BMI and WHR were negative, while all significant genetic correlations between AC species were positive].
However, in the current study some lipid species within the SM, PC, LPC, lysoalkylphosphatidylcholine [LPC(O)], PE, and ceramide [Cer(d)] classes were positively genetically correlated with BMI/WHR, while others were negatively genetically correlated with WHR/BMI, although no lipid species within any class was positively genetically correlated with WHR, and negatively genetically correlated with BMI, or vice versa (supplemental Fig. S2). These differences are interesting, as they highlight that even within lipid classes, lipid species are genetically diverse, and may not share genes consistently with both WHR and BMI (both correlated traits). These genetic correlations between lipid species and non-lipid CVD traits indicate there are potentially pleiotropic genes that effect both lipid species and non-lipid CVD traits.
Finally, we have replicated previous studies showing CVD traits are heritable with our heritability estimates (h 2 : 0.25-0.59) comparable to those published previously in a larger sample of the Busselton Family Heart Study (24) and in earlier independent studies of CVD traits (4-10).

Limitations of the study
There are several potential limitations in our study. First, our study cohort comprised individuals with European ancestry and therefore these findings may not be generalizable to other ethnic populations. However, the heritability estimates observed in this study are comparable to an earlier study of Mexican-Americans (19). Second, lipid measures used within this study were measured only once, and therefore the lipid levels analyzed represent a snapshot in time and as such may be influenced by environmental factors that may also cluster within families, such as seasonal variations, illness, diet, and other lifestyle factors, that we were unable to take into account (35). Repeated lipid measures may result in less within-person variability; however, these are not available for this cohort. Third, serum samples were collected during the 1994/95 BHS survey and kept at 80°C until processing. Although the stability of lipid metabolites has generally been shown to be robust to long-term storage, minor alterations have been previously reported (36,37). However, as samples were collected at the same time point, any changes should be consistent between the serum samples and so will likely have minimal effect on our analyses. Finally, while we have identified the possibility of pleiotropic genes associated with the human serum lipidome and CVD traits, we have not identified specific genetic variants or genes, which is a logical next step in attempting to understand the shared genetic etiology between the human lipidome and CVD traits. This will be the subject of future analyses.
In summary, we have shown that the human lipidome is heritable and genetically correlated with CVD traits. We have identified novel intermediate lipid endophenotypes that are genetically correlated with CVD traits and, due to their shared genetic etiology, genetic dissection of these lipidome endophenotypes may help to identify causal genetic variants for CVD. Future work involves the analysis of genome-wide SNP and whole-genome sequence data in this cohort.