Abstract
Investigating the cross talk of different omics layers is crucial to understand molecular pathomechanisms of metabolic diseases like obesity. Here, we present a large-scale association meta-analysis of genome-wide whole blood and peripheral blood mononuclear cell (PBMC) gene expressions profiled with Illumina HT12v4 microarrays and metabolite measurements from dried blood spots (DBS) characterized by targeted liquid chromatography tandem mass spectrometry (LC–MS/MS) in three large German cohort studies with up to 7706 samples. We found 37,295 associations comprising 72 amino acids (AA) and acylcarnitine (AC) metabolites (including ratios) and 8579 transcripts. We applied this catalogue of associations to investigate the impact of associating transcript-metabolite pairs on body mass index (BMI) as an example metabolic trait. This is achieved by conducting a comprehensive mediation analysis considering metabolites as mediators of gene expression effects and vice versa. We discovered large mediation networks comprising 27,023 potential mediation effects within 20,507 transcript-metabolite pairs. Resulting networks of highly connected (hub) transcripts and metabolites were leveraged to gain mechanistic insights into metabolic signaling pathways. In conclusion, here, we present the largest available multi-omics integration of genome-wide transcriptome data and metabolite data of amino acid and fatty acid metabolism and further leverage these findings to characterize potential mediation effects towards BMI proposing candidate mechanisms of obesity and related metabolic diseases.
Key messages
-
Thousands of associations of 72 amino acid and acylcarnitine metabolites and 8579 genes expand the knowledge of metabolome-transcriptome associations.
-
A mediation analysis of effects on body mass index revealed large mediation networks of thousands of obesity-related gene-metabolite pairs.
-
Highly connected, potentially mediating hub genes and metabolites enabled insight into obesity and related metabolic disease pathomechanisms.
Similar content being viewed by others
Introduction
Advances in high-throughput metabolomics make the blood metabolome a readily available and cheap target to characterize large studies and cohorts and to conduct large-scale multi-omics investigations [1]. Obesity is a highly heterogeneous metabolic disease that increases risk for other non-communicable diseases [2]. The link between obesity and a wide range of metabolic disorders such as insulin resistance and type 2 diabetes mellitus (T2D) has long been established [3].
Metabolic alterations have been studied as causes and symptoms in these and other pathologies [4,5,6]. For example, the influence of branched-chain amino acids (BCAAs) on metabolic disease has been the subject of numerous studies [7,8,9]. Perturbations of acylcarnitine (AC) metabolism have also been studied in connection to insulin sensitivity and T2D and other metabolic disorders [10,11,12,13]. These studies provide a strong rationale of investigating the blood metabolome in relation to metabolic traits and diseases.
In the search for the genetic basis of metabolic traits, a large catalogue of genetic association studies identifying common and rare genetic variants influencing blood metabolites is available and is continuously extended [14,15,16,17,18]. Despite high heritability of metabolic traits, additional sources of variation, e.g., via transcriptional regulation, are of interest to create a more comprehensive picture of the variation of blood metabolites in humans. Additionally, instances of transcriptional regulation of metabolites as well as metabolic regulation of gene expression were described in relation to several diseases, such as a hypertriglyceridemia inducing effect of BCAAs via expression of a transcription factor [19, 20]. Although studies on the association of human blood metabolome and transcriptome are available [20,21,22,23], sample sizes are considerably lower as compared to genetic association studies. Moreover, heterogeneity of metabolomics platforms, such as NMR and liquid chromatography tandem mass spectrometry (LC–MS/MS), reduces comparability of results between studies. In this study, we perform the first and largest genome-wide gene expression/metabolite association meta-analysis of three independent studies, all utilizing the same metabolomics and microarray platforms. As an application of the resulting comprehensive association catalogue, we analyze possible mediation effects between gene expressions, metabolites, and BMI as an example metabolic trait to identify new candidates of causal mechanisms and relationships in obesity.
Material and methods
Study characteristics and design
We performed a genome-wide gene expression–metabolite association meta-analysis of the following three studies.
LIFE-Adult
For the population-based LIFE-Adult study, 10,000 age- and sex-stratified randomly selected individuals were recruited from the city of Leipzig, Germany [24, 25]. Phenotyping focused on civilization diseases and related risk factors. Samples were collected after an overnight fasting period. Paired gene expression (n = 3173) and metabolite (n = 9646) data were available for 3145 participants.
LIFE-Heart
Patients with suspected or confirmed coronary artery disease were collected at the Heart Center Leipzig, Germany [26]. All participants underwent coronary angiography. Confirmed coronary artery disease comprised stable disease as well as acute (AMI) or historical myocardial infarction. Paired gene expression (n = 4143) and metabolite (n = 5860) parameters were available for 3626 patients (including 1271 patients with AMI). A fasting period was not required prior to blood sampling. There was no sample overlap between the LIFE-Adult and LIFE-Heart studies.
Sorbs
Participants of the Sorb study were recruited from the self-contained Sorb population, an ethnic minority of Slavic origin from Upper Lusatia in Eastern Saxony, Germany [27, 28]. Blood samples were collected after an overnight fasting period of at least 8 h. Paired gene expression (n = 988) and metabolite data (n = 935) were available for 935 participants.
All studies comply with the ethical standards of the Declaration of Helsinki and with approval by the ethics committee of the University of Leipzig (LIFE-Adult: Reg. No 263-2009-14122009, LIFE-Heart: Reg. No 276e2005, Sorbs: Reg. No: 088-2005). LIFE-Heart was registered at ClinicalTrials.gov (No NCT00497887). All participants gave their written and informed consent.
Please see Supplemental Table S1 for the respective descriptive statistics of study participants.
Metabolite measurement and pre-processing
Sample preparation
In the LIFE studies, 40 µl of EDTA whole blood was directly spotted on filter paper (WS 903 Schleicher and Schüll, Germany), dried for 3 h, and stored and − 80 °C till analysis. In the Sorb study, EDTA whole blood was not available. Therefore, 40 µl of the cell residue after plasma centrifugation was spotted on filter paper WS 903.
Mass spectrometric analysis
Punched-out 3-mm diameter blood spots (corresponding to 3 µl of blood) were extracted using methanol containing isotope-labeled internal standards and butylated in 96 well plates, as is described elsewhere [29,30,31]. Each plate contained two quality control samples for the estimation of the inter-assay coefficient of variation. This method was validated for the identification of relative concentration differences between different studies [31]. An API 2000 or API 4000 tandem mass spectrometer (SCIEX, Darmstadt, Germany) was applied for flow injection analysis. Quantitative metabolite variable data of 62 metabolites (27 amino acids, 34 acylcarnitines, and free carnitine) were derived using ChemoView 1.4.2 (Applied Biosystems, Germany). Additionally, we calculated a biologically relevant sum (total ACs) and 34 ratios of metabolites for assessment of reaction equilibria within physiological pathways involving these metabolites. Consequently, 97 metabolite features were available for analysis. See Supplemental Table S2 for an overview and the formulas of the derived ratios.
Metabolite pre-processing
Metabolite pre-processing was performed study wise. The workflow was developed and described elsewhere [32]. There, we demonstrated that applying this workflow minimized between-study heterogeneity caused, e.g., by different study designs or sampling techniques. Briefly, we removed metabolite measurements above of mean + 5 × SD of log-transformed data to account for skewness of data. Metabolite measurements at zero were temporarily excluded for this step only. A maximum of 0.3% of measurements were removed per metabolite and cohort by these criteria. Then, metabolites were inverse normally transformed. We batch-adjusted transformed quantities using an empirical Bayes method implemented in the R-function “ComBat” of the “sva” package [33, 34].
Relationship adjustment in Sorb cohort
We adjusted for relatedness among Sorb subjects by fitting a generalized linear model as implemented in the “polygenic()” R-function of the “GenABEL” package [35]. The required kinship matrix was estimated using available SNP microarray data [27, 36].
Gene expression measurement and pre-processing
Sample collection, pre-analytics, and measurement
In LIFE-Heart, RNA of 4143 patients was extracted from peripheral blood mononuclear cells (PBMCs). Details of RNA extraction can be found elsewhere [37, 38]. In the Sorb study, 988 samples of PBMCs were isolated from blood samples. The full sample preparation procedure is described elsewhere [39]. For LIFE-Adult participants, 3173 whole blood samples were collected and stored at − 80 °C in Tempus Blood RNA Tubes (Life Technologies). See the Supplemental Methods for a detailed description of the sampling and measurement process in each study.
RNA measurement was performed using Illumina HT-12 v4 Expression BeadChips (Illumina, San Diego, CA, USA) and scanned on the Illumina iScan according to the manufacturer’s specifications [37].
Data pre-processing
We pre-processed the whole blood and PBMC gene expression profiles of the three studies separately, applying the workflow implemented in the “HT12ProcessoR” R package (https://github.com/holgerman/HT12ProcessoR) that uses Bioconductor functionality [40]. The workflow is described in detail in Kirsten et al. and in the Supplemental Methods [41]. In brief, the data are log2 transformed and quantile normalized. We adjusted for technical batch effects (sentrix barcode) by applying “ComBat” [33, 34]. We removed expression probes not expressed in more than 5% of samples as well as probes significantly associating with batches after Bonferroni correction. We also removed samples with low quality based on the number of expressed genes. Next, we applied a filter based on the Mahalanobis distance of specific Illumina probes designed for quality analyses. Lastly, we removed subjects with a Euclidean distance of expression values of larger than four times the range between the 25th and 75th percentiles (interquartile range (IQR)) from the median. We mapped probes to unique genes via the “Ingenuity Pathway Analysis” database (QIAGEN Inc.). Probes that could not be mapped to a gene were removed. Sample and probe exclusions due to quality reasons are detailed in the Supplemental Methods and Supplemental Table S9. We adjusted for relatedness among Sorb subjects using the same method as for the metabolite data (see the “Metabolite measurement and pre-processing” section).
Analysis of cofactors
In a previous study, we investigated the effects of 29 clinical and lifestyle factors on whole blood metabolites in our studies (Supplemental Table S1) [32]. It revealed that the following covariates have a relevant impact: age, BMI, sex, hours fasted (not available in the Sorbs), type 2 diabetes mellitus (T2D), hematocrit, and neutrophil percentage. Regarding blood gene expression, most important cofactors comprise age, sex, and the percentages of neutrophils and monocytes (Supplemental Fig. S1) [41]. We here considered the union of identified strong cofactors of metabolites and gene expression as potential confounders which is liberal in the sense that relevant confounders must affect both omics layers. We refrained from adjusting for BMI in the association analysis in order to preserve effects on metabolites and gene expression for the subsequent mediation analysis. For the same reason, we also did not adjust for diabetes status due to its high correlation with BMI. Thus, six factors were considered as covariates in the regression analysis. Since metabolite and gene expression data of AMI patients of LIFE-Heart are clearly affected by the acute situation, we decided to analyze these patients separately, i.e., two analysis groups were defined in LIFE-Heart, namely AMI and non-AMI.
Single study association analysis
All analyses were performed on the subset of study subjects with complete gene expression, metabolite, and covariable information. We calculated associations of gene expression with metabolite levels for each study separately using multiple linear regression with the gene expression as dependent and the metabolite and covariates as independent variables. We used the functions “lmFit()” and “eBayes()” of the “limma” R package to carry out the association analysis in all studies [42].
As recommended [43, 44], multiple testing correction is performed hierarchically, i.e., we corrected p-values for multiple testing within each metabolite first (local adjustment). In a second step, smallest adjusted p-values of each metabolite are adjusted across metabolites (global adjustment). We used the Benjamini–Hochberg correction [45] for both local and global adjustment to control the false discovery rate (FDR) at 5% (R-function “p.adjust()” with “method = ”BH””). Analysis steps are outlined in Supplemental Fig. S2.
Meta-analysis
Single study association summary statistics from the four subgroups (LIFE-Adult, LIFE-Heart non-AMI, LIFE-Heart-AMI, Sorbs) were meta-analyzed by a random effects model (REM) to account for the heterogeneity of sample processing and tissues used for gene expression analysis [46]. Study heterogeneity was assessed by I2 metrics. Again, we applied hierarchical multiple testing correction to determine significance of metabolite to gene expression association meta-analysis results. Only association results available in more than one study were meta-analyzed, resulting in a total of 26,042 probes which could be mapped to 17,735 unique genes.
To assess the extent of gene expression/metabolite associations, we estimated the proportion of null p-values (η0) from the empirical distribution of p-values. The proportion of non-null p-values was calculated as η1 = 1 − η0. We estimated η0 using “fdrtool” function “pval.estimate.eta0()” with the argument “method = ”smoother”” for each metabolite using REM p-values [47, 48].
Sample sizes of studies (see Table 1) allow detection of explained variances of metabolite/gene expression relationships of 4.4%, 3.2%, 1.7%, 1.2%, and 0.5% within the Sorb study, LIFE-Heart (AMI), LIFE-Heart (non-AMI), LIFE-Adult, and the overall meta-analysis, respectively, with a power of 80% and a significance cutoff of α = 2.0 × 10−8 accounting for a multiple testing correction of approximately 2.5 million tested metabolite/gene expression pairs (PASS 2020).
In the association meta-analysis, we jointly analyzed gene expression measurements from whole blood, as well as PBMCs. To assess comparability of gene expression source tissues (whole blood in LIFE-Adult on one hand and PBMCs in LIFE-Heart and the Sorb study on the other hand), we conducted a separate meta-analysis considering only PBMC gene expression data (LIFE-Heart AMI/non-AMI and the Sorb study). We compared heterogeneity in the analyses (whole blood and PBMCs vs. PBMCs only) by comparing respective I2 estimates and respective random effects meta-analysis estimates.
Mediation analysis
We tested for mediation effects considering the association triangles of gene expression, metabolites, and BMI. At this, BMI was always considered as outcome, while gene expression and metabolites were considered as exposure and mediator, respectively, or vice versa (Supplemental Fig. S3). The effect estimate of the outcome-mediator association conditional on the exposure is called the indirect effect, whereas the effect of the outcome-exposure association conditional on the mediator is called the direct effect [49]. The sum of direct and indirect effects is considered the total effect. Conceptually, a mediation analysis is a so-called third-variable analysis, in which the effect of a variable (called mediation effect of the mediator) is analyzed in respect to the relationship of two other variables, i.e., the association between exposure and outcome. A third-variable effect can only be interpreted as a mediation effect if the underlying causal/temporal order is assumed to be correct [50]. Thus, identified relationships can only be considered as candidates for causal/mechanistic effects.
Required association statistics for calculating mediation effects (resulting from the regression analyses (1) BMI ~ metabolite + covariates, (2) BMI ~ gene expression + covariates, (3) metabolite ~ gene expression + covariates, (4) gene expression ~ metabolite + covariates, and (5) BMI ~ metabolite + gene expression + covariates) were determined by meta-analyses of the three studies (see the “Meta-analysis” section) as follows. Single study associations were calculated by linear models (R function “lm()” in the “stats” package). We adjusted for the relevant covariates as described above (see the “Analysis of cofactors” section). BMI was log transformed. Continuous covariates and all gene expression values were scaled to mean µ = 0 and standard deviation σ = 1. Metabolites were scaled as a result of the inverse normal transformation. Single study effects were again combined by random effects meta-analysis. We applied correction for multiple testing using the same hierarchical procedure with the predictor variable as the family variable and local as well as global adjustment using the Benjamini–Hochberg procedure [44].
Mediation analysis is restricted to all pairs of gene expression probes and metabolites that were significantly associated in the meta-analysis. Since interactions between mediator and exposure regarding BMI outcome can result in invalid results [51, 52], we searched for such pairs by linear model analyses: BMI ~ metabolite + gene expression + metabolite*gene expression + covariates. Interaction effects of single studies were again combined via meta-analysis. Pairs with significant interactions were removed (hierarchical FDR = 5%).
Next, exposure and outcome are required to be significantly associated (total effect, γ1) or a significant direct effect (β1) should be present (effect of exposure on outcome adjusted for the mediator) [50]. We like to remark that in the latter case, only partial mediation could be tested, i.e., whether the direct effect size of the exposure to the outcome is reduced when adjusting the outcome for the mediator. In the first scenario, also complete mediation could be tested, i.e., absence of a direct effect. Due to the complexity of human metabolic pathways, complete mediation scenarios are expected to be rare [53]. Thus, we focused on partial mediation in the present analysis.
The mediation effect is determined by the product of beta estimates of the effect of the exposure on the mediator (α) and the mediator on the outcome (β), i.e., βmediation = α × β [54,55,56]. In more detail, the product of the two z statistics of the beta estimates is calculated and compared to the distribution of the product of two standard normal distributions representing the expected distribution under the null hypothesis and used to calculate a p-value of the mediation effect. The R-function “pprodnormal()” of the “RMediation” package [57] was used for that purpose. Confidence intervals of the mediation effects were computed using the function “medci()” with Monte Carlo method as recommended [58]. Correction for multiple testing was again performed in a hierarchical way considering the exposures as families for local adjustments.
Finally, the proportion of the total effect that was mediated is calculated, i.e., PM = α × β/α × β + τ’, which is the ratio of the indirect effect of the exposure on the outcome and the total effect (βtotal = α × β + τ’, i.e., sum of indirect (αβ) and direct (τ’) effects [56, 59]). This quantity will be used to distinguish between mediation directions, i.e., the mediated proportion of the total effect (PM) will be compared for the two possible causal effect directions gene expression → metabolite (GE → M), respectively, metabolite → gene expression (M → GE), if both mediations are significant (Supplemental Fig. S3). In more detail, we classified mediations as “strongly mediated metabolite effects” (by gene expression effects) if the following conditions hold: (1) the metabolite effect on BMI was significantly mediated by the gene expression probe, (2) for the mediation effect it holds PM ≥ 0.2, and (3) for the reverse mediation it holds PM ≤ 0.2. Conversely, “strongly mediated gene expression effects” (by metabolite effects) are classified using the same conditions, but for mediations of gene expression effects by metabolite effects. With “strong bi-directional mediations,” we denote significant mediation effects and PM ≥ 0.2 for both directions. We classified mediations as “weak mediation” when they were significant but none of the above classifications are met.
To demonstrate utility and to increase confidence in our results, we explored mediated effects of known obesity-associated genes. For this purpose, we created a catalogue of 52 genes reported for association with BMI (Supplemental Table S3) [60, 61]. We screened our mediation results for these genes and selected mediations with a high mediated effect proportion (PM ≥ 0.2). To assess the functional plausibility of the mediations, we screened the literature for additional functional experiments considering respective transcript-metabolite pairs.
Results
Comparison of single study results
We calculated associations of 26,042 gene expression probes (17,735 annotated genes) and 97 metabolites (45 AAs, 44 ACs, and 8 mix quotients of ACs and AAs and the sum of ACs) in up to four data sets.
Due to the largest sample size, most significant associations were found in LIFE-Adult, while only a few associations were found in the Sorbs with the lowest sample size (Table 1).
A total of nine associations involving transcripts of four genes (ABCG1, ALAS2, HBA1/HBA2, and HBB) and five metabolites (total acylcarnitines, acetylcarnitine (C2), propionylcarnitine (C3), leucine/isoleucine, Q19:(Leu/Ile)/C3) were significant (hierarchical FDR = 5%) in all four studies, all with the same direction of effect. These were the associations of ABCG1 with leucine/isoleucine, HBA1/HBA2 with C2, ALAS2 with C2, HBA1/HBA2 with C3, ALAS2 with C3, HBB with C3, HBA1/HBA2 with total acylcarnitines, ALAS2 with total acylcarnitines, and ALAS2 with the ratio of leucine/isoleucine and C3 (see Supplemental Table S4 for an overview and Online Supplemental Table OS1 for the full summary statistics).
37,295 gene expression/metabolite associations found in meta-analysis
An overview of metabolite-transcript associations significant in meta-analysis is shown in Fig. 1. Overall, 37,295 metabolite-transcript associations involving 72 (74.2%) metabolites and 8579 (48.4%) mapped genes were significant at hierarchical FDR = 5%. A total of 25 metabolites (8 AAs, 16 ACs, and 1 mix quotient) did not show any significant associations. Most of the metabolites without significant associations also showed a high percentage of (excess) zeros in the metabolite data, implying values below the detection limit and low abundance, and with it, low power to detect associations. Across-study heterogeneity of the associations as assessed by I2 statistics was moderate (Fig. 1). Only 262 (0.7%) of significant associations showed a relevant heterogeneity I2 ≥ 0.75.
The overall fraction of non-null hypothesis (η1) was estimated from the pooled amount of all tested hypothesis resulting in η1 = 0.076, i.e., we estimated that 7.6% gene expression/metabolite pairs are associated. When estimated separately for metabolites, η1 ranged from η1 = 0 for a total of 26 metabolites (5 AAs, 6 quotients, 15 ACs) to η1 = 0.313 for octadecenoylcarnitine (C18:1), i.e., this metabolite is likely regulated by many genes. Of all metabolites, C2 exhibited the largest number of associated transcripts (n = 2187). In terms of absolute standardized effect estimate (\(\widehat{\beta }\)), the eight strongest associating metabolites all belonged to the class of acylcarnitines. Alanine showed the strongest association among amino acids. Supplemental Table S2 shows the top transcripts for each metabolite, and the full summary statistics of the meta-analysis are available at Online Supplemental Table OS2.
On the gene expression level, most associations were observed for BCL11A (BAF chromatin remodeling complex subunit BCL11A), a C2H2 type zinc-finger protein, associating with 28 metabolites (13 AAs, 13 ACs, and 2 mix quotients; Supplemental Table S5). The gene with the strongest metabolite associations was ALAS2 (5′-aminolevulinate synthase 2), whose associations with Sarc, C2, Q34, total ACs, and C3 exhibited the highest standardized effect estimates of all associations (Supplemental Fig. S4, Online Supplemental Table OS2).
The majority of significantly associating transcripts (71.2%) are associated with more than one metabolite. As such, the top 1% of transcripts (83 genes) associating with the most metabolites were responsible for ~ 4.6% of the total associations representing association hubs (Fig. 2A). When focusing on the transcripts with more than 17 unique metabolite associations, we obtain a bi-partite network including transcripts of 46 genes as displayed in Fig. 2B. A large number of these association hub transcripts cluster around free carnitine (C0), acetylcarnitine (C2), the ratio of leucine|isoleucine/propionylcarnitine (Q19:(Leu|Ile)/C3), and the sum of acylcarnitines (AC-total). Visual inspection of the network reveals further interesting associations, e.g., the association of ATP Binding Cassette Subfamily A Member 1 (ABCA1) gene expression with proline and the proline-derived ratios glutamic acid/proline (Q17:Glu/Pro) and proline/ornithine (Q25:Pro/Orn). These associations suggest a potential cholesterol lowering effect of proline by ABCA1 downregulation, previously observed for a phenylalanine-proline dipeptide [62].
Comparability of whole blood and PBMC gene expression associations
We assessed the heterogeneity of association estimates obtained by meta-analyzing studies with gene expression from different tissues (whole blood in LIFE-Adult vs. PBMCs in LIFE-Heart AMI/non-AMI and the Sorb study) since this factor could not be accounted for in single study analyses but could be of biological relevance. The heterogeneity estimates (I2) of associations of the PBMC-only analysis are significantly smaller than those of the joint tissue analyses (paired one-sided Wilcoxon signed-rank test, p < 2.2 × 10−16). Distributions are shown in Supplemental Fig. S7, panel A–C. In the PBMC-only analysis, 83.8% of all associations have a heterogeneity estimate I2 ≤ 0.5, while in the main analysis including whole blood gene expression, 78.9% of all associations exhibit a heterogeneity estimate I2 ≤ 0.5. However, effect estimates of associations significant in at least one of the two analyses correlate highly across all metabolites (Pearson’s ρ = 0.96). In view of these observations, we decided to rely on the overall meta-analysis results.
BMI mediation analysis reveals highly connected networks of transcripts and metabolites
We performed mediation analysis of effects of the potential causal chains: metabolite \(\to\) gene expression \(\to\) BMI (M \(\to\) GE \(\to\) BMI), i.e., the gene expression mediates metabolite effects on BMI, and alternatively, gene expression \(\to\) metabolite \(\to\) BMI (GE \(\to\) M \(\to\) BMI), i.e., metabolite levels mediate gene expression effects on BMI. Single study and meta-analyzed association statistics computed for testing mediation assumptions and for calculating mediation results are available as Online Supplemental Tables OS3 and OS4. In total, 33,204 GE-metabolite pairs comprising 65 metabolites and 8205 transcripts were considered in our mediation analysis, i.e., fulfilled our association requirements (see the “Material and methods” section). Of those, 676 transcript-metabolite pairs meet the requirements for testing for unidirectional mediation with GE as exposure only (GE \(\to\) M \(\to\) BMI). A total of 27,427 pairs were tested for unidirectional mediation with metabolites serving as exposure (M \(\to\) GE \(\to\) BMI) and 5101 pairs for testing for bi-directional mediation (i.e., 10,202 mediations tested). Thus, a total of 38,305 mediations were tested. Mediation summary statistics are provided at Online Supplemental Table OS5.
Correction of p-values on global and family level (the exposure is considered the family level) resulted in a total of 27,023 significant mediations (hierarchical FDR = 5%) involving 20,507 transcript-metabolite pairs. There, 375 mediations, corresponding to 349 transcript-metabolite pairs, were unidirectional with gene expression as exposures, only. Conversely, we found 15,453 transcript-metabolite pairs with unidirectional mediations with metabolites as exposure, only. Finally, 4705 transcript-metabolite pairs were bi-directional, i.e., were significant for mediations tested in both direction. An overview of available features and tested mediations is shown in Fig. 3. Notably, only eight transcripts and none of the metabolites were involved in significant mediations exclusively as exposures and not as mediators. The vast majority of 5351 transcripts involved in potential mediations were either both, mediator as well as exposure, or mediators only. Among the significant mediations on BMI, the five most frequently mediating or mediated metabolites, acetylcarnitine (C2), the sum total acylcarnitines (AC-total), 3-hydroxy-butyryl-carnitine (C4OH), octadecenoylcarnitine (C18:1), and propionylcarnitine (C3), were all metabolites involved in the oxidation of fatty acids. A summary of all mediations per metabolite is given in Supplemental Table S6. The five most frequently mediated or mediating transcripts were BCL11A, ABCG1, FERMT3, NRCAM, and CCDC50. As expected, the top mediated or mediating transcripts showed associations with multiple metabolites. In particular, this applies to BCL11A, CCDC50, and ABCG1 which are also among the top five transcripts regarding number of metabolite associations (Supplemental Fig. S4). A summary of all mediations per transcript is provided in Supplemental Table S7.
We provide a network of the 250 largest mediation effects in Supplemental Fig. S8. A web application available at https://apps.health-atlas.de/mediation-net/ can be used to display networks of mediation effects on BMI for user-specified metabolites and transcripts. The displayed network can be further filtered based on the proportion of mediated effect sizes (PM). There, nodes represent individual metabolites and transcripts, and edges represent mediations. As an example, significant mediations involving the metabolite acetylcarnitine (C2) comprised 1402 transcripts. Filtering mediations with C2 based on a cutoff of PM ≥ 20% results in a more manageable network of strong mediations comprising nine transcripts. These include plausible genes such as the cholesterol efflux transporter ABCG1 with a described role in the development of obesity, metabolic disease, and atherosclerotic lesions [64, 66].
Metabolites more frequently serve as strong mediators
To narrow down on interesting and biologically plausible mediations, we consider mediations with a high proportion of the mediated effect PM ≥ 20% and named them strong mediation effects. Among the 5352 significant mediations with gene expression as exposure, a total of 346 (6.5%) mediations were classified as “strongly mediated gene expression effects” (by metabolite effects). From the 21,671 mediations with metabolites as exposure, 57 (0.3%) were classified as “strongly mediated metabolite effects” (by gene expression effects). Mediations with high PM only occurred in one direction, i.e., even in cases of bi-directionally significant mediations, at most one of the mediation directions were classified as strong (Fig. 4B). The generally larger effects of BMI-metabolite associations were mediated frequently by gene expression effects, but to a smaller extent (M \(\to\) GE \(\to\) BMI: large βmediation, small PM, Supplemental Fig. S9). Conversely, the generally smaller effects of BMI gene expression associations were mediated to a larger extent by metabolites (GE \(\to\) M \(\to\) BMI: small βmediation, large PM, Fig. 4A, Supplemental Fig. S9). Statistics of pairs with strong mediation effect are provided at Online Supplemental Table OS5. Separating the strong mediations by their directions, the most frequent mediator genes were AHSP, ABCG1, ALAS2, HBD, and CA1. AHSP, ALAS2, and HBD are related to hemoglobin formation, and CA1 codes for an erythrocyte-specific protein. Associations and mediations of these transcripts might be primarily attributed to additional blood parameters not adjusted for in the regression models. ABCG1 codes for an important macrophage cholesterol efflux transporter, necessary for plasma HDL-C formation that has been extensively studied in connection to obesity and metabolic disease [70]. Effects of ABCG1 gene expression strongly mediate effects of various short-chain acylcarnitines (acetylcarnitine (C2), butyrylcarnitine (C4), 3-hydroxy-butyryl-carnitine (C4OH), isovalerylcarnitine (C5), hexanoylcarnitine (C6)), as well as of leucine|isoleucine (Leu|Ile) and hydroxyproline (OH-Prol).
Strong mediations can help to characterize the negative association of ABCG1 with circulating Leu|Ile (β = − 0.247, SE = 0.042, p = 2.83 × 10−09), which was also reported by Bartel et al. [20]. Both ABCG1 gene expression (β = − 0.035, SE = 0.005, p = 4.95 × 10−12) and Leu|Ile levels (β = 0.037, SE = 0.010, p = 1.28 × 10−4) are associated with BMI in our data and show strong mediation of the metabolite effect via gene expression (βmediation = 0.008, CI95% = [0.005, 0.011], p = 2.06 × 10−10, PM = 0.21). Bartel et al. discuss potential co-regulation between BCAAs and the cholesterol metabolism at the transcriptional level including ABCG1 gene expression and respective implications for obesity associations [20].
Mediations provide mechanistic insights for genetically regulated obesity genes
In this section, we focus on genes reported for genetic associations with BMI. As an example, application of our catalogue of mediation effects, we provide additional functional evidence underlying these associations. We retrieved 51 such genes from Srivastava et al. [61]. Additionally, we included one gene that was both genetically associated to metabolites (BCAA) and T2D, namely PPM1K [60]. In our meta-analysis, we detected 87 associated metabolite-transcript pairs of 27 catalogue genes with 32 metabolites. Of these associating pairs, 80 qualified for mediation analysis, i.e., at least one of the participating features showed an association with BMI. The pairs comprised 25 of the originally identified genetically regulated obesity genes. Figure 5 displays all 93 significant mediations, involving 18 transcripts and 24 metabolites (Supplemental Table S8). Of these mediations, five, involving IRF4, FANCL, and WWOX, were classified as “strongly mediated gene expression effects” (Table 2). For instance, IFR4 was mediated by propionylcarnitine (C3), sarcosine, and the ratio of sarcosine/glycine (Q27). IRF4 codes for the interferon regulatory factor 4 and has a diverse role as repressor of adipogenesis and as a negative regulator of the inflammatory response to obesity through M2 macrophage polarization [71]. Experiments with adipocyte-specific Irf4−/− knock outs in mice showed associations with increased weight gain and insulin resistance. The estimated total effect of IRF4 gene expression on BMI (βtotal = − 0.01) was mediated with a PM = 0.24 by sarcosine, with a PM = 0.26 by Q27:Sarc/Gly and with a PM = 0.25 by C3 in our data. C3 is related to BCAA metabolism and oxidation of odd-chain fatty acids, as well as the glycine metabolism, for which sarcosine is a precursor. C3 is also closely linked to insulin sensitivity playing a role in obesity-related disease [72, 73]. Additionally, WWOX has been investigated in its role in metabolic disorders such as insulin resistance [74].
Discussion
In this study, we performed a genome-wide gene expression/metabolome association meta-analysis of three cohorts with up to 7706 subjects using the same gene expression and metabolome technology across studies. We further characterized the relationships of individual transcripts and metabolites by analyzing mediation of gene expression or metabolite effects towards BMI. In our meta-analysis, we identified 37,461 significant associations including 72 metabolites and metabolic ratios and gene expression levels of 8579 genes. Previous association studies of blood gene expression and circulating metabolites were limited in sample size finding up to 1400 associations [20, 21]. Thus, we significantly extended the catalogue of expression/metabolite associations.
Furthermore, we found that 20,507 pairs of 5351 transcripts and 53 metabolites showed third-variable effects that present potential mediations towards BMI in 27,023 instances considering the two mediation pathways M → GE → BMI and GE → M → BMI. Of those mediations, 21,671 followed the first mediation pathway, while 5352 followed the second one. Four thousand seven hundred five transcript-metabolite pairs showed potential bi-directional mediation of effects. Transcripts of the genes BCL11A, ABCG1, FERMT3, NRCAM, and CCDC50 were the most frequently mediated or mediating transcripts. ATP binding cassette subfamily G member 1 (ABCG1) encodes a cholesterol efflux transporter in macrophages and monocytes. Impaired monocyte cholesterol efflux in type 2 diabetics was linked to ABCG1 gene expression [63]. The overall potential role of ABCG1 and reduced cholesterol efflux in adipogenesis and its wider relevance in cardiometabolic disease and atherosclerotic lesions has been extensively studied [64,65,66]. The kindlin-3 protein is a product of the FERMT3 gene, which, mostly expressed in hematopoietic cells, has a function in hemostasis regulation [67]. NRCAM codes for a cell adhesion molecule that was previously reported downregulated in aged CD8+ T cells and investigated for its role in immunosuppression [68, 69]. We classified 403 potential mediations as strong by a PM cutoff of 0.2. According to this definition, 346 strong mediations of gene expression effects (by metabolite effects, i.e., GE → M → BMI) and 57 strong mediations of metabolite effects (by gene expression effects, i.e., M → GE → BMI) were detected. Of note, we found no strong bi-directional mediations, i.e., strong mediations clearly favored one of the mediation directions. Thus, strong mediations represent clear candidates of unidirectional causal effects by not only increasing confidence in the effect estimates but also allowing assigning a direction of causality.
Several transcripts formed association hubs, associating with up to 28 metabolites (Fig. 2). The five most broadly associating transcripts, BCL11A (28 metabolites), CCDC50 (25 metabolites), ABCG1 (24 metabolites), PCCA (23 metabolites), and ACADVL (22 metabolites), cover a variety of functions of blood cell differentiation and composition, immune function, fatty acid oxidation, and cholesterol transport. The top association hub was the transcript of BCL11A, coding for the fetal γ-globin silencing factor protein “B-cell lymphoma/leukemia 11A.” The protein controls hemoglobin switching during maturation by repressing expression of γ-globin [75]. Elevated expression of BCL11A in human beta cells was reported to be negatively correlated to insulin secretion, contributing to an increased risk of type 2 diabetes development, making this gene a prominent target for further studies [76,77,78,79]. Induction of γ-globin synthesis was previously demonstrated by administration of short-chain fatty acids and derivatives, specifically the four-carbon butyrate and the 2-carbon acetate [80, 81]. In our data, BCL11A gene expression was negatively associated with BMI and a range of acylcarnitines (C2, C3, C4OH, and total acylcarnitines). In addition, we observed mediations of its effects on BMI by short-, medium-, and long-chain acylcarnitines as well as free carnitine (C0). This observation is consistent with the results reported by Pace et al. and Liakopoulou et al., making BCL11A a potential missing link in this mechanism, i.e., mediation of metabolite effects on BMI by BCL11A gene expression [80, 81]. We propose that increased levels of fatty acids decrease expression of BCL11A promoting γ-globin expression. The subsequently induced synthesis of fetal hemoglobin affects BMI or BMI-associated phenotypes such as type 2 diabetes and insulin secretion.
Our analysis was aimed at discovering mediations of effects towards BMI. Principally, the assumed causal order cannot be proven with statistical methods of mediation analysis but requires prior knowledge or interventional experimental designs [56]. When testing for mediation in a prescribed pathway, e.g., M \(\to\) GE \(\to\) BMI, one essentially exploits the correlation structure between the three variables M, GE, and BMI. However, this correlation structure can also be produced by alternative causal relationships, in the above case, for example, by the alternative chain BMI \(\to\) GE \(\to\) M [82]. Fairchild and McDaniel recommend to combine the statistical evidence from mediation analysis with biological information to infer the underlying causality [56]. Our results need to be considered in the light of this limitation. Indeed, in some instances, there is evidence that the causal chains analyzed here are reversed, i.e., BMI serves as exposure rather than outcome. Obesity can change the cellular composition of adipose tissue, accompanied by an increased infiltration by macrophages and other immune cells, which is linked to increased cytokine secretion, predominantly TNF-α, as well as a changed T-cell population, leading to low-grade and chronic inflammation [83, 84]. For example, among obesity genes reported in literature, IRF4 has a pronounced role in obesity-induced inflammation, supporting the BMI \(\to\) GE causality [71]. Another example of a potentially inverse causal order (BMI \(\to\) GE \(\to\) M) is mediations involving the transcript abundance of ABCG1. Previous studies showed that ABCG1 suppression in macrophages increased cytokine levels, biased macrophage polarization towards pro-inflammatory M1 macrophages, and was a cause of foam cell formation that accelerated atherosclerosis [64, 85,86,87]. While mediation results suggest that ABCG1 gene expression is a strong mediator of nine BMI-metabolite associations, the reverse causality of BMI \(\to\) GE is supported by previous research. Specifically, the impact of adipokines on macrophage polarization and, subsequently, ABCG1 gene expression differences in M1 and M2 macrophages has been previously reported [88].
An example where previous research supports the tested mediation direction involves BMP6 (coding for the bone morphogenetic protein 6) gene expression, where increased BMP6 activity induces the transcription factor PPARγ in adipocytes, leading to enhanced glucose uptake and has therefore been indicated as a drug target to treat insulin resistance [89]. We observed mediation effects of BMP6 via free carnitine (C0) and various acylcarnitines such as propionylcarnitine (C3). BMP’s documented effect on glucose uptake, and subsequently, the glycolysis rate also affects the fatty acid metabolism, making the observed mediation direction GE \(\to\) M \(\to\) BMI plausible.
Our study has several limitations. First, we focused on the link between gene expression and metabolites which is understudied in comparison to, e.g., genetic associations of gene expression or metabolite features. Our mediation results need to be considered with caution because a causality cannot be proven by this type of analysis, in particular with respect to the direction of the causal path as discussed above. Additionally, due to the complexity of omics interaction, the presence of unmeasured confounding cannot be fully excluded in our analyses. This might explain some of the bi-directional mediation effects observed. On the other hand, strong mediations appear to show a clear direction preference increasing confidence in these findings. In summary, while our analysis approach provides a necessary step to establish causal chains, further experimental approaches are required to validate these findings. As a major application of our catalogue, we considered BMI as an endpoint due to its wide availability and extensive available research although this phenotype is only a proxy to metabolic diseases. The available metabolite panel, limited to amino acids and acylcarnitines, represented only a small part of the human metabolome in one tissue with limited generalizability. However, the available metabolites were all fully characterized, and the measurement methodology is highly standardized, minimizing variability attributable to technical sources [30, 31]. We demonstrate in a previous study [32] that our metabolite panel is affected by fasting status, which differed between studies, i.e., participants of LIFE-Adult and the Sorbs study were at fasting, while participants of LIFE-Heart were not. We adjusted our analyses for fasting hours, but this information was not available for the Sorbs study contributing to the heterogeneity of results. Lastly, the usefulness of BMI as a risk factor for metabolic diseases has long been debated. More meaningful classification were proposed, e.g., by measuring perturbations of the metabolism and distinguishing between metabolically healthy and unhealthy obesity [90]. Additionally, by not adjusting for T2D in the association analysis, mediation results on BMI might comprise those attributable to T2D or other BMI-associated metabolic traits.
In conclusion, by our study, we considerably expand human circulating blood transcriptome-metabolome associations using the same phenotyping platforms in three large cohorts. We further annotated this catalogue by a mediation analysis investigating the impact of these omics layers on BMI. This analysis provides both plausible transcript-metabolite pairs whose association to BMI warrants further functional investigation and a broader impression of the interplay between metabolome and transcriptome in relation to a common phenotype such as BMI, as well.
Data availability
All single study and meta-analyzed summary statistics are publicly available online as Online Supplemental Tables OS1-5 at https://doi.org/10.5281/zenodo.7104774.
Change history
25 October 2023
Tracked changes removed in Supplementary Material 1.
References
Seldin M, Yang X, Lusis AJ (2019) Systems genetics applications in metabolism research. Nat Metab 1(11):1038–1050. https://doi.org/10.1038/s42255-019-0132-x
Sulc J, Winkler TW, Heid IM, Kutalik Z (2020) Heterogeneity in obesity: genetic basis and metabolic consequences. Curr Diab Rep 20(1):1. https://doi.org/10.1007/s11892-020-1285-4
Blüher M (2019) Obesity: global epidemiology and pathogenesis. Nat Rev Endocrinol 15(5):288–298. https://doi.org/10.1038/s41574-019-0176-8
Newgard CB (2017) Metabolomics and metabolic diseases: where do we stand? Cell Metab 25(1):43–56. https://doi.org/10.1016/j.cmet.2016.09.018
Qin Y, Meric G, Long T, Watrous J, Burgess S, Havulinna A, Ritchie SC, Brozynska M, Jousilahti P, Perola M et al (2020) Genome-wide association and Mendelian randomization analysis prioritizes bioactive metabolites with putative causal effects on common diseases. https://www.medrxiv.org/content/10.1101/2020.08.01.20166413v1. https://doi.org/10.1101/2020.08.01.20166413
Heindel JJ, Blumberg B, Cave M, Machtinger R, Mantovani A, Mendez MA, Nadal A, Palanza P, Panzica G, Sargis R et al (2017) Metabolism disrupting chemicals and metabolic disorders. Reprod Toxicol (Elmsford, N.Y.) 68:3–33. https://doi.org/10.1016/j.reprotox.2016.10.001
Newgard CB (2012) Interplay between lipids and branched-chain amino acids in development of insulin resistance. Cell Metab 15(5):606–614. https://doi.org/10.1016/j.cmet.2012.01.024
Bloomgarden Z (2018) Diabetes and branched-chain amino acids: what is the link? J Diabetes 10(5):350–352. https://doi.org/10.1111/1753-0407.12645
Arany Z, Neinast M (2018) Branched chain amino acids in metabolic disease. Curr Diab Rep 18(10):76. https://doi.org/10.1007/s11892-018-1048-7
Mihalik SJ, Goodpaster BH, Kelley DE, Chace DH, Vockley J, Toledo FGS, DeLany JP (2010) Increased levels of plasma acylcarnitines in obesity and type 2 diabetes and identification of a marker of glucolipotoxicity. Obesity (Silver Spring, Md.) 18(9):1695–1700. https://doi.org/10.1038/oby.2009.510
Adeva-Andany MM, Calvo-Castro I, Fernández-Fernández C, Donapetry-García C, Pedre-Piñeiro AM (2017) Significance of l-carnitine for human health. IUBMB Life 69(8):578–594. https://doi.org/10.1002/iub.1646
Bene J, Hadzsiev K, Melegh B (2018) Role of carnitine and its derivatives in the development and management of type 2 diabetes. Nutr Diabetes 8(1):8. https://doi.org/10.1038/s41387-018-0017-1
Bene J, Szabo A, Komlósi K, Melegh B (2020) Mass spectrometric analysis of L-carnitine and its esters: potential biomarkers of disturbances in carnitine homeostasis. Curr Mol Med 20(5):336–354. https://doi.org/10.2174/1566524019666191113120828
Illig T, Gieger C, Zhai G, Römisch-Margl W, Wang-Sattler R, Prehn C, Altmaier E, Kastenmüller G, Kato BS, Mewes H-W et al (2010) A genome-wide perspective of genetic variation in human metabolism. Nat Genet 42(2):137–141. https://doi.org/10.1038/ng.507
Shin S-Y, Fauman EB, Petersen A-K, Krumsiek J, Santos R, Huang J, Arnold M, Erte I, Forgetta V, Yang T-P et al (2014) An atlas of genetic influences on human blood metabolites. Nat Genet 46(6):543–550. https://doi.org/10.1038/ng.2982
Kastenmüller G, Raffler J, Gieger C, Suhre K (2015) Genetics of human metabolism: an update. Hum Mol Genet 24(R1):R93–R101. https://doi.org/10.1093/hmg/ddv263
Riveros-Mckay F, Oliver-Williams C, Karthikeyan S, Walter K, Kundu K, Ouwehand WH, Roberts D, Di Angelantonio E, Soranzo N, Danesh J et al (2020) The influence of rare variants in circulating metabolic biomarkers. PLoS Genet 16(3):e1008605. https://doi.org/10.1371/journal.pgen.1008605
Lotta LA, Pietzner M, Stewart ID, Wittemans LBL, Li C, Bonelli R, Raffler J, Biggs EK, Oliver-Williams C, Auyeung VPW et al (2021) A cross-platform approach identifies genetic regulators of human metabolism and health. Nat Genet 53(1):54–64. https://doi.org/10.1038/s41588-020-00751-5
Li S, Ogawa W, Emi A, Hayashi K, Senga Y, Nomura K, Hara K, Yu D, Kasuga M (2011) Role of S6K1 in regulation of SREBP1c expression in the liver. Biochem Biophys Res Commun 412(2):197–202. https://doi.org/10.1016/j.bbrc.2011.07.038
Bartel J, Krumsiek J, Schramm K, Adamski J, Gieger C, Herder C, Carstensen M, Peters A, Rathmann W, Roden M et al (2015) The human blood metabolome-transcriptome interface. PLoS Genet 11(6):e1005274. https://doi.org/10.1371/journal.pgen.1005274
Inouye M, Kettunen J, Soininen P, Silander K, Ripatti S, Kumpula LS, Hämäläinen E, Jousilahti P, Kangas AJ, Männistö S et al (2010) Metabonomic, transcriptomic, and genomic variation of a population cohort. Mol Syst Biol 6:441. https://doi.org/10.1038/msb.2010.93
Wahl S, Vogt S, Stückler F, Krumsiek J, Bartel J, Kacprowski T, Schramm K, Carstensen M, Rathmann W, Roden M et al (2015) Multi-omic signature of body weight change: results from a population-based cohort study. BMC Med 13:48. https://doi.org/10.1186/s12916-015-0282-y
Nath AP, Ritchie SC, Byars SG, Fearnley LG, Havulinna AS, Joensuu A, Kangas AJ, Soininen P, Wennerström A, Milani L et al (2017) An interaction map of circulating metabolites, immune gene networks, and their genetic regulation. Genome Biol 18(1):146. https://doi.org/10.1186/s13059-017-1279-y
Loeffler M, Engel C, Ahnert P, Alfermann D, Arelin K, Baber R, Beutner F, Binder H, Brähler E, Burkhardt R et al (2015) The LIFE-Adult-study: objectives and design of a population-based cohort study with 10,000 deeply phenotyped adults in Germany. BMC Public Health 15:691. https://doi.org/10.1186/s12889-015-1983-z
Engel C, Wirkner K, Zeynalova S, Baber R, Binder H, Ceglarek U, Enzenbach C, Fuchs M, Hagendorff A, Henger S et al (2022) Cohort profile: the LIFE-Adult-study. Int J Epidemiol. https://doi.org/10.1093/ije/dyac114
Scholz M, Henger S, Beutner F, Teren A, Baber R, Willenberg A, Ceglarek U, Pott J, Burkhardt R, Thiery J (2020) Cohort profile: The Leipzig Research Center for Civilization Diseases-Heart Study (LIFE-Heart). Int J Epidemiol 49(5):1439–1440h. https://doi.org/10.1093/ije/dyaa075
Gross A, Tönjes A, Kovacs P, Veeramah KR, Ahnert P, Roshyara NR, Gieger C, Rueckert I-M, Loeffler M, Stoneking M et al (2011) Population-genetic comparison of the Sorbian isolate population in Germany with the German KORA population using genome-wide SNP arrays. BMC Genet 12:67. https://doi.org/10.1186/1471-2156-12-67
Veeramah KR, Tönjes A, Kovacs P, Gross A, Wegmann D, Geary P, Gasperikova D, Klimes I, Scholz M, Novembre J et al (2011) Genetic variation in the Sorbs of eastern Germany in the context of broader European genetic diversity. Eur J Hum Genet : EJHG 19(9):995–1001. https://doi.org/10.1038/ejhg.2011.65
Ceglarek U, Müller P, Stach B, Bührdel P, Thiery J, Kiess W (2002) Validation of the phenylalanine/tyrosine ratio determined by tandem mass spectrometry: sensitive newborn screening for phenylketonuria. Clin Chem Lab Med 40(7):693–697. https://doi.org/10.1515/CCLM.2002.119
Ceglarek U, Leichtle A, Brügel M, Kortz L, Brauer R, Bresler K, Thiery J, Fiedler GM (2009) Challenges and developments in tandem mass spectrometry based clinical metabolomics. Mol Cell Endocrinol 301(1–2):266–271. https://doi.org/10.1016/j.mce.2008.10.013
Brauer R, Leichtle AB, Fiedler GM, Thiery J, Ceglarek U (2011) Preanalytical standardization of amino acid and acylcarnitine metabolite profiling in human blood using tandem mass spectrometry. Metabolomics 7(3):344–352. https://doi.org/10.1007/s11306-010-0256-1
Beuchel C, Becker S, Dittrich J, Kirsten H, Toenjes A, Stumvoll M, Loeffler M, Thiele H, Beutner F, Thiery J et al (2019) Clinical and lifestyle related factors influencing whole blood metabolite levels - a comparative analysis of three large cohorts. Mol Metab 29:76–85. https://doi.org/10.1016/j.molmet.2019.08.010
Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics (Oxford, England) 8(1):118–127. https://doi.org/10.1093/biostatistics/kxj037
Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD (2012) The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28(6):882–883. https://doi.org/10.1093/bioinformatics/bts034
Aulchenko YS, Ripke S, Isaacs A, van Duijn CM (2007) GenABEL: an R library for genome-wide association analysis. Bioinformatics 23(10):1294–1296. https://doi.org/10.1093/bioinformatics/btm108
Wang J (2002) An estimator for pairwise relatedness using molecular markers. Genetics 160(3):1203–1215
Holdt LM, Beutner F, Scholz M, Gielen S, Gäbel G, Bergert H, Schuler G, Thiery J, Teupser D (2010) ANRIL expression is associated with atherosclerosis risk at chromosome 9p21. Arterioscler Thromb Vasc Biol 30(3):620–627. https://doi.org/10.1161/ATVBAHA.109.196832
Burkhardt R, Kirsten H, Beutner F, Holdt LM, Gross A, Teren A, Tönjes A, Becker S, Krohn K, Kovacs P et al (2015) Integration of genome-wide SNP data and gene-expression profiles reveals six novel loci and regulatory mechanisms for amino acids and acylcarnitines in whole blood. PLoS Genet 11(9):e1005510. https://doi.org/10.1371/journal.pgen.1005510
Tönjes A, Scholz M, Breitfeld J, Marzi C, Grallert H, Gross A, Ladenvall C, Schleinitz D, Krause K, Kirsten H et al (2014) Genome wide meta-analysis highlights the role of genetic variation in RARRES2 in the regulation of circulating serum chemerin. PLoS Genet 10(12):e1004854. https://doi.org/10.1371/journal.pgen.1004854
Du P, Kibbe WA, Lin SM (2008) lumi: a pipeline for processing Illumina microarray. Bioinformatics (Oxford, England) 24(13):1547–1548. https://doi.org/10.1093/bioinformatics/btn224
Kirsten H, Al-Hasani H, Holdt L, Gross A, Beutner F, Krohn K, Horn K, Ahnert P, Burkhardt R, Reiche K et al (2015) Dissecting the genetics of the human transcriptome identifies novel trait-related trans-eQTLs and corroborates the regulatory relevance of non-protein coding loci†. Hum Mol Genet 24(16):4746–4763. https://doi.org/10.1093/hmg/ddv194
Ritchie ME, Phipson B, Di Wu, Hu Y, Law CW, Shi W, Smyth GK (2015) limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43(7):e47. https://doi.org/10.1093/nar/gkv007
Peterson CB, Bogomolov M, Benjamini Y, Sabatti C (2016) Many phenotypes without many false discoveries: error controlling strategies for multitrait association studies. Genet Epidemiol 40(1):45–56. https://doi.org/10.1002/gepi.21942
Huang QQ, Ritchie SC, Brozynska M, Inouye M (2018) Power, false discovery rate and winner’s curse in eQTL studies. Nucleic Acids Res 46(22):e133–e133. https://doi.org/10.1093/nar/gky780
Yoav B, Yosef H (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B (Methodological) 57(1):289–300
Evangelou E, Ioannidis JPA (2013) Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet 14(6):379–389. https://doi.org/10.1038/nrg3472
Strimmer K (2008) fdrtool: a versatile R package for estimating local and tail area-based false discovery rates. Bioinformatics (Oxford, England) 24(12):1461–1462. https://doi.org/10.1093/bioinformatics/btn209
Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100(16):9440–9445. https://doi.org/10.1073/pnas.1530509100
VanderWeele TJ (2016) Mediation analysis: a practitioner’s guide. Annu Rev Public Health 37:17–32. https://doi.org/10.1146/annurev-publhealth-032315-021402
MacKinnon DP, Krull JL, Lockwood CM (2000) Equivalence of the mediation, confounding and suppression effect. Prev Sci 1(4):173–181. https://doi.org/10.1023/a:1026595011371
VanderWeele TJ, Vansteelandt S (2009) Conceptual issues concerning mediation, interventions and composition. Stat Interface. https://www.researchgate.net/publication/265823964_Conceptual_issues_concerning_mediation_interventions_and_composition
Valeri L, VanderWeele TJ (2013) Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychol Methods 18(2):137–150. https://doi.org/10.1037/a0031034
James LR, Mulaik SA, Brett JM (2006) A tale of two methods. Organ Res Methods 9(2):233–244. https://doi.org/10.1177/1094428105285144
MacKinnon DP, Lockwood CM, Hoffman JM, West SG, Sheets V (2002) A comparison of methods to test mediation and other intervening variable effects. Psychol Methods 7(1):83
MacKinnon DP, Fairchild AJ, Fritz MS (2007) Mediation analysis. Annu Rev Psychol 58:593–614. https://doi.org/10.1146/annurev.psych.58.110405.085542
Fairchild AJ, McDaniel HL (2017) Best (but oft-forgotten) practices: mediation analysis12. Am J Clin Nutr 105(6):1259–1271. https://doi.org/10.3945/ajcn.117.152546
Tofighi D, MacKinnon DP (2011) RMediation: an R package for mediation analysis confidence intervals. Behav Res Methods 43(3):692–700. https://doi.org/10.3758/s13428-011-0076-x
Preacher KJ, Selig JP (2012) Advantages of Monte Carlo confidence intervals for indirect effects. Commun Methods Meas 6(2):77–98. https://doi.org/10.1080/19312458.2012.679848
Ditlevsen S, Christensen U, Lynch J, Damsgaard MT, Keiding N (2005) The mediation proportion: a structural equation approach for estimating the proportion of exposure effect on outcome explained by an intermediate variable. Epidemiology (Cambridge, Mass.) 16(1):114–120. https://doi.org/10.1097/01.ede.0000147107.76079.07
Lotta LA, Scott RA, Sharp SJ, Burgess S, Luan J, Tillin T, Schmidt AF, Imamura F, Stewart ID, Perry JRB et al (2016) Genetic predisposition to an impaired metabolism of the branched-chain amino acids and risk of type 2 diabetes: a Mendelian randomisation analysis. PLoS Med 13(11):e1002179. https://doi.org/10.1371/journal.pmed.1002179
Srivastava A, Srivastava N, Mittal B (2016) Genetics of obesity. Indian J Clin Biochem : IJCB 31(4):361–371. https://doi.org/10.1007/s12291-015-0541-x
Banno A, Wang J, Okada K, Mori R, Mijiti M, Nagaoka S (2019) Identification of a novel cholesterol-lowering dipeptide, phenylalanine-proline (FP), and its down-regulation of intestinal ABCA1 in hypercholesterolemic rats and Caco-2 cells. Sci Rep 9(1):19416. https://doi.org/10.1038/s41598-019-56031-8
Zhou H, Tan KCB, Shiu SWM, Wong Y (2008) Determinants of leukocyte adenosine triphosphate-binding cassette transporter G1 gene expression in type 2 diabetes mellitus. Metab: Clin Exp 57(8):1135–1140. https://doi.org/10.1016/j.metabol.2008.03.020
Frisdal E, Le Goff W (2015) Adipose ABCG1: a potential therapeutic target in obesity? Adipocyte 4(4):315–318. https://doi.org/10.1080/21623945.2015.1023491
Hardy LM, Frisdal E, Le Goff W (2017) Critical role of the human ATP-binding cassette G1 transporter in cardiometabolic diseases. Int J Mol Sci 18(9). https://doi.org/10.3390/ijms18091892
Skarda L, Kowal J, Locher KP (2021) Structure of the human cholesterol transporter ABCG1. J Mol Biol 433(21):167218. https://doi.org/10.1016/j.jmb.2021.167218
Fagerholm SC, Lek HS, Morrison VL (2014) Kindlin-3 in the immune system. Am J Clin Exp Immunol 3(1):37–42
Cao J-N, Gollapudi S, Sharman EH, Jia Z, Gupta S (2010) Age-related alterations of gene expression patterns in human CD8+ T cells. Aging Cell 9(1):19–31. https://doi.org/10.1111/j.1474-9726.2009.00534.x
Ren G, Roberts AI, Shi Y (2011) Adhesion molecules: key players in Mesenchymal stem cell-mediated immunosuppression. Cell Adh Migr 5(1):20–22. https://doi.org/10.4161/cam.5.1.13491
Edgel KA, McMillen TS, Wei H, Pamir N, Houston BA, Caldwell MT (1821) Mai P-OT, Oram JF, Tang C, Leboeuf RC (2012) Obesity and weight loss result in increased adipose tissue ABCG1 expression in db/db mice. Biochem Biophys Acta 3:425–434. https://doi.org/10.1016/j.bbalip.2011.11.012
Eguchi J, Kong X, Tenta M, Wang X, Kang S, Rosen ED (2013) Interferon regulatory factor 4 regulates obesity-induced inflammation through regulation of adipose tissue macrophage polarization. Diabetes 62(10):3394–3403. https://doi.org/10.2337/db12-1327
Libert DM, Nowacki AS, Natowicz MR (2018) Metabolomic analysis of obesity, metabolic syndrome, and type 2 diabetes: amino acid and acylcarnitine levels change along a spectrum of metabolic wellness. PeerJ 6: e5410. https://doi.org/10.7717/peerj.5410
Alves A, Bassot A, Bulteau A-L, Pirola L, Morio B (2019) Glycine metabolism and its alterations in obesity and metabolic diseases. Nutrients 11(6). https://doi.org/10.3390/nu11061356
Abu-Remaileh M, Abu-Remaileh M, Akkawi R, Knani I, Udi S, Pacold ME, Tam J, Aqeilan RI (2019) WWOX somatic ablation in skeletal muscles alters glucose metabolism. Mol Metab 22:132–140. https://doi.org/10.1016/j.molmet.2019.01.010
Sankaran VG, Menne TF, Xu J, Akie TE, Lettre G, van Handel B, Mikkola HKA, Hirschhorn JN, Cantor AB, Orkin SH (2008) Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A. Science (New York, N.Y.) 322(5909):1839–1842. https://doi.org/10.1126/science.1165409
Yu Y, Wang J, Khaled W, Burke S, Li P, Chen X, Yang W, Jenkins NA, Copeland NG, Zhang S et al (2012) Bcl11a is essential for lymphoid development and negatively regulates p53. J Exp Med 209(13):2467–2483. https://doi.org/10.1084/jem.20121846
Basak A, Sankaran VG (2016) Regulation of the fetal hemoglobin silencing factor BCL11A. Ann N Y Acad Sci 1368(1):25–30. https://doi.org/10.1111/nyas.13024
Peiris H, Park S, Louis S, Gu X, Lam JY, Asplund O, Ippolito GC, Bottino R, Groop L, Tucker H et al (2018) Discovering human diabetes-risk gene function with genetics and physiological assays. Nat Commun 9(1):3855. https://doi.org/10.1038/s41467-018-06249-3
Yin J, Xie X, Ye Y, Wang L, Che F (2019) BCL11A: a potential diagnostic biomarker and therapeutic target in human diseases. Biosci Rep 39(11). https://doi.org/10.1042/BSR20190604
Liakopoulou E, Blau CA, Li Q, Josephson B, Wolf JA, Fournarakis B, Raisys V, Dover G, Papayannopoulou T, Stamatoyannopoulos G (1995) Stimulation of fetal hemoglobin production by short chain fatty acids. Blood 86(8):3227–3235. https://doi.org/10.1182/blood.V86.8.3227.3227
Pace BS, White GL, Dover GJ, Boosalis MS, Faller DV, Perrine SP (2002) Short-chain fatty acid derivatives induce fetal globin expression and erythropoiesis in vivo. Blood 100(13):4640–4648. https://doi.org/10.1182/blood-2002-02-0353
Stelzl I (1986) Changing a causal hypothesis without changing the fit: some rules for generating equivalent path models. Multivar Behav Res 21(3):309–331. https://doi.org/10.1207/s15327906mbr2103_3
Ouchi N, Parker JL, Lugus JJ, Walsh K (2011) Adipokines in inflammation and metabolic disease. Nat Rev Immunol 11(2):85–97. https://doi.org/10.1038/nri2921
Gregor MF, Hotamisligil GS (2011) Inflammatory mechanisms in obesity. Annu Rev Immunol 29:415–445. https://doi.org/10.1146/annurev-immunol-031210-101322
Yvan-Charvet L, Ranalletta M, Wang N, Han S, Terasaka N, Li R, Welch C, Tall AR (2007) Combined deficiency of ABCA1 and ABCG1 promotes foam cell accumulation and accelerates atherosclerosis in mice. J Clin Investig 117(12):3900–3908. https://doi.org/10.1172/JCI33372
Wojcik AJ, Skaflen MD, Srinivasan S, Hedrick CC (2008) A critical role for ABCG1 in macrophage inflammation and lung homeostasis. J Immunol (Baltimore, Md. : 1950) 180(6):4273–4282. https://doi.org/10.4049/jimmunol.180.6.4273
Sag D, Cekic C, Wu R, Linden J, Hedrick CC (2015) The cholesterol transporter ABCG1 links cholesterol homeostasis and tumour immunity. Nat Commun 6(1):6354. https://doi.org/10.1038/ncomms7354
Wei H, Tarling EJ, McMillen TS, Tang C, Leboeuf RC (2015) ABCG1 regulates mouse adipose tissue macrophage cholesterol levels and ratio of M1 to M2 cells in obesity and caloric restriction. J Lipid Res 56(12):2337–2347. https://doi.org/10.1194/jlr.M063354
Schreiber I, Dörpholz G, Ott C-E, Kragesteen B, Schanze N, Lee CT, Köhrle J, Mundlos S, Ruschke K, Knaus P (2017) BMPs as new insulin sensitizers: enhanced glucose uptake in mature 3T3-L1 adipocytes via PPARγ and GLUT4 upregulation. Sci Rep 7(1):1–13. https://doi.org/10.1038/s41598-017-17595-5
Cirulli ET, Guo L, Leon Swisher C, Shah N, Huang L, Napier LA, Kirkness EF, Spector TD, Caskey CT, Thorens B et al (2019) Profound perturbation of the metabolome in obesity is associated with health risk. Cell Metab 29(2):488-500.e2. https://doi.org/10.1016/j.cmet.2018.09.022
Acknowledgements
We thank all study participants in the Sorb and LIFE studies whose personal dedication and commitment have made this project possible. We thank Kerstin Wirkner for running the LIFE study center and Sylvia Henger for data quality control.
Funding
Open Access funding enabled and organized by Projekt DEAL. This work was supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the e:Med line of funding (project “SYMPATH,” grant number 01ZX1906B) and by a grant from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation, Project number 209933838, Collaborative Research Center SFB1052 “Obesity Mechanisms,” SFB-1052/4 B11). The LIFE studies were financed by the LIFE – Leipzig Research Center for Civilization Diseases, Universität Leipzig. LIFE was funded by means of the European Union, by the European Regional Development Fund (ERDF) and by means of the Free State of Saxony within the framework of the excellence initiative.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics approval
The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of the Medical Faculty of the University Leipzig, Germany (Adult: Reg. No. 263-2009-14122009, Heart: Reg. No. 276-2005, Sorbs: Reg. No: 088-2005).
Consent to participate
Written informed consent including agreement to genetic analyses was obtained from all subjects involved in the studies.
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Beuchel, C., Dittrich, J., Becker, S. et al. An atlas of genome-wide gene expression and metabolite associations and possible mediation effects towards body mass index. J Mol Med 101, 1305–1321 (2023). https://doi.org/10.1007/s00109-023-02362-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00109-023-02362-z