Mendelian randomization studies of biomarkers and type 2 diabetes

Many biomarkers are associated with type 2 diabetes (T2D) risk in epidemiological observations. The aim of this study was to identify and summarize current evidence for causal effects of biomarkers on T2D. A systematic literature search in PubMed and EMBASE (until April 2015) was done to identify Mendelian randomization studies that examined potential causal effects of biomarkers on T2D. To replicate the findings of identified studies, data from two large-scale, genome-wide association studies (GWAS) were used: DIAbetes Genetics Replication And Meta-analysis (DIAGRAMv3) for T2D and the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) for glycaemic traits. GWAS summary statistics were extracted for the same genetic variants (or proxy variants), which were used in the original Mendelian randomization studies. Of the 21 biomarkers (from 28 studies), ten have been reported to be causally associated with T2D in Mendelian randomization. Most biomarkers were investigated in a single cohort study or population. Of the ten biomarkers that were identified, nominally significant associations with T2D or glycaemic traits were reached for those genetic variants related to bilirubin, pro-B-type natriuretic peptide, delta-6 desaturase and dimethylglycine based on the summary data from DIAGRAMv3 or MAGIC. Several Mendelian randomization studies investigated the nature of associations of biomarkers with T2D. However, there were only a few biomarkers that may have causal effects on T2D. Further research is needed to broadly evaluate the causal effects of multiple biomarkers on T2D and glycaemic traits using data from large-scale cohorts or GWAS including many different genetic variants.


Introduction
Over the past decade, interest in studying biological markers (biomarkers) for type 2 diabetes (T2D) has increased intensely. This happened because multiple pathobiological processes may contribute to the disease progression, which provides an opportunity to introduce preventive and therapeutic interventions for T2D (1). In clinical practice, such biomarkers (e.g., glucose and glycated haemoglobin tests) are widely used for the diagnosis of diabetes or for the monitoring of therapeutic intervention (2,3). Targeted intervention at the biomarker level would be useful where there is evidence for a causal relationship between an exposure (like a biomarker) and T2D (4).
Traditional epidemiological studies lack sufficient information to fill the evidence gap due to unmeasured confounding or reverse causality (4,5,6,7). It has been successfully shown that a complementary analysis of genetic data, termed 'Mendelian randomization,' has additive value to infer a causal association (4,5,6,7,8). The main assumption for Mendelian randomization is that the genetic variants do not change over time and are inherited randomly (based on Mendel's laws). In other words, the genetic variants as proxy measures for exposures (e.g., biomarkers) are essentially considered free from confounding and reverse causation. Therefore, the analysis of integrated observational-genetic data is considered similar to that of the randomized trials (9,10). In Mendelian randomization, if there is a causal association between a biomarker and T2D, the genetic variant(s) influencing the biomarker and the outcome of interest should be associated (5,6,8). In the current study, evidence for causal associations between biomarkers and the risk of T2D was updated via a systematic literature search to identify Mendelian randomization studies. Next, summary data from the two largest genome-wide association studies (GWAS) for T2D or glycaemic traits (10,11) were used to examine the effect estimates for each genetic variant compared with that of the identified studies.

Search strategy for candidate biomarkers
PubMed and EMBASE were searched to identify Mendelian randomization studies examining the associations between biomarkers and T2D until April 2015. The overview of this systematic literature search was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) guidelines, when applicable (12). A manual search was also done for the references of included articles to identify other relevant studies. Because a review of published original studies was performed, the Declaration of Helsinki items related to 'approval of medical ethics committee' and 'permission acquisition' are not applicable to the current study.

Selection criteria
Studies were included if they formally quantified a causal association between one or more biomarkers (as main exposures) and T2D (as the main outcome); used data on biomarker-associated genetic variant; and classified/defined the exposure as a biomarker that has been objectively measured in serum, plasma or urine. Data extraction and quality assessment A primary plan was made to extract necessary data from the full text of the original studies or to contact the corresponding author(s) when appropriate. Fig. 1 depicts the workflow of the literature search. T2D was determined as the main outcome if one or more of the following conditions were fulfilled: a physician diagnosed T2D as indicated by a self-report or in a primary-care database; fasting plasma glucose R7.0 mmol/l, a random sample plasma glucose R11.1 mmol/l; or the initiation of glucoselowering medication as retrieved from a pharmacy registry or hospital records (11).

Statistical analysis
All genetic variants that affect identified biomarkers were obtained from the original Mendelian randomization studies. To replicate the nature of the relationship between each biomarker and the outcome (i.e., T2D or glycaemic traits), a genetic approach using outcomeassociation data for the biomarker-related genetic variants was applied (5,6,8). In brief, to determine whether the same genetic variants (or a suitable proxy variant) were associated with T2D (or glycaemic traits), the corresponding summary association statistics from GWAS for T2D (DIAbetes Genetics Replication And Meta-analysis (DIA-GRAMv3)) (13) and glycaemic traits (Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC)) (14) were extracted. DIAGRAMv3 is a meta-analysis of multiple GWAS with a total number of 12 171 diabetes cases and 56 862 controls of European descent (13). Previously, details regarding the use of GWAS data for Mendelian randomization were described (5,6,10,11). These selected single-test association analyses in the GWAS data are equivalent to that of individual-level data analysis (5,6,11).
Extracted data were tabulated on Excel spreadsheets. All statistical analyses were conducted using Excel or Stata/SE version 13.1 for Windows (http://cran.r-project. org/). A two-sided P value !0.05 was considered nominally significant.

Literature search and study characteristics
After scanning 812 titles and selected abstracts, 33 articles were selected for full text review (11,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46). The literature search process is shown in Fig. 1. Five studies were excluded as they did not measure a specific biomarker (nZ1) or investigated a measure of insulin resistance as the main exposure (nZ4). The characteristics of the 28 studies are summarized in Table 1.The studies were performed in different populations; most were cohort studies including middle-aged adults in the USA or Europe and were published between 2008 and 2015. Four studies were conducted only in Asian populations including Chinese or Taiwanese. In 12 studies, data from multiple GWAS were combined to examine the associations between genotypes and T2D. Six of these studies used data from DIAGRAMv2 (nZ3) (47) or DIAGRAMv3 (nZ3) (13), and the other six used at least two cohorts with genome-wide genotyping. DIAGRAMv2 was a meta-analysis of eight GWAS comprising 8130 individuals with T2D and 38 987 controls of European descent (47). All Mendelian randomization studies investigated only one biomarker as an exposure in relation to T2D. Some studies also examined the causal associations of a single biomarker with several other outcomes such as cardiovascular disease (18,24), rheumatoid arthritis (18) or osteoporosis (24). In the final sample, studies included up to 28 144 T2D cases and 76 344 T2D controls or total participants.

Biomarkers and Mendelian randomization
From 28 studies, data were retrieved on causal associations of biomarkers with T2D. In these studies, 21 unique    biomarkers that were investigated at least once (16 biomarkers), twice (three biomarkers) and three times (two biomarkers) were identified (Table 1); 19 biomarkers were investigated in independent studies. However, the two biomarkers with three Mendelian randomization studies were investigated in combined studies where the same sample, e.g., DIAGRAMv2 for adiponectin (19,43), or a part of whole cohort, e.g., European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam for vitamin D (17,45), were used to make total cases/controls. Eleven studies used one genetic variant as a single instrumental variable in Mendelian randomization.
Eight studies used at least two independent genetic variants in the same locus as instrumental variables. The rest of the nine studies used multiple genetic variants in different loci or created a multi-locus genetic risk score for each biomarker.
Using the GWAS data from DIAGRAMv3 or MAGIC (13,14), associations of the genetic variants influencing these biomarkers with T2D and glycaemic traits were tested. A nominally significance (P!0.05) was reached for the genetic variants that affect bilirubin (PZ0.03 for glucose and Homeostatic model assessment (HOMA)insulin resistance (IR)), NT proBNP (PZ0.03 for T2D), D6D activity (PZ0.003 for T2D; PZ2.7!10 K8 for glucose) and dimethylglycine (PZ0.004 for glucose) in relation to T2D or glycaemic traits (Table 2). For adiponectin, a nominally significant association with T2D or glycaemic traits was observed for seven out of 19 genetic variants. All of these seven variants, except rs12637534, were non-ADIPOQ adiponectin genetic variants. In line with previous Mendelian randomization studies, the overall effect of non-ADIPOQ variants on T2D or glycaemic traits together with null association of ADIPOQ variants is compatible with pleiotropic effects of adiponectin on T2D (42).

Discussion
This literature search of Mendelian randomization studies shows here that ten out of 21 identified biomarkers were reported to be causally associated with T2D. In particular, the presence of potential causal associations (defined as nominally significant) between the biomarker-related variants and T2D and/or glycaemic traits can be confirmed for four biomarkers using the publically available GWAS data. The inconsistency between the identified studies and the summary data from DIAGRAMv3 or MAGIC can be to some extent explained by different design and populations applying across studies, the possibility of false positive associations, selection bias, the heterogeneity in T2D, the use of varied sources to ascertain T2D cases, differences in data quality control and pre-analysis preparations (6,11). Taken together, these findings support that the oxidative stress system (bilirubin or metabolites of bilirubin), the brain natriuretic peptide (BNP) hormone system, D6D activity and dimethylglycine may contribute to the development of diabetes or insulin resistance through secondary effects (i.e., pleiotropy) or direct mechanisms (11,37,42). Bilirubin that is the major endproduct of heme catabolism has antioxidant properties and may compensate the oxidative stress (11,48). Oxidative stress has been shown as an important factor in the pathophysiology of diabetes (11). At the cellular level, bilirubin can be oxidized to its precursor, biliverdin, to detoxify the excess of oxidants. Biliverdin is rapidly recycled to bilirubin via the action of biliverdin reductase, generating a physiologic cytoprotective cycle in several tissues (49). The underlying mechanism of a protective role of BNP in the aetiology of T2D is unknown in humans (37). In mice, overexpressed BNP signalling cascade can protect against diet-induced insulin resistance and obesity through promoting muscle mitochondrial biogenesis and fat oxidation (37,50). The biological mechanisms of the relationship between D6D activity and T2D are not well understood (30). Although data of human experimental studies are scarce, observational studies have shown that D6D activity or lifestyle-induced changes in D6D activity was associated with insulin resistance (30,51,52). Because D6D catalyses the synthesis of fatty acids, one can speculate that the link between D6D activity and T2D is likely to be mediated by changes in fatty acid composition, which in turn may affect insulin signalling and receptorbinding affinities (30). Dimethylglycine is metabolized to glycine by dimethylglycine dehydrogenase (DMGDH) in mammals (32). Accordingly, a recent GWAS identified that the DMGDH genetic variants were strongly associated with blood-based dimethylglycine (53). Epidemiological studies have observed an inverse association between the precursor of dimethylglycine, betaine, and metabolic risk factors (54) but a positive association of elevated glycine with increased insulin sensitivity (32). In humans, cardiometabolic effects of the inhibition of DMHDH or the supplementation of dimethylglycine have not been investigated (32). However, experimental animal studies have suggested a protective role of dimethylglycine in glucose metabolism through a reduction in DMGDH function (32,55,56).
In the post-omics era, epidemiological studies basically suggest that the levels of a given biomarker differ between patients with T2D (or the individuals at high risk for developing diabetes) and individuals without diabetes (2,57). If the biomarker is not causally related to the disease outcome, the process of developing diabetes may cause the increase or decrease in the levels of the biomarker, as one of the T2D consequences, called reverse causality (4,26,37). Unmeasured confounding or measured confounding factors with errors (for example, physical activity by self-report) is another explanation for the observed associations between most biomarkers and T2D. In this context, the use of genetic data (as in Mendelian randomization) can enhance the likelihood that the association between a biomarker and T2D is causal or not (4,36,37). The potential role of biomarkers in the development of T2D and the trajectories of glycaemic traits using longitudinal analysis remains to be further confirmed (6,11). The latter analytical approach can provide insight into the potential value of biomarkers, which indicates pathobiological signals of metabolic changes in the aetiology of T2D. T2D is influenced by the interactions of multiple genes or a gene may have been mapped to multiple biomarkers rather than the biomarker of interest (4,13,14). Thus, an extensive knowledge of gene function and biological processes and that the genome interacts with environmental factors is needed to better understand how genetic variations in the human genome contributes to lifelong risk of T2D (4,11,13,14,57). In this review, most studies only investigated a single biomarker-diabetes association, and statistically significant associations are reported more often. Here, publication bias should be considered. Other limitations include the lack of complementary analyses in the genetic associations for glycaemic traits and that set of genetic variants or large-scale GWAS for biomarkers were scarce. Similarly, the Mendelian randomization approach using GWAS summary statistics can be extended as a secondary analysis of several datasets that have data on the biomarker-associated variants (or their proxies) for several diseases or traits. This multidisciplinary research requires that a large group of consortia are contacted to obtain summary association statistics from GWAS consortia for outcomes of interest. Moreover, for the biomarkers linked to T2D, underlying biological mechanisms remain unknown. To uncover the underlying mechanisms, one needs to perform a complementary strand of experimental research. Before that, an in silico functional gene network (pathway) analysis can be used to speculate on the possible biological mechanisms of biomarker-disease associations (58,59). Finally, Mendelian randomization cannot completely control for the possibility of developmental compensation, called canalization, confounding and pleiotropic effects (5,6,8,11,37).
In conclusion, this is the first study that updates evidence for causal associations between biomarkers and T2D to date. Several Mendelian randomization studies investigated the nature of associations of biomarkers with T2D. Most biomarkers were investigated in a single cohort study or population. However, there were only a few biomarkers that may have causal effects on T2D. Further research is warranted to broadly evaluate the causal effects of multiple biomarkers on T2D and glycaemic traits using data from large-scale cohorts or combined GWAS including many different genetic variants. This genetic approach *P values for each of the biomarker variants were extracted from publicly available meta-analyses of genome-wide association studies (13,14). NA, not applicable; CRP, C-reactive protein; BNP, brain natriuretic peptide; interleukin 1 receptor antagonist (IL1-Ra); Lp(a), Lipoprotein(a); LTL, leukocyte telomere length; MIF, macrophage migration inhibitory factor; NT proBNP, N-terminal pro B-type natriuretic peptide; SHBG, sex hormone binding globulin, Vitamin D-BP, Vitamin D binding protein.
may advance our understanding of the causes of T2D and potentially enable us to explore novel targets for the prevention and treatment of diabetes.

Declaration of interest
The author declares that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.

Funding
This work was supported by The Netherlands Organization for Scientific Research project (NWO) and the Medical Research Council UK (grant number MC_UU_12015/1). A Abbasi is supported by a Rubicon grant from the NWO (Project no. 825.13.004).