To date, more than 100 common genetic variants have been reliably associated with type 2 diabetes and its quantitative metabolic traits [1, 2] The majority of these variants were discovered using hypothesis-free methods such as genome-wide association studies (GWAS) and other targeted, massively parallel genotyping technologies such as Metabochip. To justify the extraordinarily high costs of genetic association studies, which now runs into billions of euros [3], population geneticists have argued that the discoveries emerging from these studies would (1) elucidate the biological basis of complex disease; (2) improve targeted prevention and treatment of disease; and (3) improve the predictive accuracy of conventional risk prediction algorithms. Whilst genetic association studies have indeed opened up many interesting leads concerning the biology of type 2 diabetes, they have done little to improve the prediction, prevention or treatment of the disease.

Findings from several prospective cohort studies [46] and clinical trials [7] of older adults have been published that document the predictive accuracy of type 2 diabetes genetic risk scores and compared these with conventional and complex non-genetic risk prediction algorithms. Disappointingly, these studies show that common genetic variants do very little to improve the predictive accuracy of non-genetic risk scores comprised of variables such as age, BMI, sex and diabetes family history. However, some prospective cohort studies indicate that as time to event increases, so too does the predictive accuracy of genetic risk scores, whilst the accuracy of non-genetic clinical risk algorithms decreases [46]. These findings are reinforced by studies in Pima Indians [8] and Finns [9], which show that type 2 diabetes and related metabolic traits have a substantially larger heritable component at younger than at older ages. Some have reasoned that if the relationship between age and genetic risk for type 2 diabetes is linear, obtaining genetic information early in life might help guide preventive interventions, long before the clinical signs and symptoms of diabetes emerge.

In this issue of Diabetologia [10], Vassy and coworkers hypothesised that genetic risk scores ascertained early in life would have greater predictive accuracy for type 2 diabetes than when modelled in older adult cohorts. The authors used data from the Coronary Artery Risk Development in Young Adults (CARDIA) [11] study to examine whether 38 reliably associated common genetic variants, derived mainly from cross-sectional adult case–control studies [1], predicted the onset of type 2 diabetes (see Fig. 1). The CARDIA study is comprised of 5,115 black and white adolescents and young adults (18–30 years at baseline) from the USA. Baseline assessments were performed from 1985 to 1986, and participants were followed, on average, for nearly 24 years, with 215 cases of diabetes detected during this time. When modelled in aggregate, each unit (allele) of the genetic risk score corresponded with a 9% (95% CI 5%, 13%) increased hazard of incident diabetes on average. There was no material increase in the C statistic when the genetic risk score was added to conventional clinical risk prediction models, and there was little gain in net reclassification (maximum net reclassification index of 0.285). The conclusion from these results is that a genetic risk score informed by existing data on common variant associations from adult cross-sectional studies does not meaningfully improve the predictive accuracy of conventional, non-genetic risk prediction algorithms for incident type 2 diabetes in early adulthood. Of note, the authors have conducted a second study in which they tested the same hypothesis in an even younger cohort (14 years old at baseline) from the Bogalusa Heart Study [12], which showed similar results, reinforcing the veracity of these null findings. However, before writing off the possibility that type 2 diabetes genetic risk scores derived in early life are of clinical value, we should consider whether methodological differences between the study by Vassy et al and those alluded to above might explain the disappointing results.

Fig. 1
figure 1

Comparison of odds ratios (blue bars) from published type 2 diabetes genetic association studies and hazard ratios (red bars) from Vassy et al [9] for 38 established type 2 diabetes loci. aIndicates nominally statistically significant effect and contrasting risk allele; bdata only available in white participants

Could characteristics that vary between the cohorts studied here and previously explain the incompatible results?

The Vassy et al study focused on a cohort of white and black participants from the USA, born 1955–1973. The Lyssenko et al study focused on two cohorts of white participants from Sweden (n = 16,061) and Finland (n = 2,770) born in the first half of the 20th century [5]. DNA was obtained at a follow-up exam in the Swedish cohort; importantly, DNA was unavailable in those who had died between baseline and follow-up, which may have affected the observed genotype frequencies. The studies from Meigs et al [4] and de Miguel-Yanes et al [6] were conducted in the Framingham Offspring Study, a cohort of predominantly white participants born 1930–1940 in Massachusetts, USA. Differences in birth era, cultural factors and genetic substructure of the cohorts described above might influence the comparability of results, owing to gene environment interactions or differences in population substructure. One interesting aspect of the study by Vassy et al is that although some of the individual SNP association tests are likely to be underpowered (and thus yield false-negative results), the risk alleles for other SNPs contrast those reported in published studies comprised primarily of older adults (i.e. the diabetes risk raising allele reported previously appears to lower diabetes risk in the study by Vassy et al). Some of these differences will be attributable to random variation, but others highlight the possibility of cohort-dependent effects, which may include gene × age interactions. For example, the diabetes risk alleles at PROX1 rs340874, HCCA2 rs2334499 and KCNJ11 rs5219 are associated with lower diabetes risk in the published literature [1]; because these effects were nominally statistically significant in the Vassy et al study, the risk allele differences for these loci are unlikely to be due to random variation and may represent genuine cohort-specific effects.

Importantly, before concluding that gene × age interactions do indeed exist for PROX1, HCCA2 and KCNJ11, formal statistical tests of interaction would need to confirm this in one or more adequately powered and appropriately designed prospective cohort studies. Moreover, observations of gene × environment interactions in epidemiological studies provide little or no evidence of causal effects, as methodological factors such as heteroscadicity, confounding, reverse causality, and the scale on which data are presented are frequent non-causal explanations for statistically significant findings on gene × environment interactions, even for interaction effects that have been replicated in independent cohorts. Thus, whilst observations of interactions may generate intriguing hypotheses, experimental studies are often necessary to test those hypotheses and congruent biological explanations should always be pursued.

Interestingly, existing evidence suggests that two of the loci outlined above may have age-dependent effects on the risk of diabetes per se. PROX1, for example, is a co-repressor of HNF4A, and mutations in the latter gene cause MODY type 1 [13]. Similarly, mutations in KCNJ11, one of the first type 2 diabetes loci to be discovered, have long been associated with permanent neonatal diabetes [14] and very recently with MODY [15].

Should we expect the genetic risk factors for younger and older onset type 2 diabetes to be the same?

By virtue of the study design, participants with diabetes in the Vassy et al cohort were considerably younger at diagnosis (mean age 44.1 years) than those in previous prospective cohort studies and cross-sectional case–control studies of type 2 diabetes (∼65 years). Type 2 diabetes is diagnosed by excluding other known causes of chronically elevated blood glucose concentrations, which probably captures a range of underlying pathologies all lumped together under a single diagnosis; the molecular defects that cause each specific pathology may differ in frequency by diabetes age of onset. Thus, the genetic markers of these molecular defects may also differ in younger compared with older diabetic cohorts. This is important, as the genetic markers studied by Vassy et al were discovered primarily through cross-sectional case–control studies of older adults.

Should we expect genetic markers of type 2 diabetes discovered in cross-sectional studies to predict incident diabetes?

Almost without exception, cross-sectional studies have been used to discover novel loci for type 2 diabetes [1]. Whilst cross-sectional case–control studies are generally more accessible than prospective studies, making them attractive for large-scale investigations, one should not expect both study designs to yield identical results. Indeed, recent results from the European Network for Genetic and Genomic Epidemiology (ENGAGE) Consortium support this notion [16]. Reasons for differing results by study design may include that estimates of risk and discriminative accuracy in cross-sectional studies can be considerably higher than in prospective studies; for example, in the former, cases and controls are sometimes selected so as to maximise power to detect differences in genotype frequencies between the two groups, which may also inflate risk estimates (a source of spectrum bias). Another often overlooked issue is that with type 2 diabetes and other fairly prevalent diseases, odds ratios derived from case–control studies that are not appropriately matched are prone to be larger than risk estimates such as rate ratios or incidence density ratios derived from prospective cohort studies [17, 18]. Thus, for a variety of methodological reasons, the often-cited odds ratios for type 2 diabetes loci from consortia such as Diabetes Genetics Replication and Meta-analysis Consortium (DIAGRAM) [1] should not automatically be considered analogous to hazard rate ratios derived from prospective studies.

Whilst the study by Vassy et al will be interpreted by many as adding another nail to the coffin of genetic risk prediction models for type 2 diabetes, it is important to understand why their results do not fit with the expectation that genetic risk models perform better at younger ages. It is also important to bear in mind that the genetic risk prediction models for type 2 diabetes examined to date have focused on common gene variants, and we simply cannot conclude at this stage whether rare variants will or will not be clinically useful for prediction. We should also consider that, if the risk alleles for specific loci truly vary by age (the risk alleles for 11 of the 38 SNPs studied in the paper by Vassy et al contrast those reported in the published literature, see Fig. 1), genetic risk algorithms derived in adulthood will be inappropriate for use in younger populations, and algorithms that are specific to this younger age group, where effect alleles are coded appropriately, will be required. The study by Vassy et al provides useful information in this regard, which may inform the development of risk prediction algorithms for future studies of young onset type 2 diabetes.