Impact of nine common type 2 diabetes risk polymorphisms in Asian Indian Sikhs: PPARG2 (Pro12Ala), IGF2BP2, TCF7L2 and FTO variants confer a significant risk

Background Recent genome-wide association (GWA) studies have identified several unsuspected genes associated with type 2 diabetes (T2D) with previously unknown functions. In this investigation, we have examined the role of 9 most significant SNPs reported in GWA studies: [peroxisome proliferator-activated receptor gamma 2 (PPARG2; rs 1801282); insulin-like growth factor two binding protein 2 (IGF2BP2; rs 4402960); cyclin-dependent kinase 5, a regulatory subunit-associated protein1-like 1 (CDK5; rs7754840); a zinc transporter and member of solute carrier family 30 (SLC30A8; rs13266634); a variant found near cyclin-dependent kinase inhibitor 2A (CDKN2A; rs10811661); hematopoietically expressed homeobox (HHEX; rs 1111875); transcription factor-7-like 2 (TCF7L2; rs 10885409); potassium inwardly rectifying channel subfamily J member 11(KCNJ11; rs 5219); and fat mass obesity-associated gene (FTO; rs 9939609)]. Methods We genotyped these SNPs in a case-control sample of 918 individuals consisting of 532 T2D cases and 386 normal glucose tolerant (NGT) subjects of an Asian Sikh community from North India. We tested the association between T2D and each SNP using unconditional logistic regression before and after adjusting for age, gender, and other covariates. We also examined the impact of these variants on body mass index (BMI), waist to hip ratio (WHR), fasting insulin, and glucose and lipid levels using multiple linear regression analysis. Results Four of the nine SNPs revealed a significant association with T2D; PPARG2 (Pro12Ala) [odds ratio (OR) 0.12; 95% confidence interval (CI) (0.03–0.52); p = 0.005], IGF2BP2 [OR 1.37; 95% CI (1.04–1.82); p = 0.027], TCF7L2 [OR 1.64; 95% CI (1.20–2.24); p = 0.001] and FTO [OR 1.46; 95% CI (1.11–1.93); p = 0.007] after adjusting for age, sex and BMI. Multiple linear regression analysis revealed significant association of two of nine investigated loci with diabetes-related quantitative traits. The 'C' (risk) allele of CDK5 (rs 7754840) was significantly associated with decreased HDL-cholesterol levels in both NGT (p = 0.005) and combined (NGT and T2D) (0.005) groups. The less common 'C' (risk) allele of TCF7L2 (rs 10885409) was associated with increased LDL-cholesterol (p = 0.010) in NGT and total and LDL-cholesterol levels (p = 0.008; p = 0.003, respectively) in combined cohort. Conclusion To our knowledge, this is first study reporting the role of some recently emerged loci with T2D in a high risk population of Asian Indian origin. Further investigations are warranted to understand the pathway-based functional implications of these important loci in T2D pathophysiology in different ethnicities.

Conclusion: To our knowledge, this is first study reporting the role of some recently emerged loci with T2D in a high risk population of Asian Indian origin. Further investigations are warranted to understand the pathway-based functional implications of these important loci in T2D pathophysiology in different ethnicities.

Background
Type 2 diabetes (T2D) is a common disease characterized by insulin resistance and reduced insulin secretion. It has become a health problem world-wide and the underlying molecular mechanisms involved in the development of diabetes remain poorly understood. Linkage and candidate gene studies have been highly successful in identifying genes for monogenic and syndromic forms of diabetes [1]. However, this approach has yielded limited results in mapping the genes for T2D [2][3][4]. The latest success of the genome-wide association (GWA) studies using high density SNPs across the genome and complex genetic analyses have identified several genes with previously unknown functions [5][6][7][8][9].
In this investigation, we have examined the role of nine most significant loci previously reported to be associated with T2D in Caucasian populations [5,6,[10][11][12] in a diabetic case-control cohort of Khatri Sikhs obtained from North India. Our study is focused on the peroxisome proliferator-activated receptor gamma 2 (PPARG2; Pro12 Ala; rs1801282), insulin-like growth factor two binding protein 2 (IGF2BP2; rs 4402960), cyclin-dependent kinase 5, a regulatory subunit-associated protein 1-like 1 (CDK5; rs 7754840), a zinc transporter and member of solute carrier family 30 (SLC30A8; rs 13266634), a variant found near cyclin-dependent kinase inhibitor 2A (CDKN2A; rs 10811661), hematopoietically expressed homeobox (HHEX; rs 1111875), transcription factor-7-like 2 (TCF7L2; rs 10885409), potassium inwardly rectifying channel subfamily J member 11 (KCNJ11; rs 5219), and a recently discovered fat mass obesity-associated gene (FTO; rs 9939609). TCF7L2 gene is the most frequently replicated adult-onset T2D susceptibility gene [8]. Some of the TCF7L2 variants have been found to strongly associated with T2D in Asian Indian populations from South [13,14] and South West India [14], and also our North Indian cohort [15]. In this study, we have used the most significant TCF7L2 SNP associated with T2D in Sikhs for comparisons. Since the replication of positive association in multiple independently ascertained datasets from different ethnic groups is crucial for identifying the true population-specific susceptibility variants, we have examined the role of polymorphisms of these nine genes in our diabetes case-control cohort of Asian Indian Sikhs.

Human Subjects
The study subjects are part of the ongoing Sikh Diabetes Study (SDS) [16]. The focus of this study is on an endogamous community of Khatri Sikhs living in the Northern states of India, including Punjab, Haryana, Himachal Pradesh, Delhi, and Jammu & Kashmir. The DNA and serum samples of 532 T2D cases (299 male, 233 female) and 386 normal glucose tolerant (NGT) (184 male, 202 female) subjects were used in this investigation. Of the 532 cases, 324 are from family material (one index case from each family) and the remaining 208 are unrelated T2D cases from the same Sikh community. The cases were 25 or older with mean age at the time of recruitment (mean ± SD) of 54.2 ± 11.0 yrs. The diagnosis of T2D was confirmed by scrutinizing medical records for symptoms, use of medication, and measuring fasting glucose levels following the guidelines of American Diabetes Association [17]. A medical record indicating either (1) a fasting plasma glucose level ≥126 mg/dl or ≥7.0 mmol/l after a minimum12-h fast or (2) a 2-h post glucose level [2-h oral glucose tolerance test (OGTT)] ≥200 mg/dl or ≥11.1 mmol/l on more than one occasion with symptoms of diabetes. IGT was defined as a fasting plasma glucose level ≥100 mg/dl (5.6 mmol/l) but ≤126 mg/dl (7.0 mmol/l) or a 2-h OGTT ≥140 mg/dl (7.8 mmol/l) but ≤200 mg/dl (11.1 mmol/l). In the absence of medical record information, we confirmed self-reported T2D cases by performing a 2-h OGTT. The 2-h OGTTs were performed following the criteria of the World Health Organization (WHO) (75 g oral load of glucose). Body mass index (BMI) was calculated as [weight (kg)/height (meter) 2 ], and waist-hip ratio (WHR) was calculated as the ratio of abdomen or waist circumference to hip circumference.
The NGT subjects participated in this study were from the same Khatri Sikh community from which the T2D patients were recruited [16]. Majority of the subjects were recruited from the state of Punjab from Northern India. Individuals of South, East and Central Indian origin, or with type -1 diabetes or family member with type -1 diabetes, rare forms of T2D called maturity-onset diabetes of young (MODYs), and secondary diabetes (e.g., hemochromatosis, pancreatitis) were excluded from the study. Of the 386 controls, 180 were normal spouses of diabetic patients, and the remaining 206 were unrelated controls who had no first degree relative affected with T2D, but had other chronic illnesses such as hypertension, coronary heart disease or arthritis. The selection of controls was based on a fasting glycemia < 110 mg/dl or < 6.0 mmol/l or a 2-h glucose <140 mg/dl (< 7.8 mmol/l). Control subjects with HbA1c levels >6.5% were excluded. The average age of NGT controls (mean ± SD) was 50.5 ± 13.8 yrs. Additionally, 22 subjects with IGT were excluded from analyses. In general, Sikhs do not smoke for religious and cultural reasons and about 50% of them were life long vegetarians. Clinical characteristics of the SDS subjects used for this investigation are summarized in Table  1. All blood samples were obtained at the baseline visit and all participants provided a written informed consent for investigations. All SDS protocols and consent documents were reviewed and approved by the University of Oklahoma and the University of Pittsburgh Institutional Review Boards as well as the Human Subject Protection Committees at the participating hospitals and institutes in India. Memorandum of understanding and material transfer agreements for sample sharing and intellectual property rights were signed between collaborating Indian and the US Institutes. The study was approved by the Indian Council of Medical Research, New Delhi and Health Ministry Screening Committee, Union Ministry of Health and Family Welfare, Government of India.

SNP Genotyping
DNA was extracted from buffy coats using QiaAmp blood kits (Qiagen, Chatworth, CA) or by the salting out procedure [18]. Genotyping for all investigated SNPs was performed using TaqMan Pre-Designed or TaqMan Made-to Order SNP genotyping assays from Applied Biosystems (ABI, Foster City, USA). TaqMan genotyping reactions were performed on ABI 7900 genetic analyzer using 2 ul of (10 ng/ul) of genomic DNA following manufacturer's instructions. Florescence was detected on an ABI prism 7900HT sequence detection system (ABI, Foster City, USA). Genotypes were scored by analyzing data on both real-time as well as allele discrimination assay platforms using SDS software provided by the ABI. Data quality for SNP genotyping was checked by establishing reproducibility of control DNA samples. For quality control, 30 replicate positive controls and 8 negative controls were included in each run to match the concordance, and the discrepancy in the concordance was < 0.2%. Genotyping success rate ranged from 96%-99.8% for all the investigated SNPs except for CDKN2A (rs 10811661) for which the success rate was 88.5% due to technical difficulties.

Statistical Analysis
We evaluated Hardy-Weinberg equilibrium using a onedegree of freedom goodness-of-fit test separately among cases and controls using Pearson chi-square test. The allele frequencies in T2D cases were compared to those in controls using chi-square test or Fisher's exact probability test, where appropriate. Statistical evaluations for testing genetic effects of association between the case-control status and each individual SNP, measured by the odds ratio (OR) and its corresponding 95% confidence limits were estimated using unconditional logistic regression before and after adjusting for age, gender, and other covariates. Association analyses were performed assuming a dominant, recessive and co-dominant effect for each polymorphism using SNPassoc [19]. In all analyses, the common homozygote genotype in the control population was defined as the reference category. The likelihood ratio test was used to test the effect of each SNP at the nominal 5% significance level. Akaïke's information criterion [20] was also used to select the best genetic model for each SNP. We also performed multiple linear regression analysis to examine the impact of these variants on quantitative risk variables of T2D, including fasting insulin, glucose, and lipid levels. Skewed variables for the continuous traits (e.g., triglycerides, cholesterol, LDL, glucose) were logtransformed before statistical comparisons. Significant covariates for each dependent trait were identified by using Spearman's correlation and step-wise multiple linear regression with an overall 5% level of significance.

Results
We genotyped nine SNPs, including the six SNPs from newly emerged gene regions recently identified by multiple GWA studies [5][6][7]9,21], in a case-control cohort of Asian Indian Sikhs consisting 918 subjects including, 532 T2D cases and 386 NGT controls matched for ethnicity and geographic origin. To explore the possible mechanisms of action of these variants for affecting T2D susceptibility, we tested the association of these SNPs with the T2D-related quantitative phenotypes (height, weight, BMI, WHR, insulin, creatinine, total cholesterol, LDL-cholesterol, VLDL-cholesterol, HDL-cholesterol, and triglycerides) using multiple linear regression analysis in NGT after adjusting for the effects of age, sex, BMI, disease status, and medication. As shown in Table 3, the 'C' (risk) allele of CDK5 variant was significantly associated with decreased HDL-cholesterol levels in NGT (p = 0.005) as well as combined (NGT+T2D) sample (p = 0.005). The less common 'C' (risk) allele of TCF7L2 polymorphism also showed significant association with increased levels of LDL-cholesterol in NGT (p = 0. 010) and total and LDL-cholesterol in combined sample (p = 0.008; p = 0.003, respectively).

Discussion
In this investigation, we have carried out an unbiased replication of SNPs associated with T2D susceptibility in recently published multiple GWA studies [5][6][7][8] in our independent dataset of Asian Indian origin. Two of these SNPs belong to highly replicated biological candidate genes PPARG2 (Pro12Ala; rs 1801282) and KCNJ11 (Glu23Lys; rs 5219) that produced unequivocal evidence of their involvement in T2D in several independent studies performed in different data sets [22,23]. Moreover, these two genes are also targets for anti-diabetic therapies. PPARG2 encodes a transcription factor involved in adipocyte differentiation and is the target of thizolodinedione class of drugs used for T2D [24] and KCNJ11 is the target of sulphonylurea class of drugs extensively used for treating neonatal diabetes [25,26]. The remaining seven  variants (IGF2BP2, CDK5, SLC30A8, CDKN2A, HHEX,  TCF7L2 and FTO) evaluated in this study are from the recently identified gene regions that provide convincing evidence of their involvement in T2D in fine mapping and multiple GWA studies.
We have identified a significant association of PPARG2 (rs1801282; p = 0.005), IGF2BP2 (rs 4402960; p = 0.027) and FTO (rs 9939609; p = 0.007) variants with T2D along with the TCF7L2 variant (rs10885409; p = 0.001) in our Asian Indian population. The protective association of the less common 'Ala' allele at codon 12 of PPARG2 gene has been consistently reproduced in multiple independent studies conducted in different populations [22,[27][28][29][30][31] with few exceptions of small studies e.g. Oji-Cree [32] and Czech [33], where the Ala allele was shown to be associated with T2D risk. The protective association of Ala allele was also not confirmed in a South Indian population [34]. The observed difference of association within Asian Indian populations is important in view of the extensive diversity present between Indian populations [35][36][37]. IGF2BP2 belongs to a family of three mRNA binding proteins; the human growth hormone gene and insulin like growth factor 1 and 2 genes. It is involved in pancreas development, growth and stimulation of insulin action [38,39]. We have replicated a significant association of IGF2BP2 (rs 4402960) SNP reported earlier in three different GWA studies [5,11,12], meta analysis of three GWA studies [7], and also in Japanese population [40]. However, no association of this variant was noticed with any quantitative trait related to glucose homeostasis or lipid metabolism.
The FTO gene is of unknown function that was originally cloned as a result of identification of fused toe mutant mouse [41]. Recently, the recombinant murine fto has been shown to be involved in nucleic acid methylation by catalyzing Fe (II) -and 2-oxoglutarate (2OG)-dependent dioxygenases [42,43]. This locus is associated with childhood and adult obesity in several European and US populations [9,44,45]. The FTO variant (TA A) (rs 9939609) has also been shown to be strongly associated with T2D risk (OR 1.27; p = 5 × 10 -8 ) in GWA scan performed in UK population [9]. Our study also confirmed significant association of FTO (rs 9939609) variant with T2D in this Asian Indian sample. But the mechanism by which it relates to obesity and T2D in humans is still unclear. Studies have shown that T2D risk associated with 'A' allele of this variant (rs 9939609) is strongly mediated by BMI, and has been shown to potentially affect mass rather than height [9]. In contrast, in this Asian Indian sample, the T2D risk does not seem to be mediated by BMI or weight, as by excluding BMI, the risk associated with this variant did not disappear, as reported in other studies [9,40]. Perhaps the ethnic difference could be responsible for this difference. Moreover, BMI is not considered an accurate measure for obesity, especially in the populations from South Asia where at a given BMI, their muscle mass is low and visceral and subcutaneous fat is increased [46][47][48]. Therefore, measures of BMI or waist and hip may not reveal accurate estimates of total fat.
The importance of TCF7L2 as diabetes gene is now very well established across ethnicities but many more genes for diabetes especially in non-white ethnicities still remain to be identified. It is interesting to observe that the risk alleles found to be associated with T2D in this Asian Indian population in four significant SNPs (PPARG2, IGF2BP2, TCF7L2 and FTO) were consistent with those reported in GWA studies in Caucasians and Japanese [5,7,11,40]. However, it is still unclear how these loci contribute to T2D risk as none of these variant was found to influence insulin or fasting glucose levels in this study. More functional studies would be needed to clearly define the role of these variants in glucose homeostasis. Perhaps the causal variants in these genes controlling insulin secretion and insulin resistance are yet to be identified. With regard to remaining loci, our study did not confirm the previous reported association of variants (KCNJ11, CDK5, SLC30A8, CDKN2A and HHEX) when each SNP was analyzed separately. With the exception of CDK5 and TCF7L2, no other gene variant showed significant association with any diabetes-related quantitative traits. The 'C' (risk) allele of CDK5 variant was associated with significantly reduced HDL levels (p = 0.005) in this sample. The possibility of false positive results in our sample cannot be ruled out because of relatively small size of our cohort. Also, while our manuscript was being reviewed, publication of two large meta-analysis studies on lipid traits in subjects with coronary artery disease and T2D did not report this locus affecting lipid traits in Caucasians [49,50]. However, as we previously reported [15], five SNPs in the TCF7L2 gene (rs 7901695, rs7903146, rs11196205, rs10885409, and rs 12255372) revealed significantly elevated gene-dosage effects on total cholesterol, LDL cholesterol and VLDL cholesterol levels in this Khatri Sikh sample. It is therefore possible that the observed association with quantitative lipid levels could be population-specific and may not be present in other ethnicities. Replication of these findings in a larger sample of Asian Indian origin would be required confirm these findings.
On the other hand, the absence of positive association with T2D in remaining loci may suggest that the genetic variation in some loci could be restricted to a particular genetic background and environment and the contribution of these loci in Indian population may be minor. Alternatively, different SNPs within the same gene could be contributing to the risk and could be population specific. Therefore, further SNP screening and deep sequencing of these genes would be required to identify putative ethnicity-specific functional variants in these novel candidates.

Conclusion
Our study in Asian Indian Sikhs has successfully replicated the association of four of nine loci recently reported to be associated with T2D in Caucasians. To our knowledge, this is the first report describing the role of some recently emerged loci with T2D in a population of Asian Indian origin from North. Further investigations are warranted to cross-validate these findings in other larger datasets. Functional studies would help understand the pathway-based implications of these important loci in T2D pathophysiology in different ethnicities.