Genetic Susceptibility to Gestational Diabetes Mellitus in a Chinese Population

Introduction: New genetic variants associated with susceptibility to obesity and metabolic diseases have been discovered in recent genome-wide association (GWA) studies. The aim of this study was to investigate the association of theses risk variants with gestational diabetes mellitus (GDM). Methods: We performed a case-control study including 964 unrelated pregnant women with GDM and 1,021 pregnant women with normal glucose tolerance (as controls). A total of 33 genetic variants confirmed by GWA studies for obesity and metabolic diseases were selected and measured. Results: We observed that FTO rs1121980 and KCNQ1 rs163182 conferred a decreased GDM risk in the dominant and additive model [additive model: OR (95% CI) = 0.79 (0.67–0.94), P = 0.007 for rs1121980; OR(95%CI) = 0.84 (0.73–0.96), P = 0.009 for rs163182], whereas MC4R rs12970134 and PROX1 rs340841 conferred an increased GDM risk in the dominant, recessive, and additive model [additive model: OR(95%CI) = 1.25 (1.07–1.46), P = 0.006 for rs12970134; OR(95%CI) = 1.22 (1.07–1.39), P = 0.002 for rs340841). With the increasing number of risk alleles of the four significant SNPs, GDM risk was significantly increased in a dose-dependent manner (P trend < 0.001). And the significant positive associations between the weighted genetic risk score and risk of GDM persisted. Further function annotation indicated that these four SNPs may fall on the functional elements of human pancreatic islets. The genotype-phenotype associations indicated that these SNPs may contribute to GDM by affecting the expression levels of their nearby or distant genes. Conclusion: Our study suggests that FTO rs1121980, KCNQ1 rs163182, MC4R rs12970134, and PROX1 rs340841 may be markers for susceptibility to GDM in a Chinese population.


INTRODUCTION
Gestational diabetes mellitus (GDM) is a major public health problem, affecting about 5-10% of all pregnancies (1). Globally the prevalence of GDM has increased dramatically in the past three decades, following worldwide trends of increasing obesity and type 2 diabetes (T2D) (2). GDM has important clinical implications because of its associated adverse neonatal outcomes (3), and its increased risk for long-term complications, including obesity and impaired glucose metabolism, in both the mother and infant (4). Additionally, women with high body mass index (BMI) and a family history of diabetes may be predisposed to an increased risk of GDM (5). Given the connection, it is plausible to hypothesize that GDM may share the common genetic susceptibilities with T2D and obesity.
Accumulating evidences suggest that genetic factors play a role in GDM (6,7). The major genetic studies of GDM were candidate gene studies, which have successfully identified the association of risk variants for T2D with GDM, thereby confirming the genetic similarity between GDM and T2D (8,9). Besides, it is widely acknowledged that the FTO (fat-mass and obesity-associated gene) is related to BMI and obesity (10). Several studies have revealed that some genetic variants in FTO gene could contribute to the risk of GDM (11). However, the association with GDM was not observed in all the T2D and obesity associated loci, and these associations varied by race (12). Genome-wide association (GWA) studies have so far identified a large number of single nucleotide polymorphisms (SNPs) in different genes associated with susceptibility to obesity and metabolic diseases, including genetic variants in BDNF, FTO, GCK, GCKR, KCNQ1, MC4R, PROX1, UBE2E2, and so on (13)(14)(15)(16). In this study, we speculated that some of these SNPs may also influence the development of GDM. To verify this assumption and systematically evaluate the genetic similarity, we selected 33 SNPs in multiple genes and designed a case-control study of 964 GDM cases and 1,021 controls to assess the associations of the SNPs with GDM risk. Further functional annotation of the significant SNPs was also conducted.

Study Subjects and Design
This study was approved by the ethics committee of Women's Hospital of Nanjing Medical University (NFY201608), and all methods were carried out according to the related guidelines. The enrolment of the subjects has been described in our previous paper (17).
In brief, this case-control study was performed based on a study population of about 80,000 women who participating gestational complications screening between 2012 and 2015 in Women's Hospital of Nanjing Medical University. Using completely randomized digital table, the GDM cases and controls were randomly selected from the screening population. For all participants between 24 and 28 weeks of gestation, a glucose challenge test (GCT) was conducted. The GDM cases were defined as pregnant woman with fasting blood glucose ≥5.5 mmol/L or 2 h plasma glucose ≥8.0 mmol/L following a 75 g oral glucose tolerance test (OGTT) (18). Participants diagnosed with metabolic syndrome and related diseases before pregnancy were excluded from this study. The pregnant women without diabetes were included as controls. The controls were matched to GDM cases for age and pre-pregnancy body mass index (BMI). At last, 964 GDM patients and 1,021 controls agreed to participate in the study. All participants were unrelated ethnic Han Chinese. After written informed consent was obtained, each participant was arranged an interview and used structured questionnaires to collect demographic information and potential risk factors, such as age, parity, pre-pregnancy height and weight, family history of diabetes, and abnormal pregnancy history.

SNPs Selection and Genotyping
Based on the data from GWAS Catalog (http://www.ebi.ac.uk/ gwas/), SNPs that reach a genome-wide significant association with obesity or metabolic diseases were included for SNP selection. The traits of obesity and metabolic diseases included BMI, obesity, T2D, metabolic syndrome, fasting glucose-related traits, 2 h glucose challenge, and so on. The SNPs in hotspot susceptibility gene reported in multiple GWAS analysis with validation or in GWAS meta-analysis were given greater priority. Then, SNPs with minor allele frequency (MAF) ≥ 0.05 in Han Chinese were selected. If several SNPs were in high linkage disequilibrium (r 2 > 0.8), only one SNP was genotyped. As a result, 33 SNPs reported from GWA study were selected for genotyping (Supplementary Table 1). The flow chart of SNP selection is shown in Supplementary Figure 1.
Genomic DNA was extracted from leukocyte microspheres by traditional proteinase K digestion, followed by phenolchloroform extraction and ethanol precipitation. All SNPs were genotyped using Sequenom MassARRAY iPLEX platform (Sequenom Inc., CA). The information regarding the primers is shown in Supplementary Table 2. Genotyping was conducted blindly without knowing the subjects' case or control status. Two negative controls were used for quality control in each 384well-plate, and more than 10% samples were randomly selected for repeated sampling, with a concordance rate of 100%. The genotyping success rates of these SNPs were all above 95.0%.

In silico Analysis
In order to further understand the function of the significant SNPs in the pathogenesis of GDM, the ENCODE database (http://genome.ucsc.edu/) and the Roadmap Epigenomics database (http://genomebrowser.wustl.edu/) were used to explore whether the SNPs were located in the functional elements. In addition, PhenoScanner database (http://www. phenoscanner.medschl.cam.ac.uk/) was used to investigate the genotype-phenotypic associations of significantly correlated SNPs and their associated high-LD SNPs (r 2 > 0.8 from the 1000 Genomes Project).

Statistical Analyses
The Student's t-test and χ 2 -test were used to detect differences of selected characteristics and genotype frequencies of the SNPs between the GDM cases and controls for continuous variables and categorical variables, respectively. Logistic regression analyses were used to calculate the odds ratios (OR) and their 95% confidence intervals (CIs) to estimate the relationship between genotypes and GDM risk. The crude ORs were calculated by univariate logistic regression, while the adjusted ORs were calculated by multivariate logistic regression with the adjustment for age, parity, pre-pregnancy BMI, family history of diabetes, and abnormal pregnancy history. Logistic regression analyses with different genetic models (dominant, recessive, and additive model) were conducted. For each SNP under the additive model, a value of 0 was assigned to wild-type homozygote, 1 to heterozygote, and 2 to variant homozygote. Testing is designed specifically to reveal associations that depend additively on the minor allele. That is, individuals with variant homozygote (as compared with wild-type homozygote) are twice as likely to affect the outcome in a certain direction as individuals with heterozygote (as compared with wild-type homozygote). Similarly, for each SNP under the dominant model, a value of 0 was assigned to wild-type homozygote, 1 to heterozygote, and variant homozygote; while for each SNP under the recessive model, a value of 0 was assigned to wild-type homozygote, and heterozygote, 1 to variant homozygote. Based on results from the logistic regression analyses, SNP allele which increased the risk of GDM was defined as risk allele. All the statistical analyses were performed using Stata Version 11.1 software (Stata, College Station, TX), and P < 0.05 in a two-sided test was considered statistically significant. In order to reduce the error caused by multiple comparisons, SNPs with P < 0.01 were selected for further detailed analysis. The χ 2 -based Q-test was used to evaluate the heterogeneity of associations between subgroups. In addition, statistical power analysis was performed using G * Power 3.1.9.2 with an alpha level of 5%.

Subject Characteristics
The selected characteristics of the 964 GDM patients and 1,021 controls are shown in Table 1. As expected, there were no significant differences in age and pre-pregnancy BMI between the two groups (P = 0.094 and 0.685, respectively). However, there were more multiparaes, women with abnormal pregnancy histories, and women with family histories of diabetes in GDM cases, as compared with the controls (P < 0.05 for all comparisons).

The Associations Between Candidate SNPs and GDM Risk
The associations of 33 candidate SNPs with GDM risk were assessed using logistic analyses ( Table 2). In the control subjects, the observed frequencies of the 33 SNPs genotype were all at Hardy-Weinberg equilibrium (P > 0.05 for all SNPs, Supplementary Table 3). As shown in Table 3  Then, we also evaluated the combined effects on GDM by adding the number of risk alleles of the four significant SNPs (rs1121980-G, rs163182-G, rs12970134-A, and rs340841-T). The "0 allele" refers to the subjects carrying rs1121980 AA, rs163182 CC, rs12970134 GG, and rs340841 CC; "1-8 alleles" means 1-8 risk alleles of the four SNPs (rs1121980-G, rs163182-G, rs12970134-A, and rs340841-T). We observed that with the increasing number of the four SNP alleles, GDM risk was significantly increased in a dose-dependent manner (P trend < 0.001) ( Table 4). When compared with the subjects with"0-3 allele, " subjects carrying "6-8 alleles" had an 84% increase in GDM risk (adjusted OR = 1.84, 95%CI = 1.37-2.47, P < 0.001) ( Table 4).
The combined effects of the four SNPs on GDM occurrence were also evaluated by stratifying by age, parity, pre-pregnancy BMI, family history of diabetes, and abnormal pregnancy history. No obvious evidence of heterogeneity associations for the combined effects of the four SNPs on GDM risk was observed (Supplementary Table 4). We also conducted stratified analyses on rs1121980, rs163182, rs12970134, and rs340841 with GDM susceptibility, respectively (Supplementary Table 5). There was also no heterogeneity between the similar association strengths of the subgroups (P > 0.05).

Genetic Risk Score Calculation
Then, a genetic risk score (GRS) was calculated based on the number of risk alleles (FTO rs1121980-G, KCNQ1 rs163182-G, MC4R rs12970134-A, and PROX1 rs340841-T) an individual inherits. The natural ln of each OR for each SNP was multiplied by the number of risk alleles (2, 1, or 0) to generate a genotypic OR; said values for each locus were then summed. GDM risk as a function of GRS in our population was examined in Table 5. The mean GRS for cases was significantly higher than that calculated for the controls (P < 0.001). An analysis of risk by quartile of GRS indicated that individuals with the highest quartile of GRS having a 1.94 fold risk of GDM, when compared to the individuals with the lowest quartile of GRS.

Functional Annotation of the Significant SNPs
In the databases of ENCODE and Roadmap, the potential functions of the four significant SNPs (rs1121980, rs163182, rs12970134, and rs340841) were explored. These public databases showed that the four significant SNPs were all fell in the functional elements of the related genes in the human pancreatic islets, including Transcription Factor ChIP-seq Clusters, DNaseI hypersensitivity (DNaseI HS) density signal, Formaldehyde-Assisted Isolation of Regulatory Elements (FAIRE) density signal, or histone modification markers, such as H3K27ac, H3K27me3, H3K36me3, H3K4me1, H3K4me3, and H3K9me3 (Supplementary Figures 2-5). We further  analyzed the genotype-phenotype correlations by exploiting the PhenoScanner database. We found FTO rs1121980 and its correlated variants within an LD block were significantly associated with FTO expression in muscle skeletal and whole blood (Supplementary Table 6). For rs163182 and rs12970134, although their relationship with KCNQ1 and MC4R are not present, the two SNPs and their correlated variants were significantly associated with the expression of multiple genes in several tissues (Supplementary Tables 7,8). In addition, PROX1 rs340841 and its correlated variants were significantly associated with PROX1-AS1 expression in pancreas and multiple other tissues (Supplementary Table 9).

Power Analysis
We calculated the statistical power analysis to reassess the available data when an alpha of 0.05 was assigned. For FTO rs1121980 analysis, the statistical power ranged from 14.4 to 99.9%; for KCNQ1 rs163182, the statistical power ranged from 11.5 to 89.1%; for MC4R rs12970134, the statistical power ranged from 12.2 to 41.6%; for PROX1 rs340841, the statistical power ranged from 6.9 to 45.0%, and for the combined effects of the four risk alleles and stratified analyses, the statistical power ranged from 7.3 to 99.9%.

DISCUSSION
Limited but increasing evidences suggested that similar genetic risk variants which predispose to T2D and obesity also contribute to the risk of GDM (9). In this case-control study, we investigated the association of 33 thus far published confirmed genetic variants for obesity or metabolic diseases with GDM in a Chinese population. Our study showed that several risk variants for obesity or metabolic diseases (FTO rs1121980, KCNQ1 rs163182, MC4R rs12970134, and PROX1 rs340841) were also associated with GDM, giving further evidence that GDM and obesity/metabolic diseases may share a similar genetic background. We observed that FTO rs1121980 and KCNQ1 rs163182 conferred a decreased GDM risk, whereas MC4R rs12970134 and PROX1 rs340841 conferred an increased GDM risk. With the increasing number of risk alleles of the four significant SNPs, GDM risk was significantly increased in a dose-dependent manner. And the significant positive associations between the weighted genetic risk score and risk of GDM persisted, suggesting that FTO rs1121980, KCNQ1 rs163182, MC4R rs12970134, and PROX1 rs340841 may be markers for susceptibility to GDM in a Chinese population. FTO gene is located in chromosome 16q12.2, reported to be consistently related to obesity and BMI (10). It is highly expressed in hypothalamic nuclei and extensively expressed in brain, playing an important role in the energy balance (19). Given the connection with obesity, many studies have been conducted to identify the association between genetic variants in FTO and risk of T2D/GDM (11). In the early GWA study and subsequent validated studies, the authors suggested that FTO rs1121980 may represent a susceptibility locus for obesity risk (20,21). Its association with T2D was also reported (22). The effects of FTO polymorphisms on T2D susceptibility may be mediated through their effect on increasing the lifetime maximum BMI before or at the time of diagnosis (23). However, in two metaanalysis on GDM, no associations between FTO polymorphisms (rs9939609, rs8050136, rs1421085, rs9939609, and rs8050136) and GDM risk were found (24,25). Our results also indicated a lack of association between FTO rs1421085 and GDM risk, but we found that FTO rs1121980 was significantly associated with decreased GDM risk (OR = 0.79, P = 0.007), which added new evidence for the association of FTO polymorphisms with GDM.
The association of KCNQ1 rs163182 with GDM was also firstly reported in this study. KCNQ1 is a gene that wildly expressed in cardiac muscle, inner ear, kidney, lung, stomach, and intestine, providing instructions for making potassium channels. It is also expressed in pancreas, and could affect the insulin secretion and insulin sensitivity (26). The gene of KCNQ1 not only plays an important role in blood glucose metabolism but also regulates other metabolic substances (27,28). It has been confirmed that the KCNQ1 gene was associated with diabetes, metabolic syndrome, and lipid parameters (29,30). In a GWA study for T2D conducted in southern China (31), the authors confirmed the association between KCNQ1 rs163182 and T2D. So, our result that KCNQ1 rs163182 conferred a decreased GDM risk was understood. Previous association study and meta-analysis also reported that KCNQ1 rs2074196, rs2237892, and rs2237895 were associated with the risk of GDM (32)(33)(34). Based on the causal inference (35), GDM was a risk factor for T2D and the KCNQ1 gene was associated with T2D and GDM. Thus, the KCNQ1 gene may influence GDM occurrence.
MC4R is a G protein-coupled receptor which is expressed in the hypothalamus and implicated in the energy balance regulation (36). MC4R has been associated with key components of appetite, food intake, nutrient absorption, thermogenesis, energy expenditure, insulin secretion, obesity, and lipid metabolism (37). It is indicated that MC4R rs6567160 is associated with postpartum weight reduction and glycemic changes among women with prior GDM (38). Moreover, a significant correlation was observed between lipid parameters and MC4R rs17782313 (39). In the previous GWA study, the authors identified rs12970134 near MC4R associated with waist circumference and insulin resistance (40). And this variant was also associated with obesity and T2D (41,42). In this study, we put forward MC4R rs12970134 conferred an increased GDM risk, further confirming the genetic similarity between GDM and obesity or metabolic diseases. PROX1 encodes the prospero homeobox 1 protein, which is a homeobox transcription factor involved in developmental processes of multiple organs, such as progenitor cell regulation, gene transcription regulation, and cell fate determination. PROX1 gene also plays an important role in the embryo development (43). And PROX1 has been shown to be associated with diabetes and its complications in a number of studies (43,44). Recently, a GWA study revealed that PROX1 rs340841 is a strong susceptibility locus of early onset of diabetes with variations depending on ethnicity (45). Here, we present evidence that PROX1 rs340841 conferred an increased GDM risk in a Chinese population. It is of interest that several polymorphisms at this locus are associated with insulin levels.
In this study, GDM cases and controls were all selected from a population-based, large study for systematic screening of pregnancy complications, and the two groups were matched well-according to age and pre-pregnancy BMI, which may help reduce potential selection bias. However, several limitations ought to be acknowledged. Firstly, restricted by the conditions, the number of related patients and controls in this study is relatively small. Especially in subgroups, statistical power may be limited to finding differences between the groups. Secondly, some GDM related phenotypes, such as fasting plasma gluocose, HbA1c, 2 h Plasma glucose, insulin levels, were not obtained. So the associations of the significant SNPs with these phenotypes were not analyzed. Further quantitative studies for the GDM related phenotypes are needed. Thirdly, some reported GDM risk factors, such as weight gain during pregnancy, exercise and diet were not adjusted in the statistical analyses for the lack of related data. Fourthly, most associations reported were not reach significance after multiple comparison correction, which was another drawback of this study. Therefore, confirmation from additional patients in the further studies is warranted.

CONCLUSIONS
Taken together, our study demonstrated for the first time that FTO rs1121980, KCNQ1 rs163182, MC4R rs12970134, and PROX1 rs340841 were associated with risk of GDM in Chinese population. Further studies conducted in different populations with functional assays are needed to validate our findings, and to investigative the predictability of these genetic variants for the development of metabolic related diseases later in life in those with GDM.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/Supplementary Files.

ETHICS STATEMENT
All subjects gave written informed consent. This study was approved by the ethics committee of Women's Hospital of Nanjing Medical University (NFY201608), and all methods were carried out from August 2016 until December 2018, in accordance with the 1964 Principles of the Helsinki Declaration and its later amendments.

AUTHOR CONTRIBUTIONS
MC conceived and designed the idea, did data collection, wrote and drafted the manuscript. LZ, TC, and AS did data collection. KX did literature review. ZL and JX reviewed the manuscript. ZC performed the data analysis. JW and CJ designed and contributed to the reviewing of the final manuscript. All authors approved the final format of the submitted manuscript.