The identification of gene signature and critical pathway associated with childhood-onset type 2 diabetes

In general, type 2 diabetes (T2D) usually occurs in middle-aged and elderly people. However, the incidence of childhood-onset T2D has increased all across the globe. Therefore, it is very important to determine the molecular and genetic mechanisms of childhood-onset T2D. In this study, the dataset GSE9006 was downloaded from the GEO (Gene Expression Omnibus database); it includes 24 healthy children, 43 children with newly diagnosed Type 1 diabetes (T1D), and 12 children with newly diagnosed T2D. These data were used for differentially expressed genes (DGEs) analysis and weighted co-expression network analysis (WGCNA). We identified 192 up-regulated genes and 329 down-regulated genes by performing DEGs analysis. By performing WGGNA, we found that blue module (539 genes) was highly correlated to cyan module (97 genes). Gene ontology (GO) and pathway enrichment analyses were performed to figure out the functions and related pathways of genes, which were identified in the results of DEGs and WGCNA. Genes with conspicuous logFC and in the high correlated modules were input into GeneMANIA, which is a plugin of Cytoscape application. Thus, we constructed the protein-protein interaction (PPI) network (92 nodes and 254 pairs). Eventually, we analyzed the transcription factors and references related to genes with conspicuous logFC or high-degree genes, which were present in both the modules of WGCNA and PPI network. Current research shows that EGR1 and NAMPT can be used as marker genes for childhood-onset T2D. Gestational diabetes and chronic inflammation are risk factors that lead to the development of childhood-onset T2D.


INTRODUCTION
Diabetes mellitus is a chronically progressive, metabolic disease that affects multiple organs over a period of time. In a diabetic patient, blood glucose levels are high as the body either fails to produce insulin or cells are not sensitive enough to insulin. To ensure normal blood sugar levels, most diabetic patients have to take lifelong medications that affect the quality of life. Ingelfinger & Jarcho, (2017) reported that the incidence of diabetes has increased phenomenally all over the world. It is frightening to contemplate that the prevalence of diabetes had increased by 30.6% from 2005 to (Charlson et al., 2016. The onset of diabetes is associated with many factors, including other chronic diseases , social conditions (Whitworth, Mclean & Smith, 2017), and even global warming (Blauw et al., 2017). In summary, diabetes is a chronic disease of complex etiology and poses a threat to global health.
Based on pathogenesis, diabetes can be mainly classified into three types: gestational diabetes, type 1 diabetes (T1D), and type 2 diabetes (T2D). In general, T1D is known as insulin-dependent diabetes and often occurs in children and adolescents. In patients with T1D, beta cells of islets undergo cell-mediated autoimmune destruction. As a result, the pancreas of T1D patients cannot synthesize and secrete insulin on its own. Furthermore, T2D is known as non-insulin-dependent diabetes mellitus, which is the most common type of diabetes. Approximately 90% of diabetic patients are diagnosed with T2D. It often occurs in patients above the age of 35. In these patients, insulin secretion decreases either due to the lack of insulin receptors or impaired insulin receptors. Gestational diabetes is defined as a condition in which a normal woman develops diabetes during pregnancy. Gestational diabetes usually occurs in the middle or late stages of pregnancy. Some studies monitored women with gestational diabetes for six months after delivery. They found that 38 to 100% of these lactating mothers developed T2DM within six months of delivery (Kim, Newton & Knopp, 2002). This suggests that there is an intrinsic link between gestational diabetes and T2D. In addition to these three types, there are some rare types of diabetes, including maturity onset diabetes of the young (MODY), maternally inherited diabetes and steroid diabetes. MODY and maternally inherited diabetes are primary and hereditary diseases. The genetic patterns of them are autosomal dominant and maternal inheritance, respectively. Steroid diabetes is a secondary metabolic disorder caused by excessive glucocorticoids in the body.
With the development of precision medicine, researchers can now pay more attention to patients with childhood-onset T2D. Wu et al. (2017) noted that although T2D is common in people above the age of 35, the incidence of T2D has continued to rise in children. Both children and adolescents have physiological conditions that are different from those of adults; therefore, researchers must focus on the prevention, clinical diagnosis, and drug treatment of childhood-onset T2D. Researchers in Mexico have conducted several cross-sectional studies on children and adolescents in the area. The results indicate that T2D and metabolic syndrome-related traits were highly inherited in Mexican children and adolescents (Mirandalora et al., 2017). Joyce et al. (2017) found that statins increased the risk of childhood-onset T2D without causing dyslipidemia. Lee et al. (2018) found that the risk of developing T2D increased significantly in children and adolescents with mental disorders when they were exposed to atypical antipsychotics. Akhlaghi et al. (2016) analyzed whether the published anti-hyperglycemic drugs were safe and effective for children and adolescents. These studies have partially revealed some links of T2D in children and adolescents. Nevertheless, scientists still do not know the molecular mechanism of childhood-onset T2D. Therefore, it is necessary to identify the genes and pathways related to the pathogenesis of childhood-onset T2D. Then, we can develop special measures for the prevention and treatment of childhood-onset T2D. The microarray data (GSE9006) was used to compare the intrinsic characteristics of following three groups: healthy vs. T1D groups and the T1D vs. T2D groups. Unfortunately, childhood-onset T2D did not attract the attention of researchers. Hence, we still do not know the differences and similarities between healthy individuals and T2D patients. In this study, we identified the differentially expressed genes (DEGs) in the microarray data (GSE9006) of healthy group and T2D group. Then, we searched for genes closely related to T2D by performing weighted co-expression network analysis (WGCNA), which was further used to calculate the co-expression of genes. We performed functional enrichment analysis on DEGs and T2D genes. In addition, we constructed a protein-protein interaction (PPI) network for these genes to identify the ones that showed critical expression. Our research study focused on finding genes or pathways that are closely related to T2D and revealing the underlying molecular mechanisms. Kaizer et al. (2007) measured and uploaded the dataset GSE9006 to gene expression omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/). This dataset contained the gene expression of peripheral blood mononuclear cells (PBMCs), which were obtained from 24 healthy children, 43 children with newly diagnosed T1D, and 12 children with newly diagnosed T2D. The demographic information of volunteers was mentioned in Table 1. We selected data of healthy children and children with newly diagnosed T2D for further analysis. The dataset was based on the following platform: GPL97 [HG-U133B] Affymetrix Human Genome Array.

DEG analysis
In this study, we analyzed DEGs in healthy and T2D patients by using the limma package (Ritchie et al., 2015), which is the core component of Bioconductor and is widely used to process the microarray data. |logFC| ≥0.5 and P < 0.01 was set as the cut-off criteria.

WGCNA analysis
To determine the co-expression module and the link between the module and the phenotype, we performed WGCNA of the normalized gene expression matrix by using WGCNA package (Langfelder & Horvath, 2008). It is important to note that WGCNA package is an effective data mining tool used to identify all kinds of modules that are highly related to a phenotype, including genes, miRNA, and LncRNA. In this study, the genes were divided into 21 modules, including 20 effective modules and one ineffective module (identified as gray module). We calculated the link between each module and co-expressed genes in healthy and T2D groups.

GO and pathway enrichment analyses
Gene ontology (GO) (Ashburner et al., 2000) includes three types of data, namely, cellular components (CC), molecular functions (MF), and biological processes (BP). GO is often used to annotate genes according to a defined set of structured words. The Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa & Goto, 2000) is used to match the information of gene pathways. Reactome is a great tool to obtain gene-related pathways and interactions (Beloqui et al., 2009). GO and pathway enrichment analyses are used to help us figure out the information of genes. In this study, GO and pathway enrichment analysis was performed on DEGs and the genes highly related to T2D. P < 0.05 was the criterion used for distinguishing non-meaningful pathways. At least two genes were required to enrich each pathway.

PPI network analysis
GeneMANIA software (Montojo et al., 2010) was used to establish gene interactions and to predict gene function. The PPI network was used to analyze DEGs and genes in both the high-associated modules. We set kappa >0.4 and P < 0.05 as the criteria. Network analysis was performed by running GeneMANIA software run in Cytoscape application. The critical hubs were the nodes that were highly connected to other nodes.

Analysis of diseases and transcription factors related to critical genes
The Comparative toxicogenomics database (CTD) (Davis et al., 2015) is often used to analyze the link between genes and diseases. In this study, we determined whether the target gene was linked to diabetes. In the CTD database, we searched all the genes that were reportedly associated with T2D and T1D. Then, we matched these genes with our target genes. iRegulon (Janky et al., 2014) is a plugin for the Cystoscape application, which uses a set of pre-defined transcription factors (TFs) and direct transcriptional targets to extract information related to TFs in a group of co-expressing genes. Ultimately, the output is a set of transcription tracks and a list of genes associated with the tracks. A network was constructed with the output results. To reflect the reliability of results, the values of relevant parameters were set as follows: the maximum false discovery rate on motif similarity = 0.001; the minimum identity between orthologous genes = 0.05, and the normalized enrichment score (NES) = 5. Further analysis was performed on pairs for which NES ≥ 5.

Figure 1 Heatmaps of differentially expressed genes.
In the right rectangle, red represents a higher expression level, while blue represents a lower expression level. The middle strip indicates the grouping information. The red section is healthy group, while the black section represents the patient group. The right and bottom labels indicate grouping information and gene symbols, respectively. The color key is a histogram. The X axis is a numerical value representing the level of gene expression, and the Y axis is the number of corresponding squares. This histogram corresponds to the squares.

DEG analysis
We compared the gene expression levels of healthy children and children with newly diagnosed T2D. Figure 1 showed that 521 genes had differential expression, including 192 up-regulated genes and 329 down-regulated genes. The average logFC value of up-regulated genes was 0.787, while the average logFC value of down-regulated genes was −0.938. The terms with logFC >1.5 or logFC <-1.5 are EGR1 (

WGCNA analysis
Under the condition of soft threshold = 6, each gene was divided into 21 modules (20 valid modules and one invalid module) by cluster analysis; the correlation of these genes with T2D phenotype was calculated. Figure 2 shows the correlation results. It can be seen that the blue and cyan modules have a significant correlation with T2D, which are 0.74 (p < 0.01) and 0.69 (p < 0.01), respectively. The two modules contain a total of 636 genes; the blue module contains 539 genes, while the cyan module contains only 97 genes.

GO and pathway enrichment analyses
GO, Reactome, and KEGG pathway analyses were used to determine the up-regulated genes, the down-regulated genes, and the genes significantly correlated to T2D (blue and cyan). Table 2 displays the GO results, which indicated that the up-regulated genes were highly enriched in cytokine metabolic process, cytokine biosynthetic process, interleukin-1 beta production, positive regulation of cytokine biosynthetic process, and MyD88dependent toll-like receptor(TLR) signaling pathway. The down-regulated expression of differential genes was mainly associated with the development of lymph vessel and fat pad. In T2D-related modules, the genes either conjugated or removed small proteins to perform functions, such as RNA binding, RNA processing, and protein modification. Some genes were further enriched in organelles' lumen, nuclear region, and other items. Table 3 presents the results of KEGG and Reactome pathway analysis, indicating the involvement of up-regulated genes in TLR signaling pathway and in diseases of the immune  system. In T2D-related modules, the genes were mainly involved in vasopressin-regulated water reabsorption pathways. The down-regulated expression of differential genes was mainly enriched through chromatin-modifying enzymes, chromatin organization, and mRNA splicing.

PPI network analysis
We performed PPI network analysis on 112 genes, which were present in both the DEGs and the highly correlated modules. As shown in Fig. 3, we obtained 92 nodes and 254 pairs by setting the PPI score >0.4. An intersection was observed in genes of following types: the genes with logFC >0.6 or logFC <-0.6 in the DEGs, the genes with greater than average degree in the high correlation module, and the genes with greater than average degree in the PPI network. A total of 10 genes were obtained, all of which were up-regulated. The details were presented in Table 4. The 11 genes with the higher degree in the network are C14orf119 (logFC = 16), NAMPT (logFC = 15), NRBF2 (logFC = 14), MTO1 (logFC = 14), PIK3CG (logFC = 13), RNF146 (logFC = 13), VPS50 (logFC = 13), CHM (logFC = 12), GPD2 (logFC = 12), RAB10 (logFC = 12) and ATAD1 (logFC = 12), which have the significant influence on the whole network.

Analysis of the transcription factors of maker genes
A total of 12 genes were used for further analysis. Among them, 10 important genes were obtained from PPI analysis. The remaining two genes were DEGs with large values of logFC, namely, EGR1 (logFC = 2.00) and NAMPT (logFC = 1.32). In the CTD database, these 12 genes are reportedly linked with T2D. In addition, we analyzed and compared their reports in T1D group. The results were shown in Table 5. As shown in Fig. 4, we analyzed the transcription factor regulatory network of these genes. We found that NES ≥ 5 was for the following transcription factors: DEAF1 (NES = 7.114, degree = 6), GMEB2 (NES = 6.595, degree = 4), MAF1 (NES = 5.641, degree = 3), NKX3-2 (NES = 5.287, degree = 3), WDR83 (NES = 5.041, degree = 4).  Notes. WGCNA, weighted co-expression network analysis; logFC, log 2 fold change; PPI, protein-protein interaction network; degree, the number of related genes in a given analysis. Notes.
The number of references related of T2D or T1D is given by the comparative toxicogenomics database. T1D, type 1 diabetes; T2D, type 2 diabetes.

DISCUSSION
In humans, early growth response 1 (EGR1) is a protein-coding gene. Previous studies have reported that glucagon can transiently activate EGR1 in liver cells. To mediate glucagonregulated gluconeogenesis, hepatocytes up-regulate the expression of gluconeogenic genes. By blocking the function of EGR1 gene, we could increase glycogen content in hepatic cells, which would improve the tolerance toward pyruvate and lower fasting blood glucose. The gene EGR1 enhances insulin resistance in T2D patients with chronic hyperinsulinemia. Insulin resistance is one of the most significant causes of T2D (Shen et al., 2011). The gene EGR1 promotes the development of gestational diabetes by adversely impacting the glucagon-controlled gluconeogenesis (Zhao et al., 2016). In addition, EGR1 promotes also diabetic nephropathy, which is one of the most common complications of diabetes (Wang et al., 2015). Several findings have indicated that EGR1 promotes the inflammation within muscle and adipose (Fan et al., 2013) and also acts as an inflammatory mediator in reproductive tissues (Thakali et al., 2014). Indeed, the expression of EGR1 gene was unregulated in the placenta of obese women. A previous study has shown that the up-regulated expression of IL-8, IL-6, and TNFα was induced by EGR1 gene in BeWo cells (Saben et al., 2013). These findings indicated that EGR1 contributed to placental inflammation in obese women. Maternal adiposity must have triggered the expression of EGR1 in umbilical cord tissue (Thakali et al., 2014). Current research and previous studies have proved that EGR1 gene has significantly affected many aspects of diabetes and obesity. In multiple tissues of T2D patients, statistically significant differences were found in the expression of EGR1 gene. Nicotinamide phosphoribosyltransferase (NAMPT) is a protein-coding gene in humans. Previous studies have reported that NAMPT is an adipocytokine that exhibits proinflammatory and immunoregulatory properties (Moschen et al., 2007) and regulates beta-cell insulin secretion (Revollo et al., 2007). Furthermore, NAMPT is involved in insulin Figure 4 The analysis of transcription factor regulatory network. The pink and green nodes represent the important genes identified by previous analysis and the transcription factors that have regulatory relationships, respectively. Some important genes were not pictured in the network because they were single node, which means that no transcription factor linked to them. The different colored arrows indicate the genes regulated by different transcription factors, making the results easier to observe.
Full-size DOI: 10.7717/peerj.6343/ fig-4 resistance and chronic inflammation, which promotes the development of T2D (Ma, An & Wang, 2017;Jaganathan, Ravindran & Dhanasekaran, 2017;Motawi et al., 2014). Obesity is an important factor that leads to the development of diabetes. In obese and overweight patients with metabolic syndrome, the expression of NAMPT increases in the plasma (Filippatos et al., 2007) and simulates the effect of insulin (Fukuhara et al., 2005). However, a previous study has reported that NAMPT is related to only T2D and not obesity (Laudes et al., 2010). Some studies have reported plasma levels of visfatin, the product of NAMPT, increases in obese people (Jaleel et al., 2013), indicating that further research must be conducted to determine the relationship between NAMPT and obesity. Interestingly, the expression of NAMPT was found to be significantly higher in patients with gestational diabetes. This indicates that NAMPT is involved in the molecular mechanism of gestational diabetes (Krzyzanowska et al., 2006). Several studies have reported that Young women with gestational diabetes usually give birth to an overweight baby. This phenomenon has been attributed to intrauterine growth, which increases the prevalence of T2D in the offspring (Schaefergraf et al., 2005). Because NAMPT is also conspicuously up-regulated in children with T2D, it is necessary to further study the genetic mechanism of NAMPT in gestational diabetes patients and their offspring. These findings suggest that NAMPT is a marker gene in patients with childhood-onset T2D.
In this study, we also found that chronic inflammation was somewhat related to childhood-onset T2D. Thus, childhood-onset T2D might be caused by multiple risk factors. For the treatment and prevention of childhood-onset T2D, we need to correctly identify the contribution and sequence of risk factors. Hence, further studies must be conducted to confirm whether gestational diabetes and chronic inflammation are closely related to childhood-onset T2D.
According to KEGG, up-regulated genes are mainly enriched in the signaling pathways of TLR. The enrichment results of Reactome pathways indicated that up-regulated genes were related to the TLR Cascades, suggesting that TLRs play a pivotal role in childhoodonset T2D. Reactive oxygen is produced excessively in obese people, which leads to an imbalance in the endogenous antioxidant capacity and causes oxidative stress in adipocytes (Houstis, Rosen & Lander, 2006). Obesity also promotes the excessive production of proinflammatory adipokines, which further aggravate the chronic inflammation of adipocytes (Ouchi et al., 2011). In fact, several studies have explained the association between obesity and chronic inflammation. These studies have further reported that chronic inflammation causes insulin resistance in obese people. Insulin resistance is a precursor to T2D in adults. These studies mainly investigated the adipose tissue of adults. In this study, we found that TLR signaling pathways mainly enriched up-regulated genes in PBMCs. In fact, TLR2 and TLR4 are important cell membrane receptors that elicit innate immune responses to infection (Tack et al., 2012;Chmelar, Chung & Chavakis, 2013;Andreas et al., 2003). Previous studies have shown that TLR2/4 and JNK signaling pathways play a pivotal role in activating CD11c (+) myeloid proinflammatory cells, which further promotes inflammation and subsequent insulin resistance in cells (Nguyen et al., 2007). The TLR4 signaling pathway participates in JNK activation and instigates palmitate-induced apoptosis of INS-1β cells. With the knockout of TLR4, we blocked palmitate-induced apoptosis of INS-1 cells; however, no such phenomenon was observed with the knockdown of TLR2 (Lee et al., 2011). The inflammatory factors produced during immunization play an important role in obesity-related T2D (Iiu et al., 2013). Some immune-related diseases are complications of diabetes. For example, some diabetic patients may develop a chronic airway inflammation, which further causes asthma. The above results indicate that children may develop chronic inflammation, which further induces insulin resistance and promotes the development of T2D. In addition, we also found that several other genes, such as PIK3CG, ZNF420 were also functionally related to inflammation and immune response (Hawkins & Stephens, 2007;Tian et al., 2009). Therefore, this result further validates our findings.
In previous studies, animal models or elderly patients were investigated to elucidate the mechanism of T2D. Ours is the first study to partially elucidate the molecular mechanism of childhood-onset T2D with the help of bioinformatics. We found the marker genes (EGR1, NAMPT) and TLR signaling pathways of childhood-onset T2D. The up-regulated expression of EGR1 and NAMPT in PBMCs seems to be a gene marker of childhood-onset T2D. The results were compared with the gene expression of middle-aged and older patients with T2D. Thus, there is a genetic mechanism of NAMPT in gestational diabetes patients and their offspring. Such an offspring will have an increased risk of developing childhood-onset T2D. Children are less affected by environmental factors than adults, which allows us to speculate that genetic factors have more influence on childhood-onset T2D. Our findings suggest that NAMPT may be the key to understanding this issue. In addition, TLRs are important proteins involved in non-specific immunity and are a bridge linking specific and non-specific immunity. TLR4 recognizes not only foreign pathogens, but also endogenous substances and degradants. Given the differences in the immune system between children and adults, as well as the special circumstances in which the fetus grows in the uterus, we believe that in children with T2D, the factors that activate the TLR signaling pathway cannot be equated with those of adults. The above results only analyze the underlying mechanisms of EGR1 and NAMPT in children with T2D. We present more details as well, including genes with higher logFC and specific logFC values, genes with higher degrees and associated values, and information on transcription factors, which provide ideas for subsequent research on childhood-onset T2D. It must be emphasized that the physical condition of a child is very different from that of an adult. Therefore, future research studies must focus on elucidating the mechanisms of occurrence, prevention strategies, and treatment of children and adolescents with T2D.

CONCLUSION
EGR1 and NAMPT can be used as marker genes for childhood-onset T2D, and gestational diabetes and chronic inflammation are risk factors that lead to the development of childhood-onset T2D. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Grant Disclosures
The following grant information was disclosed by the authors: