Comprehensive analysis of inhibitor of differentiation/DNA-binding gene family in lung cancer using bioinformatics methods

Abstract The inhibitor of differentiation/DNA-binding (ID) is a member of the helix–loop–helix (HLH) transcription factor family, and plays a role in tumorigenesis, invasiveness and angiogenesis. The aims were to investigate the expression patterns and prognostic values of individual ID family members in lung cancer, and the potential functional roles. The expression levels of ID family were assessed using the Oncomine online database and GEPIA database. Furthermore, the prognostic value of ID family members was evaluated using the Kaplan–Meier plotter database. The genetic mutations of ID family members were investigated using the cBioPortal database. Moreover, enrichment analysis was performed using STRING database and Funrich software. It was found that all the ID family members were significantly down-regulated in lung cancer. Prognostic results indicated that low mRNA expression levels of ID1 or increased mRNA expression levels of ID2/3/4 were associated with improved overall survival, first progression and post progression survival. Additionally, genetic mutations of ID family members were identified in lung cancer, and it was suggested that amplification and deep deletion were the main mutation types. Furthermore, functional enrichment analysis results suggested that ID1/2/4 were significantly enriched in ‘regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolism’ for biological process, ‘transcription factor activity’ for molecular function and ‘HLH domain’ for protein domain. However, it was found that ID3 was not enriched in the above functions. The aberrant expression of ID family members may affect the occurrence and prognosis of lung cancer, and may be related to cell metabolism and transcriptional regulation.


Introduction
Lung cancer is one of the most common types of malignancies worldwide, and has higher morbidity and mortality rates compare with other malignant tumors [1]. As a threat to human health, lung cancer has been the focus of the health research field worldwide [2]. Currently, the survival of patients with lung cancer relies on quick and accurate clinical diagnosis. However, most patients have intermediate or advanced lung cancer at the time of diagnosis, and thus have a poor survival rate [3]. Therefore, it is important to investigate potential biomarkers that are related to the occurrence and prognosis of lung cancer, and to identify the possible molecular mechanisms.
The inhibitor of differentiation/DNA-binding (ID) belongs to helix-loop-helix (HLH) family of transcription factors. ID can bind to HLH transcription factors to inhibit their binding to DNA, resulting in the inhibition of cell differentiation and promotion of cell proliferation [4]. In humans, the ID family members consist of four members: ID1, ID2, ID3 and ID4. Previous studies have demonstrated that ID

Funrich analysis
Functional enrichment analysis of the interacting proteins was performed using Funrich software (version 3.1.3), which is an open-access standalone functional enrichment and interaction network analysis tool [20]. Moreover, the functional analysis of related gene interactions with ID family members was performed using Funrich software.

Transcription expression levels of ID family members in human cancer types
The present study used the Oncomine online databases to compare the transcription expression levels of ID family members between human cancer and normal tissues. As shown in Figure 1, the database containing the genes of ID1, ID2, ID3 and ID4 had a total of 445, 457, 420 and 442 unique studies, respectively. It was found that all ID family members were significantly down-regulated in lung cancer. Furthermore, the present study investigated the gene expression levels of ID family members in different lung cancer datasets (Table 1). In the Bhattacharjee dataset [21], ID1 was significantly decreased in different lung cancer types compared with normal samples; lung adenocarcinoma had a fold change of -11.038, small cell lung carcinoma had a fold change of -11.150 and lung carcinoma tumor had a fold change of -45.049. Similar results were identified in lung adenocarcinoma in the Beer dataset [22] [26]. For ID2, a decreased expression level was found in lung adenocarcinoma compared with normal samples in the Selamat [25], Beer [22], Bhattacharjee [21] and Su [23] datasets. Furthermore, in the Hou dataset [27] it was demonstrated that the expression level of ID2 was decreased in squamous cell lung carcinoma, with a fold change of -2.485. Moreover, a low expression level of ID3 was found in lung adenocarcinoma in the Selamat dataset [25], Landi dataset [24], Su dataset [23] and Okayama dataset [26]. Furthermore, ID4 was significantly down-regulated in lung adenocarcinoma in the Okayama [26], Beer [22], Landi [24], Su [23], Garber [28], Stearman [29] and Hou [27] datasets, and also in squamous cell lung carcinoma (Garber [28] and Hou [27] datasets).

Expression levels of ID family members in lung cancer
Using the GEPIA database, the present study compared the mRNA expression level of individual ID family members between lung cancer tissues and normal lung tissues. The present results demonstrated that the expression levels of ID family member genes were significantly decreased in lung adenocarcinoma and lung squamous cell carcinoma tissues compared with normal tissues ( Figure 2). Furthermore, the present study analyzed the expression levels of ID family members in different lung cancer stages. It was found that ID4 had significantly different expression levels in the various tumor stages, whereas there was no obvious difference in expression levels of ID1, ID2 and ID3 ( Figure  3).

Prognosis analysis of ID family members in lung cancer
The present study systematically performed Kaplan-Meier survival analysis according to the mRNA expression of individual ID family members in lung cancer (http://www.kmplot.com/analysis/index.php?p=service&cancer=lung). It was found that the mRNA expression levels of ID1/2/3/4 had a significant effect on OS, FP and PPS in patients with lung cancer (P < 0.05; Figure 4). Therefore, patients with lung cancer with a low mRNA expression level of ID1 or high mRNA expression levels of ID2/3/4 were predicted to have longer OS, FP and PPS.
Furthermore, the present study investigates the association of the expression levels of individual ID family members with various clinical parameters of lung cancer, including histology, clinical stages, pathological grades, AJCC stages, sex and smoking status ( Table 2). For histology, ID1 mRNA expression level was significantly associated with unfavorable OS in adenocarcinoma and squamous cell carcinoma. However, ID2 and ID4 mRNA expression levels showed a favorable association with OS in adenocarcinoma. For clinical stages, ID1 mRNA expression level was    in lung cancer. However, ID4 (AJCC stage T1) was associated with favorable OS in lung cancer. Moreover, ID1 was significantly correlated with poor OS in female patients with lung cancer, while ID2/3/4 were significantly correlated with improved OS in female patients with lung cancer. In addition, it was found that ID1 was significantly correlated with poor OS in male patients with lung cancer. The present results suggested that ID1 mRNA expression level was significantly associated with poor OS in smokers and non-smokers. However, ID2 and ID4 mRNA expression levels were associated with improved OS in patients with lung cancer without a smoking history.

Mutation of ID family members in lung cancer
Genetic mutations in ID family members in different cancer types were assessed using cBioPortal. The ID family member genetic mutations were determined in 256 cancer studies, which included 77,879 samples. As shown in Figure  5A, genetic mutations in ID family members were present in different lung cancer types, including lung squamous cell carcinoma and lung adenocarcinoma, compared with other cancer types. Furthermore, ID family genetic mutation frequencies and types in lung adenocarcinoma (TCGA; Provisional; 586 total samples) and lung squamous cell carcinoma (TCGA; Provisional; 511 total samples) are shown in Figure 5B and C. In lung adenocarcinoma (TCGA;

Functional enrichment analysis of ID family members
To investigate the interactions of the ID family genes, the present study constructed PPI networks using STRING data ( Figure 6). Furthermore, functional enrichment analysis of potential target genes was performed using Funrich software (Figure 7).

Discussion
The development of molecular biology technology and bioinformatics has helped to understand the molecular biological characteristics of ID family members and their role in tumorigenesis. However, to the best of our knowledge, there are few studies on the role and significance of ID family members in lung cancer. Therefore, the present study investigated the expression patterns, prognostic values and potential functions of ID family members in lung cancer using bioinformatics methods. Thus, the present results may facilitate the development of future studies to identify potential therapeutic targets in lung cancer. The present study analyzed the expression levels of ID family members in human tumors using the Oncomine database. The present results suggested that all ID family members had low expression levels in lung cancer. Furthermore, the experimental results based on the GEPIA database showed that the expression levels of ID family members were significantly decreased in lung cancer tissues, including lung adenocarcinoma and lung squamous cell carcinoma. In addition, it was also found that the expression level of ID4 was significantly different in various tumor stages. Therefore, the present results suggested that the expression levels of ID family members may be linked to the pathogenesis of lung cancer. In relation to this, Zhou et al. reported that the mRNA expression levels of ID1, ID3 and ID4 were significantly lower in breast cancer tissues compared with normal tissues [30]. However, the present results are inconsistent with other previous results [8][9][10], and these differences may be due to the small sample size and discrepancies in detection methods amongst the different studies.
In order to understand the relationship between ID family members and lung cancer, the present study performed a prognostic analysis using the Kaplan-Meier Plotter. It was demonstrated that increased ID2/3/4 mRNA expression levels were associated with improved OS, FP and PPS. However, decreased ID1 mRNA expression level was associated with improved OS, FP and PPS. Therefore, the present results suggested that low mRNA expression level of ID1 or high mRNA expression levels of ID2/3/4 predicted an improved survival in patients with lung cancer. In addition, the present study assessed the associations of the ID mRNA expression levels with distinct clinical parameters for OS, including histology, clinical stages, pathological grades, AJCC stages, sex and smoking status. For histology, a low mRNA expression level of ID1 was significantly associated with favorable OS in the following: Adenocarcinoma, squamous cell carcinoma, patients with stage 1 lung cancer, Grade II, AJCC stage N0, sex and smoking history. However, low mRNA expression levels of ID2 showed an unfavorable OS in the following: Adenocarcinoma, patients with stage 1 lung cancer, female patients and patients without a smoking history. Moreover, a low mRNA expression level of ID4 was found to be associated with unfavorable OS in the following categories: Adenocarcinoma, patients with stage 1 lung cancer, sex and patients without a smoking history. It was also demonstrated that ID3 mRNA expression level was significantly associated with unfavorable OS in AJCC stage T4, while ID3 showed an association with a favorable OS in female patients. Thus, the present results suggested that ID1/2/4 mRNA expression levels were correlated with pathological type, sex and smoking history in patients with lung cancer.
Currently, it is not fully understood whether IDs act as an oncogenes or a tumor suppressor genes in different tumor types due to a lack of identification of genetic alternations in ID genes. Li et al. found that 26.7% of ID3 −/− mice developed lymphoma, while none of the ID3 +/+ or ID3 +/− mice had lymphoma. Therefore, a deficiency of the ID3 gene increases the possibility of γδ T-cell lymphoma [31]. However, to the best of our knowledge, there was no previous studies investigating genetic alterations of ID genes in other cancer types. Thus, the present study evaluated the mutations of ID family members in patients with lung cancer using the cBioPortal database. The present results suggested that there were mutations in ID1/2/3/4, which were predominately amplification and deep deletion mutations.
Lung cancer not only is caused by the differential expression of ID family members, but also may be caused by the interaction between related genes [32]. Therefore, the present study constructed PPI networks using STRING data for individual ID family members. Furthermore, the possible functions of ID related genes were investigated using Funrich software, including biological process, molecular function and protein domain. It was found that ID1/2/4 were significantly enriched in 'Regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolis' , 'Transcription factor activity' and 'HLH domain' . However, ID3 was not significantly enriched in the above factors. Therefore, the present results indicated that ID family members may be involved in cell metabolism and transcription regulation.
Transcription factors are a group of sequence-specific binding proteins that can activate or inhibit transcription via a transactivation or transrepression domains [33]. Previous studies have indicated that transcription factors are involved in regulating cell differentiation, proliferation and apoptosis, and play significant roles in the occurrence and development of tumors [34]. In the human genome, the C2H2zinc-finger domain, homeodomain and HLH domain are the main types of transcription factor, and accounted for >80% of human genome [35]. Therefore, it is important to study the role of transcription factors in lung cancer. A nucleotide is an essential nutrient required to maintain the rapid proliferation of tumor cells, and nucleotide synthesis is regulated by many enzymes, genes and various metabolic pathways [36,37]. Moreover, previous studies have reported that inactivation of tumor suppressors and activation of oncogenes can promote the occurrence and development of tumors by regulating the biosynthesis of nucleotides [38]. Therefore, the mechanism of cell metabolism and transcriptional regulation of ID family genes, and related genes in the pathogenesis of lung cancer may be an important focus for future research. In addition, the present study supports the initiation of future studies to investigate the mechanism of tumor occurrence and development, to facilitate the development of prevention and treatment strategies.
In conclusion, the present study investigated the role of ID family members in lung cancer, in relation to mRNA expression levels, prognostic values, genetic mutations and functional enrichment analysis. The present results suggested that genetic mutations and mRNA expression levels were abnormal in patients with lung cancer. Furthermore, decreased ID1 or increased ID2/3/4 expression levels predicted an improved survival, and it was found that these proteins were involved in cell metabolism and transcription regulation. Therefore, ID family members may be used as biomarkers for the occurrence and prognosis of lung cancer. However, further study is required to assess the expression levels and molecular mechanisms of ID family members.