In Silico Transcriptomic Analysis of Wound-Healing-Associated Genes in Malignant Pleural Mesothelioma

Background and objectives: Malignant pleural mesothelioma (MPM) is a devastating malignancy with poor prognosis. Reliable biomarkers for MPM diagnosis, monitoring, and prognosis are needed. The aim of this study was to identify genes associated with wound healing processes whose expression could serve as a prognostic factor in MPM patients. Materials and Methods: We used data mining techniques and transcriptomic analysis so as to assess the differential transcriptional expression of wound-healing-associated genes in MPM. Moreover, we investigated the potential prognostic value as well as the functional enrichments of gene ontologies relative to microRNAs (miRNAs) of the significantly differentially expressed wound-healing-related genes in MPM. Results: Out of the 82 wound-healing-associated genes analyzed, 30 were found significantly deregulated in MPM. Kaplan–Meier analysis revealed that low ITGAV gene expression could serve as a prognostic factor favoring survival of MPM patients. Finally, gene ontology annotation enrichment analysis pointed to the members of the hsa-miR-143, hsa-miR-223, and the hsa-miR-29 miRNA family members as important regulators of the deregulated wound healing genes. Conclusions: 30 wound-healing-related genes were significantly deregulated in MPM, which are potential targets of hsa-miR-143, hsa-miR-223, and the hsa-miR-29 miRNA family members. Out of those genes, ITGAV gene expression was a prognostic factor of overall survival in MPM. Our results highlight the role of impaired tissue repair in MPM development and should be further validated experimentally.


Introduction
Malignant pleural mesothelioma (MPM) is a highly aggressive tumor of the pleural mesothelium, a metabolically active cell monolayer covering the lungs and the chest wall [1]. Neoplastic transformation of pleural mesothelial cells (PMCs) develops in the course of asbestos-induced chronic inflammation while genetic susceptibility factors, radiation exposure, and SV40 infection have also been indicated as co-factors in this process [2,3]. Recent evidence from animal studies and a few case series show that engineered nanoparticles, whose use is constantly expanding, can lead to a pathology similar to MPM and, thus, potentially lead to an increase of MPM incidence in the future and the same is the case with air pollution, which has been shown to lead to lung cancer [4,5].
Diagnosis of MPM is challenging mainly because the pleura is a common site for metastasis and reactive, benign changes observed in the pleural space may be confused with MPM [6]. At the same time, prognosis is poor since MPM therapeutic management depends on patient performance status, which is usually not good due to late diagnosis, and is potentially effective mostly in the epitheliod subtype [7]. For these reasons, the availability of effective bio markers is necessary for three clinical aspects of MPM: Early diagnosis, prognosis, and treatment outcome prediction [8].
Aberrant inflammation that could occur during wound healing has been linked to the pathogenesis of a variety of malignancies [9,10]. Environmental and infectious agents are considered key players during carcinogenesis development as they conduce to tissue damage and inflammatory reactions [9]. In the case of MPM, asbestos-induced chronic inflammation-mainly due to the production of reactive oxygen/nitrogen species-results in decreased tumor immunity [11]. Location and inflammatory response type impact on the overall prognosis of MPM patients [9].
PMCs hold a central role in the initiation and resolution of serosal inflammation and repair by secreting various pro-and anti-inflammatory mediators along with immunomodulatory ones [12,13]. It is worth noting that mesothelial regeneration of injured mesothelial surfaces appears diffusely across the traumatized area in contrast to epithelial-like surfaces, in which healing occurs solely from the wound edges [12]. Mesothelial injury and impaired healing can lead to the development of pleural effusions, serosal adhesions, and malignant mesothelioma [13,14]. Several studies have confirmed that following damage, mesothelial cells undergo mesothelial-to-mesenchymal transition (MMT), a process similar to epithelial-to-mesenchymal transition (EMT) [15]. It has been reported that the aggressiveness of MPM may be explained by its partial fibroblastic phenotype in the context of EMT conferring both high invasiveness and chemoresistance [16].
Although the above observations highlight the importance of normal serosal repair following injury, a thorough investigation of the exact role of wound-healing-associated genes in MPM development has not been performed [15]. Research in this direction is even more imperative nowadays due to the wide use of nanoparticles in a variety of applications since nano-sized fibrous particulates have similar properties to asbestos thus raising safety concerns for human health [17,18].
Here, we aimed at the identification of the differential expression of wound-healing-associated genes in MPM. For this purpose, we used established data mining techniques and transcriptomic analysis [19][20][21][22]. Moreover, we investigated their potential prognostic value as well as the functional enrichments of gene ontologies (GO) relative to the microRNAs (miRNAs) that regulate the significantly differentially expressed wound-healing-related genes in MPM.

Transcriptomic Analysis of Wound-Healing-Associated Genes in MPM
The Oncomine Cancer Microarray database Premium Research Edition (http://www.oncomine.org) was used to investigate the expression profile of wound-healing-associated genes in MPM compared with healthy controls. Gene expression data from the GDS1220 dataset of GEO Profiles were used in which the Affymetrix Human Genome U133A array was used assessing 12,624 genes [23]. We analyzed available data for 82 wound-healing-related genes (from the according list of genes included in the Wound Healing RT 2 Profiler PCR Arrays from Qiagen) that were assessed in the GDS1220 study (Table 1), so as to detect gene expression differences between MPM specimens and healthy controls [23]. The raw data were downloaded in Excel format from Oncomine and only the ones referring to surgically excised samples were selected, leading to the use of n = 40 MPM cases and n = 9 controls (n = 5 pleura and n = 4 lung samples). The gene expression data were log transformed, median centered per array, and the standard deviation was normalized to one per array as described previously [24]. Integrin Subunit Alpha 1 ITGA2 Integrin Subunit Alpha 2 ITGA4 Integrin Subunit Alpha 4 ITGA5 Integrin Subunit Alpha 5 ITGA6 Integrin Subunit Alpha 6 ITGB1 Integrin Subunit Beta 1 Interleukin 4

Evaluation of the Significantly Differentially Expressed Wound-Healing-Related Genes for Prognostic Relevance
We investigated the prognostic relevance of the significantly differentially expressed wound-healing-related genes by creating survival curves, using the publicly available survival data of each patient from the GDS1220 study. Patients were grouped as high expressing (above median) and low expressing (below median) per each significantly differentially expressed gene based on the median value of gene expression. To further evaluate the prognostic significance of the genes that were identified in the analysis of the GDS1220 mesothelioma study, we constructed survival curves from data derived from The Cancer Genome Atlas (TCGA). For this analysis, we used the PROGgeneV2 Prognostic Database software following the same approach used previously [25].

Functional Annotation Enrichment Gene Ontology Analysis of the miRNAs That Regulate the Significantly Differentially Expressed Wound-Healing-Related Genes in MPM
Functional annotation enrichment analysis of GO relative to Biological Functions and miRNAs was performed, assuming the statistical background of the whole genome. The complete list of the significantly differentially expressed wound-healing-related genes was introduced to the portal ToppFun, an application of the ToppGene Suite (https://toppgene.cchmc.org/). ToppFun reports functional enrichment of input gene lists based on transcriptome (gene expression), proteome (protein domains and interactions), regulome (transcription factor binding sites and miRNA), ontologies (GO, pathway), phenotype (human disease and mouse phenotype), pharmacome (drug-gene associations), and bibliome (literature co-citation) [26]. Functional enrichments are provided by the ToppFun algorithm, which employs hypergeometric distribution with multiple correction testing (false discovery rate; FDR) to determine statistical significance. Analysis was performed during February 2019.

Statistical Analysis
The results were analyzed using GraphPad Prism 8.0 (GraphPad Software, San Diego, CA) and R 3.3.2 [27]. Data distribution was tested by the Kolmogorov-Smirnov normality test. Comparisons of gene expressions were performed with the one-tailed t-test for parametric data and the Mann-Whitney U-test for nonparametric data. The Benjamini-Hochberg FDR was employed for multiple correction testing, which reports FDR (or q-value), in order to corroborate the validity of the results. The calculation of the q statistic was based on the formula Q = P value × total number of genes/P-value rank [24]. Statistical significance was set at the q < 0.05 level. Kaplan-Meier survival curves were generated using the overall survival data of the MPM patients and by grouping them into high and low gene expressions based on the median. The log-rank (Mantel-Cox) test, which gives equal weight to deaths at all time points, was used. Differences were deemed significant with a p ≤ 0.05. A multivariate regression was performed using the initial dataset of 30 differentially expressed genes where the ITGAV gene expression was chosen as the dependent variable. At first, all predictors were considered, and a model was derived. This mode had an acceptable R2 value but had to be discarded since the predictors were over-fit. Next, a backwards method was implemented that takes into consideration the Variance Inflation Factor (VIF) value of each predictor. If VIF of an independent variable exceeded the predetermined threshold value of 5, that predictor was discarded from the model. Thus, one avoids multi-collinearity in a model but may still obtain an overfitted model. To avoid this, an additional backwards step-wise regression was performed on the surviving variable so as to have a robust final model.

Prognostic Significance of ITGAV Gene Expression in MPM
Kaplan-Meier analysis of the significantly deregulated wound-healing-related genes in MPM identified the ITGAV gene expression as a predictor of overall survival. Graphical representation of the ITGAV gene expression from the GDS1220 study is shown in Figure 1A. In the GDS1220 study, MPM patients with low ITGAV expression had a median overall survival of 15.7 months versus 12 months of those that had high expression (p = 0.0263) ( Figure 1B). Analysis from the TCGA patient data derived from the PROGgeneV2 database corroborates that low ITGAV gene expression favors survival in MPM patients (p = 0.001) (Figure 2).

Prognostic Significance of ITGAV Gene Expression in MPM
Kaplan-Meier analysis of the significantly deregulated wound-healing-related genes in MPM identified the ITGAV gene expression as a predictor of overall survival. Graphical representation of the ITGAV gene expression from the GDS1220 study is shown in Figure 1A. In the GDS1220 study, MPM patients with low ITGAV expression had a median overall survival of 15.7 months versus 12 months of those that had high expression (p = 0.0263) ( Figure 1B). Analysis from the TCGA patient data derived from the PROGgeneV2 database corroborates that low ITGAV gene expression favors survival in MPM patients (p = 0.001) (Figure 2).

Statistical Modeling Reveals A Positive Correlation of ITGAV and COL5A1 Gene Expressions
Multivariate regression was performed as shown in Table 3, and the following model was obtained using backwards selection methods: The model explains the 43.54% (adjusted R 2 = 0.4354) of the variability of the response data around its mean. F-statistic and p-value indicate that the model is significant for the explanation of the variability of the response data. The coefficients of the model are positive, hence whenever the value of COL5A1 increases, the same happens to the value of ITGAV. The value of each coefficient shows the increase of the value of ITGAV if the corresponding variable is increased or decreased by one unit, while the rest remain fixed.

Enriched Gene Ontologies (GO) Relative To Regulating miRNAs of the Significantly Differentially Expressed Wound-Healing-Associated Genes In MPM
The complete set of the significantly differentially expressed wound-healing-associated genes was entered in the ToppFun software. The top five enriched GO for biological functions are presented in Table 4, and the miRNAs that regulate the differentially expressed genes are shown in Table 5.

Statistical Modeling Reveals A Positive Correlation of ITGAV and COL5A1 Gene Expressions
Multivariate regression was performed as shown in Table 3, and the following model was obtained using backwards selection methods: The model explains the 43.54% (adjusted R 2 = 0.4354) of the variability of the response data around its mean. F-statistic and p-value indicate that the model is significant for the explanation of the variability of the response data. The coefficients of the model are positive, hence whenever the value of COL5A1 increases, the same happens to the value of ITGAV. The value of each coefficient shows the increase of the value of ITGAV if the corresponding variable is increased or decreased by one unit, while the rest remain fixed.

Enriched Gene Ontologies (GO) Relative To Regulating miRNAs of the Significantly Differentially Expressed Wound-Healing-Associated Genes In MPM
The complete set of the significantly differentially expressed wound-healing-associated genes was entered in the ToppFun software. The top five enriched GO for biological functions are presented in Table 4, and the miRNAs that regulate the differentially expressed genes are shown in Table 5.

Discussion
Chronic tissue repair and insistent immune system stimulation following the accumulation of inhaled asbestos fibers in the pleural space are key oncogenic processes during MPM development [28]. In this study, we used transcriptome data mining in order to assess the differential mRNA expression of wound-healing-related genes in MPM. We analyzed available data for 82 genes, of which 30 were found significantly deregulated in MPM specimens compared with healthy tissues. Identified genes are mainly involved in inflammation (CXCL2, CXCL5, IL6, IL10), cellular adhesion (ITGA3, ITGAV, ITGB6), tissue remodeling (CTSG, F3, PLG, SERPINE1), and extracellular matrix organization (COL1A1, COL3A1, COL5A1, COL5A2, CSF3, HBEGF, MIF, TGFA, VTN), fundamental processes in the induction of EMT and cancer progression [29][30][31][32]. A recent study has suggested that asbestos induces mesothelial-to-fibroblastic transition in an inflammasome-dependent manner and this seems to apply for other pathogenic particles as well [28,33,34].
The acquisition of a mesenchymal phenotype by PMCs has been correlated with the sarcomatoid histological type of MPM, the latter being associated to worse prognosis [16,35]. It has been suggested that in MPM patients with sarcomatoid histology, radical surgery should be excluded, and therapy should aim at symptom control and preservation of quality of life [36]. Considering the importance of clear histological subtype distinction in the initiation of the appropriate therapeutic intervention, it would be of great interest to investigate the transcriptional expression of the 30 significantly deregulated genes-identified in this study-with respect to the MPM phenotype [22]. Unfortunately, data relative to the histological subtype of MPM specimens were not available in the GDS1220 mesothelioma study.
Among the 30 significantly differentially expressed wound-healing-associated genes in MPM, ITGAV was identified as a predictor of overall survival in patients suffering from this type of cancer. In the transcriptomic analysis, ITGAV was found significantly up-regulated in MPM subjects compared with healthy controls, and survival analysis revealed that low ITGAV gene expression favors the survival of MPM patients. In accordance with our results, recent studies have reported the tumor-promoting effect of ITGAV via the furtherance of the EMT process [37,38]. The gene expression level of the ITGAV integrin has also been associated with global survival in patients with colorectal cancer, both in univariate and multivariate analyses [39].
Following the above findings, our regression modelling predicted a significant positive correlation in the expression profiles of the ITGAV and COL5A1 genes. The latter encodes the a1 chain of type V collagen, a fibril-forming collagen that is mainly distributed in the lung, cornea, bone, and fetal membranes, together with type I collagen [40,41]. Interestingly, COL5A1 has been included in a 10-gene signature that is associated with poor survival both in patients with high-grade serous ovarian cancer and renal cell carcinoma [42,43].
Finally, the biological functions mediated by the differentially regulated genes as revealed by the GO functional enrichment analysis were pertinent to wound healing functions, such as cell locomotion, mobility, and migration. Furthermore, the GO annotation enrichment analysis predicted that the 30 wound-healing-related genes that were found significantly deregulated in MPM might be targets of the hsa-miR-143, hsa-miR-223, and the hsa-miR-29 miRNA family members. MicroRNAs have emerged as significant players in the biology of MPM, whilst miRNA-based therapy for the treatment of MPM has been suggested as an exciting area of research [44]. The hsa-miR-29c has been identified as an epigenetic regulator in MPM via the down-regulation of DNA methyltransferases and the up-regulation of demethylating genes [45]. In addition, increased expression of this miRNA has been linked to a more favorable prognosis in MPM [45]. On the other hand, hsa-miR-143 has been linked to MPM only once, in a study that showed that its expression is significantly reduced in MPM as compared to non-neoplastic corresponding tissue [46]. Its functional role in MPM is unknown, however in other cancers, the decreased expression of has-miR-143 has been shown to promote EMT, tumor growth, and metastasis [47,48]. Overall, these reports strengthen the findings of our in silico transcriptomic analysis implicating 30 wound-healing-associated genes in MPM pathogenicity.
However, it is imperative to investigate the transcriptional expression of these genes in the clinical setting, taking into account both MPM phenotype and history of asbestos exposure.

Conclusions
Conclusively, using data mining and transcriptomic analysis, we have identified a significant deregulation of 30 wound-healing-related genes in MPM specimens compared to healthy tissues. We have also demonstrated that these genes are possible targets of the hsa-miR-143, hsa-miR-223, and the hsa-miR-29 miRNA family members, which contribute in MPM pathobiology as evidenced by recent studies. Finally, we have identified ITGAV as a novel predictor of overall survival in MPM. Our results highlight the role of impaired tissue repair in MPM development and should be further validated experimentally.