PPARG Could Work as a Valid Therapeutic Strategy for the Treatment of Lung Squamous Cell Carcinoma

Previous studies showed that PPAR-gamma (PPARG) ligands might serve as potential therapeutic agents for nonsmall cell lung cancer (NSCLC). However, a few studies reported the specific relationship between PPARG and lung squamous cell carcinoma (LSCC). Here, we made an effort to explore the relationship between PPARG and LSCC. First, we used mega-analysis and partial mega-analysis to analyze the effects of PPARG on LSCC by using 12 independent LSCC expression datasets (285 healthy controls and 375 LSCC cases). Then, literature-based molecular pathways between PPARG and LSCC were established. After that, a gene set enrichment analysis (GSEA) was conducted to study the functionalities of PPARG and PPARG-driven triggers within the molecular pathways. Finally, another mega-analysis was constructed to test the expression changes of PPARG and its driven targets. The partial mega-analysis showed a significant downregulated expression of PPARG in LSCC (LFC = −1.08, p value = 0.00073). Twelve diagnostic markers and four prognostic markers were identified within multiple PPARG-LSCC regulatory pathways. Our results suggested that the activation of PPARG expression may inhibit the development and progression of LSCC through the regulation of LSCC upstream regulators and downstream marker genes, which were involved in tumor cell proliferation and protein polyubiquitination/ubiquitination.


Introduction
PPARG is a ligand-activated transcription factor belonging to the family of peroxisome proliferator-activated receptors (PPARs) [1], which is widely expressed in many cells and tissues in the human body [2]. PPARG has recently attracted interest as the potential therapeutic target for a variety of malignancies [3]. A number of animal models [4], cell lines [5,6], and clinical trials [NCT00923949, NCT01199068, NCT01199055] demonstrated that activation of PPARgamma impedes lung tumor progression and suggest that PPARG ligands may serve as potential therapeutic agents for nonsmall cell lung cancer (NSCLC), with the emphasis on lung adenocarcinoma [7,8]. For instance, Ni et al. 's study showed that the activation of PPARG could inhibit the prolif-eration of EGFR-TKI-resistant lung adenocarcinoma cells and lead to a better survival rate [8].
So far, only a few studies explored the relationship between PPARG and lung squamous cell carcinomas (LSCC) [9,10]. Kim et al. pointed out that the truncated splice variant of human PPAR gamma1 (hPPARG1(tr)) was strongly expressed in primary LSCC tumorous tissues, and the overexpression of hPPARG1(tr)) could increase the resistance of transfected cells to chemotherapeutic drug-and chemical-induced cell death [10]. However, to our knowledge, no study has systematically studied the role of PPARG in the pathology of LSCC.
To address this issue, we first conducted a meta-analysis to study the gene expression change of PPARG in the case of LSCC. Then, we integrated literature-based pathways and gene set enrichment analysis (GSEA) to study the potential pathways where PPARG could exert influence on the pathologic development of LSCC. Our results may help to understand the potential roles of PPARG in the case of LSCC.

Selection of LSCC Expression Data in Mega-Analysis.
For the initial selection, we searched the LSCC expression datasets on gene expression omnibus (GEO; https://www.ncbi .nlm.nih.gov/geo/) by using the keywords "lung squamous cell carcinoma." Then, we applied the following criteria to filter further: (1) The entry type used in the study was series; (2) The dataset was array expression data; (3) The studies were designed as a comparison between LSCC and healthy controls: and (4) The organism of the dataset was Homo sapiens. Finally, we considered the datasets for which the original data and the corresponding format files were downloadable. Since we calculated the expression using the original data as extracted above, we used the term "mega-analysis" instead of "meta-analysis.".

Mega-Analysis and Partial Mega-Analysis Models.
In order to identify the relation between PPARG and LSCC, we used a mega-analysis to analyze the expression levels of PPARG in the case of LSCC. In our study, results from using both the random-effects model and the fixed-effects model were compared. To determine the heterogeneity of the datasets, between-and within-study variance was calculated and compared. When the total variance Q was no bigger than the expected value of the between-study variances (df), the model sets the ISq (percentage of the within-over betweenstudy variance) to zero. In this case, the fixed-effects model, instead of the random-effects model, will be selected for the mega-analysis. All analyses were performed using Matlab (version R2017a; https://www.mathworks.com/products/ matlab.html).
Moreover, we performed a partial mega-analysis to discover the significance of a gene presented in part of the studies/datasets (e.g., 50% of total studies) but not in all datasets, where 50% top studies/datasets were employed for the mega-analysis of a gene. Here, we define the "top datasets" for a gene as these datasets that demonstrate the bigger absolute value of effect size than the rest of the datasets. It should be mentioned that the top datasets for different genes could be different.

Analysis of Influential Factors.
To estimate the possible influence of several factors (e.g., study date, country of origin, and sample size) on the gene expression in the case of MI, we conducted a multiple linear regression (MLR) analysis and reported the p values for each of these factors.

Construct PPARG-Drive Network and Gene Set
Enrichment Analysis. Based on large scale literature data mining, we constructed a diagnostic and a prognostic functional network connecting PPARG and LSCC. In the diagnostic network, we identified the genes that were contra-directionally regulated by PPARG and LSCC. To achieve this goal, we used Pathway Studio (http://www.pathwaystudio.com/) to identify PPARG➔gene relationships and LSCC➔gene relationships with polarity. Each of these relationships has supported by one or two scientific references. Then we identified the overlapped genes within these relationships to construct the PPARG-LSCC diagnostic network. To increase the reliability of the identified network, we limited the genes to these that also demonstrated consistency in terms of their gene expression alteration in the case of LSCC in the mega-analysis. For the prognostic pathway, we followed the same pressure but to identify genes that were downstream targets of PPARG and upstream regulators of LSCC. The reference information supporting the relations identified in these networks was provided in the Supplementary Materials (available here), including the type of the relationship, supporting references, and related sentences from the references where the relationship has been identified.
For these genes within the diagnostic and prognostic networks built above, a gene set enrichment analysis (GSEA) was conducted using Pathway Studio (version 12.1.0.9; http://www.pathwaystudio.com/) against Gene Ontology (GO; http://geneontology.org) and Pathway Studio pathways. The purpose of GSEA was to test the functional profile of the genes involved in the PPARG-driven networks.

Mega-Analysis Based on the Selected LSCC Expression
Datasets. There were 4,643 results shown in the GEO datasets identified by the keywords "lung squamous cell carcinoma". Then, further filters were set as our criteria. A total of 12 datasets satisfied the inclusion criteria for the mega-analysis, which are listed in Table 1. The studies were distributed in 8 different countries, and the study dates ranged from 2 to 15 years ago, including 285 healthy controls and 375 LSCC cases.
The mega-analysis and partial mega-analysis results for gene PPARG are presented in Table 2. As shown in Figure 1(a), the total variance (Q) was larger than the expected between-study variance (df ), the within-study variance percentage (ISq%) was 45.10, the between-study variances were significant, and thus a random-effects model was selected for PPARG in the mega-analysis. However, there were no significant between-study variances (Isq = 0, Q test p = 0:64), see in Figure 1(b). Thus, the fixed-effects model was selected for PPARG in partial mega-analysis. The LFC of the gene was estimated from about half (top 50%) of the selected datasets. PPARG demonstrated significantly lower expression in the case of LSCC (LFC = −1:08, p value = 0:00073).
MLR analysis showed that sample size and study age were not significant influential factors for the expression levels of PPARG among the 11 LSCC datasets (p value > 0:30). However, the population region (country) was identified as a significant factor that influences the LFC of PPARG in the case of LSCC (p value = 0:0045, Figure 1(c)). This may partially explain the differential results between the partial megaanalysis and the mega-analysis.

The LSCC Diagnostic Network Interfered by PPARG.
Multiple molecules (12 genes) have been identified through large-scale literature data mining that was contradirectionally influenced by PPARG and LSCC, as shown in Figure 2. According to previous literature reports, a total of 8 molecules (XIAP, UBE2D1, SKP2, ACKR3, MI21, HOXA10, STAT1, and PDPN) were upregulated in LSCC (the genes at the bottom of Figure 2; the arrows with ⊕, Figure 2) but negatively affected by PPARG (the arrows with ┥, Figure 2). These eight genes also presented increased expression levels in the 12 LSCC RNA expression datasets. These results support the literature data mining results and suggest these eight genes as positive markers for LSCC. The mega-analysis results for these genes were provided in the Supplementary Materials→Mega-analysis (available here). The inhibition of these genes by PPARG could exert an anti-LSCC effect during its pathological development ( Figure 2). On the other hand, four genes (MIR223, ANGPT1, CYP2A6, and FOXA2) have been suggested to get suppressed in LSCC (the genes at the top of Figure 2, the arrows with ┥) but were stimulated by PPARG ( Figure 2). These four genes also presented decreased expression levels in the mega-analysis, supporting the literature data mining results. Activation of these molecules could be due to other pathways where PPARG inhibits the progress of LA. Detailed information regarding the network presented in Figure 2 can be found in the Supplementary Materials→LSCC (available here) diagnostic network, including the type of the relationships and the supporting references.

GSEA for the Genes Involved in LSCC Diagnostic
Network. The GSEA was performed using Pathway Studio with the purpose of investigating the biological functions of the 12 genes within the LSCC diagnostic network. The GSEA was also confirmed by the mega-analysis, including eight upregulated and five suppressed genes. A total of 10 out of these 12 genes were shared among the top 10 most significantly enriched pathways (p values < 0:012, q = 0:05 for FDR), which are presented in Table 3. The full 21 pathways/gene sets enriched with p value < 0:047 were presented in the Supplementary Materials→GSEA1 (available here). Notably, enriched pathways highlighted by the GSEA approach are mainly related to the regulation of protein ubiquitination, the regulation of cell proliferation, and cytokine stimulus.
3.4. LSCC Prognostic Network Interfered by PPARG. As shown in Figure 3, a regulatory pathway connecting PPARG and LSCC was identified, heavily involved in the pathological development of LSCC. Based on the literature reports, three genes, TNF, NOS2, and ACE, could promote the pathological development of LSCC (highlighted in red, the arrows with ⊕, Figure 3). These three genes were deactivated by PPARG. However, according to the mega-analysis results, these genes were downregulated together with PPARG in the case of LSCC. Therefore, PPARG may not necessarily be needed for the deactivation of these genes to inhibit the progress of LSCC. The supporting references for each relation presented in Figure 3 were provided in the Supplementary Materi-als→LSCC_ (available here)prognostic network, which also include the type of the relationships.
In addition, an LSCC-inhibitor, STK11, has been shown to be activated by PPARG (highlighted in blue, the arrows with ┥, see Figure 3(a)). Mega-analysis showed that this gene showed slightly increased expression in the case of LSCC. Therefore, activation of PPARG may further promote the activation of STK11, which could be a blocker for the  Figure 3 were provided in the Supplementary Materials→Mega-analysis (available here).

GSEA for the Genes Involved in LSCC Prognostic Network.
GSEA results showed that the five genes (PPARG, STK11, NOS2, and TNF) were significantly enriched within 59 pathways/gene sets (p value < 0:044; q = 0:05 for FDR; see Supplementary Materials→GSEA2 (available here)). We presented the top 10 pathways (4 genes were enriched; p < 0:0077) in Table 4. The pathways were mainly related to cell metabolism and hormone level regulation, which was largely different from that of the LSCC diagnostic network. Also, notably, these five genes enrich more pathways than that of the 12 genes within the LSCC diagnostic network, indicating that these five genes were more functionally linked to each other.

Discussion
Previous studies suggested that the activation of PPARG might be associated with the inhibition of NSCLC [4][5][6]. However, most studies were focused on the cases of lung adenocarcinoma [4][5][6]. In this study, we aim to explore the possible linkage between PPARG and LSCC. First, we utilized a mega-analysis and a partial mega-analysis to analyze the potential relationship between PPARG and LSCC. Subsequently, we integrated knowledge from large-scale literature data mining and existing LSCC expression data to construct molecular networks connecting PPARG and LSCC, followed by a GAEA analysis to study the functional profile of the molecules involved in the PPARG-drive network. Our results showed that PPARG was significantly downregulated in about half of the LSCC cases, with multiple pathways  These results indicate that there are influential factors that lead to different expression levels of PPARG among different studies. MLR analysis showed that the population region was a significant factor that influenced the PPARG levels (Figure 1(c)). Specifically, PPARG demonstrated low expression of LFC = −1:60 in the dataset from France (GSE30219), while relatively high expression in the dataset from Germany (LFC = 0:27; GSE6044). Notably, one of the highest (GSE12428; LFC = 0:26) and the lowest (GSE19188; LFC = −1:91) expression levels were from the Netherlands, indicating that, besides population region, there could be other factors that influence the PPARG expression levels in LSCC patients. Further investigation showed that the GSE12428 dataset was especially studying LSCC patients who smoke, while the dataset of GSE19188 contained all patients in general. It has been shown that smoking is one of the major risk factors for the development of LSCC [11], and smokers tend to have low PPARG expression levels [12]. These studies may explain the different PPARG levels in these two Netherlands datasets. Moreover, MLR results also suggest that sample size   and study date were not significant factors for the PPARG levels. Due to lack of data, we only studied the influence of three factors on the expression levels of PPARG. Further study is needed to test the expression level of PPARG in LSCC patients and the possible influential factors such as age, gender, and complication.
GSEA analysis suggested that PPARG may influence the development of LSCC through multiple pathways (Table 3 and Table 4). Besides the regulation of cell proliferation associated with the procession of LSCC, PPARG could also regulate protein polyubiquitination and ubiquitination, which has become increasingly recognized as a controller to regulate the function and signaling of a profusion of proteins. Ubiquitination affects proteins in various cellular processes, including signal transduction, DNA repair, chromosome maintenance, transcriptional activation, cell cycle progression, cell survival, and certain immune cell functions [39]. Thus, it is not surprising that ubiquitin metabolism enzymes prominently feature either oncogenes or tumor suppressors in a variety of cancers and many pathways relevant to cancer. A previous study suggested that targeting those physiological processes may effectively abate the proliferation and facilitate the treatment of lung cancer cells [40]. In the LSCC diagnostic network (Figure 2) Figure 3: LSCC prognostic network interfered with PPARG. Genes in blue represent a decreased expression level from the mega-analysis using 12 LSCC datasets; entities in red represent an increased expression level. Entities highlighted in blue were literature implicated as the LSCC-inhibitors, while those highlighted in red were LSCC-promoters.  [13][14][15]. PPARG may also play a role in the progression of LSCC by interfering upstream regulators of LSCC, as shown in Figure 3. For instance, the knock-down of PPARG has been shown to deactivate STK11 [42], while the loss of STK11 could lead to the formation of LSCC [43]. Furthermore, the activation of PPARG could inhibit three promoters of LSCC, including NOS2 [44], ACE [45], and TNF [46]. Thus, increased expression of PPARG may inhibit the formation of LSCC.
The most significant contribution of this study was the identification of the two PPARG-driven networks (Figures 2 and 3) that partially explain the mechanism of the roles of PPARG in the etiology and development of LSCC. However, the integration of literature data-mine and mega-analysis may exclude potential genes/molecules connection PPARG and LSCC. Further study is needed to validate and consummate the networks identified here.

Conclusion
Results from this study indicated that the expression of PPARG might be suppressed in LSCC patients. Activation of PPARG expression may inhibit the development and progress of LSCC through the regulation of LSCC upstream regulators and downstream marker genes. Our results indicate the need for further study of the relationship between PPARG and LSCC.

Data Availability
The data of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
All the authors declare no conflict of interest.

Authors' Contributions
Shunbin Shi and Guiping Yu have contributed equally to this work.

Supplementary Materials
(1) GSEA1 and GAEA2 present GSEA results for the genes within Figure 2 and Figure 3, respectively. (2) Megaanalysis presents the mega-analysis results for PPARG gene and its driven genes within the Figure 2 and Figure 3.
(3) Partial_Mega-analysis presents the partial megaanalysis results for PPARG; (4) LSCC_diagnostic network presents the reference information for the network presented in Figure 2. (4) LSCC_prognostic network presents the reference information for the network presented in Figure 3. (Supplementary Materials)