PAC-5 Gene Expression Signature for Predicting Prognosis of Patients with Pancreatic Adenocarcinoma

Pancreatic adenocarcinoma (PAC) is one of the most aggressive malignancies. Intratumoural molecular heterogeneity impedes improvement of the overall survival rate. Current pathological staging system is not sufficient to accurately predict prognostic outcomes. Thus, accurate prognostic model for patient survival and treatment decision is demanded. Using differentially expressed gene analysis between normal pancreas and PAC tissues, the cancer-specific genes were identified. A prognostic gene expression model was computed by LASSO regression analysis. The PAC-5 signature (LAMA3, E2F7, IFI44, SLC12A2, and LRIG1) that had significant prognostic value in the overall dataset was established, independently of the pathological stage. We provided evidence that the PAC-5 signature further refined the selection of the PAC patients who might benefit from postoperative therapies. SLC12A2 and LRIG1 interacted with the proteins that were implicated in resistance of EGFR kinase inhibitor. DNA methylation was significantly involved in the gene regulations of the PAC-5 signature. The PAC-5 signature provides new possibilities for improving the personalised therapeutic strategies. We suggest that the PAC-5 genes might be potential drug targets for PAC.


Introduction
Pancreatic cancer is an intractable malignancy, which is the fourth-leading cause of cancer deaths in the United States, with 56,770 new cases and 45,750 deaths in 2019 [1]. It constitutes a small percentage of all cancer deaths (7.2%). However, it is one of the fatal types of cancers with a five-year survival rate of only 9%. The vast majority of pancreatic cancers (>85%) are adenocarcinomas occurring in exocrine glands of the pancreas. Most of pancreatic adenocarcinoma (PAC) patients typically present advanced stages at the diagnosis. Surgery is considered the most effective treatment and the only therapeutic intervention, but only 20% of the patients are eligible for resection [2].
The American Joint Committee on Cancer (AJCC) staging system has been widely applied worldwide to provide guidelines for prognostic assessment and therapeutic decisions in PAC. The AJCC staging system is based on three components: size and/or local extent of the primary tumour (T), the involvement of regional lymph nodes (N), and metastasis (M). However, it is unable to describe tumour behaviour comprehensively. Indeed, PAC patients with the same AJCC stage may have different clinical prognosis after receiving the same treatment [3]. Thus, authorised model should be further proposed to complement the current pathological staging.
In PAC patients, gemcitabine is still employed as the baseline agent for adjuvant chemotherapy [2]. Thereafter, a combination of gemcitabine with FOLFIRINOX [4] or albumin-bound paclitaxel (nab-paclitaxel) [5] have become first-line therapies. However, the majority of patients poorly respond to these chemotherapeutic agents, and the therapeutic failure rather accelerates drug-resistance and metastatic progression [6]. This phenomenon is supported by intratumoural molecular heterogeneity that arises at multiple stages during tumour progression [7]. Tumourigenesis of PAC involves mutual interactions of diverse factors, including gene mutations and microenvironmental conditions [8]. Furthermore, tumour heterogeneity is closely associated with therapeutic sensitivity [7,9]. It is therefore vital to understand the underlying mechanisms in order to increase the treatment efficacy and improve patient outcomes.
With the remarkable advances in bioinformatic technologies, prognostic gene expression signatures have extensively been developed, which reflect various clinicopathological and demographic factors. To date, commercial gene signatures were successfully established to predict prognosis and help therapeutic decision in various cancer patients such as head and neck [10] and breast [11]. In PAC, several previous studies have attempted to develop tumour subtype for prediction of prognosis [12][13][14][15] or therapeutic benefit [16]. However, clinical applications are not yet available. Thus, it is necessary to develop molecular classifier that allows to accurately predict the prognosis of the individual patient via the understanding of tumour heterogeneity in PAC. Furthermore, a good molecular classifier that minimises harmfulness from the overtreatment of patients and thus provides therapeutic benefit in safe is required.
Here, we established a novel molecular classifier that accurately predicted the prognosis of the PAC patients, which was closely associated with tumour-specific gene expression. The PAC-5 gene expression signature would give benefit to PAC patients by selecting patients who were suitable for adjuvant therapies. Further, we attempted to provide possibilities for improving prognostic models of PAC heterogeneity via extensive analyses.

Establishment of a Prognostic Gene Expression Signature
In order to generate a molecular classifier that distinguishes PAC patients into low-and high-risk groups, the gene expression data have been examined in relation to survival information. GSE71729 was used as a training dataset. A flow chart of the procedure used to generate the gene signature was provided in Figure 1. Initially, 2654 genes were obtained through filtering gene set intensity. Then, differentially expression genes (DEGs) analysis between normal pancreas and tumour tissues was employed to identify cancer-related genes, by which 1149 tumour-specific genes were obtained (Table S2). These genes were used for least absolute shrinkage and selection operator (LASSO) regression analysis with overall survival (OS) as the survival endpoint. As a result, we obtained a subset of prognostic genes: LAMA3, E2F7, IFI44, SLC12A2, LRIG1, DUOXA1, and RBM1. However, DUOXA1 expression did not act as an independent prognostic biomarker to stratify the patients into distinct risk-groups, and the gene expression data for RBM1 were not available in the external validation datasets. These two genes were excluded from establishing the final gene expression signature related to OS. Thus, we established a prognostic model that was termed the PAC-5 signature, including five genes (LAMA3, E2F7, IFI44, SLC12A2, and LRIG1, Table S3). Based on the PAC-5 gene expression patterns, the low-and high-risk groups were accurately represented by two clusters (Figure 2A, upper panel). To confirm whether the PAC-5 genes were tumour-specifically expressed, the mRNA expression levels of five genes between normal subjects and risk-groups of the PAC patients were evaluated. The expression levels of LAMA3, E2F7, IFI44, and SLC12A2 were higher in PACs than in the normal pancreases. However, LRIG1 mRNA was less expressed in PACs than in the normal pancreatic tissues. In the analysis between the tumour tissues, mRNAs of LAMA3, E2F7, and IFI44 were more expressed in the tissues of the high-risk group than in those of low-risk group, whereas the expression levels of IFI44 and LRIG1 were lower in the high-risk group than in low-risk group (Figure S1A-E). Prognostic index values were calculated based on the PAC-5 signature for all patients and normal subjects. The patients were classified into low-(n = 63) and high-risk (n = 73) groups by their prognostic indices (Figure 2A, lower histogram). Prognostic index values for the high-risk group were significantly higher than those for the two other groups ( Figure S1F).  [13] previously suggested two distinctive subtypes for predicting prognosis of PAC patients, of which a 'basal-like subtype' was associated with poorer prognostic outcome than a 'classical subtype'. These Moffitt classification subtypes were associated with the gene expression patterns and histological cellularity in PACs. Interestingly, LAMA3 was also used for their genes in subtype-discrimination, of which over-expression was related to the 'basal-like subtype'. Thus, to further evaluate the relevance between PAC-5 signature and Moffitt classification, we performed an association analysis using χ 2 test; the risk-groups by the PAC-5 signature were significantly correlated with the Moffitt classification (p = 1.12 × 10 −3 , Table S4). To further verify the survival difference between low-and high-risk groups in the training dataset, we employed the Kaplan-Meier survival curve analysis. As a result, the Kaplan-Meier plot indicated a significant prognostic difference between the low-and high-risk groups at a median OS of 27.4 and 13.7 months, respectively (p = 8.37 × 10 −4 , Figure 2B).

Survival Analysis and Clinical Relevance of PAC-5 Signature in the Validation Datasets
Next, to further estimate the robustness of the classifier, the PAC-5 signature was validated in the combined five microarray or three RNA-seq datasets. During (leave-one-out cross-validation LOOCV, the specificity and the sensitivity for correctly predicting risk were 0.839 and 0.905 in compound covariate predictor, respectively. The PAC-5 signature significantly classified patients into low-and high-risk groups at median OS of 30.4 and 17.7 months in the combined validation datasets (p = 1.88 × 10 −7 , Figure 3A), and RFS of 17.5 and 15.5 months in the combined validation datasets (p = 0.046, Figure 3B). Kaplan-Meier plots also showed significant prognostic differences in the microarray datasets and RNA-seq datasets (p = 4.87 × 10 −3 and p = 6.94 × 10 −7 for OS, respectively, Figure S2A,B). One external dataset, GSE62452, had gene expression data of adjacent tissues paired to the data of their tumour tissues (n = 61). To further intensify that the PAC-5 genes were tumour-specifically expressed, we evaluated the mRNA expression levels of five genes between adjacent tissues and their tumour tissues assigned to two risk-groups of the PAC patients. Similarly to the results of the training dataset, the expression levels of LAMA3, E2F7, IFI44, and SLC12A2 were higher in PACs than in the adjacent tissues. However, LRIG1 mRNA was less expressed in PACs than in the adjacent tissues. In the analysis between the tumour tissues, mRNAs of LAMA3, E2F7, and IFI44 were more expressed in the tissues of the high-risk group than in those of low-risk group, whereas the expression levels of IFI44 and LRIG1 were lower in the high-risk group than in low-risk group ( Figure S3A-E). Prognostic index values for the high-risk group were also significantly higher than those for the two other groups ( Figure S3F).
Univariate Cox regression analysis revealed significant prognostic accuracy of PAC-5 signature for survival time in the training dataset [hazard ratio (HR) 1.781, 95% confidence interval (Cl) 1.105-2.873, p = 0.018]. Since the training dataset had no information for clinicopathological characteristics, the prognostic value of the PAC-5 signature could not be compared with prognostic covariates. Thus, to compare the prognostic value of our PAC-5 signature with prognostic covariates, univariate and multivariate Cox regression analyses were also performed using the combined validation datasets. In the univariate analysis, pathological grade, primary tumour size, lymph nodes metastasis, AJCC staging, and the PAC-5 signature were significantly associated with OS, compared to their referents, except for pathological grade 4. The significant covariates and our PAC-5 signature were used in multivariate analysis, in which pathological grade (G2 and G3), primary tumour size (T2 and T3), lymph nodes metastasis, and the PAC-5 signature still presented significant prognostic values (Table 1).

Validation of the PAC-5 Signature in Stage I and II PAC Patients
The AJCC staging system is the most widely accepted prognostic model for PAC. However, the prognostic value of AJCC staging is indeed limited, by which the survival rates of patients in IB, IIA, and IIB are identical [3]. Thus, we investigated whether the PAC-5 signature could suitably stratify patients with stage I or II tumours into the two risk-groups in the validation datasets. The combined validation datasets included patients with survival information in stage I (n = 66, 9%) and II (n = 621, 85.4%). Indeed, we observed that the AJCC staging system has not properly stratified patients with stage IA, IB, IIA and IIB for survival ( Figure S4A). Especially, Kaplan-Meier survival curves for the stage IB and IIA were not significantly different (p = 0.297, Figure S4B). However, the PAC-5 signature significantly stratified the stage IB or IIA patients into low-and high-risk groups (p = 0.047 for stage IB and p = 0.043 for stage IIA, respectively, Figure 4A,B). Moreover, it stratified the patients with stage IIB into two distinct prognostic risk-groups (p = 8.61 × 10 −5 , Figure 4C). The patients with stage IA could not be classified into different risk-groups by the PAC-5 signature (p = 0.109, Figure S4C).

Chemotherapy
Gemcitabine-based adjuvant chemotherapy is currently recommended as a standard therapy after surgery for PAC [2]. However, the substantial number of PAC patients poorly respond to the chemotherapeutic agents, which rather causes drug-resistance and metastatic progression [6]. Actually, the clinicopathological information was incomplete for chemotherapeutic treatment in the GSE79668 and TCGA RNA-seq datasets (Table S5). The clinical information of TGCA dataset only indicated the patients who received the adjuvant chemotherapy (n = 100). In the case of GSE 79668 dataset, the details of chemotherapy information were archived with the drug names: Yes (n = 17) and No (n = 6). We could thus not assess the patients who had therapeutic benefit by the PAC-5 signature in each risk-group. However, the PAC-5 signature significantly stratified the patients who were chemotherapeutic drug-administered (n = 117) into two risk-subgroups for OS (p = 0.022, Figure 5A), but did not classify the patient for RFS (p = 0.087, Figure 5B).

Radiotherapy
Adjuvant radiotherapy has frequently been used as an integral component to treat PAC [17]. However, it is still controversial whether the patients benefit from radiotherapy [18]. Thus, in order to investigate the association of the PAC-5 signature with a response to adjuvant radiotherapy, we performed subgroup analysis. Radiotherapy itself showed therapeutic benefit for OS, but not for RFS in the GSE79668 and TCGA RNA-seq datasets (p = 4.81 × 10 −3 for OS and p = 0.631 for RFS, Figure S5A,B, respectively). By incorporating the PAC-5 signature into radiotherapy information, the high-risk patients were shown to obtain the benefit for OS, compared to the patients without adjuvant radiotherapy (p = 2.82 × 10 −4 , Figure 6A). In contrast, low-risk patients did not show significant difference in radiotherapy effect (p = 0.832, Figure 6B). However, both low-and high-risk groups did not benefit from radiotherapy for RFS ( Figure 6C,D).

Targeted Molecular Therapy
Targeted molecular therapy has been suggested as a type of personalised medicine designed to treat cancer via inhibiting oncoproteins that drive signalling pathways in cancer [19]. In the PAC treatment, erlotinib, a selective epithermal growth factor receptor (EGFR) tyrosine kinase inhibitor (TKI), is the only targeted therapeutic agent approved by Food and Drug Administration (FDA) [20]. Although the administered drug was incompletely named in clinicopathological information of TCGA dataset, the association of the PAC-5 signature with the response to targeted molecular therapy was examined in the TCGA RNA-seq validation dataset. The patients with targeted molecular therapy had therapeutic benefit for OS, compared to the patients without targeted molecular therapy (p = 8.53 × 10 −4 for OS and p = 0.018 for RFS, respectively, Figure S5C,D) By incorporating the PAC-5 signature into targeted molecular therapy, the high-risk patients were shown to obtain the benefit in OS and RFS compared to patients without targeted molecular therapy (p = 1.10 × 10 −8 for OS and p = 1.25 ×10 −3 for RFS, respectively, Figure 7A,C). In contrast, the low-risk patients did not show a significant difference in the treatment outcome (p = 0.172 for OS and p = 0.832 for RFS, respectively, Figure 7B,D).

Associations of PAC-5 Signature with KRAS Status
The malignant behaviour of cancer cells is compelled by mutations in oncogenes and tumour suppressor genes [21]. In PAC, KRAS is the most frequently mutated gene (in~95% of cases) [22]. In the GSE79668 and TCGA RNA-seq datasets, KRAS mutations were observed in 82.4% of all pancreatic tumour cases. The KRAS status itself did not show significant difference in prognostic outcomes of the patients for both OS and RFS ( Figure S6A,B). Although the PAC-5 signature did not classify the patients with KRAS wild type into low-and high-risk groups for OS (p = 0.218, Figure 8A), while the patients with KRAS mutants were significantly stratified into two distinct risk-groups (p = 9.19 × 10 −4 , Figure 8B). However, the PAC-5 signature did not still stratify the patients for RFS regardless of KRAS status ( Figure 8C,D). To further assess whether the KRAS status influences the patients assigned to two risk-subgroups by the PAC-5 signature, we have stratified each risk-subgroup by incorporating the KRAS status. However, no risk-groups were further classified by KRAS status (Figure S7A-D).

DNA Methylation Regulating Expression of the PAC-5 Genes
DNA methylation is a critical epigenetic gene regulation mechanism in cancer [23]. To assess whether the DNA methylation influenced the PAC-5 gene expression, correlations between the gene expression and their DNA methylation status at CpG sites were analysed. The threshold for the methylation value of CpG sites was set as the absolute value of ∆β = β Tumour − β Normal > 0.1 between the tumour and adjacent normal tissue; ∆β > 0.1 was defined as hypermethylated sites, and ∆β < -0.1 was considered as hypomethylated sites. The associations between the proximal gene expression and DNA methylation were determined by Pearson's correlation coefficient (r). When the two criteria (r > 0.4 and p < 0.05) were satisfied, the correlation value was defined significant. The significant CpG sites were obtained for two genes, LAMA3 and LRIG1 with moderate r-Values (Figure 9A-C and  Table S6). A DNA methylation heatmap for these three genes was provided in Figure 9D. However, no considerable CpG sites were found for regulation of the other genes.

Identification of Protein-Protein Interaction Network Associated with the PAC-5 Signature
Finally, to investigate how the PAC-5 genes might contribute to PAC progression, we employed PPI analysis in the NetworkAnalyst tool [24]. Three PPI networks related to the PAC-5 genes were generated with 53 nodes representing the proteins and 51 edges representing the interaction between the proteins (Figure 10). To further annotate functions of the proteins interacting with the PAC-5 genes, we executed a KEGG pathway analysis. Importantly, two genes in the PAC-5 signature, LRIG1 and SLC12A2 potentially interacted with the genes involved in EGFR-TKI resistance (Table S7). Figure 10. Protein-protein interaction network analysis of the PAC-5 genes. Interaction map was generated using the STRING database with experimental evidence in the Network Analyst 3.0. The proteins of the PAC-5 signature were red-circled, and the proteins related to the term of EGFR inhibitor resistance in KEGG pathway were black-circled.

Discussion
Pancreatic adenocarcinoma (PAC) is a highly heterogeneous disease with poor clinical outcomes. The prognostic prediction for treatment and mortality after surgery is frequently limited due to tumour molecular heterogeneity. Hence, the primary challenge is to develop a precise prognostic model that provides criteria for clinical treatment decisions. To address this issue, we established a PAC-5 gene expression signature via DEG profiling of normal pancreatic and PAC tissues in publicly available datasets to identify the tumour-specific genes. We here introduced novel genes, IFI44, SLC12A2 and LRIG1, which were not overlapped to other prognostic gene signatures for PAC. The robustness of the PAC-5 signature was supported by the reproducibility of a significant association between the predicted outcome and patient prognosis in external validation datasets composed with by far the largest gene expression profiles. Moreover, the PAC-5 signature could be a complementary prognostic adjunct to pathological staging to pave the way to personalised management strategies. Therapeutic subgroup analysis showed that PAC-5 signature might predict which patients would benefit from adjuvant therapies such as chemotherapy, radiotherapy, and targeted molecular therapy. Furthermore, we revealed that the PAC-5 signature could give potential therapeutic benefit to the patients with KRAS mutant. The five genes were found to be involved in tumourigenic signaling pathways, such as MAPK, PI3K-AKT, and ERBB pathways. Finally, network analyses of the PAC-5 signature provided clues for further elucidation of PAC heterogeneity and potential therapeutic target genes.
In the process of PAC-5 signature development, we initially subjected normal pancreas and tumour tissues to DEG analysis to find pancreatic cancer-specific genes. We subsequently identified seven genes (LAMA3, E2F7, IFI44, SLC12A2, LRIG1, DUOXA1, and RBM1) related to OS of PAC patients, using Cox proportional hazards analysis. Finally, we established a prognostic model with five genes that classified patients into two distinct risk-subgroups. Among the five genes in the PAC-5 signature, expression levels of LAMA3, E2F7 and IFI44 were elevated in the high-risk group, whereas SLC12A12 and LRIG1 expressions were relatively lowered. Interestingly, SLC12A2 expression was higher in tumour tissues than in normal pancreas tissues; however, it was rather highly expressed in the low-risk group, compared to the high-risk one. Further observation is thus necessary to elucidate how SLC12A2 expression is regulated in PAC biology. We also observed similar results from the analysis of an external validation dataset, which had adjacent tissues paired to their PAC tissues. A supervised method was used to construct the gene signature that was refined by LOOCV. Furthermore, a meta-analysis approach based on five microarray datasets (n = 474) and three RNA-seq datasets (n = 283) was applied to validate the prognostic significance of the gene signature in association with overall survival (30.4 months in the low-risk group and 17.7 months in the high-risk group). The PAC-5 signature and clinical parameter adjustment showed a significant association with survival in univariate analysis. Importantly, multivariate analysis demonstrated that the PAC-5 signature was the most significant variable associated with the prognosis of patients with PAC. The AJCC staging system cannot accurately predict patient survival, by which the survival times of stage IB, IIA, and IIB patients actually show no significant differences [3]. This intra-stage variance is due to tumour heterogeneity, resulting in different clinical prognosis after receiving the same treatments [6]. In our subgroup analysis, the patients with stage IIB distinctively showed poor prognosis, not in accordance with previous studies. By incorporating the PAC-5 signature, the patients with stage IB, IIA, and IIB were further stratified into significantly low-and high-risk groups. These consistent results indicate that our gene signature could be a complementary prognostic adjunct to pathological staging to pave the way to personalised management strategies.
All current treatment regimens and many clinical trials targeting specific molecular pathways failed to improve therapeutic efficacy in PAC patients. Thus, the identification of patients who respond well to adjuvant therapy remains a major clinical concern. We demonstrated that the PAC-5 signature is closely associated with clinical outcomes of adjuvant therapies. Adjuvant chemotherapy is currently recommended as a standard therapy after resection for PAC [2]. However, it has modest clinical benefit and may not improve OS. The lack of significant chemotherapeutic response of PAC results in the inherent drug resistance of tumour cells [6]. In our analysis, the clinicopathological information was incomplete for chemotherapeutic treatment. We could thus not assess that the patients who had therapeutic benefit by the PAC-5 signature in each risk-group. Nonetheless, the PAC-5 signature further classified the patients who received the adjuvant chemotherapy for OS. The potential role of radiotherapy as management of resectable tumours in adjuvant settings remains controversial [17]. The treatment efficacy in many patients with PAC is conflicting [25]. In our study, subgroup analysis of patients with available data revealed that adjuvant radiotherapy was beneficial for high-risk patients to improve OS. Targeted molecular therapy is one of the primary modalities in cancer treatment, which interferes with specific molecules needed for tumourigenesis [26]. Currently, no effective targeted molecular therapies have been found for PAC. Because of a lack of information on adjuvant targeted molecular therapy, we could not comprehensively evaluate the efficacy of the PAC-5 signature to predict therapeutic outcomes. However, we found that the PAC-5 signature evidently improved OS and RFS in high-risk patients treated with targeted molecular therapy but not in the low-risk group. Hence, the PAC-5 signature results suggested a potential advantage of adjuvant therapies to patients in high-risk group, although we could not draw definite conclusions because of the small number of patients used in these analyses or incomplete information.
In most cases of PAC, oncogenic KRAS mutations, which initially drive pancreatic neoplasia, are prevalent [22]. With the substantial evidence that mutant KRAS is critical for PAC progression, it is extensively investigated as well [27]. However, no effective targeted therapies for KRAS have been established for PAC. In our analysis, the patients with KRAS status were not involved in prognostic outcomes. We found that the PAC-5 signature in combination with KRAS status further stratified patients for OS, while the signature did not show prognostic differences for RFS. Thus, the PAC-5 signature might give the potential benefit from adjuvant TMT in patients with KRAS mutant type, although we agree that it would not be enough to make a strong conclusion for the predictive power due to the small number of patients used in these analyses.
The majority of genes in the PAC-5 signature (LAMA3, E2F7, SLC12A2, and LRIG1) have been reported to be associated with tumour progression in various types of cancer. LAMA3 is the alpha subunit of laminin-332, which is further composed of laminin subunit β2 (LAMB2) and laminin subunit γ2 (LAMC2). Tumourigenic roles of the laminin-332 are well-known in diverse cancers such as breast and colon cancers [28] and squamous cell carcinoma [29] as well as PAC [30]. E2F7, one of the E2F transcription factors, has critical roles in the regulation of cell cycle progression and DNA-damage response [31,32]. In cancer biology, E2F7 is associated with poor survival in squamous cancers [33]. Loss of E2F7 confers resistance to poly-ADP-ribose polymerase (PARP) inhibitors in BRCA2-deficient breast cancer cells [34]. SLC12A2 plays a role in Na + , K + , and 2Cl-cotransporter in membrane blebbing via interactions with actin and the p38 mitogen-activated protein kinases (p38 MAPK) in malignant mesothelioma cells [35]. Pharmacological modulation of K + transport increases sensitivity to apoptosis in human malignant pleural mesothelioma cell line [36]. SLC12A2 expression is associated with glioblastoma cell invasion and aggressiveness [37]. LRIG1 participates in the aggressive progression of several tumours, in which its expression is frequently decreased [38][39][40]. More importantly, it blocks the EGFR pathway with its antagonist erlotinib abrogated LRIG1 suppression-induced EMT and, subsequently, cell invasion, migration, and vasculogenic mimicry of melanoma cells under hypoxia [41]. IFI44 is one of the interferon-α stimulated genes (ISGs) which is associated with infections of several viruses such as hepatitis C virus [42], rhinovirus [43] and human papillomavirus [44]. In addition, IFI44 inhibits cAMP-mediated signalling downstream of ERK via depletion of intracellular GTP, resulting in arrest of cell division in melanoma [45]. In breast cancer, reduced expression of IFI44 in lymphocytes exacerbates cancer-associated immune dysfunction [46]. However, the molecular functions of IFI44 in cancer cells remain to be explored.
The PPI network analysis of the PAC-5 signature indicated the possibility of drug resistance to EGFR-TKIs. EGFR-TKIs generally bind the tyrosine kinase domain of EGFR, and thus inhibit its activity. For instance, erlotinib and/or gefitinib (small molecular EGFR-TKIs) achieved significant treatment efficacy in patients with lung cancer or PAC [20,47]. Nevertheless, cancer cells gradually acquire resistance to these drugs, resulting in progression and relapse [48]. SLC12A2 and LRIG1 were shown to interact with proteins involved in EGFR kinase inhibitor resistance, such as EGFR, ERBB2, ERBB3, and c-MET. ERBBs are known to promote pancreatic cancer development [49]. Overexpression or mutation of ERBB2 is associated with resistance to EGFR-TKIs [50]. Overexpression of ERBB3 in poorly differentiated colorectal cancer cell lines led to a significant resistance to gefitinib in vitro and in vivo [51]. Furthermore, ERBB3 phosphorylation is driven by EGFR and/or ERBB2, or through amplification of the proto-oncogene c-Met [52]. Several studies have shown that the drug resistance to either TKIs or gemcitabine is developed through hyperactivation of the c-MET/HGF signalling axis [53][54][55]. In addition, although LAMA3 and E2F7 did not exhibit direct interactions with proteins involved in resistance to EGFR-TKIs, these two proteins were also connected to many proteins related to tumour progression [56,57]. Accordingly, we suggest that the PAC-5 signature can be used as a biomarker panel to estimate not only the clinical effectiveness of EGFR-TKIs but also drug resistance. In this manner, the PAC-5 genes might be utilised as valuable targets for concurrent therapy in addition to their role as prognostic markers.
The development of high-throughput technologies has allowed accessing integrated approaches of the genetic and epigenetic patterns for the regulatory mechanism of interest genes. In our analysis of DNA methylation for gene regulations, we found that the genomic alterations in methylation influence the PAC-5 gene expressions. DNA methylation is critical in the early formation and process of diseases, especially for cancers, and the hypermethylation of promotor or/and CpG island (CGI) of genes results in the transcriptional silencing [58]. The LAMA3 loci were relatively hypermethylated at one transcriptional region in the patients of the low-risk group, which were significantly associated with the mRNA expressions. In coincidence with our data, a previous study reports that LAMA3 promoter methylation frequency was inversely associated with increased tumour stage and tumour size in breast cancer [59]. In contrast, two different CpG regions of LRIG1 were hypermethylated in patients with high-risk, which were significantly involved in the gene expression. At present, additional regulatory mechanisms for the five gene expressions in PAC biology remain to be uncovered. Perspective studies would provide new insight on cancer-specific gene regulations for understanding the molecular heterogeneity of PAC.

Development of the Prognostic Gene Expression Signature
A prognostic gene signature was developed using the GSE71729 training dataset. First, the 19,749 genes were filtered by at least more than two folds of the absolute value of log 2 scale in less than 20% of the patients. Next, differentially expressed genes (DEGs) analysis [66] between normal pancreas (n = 46) and PAC tissues (n = 125) was performed to isolate tumour-specific genes. Stringent p-Value (p < 0.001) and false discovery rate (FDR) < 0.1 using univariate permutation test (1000 times) were set as the cutoffs for the DEGs. The LASSO regression [67] (p < 0.001) was then used to identify the OS-associated gene signature from the training dataset. After this step, genes that were not compatible with the external validation datasets or not significant in the individual survival curves were excluded. For predicting prognosis, genes from the survival signature were applied to survival risk prediction analysis. This method utilised the principal component from the training dataset and generated a prognostic index for each patient. The prognostic index (y) was computed by the formula where wi and xi were the weight and logged gene expression for the i-th gene, respectively, as below.
The weight values of all genes were as follows: w LAMA3 , 0.306929; w E2F7 , 0.118701; w IFI44 , 0.263742; w SLC12A2 , −0.4137 and w LRIG1 , −0.190012. The patients were divided into two risk-groups according to a median prognostic index value of −0.060239. Patients were assigned to the high-risk group if their prognostic index values were higher than the median value, whereas the low-risk group comprised patients with the prognostic index values that were equivalent to or less than the median value. Dendrogram of prognostic genes was generated using the heatmap function in R, using default settings for the clustering algorithm.

Validation of the Prognostic Signature
The validation of the gene signature was performed in external datasets. Gene expression data from different datasets were normalised by subtracting the median expression value across the samples. Compound covariate predictor was utilised as a class prediction algorithm to further refine this model and sub-stratify the predicted outcomes [68]. The robustness was estimated by the misclassification rate that was determined during leave-one-out cross-validation (LOOCV).
Kaplan-Meier survival analyses were performed after the patient classification into two risk-groups, and Chi-square (χ 2 ) and log-rank tests were used to evaluate the survival probability in the two predicted risk-subgroups of patients. Univariate and multivariate Cox proportional hazard regression analyses were used to evaluate independent prognostic factors associated with survival, and the gene signature, tumour grade, and pathological characteristics were employed as covariates.

Network and Pathway Enrichment Analysis
NetworkAnalyst 3.0 is a web-based visual analytics platform for comprehensive profiling, meta-analysis and systems-level interpretation of gene expression data (http://www.networkanalyst. ca/) [24], accessed July 2019. The NetworkAnalyst 3.0 was used to generate protein-protein interaction (PPI) networks, and then to perform KEGG pathway enrichment analysis. The PPI network analysis was performed using STRING database v11.0 (http://string-db.org/) [69] with experimental evidence. KEGG pathway enrichment analysis was conducted to annotate the pathways, in which the genes in expression signature were involved. An adjusted p < 0.05 was considered significant for all enrichment analyses.

DNA Methylation Analysis of Gene Regulation
DNA methylation profiling analysis for the gene regulations was performed using TCGA DNA methylation data (Illumina Infinium HumanMethylation450 platform). The DNA methylation β values for CpG sites indicated the estimate of methylation level using the ratio of intensities between methylated and unmethylated alleles. The threshold for the methylation value of a CpG site was set as the absolute value of ∆β = β Tumour − β Normal > 0.1 between the tumour and adjacent normal tissue; ∆β > 0.1 was defined as a hypermethylation, while ∆β < −;0.1 was determined as a hypomethylation. The association between the proximal gene expression and DNA methylation was measured by Pearson's correlation coefficient (r). The correlation values were indicated: 0.1 < |r| ≤ 0.4, weak correlation; 0.4 < |r| ≤ 0.7, moderate correlation; r = 0.7 < |r| ≤ 0.9, strong correlation.

Statistical Methods
Gene expression datasets were analysed using BRB-Array Tools Version 4.6 (http://brb.nci.nih.gov/ BRB-ArrayTools/) [66]. All other statistical analyses were accomplished in the R language environment (http://www.r-project.org) and Statistical Package for Social Sciences (SPSS) software (version 25, SPSS Inc, Chicago, IL, USA). In all statistical analyses, a p-Value of less than 0.05 was considered significant.

Conclusions
In this study, we developed a novel gene signature, the PAC-5 signature, which was developed via DEG between the normal pancreas and PAC tissues to identify the cancer-specific genes. The gene signature accurately and robustly predicts individual PAC patients at high risk of mortality. The prognostic value of the PAC-5 signature was statistically significant in the overall datasets, independently of the pathological staging. Furthermore, we provided evidence that the PAC-5 signature might help to refine the selection of the PAC patients who are beneficial from adjuvant radiation or targeted molecular therapies. Hence, we propose that the five genes in our signature might be promising molecular targets for PAC treatment.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-6694/11/11/1749/s1. Figure S1: Comparison of PAC-5 gene patterns between normal pancreas tissues and tumour tissues assigned to low-or high-risk groups in the training dataset, Figure S2: Kaplan-Meier survival analysis of PAC-5 signature in microarray and RNA-seq datasets, Figure S3: Comparison of PAC-5 gene patterns between normal pancreas tissues and tumour tissues assigned to low-or high-risk groups in the GSE62452 validation dataset, Figure S4: Kaplan-Meier survival analysis of PAC patients with early-stage, Figure S5: Kaplan-Meier survival analysis according to radiotherapy and targeted molecular therapy, Figure S6: Kaplan-Meier survival analysis of PAC patients with KRAS status, Figure S7: Kaplan-Meier survival analysis of PAC-5 risk-subgroups with KRAS status, Table S1: Clinical characteristics of patients in training and validation datasets, Table S2: Class comparison analysis between normal pancreas and pancreatic tumour tissues, Table S3: Annotation of the PAC-5 genes, Table S4: Association between the PAC-5 signature and Moffitt classification, Table S5: Information of chemotherapeutic drug names and treatment in validation datasets, Table S6: Methylation of the PAC-5 genes, Table S7: List of protein-protein interaction network.

Conflicts of Interest:
The authors declare no conflict of interest.