N6-methyladenosine-related long noncoding RNA is a potential biomarker for predicting pancreatic cancer prognosis Tumor

Pancreatic cancer is a common malignant tumor of the digestive system, with insidious onset, difficult early diagnosis, easy metastasis, and poor prognosis. N6-methyladenosine (m6A) and long non-coding RNA (lncRNA) play important roles in the prognostic value and immunotherapy response of pancreatic adenocarcinoma (PAAD). Therefore, it is crucial to recognize m6A-related-lncRNAs in PAAD patients. In this study, m6A-related lncRNAs were obtained by coexpression analysis. Univariate, the Least Absolute Shrinkage, and Selection Operator (LASSO) and multivariate Cox regression analyses were performed to construct m6A-related lncRNA prognostic models. Kaplan–Meier analysis, principal component analysis, feature-rich annotation, and nomogram were used to analyze the accuracy of risk models. Potential drugs targeting this model are also discussed. A prognostic model based on m6A-related lncRNAs was constructed, potential drugs targeting this m6A-related lncRNAs feature were discovered, and the relationship with immunotherapy response was studied. Finally, a nomogram was established to predict survival in PAAD patients. This m6A-based lncRNAs risk prognostic model may be promising for clinical prediction of prognosis and immunotherapy response in PAAD patients.


Introduction
Pancreatic adenocarcinoma (PAAD) is a common malignant tumor of the digestive system, with a high degree of malignancy and strong invasiveness [1,2] . Despite the continuous development of multidisciplinary comprehensive treatment, pancreatic cancer is often found at an advanced stage, and the prognosis of patients is still poor, with a median survival time of 5 -8 months. Its onset is insidious, its early diagnosis is difficult, and it is prone to metastasis. Despite the continuous new progress in the field of comprehensive treatment of pancreatic cancer, there is still little effect in improving the prognosis of patients, and the 5-year survival rate is still <9% [3,4] . Therefore, it is very important to find more effective clinical indicators for the diagnosis and treatment of pancreatic cancer patients. https://doi.org/10.36922/td.v1i2.165

Tumor Discovery
Prognostic biomarkers in pancreatic cancer N6-methyladenosine (m6A) is a dynamic methylation modification located at the N6 site of adenosine, which is the most common internal modification in eukaryotic mRNA, mediating mRNA splicing, structural switching, transport, and translation, degradation and other metabolic processes [5][6][7] . The disordered regulation of m6A methylation modification may affect the processing, degradation, and translation of mRNA, leading to the activation of oncogenes and the inactivation of tumor suppressor genes, which are closely related to the occurrence, development, and drug resistance of malignant tumors. M6a methylation modification involves the action of various modifying enzymes, which are the main factors regulating carcinogenesis and tumor progression [8] . Long non-coding RNA (lncRNA) is a general term for a class of non-coding RNAs longer than 200 nucleotides, which has almost no protein-coding function due to the lack of complete open reading frames. Promotion or inhibition of cancer development can affect the diagnosis and the treatment of tumors [9,10] . Changes in RNA can affect a variety of biological processes. Therefore, the role of m6Aregulated lncRNAs may be crucial for the proliferation and migration of cancer cells [11] . Besides, studies have reported that lncRNAs can promote pancreatic cancer cell proliferation and inhibition of apoptosis [12] .
The m6A methylation modification process is reversible and involves a variety of enzymes (adenosine methyltransferases, demethylases, and RNA-binding proteins). Knockout of METTL3 gene expression reduces mRNA m6A methylation modification and attenuates cancer cell proliferation, invasion, and migration [13] . The demethylase ALKBH5 is one of the important predictors of overall survival in pancreatic cancer, and studies have found that silencing ALKBH5 can significantly increase the proliferation, migration, and invasion of pancreatic cancer cells in vitro and in vivo, while its overexpression does the opposite [14] . The result of another study reported that the expression level of lncRNAs KCNK15-AS1 and ALKBH5 in pancreatic cancer tissues was significantly lower than those in normal tissues and after overexpression of ALKBH5 in different cell lines, the KCNK15-AS1 expression was subsequently increased, while the epithelial-mesenchymal transition in pancreatic cancer cells was inhibited [15,16] . The specific role of m6A regulators in lncRNAs remains unclear. Therefore, understanding the mechanism of m6Arelated-lncRNA in the development of PAAD may provide new ideas for the prognosis and treatment of pancreatic cancer patients.
In this study, the expression profiles of 14,056 lncRNAs and 23 m6A genes were extracted from the Cancer Genome Atlas (TCGA) dataset. M6A-related lncRNAs were identified using the limma package and BiocManager package in R studio software. A prognostic model was constructed based on m6A-related lncRNAs, which was then used to predict the overall survival of PAAD patients. Next, potential drugs targeting m6Arelated lncRNAs were identified using publicly available drug sensitivity databases. At the same time, the relationship with immunotherapy response was explored. Finally, a nomogram was built to predict survival in PAAD patients.

Data sources
RNA-seq transcriptome data of PAAD patients were obtained from the TCGA (https://cancergenome.nih. gov/) database and ID-transformed transcriptome data. Relevant clinical information was downloaded, and clinical information of 185 patients was extracted. The mutation data were downloaded and organized. Pancreatic cancer patients with no survival and incomplete data were excluded to avoid statistical error in this study.

Construction and validation of prognostic models
The entire TCGA dataset was randomized into training and testing groups. A prognostic model was constructed using the training set, and the established model was validated. Subgroups including low-risk and high-risk groups were also subsequently established based on the median risk score. Combined with the survival information of PAAD https://doi.org/10.36922/td.v1i2.165

Tumor Discovery
Prognostic biomarkers in pancreatic cancer patients in TCGA, we screened the m6A-related lncRNAs involved in model construction from 288 m6A-related lncRNAs in the TCGA dataset (P < 0.05). This study used univariate Cox regression and the Least Absolute Shrinkage and Selection Operator (LASSO). Cox regression was performed using the R package glmnet to find m6Arelated lncRNAs significantly associated with PAAD patient survival in the TCGA dataset. Multivariate Cox regression was used to analyze m6A-related lncRNAs, and finally, a m6A-related lncRNAs risk model was established. The formula for calculating the risk score is as follows: Risk score = m6A-related lncRNAs1 × coef + m6A-related lncRNAs2 × coef +… + m6A-related lncRNAsn×coef Where coef represents the coefficient, which is the coefficient between lncRNAs and survival. Risk curves for high and low risk were constructed using the pheatmap package in R studio software. ROC curves were constructed using the survival, survminer, and timeROC packages in R studio software. Then, model validation was performed on clinical subgroups to find out which clinical subgroups our model was applicable to.

Differential gene identification, functional analysis, and tumor mutational burden
Differentially expressed genes in high-risk and lowrisk groups were identified and Gene Ontology (GO) functional analysis was performed on them. The filtering criteria of high-risk and low-risk differential genes were logFCfilter > 1 and fdrFilter < 0.05. GO functional analysis was performed using the cluster profiler package in the R studio software. The analysis threshold was determined by p-value, with P < 0.05 indicating significant enrichment

Tumor Discovery
Prognostic biomarkers in pancreatic cancer of functional reviews. The ggpubr package and the limma package in R studio software were also used to analyze, whether the tumor mutation burden of the high-risk and low-risk groups was different. The survival package and survminer package in the R studio software were then used to analyze the survival of the high and low tumor mutation burden groups.

Model estimation of tumor immune microenvironment, principal component analysis, and Kaplan-Meier survival analysis
Differential immune function in high-risk and low-risk groups was screened by limma package, GSVA package, and GSEABase package in R studio software. The maftools

Tumor Discovery
Prognostic biomarkers in pancreatic cancer package in the R studio software was used to assess the mutation frequencies of the high-risk and low-risk groups in the model. Principal component analysis (PCA) which is used for efficient dimensionality reduction, model identification, and grouping of high-dimensional data of whole gene expression profiles, m6A genes, m6A-related lncRNAs, and risk models based on gene expression patterns visualization was performed. Kaplan-Meier survival analysis was then used to assess the diversity of survival between high-and low-risk groups. The R packages survminer and survival are the tools used for this research.

Analysis of prognostic models and screening of potential drugs
Multivariate and univariate Cox regression analyses were performed to test whether the prognostic model was an independent variable considering other clinical characteristics of PAAD patients (sex, age, tumor grade, and tumor stage). Analyses of immune evasion and immunotherapy were also performed to find out whether there were differences between high-and low-risk groups when receiving immunotherapy. The half maximal inhibitory concentration (IC50) of compounds obtained from the GDSC website Genomics of Drug Sensitivity in Cancer (https://www.cancerrxgene.org/) in the TCGA project of the PAAD dataset were calculated to obtain potential drugs for clinical use in PAAD treatment. IC50s of compounds obtained from the GDSC website were predicted in PAAD patients using the pRRophetic package in R studio software.

Construction and validation of the nomogram
The predictive power of nomogram and other predictors (age, gender, risk score, TNM stage, T stage, N stage, and M stage) for 1-, 3-, and 5-year survival was set. A calibration curve based on the Nomogram-predicted test was applied to illustrate the agreement between actual and model-predicted results.

Identification of m6A-associated lncRNAs in PAAD patients
The matrix expression of 23 m6A genes and 14,056 lncRNAs was extracted from the TCGA database. Two hundred and

Tumor Discovery
Prognostic biomarkers in pancreatic cancer eighty-eight lncRNAs with a coexpression relationship with m6A were identified. M6A-associated lncRNAs were defined as lncRNAs significantly associated with ≥ 1 of the 23 m6A genes (|Pearson R| > 0.4 and P < 0.001). In Figure 1A, a Sankey diagram of m6A-related lncRNAs is shown. In Figure 1B

Tumor Discovery
Prognostic biomarkers in pancreatic cancer

Construction and validation of risk models based on m6A-related lncRNAs in PAAD patients
The m6A-related prognostic lncRNAs were screened from 288 m6A-related lncRNAs in the TCGA training set using univariate Cox regression analysis. In Figure 2A, 24 m6Arelated lncRNAs in the TCGA dataset were significantly Figure 9. Six potential drugs for further analysis of PAAD patients associated with survival. LASSO-penalized Cox analysis is a common method for multiple regression analysis. The application of this method not only improves the prediction accuracy and interpretability of statistical models but also enables variable selection and regularization to be performed simultaneously, which can effectively identify the most available predictive markers and generate prognostic

Tumor Discovery
Prognostic biomarkers in pancreatic cancer indicators to predict clinical outcomes. The vertical-dashed line illustrates the first level value of logλ with the smallest segmentation error. Therefore, 9 m6A-related lncRNAs were selected for subsequent multivariate analysis. Next, multivariate Cox ratio hazard regression analysis was performed to distinguish autologous prognostic proteins. 5 m6A-related lncRNAs, which were prognostic proteins independently associated with survival in the training set, were used to construct risk models to assess prognostic risk in PAAD patients ( Figure 2B and C). PAAD patients were divided into low-risk and high-risk groups according to the median prognostic risk grade. Figure 3A shows the distribution of risk levels for the entire set; Figure 3B shows survival status and survival time; Figure 3C shows m6Arelated lncRNAs; in Figure 3D, we performed a Kaplan-Meier survival analysis, which showed that the low-risk group survived longer than the high-risk group (P < 0.001).
To test the prognostic power of this established model, the risk score for each patient in the training group and in the test group was calculated using a unified formula.  Figure 5, to verify whether patients with different clinical characteristics were suitable for the model constructed in this study. The training group and test group were, further, divided into low-risk subgroup and high-risk subgroup based on age, sex, and tumor stage. The low-risk subgroup showed significantly higher survival rate than the high-risk subgroup.

Further validation of the prognostic model through principal component analysis
PCA analysis was performed in this study to test whether 23 m6A genes, 5 m6A-related lncRNAs, and model lncRNAs could have different distributions in high-and low-risk groups based on the whole gene expression profile. Figures 6A-C show that the distributions of high-risk and low-risk groups are relatively dispersed, while Figure 6D based on the model we constructed shows that the high-and low-risk groups have different distributions, indicating that the model can distinguish between high-and low-risk groups of patients.

Tumor Discovery
Prognostic biomarkers in pancreatic cancer

Estimation of the tumor immune microenvironment and cancer immunotherapy response by a prognostic model
Immune function analysis was first performed ( Figure 7A) and the differences in immune function between high-and low-risk groups were identified. Next, to explore the underlying molecular mechanisms of the m6A-based model, we performed Gene Ontology (GO) enrichment analysis, revealing the involvement of many immune-related biological processes ( Figure 7B). An analysis of immune escape and immunotherapy was also performed to find out whether there are differences between high-risk and low-risk groups when receiving immunotherapy ( Figure 7C). Mutational data were analyzed and summarized using the maftools package in R studio software. Mutations were stratified according to variant effect predictors. Figure 7D and Figure 7E show the top 20 most frequently altered driver genes between

Tumor Discovery
Prognostic biomarkers in pancreatic cancer high-risk and low-risk subgroups. A differential analysis of tumor mutational burden was then performed and the tumor mutational burden (TMB) was then calculated from TGCA somatic mutation data. The low-risk group had lower TMB than the high-risk group ( Figure 7F). Next, a survival analysis of TMB was performed. Figure 8A shows that the survival of the low TMB group was better than that of the high TMB group, and then combined with the TMB with the patient risk score for survival analysis, Figure 8B shows that patients with low TMB and low-risk score were found to have a higher probability of survival.

Identification of potential drugs for prognostic models
To explore potential drugs for the treatment of PAAD patients with our prognostic model, we used the pRRophetic algorithm to estimate treatment response based on the halfmaximal inhibitory concentration (IC50) of each drug in the Genomics of Cancer Drug Sensitivity (GDSC) database. We screened for six drugs with significantly different estimated IC50s between the high-and low-risk groups, and the lowrisk group was found to be more sensitive to most of the potential drugs. Figure 9 shows six potential drugs that can be used for further analysis of PAAD patients.

Independent prognostic analysis of prognostic models and assessment of clinical features of PAAD
We performed univariate and multivariate Cox regression analyses to assess whether risk models for m6A-related LncRNAs had independent prognostic features of PAAD. Based on Figure 10A, first in the univariate Cox regression analysis, the HRs for the risk score and 95% confidence interval (CI) were 1.181 and 1.097−1.271, respectively (p < 0.001). Based on Figure 10B, HR was 1.162 in multivariate Cox regression analysis, 95% CI was 1.074−1.257 (P < 0.001). A concordance index analysis of the risk score was then performed and it was found that the concordance index of the risk score was consistently greater than other clinical factors over time, suggesting that the risk class could better predict the prognosis of PAAD patients ( Figure 10C). Thereafter, the area under the ROC curve (AUC) analysis of risk grades was performed ( Figure 10D and E), and the AUCs of risk score grades were also shown to be higher than those of other clinical features, indicating that the prognostic risk model constructed in this study was relatively reliable.

Construction and evaluation of prognostic nomograms
Nomograms including risk classes and clinical characteristics to predict 1-, 3-, and 5-year survival in PAAD patients were constructed ( Figure 11A). Based on the correlation plots, the observed and predicted rates of survival in PAAD patients at 1, 3, and 5 years showed good agreement ( Figure 11B).

Discussion
Pancreatic cancer is the main cause of cancer-related death worldwide and has been a serious threat to human life and health due to its insidious onset, strong invasiveness, poor prognosis, and high mortality rate [2,18,19] . Through further research, it has been found that the disorder of m6A methylation modification regulation may affect the processing, degradation, and translation of mRNA, resulting in the activation of oncogenes and the inactivation of tumor suppressor genes, and the occurrence, development, and drug resistance of malignant tumors. The occurrence of m6A is closely related, and m6A changes play a crucial role in carcinogenesis and tumor progression [8] .
M6A plays a post-transcriptional modification role in eukaryotic mRNAs and lncRNAs, such as in regulating mRNA transcription, splicing and translation, as well as affecting the structure and function of lncRNAs with extensive regulatory roles [11] . M6A regulators can modify specific lncRNAs, and lncRNAs can maintain malignancy in various tumors through transcriptional, epigenetic, and post-transcriptional levels [10,20] . The role of m6A-regulated lncRNAs may be critical for the proliferation and migration of cancer cells [11] . Studies have reported that m6A methylation modification of lncRNA can affect the occurrence and development of tumors, and m6A modification can also affect the formation of RNA-DNA triple helix, in which one lncRNA binds to this series through the Hoogsteen base pair in the main groove of double-stranded DNA. In addition, m6A may also affect the reciprocal site between lncRNA and specific DNA [21,22] . Both m6A and lncRNA are important regulators of PAAD occurrence. However, studies on their roles and biological mechanisms in PAAD progression are still relatively lacking [13,17] . In this study, an independent prognostic model based on m6A-related lncRNA was constructed, inspired by the functions of m6A and lncRNA in PAAD.
In this work, 14056 m6A-associated lncRNAs were identified from the TCGA dataset to explore the prognostic functions of m6A-associated lncRNAs. After confirming the prognostic value of m6A-related lncRNAs in the TCGA dataset, five of them were selected to construct m6A-related lncRNA prognostic models to predict the survival of PAAD patients. Model validation for clinical grouping was also performed, the risk scores for each patient in the training group and across the entire set were calculated, and principal component analysis was performed to validate the prognostic model, all of which https://doi.org/10.36922/td.v1i2.165

Tumor Discovery
Prognostic biomarkers in pancreatic cancer demonstrated the accuracy of the prognostic model. PAAD patients were divided into a low-risk group and a high-risk group according to the median prognostic risk level. It was found that the high-risk group had poorer survival than the low-risk group. The model validation results of clinical grouping showed that the model we constructed was more suitable for patients who are 65 years old and below, female patients, and PAAD patients with tumor stage I and II. Multivariate Cox regression analysis showed that the m6A-related lncRNAs prognostic model was an own risk factor for survival. ROC analysis showed that the model outperformed traditional clinical features in predicting survival in PAAD patients. In addition, a nomogram was constructed showing the agreement between the 1-, 3-, and 5-year prognostic model prediction rates for PAAD patients. In terms of the accuracy of the prognostic model based on m6A-related lncRNAs in predicting patient survival, the prediction model can provide a certain basis for subsequent research to identify new biomarkers.
TMB is the total number of somatically encoded mutations associated with the emergence of neoantigens that trigger antitumor immunity [23,24] . Studies have reported that patients with low-risk endometrial cancer have higher TMB and are more sensitive to chemotherapy than patients with high-risk scores [25] . Here, we found that TMB in the low-risk group was lower than that in the high-risk group and then performed a survival analysis of TMB, finding that the low -TMB group had better survival than the high TMB group. Risk scores were used for survival analysis and found that patients with low TMB and low-risk scores had better survival. However, our prognostic model showed that there was no significant difference between high-and low-risk groups when receiving immunotherapy in the analysis of immune escape and immunotherapy, which is probably due to limited samples. More samples can be used in future studies. The tumor microenvironment can regulate the biological properties of tumor cells such as chemotherapy resistance through metabolism and other means. Six drugs with significantly different estimated IC50s were screened out between the high-and low-risk groups. The low-risk group was found to be more sensitive to most potential drugs, the discovery of which may provide new insights into the subsequent treatment of patients with PAAD ideas. Pathological stage is a decisive factor for the diagnosis and prognosis of PAAD [26] . The current staging is not precise in providing reliable predictions and reflecting the heterogeneity of PAAD. Therefore, it is critical to explore new potential predictive markers and immunotherapeutic agents. The m6A-related lncRNA prognostic model established in this paper provides a new idea for predicting the survival of PAAD patients. However, there are some shortcomings and limitations in this study, the biological mechanisms of m6A-related lncRNAs have not been fully elucidated. In the future, the accuracy of this model will be verified with more experiments to explore the role of lncRNA and its interaction with m6A.

Conclusion
Understanding the mechanism of m6A-related lncRNA in the development of PAAD may provide new ideas for the prognosis and treatment of pancreatic cancer patients. Our study provides new clues and ways for survival prediction in PAAD patients and may help to elucidate the process and mechanism of regulation between m6A and lncRNA.