Identification of m6A‐related lncRNAs as prognostic signature within colon tumor immune microenvironment

Abstract Background The current study probed prognosis‐related potential for m6A‐related lncRNAs signatures within colon tumor immune microenvironment (TIM). Methods After downloading transcriptomic datasets for colon cancer (CC) patients from The Cancer Genome Atlas (TCGA), they were divided, in a 1:1 ratio, within training or test datasets. m6A‐related lncRNAs were then scrutinized across such dataset using Pearson correlation assessment before generating a m6A‐related lncRNAs prognosis‐related model using the training dataset. The latter was then validated with the test and the whole dataset. In addition, we compared the differences of TIM and the estimated IC50 of drug response between the high‐ and low‐risk groups. Results Overall survival (OS) resulted as linked with 11 m6A‐related lncRNAs, while within the developed prognosis‐related model, areas‐under‐curves were as follows: within training dataset, values at 3‐, 4‐, and 5‐years were 0.777, 0.819, and 0.805, accordingly, and for test one, they were 0.697, 0.682, and 0.706, respectively. Finally, the values for the whole dataset were 0.675 (3‐year), 0.682 (4‐years), and 0.679 (5‐years), accordingly. Moreover, CC cases categorized within low‐risk cohort demonstrated enhanced OS (p < .0001), lower metastasis (p = 2e‐06) and lower T stage (p = .0067), more instability for microsatellite status (p = .012), and downregulation for PD‐L1, PD‐1, CTLA‐4, LAG3, and HAVCR2 (p < .05). In addition, risk scorings were significantly linked to the degree of infiltrative intensity for CD8 and CD4 (memory resting) T‐cells, T‐regulatory (Tregs), and Mast cells triggering (p < .05). Patients with low infiltrative propensity for CD4 T‐cells also had better OS (p = .016). Moreover, six representative drugs were found to be sensitive for treating CC patients. Conclusion A robust m6A‐related prognostic model with great performances was developed before exploring the TIM characteristics and its potential therapeutic drugs, which might improve the prognosis and therapeutic efficacy.


| INTRODUCTION
In 2022, the new cases and mortality of colorectal cancer were both ranked in the top three among all cancers. 1 So far, great progress has been made in terms of its diagnosis and treatment, 2,3 with patients who have been diagnosed at an early stage now having significantly improved prognosis. 4 However, more efficient biomarkers still need to be identified, especially for predicting prognosis which remains poor in patients with tumor metastasizes. 5,6 In this context, a previous study noted that the tumor immune microenvironment (TIM) was essential for predicting prognosis as well as for evaluating therapeutic efficacy. 7 For instance, immune checkpoint inhibitors, such as cytotoxic T lymphocyte antigen 4 (CTLA4) together with programmed cell death 1 (PD-1) inhibitors, have been identified and used as antitumor agents for immunotherapy. Similarly, lymphocyte activation gene-3 (LAG3) has been considered a target of immunotherapy as its upregulation can exhaust immune cell proliferation and cytokine production. 8 Finally, HAVCR2 can encode T-cell immunoglobulin and mucin domain 3 (Tim-3), a checkpoint receptor that plays vital roles within tumor microenvironment. 9 N6-methyladenosine (m6A) modifications are important for a number of biological functions, including mRNA handling, together with exporting, translating as well as biogenesis for long-noncoding RNA (lncRNA) and microRNA that contribute to human carcinogenesis. 10 Indeed, mutations or aberrant expression of lncRNAs (ncRNAs that are >200 nt long) may exhibit oncogenic functions. 11 The m6A regulators can be divided into three types: writers, erasers and readers. Among them, "writer" is a methyltransferase complex that catalyzes m6A, "eraser" is a demethylase that removes m6A, and "reader" reads the protein to recognize m6A, and combines RNA to play roles in tumorigenesis and tumor progression. 12 Using the Cancer Genome Atlas (TCGA) database, the current study applied bioinformatics as well as statistical analyses to identify and validate the suitability for m6A-related lncRNAs to predict prognoses. A prognostic model for colon cancer (CC) was also established before assessing how it could be linked to patients' clinicopathological features. In addition, the TIM was characterized in terms of the type of tumor-infiltrating cells, the tumor mutational burden (TMB) together with CLTA4/LAG3/ PD-L1/HAVCR2/ PD-1 expression profiles. Finally, high-risk and low-risk groups were comparatively analyzed for determining how they differed in terms of potential drug resistance. These findings may help to identify novel therapeutic targets that could be effectively exploited against CC.

| Determining m6A-related lncRNAs
The limma package in R compared normal group and tumor group and retrieved differentially expressed lncRNAs (DELs), 15 having adjusted p-value < 05 together with jlog2 fold change (FC)j > 0.585, selected as thresholds for this purpose. In addition, differentially expressed m6Arelated genes were obtained using unpaired t tests ( p-value < .05).
Finally, to determine how the DELs were correlated to the m6Arelated genes, Pearson correlation assessments were conducted, whereby in this case, m6A-related lncRNAs were defined at a p-value of < .001 and a jPearson correlationj of > 0.5.

| Generation and validating for m6A-related lncRNAs prognosis-related model in CC
Expression data of 432 CC patients in TCGA-COAD cohort, along with their corresponding clinical information, were obtained and defined as the whole dataset. The latter was then randomly divided in a 1:1 ratio and assigned to either a training or a test dataset, with detailed information on each provided in Table 1. Using the training dataset, a prognostic risk model was then generated before subsequently evaluating it using the test data. Univariate Cox proportional hazards regression assessment was also conducted to identify m6Arelated lncRNAs being closely related to prognoses using the R package survival and survminer ( p < .05), with overfitting avoided through least absolute shrinkage and selection operator (LASSO) evaluation using R package glmnet. Furthermore, independent prognostic factors and their coefficients were denoted through multivariate Cox regression assessment before calculating individual case risk scoring, using: Risk scoring ¼ X Cox coefficient of lnRNA χi Â Scaled expression value of lncRNA χi: SurvivalROC in R was used for plotting receiver-operatingcharacteristic (ROC) curves together with quantifying area-undercurve (AUC) for evaluating risk scoring model's sensitivity as well as specificity. In this case, the ROC curve for which true positive and false positive showed most significant differences was selected, with the curve's turning point selected as the optimal cutoff value based on which cases were categorized into a low-risk or a high-risk group.
T A B L E 1 Detailed information of the training dataset, test dataset, and the whole dataset.

| Exploring immunotherapy-related characteristics
By using the maftools package in R on the whole dataset, the mutation profiles of the two groups could be visualized, 17 with differences in terms of TMB compared using unpaired t test. Moreover, the Wilcoxon test allowed transcriptomic expression profiles for immune checkpoints together with linked ligands across high-together with low-risk groups for comparative analyses.

| Prediction and comparison of drug sensitivity within two CC groups
IC50 for cancer drug response related to CC treatment was predicted in each sample using oncoPredict R package, 18 before selecting those drugs that had an average IC50 of less than 5. Subsequently, differences within response to these drugs within high-risk together with low-risk groups were comparatively analyzed using constructed prognostic risk score model.    Table S1). After Pearson correlation analysis, 414 m6A-related DELs (114 upregulated and 300 downregulated) were obtained, with detailed information provided in Table S2.
Moreover, both groups were compared with the Wilcoxon test to determine their expression levels for immune checkpoints together with linked ligands. In this case, dataset outcomes demonstrated that high-risk scoring cases had upregulated PD-L1 (P = 5.8e À9 ), PD-1 (P = 2.2e À6 ), CTLA-4 (P = 3.9e À8 ), HAVCR2 (P = 1.9e À7 ), and LAG3 (P = 1.6e À7 ) ( Figure 6B-F). 3.5 | The drug sensitivity in high-risk and low-risk groups depending upon generated prognostic model IC50 levels for 198 chemotherapy drugs or inhibitors were predicted in all cases within the TCGA-COAD cohort, and of these, 43 drugs were found to be sensitive (Table S3). In addition, the two groups of T A B L E 2 The m6a-related lncRNAs with significant prognostic value in colon cancer identified by the univariate Cox regression analysis.   Figure 3J), which may indicate that the high instability of microsatellite status inhibits the expression of lncRNAs. In particular, unlike 10 of these which were found to be risk factors (HR < 1) (i.e., higher expression leads to worse OS), AL139383.1 was protective (HR > 1). An intronic variant of AL139383.1 was also show its link to OS in CC cases, 21 while a previous study reported that AL513123.1 was overexpressed and, as such, was of potential prognostic value in breast cancer. 22 Similarly, MIR4713HG was experimentally shown to be significantly upregulated in oral tongue squamous cell carcinoma (OTSCC) tissues, thereby making it a potential indicator of prognosis for OTSCC patients, although MIR4713HG could act as a risk factor that accelerates cell proliferation and metastasis. 23 Previous studies suggested that AC007128.1 could also be considered of prognostic value for various cancer patients, including those with esophageal cancer, 24 esophageal squamous cell carcinoma, 25 and bladder cancer. 26 However, research is still lacking regarding the link between these lncRNAs and the prognosis of colon cancer, and as such, cohort was used as additional external datasets of the target lncRNAs from other public databases could not be acquired for validating the robustness of the model further. Furthermore, the molecular functions of these lncRNAs as well as the mechanism through which they influence colon cancer's prognosis need to be further investigated.
Overall, a robust m6A-related prognostic model was established to explore TIM characteristics and drug resistance potential. These findings could help for prognosis prediction in CC cases, including the development of potential treatments.

ACKNOWLEDGMENTS
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

CONFLICT OF INTEREST STATEMENT
Authors Hezi Zhang, Ying Ba, and Shenrui Zhang were employed by Shenzhen Nucleus Gene Technology Co., Ltd. Lichao Cao declared that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

ETHICS STATEMENT
Not applicable.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are openly available in the UCSC Xena (https://xenabrowser.net/datapages/).