Identification of cuproptosis‐related subtypes and tumor microenvironment characteristics in colon cancer

Cuproptosis is a form of regulated cell death, which is characterized by the lethal accumulation of excessive copper in cells. However, its role in colon adenocarcinoma (COAD) remains elusive. Our study aimed to decipher the biological function of cuproptosis‐related genes (CRGs) in patients with COAD. The expression data were obtained from The Cancer Genome Atlas, while the Gene Expression Omnibus database was used to verify the results. A total of eight differentially expressed CRGs were selected from patients with COAD. Subsequently, patients were stratified into three subtypes with distinct clinical and biological features. In view of the huge differences in the prognosis between subtypes A and C, we selected them for further study, including the variations in clinical progressions, oncogenic pathways, and immune cell infiltration. For the sake of better evaluation, we also established a cuproptosis index (CI) to quantify the heterogeneity of CRGs expression. Enrichment analysis showed that the high‐CI group was enriched in immune activation pathways. Meanwhile, the immunosuppressive cell infiltration and immune checkpoints were elevated in the high‐CI group. The CI we constructed had its predicting function in different clinical groups. Besides, we noticed that CRGs, especially CDKN2A, were closely related to the clinical progression in patients with COAD and immune cell infiltration in tumors. Moreover, CDKN2A could become a potential novel target for immunotherapy in the setting of COAD.

study disclosed copper as a novel metal ion, critical for tumor biology. 8,9 Copper is the basis of essential biological processes, such as mitochondrial respiration, iron absorption, antioxidation/detoxification, and is an essential mineral nutrient for all living organisms. 10 Cuproptosis, as a novel cell death type that copper ions induce, is quitely uncharted in known death type, including necroptosis, apoptosis, pyroptosis, and ferroptosis. It requires the participation of mitochondrial respiration, and the adenosine triphosphate(ATP) produced by glycolysis has little effect on cuproptosis. Copper is not directly involved in the electron transport chain, but only in the tricarboxylic acid cycle. 8 The study demonstrates that excessive intracellular copper accumulation causes mitochondrial stress, specifically aggregation of mitochondrial lipoylated proteins and imbalance of ironsulfur cluster proteins, ultimately leading to copper death. Meanwhile, several studies have reported that the level of copper is higher in various malignancies compared to normal tissues, indicating its potential role in tumor initiation and progression. 11,12 Moreover, the prognostic value of CRGs has been explored in multiple studies. In skin cutaneous melanoma, the expression of LIPT1 was significantly correlated not only with patient prognosis but with immune cell infiltration. 13 In gliomas, a cuproptosis activation scoring model was constructed and had predictive value in tumor immunity, which had guiding significance for clinical treatment. 14 The aim of this study was to describe the genetic status of COAD associated with cuproptosis, and to explore the impact of cuproptosis on the prognosis of patients with COAD. We selected cuproptosisrelated genes (CRGs) from recent studies. Based on clustering of associated genes, we defined three subtypes, each corresponding to a distinct clinical and molecular signature. It provided a theoretical basis for investigating new therapies with improved selectivity, reduced side effects, and lower cellular drug resistance.

| Identification of differentially expressed CRGs
In the pre-processing of raw data, we utilized the Limma R package 23 to correct the data and process the duplicates. Wilcoxon test was used to analyze the contrast between CRG expression profiles in tumor specimens and normal tissues. Notably, a false discovery rate (FDR) <0.05 and an absolute value of log2 fold-change (L2FC) >1.5 were set as criteria when screening.

| Survival analysis
The "surv_cutpoint" command was utilized to identify the optimal cut-off value of each differentially expressed gene, to stratify patients into the high-expression group and low-expression group. The differences in overall survival (OS) time were evaluated with the log-rank test through the Kaplan-Meier (KM) method. R packages including KMsurv, 24 survival, 25 and survminer 26 were applied, and p < .05 was considered as statistically significant.

| Cluster analysis
Based on the expression of eight CRGs, we applied an unsupervised clustering algorithm to cluster the samples. To select the optimal clustering, we used the package Consensus Cluster Plus 27 to guarantee the stability of the clustering. The point with the most variable genetic value was considered as the optimal number of immune subtypes. To evaluate the reliability of clustering results, the KM method was applied to calculate the OS of cases in each subgroup, and p < .05 was considered as statistically significant. Clinical characteristics including T-category, lymph node metastasis, distant metastasis, histological stage, and age were displayed by R package heatmaps, where patients with unknown messages were deleted.

| Identification of the immune characteristics of clusters
To further explore the role of CRGs in biological pathways, we extracted clusters A and C on the basis of their great differences in clinical outcomes. Then, we performed gene set variant analysis (GSVA) utilizing the R package GSVA 28 and "c2.cp.kegg.v7.4symds" downloaded from the molecular signature database (MSigDB). To disclose the oncogenic role of target genes, Gene Ontology (GO) and enhanced Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were used in R package ClusterProfiler. 29 Singlesample Gene Set Enrichment Analysis (ssGSEA) was also performed to calculate the relative abundance of immune cells with the R package GSVA, based on the gene sets from the research of Charoentong et al. 30 A p-value <.05 and a FDR < 0.05 were defined as significant.

| Establishment of the cuproptosis index
To measure changes in CRGs in COAD, we established a scoring framework, the cuproptosis index (CI), to evaluate CRG adjustment in patients with COAD. The strategies developed by CI mainly include the following points: the differently expressed CRGs recognized from different clusters were firstly normalized among the Asian Cancer Research Group (ACRG) test, and the overlapping genes were extracted. Then, we performed a prognostic analysis of each gene in the signal, using a univariate Cox regression model. The genes which had a critical prognosis needed further investigation. Subsequently, the cuproptosis-related subtype signature was constructed, using principal component analysis (PCA). Both central components 1 and 2 were selected as signature scores. The advantage of this strategy is the well-connected (or anticorrelation) quality on the set with the largest connection block, which is the contribution from genes that do not match other groups when weighted down. At this time, we adopted a strategy similar to the gene expression grade index (GGI): where i represents the expression of CRGs.

| Correlation analyses between CI and other related biological processes
According to the correlation between CI and patient survival rate, the R package survival was used to determine the optimal cut-off value of each data set subgroup. The function surv-cutpoint was used to test all potential cut-off points several times to find the maximum grade measurement, so as to divide patients into a high-CI group and a low-CI group based on the selected maximum log rank. Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE) analysis was firstly conducted to evaluate the immune scenario in the tumor microenvironment with R package estimate, 31 including the ESTIMATE, immune, and stromal scores. We also performed a correlation analysis of CI, to further reveal the relationship between CI and immune cells. In addition, we explored the correlation of CI with immune cell infiltration and common immune checkpoints (ICPs). The KM method was used to analyze the survival correlation and clinical grouping information between high-CI group and low-CI group. All statistical p values were two-sided, and p < .05 was considered as statistically significant.

| Verification of expression and promoter methylation levels
The expression level of CRGs was validated through the Human Pro- 2.9 | Prediction of drug response R package pRRophetic 34 was utilized to predict the chemotherapeutic drug response level via the concentration causing 50% inhibition (IC50) of compared groups, using the algorithm developed by Geeleher et al. 35 Lower IC50 values commonly stand for better therapeutic effect.

| Statistical analysis
In this study, R 4.1.1 was used for data processing and analysis.
p < .05 was considered as statistically significant. Independent t-tests were used to compare continuous variables that conformed to the normal distribution, while Mann-Whitney U tests were for these continuous variables with skewed distribution. The correlation between genes and OS was analyzed by KM curves using the log-rank test.

| Establishment and characteristic recognition of cuproptosis subtypes
We used the Consensus Cluster Plus R package to classify patients according to the expression levels of the eight CRGs, which finally distinguished three distinct clusters ( Figure 3A). OS analysis was performed based on these three clusters ( Figure 3B). The results showed significant statistical differences between different taxa, with group A having the most significant survival advantage. In addition, the associ-  Figure 3D). We further analyzed the genes by GO annotation. The results in Figure 3E revealed that the genes were mainly involved in RNA splicing, the ncRNA metabolic process, ribonucleoprotein complex biogenesis, the proteasomal protein catabolic process, etc. Figure 3F shows that genes were significantly abundant in herpes simplex virus 1 infection.
Additionally, we also analyzed the expression levels of immune cells and ICP programmed cell death 1 (PDCD1) in the two subtypes and found they were different ( Figure S1). Compared with cluster C, the expression level of activated CD4 + T cells was higher in cluster A (p = .011), while myeloid-derived suppressor cells (MDSC) (p = .0082), macrophage ( p = .022), and PDCD1 ( p = .0003) expression was lower.

| Construction of the CI and corresponding clinical prognostic characteristics
Due to the heterogeneity and complexity of CRG modification, we constructed the CI to evaluate the CRG modification pattern of patients. Figure 4A showed that this index was negatively correlated  shown that ICPs play an important role in tumor immunity. 36,37 Therefore, we explored the expression of ICPs in both groups ( Figure 4G).
Compared with the low-CI array, the expression of 12 ICP genes in the high-CI array was significantly higher, while the expression of 1 ICP gene was significantly lower. These results suggested that the immune surveillance function of T cells in patients with high-CI array was inhibited, which promoted the growth of tumors.
To explore whether this index is applicable to patients of different clinical groups, we used KM curves to analyze the prognostic differences between high-and low-CI groups in different clinical groups.

| Identification of the clinical features of CDKN2A
We also explored the correlation between CDKN2A expression status and clinical features. As shown in Figure 6A, high-CDKN2A expression group was significantly associated with advanced grade ( p = .039) and lymph node metastasis ( p = .00025), but not with T stage (p = .15) or distant metastasis ( p = .09). As illustrated by Figure 6B, the rise in CDKN2A expression was positively correlated with the advancement of grade in patients with COAD (p = .007). In addition, CDKN2A expression was significantly correlated with lymph node metastasis (p < .001), but not with T stage (p = .053) and distant metastasis ( p = .095). CDKN2A showed a consistent increase in gene expression and disease severity in all clinical features. Furthermore, we also verified this conclusion using the GEO database ( Figure 6C), and all data sets (GSE113513, GSE24551, and GSE41258) consistently showed that the expression levels of CDKN2A between normal and COAD tissues were significantly different ( p < .001). Meanwhile, the receiver operating characteristic (ROC) curve showed that the expression of CDKN2A had good sensitivity and specificity for predicting the prognosis of patients with COAD ( Figure 6D).

| The immunomodulatory effect and sensitivity of anti-tumor drugs for CDKN2A
The abundance of infiltrated immunosuppressive cells, including Tregs and MDSCs, was higher in the high-CDKN2A group ( Figure 6E). Consistently, the PDCD1 was higher in the high-CDKN2A expression group ( Figure 6F). Moreover, the sensitivity of anti-tumor drugs (IC50) was evaluated, indicating that the high-CDKN2A expression group was significantly related to higher sensitivity (lower IC50 value) to sunitinib ( p < .001) ( Figure 6G). The GSEA revealed that pathways, including reactome cell cycle mitotic and reactome signaling by Rho-GTPases, were enriched in the high-CDKN2A expression group ( Figure 6H). study by Tsvetkov and his colleagues showed that intracellular copper can induce a new form of regulated cell death, which is different from oxidative stress-related cell death and is called cuproptosis. 8 Therefore, we researched the expression of CRGs in COAD and established the CI to further predict the prognosis of CRGs. In addition, we analyzed CDKN2A, which may be a promising immune target.

| DISCUSSION
Scholars found that copper homeostasis disorder plays an important role in cancer. 38,39 Particularly in patients with breast, 40,41 ovarian, 42 lung, 43 prostate, 44 gastric, 45 52 Similarly, in our study, higher expression of CDKN2A is associated with a poorer tumor immune microenvironment, and may contribute to tumor progression as a result.
The value of CRGs in patients with COAD was further corroborated in related articles. In Wu's study, 53  Similarly in Xu's study, 54 they screened out the modeled genes from clustering results which were based on CRGs, and a total of 11 genes were finally generated. They used Lasso regression to construct the model and the biological and immunological characteristics between two risk groups were revealed. Our study was clinically based and focused on the prognosis of the gene for the patient in detailed clinical status, where we utilized a completely different modeling approach, which proved to have excellent prognostic value. Meanwhile, we have studied CDKN2A, a single gene, in more depth, paving the way for subsequent basic scientific research.
Admittedly, our research has its limitations. In vitro experiments are needed to validate the results. Studies on human tissue samples are indispensable. The generalizability of the CI index still needs to be evaluated and optimized with more data.

| CONCLUSION
Our study showed that CRGs played an important role in the development and prediction of COAD. In addition, we established the CI, which could be used to evaluate clinical, prognostic, and immune patterns. Furthermore, CDKN2A was correlated with the expression of ICPs, and might become a potential target for immunotherapy.

AUTHOR CONTRIBUTIONS
Zheng Liu designed this work. Hao Han performed the bioinformatics analysis and wrote the manuscript. Ye Jin and Haihao Yan performed the data review. All authors have read and approved the published version of the manuscript.