High expression of COMMD7 is an adverse prognostic factor in acute myeloid leukemia

Acute myeloid leukemia (AML) is a frequent malignancy in adults worldwide; identifying preferable biomarkers has become one of the current challenges. Given that COMMD7 has been reported associated with tumor progression in various human solid cancers but rarely reported in AML, herein, RNA sequencing data from TCGA and GTEx were obtained for analysis of COMMD7 expression and differentially expressed gene (DEG). Furthermore, functional enrichment analysis of COMMD7-related DEGs was performed by GO/KEGG, GSEA, immune cell infiltration analysis, and protein-protein interaction (PPI) network. In addition, the clinical significance of COMMD7 in AML was figured out by Kaplan-Meier Cox regression and prognostic nomogram model. R package was used to analyze incorporated studies. As a result, COMMD7 was highly expressed in various malignancies, including AML, compared with normal samples. Moreover, high expression of COMMD7 was associated with poor prognosis in 151 AML samples, as well as subgroups with age >60, NPM1 mutation-positive, FLT3 mutation-negative, and DNMT3A mutation-negative, et al. (P < 0.05). High COMMD7 was an independent prognostic factor in Cox regression analysis; Age and cytogenetics risk were included in the nomogram prognostic model. Furthermore, a total of 529 DEGs were identified between the high- and the low- expression group, of which 92 genes were up-regulated and 437 genes were down-regulated. Collectively, high expression of COMMD7 is a potential biomarker for adverse outcomes in AML. The DEGs and pathways recognized in the study provide a preliminary grasp of the underlying molecular mechanisms of AML carcinogenesis and progression.


INTRODUCTION
Acute myeloid leukemia (AML) is an aggressive malignant tumor characterized by high heterogeneity, variable prognosis, and high mortality. The principal factors in risk stratification and treatment options are currently composed of cytogenetic and molecular abnormalities [1,2]. However, the inherent concrete molecular mechanisms have not yet been exactly elucidated. The development of various targeted agents has facilitated individualized treatment for AML patients, thereby ameliorating complete AGING remission (CR) rates and prolonging survival. Regrettably, the existing targeted drug monotherapy or combination therapy with traditional chemotherapy has not yet achieved the desired efficacy [3]. Thus, discerning novel biomarkers may contribute to better comprehending the molecular basis of AML, which may play an essential role in AML diagnosis, prognostic stratification, leukemia residual monitoring, treatment response prediction, as well as the possibility of targeted drug development.
COMM domain-containing protein 7 (COMMD7), a member of the COMMD family defined by the presence of a conserved and unique motif termed the copper metabolism gene MURR1 (COMM) domain, which is located on chromosome 20q11.21, has been reported associated with tumor progression in human solid cancers [4]. COMMD7 is overexpressed in pancreatic ductal adenocarcinoma (PDAC) cells, associated with poor prognosis in PDAC patients. Inhibition of COMMD7 gene in human PDAC cell lines induces antitumor effects under stress conditions, mediated in part by the ERK1/2-mediated CyclinD 1, Bcl-2, Bax, and MMP-2 signaling pathways [5]. Another study revealed that COMMD7-overexpressed hepatocellular carcinoma (HCC) cells promoted the proliferation of naïve HCC cells [6]. On the other hand, COMMD7 was identified as a novel NEMO interacting protein involved in NF-κB signaling termination [7], the incorrect regulation of which is known to be associated with a variety of tumors [5,8]. One study demonstrated that COMMD7 played a dual regulatory role in the NF-κB signaling pathway in HCC [9]. However, to date, the expression of COMMD7 in AML and its prognostic value remain unclear. Therefore, in this study, we aimed to ascertain the relationship between the expression level of COMMD7 and the prognosis of AML by the following three steps: First of all, RNA sequencing (RNA-seq) data of AML samples from the cancer genome atlas (TCGA) and Genotype-Tissue Expression (GTEx) were acquired to analyze the expression of the core gene COMMD7. Subsequently, functional enrichment analysis of COMMD7 was via GO, KEGG, GSEA, immune cell infiltration analysis, and protein-protein interaction (PPI) network. Besides, the clinical significance of COMMD7 in AML was analyzed by Kaplan-Meier and Cox regression and nomogram prognostic model.
In this way, significantly altered genes and pathways would be screened out through gene enrichment analysis and subpathway enrichment analysis, the connection of which with COMMD7 may play pivotal roles in the occurrence of AML.

RNA-sequencing data and bioinformatics analysis
The pan-cancer RNA-seq data of TCGA and GTEx with toil processed uniformly were downloaded from UCSC XENA (https://xenabrowser.net/datapages/) [10][11][12][13]. Level 3 HTSeq-FPKM and HTSeq-Count data of the AML samples were obtained from the TCGA website (https://portal.gdc.cancer.gov/repository) for further analysis. This study was in full compliance with the published guidelines of TCGA and GTEx.

Differentially expressed gene (DEG) analysis
The DESeq2 R package was adopted to compare expression data of low-and high-expression of COMMD7 (cut-off value of 50%) in AML samples (HTseq-Count) to identify DEGs [14]. The top 10 DEGs were performed by heat map.

Functional enrichment analysis
DEGs with the threshold for |logFC| >1.5 and padj <0.05 were applied for functional enrichment analysis. Gene Ontology (GO) functional analysis comprising cellular component (CC), molecular function (MF), and biological process (BP), as well as Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, were implemented using the ClusteProfiler package in R [15].

Gene set enrichment analysis (GSEA)
R package ClusteProfiler (3.14.3) was used for GSEA to elucidate the functional and pathway differences between the high-and low-expression groups of COMMD7 [15]. The gene set was permutated 1,000 times for each analysis. Adjusted P-value < 0.05 and FDR q-value < 0.25 were considered to be statistically significant.

Immune infiltration analysis by single-sample Gene Set Enrichment Analysis (ssGSEA)
Immune infiltration analysis of COMMD7 was conducted by ssGSEA using GSVA package in R (3.6.3). A total of 24 types of infiltrating immune cells were obtained as previously described [16]. Spearman correction was used to analyze the correlation between COMMD7 and the enrichment scores of 24 types of immune cells. Wilcoxon rank-sum test was used to analyze the enrichment scores of high-and low-COMMD7 expression groups.

PPI network
The PPI network of DEGs was predicted using the Search Tool for the Retrieval of Interacting Genes AGING (STRING) database [17]. The interaction score threshold of 0.4 was set as the cut-off criterion. The PPI network was mapped using Cytoscape (version 3.7.1) [18], and the most significant modules in the PPI network were identified using MCODE (version 1.6.1) [19]. Selection criteria were as follows: MCODE scores >5, degree cut-off = 2, node score cutoff = 0.2, Max depth = 100, and k-score = 2. Metascape (https://metascape.org/gp/index.htm) was used to conduct the pathway and process enrichment analysis.

Prognostic model generation and prediction
In order to individualize the prediction of overall survival (OS) and event-free survival (EFS) in AML patients, a nomogram was generated using the RMS R package (version 5.1-3), which included prominent clinical characteristics and calibration plots. The calibration curves were evaluated graphically by mapping the nomogram-predicted probabilities against the observed rates, and the 45°line represented the best predictive values. Concordance index (C-index) was used to determine the discrimination of the nomogram, and the bootstrap approach was used to calculate 1000 resamples. In addition, C-index and receiver operating characteristic (ROC) were used to analyze and compare the predictive accuracy of the nomogram and separate prognostic factors. All statistical tests were double-tailed with 0.05 as the statistical significance level.

Statistical analysis
All statistical analyses and graphs were analyzed and displayed by R (3.6.2) [20]. The expression of COMMD7 in unpaired samples was analyzed by Wilcoxon rank-sum test, with Wilcoxon signed-rank test used in paired samples. Kruskal-Wallis test, Wilcoxon signed-rank test, and logistic regression analysis were used to evaluate the relationship between clinical/cytogenetic characteristics and COMMD7 expression. Cox regression analysis and Kaplan-Meier method were used to evaluate the prognostic factors. Multivariate Cox analysis was adopted to compare the impact of COMMD7 expression on survival along with other clinical features. The median COMMD7 expression was regarded as the cut-off value. In all tests, P value < 0.05 was considered statistically significant. Moreover, ROC analysis was performed on the pROC package to assess the effectiveness of the transcriptional expression of COMMD7 in distinguishing AML from healthy samples. The computed area under the curve (AUC) value ranging from 0.5 to 1.0 indicated 50-100% discrimination ability.

COMMD7 expression in pan-cancers and AML
RNA-seq data from UCSC XENA (https://xenabrowser. net/datapages/) was downloaded in TCGA and GTEx formats processed uniformly through the toil process. By comparing the expression of COMMD7 normal samples in TCGA and GTEX database and corresponding tumor samples in TCGA database, COMMD7 was found significantly high expressed in 28 types of cancer ( Figure 1A), including acute myelogenous leukemia (LAML) ( Figure 1B).

Identification of DEGs in AML samples with lowand high-expressed COMMD7
The high-and low-expression groups' gene expression profiles were analyzed for differences in the median mRNA expression. A total of 529 DEGs from gene expression RNA-seq-HTSeq-Counts, including 92 upregulated and 437 down-regulated, were identified statistically significant between COMMD7 high-and low-expressed groups (|log fold change (logFC)| > 1.5, P < 0.05) (Figure 2A). The top five up-regulated DEGs and top five down-regulated DEGs between COMMD7 high-and low-expressed groups were illustrated by the heat map ( Figure 2B).

Functional enrichment analysis of DEGs
To better understand the functional implication of 529 DEGs between high-and low-expression of COMMD7 in AML, GO and KEGG functional enrichment analysis was performed by clusterProfiler package (Supplementary Table 1, Figure 3). The association with the biological process (BP) included pattern specification process, regionalization, and mesenchyme development; cellular components (CC) included collagen-containing extracellular matrix, ion channel complex, and basement membrane; molecular function (MF) included receptor ligand activity, DNA-binding transcription activator activity/RNA polymerase II-specific, extracellular matrix structural constituent. KEGG included PI3K-Akt signaling pathway, focal adhesion, and ECM-receptor interaction.
GSEA analysis was conducted to gain further insight into the biologic pathways involved in AML with different COMMD7 expression levels. GSEA was performed between low-and high-COMMD7 expression datasets to identify critical signaling pathways involved in AML. Significant differences (FDR < 0.05, ADJ P < 0.05) were observed in the enrichment of MSigDB Collection (C2.all.v7.0.symbols.gmt) of these pathways (Supplementary Table 2 and Figure 4). Gene mutations AGING or fusions with a good prognosis of AML, such as PML-RARa fusion, NPM1 mutation, AML-ETO fusion, and CBFB-MYH11 fusion, were enriched in COMMD7 low-expression phenotype based on NES, with adjusted P value <0.05 and FDR value <0.05 ( Figure  Normalized expression levels were shown in descending order from green to red. (B) Heat map of the 10 differentially expressed RNAs, including 5 up-regulated genes and 5 down-regulated genes. The X-axis represents the samples, while the Y-axis denotes the differentially expressed RNAs. Green and red tones represented down-regulated and up-regulated genes, respectively. AGING genetic variants, such as phosphorylated TP53 targets and MYC targets, were also significantly enriched in such phenotype ( Figure 4K-4L).

Immune infiltration analysis in AML
Spearman correlation analysis showed that the expression level of COMMD7 in the AML microenvironment was correlated with the immune cell infiltration level quantified by SSGSEA. Specifically, COMMD7 was positively associated with NK CD56bright cells and active dendritic cells (aDCs) ( Figure 5).

PPI enrichment analysis in AML
The network of COMMD7 and its potential coexpressed genes in COMMD7-related DEGs was constructed by STRING, with a threshold of 0.4 (Supplementary Table 3). A total of 529 DEGs were screened out ( |log fold change (logFC)| >1.5, P < 0.05). The PPI network with 238 nodes and 367 edges was displayed by Cytoscape-MCODE ( Figure 6A). The most significant module with a MCODE score of 7.317 contained 42 nodes and 150 edges ( Figure 6B). Meantime, Metascape-MCODE was used to identify densely connected PPI network components of COMMD7, shown in Supplementary Figure 1. The three best-scoring GO terms by p-value as the functional description of the corresponding components were shown in Supplementary Table 4.
Likewise, the forest plot illustrated the prognostic value of COMMD7 in various AML subtypes using univariate Cox regression, with a conclusion consistent with the above results ( Figure 9).

Prognostic model of COMMD7 in AML
To better predict AML patients' prognosis, a nomogram was constructed based on the Cox regression analysis results using the RMS R package ( Figure 10A). Three independent prognostic factor variables, age, cytogenetic risk, and COMMD7 expression, were included in the model, selected into the prediction model at a statistical significance level of 0.2. Based on multivariate Cox analysis, a point scale was used to assign points to these variables. The straight line was drawn upward to determine the points of the variables, and the sum of the points assigned to each variable was rescaled to a range of 0-100. The points of each variable were accumulated and recorded as the total points. The probability of AML patient survival at 1-, 3-, and 5-year was determined by drawing a line from the total point axis straight down to the outcome axis. The 1-year survival probability was determined by drawing a vertical line downward on the total point axis along the 162-direction ending axis, suggesting the probability of 1-year survival < 20%, both of the probability of 3and 5-year < 10%. The prediction results of the nomogram calibration curve of OS were consistent with all patients' observation results ( Figure 10B).

DISCUSSION
COMMD7 is a member of the COMMD family defined by the presence of a conserved and unique motif termed    AGING the COMM (copper metabolism gene MURR1) domain, which functions as an interface for protein-protein interactions [4]. Several studies had revealed that COMMD7 was involved in the regulation of NF-kappa B signaling [5,8]. It was known that un-correct regulation of the NF-kB pathway had been linked to various tumors. Studies in several tumors such as hepatocellular carcinoma and pancreatic ductal adenocarcinoma have evaluated the expression and function of COMMD7 in tumor development [5,6]. However, little is known about the expression of COMMD7 and prognostic value in AML.
The present study's central result was that highexpressed COMMD7 in AML was associated with high BM/PB blasts, intermediate-high cytogenetic risk, and poor prognosis. Via GSEA gene enrichment analysis, low-expressed COMMD7 was associated with NPM1 mutation, PML-RARa fusion, AML-ETO fusion, and CBFB-MYH11 fusion, which are excellent prognostic factors. In contrast, high-expressed COMMD7 was associated with Wnt, RAS, MAPK, and Hedgehog pathways, suggesting that COMMD7 was not only a potential prognostic biomarker but also a promising therapeutic target by affecting oncogenesis-related pathways in AML.
It is worth noting that the most clinically relevant finding was that high expression of COMMD7 was associated with poor survival. Multivariate Cox regression analysis showed that high expression of COMMD7 was another independent prognostic factor besides age (>60 years). The establishment of the nomogram prediction model further confirmed the predictive effect of COMMD7 expression on prognosis. Therefore, COMMD7 may serve as a new adverse prognostic factor in AML patients.
More importantly, it was found that high expression of COMMD7 predicted poor prognosis in a subgroup of AML patients with NPM1 mutation. NPM1 mutation occurs in about 30% of newly diagnosed AML, which is to date one of the most frequent genetic alterations identified in AML. Isolated NPM1 mutations are generally considered to have a positive prognostic effect on AML [21,22]. Unfortunately, approximately 30-70% of AML patients with NPM1 mutations relapse within five years, with age and FLT3-ITD mutations reported to be influencing factors [23,24]. The relevance of cooperation between NPM1 and other mutations with different outcomes in driving AML has been reported in several other studies [21,22,25,26]. In view that the pivotal elements and mechanisms are still not completely clear, herein, we discovered that AML patients with high expression of COMMD7 leading to NPM1 mutations had a poor prognosis. Further research is required to verify the effect of high COMMD7 expression on AML with NPM1-mutation and explore its underlying mechanism.
In addition, Wnt, RAS, MAPK, and Hedgehog pathways were found to be closely relevant with high expression of COMMD7 in AML. Wnt signaling was convinced to be up-regulated through a variety of AGING mechanisms in AML that are necessary for the maintenance of leukemia stem cells [27,28]. A high incidence of gene mutations in RAS/MAPK pathway was identified in AML. Hedgehog pathway was related to AML cell resistance to drugs and radiotherapy, resulting in poor prognosis in AML patients. Here, COMMD7 was found to be associated with these pathways and may be involved in the genesis and maintenance of leukemia cells, calling for further studies to confirm our results and explore the specific regulatory mechanism of COMMD7 as well as these pathways.
In immune cell infiltration analysis, high expression of COMMD7 was associated with higher CD56(bright) NK cells. Human NK cells account for 10-15% of circulating lymphocytes, of which CD56(bright) and CD56(dim) NK cells are the primary two subsets. CD56(bright) NK cells have been considered as immature NK cells, precursors of CD56(dim) NK cells. Compared with CD56(dim) NK cells, CD56(bright) NK cells are characterized by high cytokine production and low cytotoxic capacity [29][30][31]. Notably, NK cells play a "double-edged sword" role in the generation of tumors. Traditionally, NK cells have been considered to AGING play an important role in immunosurveillance and thus have antitumor effects [32]. Recently, a series of studies have shown that CD56(bright) NK cells promote tumor development [31,[33][34][35]. CD56(bright) NK cell infiltration increased in lung cancer, colorectal cancer, breast cancer, et al. [33,34,36,37]. Promoting tumor angiogenesis, tumor immune escape, and loss of activity to kill tumor stem cells have been demonstrated to be involved in promoting the malignant progression of CD56(bright) NK cells [33,38,39]. Cytokines in the tumor microenvironment play a regulatory role in promoting CD56(bright) NK cell tumors [39,40]. In this study, NK CD56(bright) cells' infiltration was positively correlated with COMMD7 expression. Through Kaplan-Meier survival analysis, high COMMD7 expression was found to be associated with poor prognosis in AML patients. AML blast cells have been reported to evade NK cell immunosurveillance by diminishing the expression of several activated receptors [29]. However, the relationship between CD56(bright) NK cells and AML has not been fully elaborated. Hence, according to our findings and the above research reports, the relationship between COMMD7 and CD56(bright) and CD56(dim) NK cells, and whether COMMD7 and CD56(bright) NK cells are involved in the immune escape in AML still deserve further exploration.
Moreover, Cox analysis in the present study indicated that COMMD7 might have the ability to become an independent predictor of poor prognosis in AML after adjusting for routine clinical features. Multivariate Cox regression analysis showed that age>60 years and high COMMD7 expression were the independent prognostic factors for OS deterioration. A nomogram prognosis map was constructed by combining COMMD7 with cytogenetic risk and age to obtain a more accurate prognosis prediction model. The C-index COMMD7-related Cox model predicted the OS to be 0.754 (0.728-0.780). The calibration chart showed optimal agreement between the predictions of the nomogram associated with COMMD7 and the actual observations of 1-year, 3-year, and 5-year OS probabilities. As previously reported, older age (>60), high cytogenetic risk, high WBC count (>20 × 10 9 /L), FLT3 mutation-positive, and NPM1 mutation-negative were independent factors predicting poor prognosis of AML [1]. According to the Cox analysis and nomogram model, it seems that COMMD7 may potentially have better predictive power than cytogenetic risk, WBC count, FLT3 mutation status, and NPM1 mutation status. From this point of view, our model may provide a personalized score for individual AML patients.
However, the limitation of this study lies in the small sample size. Also, to ensure greater reliability and AGING representativeness of the findings and assumptions, the sample should be expanded for further research in the future. Clinical samples should be used to verify the prognostic predictive role of COMMD7 mRNA and protein in AML. Experimental validation should also be performed to investigate the regulatory mechanisms between COMMD7 and the genetic alterations (such as NPM1 and FLT3) and essential pathways selected by GSEA analysis. A mass of plans has been formulated for some recent laboratory work.

CONCLUSIONS
In summary, this study disclosed for the first time that COMMD7 expression increased in AML, which is also related to poor prognosis. Moreover, Wnt, RAS, AGING MAPK, and Hedgehog pathways may be the essential pathways participating in the regulation of COMMD7 in AML. More intriguingly, COMMD7 may reverse the NPM1 mutation from a good role in AML. Further verification should be carried out to reveal the biological impacts of COMMD7 in AML.

Editorial note
& This corresponding author has a verified history of publications using a personal email address for correspondence.

AUTHOR CONTRIBUTIONS
HYT, KFL, and YL were responsible for the design of the study, data acquisition, and analysis, as well as drafting the manuscript. LGC, HZ, and LW participated in data acquisition, analysis, and interpretation. DYL and ZZZ participated in drafting the manuscript and troubleshooting. YL, RZP, PSZ, XHD, and KYS participated in its design and coordination. All authors read and approved the final manuscript.