Comprehensive characterization of m6A methylation and its impact on prognosis, genome instability, and tumor microenvironment in hepatocellular carcinoma

N6-methyladenosine (m6A) RNA regulation was recently reported to be important in carcinogenesis and cancer development. However, the characteristics of m6A modification and its correlations with clinical features, genome instability, tumor microenvironments (TMEs), and immunotherapy responses in hepatocellular carcinoma (HCC) have not been fully explored. We systematically analyzed the m6A regulator-based expression patterns of 486 patients with HCC from The Cancer Genome Atlas and Gene Expression Omnibus databases, and correlated these patterns with clinical outcomes, somatic mutations, TME cell infiltration, and immunotherapy responses. The m6A score was developed by principal component analysis to evaluate m6A modifications in individual patients. M6A regulators were dysregulated in HCC samples, among which 18 m6A regulators were identified as risk factors for prognosis. Three m6A regulator-based expression patterns, namely m6A clusters, were determined among HCC patients by m6A regulators with different m6A scores, somatic mutation counts, and specific TME features. Additionally, three distinct m6A regulator-associated gene-based expression patterns were also identified based on prognosis-associated genes that were differentially expressed among the three m6A clusters, showing similar properties as the m6A regulator-based expression patterns. Higher m6A scores were correlated with older age, advanced stages, lower overall survival, higher somatic mutation counts, elevated PD-L1 expression levels, and poorer responses to immune checkpoint inhibitors. The m6A score was validated as an independent and valuable prognostic factor for HCC. M6A modification is correlated with genome instability and TME in HCC. Evaluating m6A regulator-based expression patterns and the m6A score of individual tumors may help identify candidate biomarkers for prognosis prediction and immunotherapeutic strategy selection.


Background
Hepatocellular carcinoma (HCC) accounts for the most cases of primary liver cancers, which have the sixth and third highest incidence and mortality rate, respectively [1]. HCC is characterized by its high malignancy, high recurrence rate, and poor prognosis. However, Open Access *Correspondence: shukunyao@126.com 1 Peking University China-Japan Friendship School of Clinical Medicine, No. 2 Yinghua East Road, Chaoyang District, Beijing 100029, China Full list of author information is available at the end of the article its underlying molecular pathogenesis remains largely unclear.
N6-methyladenosine (m6A) RNA methylation plays a vital role in RNA splicing, export, processing, translation and decay, and is the most common type of post-transcriptional regulation of mRNAs in living organisms [2]. Previous reports identified the following three types of m6A regulators: "writers" (methyltransferases), "readers" (m6A-binding proteins), and "erasers" (demethylases) [3]. Recently, growing evidence has revealed the emerging role of m6A deregulation in liver diseases and cancer [4][5][6]. Thus, regulators of m6A modification may be diagnostic and therapeutic targets for HCC.
Genome instability is an evolving hallmark of many types of cancers and is involved in tumorigenesis and cancer progression [7]. The tumor mutation burden (TMB), which refers to the accumulation of endogenous and exogenous mutation processes in cancer cells, has been reported to play a crucial role in the biological processes of cancer based on the high frequency and wide spectrum of somatic mutations [8]. The TMB was also shown to be a useful prognostic biomarker [9], and for immunotherapy selection in some cancers [10]. Furthermore, Zhang et al. found that m6A modification scores were correlated with TMB levels in gastric cancer [11]. However, the relationship between m6A regulation and genome instability in HCC remains ambiguous.
Cancer development has been found to occur that are closely associated with alterations in tumor microenvironments (TMEs), which are aggregations of tumor cells and adjacent tumor-related nontumor cells [12]. Blockade of immune checkpoints, including programmed cell death-1 (PD-1), its ligand (PD-L1), and cytotoxic T lymphocyte associated antigen 4 (CTLA4), has revolutionized oncology therapeutics [13,14]. The association between PD-L1 expression and immune infiltration of m6A regulators has been reported in head and neck squamous cell carcinoma and gliomas [15,16]. However, the relationships of m6A modification regulators with immune infiltration of TMEs and immunotherapy in HCC require further exploration.
The aim of this study was to comprehensively identify m6A regulator-based expression patterns, quantify individual scores, and evaluate the prognostic value of m6A methylation in HCC. The relationships of m6A regulation with genome instability and TMEs were also investigated to reveal the role of m6A methylation in HCC carcinogenesis and immunotherapy. In summary, the current study expanded the understanding on the mechanism of m6A modification in the tumorigenesis and development of HCC and identified biomarker candidates for prognosis prediction and therapy selection in patients with HCC.

Data collection
The RNA profiles and clinical information of 374 HCC samples and 50 normal samples were downloaded from The Cancer Genome Atlas (TCGA) database on July 18, 2021. The fragments per kilobase of transcript per million mapped reads (FPKM) values of RNA expression data downloaded from TCGA database were transformed into transcripts per kilobase million (TPM) values. A total of 21 datasets of HCC patients were obtained from Gene Expression Omnibus (GEO) databases. After excluding 20 GEO cohorts without survival data, the normalized series matrix file and clinical information of 115 HCC patients were downloaded from the GSE76427 dataset. Next, 3 samples were deleted from TCGA cohort because they were not primary tumors of HCC. Therefore, the gene expression and clinical data of 486 patients with HCC were further analyzed. The batch effects among TCGA and GSE76427 cohorts were removed using the ComBat method in "sva" R package. The PRISMA flow chart is shown in Additional file 1: Fig. S1. The somatic mutation data of patients with HCC were obtained from TCGA database and identified using the VarScan software. The copy number variation (CNV) data of HCC were collected from the UCSC Xena database (www. xena. ucsc. edu). The immunophenoscores (IPS), a predictor of response to anti-CTLA4 and anti-PD-1 antibodies based upon the expression profiles of the representative genes of immunomodulatory, were obtained from The Cancer Immunome Atlas (https:// tcia. at). We followed the rules that are set forth by the public data set management for use of the data set. All data in this study were obtained from public databases and available to the public.

Consensus clustering of m6A regulator-based expression patterns
Based on the expression levels of m6A regulators, patients with HCC in TCGA and GSE76427 cohorts were classified into different m6A regulator-based expression patterns, namely m6A clusters, using the "Consensus-ClusterPlus" R package with the pam method, with 50 iterations and a resampling rate of 80% [21]. To determine the number of m6A clusters, we used the empirical cumulative distribution function (CDF) plots to identify the consensus distributions for each k, as well as the delta area score to display the relative growth in cluster stability. Consistent matrix (CM) plots were also illustrated based on the k-value. Moreover, based on the expression level of 25 m6A regulators, principal component analysis (PCA) was performed among the different m6A clusters. The expression levels of the 25 m6A regulators were compared among the m6A regulator-based expression patterns.

Enrichment analysis and immune cell infiltration of m6A clusters
To explore the different biological processes between m6A clusters, the "c2.cp.kegg.v7.4.symbols" file was downloaded from the MsigDB database for Gene set variation analysis (GSVA). An adjusted P < 0.05 was considered to indicate statistically significant results. The "limma" R package was applied to identify differentially expressed and m6A-related genes using the lmFit and eBayes functions with the significant cutoff value at adjusted P < 0.05. Differentially expressed genes (DEGs) between the different m6A regulator-based expression patterns were first identified, and then the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was conducted based upon the DEGs by using the "cluster-Profiler" R package [22,23]. Furthermore, the immune infiltration characteristics were analyzed and compared among different m6A clusters. The relative abundances of cell infiltration in the TMEs of HCCs were quantified using the enrichment score for each immune category of single-sample gene set enrichment analysis (ssGSEA). The "GSVA" R package was utilized for GSVA and ssGSEA analyses.

Identification of m6A gene clusters
To further analyze the values of the DEGs of the m6A clusters, univariate Cox regression analysis was performed to retrieve the prognosis-related DEGs (P < 0.0001). HCC patients in TCGA and GEO cohorts were separated into different m6A regulator-associated gene-based expression patterns, namely m6A gene clusters, which were defined by the consensus clustering according to the expression of prognostic DEGs. PCA of m6A gene clusters was performed based on expression of these prognosis-related DEGs. Survival analysis was also conducted among gene clusters. Moreover, the expression levels of the 25 m6A regulators in the different gene expression-based clusters were then investigated.

Correlation of m6A score with clinical factors, genome stability, and immune characteristics
To quantify the m6A modification score, PCA was conducted to develop a set of m6A scoring systems using a method similar to that used by Zhang et al. [11]. Briefly, the sum of all principal components 1 and 2 of each prognostic DEG related to m6A regulators was considered as the m6A score. Patients with HCC were classified into the low or high m6A score subgroup according to the cutoff point determined using "survminer" R package. The differences in m6A scores among the m6A clusters and m6A gene clusters were compared. Furthermore, the value of m6A score in predicting prognosis was examined by survival analysis of patients with HCC and of patients in subgroups stratified by clinicopathological features. The correlations of the m6A score with the TMB, immune cell infiltration and IPS were explored in detail to reveal the roles of m6A regulators.

Statistical analysis
R version 4.0.3 was utilized for data analysis. For quantitative data, t-test and Wilcoxon test were used to compare two groups, whereas one-way analysis of variance and Kruskal-Wallis test were used to compare differences among three or more groups for parametric and nonparametric data, respectively. Kaplan-Meier curves were used for survival analyses using log-rank test. The cutoff values of m6A regulator expression and the m6A score in survival analyses were calculated using the survcutpoint function in "survminer" R package to evaluate all potential cutoff points repeatedly to determine the maximum rank statistics. Cox regression analyses were performed to identify independent prognostic factors from the m6A regulators and m6A-related prognostic DEGs. Time-dependent receiver operating characteristic (ROC) curves were constructed to evaluate the accuracy of the m6A score for prognosis prediction using the 1-and 3-year areas under the curves (AUCs). Waterfall maps were used to present the mutational landscape of patients with HCC in TCGA cohort using the "Maftools" R package. Statistical significance was set at P < 0.05.

Mutational genomic landscape of m6A regulators in HCC
A total of 364 patients with HCC were enrolled from TCGA cohort to depict the landscape of genetic variation in m6A regulators, among which 45 samples (12.36%) showed mutations in m6A regulators (Fig. 1a). The CNV alteration of HCC is depicted in Fig. 1b. Some m6A regulators displayed increased copy numbers, whereas others showed decreased CNV frequencies.
The chromosome locations of the CNV alterations of m6A regulators are illustrated in Fig. 1c. Furthermore, the expression levels of m6A regulators were compared between normal and HCC tissues in TCGA cohort, indicating that except for ZC3H13, the other 24 m6A regulators were upregulated in the HCC samples in TCGA cohort (Fig. 1d).

Prognostic value of m6A regulators in HCC
As METTL16 and VIRMA were not available in the GSE76427 cohort, only 23 m6A regulators were analyzed. A total of 18 m6A regulators, namely, METTL3, WTAP, RBM15, RBM15B, CBLL1, ZCCHC4, YTHDC1, YTHDF1/2/3, LRPPRC, HNRNPA2B1, HNRNPC, IGF2BP1/2/3, ELAVL1, and FTO, were found to be associated with overall survival (OS) in HCC patients from TCGA and GSE76427 cohorts using Kaplan-Meier curves and log-rank tests (Additional file 1: Fig. S2). A crosstalk network was generated by R software to present the widespread correlations in expression levels across the m6A regulators in patients with HCC (Fig. 2a). In summary, m6A regulators could have played a vital role in predicting clinical outcomes in patients with HCC, and some regulators showed potential as prognostic biomarkers.

Consensus clustering for HCC patients by m6A regulators
Using consensus clustering, the patients with HCC from TCGA and GSE76427 cohorts were assigned to three m6A regulator-based expression patterns, namely m6A cluster A (n = 188), cluster B (n = 110), and cluster C (n = 187), based on the expression of m6A regulators. Both the CDF plot and delta area score showed the highest stability with k = 3 (Additional file 1: Fig. S3a, b). The CM plot also showed the high consistency at k = 3 (Additional file 1: Fig. S3c). The results of PCA showed distinguishable differences in the transcriptional profiles among the three m6A clusters (Fig. 2b). The three m6A clusters showed significantly different OS, with cluster A and C exhibiting good prognoses and cluster B demonstrating the worst clinical outcome (Fig. 2c). The relationships of the m6A clusters with clinicopathological features and expression of m6A regulators are illustrated in Fig. 2d. Furthermore, the expression of the m6A regulators among the three m6A clusters showed significant differences (Fig. 2e). Specifically, cluster B was characterized by significant upregulation of all m6A regulators.

Enrichment analysis and TMEs of m6A clusters
GSVA analyses suggested that among the three m6A clusters, cluster B included carcinogenesis-related enrichment pathways such as cell cycle, RNA degradation and nucleotide excision repair (Additional file 1: Fig.  S4a, S4b). A total of 3500 genes were found to be differentially expressed among the three m6A clusters (Fig. 3a). KEGG enrichment analysis suggested that these DEGs were enriched in carcinogenesis and RNA modification pathways, including cell cycle, mismatch repair, RNA degradation, and nucleotide excision repair (Fig. 3b). The cell infiltration of the TMEs in m6A clusters was further investigated. Cluster B showed the highest infiltration of activated CD4 + T cells, immature dendritic cells, natural killer T cells, and type 2 T helper cells, with the lowest infiltration of eosinophils, neutrophils, type 1 T helper cells, and type 17 T helper cells (Fig. 3c). Thus, the immunosuppressive TMEs in cluster B may have contributed to worse clinical outcomes, indicating that the distinct TME characteristics in varying m6A clusters might play a vital role in the tumorigenesis and prognosis of HCC.

Consensus clustering of the m6A gene cluster
Using the univariate Cox regression model, 307 DEGs were extracted as significantly associated with prognosis (Additional file 1: Table S1). Based on the expression of these prognostic DEGs, patients with HCC were classified into three m6A regulator-associated gene-based expression patterns: m6A gene cluster A (n = 179), gene cluster B (n = 110), and gene cluster C (n = 196) (Additional file 1: Fig. S3d-f ). Distinguishable differences in the transcriptional profiles of prognostic DEGs were revealed among the three m6A gene clusters (Fig. 4a). Consistent with the results for the three m6A clusters mentioned above, the OS of the three gene clusters were significantly different, with cluster B exhibiting the worst prognosis (Fig. 4b). The relationships of the m6A gene cluster with age, gender, stage, prognosis, m6A cluster, and expression levels of DEGs are illustrated in Fig. 4c. Furthermore, the expression levels of the 23 m6A regulators differed among the three gene clusters, with gene cluster B exhibiting the highest expression of most m6A regulators (Fig. 4d). In summary, the m6A-related DEGs may be crucial factors affecting the clinical outcomes of patients with HCC.

Generation of m6A score and the prognosis prediction
To further explore the value of the DEGs, an m6A score was generated to evaluate the m6A modification. The m6A score was highest in cluster B and lowest in cluster A (Fig. 5a). Similar distribution was observed in the m6A gene clusters (Fig. 5b). Older patients (age > 65 years), patients with early-stage disease and patients survived at the clinical endpoint had lower m6A scores compared to their control pairs (Fig. 5c-e). Using a cutoff value of 10.46 determined by the "survminer" R package, the patients were divided into high and low m6A score groups. Patients with low m6A scores tended to have better prognoses (Fig. 5f ). The OS of the low m6A score group was longer than the high m6A score group of HCC patients, suggesting that the m6A score can be successfully applied to all patients with HCC as a prognostic indicator (Fig. 5g-l).
The age, gender, stage and m6A score were examined using Cox regression analysis in HCC patients. Univariate Cox analysis showed that the stage and m6A score (both P < 0.001) were associated with OS of HCC patients (Fig. 5m). The stage and m6A score (both P < 0.001) were also correlated with OS in multivariate Cox analysis (Fig. 5n). Therefore, the m6A score was validated as an independent prognostic factor for patients with HCC. The 1-and 3-year AUC of the m6A score were 0.743 and 0.678 respectively, indicating good performance of the m6A score in predicting prognoses (Fig. 5o). An alluvial diagram was utilized to illustrate the attribute changes among m6A clusters, m6A gene clusters, m6A scores, and clinical outcomes (Fig. 5p).

Correlation of m6A score with genomic instability
The somatic mutation count of the high m6A score subtype was higher than that of the low score subtype (Fig. 6a). Patients with high TMBs tended to have poorer prognoses (Fig. 6b). The m6A score was combined with the TMB to predict the prognoses of patients with HCC (Fig. 6c). Specifically, patients with an elevated TMB and m6A scores showed the worst prognoses. Waterfall maps of low and high m6A score groups revealed the distribution of somatic mutations in the TCGA cohort (Fig. 6d, e). The top three mutated genes in the low m6A score group were CTNNB1 (28%), TTN (23%) and TP53 (21%), and the top three mutated genes were TP53 (53%), TTN (26%) and CTNNB1 (17%) in the high m6A score group. Based on these results, m6A modification may be correlated with genomic stability, but the mechanism requires further exploration.

Relationship of m6A score with TME and immunotherapy
The m6A scores were positively correlated with activated CD4 + T cells, activated dendritic cells, immature B cells, immature dendritic cells, myeloid-derived suppressor cells (MDSCs), natural killer T cells, regulatory T cells, T follicular helper cells and type 2 T helper cells, and negatively correlated with eosinophils, monocytes, neutrophils, and type 1 T helper cells (Fig. 7a). Patients with high m6A scores were characterized by increased expression levels of PD-L1 (Fig. 7b). As low m6A scores were associated with a higher IPS in the four subgroups, patients with low m6A scores showed better responses to anti-PD-1 therapy and anti-CTLA4 therapy (Fig. 7c-f ).
Overall, m6A modification might be involved in immune cell infiltration of the TME and act as a promising crucial factor in the immunotherapy response.

Discussion
Mounting evidence suggests that m6A methylation is a prevalent RNA internal modification with an essential role in HCC carcinogenesis, progression, and treatment outcomes [5,6]. The correlation between m6A regulation and the TME has been investigated in some cancer types such as gastric cancer [11] and lung adenocarcinoma  [17]. In this study, we identified three m6A regulatorbased expression patterns and three m6A gene clusters and developed an m6A score to quantify m6A modification. The relationships of m6A regulation with clinical outcomes, somatic mutations, cell infiltration of the TME, and immunotherapy responses were systematically investigated to examine the value of m6A modification in HCC development. Distinct genetic alterations and significantly upregulated expression of m6A regulators were observed in patients with HCC compared to in normal pairs in this study, which is consistent with results obtained in other cancer types, such as gastric cancer, colorectal cancer, and pancreatic cancer [11,19,24]. The m6A regulators were rarely mutated in HCC, whereas gain and loss alterations in CNVs were prevalent but relatively equal among the m6A regulators. These results indicated that somatic mutations and CNVs may partially explain the expression differences in m6A regulators between HCC and normal samples, which requires further exploration.
In our study, 18 m6A regulators were identified as prognostic indicators for HCC. High expression of these m6A regulators was correlated with worse OS in patients with HCC. Consistent with our findings, METTL3 [25], WTAP [6], ZCCHC4 [26], YTHDF2 [5], LRPPRC [27], IGF2BP1 [28], IGF2BP2 [29], and FTO [30] have been reported as potential prognostic indicators involved in diverse pathophysiological processes in HCC. Moreover, higher m6A scores correlated with older age, advanced stage, higher somatic mutation counts, and poorer OS in all age, gender, and stage subgroups with good performance in our study. Meanwhile, m6A scores were found to be useful for diagnostic and prognostic prediction in gastric cancer [11], colon cancer [19], and HCC [31]. In summary, m6A regulators and the m6A score may be promising biomarkers for evaluating clinicopathological features and predicting clinical outcomes in HCC.
Our findings also revealed a relationship between m6A regulation and the TME in HCC. Patients with higher m6A scores showed elevated PD-L1 expression. Accumulating evidence has shown that m6A regulation is Fig. 6 Correlation of m6A score with tumor mutation burden (TMB). a TMB levels in different m6A score groups. b Survival analysis of TMB in patients with hepatocellular carcinoma. c Survival analysis of patient subgroups stratified according to m6A score and TMB. Waterfall maps of somatic mutations for patients with low m6A scores (d) and high m6A scores (e). H: high; L: low correlated with PD-L1 expression and TMEs in gastric cancer [11] and colorectal cancer [19]. In addition, the m6A regulator-related risk scores were previously found to be correlated with CTLA4 and PD-L1 in breast cancer [32]. Analysis of the value of m6A modification in immunotherapeutic clinical outcomes of HCC suggested that patients with low m6A scores might benefit from immunotherapies targeting CTLA4/PD-1 inhibitors.
Thus, m6A regulators might affect PD-L1 expression and immune cell infiltration in patients with HCC.
In our study, TMB could act as a potential prognostic factor for HCC, which is consistent with many other studies [33,34]. Our findings revealed that combination of m6A score and TMB could predict OS of HCC. Moreover, TP53 was the most common mutation (53%) in patients with high m6A scores, whereas CTNNB1 was the most common mutation (28%) in patients with  [35], and HCC patients harboring TP53 mutations tended to have poor prognosis along with hypoxia-induced HCC stemness [36]. Patients with HCC carrying TP53 neoantigens also showed longer OS and higher cytotoxic lymphocyte infiltration [37]. Inhibiting the expression of CTNNB1 may increase the stemness features of HCC [38]. Overall, the correlation and interaction of m6A modifications with genome instability might influence tumorigenesis and prognosis of HCC.
In this study, we provided new insight into the roles of m6A regulator-based expression patterns in HCC. In clinical practice, the m6A score may be useful for evaluating m6A methylation, predicting clinical outcome, and assessing corresponding TME cell infiltration characterization, TMB and immunotherapeutic response in individual patients with HCC, which might contribute to the identification of prognostic biomarkers and selection of immunotherapies. However, there were some limitations to our study. First, the sample size of patients may affect the development of the m6A score. Large-scale datasets and clinical samples should be utilized to validate the prognostic value of the m6A score and m6A regulators found in this study. Moreover, molecular experiments are necessary to confirm the specific biological pathways involved in m6A modification during HCC carcinogenesis and progression. Finally, considering the limitations in the therapeutic response data, the efficiency of m6A modification to predict prognosis and clinical benefit in immunotherapy requires verification in the future.

Conclusions
In summary, we identified m6A regulator-based expression patterns and m6A score, and investigated their prognostic value in HCC patients. Furthermore, the correlation of m6A regulation with somatic mutations, TMEs, and immunotherapy responses were explored. Our findings provide insights into the evaluation of m6A regulation as a potential biomarker for prognostic prediction and guidance of immunotherapeutic strategies for HCC.
Additional file 1. Fig. S1. PRISMA flow diagram for data collection. Fig.  S2. Survival analyses for m6A regulators associated with overall survival of patients with HCC. Fig. S3. Unsupervised consensus clustering analysis of patients with HCC. Fig. S4. GSVA analyses of the three m6A regulatorbased expression patterns. Table S1. Prognostic m6A-related genes among three m6A regulator-based expression patterns.