A novel tumor doubling time-related immune gene signature for prognosis prediction in hepatocellular carcinoma

Hepatocellular carcinoma (HCC) has become a global health issue of wide concern due to its high prevalence and poor therapeutic efficacy. Both tumor doubling time (TDT) and immune status are closely related to the prognosis of HCC patients. However, the association between TDT-related genes (TDTRGs) and immune-related genes (IRGs) and the value of their combination in predicting the prognosis of HCC patients remains unclear. The current study aimed to discover reliable biomarkers for anticipating the future prognosis of HCC patients based on the relationship between TDTRGs and IRGs. Tumor doubling time-related genes (TDTRGs) were acquired from GSE54236 by using Pearson correlation test and immune-related genes (IRGs) were available from ImmPort. Prognostic TDTRGs and IRGs in TCGA-LIHC dataset were determined to create a prognostic model by the LASSO-Cox regression and stepwise Cox regression analysis. International Cancer Genome Consortium (ICGC) and another cohort of individual clinical samples acted as external validations. Additionally, significant impacts of the signature on HCC immune microenvironment and reaction to immune checkpoint inhibitors were observed. Among the 68 overlapping genes identified as TDTRG and IRG, a total of 29 genes had significant prognostic relevance and were further selected by performing a LASSO-Cox regression model based on the minimum value of λ. Subsequently, a prognostic three-gene signature including HECT domain and ankyrin repeat containing E3 ubiquitin protein ligase 1 (HACE1), C-type lectin domain family 1 member B (CLEC1B), and Collectin sub-family member 12 (COLEC12) was finally identified by stepwise Cox proportional modeling. The signature exhibited superior accuracy in forecasting the survival outcomes of HCC patients in TCGA, ICGC and the independent clinical cohorts. Patients in high-risk subgroup had significantly increased levels of immune checkpoint molecules including PD-L1, CD276, CTLA4, CXCR4, IL1A, PD-L2, TGFB1, OX40 and CD137, and are therefore more sensitive to immune checkpoint inhibitors (ICIs) treatment. Finally, we first found that overexpression of CLEC1B inhibited the proliferation and migration ability of HuH7 cells. In summary, the prognostic signature based on TDTRGs and IRGs could effectively help clinicians classify HCC patients for prognosis prediction and individualized immunotherapies.

poor therapeutic efficacy [1]. The incidence and mortality of HCC have emerged rapidly on the background of increased alcohol abuse, cirrhosis, aflatoxin exposure, diabetes, metabolic syndrome, and obesity. With the rapid developments in immunotherapy, HCC patients could access potential treatment strategies including Sorafenib, Lenvatinib, and Atezolizumab [2], while not all the patients could benefit from these immune-based therapies due to the complex immune status of HCC [3][4][5]. Tumor doubling time (TDT), a classic biomarker reflecting tumor growth patterns, has some implications for general management of tumor patients including anticancer treatment reactivity and outcome, tumor histologic type prediction, and best surveillance intervals determination [6][7][8][9]. When gradual and predictable tumor growth was determined based on TDT, patients with indolent or aggressive tumors could be identified at an early stage and receive follow-up treatment to improve final survival, which may reduce the mental and financial burden caused by overdiagnosis or missed diagnosis [9][10][11]. What's more, TDT was highly associated with tumor vascular invasion and recurrence [12] and could be regarded as a reliable predictor of sorafenib's therapeutic effect [7], suggesting that TDT might have a significant relationship with the cancer immune status. Furthermore, immunotherapy is becoming the new standard of treatment for advanced stages, considering its increasing use worldwide to treat HCC patients, although the inadequate immune response has been a growing concern [13]. However, the association between TDT-related genes (TDTRGs) and immune-related genes (IRGs) and the value of their combination in predicting the prognosis of HCC patients remains unclear. In the present study, a novel prognostic signature based on currently available TDTRGs and IRGs could efficiently classify HCC patients for prognosis and individualized immunotherapies.

Public datasets and generation of TDTRGs and IRGs
The mRNA expression data and clinical characteristics of HCC patients from three publicly available datasets including TCGA-LIHC, GSE54236, and ICGC (LIRI-JP) were incorporated into the present study. Genes with a cut-off criterion of adjust P value < 0.01 and Pearson correlation coefficient > 0.3 in GSE54236 were considered TDTRGs. IRGs were available from the Immunology Databases and Analysis Portal (ImmPort) database (https:// www. immpo rt. org/ home) [14].

Prognostic risk score model construction and functional analysis
The univariable Cox regressions were first performed to calculate the association between overlapped genes of TDTRGs and IRGs and survival outcomes in TCGA-LIHC cohort. Then LASSO-Cox regression method and stepwise Cox regression analysis were performed to evaluate the above prognosis-related genes and establish a prognostic signature. A risk score was finally established based on the basis of linearly combining the formula below with the mRNA expression level multiplied by the multivariate Cox regression coefficient (β) model. Risk score = (β mRNA1 × mRNA1) + (β mRNA2 × mRNA2) + … + (β mRNAn × mRNAn). We stratified patients in TCGA dataset into two subgroups due to the optimal risk score threshold. The predictive power and independence of the prognostic signature in TCGA were assessed by ROC analysis, Kaplan-Meier survival analysis, and Cox proportional hazards regression analysis. Gene set enrichment analysis (GSEA) between the two subgroups was performed to identify the significantly alerted Hallmarks with FDR < 0.05. Annotated gene set h.all.v7.2.symbol. gmt (Hallmarks) was chosen as the reference gene set.

Clinical specimens and quantitative real-time PCR (qRT-PCR) analysis
Fresh frozen tumor tissues from previously collected HCC patients were selected as an independent validation cohort [15]. qRT-PCR was used to detect the mRNA levels of genes in the model [16]. After the relative mRNAs expression levels were normalized to β-ACTIN and log 2 transformed, patients were stratified into two subgroups according to the above formula. Primer sequences are showed in Additional file 3: Table S1.

Immune status calculation and immune infiltrates analysis
The immune status of each sample was assessed by applying the ESTIMATE algorithm [17] to the TCGA cohort and calculating immune and stromal scores, and the association between risk scores and immune, stromal scores were analyzed by Pearson correlation analysis. To explore the impacts of the prognostic model on immunotherapies, we calculated the relationship among risk score and 15 potentially available targeted immune checkpoint genes in TCGA-LIHC, including CCL2, CD274, CD276, CD4, CTLA4, CXCR4, IL1A, IL6, LAG3, PDL1, PDL2, TGFB1, OX40, CD137 and CDX40L [18]. Furthermore, to assess the potential association between prognostic signature and tumor-infiltrating immune cells (TIICs) in the HCC microenvironment, the TCGA database was used to measure the abundance ratios of 22 types of TIICs through CIBERSORT [19] (http:// ciber sort. stanf ord. edu/). Finally, the predictive ability of significantly changed TIICs was assessed by Kaplan-Meier survival analysis.

Genetic alterations and TMB analysis
The mutation and CNA data of 361 HCC patients were downloaded from TCGA to analyze the difference of genetic alterations between the high-and low-risk score subgroups with R package "maftools", and the tumor mutation burden (TMB) of each patient was subsequently assessed.

Cell proliferation and migration assay
The viability of cells was assessed through the cell counting kit-8 (CCK-8, Sangon Biotech, Shanghai, China) in accordance with the manufacturer's instructions. For the CCK-8 assay, in short, cells were seeded in 96-well cell culture clusters at a density of 1 × 10 5 cells per well and cultured for 1 day, 2 days, and 3 days, respectively. After culturing, 10 μL CCK-8 solution was added into each well, and then the absorbance was detected at a wavelength of 450 nm within 4 h with a microplate reader. For the wound-healing assay, cells were seeded in 6-well plates and cultured to approximately 80% confluence in serum-free medium and then cell monolayers were scratched with a sterile pipette tip. Cells were cultured in DMEM medium with 10% FBS for the next 24 h after removal of cell debris by PBS washing. The area of the wound width was measured after photographing the wound width of the cell monolayers. Three independent duplicates needed to ensure the accuracy in this assay and the wound closure rate was calculated as [1-(wound area / original wound area)] from photographs. For the transwell assay, after transwell filters were coated with Matrigel, cultured cells were resuspended in 200 μL serum-free DMEM at a density of 1 × 104 cells per mL, and plated into the transwell inserts while the wells were filled with 500 μL DMEM supplemented with 10% FBS. After incubated at 37℃ for 48 h, cells attached to the downside of the transwell filters were stained with 0.1% crystal violet in PBS for 15 min and counted under microscopy at 200 × magnification.

Statistical analysis
Categorical data were compared with Pearson chi-square test or Fisher exact test whenever appropriate, and quantitative variables were analyzed using independentsamples t test. ROC curve analysis and Kaplan-Meier survival analysis were performed to assess the prediction performance of survival outcomes with R software (Version 4.0.3). Cox proportional model was performed to analyze the relationship between prognostic signature and survival outcomes, together with other clinical features. Clinical characteristics of HCC patients in TCGA, ICGC and clinical validation cohorts were showed in Additional file 3: Table S2. Results were considered statistically significant when P value < 0.05.

Identification of overlapped genes in TDTRGs and IRGs
With the cut-off criterion of adjust P value < 0.01 and Pearson correlation coefficient > 0.3, 1539 genes in GSE54236 were considered TDTRGs. Then 68 genes were identified as overlapped genes of TDTRGs and IRGs (Fig. 1A). The expression levels of the 68 overlapping genes in normal and tumor tissues in the TCGA-LIHC dataset were shown in Fig. 1B. Results of Go and KEGG analysis indicated that these overlapped genes were majorly associated with immune response and inflammation (Fig. 1C).

Establishment of a prognosis-related signature in TCGA
As calculated by univariable Cox regression, 29 of the 68 overlapped genes had significant prognostic relevance ( Fig. 2A) and were further analyzed. The correlations between these prognostic genes were shown in Fig. 2B. Then the most valuable prognostic genes among the 29 genes above were selected by performing the LASSO-Cox regression model based on the minimum value of λ (Fig. 2C) and a prognostic three-gene signature was finally identified via a stepwise Cox proportional model. Risk score = (0.2984276 × HACE1) + ( 0.1782599 × COLEC12) -(0.2154380 × CLEC1B). Risk scores for HCC patients were calculated with the above formula, and patients were stratified into high-or lowrisk subgroups with an optimal risk score threshold (Fig. 2D). The association between risk score and clinical characteristics including age, gender, grade, stage, vascular invasion, value of AFP, cirrhosis, and tumor status were evaluated. The results revealed that higher risk scores were linked to advanced TNM stage, later grade, later T stage, and recurrence (Additional file 1: Figure S1). Kaplan-Meier survival analysis revealed that patients with higher risk scores were significantly relevant to poorer survival outcomes (Fig. 2E). In addition, further stratified survival analysis was applied for different clinical characteristics, and the results demonstrated that this prognostic model could further differentiate patients with different clinical characteristics including age, vascular invasion, grade, recurrence, TNM stage, gender, and AFP value (Additional file 2: Figure S2). Finally, ROC analysis revealed that this signature had a good prognostic performance with AUCs at 1-, 2-, 3-year of 0.780, 0.668, 0.692 (Fig. 2F). Finally, the relationship between risk score and TDT was analyzed, and we found that HCC patients in the high-risk group had significantly shorter TDT (Fig. 2G).

Verification of the signature in ICGC cohort and clinical cohort
To validate the signature, the ICGC dataset and a clinical cohort were applied as validation cohorts. Risk scores of patients were calculated with the same formula, and patients were stratified into high-or low-risk subgroups in the ICGC cohort (Fig. 3A) and the clinical cohort (Fig. 3D). Kaplan-Meier survival analysis revealed that patients with higher risk scores were prominently relevant to poorer OS rates in the ICGC cohort (Fig. 3B), as well as in the clinical cohort (Fig. 3E). ROC analysis revealed that this signature had a good prognostic performance with AUCs at 1-, 2-, 3-year of 0.641, 0.618, 0.639 in the ICGC cohort (Fig. 3C) and 0.775, 0.638, 0.705 in the clinical cohort (Fig. 3F), respectively.

Establishment of a nomogram model in TCGA
To investigate the coefficient prediction efficiency of this signature, a nomogram model was established in the TCGA dataset, and the result revealed that the nomogram with a C-index of 0.713 could help us provide a quantitative method for predicting the 1-, 2-, 3-year survival rate accurately (Fig. 4A). The overlap between the forecasted and actual probabilities of 1-, 2-, 3-year survival rate in the calibration curves indicated good agreement ( Fig. 4B-D).

Functional analysis and immune status calculation
To investigate the essential molecular mechanisms within this three-gene signature, Gene Set Enrichment Analysis (GSEA) was performed to identify significantly alerted biological processes and pathways. Annotated gene set h.all.v7.2.symbol.gmt (Hallmarks) was chosen as the reference gene set. Five Hallmarks with FDR < 0.05 were enriched in this signature, including adipogenesis, fatty acid metabolism, oxidative phosphorylation, TNFA signaling via NFKB, and xenobiotic metabolism (Fig. 5A).
According to the results of the ESTIMATE algorithm, risk scores was significantly associated with immune scores, as well as stromal scores (Fig. 5B), and patients in the low-risk subgroup had higher immune scores and stromal scores when compared with patients in the highrisk subgroup (Fig. 5C), indicating that this signature was closely related to tumor immune status. In the following, the expression levels of 15 potentially targetable immune checkpoint genes were compared between the two subgroups in the TCGA database, and results showed that patients in the high-risk subgroup had significantly increased PD-L1, CD276, CTLA4, CXCR4, IL1A, PD-L2, TGFB1, OX40 and CD137 (Fig. 5D), indicating that immune checkpoint inhibitors (ICIs) treatment were more effective for patients in high-risk subgroup.

Immune infiltrates analysis
Based on the CIBERSORT algorithm, the differences and correlations of 22 types of TIICs in two subgroups in TCGA were assessed by Wilcoxon signed-rank test and Pearson correlation analysis, respectively. Difference analysis demonstrated that HCC patients in the low-risk score subgroup had modestly increased ratios of plasma cells, CD4 memory resting T cells, resting NK cells, and monocytes, while patients in high-risk score subgroup had significantly elevated ratios of follicular helper T cells (Fig. 6A). Pearson correlation analysis indicated that six types of TIICs were significantly associated with the risk scores, including plasma cells, CD 8 T cells, resting NK cells, follicular helper T cells, resting dendritic cells, and monocytes (Fig. 6B). Furthermore, plasma cells, resting NK cells, follicular helper T cells, and monocytes were considered as overlapping TIICs (Fig. 6C). Among the four overlapped TIICs, unfortunately, Kaplan-Meier survival analysis revealed that only plasma cells were prominently relevant to poor survival outcomes in HCC patients (Fig. 6D).

Genetic alterations and TMB analysis
The results of genetic alterations analysis indicated that the top 10 most significantly mutated genes were TP53, TTN, CTNNB1, MUC16, ALB, PCLO, APOB, MUC4, RYR2, and ABCA13 in the TCGA cohort (Fig. 7A). In addition, the mutation rates of the above genes were remarkably different in the two subgroups (Fig. 7B). Both HACE1 and CLEC1B had a mutation in 1% of HCC samples, while less than 1% mutation rate was found for COLEC12 in HCC patients (Fig. 7C). Subsequently, the TMB of each patient was assessed. However, no significant difference in TMB was found between the two subgroups (Fig. 7D).

Expression levels of genes in model
Differential mRNA expression of the three genes between tumor and normal tissues was explored in Gene Expression Profiling Interactive Analysis (GEPIA) [20] (http:// gepia. cancer-pku. cn/) and we found that only CLEC1B differed significantly and therefore was focused on further analysis (Fig. 8A). Differences in mRNA expression of the three genes in HCC cell lines were explored in Cancer Cell Line Encyclopedia (CCLE) [21] (https:// porta ls. broad insti tute. org/ ccle) and all of these genes were significantly different across HCC cell lines (Fig. 8B). Finally, the protein expression levels of CLEC1B between tumor and normal tissues were explored in Human Protein Atlas (HPA, www. prote inatl as. org), and a significant difference was detected (Fig. 8C).

Inhibition of overexpressed CLEC1B on cell proliferation and migration ability
After generation of stable overexpression (OE) and negative control (OE-NC) cell lines, CLEC1B expression in HuH7 cells was confirmed by qRT-PCR (Fig. 9A).
To investigate the influence of CLEC1B on the malignant growth and migration potential, cell proliferation assay, wound-healing assay, and transwell assay were performed. For cell proliferation ability, the results of CCK-8 showed that the proliferation rate of up-regulated CLEC1B was significantly lower than controls in HuH7 cells (Fig. 9B), indicating that CLEC1B inhibited the proliferation ability of HuH7 cells. For migration ability, wound-healing assay showed that CLEC1B overexpression remarkably inhibited migration ability in HuH7 cells (Fig. 9C), as did the result of transwell assay (Fig. 9D). All the results above indicated that CLEC1B might inhibit the proliferative and migration potential of HCC cells.

Discussion
Tumor doubling time could effectively reflect the growth pattern of tumors, which was crucial for arranging times for tumor detection and chemotherapy interventions [7] and was valuable for predicting the trend of tumor metastasis [22]. Previous studies showed that the tumor growth patterns differed significantly between indolent and invasive tumors [6,23], indicating that we can initially determine the indolent or aggressive biology based on TDT. Besides, even in the same tumor, such as HCC, an aggressive malignancy with a very short five-year survival time, the tumor growth patterns varied significantly due to the HBV or HCV virus infection, non-viral liver disease, and individual variability. On the other hand, immunotherapy for HCC patients within the tumor immune microenvironment (TIME) has been greatly  developed in recent years. Some advanced HCC patients could benefit from these immune checkpoint inhibitors therapies and exhibit favorable survival outcomes [24][25][26][27]. Therefore, it is necessary to study prediction models based on TDT and TIME for more accurate HCC monitoring and treatment in clinical practice.
In the present study, tumor doubling time-related genes (TDTRGs) were acquired from GSE54236 by using the Pearson correlation test and immune-related genes (IRGs) were available from ImmPort. Prognostic TDTRGs and IRGs obtained from univariable Cox regression analysis in TCGA-LIHC dataset were determined to create a prognostic model by the LASSO-Cox regression and stepwise Cox regression analysis. Three genes (HACE1, CLEC1B, and COLEC12) were involved in the model. International Cancer Genome Consortium (ICGC) and another cohort of individual clinical samples acted as external validations demonstrated that the signature exhibited superior accuracy in forecasting the survival outcomes and TDT in HCC patients. More importantly, this three-gene signature was an independent risk factor for HCC patients when other clinical factors in the three cohorts were considered. Additionally, significant impacts of the signature on the HCC immune microenvironment and reaction to immune checkpoint inhibitors were observed. Finally, we found overexpressed CLEC1B could inhibit the proliferation and migration of Huh7 cells, which is consistent with the prevision research [28]. In conclusion, the prognostic signature based on TDTRGs and IRGs could efficiently classify HCC patients for prognosis prediction and individualized immunotherapies, and CLEC1B might be an immunotherapy target in the future.
HECT domain and ankyrin repeat containing E3 ubiquitin protein ligase 1 (HACE1) is an E3 ubiquitin ligase located on chromosome 6q21 that plays a crucial role in tumor biology and is closely associated with survival outcomes. HACE1 could act as a tumor suppressor in various human malignancies, including lung cancer [29], HCC [30], breast cancer [31], osteosarcoma [32], colorectal cancer [33], gastric cancer [34], Leukemia [35] and Wilms tumors [36]. For example, in lung cancer, HACE1 deletion could promote KRas G12D -driven lung cancer progression by modulating the tumorigenic activation of RAC-family GTPases [29]; HACE1 could accelerate autophagic flux to inhibit tumor growth by ubiquitinating the autophagy receptor and could serve as an autophagy-related target for immunotherapeutic  [37]; HACE1 reduced the accumulation of HIF1α during cellular hypoxia by decreasing the stability of the protein, thus achieving the purpose of inhibiting tumor growth. In HCC, the down-regulated HACE1 was closely related to poor survival outcomes, while overexpressed or demethylated HACE1 inhibited proliferation and migration ability of HCC cells [30,38]. All above indicated that HACE1 might be a potential therapeutic target in HCC patients. C-type lectin domain family 1 member B (CLEC1B), a tumor platelet-related molecule secreted by activated platelets in the peri-tumor area, could affect thrombus formation and hematogenous metastasis of tumors through interactions with podoplanin [39]. Blocking the interaction might be a promising therapeutic strategy for preventing pulmonary metastases from osteosarcoma [40]. Cordycepin treatment effectively reduced the proliferation and migration ability of gastric cancer cells by upregulating CLEC1B [41]. Recently, CLEC1B was confirmed to be remarkably downregulated and related to tumor hemorrhage in HCC, indicating that CLEC1B could be served as a potential target for PD-L1/PD1 immunotherapy [28,42]. Therefore, to further determine the antitumor effect of CLEC1B, we explored the expression level of CLEC1B in HCC cell lines and a generalized low expression was found. Subsequently, we overexpressed CLEC1B in HuH7 cells and observed a significant inhibition of cell proliferation and migration ability. All these suggested that CLEC1B might be a promising molecule for antitumor immunotherapy Fig. 9 Overexpressed CLEC1B inhibited cell proliferation and migration in Huh7 cells. A Lentivirus vector encoding the full-length human CLEC1B DNA sequence was used to manipulate CLEC1B expression. B CCK-8 assay, C Wound-healing assay, and D transwell assay were used to detect the effect of overexpressed CLEC1B on cell proliferation and migration ability. *p < 0.05, **p < 0.01, ***p < 0.001 in HCC patients. As for COLEC12, it was increased in osteosarcoma and remarkably associated with poor survival outcomes [43], while there are few relevant studies in HCC and further research is needed.
In previous studies, a large number of signatures have been identified for predicting survival outcomes in HCC patients [44][45][46][47][48][49]. Compared with these signatures, the novel three-gene signature in our current study has some new features: first, the signature constructed based on TDT and TIME is more accurate for HCC monitoring and treatment in clinical practice; second, the signature is validated by qRT-PCR analysis in a small size clinical cohort, ensuring its clinical relevance. Finally, the threegene signature contains fewer genes and is easier to implement clinically. Undeniably, our current study has some restrictions. The great diversity of HCC and the mechanism of recurrence after treatment may reduce the performance of the signature. Moreover, the small sample size limits the validation of the model, and future multicenter randomized controlled studies are required to evaluate this signature. In addition, the specific mechanisms of the three genes in HCC, especially for CLEC1B, are still not well understood and more in vivo and in vitro experiments are needed in the future.
In summary, a prognostic three-gene signature based on TDTRGs and IRGs was constructed in our current study, which could effectively help clinicians classify HCC patients for prognosis prediction and individualized immunotherapies.