Identification of RHOBTB2 aberration as an independent prognostic indicator in acute myeloid leukemia

Rho-related BTB domain (RhoBTB) proteins belong to Rho guanosine triphosphatases (GTPases). Their putative role implicated in carcinogenesis has been supported by accumulating evidence. However, their expression pattern and potential role in acute myeloid leukemia (AML) remain unclear. We profiled RHOBTB mRNA expression via the Gene Expression Profiling Interactive Analysis 2 (GEPIA2) database. Survival analysis was conducted with GEPIA2 and UALCAN. Univariate and multivariate Cox regression analyses were performed to validate RHOBTB genes as independent prognostic indicators in the LAML cohort from The Cancer Genome Atlas (TCGA). Data regarding expression in different subtypes and relationships with common disease-related genes were retrieved from UALCAN. Co-expressed genes were screened out and subsequently subjected to functional enrichment analysis. We observed aberrant transcription levels of RHOBTB genes in AML patients. RHOBTB2 was identified as a prognostic candidate for overall survival (OS), independent of prognosis-related clinical factors and genetic abnormalities. Moreover, RHOBTB2 expression was increased in non-acute promyelocytic leukemia (APL) subtypes, patients without FLT3 mutation and PML/RAR fusion, and imparted a positive correlation with the expression of FLT3, FHL1, and RUNXs. Co-expressed genes of RHOBTB2 were enriched in functional pathways in AML. Our findings suggest that RHOBTB2 might be a novel biomarker and independent prognostic indicator in AML and provide insights into the leukemogenesis and molecular network of AML.


INTRODUCTION
Acute myeloid leukemia (AML) is the most common form of acute leukemia in adults, and the incidence and mortality risks increase with age [1,2]. As a heterogeneous disease of the blood system, AML is characterized by differentiation arrest and malignant clonal expansion of myeloid lineage blasts. Many oncogene activating mutations and cytogenetic abnormalities in AML, such as core-binding factor (CBF), retinoic acid receptor-α (RAR-α), FLT3, RAS, p53, WNT, nucleophosmin (NPM1), and CEPBA double , are associated with high-risk clinical characteristics and adverse prognosis [3][4][5]. The complex genetic background substantially impacts risk stratification, treatment responses, and prognosis prediction. Hence, it is urgent to authenticate potential and independent biomarkers involved in diagnosis, treatment, and prognosis of patients with AML.
The RhoBTB subfamily, represented in mammals by three isoforms, RhoBTB1, RhoBTB2, and RhoBTB3, was recognized in the lower eukaryote Dictyostelium discoideum, and thus became the de novo addition to

Transcriptional levels of RHOBTB genes in patients with AML
The RHOBTB genes comprise three members, RHOBTB1, RHOBTB2, and RHOBTB3, in mammalian cells. We compared the transcriptional levels of RHOBTB genes in the bone marrow of AML patients (TCGA-LAML, n = 173) with those in normal samples (GTEx, n = 70) through Gene Expression Profiling Interactive Analysis 2 (GEPIA2) (Figure 1). Gene expression analysis using box plots indicated that the transcriptional levels of RHOBTB1 and RHOBTB3 were decreased (P < 0.05, Figure 1A and 1C), while that of RHOBTB2 was significantly increased (P < 0.05, Figure  1B). Consistently, three datasets indicated increased RHOBTB2 expression; six datasets and five datasets showed reduced RHOBTB1 and RHOBTB3 expression, respectively, in leukemia compared to normal samples in the ONCOMINE database (Supplementary Figure 1). Downregulation of RHOBTB1 and RHOBTB3 has been reported in various types of tumors, and this pattern was confirmed in AML. RHOBTB2 showed notably divergent expression patterns between AML and other AGING tumor types, which led us to further explore the underlying clinical significance.

Prognostic values of RHOBTB genes in patients with AML
Intriguingly, a high expression level of RHOBTB2 was associated with poor overall survival (OS) (Hazard ratio (HR) (high) = 2.9; P = 0.00041, Figure 2B) while a low expression level of RHOBTB3 was associated with poor OS for patients with AML (HR (high) = 0.44; P = 0.0045, Figure 2C) in GEPIA2. However, there was no difference for RHOBTB1 (HR (high) = 1.1; P = 0.85, Figure 2A). The prognostic evaluation capacity of RHOBTB2 and RHOBTB3 was validated with the UALCAN database (Supplementary Figure 2) and Kaplan-Meier (KM) survival analysis (log-rank P-value = 0.000239 for RHOBTB2, log-rank P-value = 0.00024 for RHOBTB3) in the TCGA-LAML cohort (n = 151) obtained from https://portal.gdc.cancer.gov/ in January 2020 ( Figure 3A-3B). RHOBTB2 overexpression was recognized as a risk factor for OS with HR = 1.672 (95% confidence interval (CI), 1.285-2.176) ( Figure 3A) via Cox proportional hazards analysis, while high expression of RHOBTB3 as a protective factor with HR = 0.444 (95% CI, 0.288-0.685) ( Figure 3B). The median survival time of the RHOBTB2 high-expression group was 0.8 years and that of the low-expression group was Dotted lines indicate the 95% CI. Gene expression levels were dichotomized, generating a high expression group (solid red line) and a low expression group (solid blue line), based on the median expression level of each gene as the cut-off value. OS, overall survival. HR, hazard ratio. CI, confidence interval. ** P < 0.01. *** P < 0.001. ns, not significant. and RHOBTB3 (B) in AML patients. P-values and hazard ratios (HRs) with 95% confidence intervals (95%CIs) were generated along with logrank tests and univariate Cox proportional hazards regression. Dotted lines indicate the 95%CI. The survival probability of a total of 140 patients from the LAML cohort was computed after case-wise deletion. Patients were grouped by a dichotomization method based on the median expression level of each gene. Solid red lines represent the high expression groups while solid blue lines represent the low expression groups. Red and blue arrows indicate the median survival time of the two groups, respectively. (C-D) Time-dependent ROC analysis was performed for the 1-, 3-, 5-year time points to determine the predictive accuracy of RHOBTB2 and RHOBTB3. AUC values represent the prediction ability in 1-, 3-, 5-year OS. ROC, receiver operating characteristic. AUC, area under the curve. (E) Multivariate Cox regression analyses of RHOBTB2, RHOBTB3, and clinical features in the TCGA-LAML cohort. The forest plots were generated with the Pvalues, HR, and 95% CI of each variable through 'forestplot' R package. (F) Univariate and multivariate Cox regression analyses of RHOBTB2, RHOBTB3, and three other potential prognosis-related genes (FHL1, HOPX, and FAM124B). A P-value < 0.05 was considered statistically significant. Asterisks represent levels of significance ( * P < 0.05, ** P < 0.01, and *** P < 0.001). AGING 2.3 years. In comparison, the median survival times of the RHOBTB3 high and low-expression groups were 2.3 years and 0.7 years, respectively ( Figure 3A-3B). Time-dependent receiver operating characteristic (ROC) analysis of RHOBTB2 and RHOBTB3 was performed to compare each gene's predictive accuracy. RHOBTB2 had a larger area under the curve (AUC) than RHOBTB3, especially for 3-and 5-year survival (3-year AUC = 0.732 vs. 0.673, 5-year AUC = 0.802 vs. 0.720) ( Figure 3C-3D). Therefore, the results above suggest that both RHOBTB2 and RHOBTB3 may be potential prognostic factors for patients with leukemia, and RHOBTB2 showed better prognostic performance.

Identification of RHOBTB2 as an independent prognostic indicator in AML
We performed univariate and multivariate Cox regression analyses to determine whether RHOBTB2 and RHOBTB3 are robust AML OS-related genes that can be used for prognosis prediction.
Multiple clinical factors, such as age, WBC count, blast cell percentage, and cytogenetic abnormalities, impact the prognosis of AML. Some individual genes, including four-and-a-half LIM domain 1(FHL1), HOPX and FAM124B have been recently identified as candidate prognostic factors through a genome-wide Cox regression screening project [17]. Thus, RHOBTB2 and RHOBTB3, combined with risk factors including age (≥60 years old), WBC count (≥30 × 10 9 /L), blast cell percentage, cytogenetic abnormalities, therapeutic agent target (FLT3, DNMT3A, and TP53 mutations, etc.), and de novo prognostic indicators (FHL1, HOPX, and FAM124B) (Supplementary Table 1) were used for multivariate Cox regression analysis of the TCGA-LAML cohort (n = 151). As shown in Figure 3E, the forest plots indicated that high RHOBTB2 expression, but not high RHOBTB3 expression, was strongly predictive of poor outcome in AML patients (HR = 1.581; 95% CI, 1.102-2.270; P = 0.012), independent of clinical features including age, WBC count, blast cell percentage and gene mutation status ( Figure 3E). Compared to RHOBTB3 and the three potential prognostic indicators (FHL1, HOPX and FAM124B), only RHOBTB2 displayed prognostic value (P-value = 0.003, vs. P-value = 0.057 for RHOBTB3, P-value = 0.896 for FHL1, P-value = 0.087 for FAM124B, and Pvalue = 0.184 for HOPX), and a higher HR (HR = 1.685; 95% CI, 1.188-2.388) in the Cox model ( Figure  3F, right panel), even though all five genes were statistically significant in univariate Cox regression analysis ( Figure 3F, left panel). The results above verify that RHOBTB2 can be used as an independent and effective predictor of the OS for AML patients.
We further clarified the expression profile of RHOBTB2 in different subtypes of AML using UALCAN. The box plots showed that RHOBTB2 was remarkably increased in FAB subtypes M0, M1, M2, M4, and M5 compared to that in M3 (also known as acute promyelocytic leukemia (APL)), which has a higher degree of differentiation (P < 0.001 for M0 vs. M3, M1 vs. M3, M2 vs. M3, M4 vs. M3 and M5 vs. M3, Figure 4A). FAB subtypes M6 and M7 were not considered because of the small sample size for each. It seems like progenitor cells have higher RHOBTB2 expression levels than their progeny.
Next, the correlation between RHOBTB2 expression and clinical features, such as age, gender, FLT3 mutation, PML/RAR-fusion, and RAS activation status in AML patients, was analyzed through UALCAN ( Figure 4B-4F). Box plots showed that the expression level of RHOBTB2 was higher in the 61-80-year-old group than in the 21-40-year-old group (P = 0.015, Figure 4B) and higher in the male group than in the female group (P < 0.001, Figure 4C). We did not analyze the 80-100-year-old group, as it had only five samples. In addition, RHOBTB2 expression was increased in AML patients without FLT3 mutation (P = 0.024, Figure 4D) and patients without PML-RAR fusion (P = 0.0038, Figure 4E). The presence of FLT3 mutations in AML enabled the recent approval of targeted drugs that can help patients achieve prolonged remission. PML-RAR is a fusion gene that is associated with the specific subtype of leukemia APL.
share the same expression pattern with RHOBTB2 in bone marrow samples. High expression of FHL1 and RUNX1-3 is related to poor prognosis. NPM1, the high expression of which predicts a superior prognosis, was negatively correlated with RHOBTB2, but the correlation was not significant.

Functional enrichment analyses of RHOBTB2 and co-expressed genes in AML
To further explore the potential function and molecular pathways of the RHOBTB2 gene in AML, we utilized the LinkedOmics database [23] to identify co-expressed genes of RHOBTB2 in data of 173 patients from TCGA. A total of 8,198 genes related to RHOBTB2 were altered, which reflects the considerable impact of the core gene RHOBTB2 on AML pathogenesis. The 2,309 gene clusters of these related genes that were positively related to RHOBTB2 are displayed as red dots, whereas the 2,028 gene clusters that were negatively associated with RHOBTB2 are represented by green dots in the volcano plot (P < 0.01 and FDR < 0.01, Figure 5A). The top 20 significant gene sets positively and negatively associated with RHOBTB2 are presented in Table 1. RHOBTB2 and its top 200 associated gene clusters were subjected to gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) functional enrichment analyses to identify enriched categories and signaling pathways in the TCGA-LAML cohort. The bubble diagrams showed that the gene clusters were located in cell morphogenesis involved in differentiation, small GTPase mediated signal transduction, and positive regulation of catalytic activity (http://amigo.geneontology.org/amigo, GO:0000902, GO:0000904, GO:0007264, GO:0043085, etc) in the aspect of biological processes ( Figure 5B). The functions of these categories above may be related to cell division, cell cycle, proliferation, and differentiation of blast cells in AML. In the aspect of cellular components, Chang et al. found that RhoBTB2 is distributed in a vesicular pattern [24], and coincidentally these genes are putative structural constituents of phagocytic vesicle (GO:0045335), endocytic vesicle (GO:0030139) and cytoplasmic vesicle membrane (GO:0030659) ( Figure 5C). When it comes to the aspect of molecular functions, they are localized in RHOBTB2-related utilities such as Rho GTPase binding (GO:0017048), Rac GTPase binding (GO:0048365) and GTPase activator activity (GO:0005096) ( Figure 5D). The top 10 pathways related to RHOBTB2 and co-expressed gene clusters were defined by KEGG analysis. These pathways such as cytotoxicity (https://www.kegg.jp/kegg/, hsa04650), phagosome (hsa04145), leukocyte transendothelial migration (hsa04670), and apoptosis (hsa04210), have been intriguingly implicated in vital pathological processes including cell fate determination and migration of leukocytes ( Figure 5E). Given our results and the architecture, localization and biological functions of RHOBTB2 postulated in previous research, we hypothesized that overexpression of RHOBTB2 in leukemic blast cells could regulate cell differentiation, cell cycle, proliferation, apoptosis, migration and vesicle transport.

GEPIA2
GEPIA2 (http://gepia2.cancer-pku.cn/, Beijing, China) is an interactive web-based tool for analyzing cancerrelated RNA sequencing data provided by TCGA and GTEx projects [25]. General gene expression profiles, survival analysis, and correlation analysis were conducted through the "Expression Analysis" module with the TCGA-LAML cohort (n = 173) and normal tissues (n = 70), the data of which are available in the panel "dataset sources". Student's t-test was used to perform expression analysis. The survival results were displayed by Kaplan-Meier curves with HRs and P values from a log-rank test. A P-value = 0.05 was used as the threshold of statistical significance.

Univariate and multivariate Cox regression analyses
Univariate and multivariate Cox regression analyses were performed to identify candidate prognostic genes in the LAML cohort from TCGA. The data (available through https://portal.gdc.cancer.gov/) were updated in Jan 2020, and the cohort contains 151 AML patients with high-throughput sequencing (RNA-Seq) data and detailed clinical information [26].
A forest plot with the P-value, HR and 95% CI of each variable was built through "survival", "survminer" and "forestplot" R packages in RStudio 4.0.3. Gene expression levels were dichotomized based on the median expression level in the cohort as the cutoff value.

LinkedOmics
LinkedOmics (http://www.linkedomics.org) provides a unique portal to analyze cancer multi-omics data and clinical data for 32 cancer types and 11 158 patients from TCGA project [23]. Genes associated with RHOBTB2 were identified in the TCGA-LAML cohort (n = 173) and are presented in volcano plots. Pearson's correlation test was used to evaluate the statistical relationship.

Figure 5. Co-expressed genes of RHOBTB2 (LinkedOmics) and functional enrichment analyses in AML (WebGestalt). (A)
Genes positively and negatively correlated with RHOBTB2 in AML were indicated by the volcano plot. Red dots in the right sector represent positively correlated genes, while green dots in the left sector are negatively correlated genes. A total of 8,198 genes with significant associations were defined (P < 0.05), among which there were 2,309 positively associated genes and 2,028 negatively associated genes when FDR < 0.01 and P-value < 0.01 were used as thresholds. Pearson's test was used to identify the correlations in the TCGA-LAML cohort (n = 173). Bubble plots display the functional enrichment results of GO analysis in terms of biological processes (B), cellular components (C), molecular functions (D) and KEGG signaling pathways (E). The top 10 functional categories and pathways were annotated with color gradient bubbles of different sizes. A (-log10) P-value>1.3 (P-value < 0.05) was considered statistically significant. FDR, false discovery rate.

UALCAN
(Birmingham, AL, USA, http://ualcan.path.uab.edu) serves as a platform for validating specific genes and screening tumor candidate biomarkers [27]. RHOBTB gene expression in AML subgroups based on various clinicopathologic features and survival outcomes was investigated via "Expression Analysis" and "Survival Analysis" modules, respectively. The processed RNA-sequencing data and survival profiles of the AML cohort (n = 163) were obtained using TCGA assembler (http://www.compgenome.org/TCGA-Assembler/). A P-value of 0.05 was used as the threshold for significance.

GO and KEGG pathway enrichment analyses
The WebGestalt database (http://www.webgestalt.org/option.php) for deriving biological insights from gene lists was exploited to perform GO and KEGG pathway enrichment analyses for RHOBTB2 and the top 200 co-expressed genes. The built-in reference human protein-coding genome was selected as the background parameter. Bubble plots with (-log10) P-values, FDRs, and enrichment ratios were generated through "ggplot" and "dplyr" R packages. A (-log10) P-value > 1.3 was considered to indicate enrichment of a meaningful pathway.

DISCUSSION
As an atypical subfamily of Rho GTPases, RhoBTB proteins possess the most salient domain architecture. Studies have implicated their pivotal role in the regulation of cell growth through cell cycle control and apoptosis, vesicle trafficking, and organization of the actin filament system. Hitherto increasing evidence has implicated the RHOBTB genes in tumorigenesis. The expression profile of RHOBTB genes in AML and whether they can affect myeloid leukemogenesis, pathogenesis, and prognosis remain obscure. In this study, we performed bioinformatics analyses to explore the expression profile and prognostic value of RHOBTB genes in AML and enhance the accuracy of prognosis prediction.
We found aberrant RHOBTB gene expression in human AML samples through the ONCOMINE and GEPIA2 databases. RHOBTB1 and RHOBTB3 were decreased significantly in AML samples. In contrast, the transcriptional level of RHOBTB2 was dramatically increased in AML compared to normal samples, unlike the pattern found in other tumors. All three RHOBTB genes have notable differences in tissue expression levels in humans [6]. The status of RHOBTB genes in various tumors remains to be further uncovered. Although RHOBTB2 is frequently deleted in various carcinomas, including breast, lung, and stomach carcinomas, many tumor cells still retain RHOBTB2 expression [28]. Blast cells make up a high proportion (20%~100%) of the cells in bone marrow samples from AML patients. These myeloid progenitor cells are derived from hematopoietic stem cells (alias leukemia stem cells (LSCs)), which are different from the stem cells of solid tumors. Based on the findings above, we hypothesized that there is no overt relationship between mRNA expression patterns of the three RHOBTB genes and protein architecture. It is reasonable that only RHOBTB2 showed expression patterns in AML patients that are different from those in solid tumor types.
The prognostic value of RHOBTB genes in patients with AML was assessed in several databases and by Cox regression. Survival analysis suggested that high RHOBTB2 expression and low RHOBTB3 expression are associated with adverse OS in AML. The ROC analysis indicated that RHOBTB2 had a larger AUC than RHOBTB3 and had a better prognostic value. Subsequently, to demonstrate whether the prognostic efficacy of RHOBTB2 and RHOBTB3 is independent of other clinical factors, we performed multivariate Cox regression analyses in the TCGA-AML cohort. Several disease-related factors and gene mutations, such as age, WBC count, blast cell percentage, TP53 mutations, FHL1, HOPX, and FAM124B, were confirmed to have significant and general prognostic value in previous studies [17]. We entered RHOBTB2 and RHOBTB3 with all of these prognostic variables into the multivariate analyses. High RHOBTB2 expression was identified as an independent indicator for unfavorable OS.
We examined the relationship between RHOBTB2 expression and clinical features and genetic alterations of AML patients to validate whether it could be used as a tool for risk stratification. RHOBTB2 expression AGING was higher in the 61-80-year-old group, which is consistent with the worse 5-year OS of elderly AML patients. The RHOBTB2 expression level was upregulated in the non-APL FAB subtype, AML patients without FLT3 mutation, and patients without PML-RAR fusion, although it showed no difference between patients with and without RAS activation status. Correlation analysis with the GEPIA2 database indicated that the RHOBTB2 expression was positively associated with the expression of FLT3, FHL1, and RUNX1-3. Patients with FLT3 mutation have a lower complete remission rate and poorer prognosis [29]. FHL1 is a powerful prognostic factor for determining OS, event-free survival, and relapsefree survival [17]. The three RUNX family members are lineage-specific master regulators and play an essential role in hematopoiesis [30]. These data reinforce the role of RHOBTB2 as a prognostic indicator for specific AML subtypes.
We also explored the potential of RHOBTB2 coexpressed genes as biomarkers for AML through the LinkedOmics database. GO and KEGG analyses indicated that these co-expressed genes were enriched in multiple functional categories and pathways that may contribute to regulating the cell cycle, apoptosis, differentiation, migration and vesicle transport in AML. Although there is little research concerning the role of RHOBTB2 in AML, in future studies, we will aim to determine the possible mechanism based on the comprehensive analysis of the expression patterns and functional enrichment in AML. RhoBTB2 can function in cell cycle and apoptosis through ubiquitination and degradation of cancer-related proteins, so we hypothesize that RhoBTB2 plays an intricate and differential role in tumorigenesis depending on target genes with multiple pathways. RHOBTB2 (DBC2) downregulates cyclin D1 (CCND1); however, other leukemogenesis pathways, such as c-myc and Wint-1, might be induced [28]. Scott N. Freeman et al. found that RHOBTB2 overexpression, as a target of E2F1 during mitosis, facilitates cell cycle progression and propagation for a short time [31]. E2F1 may promote the transcription of RHOBTB2 during mitosis, which affects the cell cycle and boosts the proliferation of AML leukemia cells. RHOBTB2 is expressed in fetal tissues and may control developmental processes [32], thus possibly exerting an influence on morphogenesis, localization and differentiation of leukemic blast cells. The potential role of RhoBTB2 in vesicle transport has been addressed by Chang et al. [24]. We hypothesized that RhoBTB2 might partly mediate the membrane trafficking and distribution of chemotherapeutic drugs and thus contribute to the poor outcomes of AML treatment, but the in-depth mechanism requires more laboratory work. The published reports above have introduced some intriguing hypotheses that provide the basis for further explorations.
In conclusion, the current study was the first to thoroughly identify the aberrant expression and prognostic value of RHOBTB family members in AML. RHOBTB2 was increased in high-risk subgroups of patients with leukemia and thus could serve as a potential biomarker. Our results illustrate that the overexpression of RHOBTB2 is an independent indicator for predicting the adverse outcome of AML and might play an essential role in leukemogenesis. Further research based on this discovery will aid the understanding of the comprehensive gene network of leukemogenesis and improve the accuracy of leukemia survival and prognostic prediction.

AUTHOR CONTRIBUTIONS
Peng Liu and Qinghai Ma contributed to the data acquisition and analysis; Xiaoning Zhang, Hanxiang Chen, and Li Zhang prepared the figures and wrote the manuscript draft; Peng Liu and Xiaoning Zhang contributed to the study design. The final manuscript was reviewed by Xiaoning Zhang and approved by all the listed authors.