Promoting Prognostic Model Application: A Review Based on Gliomas

Malignant neoplasms are characterized by poor therapeutic efficacy, high recurrence rate, and extensive metastasis, leading to short survival. Previous methods for grouping prognostic risks are based on anatomic, clinical, and pathological features that exhibit lower distinguishing capability compared with genetic signatures. The update of sequencing techniques and machine learning promotes the genetic panels-based prognostic model development, especially the RNA-panel models. Gliomas harbor the most malignant features and the poorest survival among all tumors. Currently, numerous glioma prognostic models have been reported. We systematically reviewed all 138 machine-learning-based genetic models and proposed novel criteria in assessing their quality. Besides, the biological and clinical significance of some highly overlapped glioma markers in these models were discussed. This study screened out markers with strong prognostic potential and 27 models presenting high quality. Conclusively, we comprehensively reviewed 138 prognostic models combined with glioma genetic panels and presented novel criteria for the development and assessment of clinically important prognostic models. This will guide the genetic models in cancers from laboratory-based research studies to clinical applications and improve glioma patient prognostic management.


Introduction
Malignant tumors are characterized by therapeutic resistance, frequent recurrence, and distant metastasis, which cause difficulties in treating by either surgical resection or adjunctive therapies, leading to poor prognosis. For better clinical management, many prognostic models were proposed to analyze survival [1]. Previous models with unformulated predictors stratified patients into relative risk groups. However, they did not provide quantitative results or absolute risk stratification. Machine learning algorithms can identify critical patterns through big and complex data with high accuracy [2]. Common algorithms applied in cancer prediction include weighted gene coexpression network analysis (WGCNA), L1-penalized least absolute shrinkage selection operator (LASSO), Cox proportional hazards (PH) model [3], Neural Network [2], and Elastic regression [4]. Based on these machine learning algorithms, risk scores and further pictorial nomograms are constructed for addressing this issue [5,6]. Risk score models are calculated with a spectrum of parameters in predicting clinical outcome risks. Samples are from local patient cases or online data. Establishing a model requires two main steps: development and validation [5]. First, predictors are selected and risks are calculated; thereafter, performance estimations are performed to assess the predictive quality. Second, the model is validated internally and externally (independent data set). Performance estimation is also performed [7].
Gliomas are common intracranial tumors causing the highest mortality rates in all cancer types [8]. Routine treatments for gliomas include surgical excision, radiotherapy, and chemotherapy. e world health organization (WHO) grouped gliomas into four grades based on their histological features: WHO grades I, II, III, and IV [9]. Low-grade gliomas (LGGs) comprise WHO grades I and II, showing a relatively good prognosis. High-grade gliomas (HGGs) consist of WHO grades III and IV, manifesting worse survival outcomes [9]. Histological classification provides an understanding of glioma behavior. However, molecular groups differentiate prognostic groups more accurately [10] (Figure 1). e 2016 Central Nervous System WHO classification, therefore, provided molecular features including IDH-mutant/wildtype astrocytoma and glioblastoma, IDH-mutant/wildtype, and 1p/19q-codeleted/noncodeleted oligodendroglioma, H3K27M-mutant/wildtype diffuse midline glioma, and RELA fusion-positive/negative ependymoma to precisely classify gliomas [11]. ese and other markers such as CDKN2A and EGFR were combined to develop prognostic models for glioma patients [12]. e poor prognosis of gliomas triggered the development of a clinically useful and effective model to assess survival risks for subsequent therapeutic strategies. is review systematically summarized and compared all 138 risk score models for gliomas with multivariable markers (Figure 2). It also presented the clinical significance of some frequently reported predictors, thus guiding the advanced prediction models for glioma.
is will be conducive to translation from laboratory-based models to clinical available tools and clinical prognosis management improvement.

Rules for Evaluating Model Quality and
Exclusion Criteria e TRIPOD statement standardized reports on the prediction models. It proposes a checklist of 22 items required in the development and validation process [13]. However, it does not judge model quality. us, we screened through the 138 models and proposed novel criteria that classified them into different quality groups ( Figure 3); the criteria are listed in Table 1. After quality division, the high-and mediumquality models were discussed, and the low-quality group models were not presented (Table S1).
Performance estimation, validation, and EPV are vital factors to assess the model quality. Performance includes discrimination and calibration. Discrimination refers to the capability that differentiates patients from events that happen or not, which functions as the most essential quality results [14]. Calibration compares estimated event rates with observed rates [15]. Less efficacy arises when discrimination is confused with performance. Secondly, most studies lack calibration results [16]. For discrimination, we defined area under the curve (AUC), or c-index ≥ 0.80 as high accuracy, AUC ≥ 0.70 as acceptable, and AUC < 0.7 as low accuracies. e repeatability and transportability of the model should be verified through internal and external validation before clinical application [15]. Proper EPV requires 10 minimally determined by rule of thumb, but when many low-prevalence predictors are included, EPV should be up to 20 [17]; otherwise, it is considered an overfitting model.
Other aspects including variable number, missing value, outcome definition, reference genome, and annotation update also contribute to the model assessment. Excessive variables increase detective costs and decrease practicability.
For most models contain less than 10 predictors and no certain conclusion have been presented, we defined less than 10 appropriate predictors. Missing data can lead to bias when not handled properly. Outcome definition is clearer and more informative with specific follow-up time than overall survival (OS). Reference genome and annotation updates can cause reversed individual risk prediction due to multiple gene expression diversity [18]. e inconsistent cutoff values between training and validation sets, screening method, and the threshold for predictors are complex questions and remain unresolved; they were not reviewed.
We excluded studies if (1) a single genetic predictor was employed with or without other types of predictors, (2) the model predicts glioma diagnosis at the time of screening, and (3) the model is inadequately presented without a regression equation or risk score.

RNA Models
3.1. mRNAs. Message RNA (mRNA) plays a critical role in central dogma that controls protein synthesis and decides cellular biology and behavior [19]. Many biological processes of mRNA in cancers, such as mRNA splicing, methylation, and interfering, can modulate the mRNA level and alter the cancer property; thereby, mRNA modulation has been a therapeutic target since early times. e levels of mRNAs in cancers, including gliomas, represent the expression of genes that are connected to the prognosis [11]. For glioma prognostic models, the largest proportion of models (78 models) are constructed based on mRNA sequencing data ( Figure 2), of which 18 models are in the high-quality group.

3.1.1.
LGGs. Low-grade gliomas (LGGs) refer to WHO I, II gliomas [9]. Twenty-one mRNA models for LGGs have been reported. Among them, ten are in the high-quality group and nine are in the medium-quality group.
In the high-quality group, nine [20][21][22][23][24][25][26][27][28] and one [29] confer high and acceptable accuracy, respectively. Common advantages and additional luminous points in the high-quality group are shown. e corresponding AUC for multiple outcome events of 1, 3, and 5 years' survival in the 4 models [22][23][24][25] functions more powerful than many other models that predict merely OS. Two models were validated and assessed highly accurately both internally and externally [21,22]. Su's model [25] is highly accurate, except when it predicted a 5-year survival training set (AUC was 0.711). For Zeng's model [24], despite its lower AUC for the external validation set, the nomogram performed better via c-index and calibration curve.

HGGs.
High-grade gliomas (HGGs) comprise WHO III and Glioblastoma (GBM). A total of 44 HGGs models have been constructed but show a few satisfactory studies.
e high-quality and medium-quality groups contain 7 and 11 models, respectively. e high-quality models were designed for GBM [39][40][41][42][43][44][45]. 4-gene [44] and 3-gene [43] models associated with autophagy were both validated via two independent datasets. e two risk scores' predictive discrimination varies among survival rates in different years and datasets, while their nomograms stabilized the accuracy above 0.72, which integrated risk scores with other common factors. Nomograms' superior predictive accuracy to risk scores can also be observed in a study by Wang, the nomogram is highly accurate (from 0.77 to 0.85) compared with the AUC of risk score (from 0.67 to 0.79). Of note, nomogram performance decreased when estimated by c-index [39]. However, in a study by Zhu, the risk score outperformed the nomogram.
AUC of risk score is 0.781 and 0.771 for 2 and 3 years of survival in the discovery cohort [41], while the c-indices of nomogram are also less than 0.70 [41]. Nomograms benefit from combining risk scores with other predictors, but whether their accuracy increases depends on the factors' quality. Moreover, the calibration curves were plotted and verified the reliability in the four studies [39,41,43,44]. Additionally, 2329 samples from multiple cohorts in Zhu's study are the largest sample size in the currently existing models, improving its repeatability and transportability [41].

Other Classified Gliomas Clusters.
ree models and ten models are grouped into the high-quality and mediumquality groups from the total 28 models reviewed, respectively.
e three high-quality models are highly accurate [57-59]. Particularly, Wang's model for diffuse glioma exhibits excellent discrimination with AUC from 0.874 to 0.950 [58]. But it was only internally validated.
Nine medium-quality models show high accuracy Conclusively, most studies were designed for diffuse gliomas, and our criteria characterized 3 high-quality models. Besides, many models show high predictive accuracy when subjected to training. However, the absence of accuracy of validation sets failed to affirm the obtained discrimination results.

LncRNAs.
Long noncoding RNAs (lncRNAs) are transcripts more than 200 nt in length. LncRNAs lack significant protein-coding capacity, but their regulatory functions are widely engaged from gene expression to protein translation. In gliomas, the lncRNAs function in stemness, drug resistance, blood-tumor barrier permeability, angiogenesis, and motility cancer phenotypes [70]. lncRNA is the second major hotspot in model research, after mRNA ( Figure 2). Seventeen lncRNA-signature models on different gliomas and one model on diffuse intrinsic pontine glioma have been reported. ree are high-quality models, and five are medium-quality models.
All three high-quality models exhibit high accuracy [71][72][73]. Except for the slightly decreased AUC values (0.722) when submitted to external validation based on Chen's study [73], the other two models show similar high accuracy in training, internal validation, and the entire set with AUC from 0.84 to 0.91 [71,72]. e highest AUC value in the medium-quality group was obtained in Wang's model (0.942) for anaplastic glioma [74].
is is followed by Kiran's UVA8 model (8-lncRNA signature) [75] acceptably test for 5-year survival. e AUC values for the other two models were 0.68 and 0.70, respectively [76,77]. In the five medium-quality models, they were externally validated, but three lacked internal validation [74,77,78]. Moreover, Kiran's study reported the UVA8 model and compared it with other predictors and models [75]. e UVA8 accuracy [75] is higher than other clinical features or IDH status. It outperformed the 5 published signatures in the training dataset by c-indices [75]. While the 6 models were designed for a diverse class of gliomas and different prognostic events and to validate various datasets, this positive result in Kiran's study was inevitably questionable due to incomparability.
Internal validation is absent in the 9 low-quality models (Table S1), which is vital to address the stability in selecting predictors and the quality of predictions before clinical application [79]. Cross-validation or bootstrapping methods should be employed to achieve complete internal validation.
3.3. miRNAs. MicroRNAs (miRNAs) are a class of noncoding RNA that binds to complementary target mRNAs.
is results in mRNA translational inhibition or degradation. In gliomas, miRNAs are involved in various tumorassociated activities, including immune response, hypoxia, tumor plasticity, and resistance to therapy through multigene targets [80], indicating miRNA-based models as a promising strategy for glioma prognosis.
Fourteen studies on miRNA signatures, one on LGG [81], and others on HGG were reported. ree models were Search in pubmed.gov and webofknowledge.com using following key terms: glioma, risk score, nomogram, genetic, prognosis.
138 models were included finally and divided into three groups Characterized in the low-quality group if there is No performance estimation OR No internal and external validation set OR EPV < 10 Classified into the medium-quality group if performance of validation set is absent 27 models were divided into the high-quality group eventually Employed only a single predictor Predicts glioma diagnosis at the time of screening Is presented without a regression equation or risk score Journal of Oncology 5 characterized in the high-quality group [81][82][83], whereas the other models were classified in the low-quality group. e three high-quality models have respective advantages. e 5-miRNA model by Cheng et al. [83] adopted complete internal and external validations, and the AUC values increased from 0.649 and 0.756 to 0.847 and 0.909 after integrating age and chemotherapy, respectively, although only 19 samples were submitted to external validation. Chen's model [82] was internally and externally validated and achieved similar AUC for disease-free survival and OS in three datasets. Besides, Qian's study [81] established a nomogram that predicted a 1-, 2-, 3-, and 5-year survival rate, which is informative for prognostic management. However, its drawbacks include the absence of internal validation and low accuracy in the training set (cindex � 0.68).

Methylation Models
Cytosine-phosphate-guanine (CpG) islands are a cluster of CpG sites located at or near the transcription start regions of genes and gene promoters. CpG island methylation is the most common epigenetic type of cancer. e CpG island methylator phenotype that comprises several CpG islands is an interesting topic in cancer epigenetics [84]. e glioma CpG island methylator phenotype is associated with gliomas tumorigenesis and is an independent biomarker stratifying gliomas into epigenetic subtypes [85]. Currently, 8 models have been reported and none of them is a high-quality model; 3 of the models are classified into the mediumquality group [86][87][88].
e two medium-quality models have acceptable discriminations from 0.71 to 0.77 but were not internally validated [86,87]. Yin's 6-CpG risk score [86] achieved a higher AUC value (0.734) for patients receiving all treatment integrated with the CpG island methylator phenotype. Moreover, higher AUC (0.771) was achieved with the MGMT status combination for those receiving radiation therapy/ temozolomide. Besides, the prediction accuracy rate of the 6-CpG signature (87%) was validated via the support vector machines model.

Other Multimolecular Models
Two protein-signature models based on reverse phase protein array were constructed. Stetson's 13-protein risk model [89] applied c-index to estimate the model's accuracy in both training and validating sets for GBM (0.63 and 0.60, respectively) and IDH-wildtype LGG (0.82 and 0.70, respectively), but the shortage is the low EPV (less than 10). Patil and Mahalingam developed another 4-protein model, but without external validation and performance estimation [90]. Both two models are of low quality. ree mixed models of different classes of molecular signatures were presented for GBMs prognosis. e mixed model for mRNA and lncRNA is of high quality [91], and the other two are medium-quality models [92,93]. e three models were fully validated both internally and externally. ey were estimated using receiver operating characteristic curve or c-index; however, the validation set lacked estimation, and there was a low c-index value (0.68) in the training set from Etcheverry's study [92].

Biofunction and Clinical Significance of Frequently Reported Molecules
Molecular signatures reviewed consist of mutated genes, noncoding RNAs, and proteins. Currently, the star markers including IDH, MGMT, 1p/19q, H3K27M, TERT, and ATRX are known to exhibit significant prognostic value. e IDHmutant with 1p/19q codeletion and MGMT promoter methylation are favorable prognostic factors. e H3K27Mmutant and ATRX alteration are associated with higher risk whereas TERT has a dichotomous prognostic effect [94]. Besides, other molecular signatures collected from 138 published models were reported to contribute to prognostic risk estimation. e prognostic value for most molecular biomarkers has not been validated. erefore, we analyzed 138 models to select the most overlapping biomarkers ( Table 2) with known evidence from researches to determine their biofunctions and potential prognostic values in gliomas. e predictors that presented repeatedly more than twice in 138 model studies are listed in Table 2. Predictors that overlapped less than three times were not reviewed.
is played critical roles in glioma progression and was correlated with poor prognosis [95]. Besides, IGFBP2 downregulation was reported specifically in IDH-mutant gliomas [96]. Besides, EGFR (epidermal growth factor receptor) and integrinβ can integrate with IGFBP2 to promote tumor progression [95]. e IGFBP2 gene, therefore, presents prognostic value and functions as a potential immunotherapeutic target for GBM in the future clinical trials [95,96].

HDAC Families and CD44.
Histone deacetylase (HDAC) is a vast family of enzymes that mainly exert a repressive influence on transcription [97]. It blocks gene transcription by inhibiting histone acetylation and compacts the DNA/histone complex. In gliomas, HDAC functions to bridge the xCT-CD44 complex with malignant glioma cells and various tumor zones [98]. Currently, HDAC inhibitors exhibit unfavorable therapeutic efficacy in glioma patients. Researchers have reported a more beneficial strategy that adopted HDAC inhibitors in combined therapy [99]. Moreover, many clinical trials (Table 3) are ongoing to test their application prospects in treating diffuse intrinsic pontine glioma. and HGG. Currently, HDAC inhibitors are characterized by HDACs risk factors; however, their predictive ability has not been verified. In-depth explorations on their therapeutic efficacy and prognostic value are required. 6 Journal of Oncology CD44 was identified in GBM and from brain metastases [100]. It is a biomarker of the mesenchymal GBM subtype with the most aggressive growth patterns [101]. Besides, GBM progression was inhibited by inhibiting CD44 expression.
is indicates the roles of CD44 in the tumor process [102].

MDK.
MDK is a heparin-binding growth factor encoding gene extensively studied for its multiple functions in various tissues. MDK contributes to numerous tumorrelated activities in glioma, and its overexpression is associated with poor prognosis [103]. However, no clinical trial has been performed to explain its therapeutic potential.
6.4. GPNMB. GPNMB encodes the type 1 transmembrane protein expressed mainly on the surface of cancer cells [104]. GPNMB promotes tumor progression through immunemicroenvironment plasticity. It also enhances Wnt/β-catenin signaling pathway activation and interaction with Na + /K + -ATPase subunits [105][106][107]. Also, abnormally high GPNMB expression has been reported to be associated with unfavorable survival outcomes in GBM [108]. ese mechanisms justify that GPNMB is a risk factor for glioma patients.

EGFR.
It has been reported that EGFR signaling pathways are activated in the majority of GBM cells [109]. EGFR gene aberration contains rearrangement,  Journal of Oncology 7 amplification, and mutation.
e EGFR variant III (EGFRvIII) is a common mutation product in GBM [110].
e EGFRvIII contributes to many tumor biological features [111], indicating a poor outcome. Besides, wild-type EGFR associated with tumor cell invasion and angiogenesis has been demonstrated in several in vivo and in vitro experiments [112]. us, EGFR and EGFRvIII have been identified as popular therapeutic targets for treating malignant glioma patients. However, current treatments that target EGFR have failed in clinical trials including the small molecule drugs and biologic antibodies [112]. While some trials are still ongoing (Table 3).
6.6. VEGFA. VEGFA, also known as VEGF, is a growth factor that promotes tumor angiogenesis and vascular permeability and regulates immune cell and fibroblastoma and microenvironment formation [113]. In glioma, VEGF acts as a regulatory growth factor secreted by glioma stem cells to promote the tumor vasculature [114]. Anti-VEGF-A antibody has been applied as a novel antiangiogenic therapy for malignant gliomas, as with bevacizumab [114]. However, it has not achieved considerable progression in treating gliomas. A report by Eskilsson et al. [112] indicates that causes of failure may be attributed to enhanced invasive properties of tumor cells by different mechanisms, for example, RTK signaling. Many trials using bevacizumab with other strategies, for example, EGFR inhibitors, showed improved efficacy (Table 3); some trials are underway. Since several VEGFA studies on gliomas exist, it is expected that it will work as an accurate and effective predictor for glioma prognosis. is is despite its current poor performance in antiangiogenesis therapy. 6.7. miR-221/miR-222. miR-221/miR-222 are two closely related miRNAs located in the genome region of the X chromosome. ey have a similar sequence, structure, and biofunction and upregulate expression in various human cancers including glioma. Knocking down miR-221/miR-222 expression blocks cell cycle transition, suppresses tumor cell growth, and increases sensitivity to radiotherapy [115]. Also, miR-221/miR-222 is associated with tumor cell apoptosis by targeting the apoptosis-related gene [116]. Recent studies revealed its connection with glioma histology and patient prognosis [117]. is showed that either miR-221 or miR-222 expression was associated with a higher WHO grade thereby indicating a poor prognosis.

Other Significant Predictors.
Apart from the abovediscussed vital parameters, other factors play significant roles in glioma malignant properties and prognosis assessment. ese predictors include MYC, KCNJ10, CHI3L1, STAT1, and FZD7. MYC amplification has been identified in many classes of cancers. It acted as a qualified prognostic prediction biomarker [118]. KCNJ10 encodes inwardly rectifying potassium channel Kir4.1 protein that is expressed exclusively in central nervous system glial cells.
is establishes a hyperpolarized resting membrane potential and prevents glioma cell proliferation [119]. CHI3L1 overexpression is associated with poor survival [120] causing tumor invasion, migration, angiogenesis, and resistance to temozolomide therapy [121]. e aberrant STAT1 activation may cause oncogenesis [122]. Limited information on FZD7 activity in gliomas exists; however, the other four predictors have attracted extensive research.

Discussion
For model variables, previous studies focused on pathological, anatomic, and clinical predictors, but recently genetic data were widely recommended for enhancing predictive ability [123,124]. Formulated genetic predictors harbor an advantage in calculating absolute risks based on marker expression levels via sequencing analysis and coefficients, thus providing stronger information compared with relative risk tools. However, they are still not recommended for clinical application, due to limitations, such as the lack of gold markers, difficulties of data collection, the complexity of analysis, and low adherence to complete and transparent reporting [125]. Our review confirmed a similar problem in gliomas. We found that only 27 models (20%) are classified into the high-quality group (Table 4), 31 models (22%) are in the medium-quality group, and 83 models (58%) are classified in the low-quality groups; none can be clinically applied according to our criteria, which was urgently required to address the issue that no method for assessing the model quality exists currently [125].
Problems in models consist of methodological deficiencies and clinical confirmation absence. e former was mainly attributed to the low adherence to guidelines like the TRIPOD statement [13], leading to series of deficiencies like the lack of performance assessment, validation that was also observed in other cancers [123,126,127]. A good example of a prostate cancer model was constructed in transparent and clear details [128], and its constructing details were listed in a table, facilitating the check of model reliability for users and avoiding methodological negligence during the establishment process. And for validating, age-related risk score [129] was analyzed through 2953 cases from 10 datasets; the large sample size and high AUCs verified its robustness. For improving the model methodology, complete and transparent reporting should be strictly ensured. Besides, Zhang et al. highlighted that reference genome and annotation updates cause inconsistent gene expression levels, leading to discordant individual risk grouping. Using up-to-date reference genome, stable gene in each annotation release (with consistent length and overlap), and gene pairs is helpful [18]. For clinical application, while decision curve analysis [130] compares the net benefit of the models with traditional approaches, the best methods for testing clinical significance are prospect clinical trials [123,125,126]. Health economic impact evaluations should also be considered [131], but we found that no study has reported the cost of predictor detection. Economic cost and effectiveness of models were critical for the medical decision-maker, like cost-effectiveness ratio; they also contributed to the optimization of risk thresholds. However, since few studies have conducted these methods, we emphasized increasing awareness of clinical considerations when a high-quality model was established.
Conclusively, we comprehensively reviewed all 138 machine learning genetic models in gliomas and proposed novel criteria that will foster the development or assessment of clinically important models for not only neurosurgeons but also researchers in other cancers. Given the current situation, a lot of effort should be put to standardize the model quality through adherence to complete and transparent reporting and promote model generalization by conducting prospective clinical trials, and economic effectiveness should be the following issue. Despite the various difficulties, future genetic models will lead the prognostic management, and novel gene-pair-based models deserve development.

Data Availability
No data were used to support this study.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this article.

Authors' Contributions
Conception and design were done by Quan Cheng and Zhixiong Liu; administrative support was given by Quan Cheng and Zhixiong Liu; provision of study materials or patients was done by Xisong Liang and Zeyu Wang; collection and assembly of data were carried out by Ziyu Dai and Hao Zhang; data analysis and interpretation were performed by Xisong Liang and Zeyu Wang; manuscript writing was done by all authors (VII). All authors approved the final version of the manuscript.