Kaplan-Meier plotter data analysis model in early prognosis of pancreatic cancer

Intelligent data analysis methods provide helpful tools for cancer researchers to detect the prognosis of patients with specific diseases. Yet, very little information is known about the features of these models used in data analysis methods. In this study, we presented a new Kaplan-Meier plotter model with a better-combination of input features for early prognosis tasks of pancreatic cancer. Our new model integrates gender, race, and follow up the threshold to get better verification of genes of interest as prognostic markers for predicting cancer at early stages. Assessment is made for the developed model to examine the important role of the oncogene RablA in early prediction of pancreatic cancer on the standard clinical datasets from The Human Protein Atlas. Our results showed that overexpression of the oncogene Rab1A in pancreatic cancer plays a vital role in its early prognosis (p<0.05). The proposed model results were also verified using an independent dataset deposited in The Human Protein Atlas. Altogether, the experimental results highlight Rab1A potential role in cancer prognosis.


Introduction
Pancreatic cancer has the fourth highest mortality rate among other deadly cancers [1]. The incidence of pancreatic cancer is 56770 newly diagnosed patients and 45750 deaths yearly in the United States [2]. The Pancreatic cancer survival rate is low (2-9%), despite the improvement in pancreatic cancer treatments. However, the pancreatic cancer survival rate can be improved when It is detected in the early stages. Therefore, the early the early prognosis of pancreatic tumors is essential for successful treatment. In the last decade, many researchers improve the performance of many techniques used in clinical research to ensure an early prognosis of a cancer kind. These techniques are clinical images analysis, machine learning, and miRNA expression analysis [3][4][5][6][7][8][9].
One of the intelligent data analysis methods is machine learning technology. Machine learning uses algorithms to analyze complex medical datasets. It has several applications in different areas in medicine such as medical prognosis and research. In recent years, machine learning shows significant improvement in predictive pancreatic cancer compared with other techniques that dominate the medical literature [10][11][12][13][14][15][16]. Regarding good performance and accuracy of machine learning in medical fields, it has not been accepted in practice for many reasons. Among these reasons perhaps due to increase in the abundance of tools available to physicians which further increases the complexity of the physician work. Scientists start to identify and predict diseases using clinical image analysis magnified images of lesion regions. Recently, the clinical analysis methods exhibit higher accuracy in cancer prediction in comparison with the unaided eye [17]. Due to several limitations and artifacts that are associated with clinical images which cause low accuracy, scientists tend to use other automated methods to analyze patient data. Therefore, some scientists start to combine gene expression and machine learning to predict the prognosis of patients in techniques called mRNA profiling. mRNA profiling uses mRNA to diagnose and help in predicting the outcomes of cancer patients. It also could be used as a biomarker or therapeutic target. However, a suitable level of validation for these techniques is required to be considered in practice in medical diagnosis. Here in this work, we improve a model for pancreatic cancer prediction. in order to fulfill this goal, the presented model is assessed with a dataset placed in The Cancer Genome Research Network (TCGA) database shown in Table 1. The contributions of this work can be outlined as follows: -The presented analysis model was assessed with pancreatic tumors RNA-seq datasets.
-The proposed model was investigated for the impact of a new set of extended features in pancreatic cancer prediction (prognosis).
-The performance of the model was assessed from TCGA datasets to The Human Protein Atlas.

New Kaplan-Meier analysis model for pancreatic cancer early prediction
Kaplan Meier method is one of the best estimates which was designed to detect such changes for the evaluation of treatments and the prediction of their effects on cancer survival probability [18]. Kaplan Meier's classical model consists of fifteen features. These features are: gene symbol or multiple genes symbols, cutoff values, survival used to compute median survival (include relapse-free survival and overall survival) when the cohorts reach median survival, follow up threshold (referred to the period time, in months, after patients receiving treatment), quality control (removed redundant sample, check the proportional hazards assumption, and perform multivariate analysis), cancer stage (range from one to four), cancer grade (range from one to four), race (white and African American), gender, analysis based on cellular contents and cancer type (Basophils, B cells, T cells, Eosinophils, and Macrophages). However, the classical Kaplan Meier model can only detect diseases in later stages. In most diseases especially, cancer if it is not detected and treated at its early stages, then patients will not get a full recovery of the disease. In order to improve Kaplan Meier model detection tasks for a disease type, we extended the model features by integrated new restriction features along with the basic features. The extension of the model features makes the analysis more restricted to threshold, race, and gender. The model's new set of features is important in enhancing the model detection output. The proposed model was evaluated with a dataset placed in the TCGA portal of 176 cases.

Assessment of Kaplan-Meier new model results to Human Protein Atlas databases
Pancreatic cancer is the fourth leading cause of cancer death worldwide. Detecting pancreatic cancer in its early stages plays an important role in improving cancer treatment. Finding a new prognostic biomarker for pancreatic cancer is a crucial and challenging task. Therefore, we evaluated the results of the new Kaplan Meier model on the standard clinical datasets from Human Protein Atlas (https://www.proteinatlas.org). The data was downloaded from the following website (proteinatlas.xml.gz). The downloaded data was used to analyze mRNA expression of Rab1A in 176 of  [19]. 84 patients with high Rab1A were alive and 92 patients were dead. On the other hand, all alive patients have low Rab1A expression. Log-rank test was used to evaluate the significance of the correlation between Rab1A mRNA expression and patient survival. The analysis results were considered to be significant when the value of p was less than 0.05.

Amplification of Rab 1A in pancreatic cancer
Many previous studies reported that Rab1A protein is overexpressed in different cancers [20][21][22][23][24][25][26][27]]. Yet, its relevant role in pancreatic cancer is still obscure. Hence, the aim of our study is to examine and check the mRNA level of Rab1A in 176 pancreatic tumor and 248 non-cancerous pancreatic samples.

Potential role of the new Kaplan-Meier model in the early pancreatic cancer prognosis
In recent years, scientists tried to develop new methods to help the prediction of a cancer kind before it develops symptoms. However, an accurate prognosis of a tumor is considered a very difficult task for the medical practitioners. As a result, a lot of analysis models and technologies have been developed for the previously mentioned purpose. Kaplan Meier has become a very popular analysis tool among medical researchers. This analysis tool can predict the possible cancer outcomes by using very compound datasets. From using the basic features of the Kaplan-Meier classical model: gene symbol (Rab1A), follow up threshold (240 months), OS, cancer type (pancreatic cancer), we were able to predict the pancreatic tumor in later stages. Unexpectedly, analysis results revealed that Rab1A is amplified in pancreatic cancer patients. Moreover, the patient's survival rate was better with the low expression level of Rab1A in comparison to the poor survival of patients with a high expression level (p=0.0024) ( Figure  1A). Besides our results are consistent with previously published papers and it's overexpressed in many cancers [20][21][22][23][24][25][26][27]. Although classical features (as input) of the model can overall predict the possible outcomes of pancreatic cancer patients but the early prediction is remained not detected. So, we suggested in this study a new model with new restriction features (female gender, white race and follow up threshold 180 months) as input to improve the prognosis of human pancreatic cancer. Remarkably, the results of the new analysis model help to improve the detecting ability of the model in pancreatic cancer (p<0.05). The results are shown in Figure 1B. Our results demonstrate that high expression of Rab1A is linked to poor prognosis in white females' stage 2 pancreatic tumors. The results are shown in Figure 1C. Together, our results show that the new model has better prognosis ability for the disease at early stages in comparison with the classical analysis model of Kaplan Meier. Furthermore, we further examined the impact of the new model on pancreatic cancer on earlier stage (stage 1). Our results are shown in Figure 1D. However, the results were not significant in terms of predicting the disease at earlier stages (stage 1) (p˃0.05). Altogether, these observations reveal our improved analysis model could help in the early prediction of many further cancers. C. D.  Pancreatic cancer is one of the most aggressive cancer types. This reason makes finding the right treatment option for pancreatic cancer patients' hard task for doctors. Therefore, we confirm the results of Kaplan model on the standard datasets deposited in The Human Protein Atlas [19]. Our Analysis for genomic datasets reveals that poorer survival is for pancreatic cancer patients with overexpression of Rab1A. While patients with low expression of Rab1A have better survival (p = 0.048) (Figure 2). Therefore, Rab1A level in pancreatic cancer patients can provide predictive value for pancreatic cancer patient's outcomes. Our observations bear a resemblance to those results observed in our study and other studies [20][21][22][23][24][25][26][27]. The model results highly support that amplification of Rab1A is linked to a high risk of pancreatic cancer related death. Human Protein Atlas database used to validate our model of pancreatic cancer patients. Log rank test was used to calculate P value.

New Kaplan-Meier analysis model supports Rab1A role in oncogenesis
Several researches in the field of oncology believe Rab1A play a critical role in many tumors [20][21][22][23][24][25][26][27]. Here, the overexpression role Rab1A in pancreatic tumors was examined. The analysis results revealed that the level of mRNA in pancreatic tumors was higher than in noncancerous pancreatic samples. The results are depicted in Figure 3A and 3B. Generally, our results showed overexpression of Rab1A in pancreatic cancer similar to other studies [20][21][22][23][24][25][26][27]. Therefore, we believed that our results may elucidate several findings related to pancreatic cancer development.

Conclusion
A new and improved model of Kaplan-Meier was presented in this work. An assessment for predicting pancreatic tumors at first and second stages in female white pancreatic tumor patients model was made for the model. Kaplan-Meier new model exhibits a high sensitivity for detecting the cancer in second stage in comparison to the classic one. Furthermore, the results were verified to a standard dataset from Human Protein Atlas for the pancreatic tumor prediction tasks [19]. Interestingly, our Kaplan Meier results were bearing a resemblance to standard datasets results. Overall, our results further emphasize the relevance of Rab1A in the development of other malignancies but its role in cancer development still not clear. So, more studies address Rab1A regulation in a pancreatic tumor are urgently required.