EDLM: Ensemble Deep Learning Model to Detect Mutation for the Early Detection of Cholangiocarcinoma

The most common cause of mortality and disability globally right now is cholangiocarcinoma, one of the worst forms of cancer that may affect people. When cholangiocarcinoma develops, the DNA of the bile duct cells is altered. Cholangiocarcinoma claims the lives of about 7000 individuals annually. Women pass away less often than men. Asians have the greatest fatality rate. Following Whites (20%) and Asians (22%), African Americans (45%) saw the greatest increase in cholangiocarcinoma mortality between 2021 and 2022. For instance, 60–70% of cholangiocarcinoma patients have local infiltration or distant metastases, which makes them unable to receive a curative surgical procedure. Across the board, the median survival time is less than a year. Many researchers work hard to detect cholangiocarcinoma, but this is after the appearance of symptoms, which is late detection. If cholangiocarcinoma progression is detected at an earlier stage, then it will help doctors and patients in treatment. Therefore, an ensemble deep learning model (EDLM), which consists of three deep learning algorithms—long short-term model (LSTM), gated recurrent units (GRUs), and bi-directional LSTM (BLSTM)—is developed for the early identification of cholangiocarcinoma. Several tests are presented, such as a 10-fold cross-validation test (10-FCVT), an independent set test (IST), and a self-consistency test (SCT). Several statistical techniques are used to evaluate the proposed model, such as accuracy (Acc), sensitivity (Sn), specificity (Sp), and Matthew’s correlation coefficient (MCC). There are 672 mutations in 45 distinct cholangiocarcinoma genes among the 516 human samples included in the proposed study. The IST has the highest Acc at 98%, outperforming all other validation approaches.


Introduction
With the continuous expansion of medical technology, the time of "big data" has arrived. Artificial intelligence (AI) and many AI technologies are being utilized in the medical services industry to unlock the unlimited potential of big data [1]. Cholangiocarcinoma is one of the deadliest forms of cancer in people and is now the number one killer and disability in the world [2]. Cholangiocarcinoma develops when the DNA of bile duct cells is altered. The DNA of a cell conveys guidelines that direct the cell's activities. Due to these modifications, cells grow uncontrolled and aggregate into masses known as tumors, which can infiltrate and damage healthy bodily parts [3].
The tumor suppressor gene TP53 is mostly responsible for the alterations in cholangiocarcinoma. Additionally, bile duct cancer may be influenced by the genes KRAS, HER2, and ALK. Some of the genetic modifications that lead to bile duct cancer may be influenced by inflammation [3].
The process of the development of Cholangiocarcinoma is explained with the help of Figure 1. Cholangiocarcinoma is divided into three most common categories: Extrahepatic cholangiocarcinoma is a disease of the extrahepatic bile channels [4]. The cancer may end up in the liver or the small intestine. Cholangiocarcinoma, which begins outside the liver but in the region where the bile ducts and main blood arteries join the liver, is a subclass of extrahepatic cholangiocarcinoma [5]. The second one is intrahepatic cholangiocarcinoma, a cancer of the bile channels in the liver, and the third one is gallbladder cholangiocarcinoma, which starts in the gallbladder. The tumor suppressor gene TP53 is mostly responsible for the alterations in cholangiocarcinoma. Additionally, bile duct cancer may be influenced by the genes KRAS, HER2, and ALK. Some of the genetic modifications that lead to bile duct cancer may be influenced by inflammation [3].
The process of the development of Cholangiocarcinoma is explained with the help of Figure 1. Cholangiocarcinoma is divided into three most common categories: Extrahepatic cholangiocarcinoma is a disease of the extrahepatic bile channels [4]. The cancer may end up in the liver or the small intestine. Cholangiocarcinoma, which begins outside the liver but in the region where the bile ducts and main blood arteries join the liver, is a subclass of extrahepatic cholangiocarcinoma [5]. The second one is intrahepatic cholangiocarcinoma, a cancer of the bile channels in the liver, and the third one is gallbladder cholangiocarcinoma, which starts in the gallbladder. Cholangiocarcinoma kills about 7000 people a year. Women die less often than men. Asians have the highest death rate. Between 2021 and 2022, African Americans had the largest increase in cholangiocarcinoma deaths (45%), Whites (20%) and Asians (22%) were next [6]. A total of 60-70% of patients with cholangiocarcinoma are diagnosed with local infiltration or distant metastases, thus losing the possibility of a curative surgical intervention. Less than a year is the median survival time across the board [7].
Medical imaging, which includes computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound, is the most effective and common non-invasive diagnostic method for cholangiocarcinoma identification (US) [8]. Cholangiocarcinoma diagnosis could be aided by artificial intelligence (AI). Given the rarity of the condition, the heterogeneity of the tumor's anatomic location and risk factors, and the importance of AI in cholangiocarcinoma detection [9], several machine learning (ML) AI methods, such as logistic regression, support vector machines (SVMs), artificial neural networks (ANNs), and convolutional neural networks (CNNs), have been used to identify cholangiocarcinoma [10]. This study uses a machine learning framework to pinpoint cholangiocarcinoma.
A rare kind of cancer called cholangiocarcinoma starts in the bile ducts. The small intestine receives bile (a digestive liquid) from the liver and gallbladder through bile ducts, which are tiny tubes. Most cholangiocarcinoma cases are discovered after the disease has progressed outside the bile ducts [11]. Treatment is tough, and the chances of recovery are often slim. The exact etiology of cholangiocarcinoma is not known by experts. Risk factors suggest that diseases that cause chronic (permanent) irritation of the bile ducts may contribute to the growth of this cancer [12]. DNA changes resulting from constant damage, such as inflammation, can alter some cells' growth, division, and behavior. The following pie chart shows that more than 70% of liver tumors are caused by hepatocellular carcinoma (HCC), and 8% of liver cancers are cholangiocarcinomas (CCAs), the second most common primary malignancy [13]. Cholangiocarcinoma kills about 7000 people a year. Women die less often than men. Asians have the highest death rate. Between 2021 and 2022, African Americans had the largest increase in cholangiocarcinoma deaths (45%), Whites (20%) and Asians (22%) were next [6]. A total of 60-70% of patients with cholangiocarcinoma are diagnosed with local infiltration or distant metastases, thus losing the possibility of a curative surgical intervention. Less than a year is the median survival time across the board [7].
Medical imaging, which includes computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound, is the most effective and common non-invasive diagnostic method for cholangiocarcinoma identification (US) [8]. Cholangiocarcinoma diagnosis could be aided by artificial intelligence (AI). Given the rarity of the condition, the heterogeneity of the tumor's anatomic location and risk factors, and the importance of AI in cholangiocarcinoma detection [9], several machine learning (ML) AI methods, such as logistic regression, support vector machines (SVMs), artificial neural networks (ANNs), and convolutional neural networks (CNNs), have been used to identify cholangiocarcinoma [10]. This study uses a machine learning framework to pinpoint cholangiocarcinoma.
A rare kind of cancer called cholangiocarcinoma starts in the bile ducts. The small intestine receives bile (a digestive liquid) from the liver and gallbladder through bile ducts, which are tiny tubes. Most cholangiocarcinoma cases are discovered after the disease has progressed outside the bile ducts [11]. Treatment is tough, and the chances of recovery are often slim. The exact etiology of cholangiocarcinoma is not known by experts. Risk factors suggest that diseases that cause chronic (permanent) irritation of the bile ducts may contribute to the growth of this cancer [12]. DNA changes resulting from constant damage, such as inflammation, can alter some cells' growth, division, and behavior. The following pie chart shows that more than 70% of liver tumors are caused by hepatocellular carcinoma (HCC), and 8% of liver cancers are cholangiocarcinomas (CCAs), the second most common primary malignancy [13].
The ratio of primary tumors is shown in Figure 2. Most patients with cholangiocarcinoma are over 65 years old. Effective treatment can be difficult because the disease is often not diagnosed until it is at an advanced stage. Depending on where the cancer is and how it develops, affected people can live months to years after diagnosis [13]. In medicine, In 2009, Logeswaran used a popular ANN, a multilayer perceptron, to differentiate images with and without CCA [15]. As a result, the Acc of the test for distinguishing healthy images from tumor images was 94%, and the Acc of the multi-disease test was 88%. To assess the Sp of several serum indicators to enhance CCA diagnosis, Pattanapairoj et al. constructed a classification model in 2015 using both C4.5 (the technique used to build classification models for decision trees in logical form) and ANN, with an AUC of 0.961 [16]. Shao et al. created an ANN model in 2018, which is crucial for patients with inoperable CCA when choosing a course of therapy, with an AUC of 0.964 [17].
Another study by Peng et al., in 2019, offered a novel method of precision treatment for CCA patients, using radiographic signatures of 128 CCA patients based on US scans to noninvasively characterize the biological activity of CCA, with an AUC of 0.930 [18] as shown Table 1. Additionally, Yang et al. investigated the MRI radiomics model's diagnostic efficacy using random forests in 2020, reporting an AUC of 0.90 [19].

Relative incidence of primary liver tumors Hepatocellular carcinoma Benign
Other malignancies Cholangiocarcinoma In 2009, Logeswaran used a popular ANN, a multilayer perceptron, to differentiate images with and without CCA [15]. As a result, the Acc of the test for distinguishing healthy images from tumor images was 94%, and the Acc of the multi-disease test was 88%. To assess the Sp of several serum indicators to enhance CCA diagnosis, Pattanapairoj et al. constructed a classification model in 2015 using both C4.5 (the technique used to build classification models for decision trees in logical form) and ANN, with an AUC of 0.961 [16]. Shao et al. created an ANN model in 2018, which is crucial for patients with inoperable CCA when choosing a course of therapy, with an AUC of 0.964 [17].
Another study by Peng et al., in 2019, offered a novel method of precision treatment for CCA patients, using radiographic signatures of 128 CCA patients based on US scans to noninvasively characterize the biological activity of CCA, with an AUC of 0.930 [18] as shown Table 1. Additionally, Yang et al. investigated the MRI radiomics model's diagnostic efficacy using random forests in 2020, reporting an AUC of 0.90 [19].  [15] MLP 0.960 Pattanapairoj, et al. [16] C4.5, ANN 0.961 Shao, et al. [17] BP-ANN 0.9648 Peng, et al. [18] LASSO, SVM 0.930 Yang, et al. [19] Random Forest 0.90 The AUC values were the best results in the above studies. ANN-artificial neural network; MLP-multi-layer perceptron; C4.5-an algorithm used to construct a decision tree classification model in logical form; BP-ANN-back propagation artificial neural network; LASSO-least absolute shrinkage and selection operator; and SVM-support vector machine [13].
Artificial intelligence can automatically offer a quantitative and impartial evaluation of a tumor by recognizing intricate patterns in picture data [20]. The present status of the artwork has a great deal of limitations and cutoff points. There is not yet a clear benchmark dataset for cholangiocarcinoma transformations and specific successions. Additionally, assessment techniques come up short on essential thoroughness and conviction. Subsequently, apparently, there is sufficient space for the models' precision to be moved along [21]. The latest and most summed-up dataset, as portrayed in the data acquisition framework section, was collected for this study while keeping these requirements in mind. Three deep learning RNN algorithms were utilized in this study: LSTM, GRU, and BLSTM.
Many machine and deep learning algorithms have been utilized for cancer detection, primarily focusing on image-based approaches that identify cancer after symptoms have already appeared. However, early detection is crucial to enhance treatment outcomes. This study aims to achieve early detection by detecting cancer through mutation detection in gene sequences. The dataset used in this study is not publicly available as it is formulated from multiple renowned databases in this study. To enable efficient mutation detection, extensive feature extraction techniques are employed. Moreover, the training process is enhanced by ensemble multiple deep learning algorithms. The performance of the proposed models is evaluated using various testing techniques to ensure their effectiveness.

Materials and Methods
This section talks about the inside activity and full clarification of the dataset collection, feature extraction, and classification strategies. The proposed methodology consists of dataset curation, extensive feature extraction, EDLM for accurate classification, testing, and evaluation. The proposed EDLM consists of three deep learning models-LSTM, GRU, and BLSTM. The whole process is explained with the help of Figure 3.
Artificial intelligence can automatically offer a quantitative and impartial evaluation of a tumor by recognizing intricate patterns in picture data [20]. The present status of the artwork has a great deal of limitations and cutoff points. There is not yet a clear benchmark dataset for cholangiocarcinoma transformations and specific successions. Additionally, assessment techniques come up short on essential thoroughness and conviction. Subsequently, apparently, there is sufficient space for the models' precision to be moved along [21]. The latest and most summed-up dataset, as portrayed in the data acquisition framework section, was collected for this study while keeping these requirements in mind. Three deep learning RNN algorithms were utilized in this study: LSTM, GRU, and BLSTM.
Many machine and deep learning algorithms have been utilized for cancer detection, primarily focusing on image-based approaches that identify cancer after symptoms have already appeared. However, early detection is crucial to enhance treatment outcomes. This study aims to achieve early detection by detecting cancer through mutation detection in gene sequences. The dataset used in this study is not publicly available as it is formulated from multiple renowned databases in this study. To enable efficient mutation detection, extensive feature extraction techniques are employed. Moreover, the training process is enhanced by ensemble multiple deep learning algorithms. The performance of the proposed models is evaluated using various testing techniques to ensure their effectiveness.

Materials and Methods
This section talks about the inside activity and full clarification of the dataset collection, feature extraction, and classification strategies. The proposed methodology consists of dataset curation, extensive feature extraction, EDLM for accurate classification, testing, and evaluation. The proposed EDLM consists of three deep learning models-LSTM, GRU, and BLSTM. The whole process is explained with the help of Figure 3.

Data Acquisition Framework
The dataset is the main part of this study. A complete data collection framework was built to train, test, and evaluate the EDLM. The most common way of obtaining reliable and exact information for a study is known as information assortment. Information procurement is the most common way of gathering information to direct research and illustrating how the information is accumulated from a solid source [22].
In human genes, two types of mutations occur: driver mutations, passenger mutations. The type of mutation in cells that causes cancer is known as a driver mutation. Cells grow abnormally because of driver mutation [23]. The dataset was developed so that it contains both normal and mutated sequences. The normal gene sequences were gathered from asia.ensembl.org (accessed on 13 November 2022) [24] using web scrapping code (WSC) developed in Python. CD-HIT was used to reduce sequence redundancy and improve performance. These gene sequences are available, but a generalized dataset related to mutated sequences is not available. Therefore, mutation information was obtained from IntOGen.org [25] using WSC written in Python. The mutation information contains the address of each element in the normal gene sequence. It has both nucleotides before mutation and after mutation. Therefore, a code was written in Python named Generated Mutated sequences (GMSs). which incorporated these changes in the normal gene sequences and built mutated sequences. All the normal gene sequences of all genes were combined in one file, and all the mutated gene sequences of all genes were combined in another file. Thus, the final dataset was formulated to have both normal gene sequences and mutated gene sequences.
The 516 human samples included in the proposed study contain 672 mutations related to 45 different cholangiocarcinoma genes. In Table 2, cholangiocarcinoma genes are listed. A word cloud visualization generated with Python to highlight the nucleotides and the frequency and significance of each nucleotide in all gene sequences connected to cholangiocarcinoma are represented with the size of each gene in Figure 4.

Feature Extraction
Feature extraction effectively reduces the amount of data that must be processed while accurately and comprehensively defining the initial data set by combining variables with selections and/or characteristics. It enhances the performance and Acc of training

Feature Extraction
Feature extraction effectively reduces the amount of data that must be processed while accurately and comprehensively defining the initial data set by combining variables with selections and/or characteristics. It enhances the performance and Acc of training models [26].
Feature extraction techniques are used to extract key characteristics from the raw data source. The process of collecting data in numerous phases to extract critical characteristics required in model training is known as feature extraction [27]. This is the most crucial phase in the preparation of machine learning and deep learning algorithms. Attribute extraction finds data patterns, which are then employed in data training and testing procedures.
This study calculates statistical moments such as Hahn, raw, and central moments. These feature extraction approaches are used to extract key features from data from mutant gene sequences and normal gene sequences [27,28]. All feature extraction techniques are listed in Figure 5.

Feature Extraction
Feature extraction effectively reduces the amount of data that must be processed while accurately and comprehensively defining the initial data set by combining variables with selections and/or characteristics. It enhances the performance and Acc of training models [26].
Feature extraction techniques are used to extract key characteristics from the raw data source. The process of collecting data in numerous phases to extract critical characteristics required in model training is known as feature extraction [27]. This is the most crucial phase in the preparation of machine learning and deep learning algorithms. Attribute extraction finds data patterns, which are then employed in data training and testing procedures.
This study calculates statistical moments such as Hahn, raw, and central moments. These feature extraction approaches are used to extract key features from data from mutant gene sequences and normal gene sequences [27,28]. All feature extraction techniques are listed in Figure 5.

Hahn Moments Calculation
Hahn moments are used to compute statistical parameters. Hahn moments are the most important concept in pattern recognition. They compute the mean and variance for the data collection. The Hahn polynomial is used to calculate Hahn moments. The size and placement of these Hahn points remain constant. These details are significant because they are sensitive to biological sequence information and can extract hidden properties from gene sequences [29].
Hahn moments need the use of two-dimensional data. As a result, the genomic sequences are transformed into a two-dimensional matrix A of size x * x, as shown in the following equation: The gene sequence is defined by A in this case. The value of A is used to compute Hahn moments.
Each element A represents a genomic sequence residue. The computation of statistical moments of the third order can be found in [30]. Hahn moments are orthogonal since they take in a square matrix. For the benchmark dataset, the Hahn polynomial can be calculated using the following equations: All integers a and b must be positive and are predefined constants. The size of the data array is G, and the instant's order is where a, b are predetermined constants, is any member of the square matrix and represents the current order. If A is an integer, then [0, G − 1](G is the provided positive integer). The polynomial's shape can be altered using these moveable variables [31]. Pochhammer's symbol is (a)k = a · (a + 1) · · · (a + k − 1) = r(a+k) r(a) . Equations (2) and (3)  For each gene sequence, 10 raw, 10 central, and 10 Hahn moments are calculated. These moments are united into the miscellaneous super feature vectors and are up to 3rd order.

Central Moments Calculation
Utilizing mean and variance, the central moment of feature extraction is utilized to extract key features. It is the region closest to the mean of the randomly chosen variable in the probability distribution [32]. The following equation illustrates the generic formula for calculating the central moments for the cholangiocarcinoma dataset: C 00 , C 01 , C 10 , C 11 , C 02 , C 12 , C 21 , C 30 , and C 03 are the designations for the distinctive qualities from central moments up to the 3rd order.

Raw Moments Calculation
The raw moments are used for statistical computations. Imputation is the process of maintaining facts by substituting the best available replacement values for missing data values in a data collection. The following equation shows the initial moments for the 2D data of order x + y [33]: Up to order three, raw moments are computed. This provides detailed information on important sequence elements, including R 00 , R 01 , R 10 , R 02 , R 20 , R 03 , and R 30 . The location of each gene in the cholangiocarcinoma gene sequence is determined using PRIM [31]. The following equation displays a PRIM-generated matrix with a dimension of 30 × 30: P 1→1 P 1→2 · · · P 1→j · · · P 1→m P 2→1 P 2→2 · · · P 2→j · · · P 2→m . . . . . . . . . . . . P n→1 P n→2 · · · P n→j · · · P n→m . . . . . . . . . . . .
The RPRIM operates similarly to PRIM but in the other direction [31]. The following equation describes, in detail, how to calculate RPRIM for the cholangiocarcinoma dataset:

Accumulative Absolute Position Incidence Vector (AAPIV)
A frequency matrix provides information about the frequency of a gene in a sequence of genes. AAPIV describes the different nucleotide configurations that occur in gene sequences. AAPIV is used to study the nucleotide sequences of cancer gene sequences against each other. The following equation serves as an illustration of the relative gene sequences of cholangiocarcinoma [33]: where n is the number of total nucleotides in the gene sequence, which may be estimated using equation regardless of its component: where βj represents where the ith nucleotide is located.

Frequency Vector Calculation
A dataset consists of tens of thousands of data records, each with a unique set of properties. The sequence of genes that come together to produce a gene sequence is represented by a frequency matrix. This is represented by the following equation: The frequency of each gene in the cholangiocarcinoma gene sequence may be seen below. The following equation can be used to determine the frequency vector: The frequency of each gene in the gene sequence is shown here by the numbers f 1 to f n .

Proposed Deep Learning Algorithms
In this study, an ensemble deep learning model (EDLM) is proposed, which consists of three deep learning models-LSTM, GRU, and BLSTM-for the early diagnosis of cholangiocarcinoma. Several malignancies, including cholangiocarcinoma, are recognized, detected, predicted, and diagnosed using the proposed EDLM. An input layer, an output layer, a pooling layer, a dense layer, and a dropout layer are just a few of the layers that make up a deep neural network model. Fully connected layers are then placed on top of everything else [34]. Each layer receives input from the layer before it and analyses the features. Algorithms with intrinsic learning characteristics inside these layers can educate themselves using several learning techniques.
LSTM, GRU, and BLSTM are three types of deep learning RNN algorithms [35] used in this study. For the detection of cholangiocarcinoma, these algorithms employ three assessment methods: An SCT, IST, and 10-FCVT.

Long Short-Term Memory (LSTM)
In challenges involving sequence prediction, LSTM networks, a kind of recurrent neural network, may discover order dependency [36]. The efficiency of the RNN declines with slot length. In essence, LSTMs are capable of long-term information storage. The information's length is modified to 64. The addition of a 128-neuron LSTM layer is also mentioned in [37]. The dense layer links the feedback from all the layers and transmits it to the result layer. A 20% dropout capacity is added in the dropout layer to prevent the model from overfitting. In order to combat model overfitting, this model features two dropout layers. In the LSTM layer, Stochastic Gradient Descent (SGD) is utilized as an enhancer. A sigmoid capability is utilized as an initiation capability, the details of which are also available in [38]. These details are also shown with the help of Figure 6. Which input value is used to change the memory depends on the input gate. Whether 0 or 1 values are allowed is determined by the sigmoid function [39]. The tanh function also assigns a weight to the provided data, defining its value on a scale from −1 to 1.
The Forget gate identifies the details in the block that should be erased. A sigmoid Which input value is used to change the memory depends on the input gate. Whether 0 or 1 values are allowed is determined by the sigmoid function [39]. The tanh function also assigns a weight to the provided data, defining its value on a scale from −1 to 1.
The Forget gate identifies the details in the block that should be erased. A sigmoid function determines it. It examines the previous state (h t − 1) and the content input (x t ) for each number in the cell state, c t − 1, and returns a value between 0 (omit this) and 1 [39].
The block's input and memory are both used to calculate the output. This is determined by the sigmoid function whether 0 or 1 values are acceptable. Additionally, the tanh function defines which values between 0 and 1 can pass [40]. Additionally, the tanh function gives the provided values weight by determining their significance on a scale from −1 to 1 and multiplying it by the sigmoid output.

Gated Recurrent Unit (GRU)
The second deep learning technique used in the suggested study is the GRU strategy. GRU accomplishes comparable tasks as LSTM but has fewer gates. GRU outperforms LSTM in terms of outcomes [40] since it uses fewer gates and parameters. The reset gate and the update gate are the only gates used by GRU in the cell. The update gate in the GRU controls how much past data are used, whereas the reset gate controls how much past data are disregarded [41]. The GRU cell structure utilized to identify cholangiocarcinomas is shown in the Figure 7. The following equations show the work process of GRU: In the proposed model, the input is transformed into a vector with a fixed word length of 64 by a single embedding layer. A GRU layer with 256 neurons and a fundamental RNN layer with 128 neurons make up the second layer. Two dropout layers are added at 30% to avoid overfitting. A substantial layer of 10 neurons is introduced at the end. The following equations show the work process of GRU: In the proposed model, the input is transformed into a vector with a fixed word length of 64 by a single embedding layer. A GRU layer with 256 neurons and a fundamental RNN layer with 128 neurons make up the second layer. Two dropout layers are added at 30% to avoid overfitting. A substantial layer of 10 neurons is introduced at the end. Stochastic Gradient Descent (SGD) is used as an optimizer in the GRU layer, as explained in [42]. As an activation function, the sigmoid function is used. Sparse categorical cross entropy (SCCE) is used to reduce the loss experienced during training the proposed model.

Bi-Directional LSTM (BLSTM)
The BLSTM extends the regular LSTM. The model incorporates two parallel LSTM layers to produce a forward and backward loop, as seen in the picture below [42]. The network is supposed to produce predictions by using past and future information from forward and backward sequences. In this scenario, the present information is dependent on previous information and is also related to future information [43]. The red and green arrows in the Figure 8 and the equations below reflect the forward and backward consequences, respectively.

Ensemble Deep Learning Models (EDLM)
EDLM is a cycle where different assorted models are made to foresee a result, either by utilizing an assortment of modeling algorithms or by utilizing an assortment of training data sets. The model then, at that point, joins each base model's conjecture, yielding a solitary last expectation for the concealed information. The purpose of using EDLM is to reduce prediction generalization errors. When using the EDLM, the prediction error of a model that is different and independent from the base model is reduced. Technology seeks public wisdom when making predictions. An ensemble model has multiple base models, but it works and functions as a single model [46].
With the use of a group learning strategy, this study focuses on the demonstration of unique deep learning models, including LSTM, GRU, and BLSTM. Three groups, such as the training set, validation set, and test set, are created from the extensive feature-extracted dataset. V denotes the validation set, whereas T denotes the test set [47]. Every single LSTM, GRU, and BLSTM deep learning model receives the training set as a contribution. In order to get scan ranges and the optimal attributes for suggested ensemble learning model bounds, the matrix inquiry improvement approach is also used. As shown in Figure 9, EDLM is created from each unique deep learning model under the names training model1, training model2, and training model3; these models are represented as LSTM, GRU, and BLSTM, respectively. Similar studies have also been conducted using machine and deep learning techniques [47][48][49][50][51][52][53][54][55][56][57][58][59][60][61]. The following Figure 8 explains the BLSTM cell structure used in the identification of cholangiocarcinoma [44,45].

Ensemble Deep Learning Models (EDLM)
EDLM is a cycle where different assorted models are made to foresee a result, either by utilizing an assortment of modeling algorithms or by utilizing an assortment of training data sets. The model then, at that point, joins each base model's conjecture, yielding a solitary last expectation for the concealed information. The purpose of using EDLM is to reduce prediction generalization errors. When using the EDLM, the prediction error of a model that is different and independent from the base model is reduced. Technology seeks public wisdom when making predictions. An ensemble model has multiple base models, but it works and functions as a single model [46].
With the use of a group learning strategy, this study focuses on the demonstration of unique deep learning models, including LSTM, GRU, and BLSTM. Three groups, such as the training set, validation set, and test set, are created from the extensive feature-extracted dataset. V denotes the validation set, whereas T denotes the test set [47]. Every single LSTM, GRU, and BLSTM deep learning model receives the training set as a contribution. In order to get scan ranges and the optimal attributes for suggested ensemble learning model bounds, the matrix inquiry improvement approach is also used. As shown in Figure 9, EDLM is created from each unique deep learning model under the names training model1, training model2, and training model3; these models are represented as LSTM, GRU, and BLSTM, respectively. Similar studies have also been conducted using machine and deep learning techniques [47][48][49][50][51][52][53][54][55][56][57][58][59][60][61].
unique deep learning models, including LSTM, GRU, and BLSTM. Three groups, such the training set, validation set, and test set, are created from the extensive featuretracted dataset. V denotes the validation set, whereas T denotes the test set [47]. Ev single LSTM, GRU, and BLSTM deep learning model receives the training set as a con bution. In order to get scan ranges and the optimal attributes for suggested ensem learning model bounds, the matrix inquiry improvement approach is also used. As sho in Figure 9, EDLM is created from each unique deep learning model under the nam training model1, training model2, and training model3; these models are represented LSTM, GRU, and BLSTM, respectively. Similar studies have also been conducted us machine and deep learning techniques [47][48][49][50][51][52][53][54][55][56][57][58][59][60][61].  Testing and validation sets are utilized to assess each training model. At last, as displayed in the outcomes area, an EDLM produces last superior outcomes.
The singular profound learning model is provided loads to fabricate group learning forecast in the situation. Here, n signifies the weight given to every exceptional profound learning model, means each model's forecast, and is the perception [62].
The numerical recipes used to decide the results of the calculations are recorded below. The recipes to determine the responsiveness, particularity, exactness, and Matthew's correlation coefficient (MCC) are portrayed in the equations below, in a specific order: Sn in the equations above relates to the capacity to forecast the count that can accurately detect cholangiocarcinoma. The capacity to anticipate the count that will really reveal the lack of cholangiocarcinoma is referred to as Sp. All individuals in TP + FN have the specified condition [63], while TN + FP are the subjects devoid of the mentioned circumstances. Total participants with good outcomes are denoted by TP + FP, whereas patients with negative results are denoted by TN + FN [64,65].

Results
To extract the key aspects of the balanced data, the cholangiocarcinoma dataset is pre-processed. To the retrieved data, extensive feature extraction techniques are applied. Then, the proposed EDLM deep learning methods are employed. IST, SCT, and 10-FCVT were used to gauge the effectiveness of the deep learning algorithm. The outcomes of various validation procedures are as follows.

Self-Consistency Test (SCT)
SCT is an iterative testing process; when results are satisfied, it stops. A total of 100% of the information collected during an SCT is used for testing and training. The complete dataset is used in SCT for both training and testing. Very minimal loss occurs in BLSTM. In contrast, the SCT showed that LSTM, GRU, and BLSTM had extremely excellent Acc. The ROC curve using SCT is shown in Figure 10.

Results
To extract the key aspects of the balanced data, the cholan pre-processed. To the retrieved data, extensive feature extraction Then, the proposed EDLM deep learning methods are employed were used to gauge the effectiveness of the deep learning algorith ious validation procedures are as follows.

Self-Consistency Test (SCT)
SCT is an iterative testing process; when results are satisfied of the information collected during an SCT is used for testing and dataset is used in SCT for both training and testing. Very minima In contrast, the SCT showed that LSTM, GRU, and BLSTM had The ROC curve using SCT is shown in Figure 10.  The outcome, defined by the ROC curve as being between 0.99 and 1.0, should be regarded as excellent. The decision boundary results using SCT are shown in Figure 11.

R REVIEW 14 of 20
The outcome, defined by the ROC curve as being between 0.99 and 1.0, should be regarded as excellent. The decision boundary results using SCT are shown in Figure 11.

Independent Set Test (IST)
The proposed study was also validated via IST. The values from the confusion matrix are used to determine the model's Acc. This test is related to the primary performance measuring method for the proposed model. In this case, the algorithm is trained on 80% of the dataset and tested on 20% of the dataset. The ROC curve of EDLM in IST is shown in Figure 12.

Independent Set Test (IST)
The proposed study was also validated via IST. The values from the confusion matrix are used to determine the model's Acc. This test is related to the primary performance measuring method for the proposed model. In this case, the algorithm is trained on 80% of the dataset and tested on 20% of the dataset. The ROC curve of EDLM in IST is shown in Figure 12.

Independent Set Test (IST)
The proposed study was also validated via IST. The values from the confusion ma used to determine the model's Acc. This test is related to the primary performance mea method for the proposed model. In this case, the algorithm is trained on 80% of the and tested on 20% of the dataset. The ROC curve of EDLM in IST is shown in Figure 1  The decision boundary result using IST are shown in Figure 13.

Independent Set Test (IST)
The proposed study was also validated via IST. The values from the confusion matrix are used to determine the model's Acc. This test is related to the primary performance measuring method for the proposed model. In this case, the algorithm is trained on 80% of the dataset and tested on 20% of the dataset. The ROC curve of EDLM in IST is shown in Figure 12.

10-Fold Cross-Validation (10-FCVT)
The data are uniformly subsampled into 10 groups for the 10-FCVT. By dividing the training set into 10 divisions, treating each part in the validation set, and training the rest 9-fold, you may choose the model's hyperparameters and architecture. This process is repeated 10 times, and then, the average value is calculated. Figure 14 shows the ROC curve of EDLM in 10-FCVT.
The data are uniformly subsampled into 10 groups for the 10-FCVT. By training set into 10 divisions, treating each part in the validation set, and tra 9-fold, you may choose the model's hyperparameters and architecture. This peated 10 times, and then, the average value is calculated. Figure 14 shows th of EDLM in 10-FCVT. A decision boundary visualization of each fold obtained through EDLM is shown in Figure 15.

Results Comparison
The result of the EDLM is contrasted with those of its own distinctiv including LSTM, GRU, and BLSTM, in Table 3. Many measurements are used ison. A comparison is made using all three types of tests, such as SCT, IST, a Table 3 demonstrates how the proposed EDLM improves the identification the stand-alone deep learning techniques, including LSTM, GRU, and BLST it is clearly mentioned that BLSTM performs very well in SCT and IST, and side, EDLM performs very well in 10-FCVT, which is the best representation dataset. The statistical tools that evaluate the models are Acc, Sn, Sp, and M Sn, Sp, and MCC of BLSTM in SCT are 99%, 100%, 98%, and 0.98, respectiv Sn, Sp, and MCC of BLSTM in IST are 98%, 100%, 96%, and 0.95, respectiv Sn, Sp, and MCC of EDLM in 10-FCVT are 92%, 94%, 93%, and 0.86, respect Table 4 provides a comprehensive overview of the performance compar A decision boundary visualization of each fold obtained through EDLM in 10-FCVT is shown in Figure 15.
9-fold, you may choose the model's hyperparameters and architecture. This process is repeated 10 times, and then, the average value is calculated. Figure 14 shows the ROC curve of EDLM in 10-FCVT. A decision boundary visualization of each fold obtained through EDLM in 10-FCVT is shown in Figure 15.

Results Comparison
The result of the EDLM is contrasted with those of its own distinctive algorithms, including LSTM, GRU, and BLSTM, in Table 3. Many measurements are used for comparison. A comparison is made using all three types of tests, such as SCT, IST, and 10-FCVT. Table 3 demonstrates how the proposed EDLM improves the identification precision of the stand-alone deep learning techniques, including LSTM, GRU, and BLSTM. In Table 3, it is clearly mentioned that BLSTM performs very well in SCT and IST, and on the other side, EDLM performs very well in 10-FCVT, which is the best representation of the whole dataset. The statistical tools that evaluate the models are Acc, Sn, Sp, and MCC. The Acc, Sn, Sp, and MCC of BLSTM in SCT are 99%, 100%, 98%, and 0.98, respectively. The Acc, Sn, Sp, and MCC of BLSTM in IST are 98%, 100%, 96%, and 0.95, respectively. The Acc, Sn, Sp, and MCC of EDLM in 10-FCVT are 92%, 94%, 93%, and 0.86, respectively. Table 4 provides a comprehensive overview of the performance comparison between

Results Comparison
The result of the EDLM is contrasted with those of its own distinctive algorithms, including LSTM, GRU, and BLSTM, in Table 3. Many measurements are used for comparison. A comparison is made using all three types of tests, such as SCT, IST, and 10-FCVT. Table 3 demonstrates how the proposed EDLM improves the identification precision of the stand-alone deep learning techniques, including LSTM, GRU, and BLSTM. In Table 3, it is clearly mentioned that BLSTM performs very well in SCT and IST, and on the other side, EDLM performs very well in 10-FCVT, which is the best representation of the whole dataset. The statistical tools that evaluate the models are Acc, Sn, Sp, and MCC. The Acc, Sn, Sp, and MCC of BLSTM in SCT are 99%, 100%, 98%, and 0.98, respectively. The Acc, Sn, Sp, and MCC of BLSTM in IST are 98%, 100%, 96%, and 0.95, respectively. The Acc, Sn, Sp, and MCC of EDLM in 10-FCVT are 92%, 94%, 93%, and 0.86, respectively.  Table 4 provides a comprehensive overview of the performance comparison between the previous approaches and proposed models. The BLSTM model stands out with its exceptional AUC value of 0.99, indicating its superior predictive capabilities. Additionally, the proposed algorithms, as a whole, exhibit better accuracies at 98% than those achieved by previous methods, further emphasizing their potential for enhancing predictive modeling tasks. These findings highlight the importance of continued research and development in the field, as advancements in machine learning algorithms have the potential to revolutionize various domains.

Models Area Under the Curve
Matake, et al. [14] ANN 0.961 Logeswaran [15] MLP 0.960 Pattanapairoj, et al. [16] C4.5, ANN 0.961 Shao, et al. [17] BP-ANN 0.9648 Peng, et al. [18] LASSO, SVM 0.930 Yang, et al. [19] Random Forest 0.90 The proposed ELDM can also be applied to other types of cancerous datasets. The results produced by the proposed model on a prostate cancer dataset are shown in Table 5. It seems that the proposed model is very effective for the detection of mutations to detect cancer. The model is also validated on other types of cancer datasets, and therefore, this demonstrates the generalizability of the proposed model.

Discussion
Cholangiocarcinoma (CCA), one of the deadliest types of cancer in humans, is currently the leading cause of death and disability worldwide. In this study, an EDLM composed of three deep learning models-LSTM, GRU and BLSTM-is proposed. The proposed system is a viable in silico strategy for tracking down mutations in cholangiocarcinoma. When contrasted with the present status of art, the recommended system is a computationally clever indicator. A complete data collection framework is developed in Python to scrap the data from well-known databases and develop mutated gene sequences. An extensive feature extraction framework is developed to extract useful features from the gene sequences and prepare the dataset for training and testing. The EDLM based on ensemble learning of LSTM, GRU, and BLSTM is developed to learn the hidden features of the prepared dataset and identify mutations in cholangiocarcinoma for early detection. Multiple testing techniques, such as SCT, IST, and 10-FCVT, are used to test the proposed model. Multiple statistical tools, such as Acc, Sn, Sp, and MCC, are used for the evaluation of the proposed EDLM, LSTM, GRU, and BLSTM. The performance of EDLM in terms of ROC curve in SCT, IST, and 10-FCVT is shown with the help of Figures 10, 12 and 14 respectively. The performance of EDLM in terms of decision boundary in SCT, IST, and 10-FCVT is shown with the help of Figures 11, 13 and 15 respectively. The evaluation results are shown in Table 3. The best performance in SCT is shown by BLSTM. The best performance in IST is also shown by BLSTM. The best performance in 10-FCVT is shown by EDLM, as discussed in Table 3.

Conclusions
Cholangiocarcinoma (CCA), one of the deadliest types of cancer in humans, is currently the leading cause of death and disability worldwide. The best results for early cholangiocarcinoma cancer diagnosis using EDLM are shown in this study. The proposed EDLM consists of three different deep learning algorithms-LSTM, GRU, and BLSTMthat are used to discover mutations in cholangiocarcinoma. All algorithms have above 95% Acc, as shown in Table 3, with an AUC value of 99. The diagnosis of cholangiocarcinoma made using these results is the most precise to date. Table 3 shows the results of the IST, SCT, and 10-FCVT in terms of Acc, Sn, Sp, and MCC. Right now, these are the best techniques for cholangiocarcinoma early diagnosis. Future studies will build on this effort to identify other diseases, and an ensemble deep learning model of other types will also be made.