Cardiac Stroke Prediction Framework using Hybrid Optimization Algorithm under DNN

-Heart weakness and restricted blood flow into the cavities can cause a range of strokes from mild to severe Heart strokes are primary caused due to the fat deposited on artery walls. The process reduces the intake of blood and internally causes a pseudo vacuum of air bubbles leading to a stroke which can be identified with high-end instrumentations. In this article, a detailed evaluation is processed with a Hybrid Optimization Algorithm (HOA). In the proposed technique, data are preprocessed using a label encoder and the missing values of the dataset are filled. Whale Optimization Algorithm (WOA) and Crow Search Algorithm(CSA) extract inter-connected patterns and learning features using a dedicated Deep Neural Networking (DNN) support. The proposed Hybrid Optimization Algorithm extracts features and the resultant values demonstrate a high accuracy range of 97.34%.


INTRODUCTION
A cardiac stroke occurs when the blood flow to the heart is blocked. A buildup of fat, saturated fat, and other substances in the blood vessel that stream the heart, known as plaque, is the most prevalent cause of obstruction. Plaque can break and generate clot, blocking blood flow. Some portions of the heart muscle can get injured or damaged if the blood flow is disrupted. Some of the common symptoms of a cardiac stroke are pressure, nausea, heartburn, fatigue, and shortness of breath. Not all patients have the same symptoms or symptom intensity. Some people experience modest pain, while others severe agony [1]. Some persons have no signs or symptoms. For others, abrupt cardiac arrest may be the first symptom. The larger the number of symptoms one experiences, the more likely the patient suffers a heart attack. Although some heart attacks occur unexpectedly, many patients have warning signs and symptoms such as recurrent chest pain that is provoked by exercise and eased by rest.
According to The World Heart Federation, by 2030 there would be more than 23 million heart disease-related deaths annually [2]. Heart related diseases are the most common cause of death in the US according to the American Heart Association. In the US, over 605,000 new congestive heart failures and 200,000 frequent congestive heart failures take place each year [3]. The process of predicting a stroke is commonly a cardiologist's task. These experts rely on medical information and practical demographic images such as Computerized Tomography (CT), Magnetic Resonance Imaging (MRI), and Positron Emission Tomography (PET), invalidating the blockages. Still, due to the more significant extent of the population and the lack of sufficient number of experts, the validation process is challenging and with a high order of false-positive predictions. The cardiac stroke is validated under higher-computational techniques to resolve an exact process of prediction by experts. The computational approach improves a prediction to validation ratio as it deals with sensitive medical information processing [4,25]. The association of Deep Neural Networks (DNNs) provides a reliable process for data mapping and digitalization-based learning. The learned datasets are pre-processed and synchronized timely to form reliable training datasets. These datasets can be used for primary filtering and for providing accurate decision support. The inclusion of deep learning datasets improves the prediction rate and the dependability ratio.
II. LITERATURE REVIEW Cardiac stokes are also known as heart attacks, heart jamming, vascular blockages, etc. Predicting a heart stroke is a challenging endeavor. Fuzzy Clustering Means (FCM), an unsupervised classification technique, was used to envisage early heart attacks on utilizing the patient medical information. Data mining techniques were used to preprocess the information in the patient record, and a Fuzzy C means classifier was used to classify the attributes in [5]. The effectiveness of the classifier was evaluated using data from 270 patients and had an accuracy of 92%. Medical professionals can use a cardiac stroke prediction system to forecast patients' heart disease status based on their clinical data [6]. Heart disease was classified using an artificial neural network algorithm, with 80% accuracy [6]. To assess the chance of increase in heart disease, the algorithm considers 13 factors. Heart disease database tests classification algorithms such as Decision Trees, Naive Bayes, and Neural Networks were utilized in [7] with 100%, 99.62%, and 90.74% accuracy for Neural Networks, Decision Trees, and Naive Bayes respectively. A Neural Network was merged in a coactive neuro-fuzzy inference system in [8]. The presence of the disease was diagnosed with a fuzzy logic qualitative technique combined with a genetic algorithm. The Coactive Neuro-Fuzzy Inference System (CANFIS) performance was assessed in terms of training results with 0.000842 mean square error. The system for predicting heart disease in [9] used ensemble deep learning and feature fusion methods. The information gain methodology reduced the computational burden and improved system performance by removing inappropriate and unnecessary components with 98.5% accuracy.
The enhanced deep learning aided convolutional neural network in [10] increased cardiac stroke prediction. It combines multi-layer perceptron's model with learning techniques and was executed on an Internet of Medical Things platform. It achieved a precision as high as 99.1%. The Hybrid Random Forest with a Linear Model (HRFLM) tried to uncover relevant features using machine learning approaches, enhancing the accuracy of cardiac stroke estimation with various combinations of data [11] with an accuracy of 88.7%. To distinguish between heart disease patients and healthy people, researchers in [12] employed a deep learning system based on Multiple Kernel Learning (MLK) with Adaptive Neuro-Fuzzy Inference System (ANFIS). The ANFIS classifier uses the MLK approach to categorize heart disease and healthy subjects. The system offers 98% sensitivity, 99% specificity, and 0.01% mean square error. For the initialization of neural network weights, the hybrid system in [13] employs a global optimization genetic method. The learning is faster, more consistent, and accurate, with an accuracy of 89% in predicting the risk of heart disease.

III. METHODOLOGY
The dataset for the proposed system is taken from the open Cleveland data base repository of 270 patients [14]. This multimodal dataset needed to be pre-processed before it could be used in the deep learning model. The misplaced values in the dataset were filled in the pre-processing step. Using the Label Encoder approach, the data were converted to a numerical representation. Resampling technique was used to reduce data imbalances, and the standard scalar methodology was used to normalize the data to a value ranging from 0 to 1. In the proposed system, we used 13 attributes for feature extraction: age, gender, chest pain type, blood pressure, cholesterol, blood sugar during fasting, rest ECG, thal (3 = normal; 6 = fixed defect; 7 = reversible defect), exercise-induced angina, old peak (ST depression induced by exercise relative to rest), slope of the peak exercise ST segment, maximum heart rate, and number of major vessels colored by flourosopy. There are significant and insignificant features in this normalized dataset. The dataset was subjected to the Whale Optimization Algorithm (WOA) for feature selection. The structural threshold extraction is done using the Crow Search Algorithm (CSA). This bio-inspired meta-heuristic algorithm assisted in selecting the most important attributes from the dataset to guarantee that the resulting predictions were as accurate as possible. The choice of optimum hyper parameters affects the prediction results given by a Deep Neural Network (DNN) model. The grid search approach is used to optimize the hyper parameters. The proposed methodology aims to provide a reliable and self-significant system for predicting strokes in early stages. The proposed method is termed as Hybrid Optimization Algorithm (HOA) with reference to pattern segmentation based on attributes and WOA. The HOA tends to apply the schematics of multiple information systems validating and reconfiguring the incoming datasets. The architecture of HOA is presented in Figure 1.

A. Whale Optimization Algorithm
WOA has better performance than recent meta-heuristic methods [15]. In WOA, a population of baleen whales searches for food in a multi-dimensional search space. The individuals' locations are denoted as diverse decision variables. The space among the baleen individual whale and the sustenance relates to the objective cost value. Three operational processes measure the time-dependent position of an individual whale: (1) reduction in encircling prey, (2) bubble-net attacking, and (3) the exploration for prey.

1) Encircling of the Prey
A baleen whale identifies the place of the prey and encompasses it. The optimum design position in the search space is not possible to be known upfront. Hence, the WOA adopts the paramount solution as the target prey closest to the optimum position. Once the best agent is identified, the remaining agents update their place relatively to the best agent [16]. This behavior is depicted by: where a indicates the current iteration, ܵ Ԧ and ‫ܦ‬ ሬ ሬԦ are coefficient vectors, ܺ Ԧ is the prey position vector, and ܺ Ԧ represents the position vector of a whale. The vectors ܵ Ԧ and ‫ܦ‬ ሬ ሬԦ are computed as per the following equation: The value of ‫ݏ‬ Ԧ is linearly reduced from 2 to 0 for the repetitions and ‫‬ ଵ ሬሬሬԦ and ‫‬ ଶ ሬሬሬሬԦ are random vectors in [0, 1].

2) Bubble-net Attacking Method
Two types of approaches are used to represent the bubbleneck behavior of the baleen whales mathematically [17]. These approaches are:

b) Spiral Updating Position
Finds the space between the whale (A,B) and the prey (A*, B*). The helix-shaped movement is represented using the spiral equation (4): where ‫ܨ‬ ᇱ ሬሬሬԦ ൌ หܺ * ሬሬሬሬԦ ሺ‫ݔ‬ሻ െ ܺ Ԧ ሺ‫ݔ‬ሻห indicates the best solution, y is a constant describing the logarithmic spiral shape, and a is a random value ranging between [-1, 1].

3) The Search Process for the Prey
The vector ܵ Ԧ is varied to search the prey where random whales search randomly based on each other's positions [18]. The mathematical model is: where ܺ ୰ୟ୬ୢ → is the random position vector of the whale.

B. Crow Search Algorithm
The CSA is based on crow behavior and social interaction. It is a population-based method. Crows are smart birds that live in clusters and have a large brain compared to their body size. They can hide food and memorize locations. Even after several months, they can retrieve the stashed food [19]. The CSA imitates the crow behavior of hiding and recovering food. A population-based algorithm is used. N individuals (crows) make up the flock's size, which is m-dimensional with m being the problem's dimension. The point Xx,y of the crow a in a specific iteration y is described as: where x=1,2,….N, y=1,2,….N, and N is the number of maximum iterations. Every crow can recall the location Nx,y to hide food untill the present iteration.

C. Deep Neural Network
A DNN is a complex neural network with more than two layers. To process data, these networks employ sophisticated mathematical models. A neural network mimics the human brain activity [20]. The network is designed to simulate pattern recognition techniques, with multiple layers of network connections simulating input processing. An input layer, an output layer, and hidden layers in between make up the basic framework of a DNN. Each layer performs its own sorting and ranking, resulting in a feature hierarchy [21]. This advanced neural network aids in processing unstructured and unlabeled data, where artificial intelligence is utilized to classify and organize data outside of standard input-output protocols. The use of DNN in designing health care applications based on Internet of Things technology is studied in [27]. The system is aided with supported, trained datasets for mapping and operation management. These datasets are termed as reliable sources of information prediction and validation. According to regional information scheduling, the processing paradigm is dependent on repeated updating and synchronization of trained datasets. The DNN framework supports the process of validation under HOA. The primary layer of DNN is proposed under the action of input alignment and missing parameter gapping. The DNN middle layer is supported to build a series of interconnected information systems to support reliable decision support. The overall system is supported under HOA for improvized performance estimation and processing.
IV. EXPERIMENTAL SETUP AND RESULTS This section discusses the proposed work's experimental findings. Google Colab, was used to implement the proposed task. On the cardiac stroke dataset, Python 3.7 was used to train and evaluate the proposed model. Missing values were replaced with attribute mean in the pre-processed balanced dataset. The histogram of the dataset is shown in Figure 2 shows the frequency of occurrence of attributes in a certain range of values in continuous and predetermined intervals. The x axis represents the range of values with different scale and the y axis represents the frequency of the value in each attribute. The heat map is a two-dimensional data visualization where colors represent values, and it is used to understand data easily. Complex datasets can be understood with more sophisticated heat maps. When we want to see which intersections of two lines overlap, a heat map can be informative. The data are more concentrated in categorical values in comparison to others. Figure 3 uses a heat map to depict the correlation between the dataset attributes. Both axes represent the attributes of the dataset and the value in each cell represents the co-relation strength.
WOA and CSA are used to extract interconnected patterns and features. Deep learning algorithms are only capable of working with numerical data. The Label Encoder scheme is used in this study to turn non-numerical into numerical data. The next stage is to extract features using a hybrid optimization technique. In order to train a deep learning model, it is critical to extract the essential features. By selecting the best attributes that positively impact categorization, feature selection minimizes training time and improves performance.
In this study, a hybrid optimization strategy was employed because it has rapid convergence rate and avoids getting stuck in local minima, which aids in selecting the best features. 70% of the dataset was used to train the model. The proposed model was tested and validated with the remaining 30% of the dataset. The accuracy computation of HOA is significantly higher when compared to previous approaches and techniques in the area of predicting heart strokes and failures. The proposed HOA generated an accuracy of 97.34% surpassing techniques such as WOA, Ant-Colony Optimization (ACO) algorithm, and DNNs, as presented in Figure 4. The method has also computed stoke prediction rate and validation rate regarding secure computing.

V. CONCLUSIONS AND FUTURE WORK
A deep learning-based framework helps to classify the heart stroke dataset providing valuable insights. The dataset used is the publicly available stroke dataset in [14]. The data were pre-processed by filling in the missing values and were transformed with the Label Encoder technique for conversion to numerical format. The transformed data were resampled, for treating the imbalances, and normalized using the standard scalar technique. The proposed HOA is based on the schematics of WOA, CSA, and DNN combined in an observed ecosystem of information validation. The proposed HOA exhibited higher prediction rate in comparison with previous techniques and approaches. The trained dataset of HOA is iteratively updated to assure the data threshold is a self-learning framework towards supporting the neural networking model.