Ensemble Feed-Forward Neural Network and Support Vector Machine for Prediction of Multiclass Malaria Infection

Globally, recent research are focused on developing appropriate and robust algorithms to provide a robust healthcare system that is versatile and accurate. Existing malaria models are plagued with low rate of convergence, overfitting, limited generalization due to restriction to binary cases prediction, and proneness to local minimum errors in finding reliable testing output due to complexity of features in the feature space, which is a black box in nature. This study adopted a stacking method of heterogeneous Artificial (ANN) Support (SVM) algorithms to predict multiclass, symptomatic, and climatic malaria infection. ANN produced 48.33 percent accuracy, 60.61 percent sensitivity, and 45.58 percent specificity. SVM with Gaussian kernel function gave better performance results of 85.60 percent accuracy, 84.06 percent sensitivity, and 86.09 percent specificity. Consequently, to improve prediction performance, a stacking method was introduced to ensemble SVM with ANN. The proposed ensemble malaria model was tuned on different thresholds at a threshold value of 0.60, the ensemble model gave an optimum accuracy of 99.86 percent, sensitivity 100 percent, specificity 98.68 percent, and mean square error 0.14. The ensemble model experimental results indicated that stacked multiple classifiers produced better results than a single model. This research demonstrated the efficiency of heterogeneous stacking ensemble model on effects of climatic variations on multiclass malaria infection classification. Furthermore, the model reduced complexity, overfitting, low rate of convergence, and proneness to local minimum error problems of multiclass malaria infection in comparison to previous related models.


INTRODUCTION
One major health problem among humans especially in the tropical region is malaria infection. People are diagnosed of malaria at least three times per year. From the World Health Organization statistical record, most malaria cases emanated from the African region. About 93 percent or 213,000,000 malaria cases were reported in 2018 with 405,000 mortality rates as compared to 3.4 percent malaria cases in the Asia region (Southeast) and 2.1 percent malaria cases in the Eastern Mediterranean region. Globally, six countries, namely Uganda, Democratic Republic of the Congo, Nigeria, Côte d'Ivoire, Niger, and Mozambique, have prevalent cases of malaria (Teboh-ewungkem & Ngwa, 2020;Thornton, 2020;WHO, 2019). Malaria is a parasitic infection transmitted by a vector known as Anopheles mosquito. The vector consists of a parasite known as Plasmodium species, which invades the red blood cell, thus infecting the liver system (Mueller et al., 2009;Vaughan & Kappe, 2017).
Recent research are focused on the dynamics and complexities of malaria parasite transmission. Research on the risk of asymptomatic and symptomatic influences on malaria infection are still valid today (Bannister & Mitchell, 2003;Depinay et al., 2004). Existing literature examined the binary nature of malaria cases; however, the non-linear system involves multiclass cases. Malaria parasite can be clinically diagnosed in counts of low, mild or high. Occasionally, different diagnosed cases may even overlap, and medical problems need a knowledge intensive program analysis and generalization. The domain complex networks of problems need to be ensembled to devise an individualized solution (Randolph, 2008). Consequently, big data of malaria cases are being recorded yearly and there are difficulties in analyzing and making inferences to reduce its complex nature (Keeling & Rohani, 2011). On several occasions, the system is plagued with problems of local minimum response and overfitting resulting from enormous parameters to fix. Then, the call for machine learning models arises to find the knowledge intensive mechanism and break the complexity of data interpretation to solve medical problems at hand (Hegazy et al., 2013). Machine learning methods eradicate the problems in predicting values, classifying patterns, filtering data, structuring data, and extracting valuable features from data when faced with many irrelevant/noisy features. They also extract association among data components, model the data and generate systems that are less error-free, as well as integrate the system with different sensors using classification and inferences (Maina et al., 2017;Namdev et al., 2015).
Feature extraction extracts best features and suitable information to handle a given task in solving a problem (Mizher et al., 2019). Feature extraction is a transformational approach to transform space input features into few subspaces that retain accurate feature description. In machine learning, feature extraction is active in removing outliers and redundant data, thus improving learning accuracy and reducing complexity. Over the decades, dimensionality reduction is a challenging issue in handling feature extraction and feature selection to obtain a robust model. Feature selection and feature extraction algorithms have been proposed to enrich classification of groups of patterns, signals, and features to make inferences about a particular problem in a specific domain of interest and enhance prediction accuracy (Khalid et al., 2014). The classifiers involved are the Convolutional Neural Network (CNN) (Roy et al., 2018;Su, 2020;Triwijoyo, 2017), Multiple Linear Regression (Priambodo & Ahmad, 2018), Decision Tree (Ibrahim et al., 2016), Feed Forward Neural Network (FNN) (Priambodo & Ahmad, 2018), Back Propagation Neural Network (BPNN) (Hairuddin et al., 2020), k-Nearest Neighbor (k-NN) (Gupta & Mittal, 2018), Support Vector Machine (SVM) (Chaudhari & Agrawal, 2015), and Bayes (Ganesan et al., 2010).
Ensemble learning is a machine learning paradigm employing multiple learning methods to solve machine learning problems of missing feature, feature selection, error correlation, confidence estimation, class imbalanced data, etc. in diverse real-world applications (Oza & Russell, 2000;Yang et al., 2016;Zhou, 2009). It constructs and combines a set of hypotheses for training data by employing multiple algorithms and bridging the gaps in their weaknesses and strengths (Moayedi & Jahed Armaghani, 2018). Ensemble learning exhibits correct high prediction and classification performance as compared to single learning models (Kwon & Kwak, 2019). The key idea to improve performance is to modify training datasets, build classifiers on these n-training sets, and combine them to the final decision rule. Ensemble learning method comprises bagging, boosting, and stacking. Its generalization is attractive as compared to single learning model (Brown, 2010;Samat et al., 2014).
Yearly, there are high malaria incidence cases that affect both young and old citizens as compared to other infections, and several difficulties have risen in predicting its occurrence and analysis of its possible threats. The need to develop an alternative fast healthcare solution that employs the unique features of several models and complements the weaknesses and strengths of one another is of great importance. Medical personnel, patients, and any stakeholders will have greater opportunities to perform malaria severity predictions. This paper's subsequent section organization is as follows: Section II presents a review of related previous models' performances and their limitations, Section III introduces the methodology and materials used for the Support Vector Machine and Artificial Neural Network (SVM_ANN) Model, Section IV gives the Ensemble Stacked SVM_ ANN Model result, Section V discusses the model result, summary of the model, strength, limitation of the study, and future direction. Finally, Section VI sums up the paper with concluding remarks.

RELATED WORKS
Conventional microscopy is sometimes inefficient to diagnose infections and there is a difficulty of overlapping in computation of results. Classification algorithms, namely Neural Network, SVM, and Naïve Bayes, have been employed with Discrete Wavelength Transform (DWT) and Gray-Level Co-Occurrence Matrix (GLCM) for feature extraction, and SVM has been proven to be outstanding. Several literature justified the effectiveness of SVM in handling binary class problems; in contrast, this study only incorporates a classifier to check the level of parasitemia in red blood cells (RBCs), which are multiclass in nature (Chaudhari & Agrawal, 2015).
Various machine learning algorithms, such as k-Nearest Neighbor (k-NN), Linear Discriminant Classification (LDC), and Logistic Regression (LR), were combined to gain physical features that can differentiate among cells easily and thus increase diagnostic capability. From the results, it was discovered that for late trophozoites, LDC gave the highest accuracy of 99.7 percent in comparison to NNC with 99.5 percent accuracy and LR with 99.1 percent accuracy to detect stages of schizont and to differentiate between uninfected RBCs. Furthermore, for early detection of trophozoites, LDC gave the best accuracy of 98 percent, specificity of 99.8 percent, but was weak in specificity of 45.0 percent to 66.8 percent. The major challenge in the research is that oftentimes early trophoizoites are being mistaken with late trophoizoites. Therefore, better algorithms for detection are needed to back up expert analysis and Giemsa staining experiment to be conducted (Park et al., 2016).
From historical records, there is widespread usage of ANN, a machine learning tool for prediction of diseases especially cancer and malaria (Arulampalam & Bouzerdoum, 2003). Nowadays, SVM has been proven to work better than ANN for binary classification problems (Zacarias & Bostrom, 2013). For simple representation, Decision Tree is widely used but usually involves large training sets. In a reallife system like the healthcare system, an accurate prediction model is needed. A previous study focused on SVM and Firefly Algorithm (FFA) copulation to detect malaria cases. In the study, FFA was employed to choose appropriate parameters for SVM. The proposed method was applied to areas of Jodhpur and Bikaner in India where malaria transmission was unstable. The result of the study indicated that SVM-FFA worked better in comparison to SVM, ANN, Auto-Regressive Moving Average method, and other existing models (Ch et al., 2014).
Existing literature considered behavioral features of Counts of Chromatin Dots (NCD), Infected Red Blood Cells Size (RBCS), Location of Chromatin (LC), Structure of Parasite (SP), RBCT (RBCT) and Counts of Parasite/RBC (CRBC). No consideration was given to symptomatic characteristics of infected red blood cells (Di Ruberto et al., 2000). Several studies revealed that climatic variations also had a great influence on malaria incidence (Zhou et al., 2004). The literature also disclosed that most existing models were not subjected to several performance measures to ascertain their level of effectiveness and robustness. This study developed a model that solved multiclass nature of parasitemia in thick red bold cells and measured with several performance metrics of sensitivity, specificity, accuracy, mean square error (Barros et al., 2010).
An approach to handle multitask multiclass SVM with basis of regularization functional minimization was conducted. Multiclass problems having a quadratic objective function were subjected into a constrained optimization problem to learn directly from the data. Two different learnings took place: label-compatible and labelincompatible multitask learning. Choosing appropriate kernels help to categorize the linear multitask learning approach to non-linear cases (Mohammed et al., 2020). With several experiments conducted and compared to other multitask learning models, this approach stood out to be good to solve multitask multiclass problems (Ji & Sun, 2013). Accurate malaria parasite mitochondrial protein identification helped to find appropriate drugs to combat the infection and sequencebased approach was adopted for the detection of malaria parasite in mitochondrial proteins. Beforehand, to discretely formulate the protein sequences, adjoining dipeptide composition was extended to g-gap dipeptide composition. Its optimal features were selected with incremental feature selection approach and analyzed with Analysis of Variance (ANOVA). The result of the evaluation indicated 97.1 percent accuracy, with 101 optimal 5-gap dipeptides. This method was proven to be better when compared to existing methods (Ding & Li, 2015).
A co-infection predictive symptom-based system was developed for malaria and typhoid infections. The research aim was to develop a computer-based system that would help in medical diagnosis, especially in areas that lacked facilities and medical experts. SVM was employed for the co-infection classification of 20 patient malaria cases that were collected as data samples. The result of the proposed system was 80 percent accurate for classifying malaria and 60 percent accurate for classifying typhoid. An accuracy of 90 percent was attained for the typhoid and malaria co-infection and it captured a relatively low dataset. The limitation of the study was that several performance metrics were needed to be employed to ascertain the correctness of the system. Moreover, the effects of global thresholding and climatic conditions were not considered (Aminu et al., 2016).
A study was carried out on resultant simultaneous effects of temperature and rainfall on the dynamics of mosquito population and malaria incidence cases. The result revealed that temperature was a higher determinant of malaria outbreaks in a vulnerable population (Parham & Michael, 2010). Modeling statistical tools of Long-Short-Term Memory (LSTM), Auto-Regressive Integrated Moving Average (ARIMA), Back Propagation-ANN (BP-ANN), as well as Seasonal and Trend Decomposition using Loess and ARIMA (STL+ARIMA) were previously adopted by researchers between 2011 -2017 to predict the influence of climatic variations on malaria infection. In a previous study, a stacking architecture was proposed to combine different algorithms. Gradient Boosting Regression Tree (GBRT) was employed to combine four algorithms, and the model prediction was improved by the stacking structure. The performance metrics of mean absolute deviation (MAD), root mean square error (RMSE), and mean absolute scaled errors (MASE) were employed to test the model's predictive power. Initially, RMSE values of the existing four models were 13.176, 14.543, 9.571, and 7.208;MASE values were 0.469, 0472 ,0.296, and 0.2666;and MAD values were 6.403, 7.658, 5,871, and 5.691. The results indicated that the MAD, RMSE and MASE values of GBRT decreased to 4.625, 6.810, and 0.224, respectively (Wang et al., 2019).
A study was conducted on the analysis of hematological predictors of malaria infection in the Ashanti region of Ghana with the Logistic Regression model. The study revealed that skills needed for microscopic examination of peripheral blood film were often lacking among laboratory scientists. A binary logistic model was conducted and it identified the predictors' age, hemoglobin, platelet, and lymphocyte as the most significant asymptomatic predictors. The result from the study indicated 77.4 percent sensitivity and 75.7 percent specificity with a positive predictive value (PPV) and negative predictive value (NPV) of 52.72 percent and 90.51 percent, respectively (Paintsil et al., 2019). A research was conducted in Ethiopia among southern lowland areas from July to September 2016. The study focused on investigating malaria severity in several regions of the study sites. 90 villages were randomly selected from five villages. The statistical significance of P value ≤ 0.05 was applied as a benchmark. The results of the study indicated 2/5 independent clusters with higher risks. Over ¼ febrile cases were confirmed positive and over 2/4 of the positive cases' causative agent was Plasmodium falciparum, the rest ¼ causative agent was Plasmodium vivax. Conclusively, enough malarial intervention programs should be conducted in such areas with critical conditions (Esayas et al., 2020).
A survey of asymptomatic malaria and mosquito vectors was conducted in the border region of China-Laos to investigate the epidemic trend of malaria infection. Nested polymerase chain reaction (PCR) and microscopy examination was conducted on blood samples of 354 local residents from one year to seventy-two years (1 -72 years old) at Sankang village in 2016. Furthermore, 2,430 adult mosquitos were trapped in Muang Khua district in the same year from June to August. The results of the surveillance of mosquitos indicated that Culex and Anopheles were the predominant vectors. The predominant species of seven groups of Anopheles was Anopheles sinensis, thus indicating that the China-Laos border had the largest malaria epidemic condition (Zhang et al., 2020).
In the last decades, researchers seek for robust and efficient machine learning methods to arrive at a definite conclusion from unreadable ambiguous data. Ensemble methods emerged and gained significant attention in the scientific community. Machine learning ensemble methods combine multiple learning algorithms to attain better predictive performance than could be obtained from single base learning algorithms. Combining multiple learning models has been hypothetically and experimentally shown to provide a significantly better performance than their single base learners. In the literature, ensemble learning algorithms set up a dominant and state-of-the-art approach for obtaining maximum performance. Ensemble methods have been applied in a variety of real-life problems ranging from face and emotion recognition through text classification and medical diagnosis to financial forecasting. Future research may exploit ensemble learning for improving prediction accuracy and machine learning readability and enhancing model reliability (Pintelas & Livieris, 2020).
An ensemble machine learning model was proposed for the prediction of resistance artemisinin of malaria due to its exponential increase in many areas of Sub-Saharan Africa and Southeast Asia and in the late 2000s in Cambodia. Recent research are exploring underlying mechanisms behind the incidence cases of artemisinin resistance to transform isolated data and handling the tens of thousands of variables and machine learning models. Scikit-learn package with Gradient Boosting, Random Forest, Decision Tree, Lasso Lars, Elastic Net, Light Gradient Boosting Machine (LightGBM), Stochastic Gradient Decent, and Extreme Random Tree were employed with various scaling methods ranging from Principal Component Analysis, Min/ Max Scaler, Wrapper, Maximum Absolute Scaler, Robust Scaler, Sparse Normalizer, Truncated Singular Value Decomposition Wrapper, and Standard Scale Wrapper. A recent study aimed at accurately predicting Plasmodium falciparum drug resistance levels of artemisinin isolate as quantified by the IC50 and also predicting the parasite vitro transcriptional profiles of the clearance rate of malaria parasite isolates. After training with 498 individual models, two ensemble models (voting and stacking) methods were adopted by the model selection method, i.e., Caruana ensemble selection algorithm. The result of the study indicated that the voting ensemble model was the best model with the lowest normalized RMSE of 0.1228 and a mean absolute percentage error (MAPE) of 24.27 percent. This implied that the voting ensemble model accurately predicted IC50 in malaria isolates (Ford & Janies, 2019).
A performance evaluation of deep neural ensembles toward malaria parasite detection in thin-blood smear images was conducted in 2019 due to burdensome of disease diagnosis and adverse variability of inter/intra-observer variability, mainly in large-scale screening under resource-constrained settings of microscopic thick/thin-film blood examination. Convolutional Neural Network (CNN) is a deep learning algorithm with the architecture for image recognition but is plagued with high variance and sometimes would overfit due to its sensitivity to training data fluctuations. A recent study aimed to improve robustness and generalization and reduce model variance by employing ensembles algorithms to detect parasitized cells in thinblood smear images. Various cross validations were conducted to prevent data leakage into the validation and reduce generalization. Then, the models were evaluated with accuracy, mean squared error (MSE), area under the receiver operating characteristic (ROC) curve (AUC), F-score, precision, and Matthews correlation coefficient (MCC). The result of the study indicated that the ensemble model constructed with VGG-19 and Squeeze Net performed better that the state-of-the-art models in several performance metrics (Rajaraman et al., 2019).
An ensemble framework for classification of malaria disease was proposed due to the challenge of having prevalence of data with noninfected cases as compared to infected cases. Consequently, the major concern was to develop a model-based decision support system that could handle unbalanced datasets relatively well and give accurate prediction. To overcome the aforementioned problem, ensemble methods of boosting, bagging, and voting algorithms that could handle minority samples were proposed. In the study, a comparative analysis on accurately classifying imbalanced and balanced malaria disease datasets with AdaBoost, Random Forest, Multilayer Perceptron (MLP), and Linear Discriminant Analysis (LDA) classifiers was conducted. The experimental result indicated that the Random Forest algorithm showed outstanding performance for the classification of imbalanced malaria disease (Sajana & Narasingarao, 2018b).
A comparative study on imbalanced malaria disease diagnosis using ensemble machine learning algorithms was conducted because malaria infection was prevalent majorly in non-urban areas. In the study, a skewed distribution of data was collected with five positive cases and 160 negative cases from a private clinic where 87 were neonatal patients. To balance the dataset, the Synthetic Minority Oversampling Technique (SMOTE) algorithm was employed. Afterward, various classifiers such as Decision Tree using C4.5, Naive Bayesian, and Radial Basis Function (RBF) Network carried out the classification. A comparative study on the research indicated that RBF Network had the highest classification accuracy of 98.9 percent, Bayesian had 94.7 percent, and Decision Tree 92.7 percent (Sajana & Narasingarao, 2018a).
Despite the greatest predictive ability of ensemble learning, some vital issues remain unaddressed. Several important ones are what factors affect the accuracy of an ensemble, to what extent they work, and the challenges of evaluating the relationships among the domain of interest features. The factors to be studied include the accuracy of individual models, the diversity among the individual models in an ensemble, decision-making strategy, and the number of the members used for constructing an ensemble. The description of the conceptual and theoretical analyses on these factors, and the possible relationships between them were presented. Experiments have been conducted by using some benchmark datasets and some typical results were presented (Wang, 2008).
This study tends to explore the dynamics of existing ensemble malaria models and their drawbacks. Existing models focused on the morphological (asymptomatic) factors of malaria incidence cases, modeling of binary cases of malaria incidences, overfitting problems, and proneness to local minimum error. Consequently, in this study, considerations will be given to the effects of symptomatic factors and climatic variations factors on malaria incidence cases, and modeling of multiclass cases of malaria incidence, which is a vital context. A feature selection method incorporated with an algorithm under unique kernels to produce optimal features for prediction will be introduced.

METHODOLOGY
Demands for intelligence and knowledge-based systems beyond intuition to medical practitioners is very vital (Djam et al., 2011;Oguntimilehin & Abiola, 2015). Prediction applies mathematical, statistical, and machine learning models (Zinszer et al., 2015). This study aimed at multiclass symptomatic and climatic-based malaria infection prediction. Sampled malaria patient laboratory test results with Giemsa staining observed under microscope and the corresponding monthly climatic readings served as input variables to the model. The observed features were preprocessed with Min_Max, Divide by Maximum, and Standardization approaches. The Divide by Maximum approach outperformed the other preprocessing methods. The choice of machine learning algorithm to solve a problem always depends on the size, quality, and nature of the data (Djam et al., 2011). After a critical review and consideration of the strengths and weaknesses of most commonly adopted machine learning techniques for ensemble learning, the choice of SVM and ANN were made as depicted in Figure 1 (Abisoye & Jimoh, 2017).
Initially, ANN was adopted to train the malaria features and a global search was conducted to search for the optimum threshold that produced good results. Nevertheless, after several testing, it produced inaccurate results. Then, SVM, which was appropriately dependent on appropriate kernel functions, was employed. The preprocessed data was then analyzed in Microsoft Excel Worksheet and simulated with libSVM in MATLAB 2015a. Given a large number of features, SVM with One_Versus_All (OvA) algorithm was employed to handle multiclass problems and extract instances that exerted the highest predictive weight and maintained its class unique values. These optimum features instances lied on the hyperplane and served as the support vectors; however, SVM also did not give accurate and expected results. Then, ANN classifier stacked with SVM optimal features was proposed to classify relatively well the features into their respective groups in the feature space.

Population Sample and Sampling Procedure
Some non-negative variables were introduced to solve non-linear problems. ξ i ≥0 is a non-negative variable introduced to the constraints in Equation 1: well the features into their respective groups in the feature space.

Population Sample and Sampling Procedure
Some non-negative variables were introduced to solve non-linear problems. ξi ≥0 is a non-negative variable introduced to the constraints in Equation 1: (1) and modified to: where y is the target variable, w is the exerted weights on the network, xi is the input feature in the input space X, and b is the bias variable. The Lagrangian theory is employed and the Lagrangian will be minimized with respect to w, b, and ξ, and maximized with respect to α and β. (1) ( 2) where y is the target variable, w is the exerted weights on the network, is the input feature in the input space X, and b is the bias variable. The Lagrangian theory is employed and the Lagrangian will be minimized with respect to w, b, and ξ, and maximized with respect to α and β.
Thus, the dual problem is: (3) Given α as the Lagrange multiplier: The Lagrangian, M with respect to w, b, and ξ minimum is given by: Whereby C stands for a regularization constant to reduce overfitting.

Data Description
In this study, real malaria incidence dataset was obtained from General  The documented laboratory test results known as malaria parasite count (MP count ) with symptoms from January 2012 to December 2015 served as input variables. A total of 1,200 malaria cases were documented and analyzed for training, testing, and validation phases.

Normalization and Multiclass Encoding
To normalize the malaria features, unitary method and scaling in Equations 12 and 13 were adopted for standardization of data feature ranges. Missing data features like no rainfall due to seasonal changes were assigned zero. Table 2 represents the binary encoding threat classes of malaria features. OPT 1 is the encoded qualitative measure for target Y, with class 0 depicting insignificant malaria parasite count cases, class 1 depicting significant and low malaria parasite count cases, and class 2 depicting significant and high malaria parasite count cases.
a. Unitary Method (12) Where is the normalized value and is the original value b. Feature Scaling (13)

SVM_ANN Network Topology
To avoid saturation during training, weights were initialized and randomized to small random values. A network training function, scaled conjugate gradient 'traingscg', was employed to update weight 132 Journal of ICT, 21, No. 1 (January) 2022, pp: 117-148 and bias values. In this study, an 8-10-1 network structure topology worked best for prediction of malaria infection. The network topology is shown in Figure 2.

Figure 2
8-10-1 SVM_ANN Network Topology. Figure 3 depicts the self-organizing map (SOM) training visualization of the weights that connected each input to each of the neuron. SOM training identified each neuron associated with the weight vector and moved them to become the center of cluster of input vector. Darker colors revealed larger weights. The inputs were highly correlated if connections of two inputs were very similar.

Figure 3
SOM Training Weight Vector.

SVM_ANN Network Topology
To avoid saturation during training, weights were initialized and randomized to small random values. A network training function, scaled conjugate gradient 'traingscg', was employed to update weight and bias values. In this study, an 8-10-1 network structure topology worked best for prediction of malaria infection. The network topology is shown in Figure 2.

Figure 2
8-10-1 SVM_ANN Network Topology. Figure 3 depicts the self-organizing map (SOM) training visualization of the weights that connected each input to each of the neuron. SOM training identified each neuron associated with the weight vector and moved them to become the center of cluster of input vector. Darker colors revealed larger weights. The inputs were highly correlated if connections of two inputs were very similar.

Figure 3
SOM Training Weight Vector.

Optimal Feature Extraction
Optimal features in the feature space were handled by the wrapper method and SVM. The extracted features portrayed the highest predictive power and still maintained their group's distinguished

Optimal Feature Extraction
Optimal features in the feature space were handled by the wrapper method and SVM. The extracted features portrayed the highest predictive power and still maintained their group's distinguished characteristics. Therefore, three significant climatic variation factors and five predominant malaria symptomatic features were tuned. The model was the threshold to obtain the exact range of accurate specification of results that would be produced. The SVM algorithm incorporated with OvA was the threshold as shown in Algorithm 1.
Given a set of malaria infection cases a set of targets and a training set as the input, from the supervised classification algorithm procedure, the SVM model would learn based on the training set  SVM is primarily built to solve binary class problems, but it can be embedded to solve multiclass problems by introducing the One_ Versus_All (OvA) algorithm that can single-handedly capture each class of the target and compare it with the other classes.

Ensemble Methods
Ensemble machine learning model involves three methods: bagging, boosting, and stacking. In bagging, multiple classifiers of the same kind are aggregated by the voting technique. Boosting resembles gagging but the new model is affected by the previous model's result.
Stacking also involves aggregation of multiple base learning models to produce a meta model. The base models are trained based on the complete training set, while the meta model is trained on the output of the base models. Stacking employs stacking generalization, a more sophisticated version of cross validation. The difference in stacking and boosting is that tuning of the parameters takes place at both base level and meta level in stacking, while tuning of the parameters only takes place at base level in boosting. In this study, an heterogenous ensemble stacking of SVM with ANN was proposed as shown in Algorithm 2.

Ensemble Stacked SVM_ANN Stacking Algorithm and Adaptive Thresholding
With the Adaptive Thresholding algorithm in Algorithm 2, the malaria ensemble model was the threshold to ascertain for the best threshold value of the vector density that would produce the best result and produce a robust and reliable model. To search for the optimal threshold in the training phase, a threshold frequency ranging from 0.1 to 1.5 was experimented. In the testing phase, the tradeoff search of the threshold optimal parameters of false positive rates, false negative rates, specificity, sensitivity, and accuracy was also conducted.
Given a set of malaria infection cases a set of targets and a training set as the input, from the ensemble stack algorithm procedure, the ANN model would learn based on the SVM result.
When the classifiers were tuned independently, there was no good result. Therefore, this study resulted to the ensemble stack approaches. The SVM algorithm in Algorithm 2 was adopted.
Journal of ICT, 21, No. 1 (January) 2022, pp: 117-148  The result of the SVM algorithm stacked with ANN is as shown in Ensemble Stacked SVM_ANN Stacking Algorithm and Adaptive Thresholding (Algorithm 2). In the training phase, an Adaptive Thresholding algorithm was embedded with classification to obtain the actual configurations that were accurate, robust, and reliable. When subjected to several testing, at threshold frequencies between 0.2800 and 0.7350 with a step size of 0.005, the Ensemble Stacked SVM_ANN Model produced a good result.

ANALYSIS AND RESULT
The complexity of multiclass symptomatic and climatic-based malaria features was handled by the One_Versus_All (OvA) algorithm. In this study, SVM employed linear, Gaussian, and polynomial kernel functions to ascertain their functionality. A total of 1,200 malaria cases were trained, tested, and validated according to stratified sampling. This corresponded to the total number of 840:180:180 for training, testing, and Validation. The model was evaluated with these performance metrics: a. Accuracy: Accuracy is the performance metrics to calculate the correct predictions that are correctly identified.
b. Sensitivity: Sensitivity is the performance metrics to identify the infected cases.
(15) c. Specificity: Specificity is the performance metrics to distinguish the infected cases from non-infected cases.
f. Mean Square Error (MSE): MSE is the statistical performance metric to obtain efficient estimators. It is widely adopted by researchers.
g. Number of Support Vectors: Closest features on the hyperplane are called support vectors and they exert the greatest forces on the hyperplane. A good model often has large support vectors. The optimal separating hyperplane is given by: Where are the support vectors from each class satisfying these constraints , y c = -1, y s = 1. α s , α c > 0 and α s, α c and targets

SVM Result
SVM training, testing, and validation phases were conducted on 1,200 malaria cases in the range of 840:180:180 respectively. The corresponding results are shown in the graphs of SVM_0, SVM_1, and SVM_2 malaria cases as depicted in Figures 4, 5, and 6, respectively.  Where (x + x ) are the support vectors from each class satisfy αs, αc> 0 and αs, αc ∈αi and targets yi∈[0,1].

SVM Result
SVM training, testing, and validation phases were conducted o 840:180:180 respectively. The corresponding results are shown in the graphs of SVM_0, SVM_1, and SVM_2 malaria cases as depicted in Figures 4, 5, and 6, respectively.

ANN Result
After a continuous tuning of the proposed ANN malaria model, at a threshold value of 0.55, an optimum accuracy of 52.31 and standard deviation of 1.4076 were attained. Figure 7 depicts the best threshold values and performance for the ANN malaria training model.

Figure 7
Optimal ANN Malaria Training Model.
From the model's results, to gain a model with the best accuracy, a benchmark of 1.50 standard deviation should not be exceeded as tested by the rule of thumb. Figures 8(a)

ANN Result
After a continuous tuning of the proposed ANN malaria model, at a threshold value of 0.55, an optimum accuracy of 52.31 and standard deviation of 1.4076 were attained. Figure 7 depicts the best threshold values and performance for the ANN malaria training model.

Figure 7
Optimal ANN Malaria Training Model.
From the model's results, to gain a model with the best accuracy, a benchmark of 1.50 standard deviation should not be exceeded as tested by the rule of thumb. Figures 8(a) and 8(b) show the ANN malaria validation model performance, threshold, and continuity.

ANN Malaria Validation (1). ANN Malaria Validation (2).
However, the test model produced accuracies of 48.33 percent, 47.22 percent, and 46.11 percent, respectively for Class ANN_0, ANN_1, and ANN_2 within the threshold values of 0.2800 and 0.7350 as shown in Figure 8(a) and 8(b). Finally, Table 3 depicts the validation stage, where the best accuracy of 46.222 with the least standard deviation of 1.3833 were obtained at a threshold value of 0.600.

Table 3
Optimal ANN Malaria Validation.  Figure 8(a) and 8(b). Finally, Table 3 depicts the validation stage, where the best accuracy of 46.222 with the least standard deviation of 1.3833 were obtained at a threshold value of 0.600.

Ensemble Stacked SVM_ANN Malaria Model
From the study, it was discovered that the ability to separate the non-infected cases of malaria infection was higher than infected cases. Therefore, there is a need for an enhanced model that can accurately separate the cases. This study resolved to Ensemble Stacked SVM_ANN Malaria Model.  From Table 4 at the threshold value of 0.60, the SVM_ANN Malaria Model produced a good result with True Positive Rate of 100 percent and True Negative Rate of 99.60 percent. The overall accuracy of the ensemble model was 99.86 percent and 0.14 percent error rate. The ensemble multiclass symptomatic and climatic-based model produced good results in comparison to the existing models.

Comparison of ANN, SVM, and Ensemble Stacked SVM_ANN
Initially, ANN and SVM models were differently tuned but they generated low performance of 48.33 percent, 85.60 percent A cc , 60.61 percent, 84.06 percent S s , and 45.58 percent, 86.49 percent S p , respectively. Then, the SVM result was stacked with ANN to produce an Ensemble Stacked SVM_ANN Model. Linear, Gaussian, and polynomial kernel functions were employed in the model as depicted in Figures 4, 5, and 6. Nevertheless, Gaussian function gave the optimum result for the model. Figure 9 shows the results of the Ensemble Stacked SVM_ANN Model from SVM_2 with 308x8 double support vectors with Gaussian function in comparison to ANN and SVM.

Figure 9
Comparison of ANN, SVM, and Ensemble Stacked SVM_ANN.
The Ensemble Stacked SVM_ANN Model was evaluated with threshold metrics and probability metrics. Consequent to the results produced, the Ensemble Stacked SVM_ANN Model gave higher prediction values, which is a determinant to greater robustness and reliability.

DISCUSSION
This study developed an ensemble stacked machine learning algorithm of ANN and SVM incorporated with One_Versus_All algorithm to solve multiclass problems by giving consideration to the effects of climatic variations. The model incorporated the extraction of optimal features and global thresholding techniques to avoid overfitting and proneness to local minimum error. The study focused on symptomatic features and effects of climatic variations, whereas the asymptomatic characteristics of malaria cases were not considered. The research revealed that SVM reduced complexity of implementation as seen in ANN by employing high dimensional spaced kernel functions. SVM also solved multiclass problems of most existing machine learning algorithms by introducing the One_VersusAll algorithm to solve overfittings problems. This research also explored predictive adaptive thresholding power in ANN. Furthermore, it bridged the gap between the models by combining their strengths and weaknesses. percent, 85.60 percent Acc, 60.61 percent, 84.06 percent Ss, and 45.58 percent, 86.49 percent Sp, respectively. Then, the SVM result was stacked with ANN to produce an Ensemble Stacked SVM_ANN Model. Linear, Gaussian, and polynomial kernel functions were employed in the model as depicted in Figures 4, 5, and 6. Nevertheless, Gaussian function gave the optimum result for the model. Figure 9 shows the results of the Ensemble Stacked SVM_ANN Model from SVM_2 with 308x8 double support vectors with Gaussian function in comparison to ANN and SVM.

Figure 9
Comparison of ANN, SVM, and Ensemble Stacked SVM_ANN.
The Ensemble Stacked SVM_ANN Model was evaluated with threshold metrics and probability metrics. Consequent to the results produced, the Ensemble Stacked SVM_ANN Model gave higher prediction values, which is a determinant to greater robustness and reliability. In this study, SVM incorporated with OvA was highly effective in reducing big data complexity by selecting the best feature subset instances needed for prediction. The model was tuned with large and small datasets and it handled them well regardless of their sizes, but it used predefined function to optimize well. When SVM was subjected to different kernel functions, SVM_2 with radial basis or Gaussian function produced the optimal result with the highest support vectors 308 X 8 of 85.60 percent A cc , 84.06 percent S s , and 86.49 percent S p . Therefore, the support vectors were serially ensembled into ANN. The Ensemble Stacked SVM_ANN Model then generated optimum results of 1.35 percent FP R , 100 percent TP R , 0 percent FN R , and 98.65 percent TN R, as depicted in Figure 9.

CONCLUSION
From the proposed Ensemble Stacked SVM_ANN Malaria Model, the 0.600 threshold value indicated that 60 percent of female Anopheles mosquitos that were responsible for malaria transmission in the stipulated time survived under the influences of temperature, rainfall, and relative humidity. The model was able to handle symptomatic and climatic-based multiclass malaria infection with feed forward accurate measurement of 98.91 percent, 1.38 variation rate among the data, and 0.14 back propagation error rate of the vector population. From the study, ANN could handle multiclass problem and thresholding but not appropriately well, which resulted to overfitting. Furthermore, it had a slow rate of convergence as well as a drawback of finding a reliable testing output. SVM strengthened the weaknesses of ANN by reducing the model complexity and generalization error. The Ensemble Stacked SVM_ANN Model experimental results generated the best result of 98.91 percent (A cc ), 100 percent, (S s ) 98.68 percent (S p ), 0.14 (MSE), and 99.86 (C R ) at an optimum threshold of 0.60. This ensemble stacked multiclass symptomatic and climatic-based model showed a better performance as compared to the other existing models. The proposed future research can focus on using other ensemble learning approaches with the appropriate normalization method to improve the model performance.