Heuristic Scale to Estimate Premature Malaria Parasites: Scope in Microscopic Blood Smear Images

Objective: Malaria is one of the epidemic diseases and early detection of malaria symptoms in the patients using the current manual procedures are skeptical, as the diagnosis patterns depends more on the experience of professionals. To overcome the challenges, in this paper we are proposing computer aided model to support in malaria detection at early stages using Microscopic Blood Smear Images analysis using machine learning. Methods/Statistical Analysis : There are many computer aided models that were proposed and adapted in the process of addressing the diagnosis models. Some models like machine learning, image processing, neural network based solutions etc are adapted, which reflects more insights into the process. However, the issue of gaps in accuracy still persists, and the proposed model of Heuristic Scale to Estimate Premature Malaria Parasites Scope (SEMPS) with multi stage processing of the microscopic images of blood smear is processed. Findings : The proposed model is compared with the benchmark models like SVM and Bayesian, the outcome in terms of efficiency of the model is imperative from the results. The proposed model has resulted in more ef - fective and accurate detection of malaria symptoms in the test cases, and the result accuracy is higher than the other two benchmarking models of SVM and Bayesian techniques chosen for comparative analysis. Improvements: The computational complexity of the SEMPS is evinced as linear, where the majority of benchmarking models are found to be up-hard.


Introduction
Malaria is an endemic disease more prevalent in African and Asian continents, and the number of cases reported for malaria related deaths are on surge. One of the key challenges in the diagnosis of Malaria is about the detection at the premature levels, and there is significant need for early diagnosis of the conditions to ensure better treatments and reducing the implications of resulting impacts like life threat etc. From the review of literature on Malaria 1 , malaria parasites (Species of Parasitic Protozoa) are categorized in to four types as: In the hot and humid weather conditions, the parasite Plasmodium vivax is found in significant manner 4 . To ensure that suitable drugs are administered to the human body ailing with disease, early detection of such parasites in the human blood at premature level is profoundly important. WHO guidelines to the medical practitioners 5,6 , recommends that the suspected cases of malaria parasite existence are recommended to microscopic diagnostic tests of blood smears, if can help identification and treating the disease at its early stages.
Certainly the microscopic diagnostic tests shall support in identifying parasite type differentiation and also in quantification of the presence to ensure that the disease severity is analyzed. Rapid-diagnostic tests are the other alternative method in terms of diagnostics, but one of the key limitations in Rapid-diagnostic model is the non-feasibility of detecting the disease at pre-mature stages.
Minimal costs and scope of scalability is leading to microscopic diagnostic tests being preferred and adapted. Key strategy in microscopic diagnostics is pertinent to thick and thin blood smears in the blood samples collected from the suspect cases. In identifying the parasite influence, thick smears are analyzed, while the type of parasite can be determined using thin smears 6 . Thick smears can be very resourceful than the thin smears, in identification of the malaria symptoms at premature stage.
Microscopic diagnostic tests provide inputs on the parasite type and quantification observed in the blood smears. The results from the tests have to be affirmed by the authorized professionals having substantial domain knowledge, thus resulting in many of inaccurate detections or non-diagnosis. Adapting computational technology aided solutions for addressing the limitations and using the computing solutions to overcome the problem of early diagnosis issues could be an effective solution.
For instance, the machine learning model could be very significant for developing a solution with computer aided models. A solution like features obtained from microscopic images of erythrocytes and other normal blood cells could be very vital in differentiating the premature state of disease scope.
In the collection of microscopic images, digital image processing solutions can be very vital 7-18 and more specifically the edge based segmentation could be an effective model 19 for analysis. However, some of the limitations in the edge based segmentation of microscopic images for erythrocytes could be with limited identification of parasites due to contrast issues, lack of clarity on the edges resulting from color similarities , irregular edges and also in terms of noise intensity in the parasite effected; Considering such limitations in the models, the proposed model is about a new machine learning strategy termed CUCKOO Search 20 developed on the basis of evolutionary computational model, which adapts morphological features and benchmark textures as the basis. The proposed solution shall address the limitations envisaged in the edge based segmentation.
Numerous researches and developments has emerged since last decade of time, over the process of computer aided malaria diagnosis, which has grabbed the attention for microscopic image analysis. Many benchmarking contributions have been proposed in the earlier studies on the model. Some of them are based on supervised learning methods [21][22][23] and the decision support system 24 based solution, digital image analysis [25][26][27][28] , and some based on the pattern recognition solution 29,30 . Also, some of the solutions were proposed adapting artificial neural network solutions 31,32 .
With the rising adaptation of cross modeling, some significant models like the segmentation based on histogram equalization which is used to classify the overlapping infected cells were also considered in the process 33 . There are many unsupervised models also found in the review of literature [34][35][36][37] . Also some of the machine learning models 38 and content based image retrieval models 39 has also been proposed in the earlier studies, like the parasite estimation using segmented digitized blood smears are some of the significant contributions that are reviewed in the literature.
Some of the critical constraints that are envisaged in the benchmarking models are about the dullness in contrast, intensities observed in similar fashion for both effected and normal areas of blood smear images for microscopic image segmentation modeling, edge formation issues and other such factors are turning out to be a major impact. Also, apart from one model devised in 38 , all the other models depend on signatures of effected images to identify the parasite scope in a new image. Such dependency, in the instances of trivial variations could lead to more of false alarming conditions. However, such constraints could be effectively addressed in the machine learning models, but one of the constraints envisaged in the machine learning models is about need for considerable volume of microscopic images as inputs essential for training the system, and it should adapt optimal features.
Around 94 features are used in the machine learning method proposed in 38 , and the SVM machine learning models 40 and also the Bayesian Classifiers 41 are also adapted to train and test the model for effective machine learning. Among the varied models, the optimal range of feature selection has resulted in One-way-ANOVA 42 . Though the detection accuracy levels of the model is reaching to 84%, still there is significant instability in divergent count of features. Also, the issues of process complexity towards resource utilization is also not linear for divergent count of features in the model, and the other key constraint is about feature extraction which is carried out by segmentation of image using an algorithm called Marker Controlled Watershed model 14 . The algorithm in 14 uses the gradients that happen under segmentation for estimation, thus the chances of feature value optimality may not be so effective.
Considering such factors, the model of evolutionary computation based machine learning system is adapted to overcome the limits that are identified in 38 . The proposed model CUCKOO of search shall be used over the edge based segmentation model to extract the features from the chosen blood sample images, as the edge based segmentation is one of the effective methods for image based analysis of parasite type detection 37 .
In the further sections of this report, the model proposed in section-II. The experimental study results are depicted in Section 3 along with performance analysis and the conclusion for the model is depicted in Section 4.

Heuristic Scale to Estimate Premature Malaria Parasites Scope
The definition of scale is defined using hierarchy of multiple stages like acquiring the images of blood smears in the image inputs, ensuring that the images are preprocessed. Extracting some of the bench marking features identification of the optimal features and then applying the Cuckoo search on the optimal features in a relative manner, for defining the heuristic scales to depict the normal blood smears and the parasite prone blood smears at premature level.

Conversion to Grayscale
Using the Colorimetric gray conversion 43 Luma coding grayscale conversion 44 , and Green channel gray conversion 44 , the RGB (three channels) microscopic image is converted to a single channel. Also PCA-related grayscale conversion 45 is also adapted in the approach for conversion, and among all the conversion methods adapted, PCA based grey scale has provided more optimal quality images of microscopic blood smear [46][47][48] . Hence PCA is adapted as an effective approach for RGB microscopic image conversion to grayscale.
Using the linear least-square model, the maximum contrast in the grayscale is developed in the PAC based model. RGB color coordinate is used for assessing the primary axis of the RBC color and the best fit regression line developed by PCA regression which certainly reduces the distance between point and the axis line that are influenced by parasite impacted cell image in the regression space. Also, the illustrative visualization of the regression line, with inputs to RGB image and the output grayscale is depicted in Figure 1.
The variation of angles amidst the angles could be represented by R, G and B that are obtained first and sequentially, the cosine values of them are transformed to grayscale values. Equation that is used to gather the regression weights are: In the above depicted equation (Eq.1), minimum of the weights x, y, z that are applied to R, G and B. The image of the pixel count is depicted by|P ix | and r i , g i , b i denotes the red, green and blue values of the i th pixel.

Contrast Correction in Illumination
The poor illumination of the microscopic images could be attributed to varied range of conditions. However, in terms of identifying infected erythrocytes any kind of dullness persisting in contrast shall be a major constraint. Hence, the emphasis is on improving the contrast levels is a key Under the gamma threshold levels of 0.5, the contrast gray scale image for the input gray scale can be seen in the Figure 2.
The given grayscale input image and the resultant image from the gamma equalization process under gamma threshold 0.5 can be found in Figure 2.

Noise Tumbling
The noise of some common type called Salt & Pepper and other super imposed patterns always impact the microscopic images of blood smears 50 . Technique of Median filter is adapted for removing such noise patterns 51 , and also the Median filter is used in combination of Gaussian Filter 50 to reduce the super imposed patterns in the blood smear images shown in Figure 3.
Identifying spectral peaks of pattern noise • Special filtering process is applied to any kind of Moiré pattern noise from Fourier amplitude spectrum in a given image using (Equation (3) be visible as bright spots in amplitude spectra, similar to impulses in visual conditions. Any kind of impulsive noises can be detected by profound solution of median filter [52][53][54] and hence such solution is adapted in detection and filtering of noise. Also the following two steps are followed in detection o spectral peak in the Fourier amplitude spectrum.
Defining low-frequency area • The transformation of the initial image in to 2D wavelet is carried out, and if the relevant functions are identified to be discrete, the scaling and aggregate of wavelets shall be bipartite in to two phases. Hence, the wavelet transform is applied to each axis initially. In extension to such bipartite signal (in terms of image in the form of 2D signal) has been portioned in to sub bands as LL, HL, LH and HH 55 .
The sub bands HL and LH depict the signal deviation in X-axis and Y-axis in the decomposed image. To improvise the ability of coding, the maximal bits and minimal or zero bits are spent on low and high frequency bands in respective manner.
Localize a spectral peak • The (a, b)th spectral coefficient mn ab L is considered as peak, if the following justification exists: The window of size with local median of and pre-defined threshold value denoted by T.
Number of peaks identified the threshold value than T are found inversely proportional.
Gaussian Filtering: • Detection of spectral peak is adapted for Gaussian filtering which corrects spectral coefficient of interest and amplitude towards spectrum coefficients and the filtering process is carried out as mn ab L Indicates the set of amplitude spectrum coefficients chosen from the window of size m n × in surround of (,) th ab spectral coefficient ab L . The Gaussian filtering process (Equation (5)) results set of amplitude spectrum coefficients mn ab L and the Gaussian filter used is represented by mn G . Also a typical Gaussian filter shall be applied even to explore Gaussian surface which is covered by two connected peaks. The process overhead shall be found to be adapted in Gaussian filtering process and it contains some pairs of noise peaks and also the process overhead could be substantially high.
It is imperative from the above process that the usage of Median and Gaussian filters are very much divergent but effective in terms of noise filtering. Also, the median filters restrict the size of region around noise peaks and also further when Gaussian filter is adapted it shall perform over the region defined by median filter.

Edge Detection
The initial and key objective is to improving the visibility of the borders of erythrocytes, which is done by Canny-based filters 56 that preserves continuous edges in effective manner. In order to this the median filter will be used initially to smoothen the contours formed on target image of size q × q pixels of noise free image. Since the infected area of the erythrocyte appears as dark area, the edges and the borders related to such darkest regions are highlighted in Figure 4.

Segmentation by K-Means Clustering to Identify Erythrocyte
In the preprocessing stage, the grayscale Microscopic blood smear image is delivered and it is used as input for process of segmentation by using K-Means algorithm that has K value as 2, which is since the pixels either fall into infected area or normal region of the erythrocyte Figure 5.
The simple clustering technique of K-Means 57 shall be adapted for clustering the microscopic image data sets that are considered. If the dataset U is clustered in d i dimension space as k clusters, as in the selected context the value of k is 2. Initially the normal and infected erythrocytes shall be used to create prototypes in such a manner that it denotes respective cluster.
Then each entry of dataset U shall be moved to respective clusters that are based in nearest prototype. Also, for each of the prototype cluster that is identified, if any other cluster prototype is different to the earlier prototype, the clustering is done according to the cluster prototypes. However, if no change in the prototypes is envisaged, all the clusters observed then shall be used with respective entries that are finalized.
The objective function that estimates the squared error is adapted in K-means clustering for identifying the nearness of each entry to the dataset U and hence respective prototypes of the clusters are adapted as follows: The objective function, which estimates the squared error, is used in K-means clustering to identify the nearness of the each entry of the dataset U and respective prototypes of the clusters that is as follows.   (6)). Using this objective functions, the results that are obtained denotes the distance among the data points and respective cluster. Key steps that are adapted in the K-Means process are: For the K clusters, K data points shall be considered as K data points (k value is considered as Equation (2)) Distance between k centroids and data points are 1. identified.
Moving the data point to cluster once the cluster of a 2.
centroid is found within minimal distance. Application of step 3 to all the data points.

3.
For each cluster optimal centroid is searched for. 4.
The process is repeated of 2, 3, and 4 are repeated until 5.
no changes are observed in any centroid.
I. Apply the step III to all data points. II. Find optimal centroid of each cluster. III. Repeat steps II, III and IV until the centroid of any cluster not changed.
K-means reflects significant times and also the cluster optimality which shall have to be proportionate for centroids that are initially adapted. To the selected binary image by k-means.
Clustering of the given binary image by k-means shall be explored as: The number of clusters that are set to be 2, when the infected erythrocytes shall contain pixels that has high intensity is used for normal erythrocytes.
As the input one is a grayscale image, the scope of differentiation that is estimated by their intensity.
Also, the set of 2 clusters that are formed by assessing at varied level of pixel shall be darkest area or even it might not be one. To evaluate such conditions, for the initial centroids of cluster 1 and 2, pixels are randomly selected for the darkest area for an image and the pixel shall randomly be selected from other part of the image in respective manner.

Connected Component Analysis
In terms of noise removal in the resultant images from the FCM process, shall be the emphasis in the process, however, for the connected component analysis, there is morphological method with a kind of erosion process which shall be adapted. Also, the holes that are observed in the resulting images shall be achieved using the optimal segmentation.
i. Morphological Operation Also, in terms of resulting binary images to the infected erythrocytes in a clustering process comprising of spanned excess regions hovering on infected erythrocytes, such excess regions has to be eliminated. Such process can be done by focusing on morphological binary destruction operation 14 . Also, the structuring element ( , ) s mn shall be used for destruct binary image ( , ) bmn that delivers resultant binary image ( , ) r mn . The destruction is carried out as explored in Equation (7).
{ } To identify optimal size of the STREL, tests were conducted on 3 3, 5 5 7 7 and × × × square particles of STREL and it is imperative that 33 × squared particles are optimal.
ii. Filling Holes observed in clustered erythrosine areas The resulting images from k-means depict erythrocytes that have holes and also the common obstacles shall be around segmentation accuracy for infected erythrocytes 58 . Hence the holes have to be filled as follows.
If ( , ) bmn be the output of the K-means process for a parasite binary image. And bmn with hole filled (see Equation (8)).
The result generated from connected component analysis which is applied on resultant image of K-means is visualized in Figure 6.

Features Extraction
The features for the contextual differentiating in the texture and also the morphological patterns for the given greyscale images are profoundly related in the literature on the domain. In 7-18 , which are not considered towards distinguishing the normal and diseased erythrocytes.
Entropy • In terms of entropy which indicates the level of uncertainty, it is important to differentiate the infected and normal erythrocytes, and the model usually relies on available entropies that are explored in the literature 15,16 . For measuring the entropies, some of the factors like histogram of the region of interest have to be taken in to account. Also, in terms of entropies that have to be evaluated, there Vol 10 (8) | February 2017 | www.indjst.org infected erythrocyte in terms of surface coarseness, the grayscale image shall be processed as third dimension for the chosen 2 dimension image and the resulting variation from the process shall conclude coarseness or texture deviation in the infected erythrocyte. Also, the approach that is adapted for identification of the variations towards third dimension shall be as "modified differential box counting with sequential algorithm" 9  shall be converted in to binary pattern (0 or 1) based on the Gp c .
Morphological feature • Some the features that are proposed in 14 , that has invariant moments 7,60 , shall be considered as Morphometric information features that are extensive for depicting anomalous erythrocytes recognition. Also this could be attributed to variance of shape and size depicted amidst of infected and normal erythrocytes.

Features Selection
When E and F shall be set of records, for each record denotes all features for an normal erythrocyte and also infected and respectively The task of finding hamming distance amid unique values towards each attribute of E for the counter part of F Selection of attributes with hamming distance for more than the threshold hdt that set of optimal attributes E a of size n, F a of size m from E and F respectively i. Assessing Hamming Distance is as follows: Difference between the unique values of same attribute for the records identified as true and false are generated from Hamming Distance. One of the key strategies for assessing the difference for elements is in the coding theory adapted. Such strategy shall be applied for managing distance between various unique values that are observed and also for an attribute for record set that is labeled as true or false. GLCM features • The total GLCM features are 19 and all of them are related to information measure and variance, entropy and energy related aspects 29,30 , that are substantial for exploring the texture information 8 . Also, the GLCM matrix which indicates the divergent grey shades that is found in the image. The matrix shall be used for describing the metrics defined.  Ixy, that is used for defining all 11 texture features 8,10,11 .

Gray Level Run Length Matrix
Fractal dimension • Using the fractal dimension model, the surface coarseness in an image can be figured out 59

Experimental Study and
Results Analysis

The Dataset
A dataset is developed based on the samples gathered from varied cellular counterparts that are collected from varied diagnostics, using statistical guidelines essential for contributions in a medical journal 61  Results that are derived from study experiments reflect upon the performance graphs and the tables. Levels of prediction accuracy shall be stable for hamming distances within or equal to 0.50 towards a trained and tested images in Figure 7. Also in terms of overhead and the resource utilization for SEMPS, there is optimal ratio level essential for time of completion pertaining to the input records that are envisaged as linear in Figure 8. Also the ration of memory usage towards linear levels is divergent to the number of records inspected seen in Figure 9.
The heuristic scale definition from CUCKOO search shall be as optimal for the prediction accuracy that is concrete in terms of results to the count of optimal features and the prediction accuracy estimated (87%), which is substantially an effective count compared to SVM based prediction (83.555) 38 and also Naïve Bayes prediction accuracy resulting (84%) 38 .

Conclusion
In the proposed model of Heuristic Scale to Estimate Premature Malaria Parasites Scope (SEMPS) with multi stage processing of the microscopic images of blood smear ismuch effective than the other models that were depicted earlier. The process of image processing and the heuristic scale evaluation is carried out using multiple phases in order to ensure that the levels of accuracy and the process outcome shall be more effective. Some of the key factors considered in the process are segmentation, optimal feature selection, feature extraction and evolutionary compuation that can be adapted on the baiss of heurisitc scale definition.
Some of the critical constraints that are envisaged in the benchmarking models are about the dullness in contrast, intensities observed in similar fashion for both effected and normal areas of blood smear images for microscopic image segmentation modeling, edge formation issues and other such factors are turning out to be a major impact, from the earlier processes.
With the proposed model of CUCKOO search the outcome in terms of evaluating the process using the image comparisons and the optimal feature utilization has been resulting effective outcome. Despite the fact that there significant stages involved in the process and the adaptation of various benchmarking methods like SVM and Bayesian model are considered in the experimental study, the results that are envisaged from the process signify that the proposed model of SEMPS shall be very resourceful in premature stage detection of malaria from the microscopic images. Also, the study has depicted scope towards using varied directions like the identification of correlation of features and the impact of such features on the scale definition, and the scope for using the genetic algorithm for identifying the optimal features.