An automatic feature selection and classiﬁcation framework for analyzing ultrasound kidney images using dragonﬂy algorithm and random forest classiﬁer

In medical imaging, the automatic diagnosis of kidney carcinoma has become more difﬁ-cult because it is not easy to detect by physicians. Pre-processing is the ﬁrst identiﬁcation method to enhance image quality, remove noise and unwanted components from the back-drop of the kidneys image. The pre-processing method is essential and signiﬁcant for the proposed algorithm. The objective of this analysis is to recognize and classify kidney dis-turbances with an ultrasound scan by providing a number of substantial content description parameters. The ultrasound pictures are prepared to protect the interest pixels before extracting the feature. A series of quantitative features were synthesized of each images, the principal component analysis was conducted for minimizing the number of features to produce set of wavelet-based multi-scale features. Dragonﬂy algorithm (DFA) was exe-cuted in this method. In the proposed work, the design and training of a random decision forest classiﬁer and selected features are implemented. The classiﬁcation of e-health information using ideal characteristics is used by the RF classiﬁer. The proposed technique is activated in MATLAB/simulink work site and the experimental results show that the peak accuracy of the proposed technique is 95.6% using GWO-FFBN techniques compared to other existing techniques.


INTRODUCTION
Medical image recognition is a severe problem since any mistake can threaten the correct diagnosis in these instances. Furthermore, adequate information from clinical trials is gathered in order to ensure the statistical importance of the research. Therefore, scientific groups in a variety of medical areas work on digital databases that contain clinical analysis and outcomes from laboratories that are helpful for research, medical research, epidemiological research, quality control etc. [1][2][3]. For complicated geometric challenges in medical image processing, segmentation, feature-extraction, 3D modelling and registration of medical information, efficient algorithms are needed. This typical geometric issue is diagnosing an adequate surface that connects a number of outline information points [4,5]. clinical diagnosis play a significant part. The original diagnostic stage for kidney diseases therapy is known as ultrasound image, especially, renal size, form and location for data on kidney function is analysed [6][7][8]. Renal stone infection is a dangerous disease. Initially it can cause kidney stones and then progressively harm the body, but it is not known to most individuals [9][10][11]. In this study, proposed an automatic system for analyzing and identifying the ultrasound kidneys (like cysts, stones, crossparenchyma and tumours) as normal or anomalous [12,13]. An ultrasound kidney is a non-invasive diagnostic test that produces images, which are used to assess the size, shape, and location of the kidneys [14,15]. Ultrasound is used to assess blood flow to the kidneys [16,17]. Ultrasound uses a transducer that sends out ultrasound waves in a frequency too high to be heard [18,19]. Datasets from the publicly available websites (http:/www.ultrasoundimages.com,http:/www.sonoworld. com) are obtained for the purposes [20]. The input ultrasonic image is initially pre-processed to determine the exact region of further analysis. Three photo pre-processing techniques have been introduced, such as ROI, bicubic interpolation, and background elimination to ultrasound images [13]. Several functions derived from the first grey-level statistics of second order with modified angles, algebraic time invariants in higher order, Gaussian differential characteristics and new-defined power spectral parameters. In addition, for comparison purposes, some features have been removed.
Unbalanced data reduces the efficiency of multi-dimensional reduction or feature extraction systems. While presented with unbalanced datasets, few techniques offer feature-extraction tools that are favourable to common class, like PCA [4]. The unsupervised PCA seeks to extract the aspect of total scattering. It maintains all PCA and LPP feature extractions for dimension reduction [21][22][23][24][25]. Therefore, the reduction of dimension is an important step towards developing machine learning algorithms. Since the number of features is greater, certain specialized methods to pick features can be adopted. Multivariate methods are efficient for deleting unnecessary, redundant features between feature selection methods. In order to select the optimum functionality utilizes dragonfly algorithm [26]. After future selection, the final stage of the system becomes classifier. A classification of random decision forest (RDF) [14] is developed and trained to use the optimal features of the DFA chosen.
Mirjalili in 2015 [26] proposed the dragonfly algorithm (DA) based on the two different types of swarming behaviours of dragonflies in nature. The DA algorithm balances the phases of exploration and exploitation by imitating the natural swarm interaction of dragonflies for navigating, food search and enemy avoidance. Such behaviour is called dynamic or static swarming. The dynamic swarming refers to the moving phase, and the static swarming denotes the hunting phase. In the static swarming, the minimum possible numbers of dragonflies forms a small group and fly in all directions, meanwhile in the dynamic swarming, a significant number of dragonflies is required to conform a big set, afterwards only fly in same direction. Correspondingly, in the DA there are two basic phases, exploration and exploitation; in the same way any other meta-heuristic algorithm is applied.
The main contribution of the paper is as follows, The input ultrasonic image is pre-processed at the beginning to further determine the exact area of analysis. Three photo pre-processing techniques have been introduced, such as ROI, bicubic interpolation and Background Removal for Ultrasound Images [25]. Subsequently, several features are extracted separately for each image [27]. Several functions derived from the first grey-level statistics of the second order with modified angles, algebraic time invariants in higher order, Gaussian differential characteristics and new-defined power spectral parameters. In addition, for comparison purposes, some features have been removed.
The rest of this study is structured as follows: Section 2 describes related works pertaining to the use of identification of ultrasound kidneys. Section 3 illustrates the proposed methodology on an automatic system for analysing and identifying the ultrasound kidneys (like cysts, stones, cross-parenchyma and tumours) as normal or anomalous. Section 4 demonstrates the experimental results. Section 5 concludes the study.

RELATED WORK: A BRIEF REVIEW
There have already existed various studies in the literature depending on the classification of lung cancer, the classification of liver diseases and the segmentation in renal tumours from various angles. A part of the work is reviewed in this section. Makaju et al. [28] developed a CT picture scheme for pulmonary cancer identification. The scheme was used for the identification and ranking by the SVM of the nodule as malignant or benign in the lung CT test picture using watershed divisions to identify the cancer nodule. Here, cancer model detects 92 percent more accurate than the existing one and 86.6 percent more accurately than the existing one. Overall, the suggested scheme has been improved by contrast to present finest models, but this technique was not classified as Stages 1, 2, 3 and 4 of cancer at various phases.
Qiang Zheng et al [23]. introduced a fresh graphic-based cutby-cut technique for the U.S. image segment by incorporating initial picture intensity data and texture charts obtained using Gabor filters. Here, structured a picture pixel chart near to renal frontier rather than a whole picture chart to manage big variations in appearance within renal pictures and increase computing effectiveness. For bilateral kidneys, this technique produced successful outcomes for segmentation. Also Qiang Zheng et al. implemented a transfer-learning technique for extracting imaging from ultrasonic (US) renal pictures to enhance childhood diagnosis. A pre-trained profound learning system for the production of 3-channel function maps, calculated from U.S. imagery, including initial pictures, gradient functions and distant transforming characteristics, was specifically introduced.
Jun Xie et al. [24] suggested a new fragmentation technique centred on a new texture and existing shape to segment the kidneys in ultrasound pictures. A texture model was designed using the expectation-maximisation technique to estimate the parameters of a collection of half planed Gaussian mixtures. In this way, it correctly and effectively segments the kidney in US pictures.
The polycystic kidney disease in medical imaging pictures was evaluated in 2018. Akkasaligar and Biradar [25] launched a fresh strategy called morphological surgery. The segmentation was carried out with the use of gradient vector force (GVF) snakes. The suggested technique focuses on the cystic and polycystic type ranking of the medical ultrasound kidney picture.

PROPOSED METHODOLOGY
Automatic diagnosis and ultrasound from Figure 1 shows the overall workflow of classification of kidney images. Random forest classification for automatic diagnosis and classification of ultrasound kidney images is trained. For that purpose, ultrasound kidney images are given as the input for pre-processing stage. In the pre-processing section, contain ROI segmentation, bicubic-interpolation and noise elimination process. Then the pre-processed image is given to feature extraction section. In feature extraction section contains discrete Fourier transform, Gabor filter; multi Gaussian differential feature process and finally features are selected with the help of dragonfly algorithm. Based on the training, the random forest classifier is utilized for testing purpose. The detailed discussion is given in the following section.

Image pre-processing
Image pre-processing is a key stage of detecting, so that noises can be removed and image quality improved. It has to be implemented in order to restrict the quest for anomalies in the context. Image pre-processing is the name given to the functions of images at the lowest possible abstraction, the purpose of which is to recover the image data to suppress or enhance undesirable distortions, and few image features are significant for process-ing. It does not enlarge image information content. The scanned images by mutual piece wise linear transformation for maximizing the quality of image by eliminating unwanted data from the images, image pre-processing is a key stage of detecting, so that noises can be removed and image quality improved. It has to be implemented in order to restrict the quest for anomalies in the context. This phase is primarily enhancing the image quality through the removal of unrelated and excess parts from the background of the picture for further treatment. A good choice of pre-processing methods can improve the system's precision significantly. The objective of pre-processing is to strengthen the picture information, which eliminates adverse distortions or improves certain picture characteristics, which are essential for further processing and improves information set management. The accuracy of an optical test can substantially improve image pre-processing. Several filter activities that strengthen or decrease certain picture information make it possible to evaluate simpler or quicker.

Region of interest segmentation
The first stage is to ascertain the region of interest (ROI) in image processing. It improves classification system velocity and precision by only choosing the kidney and eliminating unnecessary information such as patient data and scan data. This research generates automatically specified triangular ROI and can be used as a pre-processing phase for any additional segmentation technique because it removes the redundant context and keeps lesion and neighbouring tissue intact. The ROI-size is selected as appropriate for both longitudinal and cross kidney pictures of 256 × 256 pixels. Seed point selection is quite difficult for the renal ultrasound picture to produce ROI for the entire renal picture for both longitudinal and cross views in kidney ultrasound image. Therefore, the execution of an automated ROI method to foster the complete processing of the renal ultrasound images fragmentation is proposed. Then, the texture was analyzed by computing the local entropy of image; followed by threshold selection, morphological functions, material window, seed point determination, and ROI generation. This process was done by choosing different speckle noise reduction methods and different threshold value for multiple kidney ultrasound images. Therefore, the texture of the kidney pictures has been analyzed. As a consequence, the renal sinus, the centre of the kidney, appears lighter than other portion of the pictures and also the most prevalent area found in kidney pictures in texture analysis.
Here, randomly chosen 100 healthy and 100 diseased images as training data with different sizes and spatial distributions as database. Here, a pixel within an image is referred to tuple (x, y, v), here x and y refers pixel coordinates, and v indicates pixel value. An image I (432×432 pixels) is described as a set of pixels,

Bicubic interpolation
Bicubic interpolation enhances the pattern of approximate luminous activity internally through the bilateral polyunsaturated surface; usually 16 neighbouring points are utilized for interpolation. Bicubic interpolation needs 16 pixels, 16 floating-point functions, is compatible with window processing. Thus, spatial and temporal parallel methods may be utilized for its execution, Interpolation of image can be chosen to fit a sustained function via distinct digital image highlights. Because, the interpolation function is persistent the fresh pixel value can be encountered at the required place. Interpolation, in several other terms, rebuilds the values of pixels that are missed when the interpolation function is sampled. Interpolations for similarly separated information are mathematically represented as: where R is the kernel interpolation weighed by the Q i coefficient value. Subsequently, this weighted value is allocated to iinformation specimens a i . In the above formula R is the symmetrical frequency domain operation. Bicubic interpolation is the two-dimensional cubic interpolation. The Lagrange Polynomial, Cubic Splines etc. can be used to bicubic interpolation. The cubic interpolation equation of a third degree spline provides a proficient approach to the optimal sinc feature detection. The resulting interpolated picture is smoother than NN and bilinear interpolation. In order to measure the grey pixel value, bicubic interpolation uses a cumulative total of 16 pixels. The cubic interpolation kernel comes from the generic cubic spline interpolation equation by enforcing restrictions. The kernel of cubic interpolation is,

Noise elimination
Noise can lead to incorrect segmentation outcomes and it is very essential before segmentation to suppress noise. Noise removal is described as the way to recover the broken and loud picture from a degraded picture. The degraded pictures can be restored in various respects. Different flaws like picture imperfection, poor concentrating, and distortion etc., could cause the picture to degrade, which often cause noise or bubble. Because of the corruption of the pictures, it is learning about the sounds in a picture to pick the most suitable noise-proofing algorithm. CT pictures are commonly utilized for salt and pepper sound, blaze noise and gauzy noise. Noise can be separated from the pictures with different methods such as a mean filter, Gaussian filter, median filter and wiener filter etc. An efficient denoising algorithm for various kinds of noisy images is highly hard to use. The fundamental characteristics of an excellent method for picture noising are noise control and edge preservation. It develops during capture or image transfer. Noise indicates that the pixels in the image portray different intensity values based on actual pixel values obtained as image. Noise removal is the process of eliminating or diminishing noise as image.
The mean filter is a linear filter and utilizes the average event over the neighbourhood, while the median filter is a non-linear filter and the image pixel intensity measurements in the preferred neighbourhood of the filter replace each pixel intensity measure. Based on the statistical technique, the Wiener filter removes the added noises while maintaining its edges and simultaneously reverses the blur. The Gaussian filter is renowned for its blurring and decreasing the noise through filtering of information, making a convolution both in spatial and probable disciplines.

Gaussian filter
Gaussian filter is performed with strange size masks. Each origin of the shield has a significance calculated from a Gaussian distribution feature. The Gaussian feature (focused at a = ) is constant in one dimension.
where a is the filtered point place, ( ) is average and ( ) is the standard deviation. Durable Gaussian feature (focused at ( , )) in two dimensions is where (a, b) is the filtered point location in 2D. The weights are located by row and column numbers as coefficientsaandbin equations 3 and 4, since the filtering is performed with a shield.

Wiener filter
Wiener filter is another filter used in this work for removing noise. Wiener filter can be determine with the help of equation (5) ∑ where g represents pixel of original image, h implies pixel of image after filtering. Here, all pixels of gand h (assume the same size) are taken over. This amount is a metric of the proximity between gand h. Filters according to this minimum square principle, are termed as Wiener filters.

Feature extraction
Manual segmentation takes place under the presence of a doctor who is well trained in normal and anomalous images. Features needed for kidney function, they are derived as separated areas. Feature extraction is a process of reducing the size, thereby reducing the initial set of raw data into manageable groups for processing.

First order grey level statistics
When g(p, q) is a discrete picture, and S T is the total pixel count in kidney region ℘. Average (R 1 ), dispersion (R 2 ), variance (R 3 ), average energy (R 4 ), skewness (R 5 ), Kurtosis (R 7 ) and mode (R 8 ) are calculated as the values for first-order grey-level statistics.
If the intensity values v(e, f )ofg(p, q) are arranged in ascending order, then For even N , by finding the average of two centre scores, the centre value shall be predicted. If hist (a k ) is the picture histogram, which provides the number of grey levela k frames, then

Second order grey level statistics
For the region from pictures, texture characteristics play a significant part. There are statistical and structural two basic methods for describing texture. For the depiction of statistical texture the temporal grey-level dependency method (SGLDM) is commonly used. The above technique can discriminate against all recognized, visually separate texture pairs. These second-order statistical characteristics are calculated under two-stage process. Initially, the co-occurrence arrays with element B e f (k, l )are provided. All (k, l ) th entry matrices are likely to move from pixels of grey point(k)to pixels of freshly proposed grey point(l ), from pixels of 0 • , 30 • , 60 • , 90 • , 120 • and 150 • , to a new stage of grey point (j). The texture characteristics are predicted based on the co-occurrence matrices. There are 14 texture actions, which could be removed fromB e f (k, l ). The following calculations as specified five structural characteristics: energy (EGY), entropy (EPY), correlation (C), inertia (I) and homogeneity (HTY),

Correlation
The correlation among ultrasound features and actual histopathological condition of kidney tissues, together with diagnosis and follow-up of canine chronic kidney disease the correlation among every ultrasonography abnormality, the RUS, and degeneration inflammation scores is experimented with linear mixed model assuming the patient with random factor and correlation equation is given in equation 16.
Homogeneity was measured by measuring entropy. Entropy as a measure of the information content (or reverse randomness) in data is well known on information theory. Entropy signal intensity is measured as histogram that represents the distribution of intensities present in the cartilage box. A histogram may describe H as following formula: H (i) = n, here n (i = 0, 1.L1) denotes the number of events corresponds to signal intensity, and i and L denote the number of distinct grey states under cartilage. The histogram is normalized with total number of intensities N as the histogram indicates probability distribution of signal intensities e and thus invariant with cartilage volume and the homogeneity equation is given in equation (18).
The correlated features are extracted based on mean value and standard deviation values of ultrasound image. Usually current weight, albumin level, entropy, correlation, variance, homogeneity and coarseness are some of the correlated features used while detecting kidney carcinoma. The current weight of the correlated features during feature extraction in ultrasound kidney carcinoma is 0.97. Then the albumin level of kidney lies between 0.89 ± 0.05. The entropy feature based on the energy is 0.989. The correlated feature based on grey level statics is 0.9 ± 0.04. The variance based on grey level statics feature is 0.93, ±0.03. The complexity of the correlated features is 0.89 ± 0.03. The variance based on grey level statics feature is 0.93, ±0.03. The coarseness of the correlated features based on ultrasound kidney image is 0.96 ± 0.02.

Higher order algebraic moment invariants
Moment invariant is a rotation, transmission and scale (RTS) invariant property of linked area. It describes a simply calculated collection of regional properties that can be used to identify shapes, parts and class. This is the classic method for producing algebraic invariants. Ifg(p, q)is an algebraic image, the algebraic phases are indicated by, For each pixel of the image, a fixed number of lower order moments are computed using (x + y ≤ 3). Invariants of time generally defined as standardized moments, xy given by, where, The range of seven RST invariant characteristics = [ 1 , 2 , … . 7 ]is calculated for renal pictures with equation (20) for normalized key moments. Every on equation provides an invariant value for one time.

3.3.4
Multi-scale Gaussian differential features The distinguishing characteristic is space derivatives vector. The initial two types of spatial derivatives m and n are utilized as characteristic vector for a specified picture g(p, q). This vector provides useful statistical data on local intensity surface pattern and image. The first derivatives constitute the intensity gradient or edge and the second is curvature. Stability is a primary focus in the derivatives calculation. The derivatives are robust if the picture distribution with relatively stable. Gaussian derivative filters are calculated instead of using just a finite difference. The multi-scale differential characteristics emerge from the combination of the Gaussian filter mask first and second derivatives.
The first derivatives in 'e' and ' f ' direction with image g(p, q) is given by, The second derivatives direction are A(e, f ).
Several distinctive features from some of these derivatives can indeed be formulated by fixing = 4and certain description and techniques are created. These characteristics are chosen mainly based on rotation and lighting tolerances. Two main curvature characteristics are calculated as follows, namely isophote (IS) and flow line (FW). where

Power spectral features
A 2-D FT algorithm that generates in power spectrumL(r, s), is used to the pre-processed picture and square intensities are measured. The technique used to discover the energy proportion for the ranking of renal abnormalities, which can solve the consequence of system defining angular rate of the radial cut-off (Ω pc ). The power spectral parameters Ls 1 , Ls 2 and BJ 1 , BJ 2 , BJ 3 and BJ 4 definition will be taken as follows from the projected power element.
where U n ∕b implies angular radial cut-off frequency (n = 1), L Y −mean denotes mean of L(r, s) NR, MRD and CC images (called global mean), L Y −total implies total power mean L(r, s) of NR, MRD and CC images (called global total power)

Discrete Fourier transform
The picture on frequency domain as sine and cosine elements is represented by Fourier Transform. The following equation is used to provide the discrete Fourier transform (DFT) of a 2D picture in width NxN: where f (k, l ), (k, l ), f (k, l ) implies image value on frequency domain equivalent with coordinates k and l .

Gabor filter
The characteristics chosen from the characteristics of pictures used to define a category. The Gabor features are obtained in this work to classify the pictures of the renal. The removal of characteristics as numerous frequencies or scales linked with Gabor filtering in distinct corners. In order to evaluate Gabor characteristics, the wavelet disintegrated pictures are used. The Gabor filter is described as where u and v implies standard deviations onu, v directions; = 1∕central frequency signifies that wavelength, and symbolizes that angle of orientation. The principal characteristics of the form and structure of the kidney can be projected with frequency and direction. The characteristics are extracted from six distinct orientations and four distinct wavelengths. The parameters u 1 and v 1 are summarized in Equations (43) and (44) From the Gabor filter, x(u, v, , ), the Gabor features can be extracted as, The Gabor characteristics measure six orientations and four distinct central frequencies and produce 24 characteristics for each breaking of the kidney pictures.

3.3.8
Connection based feature subset selection (CFS) CFS subgroup based on following assumption that the subgroup element is regarded as large, but very much related to the class cannot be linked with different class characteristics. The numerical evaluation technique for the component provides the following speculation an operative significance as follows: where t o j implies connection among summed features and class variable; b indicates feature quantity t o j refers normal line of relationship among features and class variable; and t oo implies usual inter connection among features.

3.3.9
Scale invariant feature transform (SIFT) Scale invariant feature transform (SIFT) provides an impressive illustration of characteristics at particular image points, such as edges, masses, and angles. It provides the collection of characteristics that have no problems with other techniques. Filter operates better whether the gradual amount of images is captured with different roles in a comparable field. Invariant to scaling, translation or pivot is the vector of each component. Refinement strategies are shown in the SIFT. Extreme space detection by obtaining the extraordinary qualities of the Gaussian convolution can be considered. The main limitation point is achieved by destroying poor edges and small dividing points. It drive out that enormous edge-pixels and flows through neighbours across the border and smaller top. For each main stage, the pitch grandeur and presentation are determined. Angle greatness (x, y) may be ascertained with equation (47) and rotation of image pixels may be ascertained by the equation (48).
The classification of usual or unusual pictures denotes tedious method, according to obtained characteristics. Consequently, the features must be reduced to improve classification precision.

Feature selection with dragonfly algorithm
The appropriate features are chosen using the dragonfly optimization algorithm (DFA) after the extraction process of the feature. Whereas substantial numbers of feature influence the classification systems quality and increase the difficulty of computation. The DFA focuses on dragonflies ' attributes and intellectual ability. The distinctive and superior swarming behaviour of dragonflies is based on the dragonfly algorithm. Hunting and migration are swarming by the dragonfly or known as stationary swarm behaviour of hunting swarming is described with mission of little group of dragonflies that change their moves nearby and rapidly. A migratory swarm, or "diverse swarm" behaviour, is associated by a huge amount of flies flying along lengthy distances in one direction. Static swarm and dynamic swarms are the DFA's abilities to use and exploit. Dragonfly behaviour is based upon the principles of detachment, alignment, cohesion, distraction from the enemies and food attraction. The step by step proposed feature selection process is shown below: Separation (S k ): Separation relates to the avoidance of static collisions by dragonfly from other dragonfly in the vicinity.
where W implies location of current individual, W x portrays that position x th neighbouring individual, and N refers number of neighbouring individuals. Alignment (A k ): The alignment shows that dragonfly match speed with the speed of other dragonfly in the vicinity.
where W x portrays that velocity ofxth neighbouring individual. Cohesion (C k ): Cohesion that relates to dragonflies inclination towards the centre of the neighbourhood population.
where W refers location of current individual,N refers number of neighbourhoods, andW x portrays thatxth neighbouring individual. Attraction to food supply (F S k ): Attraction demonstrates dragonflies' tendency to transfer to the source of food.
where W is the position of the current individual, and W + shows the position of the food source. Diversion to enemy outlets (DE k ): Distraction demonstrates the inclination of dragonflies to avoid a competing foe.
where W refers location of current individual, andW − portrays that position of enemy. DFAs are modified depending on the fittest unit to impact on the efficiency of the food source/position vectors. In addition, the enemy's fitness and positions are calculated on the basis of the weirdest dragons. This reality can help DFA converge more successful fields in the alternative and prevent non-promising regions in particular. Two guidelines update the positioning vectors of the dragonflies: the velocity Vector (ΔW ) and the position vector. The velocity vector portrays that direction of movement of dragonflies and quantified as Equation (54).
where s, a, c, f, d and w portrays that weighting vectors of dissimilar components. The location vector of members is quantified as Equation (55): where t refers current iteration. To ascertain the convergence of flies, their weights must be adapted to traverse from discovery to search space exploitation. The assumption is also that dragonflies expect to see more dragonflies in the optimization process as they modify their flight paths. In other words, the precinct area is increased and the swarm is one group that converges to the global optimum at last stage of optimization. From the best and worst alternatives discovered the source and enemy food are selected. This leads to convergence into successful regions of the search area and differences in the search fields that are not stable. In this case, dragonfly positions are modified with the corresponding equation: where t refers current iteration and d implies dimension of position vectors. Lévy Flight is quantified as defies: where h 1 , h 2 refers two random numbers in [0,1], is constant and is quantified as defies: where Γ(g) = (g − 1)! dragonflies are initialized automatically to either 0 or 1. Here, if a dragonfly's kth position is 0, it means the kth attribute is not selected for the process of classification. If it is 1 then the classification process takes the kth value. Output : Class 0 or 1 The optimization cycle stops when the most excessive number of iteration is completed or when the best solution is discovered. As we have obtained the optimum number of features, establish a vector of features to constitute the selected features. The selected features will be fed to the classification system after the selection process. Random decision forest classifier is used for classification, which classifies the specified image into two classes, namely normal and tumour.

Random decision forest classifier
Random forest (RF) is proposed with Leo Breiman is the fastest, most accurate, noise classification system. Each tree in the forest is individually affected with values of random vectors and has same distribution as other trees on forest. RF includes the outsized number of decision trees, where decision tree choose its separating features as bootstrap training set S i where, i indi- As number of trees in the forest turns into outsized number, generalization error can also increase until it converges to a few boundary levels. One of most reliable learning algorithms accessible is random forest. It creates an extremely precise classifier for many data sets. A random forest is a pruned decision trees compilation. When the substantial training datasets and significant number of input variables, random forests are often used. A random forest method is ensemble classifier that has numerous decision trees and produces the class label which is the highest results of the individual trees' class output. Figure 2 shows the sample framework of RF. The key step of the proposed system is to distinguish appropriately between normal and cancer kidney images. The input to the classification stage is the feature vector from the previous step and the labelled vector (where 0 = normal and 1 = cancer). Based on the following steps, each tree is constructed: 1. Let T be the number of training cases and let V be the number of variables on classifier.
2. The number v of input variables is being used to ascertain the tree node decision and v should be much smaller than V. The number v of input variables is selected based on the Gini criterion.
3. Find a suitable training set for this tree by selecting t times with suitable alternative (i.e. take a bootstrap sample) from all readily accessible T training cases.
4. Choose randomly v variables for each node of the tree with which to target the decision on a certain node.

Every tree is grown completely and not pruned.
Where T is the number of instances in a dataset, V is the number of attributes in a dataset and m is randomly selected from ' V' attributes used during best splitting.
Output : Generates decision trees A new sample is driven down the tree to predict it. The training sample label is designated in the terminal node in which it ends up. Throughout the occurrence, RF is used in classification tasks, RF receives a class vote for all trees, and the classification is done at that point by majority vote. This entire process is iterated across all trees in the ensemble and the average vote of all trees is alleged as a prediction of random forests.

Performance metrics
To compare with current ANN [29], KNN [30] and decision tree [32], the achievement of the proposed system of medical kidney cancer analysis is performed using metrics such as sensitivity, accuracy, specificity, precision, recall, F-measure, NPV and MCC. Certain measurements are calculated with true negative (T N ), true positive (T P ), false negative (F N ) and false pos-itive (F P ). The evaluation measurements are defined in the following sections.

Accuracy
The measure of the overall usefulness/efficiency of classification system are named accuracy. Equation of accuracy,

Specificity
It is used to identify and classify negative class patterns and measure classification ability. It is calculated as

Sensitivity
It is used to identify and classify positive class patterns and measure classification ability. It is calculated as

F-measure
For computing the classification accuracy F-measure make utilize of recall and precision. It is described as,

Mathews correlation coefficient
In binary class issues, the Mathews correlation coefficient (MCC) acts as a measure of classification, with their value varying from −1 to 1. Here, −1 indicates the classification error, 1 indicates the correct label classification prediction and random prediction is zero. The MCC formula is, The error rate is the number of all false predictions divided with total number of datasets.
The negative predictive value is mentioned as False positive rate False negative rate 4.1.7 Comparison of proposed and existing system Table 1 portrays that quality comparison of proposed and existing methods.

Dataset description
The images of the kidney are used to diagnosis method. Here, compile the database from publically accessible online sources, moreover compiled 100 pictures outright. Of these, 40 pictures are perfectly normal, 30 are tumour images and 30 are stone images. For training 805 images are utilized and 20% images for analysis. Figures 3 and 4 show the proposed result for automatic diagnosis and classification of ultrasound kidney images. Random forest classification for automatic diagnosis and classification of ultrasound kidney images is trained here. For testing purpose, image acquisition of ultrasound kidney images is taken and it is given as the input of pre-processing stage are shown in image acquisition block in Figures 3 and 4. In the pre-processing section, contain ROI Segmentation, noise elimination Bicubic-Interpolation, correlation, and homogeneity features of output results are shown in Figures 3 and 4. Then the pre-processed image is given to feature extraction section. In feature extraction section contains discrete Fourier transform, Gabor filter; multi Gaussian differential features and the output results of feature extraction are given in Figures 3 and 4 and finally feature are selected with the help of dragonfly algorithm and the output results of feature selection are given in Figures 3 and 4. In Figure 3, the feature selection does not show any abnormal features, so the classification result of RF classifier shown as Normal. Then classification output of RF classifier is shown in Figure 3. In Figure 4, the feature selection shows abnormal features, so the classification results of RF classifier shown as abnormal.

Performance analysis
In Figures 5 and 6, the specificity, sensitivity, precision, accuracy, recall and F-Measure proposed random forest classifier performance is greater compared with ANN, KNN and decision tree. Specificity performance is higher and the accuracy than other metrics comparatively. Sensitivity, precision, recall and F-Measure showed similar performance of random forest classifier.
In Figures 7 and 8, the NPV and MCC the proposed random forest classifier values are near to 1 and showed higher performance compared to ANN [29], KNN [30] and decision tree [32]. In FPR, FNR, FRR and error rate, the proposed random forest classifier values are near to 0 and shows higher performance compared to ANN, KNN and decision tree.
In Figure 9(a), the performance measurements like specificity, sensitivity and accuracy is compared to random forest classifier, KNN, ANN and decision tree. In sensitivity, the random forest classifier displays 43.12%, 25.43% and 32.22% greater compared with KNN, ANN and decision tree correspondingly. In specificity, the random forest classifier portrays 16.06%, 8.2% and 9.28% greater compared with KNN, ANN and decision tree correspondingly. In accuracy, the random forest classifier portrays 24.72%, 11.43% and 10.16% greater compared with KNN, ANN and decision tree correspondingly. Figure 9(b), in precision, recall and F-Measure that proposed random forest classifier shows similar percentage of 43.12%, 25.43% and 32.22% greater compared with KNN, AN, and decision tree correspondingly.
In Figure 10(a), the proposed DFA, GWO and PSO are compared and repetitions are executed. The PSO portrays steady performance up to 35 repetitions, then maximizes the fitness performance from 36 to 50 with an average repetition of 0.65, then gradually increases to 69 repetitions, and then repetition shows consistent performance from 69 to 95. The GWO repetition shows a gradual increase from 0 to 10 repetitions with consistent performance up to 27 repetitions and then maximize in fitness performance over the average range, but the proposed DFA operates steadily up to 66 repetitions, then there is a steep increase of 0.71 to 0.95 in repetition 100 to 100 repetitions. In Figure 10(b), the proposed DFA, GWO and PSO are compared and repetitions are executed. The PSO portrays steady performance up to 30 repetitions then maximize the fitness performance from 70 to 90 with an average repetition portrays steady performance. The GWO portrays consistent performance up to 20 repetitions, then maximizes fitness performance over the average range, but operates steadily up to proposed DFA 60 repetition, with a step increase of 0.71 to 0.95 out of 70 repetitions until 100 repetitions occur at constant level. The overall fitness performance is developed with DFA.
In Figure 11, the performance is calculated and the values are determined to selected features. In Figure 11(a), MCC, FRR and Error Rate are calculated and MCC value near to 1 and shows greater efficiency and FRR and Error Rate value near to 0 and shows greater efficiency. In Figure 11(b), NPV, FNR and FPR are calculated and NPV value near to 1 and shows greater efficiency and FNR and FPR value near to 0 and shows greater efficiency. In Figure 11(c), the performance of sensitivity, specificity, accuracy is performed efficiently.
In Figure 12, at execution time 0.5 ns, the speed of proposed random forest classifier is higher than 16.6%, 30%, 70.13% compared with existing ANN, KNN, and Decision tree classifier respectively. At execution time 0.15 ns, the speed of  proposed random forest classifier is higher than 25.21%, 10.52%, 15.13% compared with existing ANN, KNN, and Decision tree classifier respectively. At execution time 0.25 ns, the speed of proposed random forest classifier is higher than 15.38%, 63.63%, 60% compared with existing ANN, KNN, and decision tree classifier respectively. At execution time 0.35 ns, the speed of proposed random forest classifier is higher than 33.33%, 33.33%, 42.85% compared with existing ANN, KNN, and decision tree classifier respectively. At execution time 0.45 ns, the speed of proposed random forest classifier is higher than 9.75%, 28.57%, 36.36% compared with existing ANN, KNN, and decision tree classifier respectively.

CONCLUSION
In this study, an automated system has been created to recognize and classify the kidney diseases. The mechanism comprises of five main elements: ROI segmentation, pre-processing of images, feature extraction, selection of features and classification. The dominant functional range was used to develop and implement an automated technique to detect and classify US renal images. Due to the effective pre-processing operation these quality features define the kidney features were viable. Depend on these numerical values of characteristics it is probable for creating a universal reference for kidney types, analyze the image to assess the pathologies and make a relative research choice. Experimental outcomes show that the storage point of the kidney can be automatically detected and the real renal ROI generated up to 89%. True ROI generation could be used for any other segmentation technique as pre-processing processes.
The pre-processing of CT pictures for kidney tumour identification was done in this study. Because, CT pictures contain inherently noise, for better segmentation outcomes it was essential to eliminate noise. Various CT pictures with Gaussian and wiener filter use sound removal filters. The predominant participation to the classification of medical data by a creative RF classification with best possible features retrieved from the dragonfly algorithm.