Discrete Wavelet Transmission and Modified PSO with ACO Based Feed Forward Neural Network Model for Brain Tumour Detection

: In recent years, the development in the field of computer-aided diagnosis (CAD) has increased rapidly. Many traditional machine learning algorithms have been proposed for identifying the pathological brain using magnetic resonance images. The existing algorithms have drawbacks with respect to their accuracy, efficiency, and limited learning processes. To address these issues, we propose a pathological brain tumour detection method that utilizes the Weiner filter to improve the image contrast, 2D- discrete wavelet transformation (2D-DWT) to extract the features, probabilistic principal component analysis (PPCA) and linear discriminant analysis (LDA) to normalize and reduce the features, and a feed-forward neural network (FNN) and modified particle swarm optimization (MPSO) with ant colony optimization (ACO) to improve the accuracy, stability, and overcome fitting issues in the classification of brain magnetic resonance images. The proposed method achieves better results than other existing algorithms.

an automated system is required to classify MRI images quickly and accurately. CAD is one of the automated systems that can be utilized to perform diagnoses processes. Many traditional algorithms have been proposed for the PBDS, but they are currently in their initial stage due to the difficulty involved in the implementation of feature extraction and the classification of images. The main goal of this paper is to improve the accuracy of the PBDS when classifying MR images. Section 2 details the progress of related work in the pre-processing, feature extraction, and classification of brain images. Section 3 contains the methodology of a proposed model for the PBDS system. Section 4 provides an experimental analysis of the proposed model, along with different parameters. Finally, Section 5 concludes the research work.

Related work
In recent years, many researchers have proposed image classification mechanisms for brain imaging. Hebli et al. [Hebli and Gupta (2017)] proposed the three-level sub-band approximation of 2D-DWT for feature extraction in MR images. They used DAUB-4 decomposition filters with a support vector machine (SVM) and self-organization maps (SOM) to classify the images. This method achieved 98% accuracy in MR image classification. The major drawback of the SOM classifier is that it requires more computation time to classify the images. El-Dahshan et al. [El-Dahshan, Hosny and Salem (2010)] developed the improved DWT (IDWT) algorithm for extracting features of brain MRIs. This method implements the k-nearest neighbor and feed-forward ANN algorithms to differentiate between normal and abnormal images. The accuracy of the algorithm is almost 97% in identifying normal images and 98% in abnormal images. The major problem with the IDWT algorithm is that the features of MR images are not compared with an online database. Zhang et al. [Zhang, Wang and Wu (2010)] proposed hybrid algorithms for the classification of brain images. They used three-level sub-band 2D-DWT in all of the algorithms for feature extraction. However, different image classifiers, such as feed-forward ANN [Shakeel, Tobely, Al-Feel et al. (2019)], the adaptive chaotic PSO algorithm [Saba, Mohamed, El-Affendi et al. (2020)], the scaled chaotic ABC algorithm [Zhang, Wu and Wang (2011)], kernel SVM [Zhang and Wu (2012)], and KSVM+PSO [Zhang, Wang, Ji et al. (2013)] have been used for the classification of brain images. Das et al. [Das, Chowdhury and Kundu (2013)] developed the Ripplet transformations for extracting the different features from the MR images. The PCA has been employed for feature reduction and LS-SVM is currently implemented for the classification of diseased and non-diseased brains through MR images. The Ripplet method achieved high accuracy in terms of classification when it was applied to larger datasets. This method involves a complex procedure when it is operated with online datasets. Saritha et al. [Saritha, Joseph and Mathew (2013)] proposed a feature extraction method using the wavelet entropy SWP. This method computes the entropy of the DAUB-4 wavelet and implements the PNN to classify brain images. The results of the feature extraction method are effective in comparison with other algorithms [Mudukshiwale, Amit and Patil (2019);Armin, Sharif, Yasmin et al. (2018), Sharma, Purohit and Mukherjee (2018); Othman, Abdullah and Kamal (2011); Padlia and Sharma (2019)].

Pre-processing phase
The Wiener filter is a powerful tool for pre-processing MR images. This method is used to decrease signal noise by replacing the impulse filter [Naimi, Adamou and Mitiche (2015)]. Wiener filtering is one of the approaches that convey a trade-off between the noise smoothing and inverse filtering that provides the noise smoothing and inverts the image blurring. Wiener filtering uses the stochastic framework for applying the linear approximation to the original brain MR image. Eq. (1) shows the Wiener filtering method in the pre-processing stage in the Fourier transform.
x n x F where B (f1, f2) represents the blurring filter, Sx (f1, f2) represents the original brain MR image power spectrum, and Sn (f1, f2) represents the adaptive noise.

Feature extraction phase
In the proposed model, the discrete wavelet transform (DWT) is used as the feature extractor. For each training input MR image, we apply the DWT to extract dyadic scales and positions. The basic elements of the DWT are as follows: consider that x(t) is the square integral function. Ψ(t) is the real-valued wavelet related to x(t), which is a continuous wavelet transform given as where Wψ (α, β) represents the wavelet transform, α represents the dilation factor and β represents the translation parameter. Eq. (3) shows the discrete variation of Eq. (2), which can be obtained by restraining α and β to a discrete lattice (α=2 j and β=2 j k).
where the aj,k(n) and dj,k(n) represent the coefficients of the approximation and detailed components, respectively. The low pass filter is represented by G(n) and the high pass filter by H(n). The wavelet scale factors are represented by j and the wavelet translation factor is represented by k. Figs. 1 and 2 show the representation of the 2D-DWT as it is applied in each dimension to the training input image. Here, we have taken a sample pathological brain image and applied the three-level decomposition of WT. The sub-band LL1(aj) is further decomposed by using 2D-DWT and can be considered for approximating the components. The LH1(djh), HL1(djv), and HH1(djd) sub-bands are considered for detailed component analysis in the horizontal direction, vertical direction, and diagonal direction. Different types of wavelet transforms have gained popularity in wavelet analysis, among which the Haar wavelet has been used regularly in various applications [El-Dahshan, Hosny and Salem (2010)]. The Haar wavelet performs well in noisy conditions and can also be implemented in both orthogonal and symmetric form. The Haar wavelet extracts the basic components present in the image with high performance. In the proposed work, the approximation coefficients are computed in the level-3 Haar wavelet decomposition image. These coefficients are utilized as the feature vector for the image. Algorithm 1 shows the feature extraction procedure for brain MR images.

Feature normalization and reduction
The features computed from Algorithm 1 have high dimensionality and require significant space and computational power. Therefore, there is a requirement for a feature reduction technique to reduce the dimensionality and extract the candidate features. PPCA is an important approach that reduces the high dimension features to low dimension features by connecting them to u, which is a P-dimensional observation vector, and v, which is a k-dimensional unobserved vector that performs normalization with zero mean and unit variance. Algorithm 2 shows the normalization with PPCA. According to Algorithm 1, a normalized FM of size L×Mi and a reduced FM of size L×R are obtained after applying the PPCA. The reduced FM is smaller than the normalized FM. The PPCA removes the class labels for the data and the data is converted into an unsupervised mode. To address the unsupervised data, LDA is introduced. LDA is a supervised approach that distinguishes the classes that are outliers and differ significantly from the similarities in the data. Conventional LDA is not suitable for high dimensional features and small sample dataset problems. In these scenarios, LDA forms only a singular scalar matrix (SW). To overcome the limitations of LDA, PPCA+LDA is used in the proposed model, where P-dimensional data is reduced using the PPCA and kdimensional data is reduced using LDA.

Classification using the FNN and MPSOACO
In this section, we discuss the preliminaries of feed-forward neural networks (FNNs), MPSO, and ACO, after which we describe the proposed FNN and MPSOACO algorithm in detail.

Feed-forward neural network (FNN)
Since 2000, FNN is a well-known pattern recognition classifier and has been widely used by many researchers. The training dataset is given as input to the FNN and performs the batch mode training [Zhang, Wang, Ji et al. (2013)]. The configuration of the network is given as HiP ×Hhl ×Hop. Here, the two-layer neural network with input layer Hip, hidden layer Hhl, and output layer Hop identify that the brain is normal or pathological. Consider that ω1 and ω2 are the weighted matrices between Hip and Hhl, respectively. The following steps are then used to update the weighted values to train the dataset [Monochehri and Kolahan (2014)].
where ci represents the i th input value in the network, Aj represents the hidden layer output, fhl represents the Hhl activation function, and the sigmoid function is shown in Eq. (5).
where fop denotes the Hop activation function and the values of the weights are assigned randomly.
Step 3 where Tk denotes the authentic variable k th value, Hs denotes the number of samples.
Step 4: The fitness function for the Hs samples is given as where ω denotes the vector of (ω1, ω2).

Modified particle swarm optimization (MPSO)
The PSO is an efficient optimization algorithm used for the process of searching through the group of particles that will be used in the updated iterative procedure. To find the optimal solution, every particle that moves in the direction of solution is selected as the local best (Pbest) or global best (gbest) in the group [Zhang, Wang, Ji et al. (2014) where TP represents the total particles present in the swarm, v denotes the current iteration value, i denotes the particle index, f represents the function and P denotes the particle position. Eq. (11) is used to update the position P and velocity V of the particles. )) ( where a1 and a2 are the accelerated coefficients, r1 and r2 are the random variables that lie between 0 and 1. The proposed modified PSO (MPSO) is framed by adding the inertia ω to Eq. (11) )) ( where the inertia ω represents the weight factor that balances the local search and global search.

Ant colony optimization mechanism (ACO)
The ACO algorithm proposed by Khorram et al. [Khorram and Yazdi (2019)] has proven to be effective in solving many optimization problems. In the ACO algorithm, ants simultaneously search for paths to the food. The ants choose the paths based on the quantity of pheromone left by the ants in the various paths. The probability of selecting a path depends on the amount of pheromone that has accumulated on a particular path. Eq. (14) shows the probability computation for selecting a path.
where χij represents the pheromone intensity of the i th ant in the j th pathway. K is used to identify whether the j th path needs to be selected or not. Τij is the probability of selecting the i th ant based on the j th path intensity. The fitness function for the feature subset generated at the time that the ant reaches food is evaluated using Eq. (15).
where AC denotes the feature subset accuracy, n denotes the number of ants in the feature subset, and λ denotes the weight factor. After finishing one cycle, the pheromone values of all paths are updated. Eq. (16) shows the pheromone update mechanism.
where Δχij is the incremental value of the pheromone update and ρ denotes the expiration of the updated pheromone trail. Δχij is further explained as follows: In Eq. (17), S denotes the set of paths and b denotes the control parameter used to regulate the pheromone quantity.

Proposed method
The proposed method is composed of four modules: pre-processing, feature extraction, feature normalization and reduction, and classification. Fig. 3 shows the block diagram of the proposed pathological brain detection model (PBDM). The pre-processing is performed by using the Weiner filter, the feature extraction is conducted using the 2D-DWT, feature normalization and reduction are processed by PPCA and LDA respectively, and the classification is performed using the FNN and MPSOACO. Algorithm 3 shows a detailed analysis of the proposed PBD model.  Apply three level 2D-DWT and create a set of feature vectors with dimension P. end for for j in 1 to N do Execute PPCA and LDA transmission for obtaining wavelet coefficients. end for Perform the cross validation on the generated data set and generate the training data, validation data and testing data. Train the FNN algorithm using the MPSOACO algorithm and select the optimal weights at input layer and hidden layer. Calculate the output layer weights using the optimal weights at input layer and hidden layer. Measure the performance of the classifier based on the testing data set.

Online learning:
The user submits the query image to the system. Apply Wiener filter for the image to perform pre-processing. Apply three-level 2D-DWT and create a set of feature vectors with dimension P. Obtain a reduced feature set by multiplying the wavelet coefficient by the feature vector coefficients. Input the reduced feature set to the FNN classifier that is trained by MPSOACO and determine whether the image is normal or pathological. End

Experimental analyses
The proposed model was simulated using the MATLAB 9.5 R2018b in a PC with a configuration of 3.7 GHz, 12 GB RAM, and a Windows 10 operating system. The performance evaluation of the proposed model is compared with the other existing systems is shown in this section.

Feature illustration and reduction results
As an initial step, we applied wiener filter to improve the contrast of the images. Then, we carried three level 2D-DWT method (Algorithm 1) to divide the image in to 10 sub bands as shown in Fig. 3. This mechanism produces 32×32=1024 feature coefficients.
The top left corner in the three level 2D-DWT image denotes the approximation coefficients. The size of the images is taken as 256×256=65536 which is of larger size. We applied PPCA +LDA on the Dataset-66, Dataset-160 and Dataset-255. Each image in the datasets is rearranged with a row vector and in the form of two dimensional matrix. Algorithm 2 shows the normalization mechanism using the PPCA+LDA. It reduces the features from 65536 to 1024 by considering the three level 2D-DWT transformations. Fig.  6 shows the cumulative variance with respect to the principal components (features). It is observed that, PCA requires 13 features whereas PCA+LDA and PPCA+LDA require only 3 features when the threshold is fixed as 0.95. Therefore, PPCA+LDA are selected as suitable mechanism for identifying the significant components. Figure 5: Sample ground truth brain MR images

Performance evaluation of the FNN-MPSOACO classifier
The performance of the FNN-MPSOACO classifier was tested with different numbers of features and the accuracies of  shows the classification accuracy of the proposed method with the existing KNN, BPNN, SVM, and extreme learning machine (ELM) classifiers. The proposed classifier achieved an accuracy of 100% in and 98.95% in Dataset-255, as illustrated in Fig. 7.
The classification accuracy of the proposed method is compared with the existing methods such as DWT+SVM [Mehrotra, Ansari, and Agrawal (2020)

Computing time analysis
Tab. 4 shows the analysis of the computing time for each step in the DWT+PPCA+LDA+FNN+MPSOACO method. We considered offline learning and online prediction approaches to find the computing time of the proposed method. The offline learning is the process of finding a pathological brain with an available dataset, whereas online prediction uses real-time data. For the offline mechanism, the Weiner filter last 0.252 s, DWT lasts 0.685 s, PPCA lasts 0.243 s, LDA lasts 0.312 s, and FNN-MPSOACO lasts 202.745 s. For the online mechanism, the Weiner filter lasts 0.002 s, DWT lasts 0.003 s, PC lasts 0.003 s, and prediction lasts 0.001 s.

Conclusion
This paper proposed the DWT+PPCA+LDA+FNN-MPSOACO image classification model for identifying pathological brains in MR images. The proposed model achieved the highest classification accuracy of 99.72% compared with other existing algorithms.
As an initial step, the proposed method uses Weiner + DWT to extract the features from the brain MR images. PPCA+LDA is used to perform the normalization and feature reduction. The FNN-MPSOACO algorithm is used to classify the MR images into normal and pathological brains. In the future, the proposed method could be extended to evaluate other images, such as CT scans, PET, and MRSI, and the proposed algorithm could be improved by adding the deep learning mechanism.