A framework for efficient brain tumor classification using MRI images

: A brain tumor is an abnormal growth of brain cells inside the head, which reduces the patient's survival chance if it is not diagnosed at an earlier stage. Brain tumors vary in size, different in type, irregular in shapes and require distinct therapies for different patients. Manual diagnosis of brain tumors is less efficient, prone to error and time-consuming. Besides, it is a strenuous task, which counts on radiologist experience and proficiency. Therefore, a modern and efficient automated computer-assisted diagnosis (CAD) system is required which may appropriately address the aforementioned problems at high accuracy is presently in need. Aiming to enhance performance and minimise human efforts, in this manuscript, the first brain MRI image is pre-processed to improve its visual quality and increase sample images to avoid over-fitting in the network. Second, the tumor proposals or locations are obtained based on the agglomerative clustering-based method. Third, image proposals and enhanced input image are transferred to backbone architecture for features extraction. Fourth, high-quality image proposals or locations are obtained based on a refinement network, and others are discarded. Next, these refined proposals are aligned to the same size, and finally, transferred to the head network to achieve the desired classification task. The proposed method is a potent tumor grading tool assessed on a publicly available brain tumor dataset. Extensive


Introduction
With rapid economic development, people's living standards are continuously improving. Most importantly, the proportion of healthcare systems is strengthened, and health awareness campaigns in societies are gradually increasing. Over the last years, the role of technology in healthcare is expanding exponentially, and health information technology is promoted [1][2][3][4][5]. This technological advancement has shaped up the future of healthcare and improved public health. For example, computer vision technology, image processing as its key component have successfully applied in medical imaging to improve patient care. Most importantly, from a clinical perspective, the technologies such as image enhancement, image segmentation, object detection and image classification have attracted much more attention and mainly applied in disease diagnosis and set up early treatment plan [6][7][8][9]. For example, enhancing the visual quality of medical images improves disease diagnosis [10][11][12]. The segmentation of medical images helps to extract the region of interest, such as segmenting the body organs or tissues while performing detection and classification tasks, i.e., brain tumor detection and others [13][14][15]. The detection process helps us to find the object of interest and determine their location, such as brain tumor, which helps patients' treatment [16][17][18]. Another essential task is the classification of medical images. It allows doctors or radiologists to classify objects such as brain tumor in images into different categories, i.e., Meningioma, Glioma, and Pituitary, to improve disease diagnosis [19][20][21][22]. Early diagnosis of any diseases plays an important role in patients treatment planning. The medical imaging tasks, especially enhancement, correct segmentation of tissues or body regions, their early detection and classification can provide effective help for the clinical diagnosis of different diseases. Hence, an accurate disease diagnosis and treatment planning depend on the improved enhancement, segmentation, detection and classification methods. However, performing these tasks in the modern medical image processing field is a real challenge for the researchers due to low contrast, noise, and other imaging ambiguities.
Medical imaging technologies such as magnetic resonance imaging (MRI), computerized tomography (CT), ultrasound imaging (UI) and X-rays has been successfully adopted to view, analyze, diagnose, monitor and treat diseases in the human body. These technologies help medical practitioners to obtain more information regarding different areas of the human body. This information helps study, treats the particular disease or injury and benefits from knowing the effect of existing medical treatment. Among these technologies, the radiologist has preferred MRI because, at the molecular level, it provides rich and typical microscopic chemical and physical information of the human anatomy. Compared to others technologies, MRI is more useful in disease detection and classification due to their high resolution. MRI uses a strong magnetic field and non-ionizing radiations in order to get the view of various organs and tissues. The results of MRI scanning are soft tissues of the body in the form of three-dimensional (3D) images. These tissues actually are organs and muscles that cannot be visualized in the images obtained using X-rays. Therefore, in medical applications, MRI images are typically used for brain tumor classification. The brain is one of the essential organs, which controls multiple complex functions in the human body. MRI technique has been successfully employed to identify a variety of diseases related to the brain, particularly tumor. Earlier identification of tumors from brain MRI images has recently achieved significant importance and considered as a lifesaver for brain tumor patients. Nevertheless, brain tumor classification is crucial; it is equally important to know the type of tumors to increase the patient survival rate and suggest proper treatment. Brain MRI images can be classified as normal and abnormal, or its kind. Over the years, numerous methods have been proposed for efficient brain tumor classification using very-high-resolution brain MRI images with reasonable contrast. The existing studies offer their intuitions into the view of brain MRI imagery. Kaplan et al. [23] proposed a method to classify three types of tumors named Meningioma, Glioma, and Pituitary tumors using modified local binary patterns (LBP) feature extraction methods from brain MRI images. This method achieves good accuracy compared to others and could be a better choice for the radiologist until they use enhanced quality images.
In [24], the authors have proposed a model to classify tumor presence into known and unknown classes using the convolutional network. Their model fuses the MRI sequences using discrete wavelet transform (DWT) to achieve a better quality image for the classification task. The method tackles the unwanted noise in the images; however, the performance may degrade when the low contrast image is used. The brain classification model based on a deep neural network was introduced in [25]. The model adopted the pre-training of the generative adversarial network (GAN) for improved classification. Extensive results show that the model achieved good performance on contrasted enhanced images and can significantly determine tumor types. However, the efficiency of the model could be decreased with the low quality of images. Besides, Hassan et al. [26] developed the model to tell whether the brain MRI image contains the tumor. For these purposes, they created the model based on CNN using data augmentation. The suggested model has achieved good performance using transfer learning. The model shows promising results, but its performance might not be optimal, particularly for noisy and low contrast images. These findings reported that most of the methods had used high-quality MRI images with standard contrast to achieve robust classification performance.
The MRI images could be of low quality, noisy, and contain undesired artifacts, leading to inaccurate diagnosis results. However, tumor classification from such images is still a challenging task for active researchers in order to effectively classify the abnormal tissues when the MRI images are of low contrast and poor quality. Poor quality, noisy, and low contrast MRI images are the big obstacles in diagnosing the disease early due to their low visible quality. Poor visual representation lessens the effective treatment options for tumors or any other illness. This creates a strong need for the enhancement of MRI images before their classification. For this purpose, numerous approaches have been developed to improve the quality of MRI images. Later, to achieve improved performance, these enhanced images can be used to perform the most common medical image processing tasks such as segmentation, detection, and classification to diagnose diseases. Monika et al. [27] have proposed a model to improve the contrast of Brain MRI images for tumor detection. The authors claimed that their model achieved better visual representation and assumed that their enhanced results would be sufficient to perform tumor detection. However, they didn't perform experiments to prove it, and the model is computationally expensive in improving the contrast of the images.
Similarly, in [28], the authors have presented a model of contrast enhancement of brain MRI images based on dynamic histogram equalization. The proposed model achieved good accuracy in improving the quality of images. However, the efficacy of the proposed method is not checked for other tasks of medical image analysis. Besides, the model is computationally expensive and requires a high cost. Subramani et al. [29] introduced a model for improving the visual quality of the MRI images using exposure-based contrast limited bi-histogram equalization. Their model preserved the images well by resulting in better contrast and brightness and removed the images' noise. The major limitation associated with them is their model's computational cost. The authors in [30] have presented a hybrid model for brain MRI images enhancement and classification. The model yields better accuracy. However, it only works for a smaller dataset, requires high computational cost, and increases images in the dataset; the model needs to be trained each time. Upendra et al. [31] proposed a model for improving the brain MRI images' visual quality, using particle swarm-optimized, and texture-based histogram equalization techniques. The model has achieved significant improvement in enhancing the visual quality in images. The authors have claimed that the proposed model will be sufficient for brain tumor detection and segmentation yet verified. In summary, several state-of-the-art methods are developed to solve the issue related to low image quality. Each method has its own advantage and disadvantage. Few methods are reported to achieve robust performance in improving contrast, texture details, and reducing noise. Besides, these methods over-enhance the images texture details. Based on the existing literature paradigms, as far as the Author's knowledge, the dilemma of brain tumor classification from poor quality MRI images is yet infancy in its experimental results and necessitates to be taken into consideration aptly. Consequently, to reach this objective in this research work, a new technique has been introduced, which classify tumors using brain MRI images. The proposed approach can classify tumors using brain MRI images. Hence, this research effort is an advancement compared to prior methods. Moreover, the proposed method efficiently and accurately classifies the tumors in brain MRI images, distinguishing this work from the earlier research paradigms.
The main contribution of this research is summarized as follows:  A new automated method is proposed which can replace conventional invasive brain tumor classification and enhances the overall classification accuracy.  An efficient strategy is employed to enhance the low visual quality of MRI images.  Data augmentation technique is used to achieve high classification accuracy on a small dataset, and the impact of over-fitting on classification performance is studied.  An efficient and simpler object (tumor) localization method is developed, which gets the initial locations by computing multiple hierarchical segmentation using superpixels and then rank the locations according to region score, which is defined as a number of contours wholly enclosed in the located region, only the top object locations are passed for the next task.  A deep neural network (EfficientNet) is employed for rich features extraction.  A comparison of the proposed method with existing state-of-the-art approaches for brain tumor classification is presented. The proposed method achieved an excellent classification accuracy compared to traditional methods. The rest of the paper is organized as follows: Section 2 explains the associated background research works, the proposed technique is described in Section 3, Section 4 discusses the extensive results and their analysis, and lastly, the summary and limitation along with future work are offered in Section 5.

Related works
Recently, a substantial amount of effort has been put into the extension and evolution of techniques for brain tumor classification, which includes Meningioma, Glioma, and Pituitary. The classification of brain tumors is one of the most important and exhilarating tasks in the field of medical image processing. An MRI is one of the most promising imaging techniques used for brain tumor classification and has innumerable noteworthy advantages in the medical field. The most noticeable is radiologists will have a second opinion, which will help them diagnose the intensity, diameter, position, and type of the tumor easily and quickly. Furthermore, the earlier and accurate detection and classification of tumors will help in treatment planning. Earlier research has invested abundant efforts into bringing significant insights into the many previous methods utilized for image-based brain tumor classification. However, a preponderance of the reported methods uses high-quality MRI images with suitable contrast for robustly classifying tumors. Therefore, one of the most significant aims of this research work is to classify tumors from low-quality MRI images. An existing approaches and attempts that have been carried out to classify brain tumors effectively using a similar dataset of brain tumor detection are comprehensively observed as follows: Cheng et al. [32] proposed a model for brain tumor classification based on the augmented region. The model takes augmented regions as a region of interest (ROI) using the image dilation process to know the tumor types. Their model split the augmented regions into sub-regions using the adaptive spatial division technique to indemnify spatial information damage. The three different features extraction, i.e., bag-of-words (BOW), intensity histogram, and grey level co-occurrence matrix (GLCM), are adopted to evaluate their model's performance. The experimental results show that the proposed model surpassed the prior approaches while using augmented regions as an ROI. The model yields an accuracy of 91.28%, which shows the effectiveness and robustness of their model. Similarly, aiming to improve classification accuracy, Ismael et al. [33] have proposed a model based on statistical features combined with a back-propagation neural network for brain tumor classification. The proposed method obtained ROI, i.e., tumor segments using segmentation techniques or by manual identification based on radiologist suggestions. At the same time, the model combines two-dimensional (2D) DWT and Gabor filter methods to obtain high-quality statistical features to boost classification performance. Moreover, the impact of features selected has been tested using a back-propagation neural network. Their model achieves an accuracy of 91.9%, which validates their robustness and effectiveness to be used for brain tumor classification. Furthermore, in an effort to improve brain tumor classification, Tahir et al. [34] have presented a model for improving the classification performance based on a combined pre-processing pipeline. Unlike previous approaches, their model grouped different pre-processing techniques into three categories: edge detection, noise removal and contrast enhancement. Then different possible combinations based on these techniques are obtained using different image sets. Finally, these different combinations obtained are passed to the classification pipeline. The model achieves an accuracy of 86% and affirms that the combined strategy of preprocessing techniques improves the classification accuracy robustly compared to the techniques used single pre-processing technique.
Furthermore, Paul et al. [35] offered a deep learning-based model of brain tumor classification. Their model used a convolutional neural network (CNN) to enhance classification accuracy. The model obtains 5-fold cross-validation accuracy of 90.26% on brain tumor imagining. Besides, the model suggested that reducing an image size can increase training performance and help doctors in the patient's treatment process. Moreover, Afshar et al. [36] built a capsule network (CapsNet) model for efficient brain tumor classification. The proposed network improves the classification accuracy by utilizing the spatial relations between the tumor and its surrounding tissues, which are the drawbacks in previous CNN-based classification models. The model increases the classification accuracy by accessing tumor surrounding tissues and taking them as an extra input. This model outperforms other competitors [34,35,40,41] and yields 86.56 and 72.13% accuracy with and without segmentation, respectively. Besides, [37] introduced a modified CapsNet for brain tumor classification; their model also overcomes CNN shortcomings. Unlike CNN, their model can handle input transformations such as rotation and affine transformation robustly and doesn't require a large amount of training data. This model achieved a classification accuracy of 90.89% and outperformed other competitors. For the same purpose, Zhou et al. [38] put an effort to improve classification accuracy based on a holistic approach. The method used a dense convolutional neural network (DenseNet) to extract features from axial slices in images and classify them using a recurrent neural network (RNN) to determine tumor categories. Their method is able to work well without manual or automatic segmentation of regions. The high accuracy of 92.13% demonstrates the efficacy of their model.
Likewise, a CNN based model for brain tumors classification was developed in [39]. This method extracts the features using CNN, and then, based on these features, images are classified using a kernel extreme learning machines (KELM) network. Experimental results of this joint based mechanism of CNN and KELM achieves promising results in accuracy, i.e., 93.68% compared to other conventional machine learning classifiers such as radial basis function neural network (RBFNN), k-nearest neighbour (KNN), support vector machine (SVM), and so on. Furthermore, Abiwinanda et al. [40] built a model using CNN for brain tumors classification. They have designed different seven variants of CNN without segmentation. Their model's second variant achieves the highest training and testing accuracies of 98.51 and 84.19% compared to previous models, respectively. Another multi-class model for brain tumor classification based on a deep neural network has been presented in [25]. The model extract features and learns the structure of images by pre-training a neural network as a discriminator in a generative adversarial network (GAN) with data augmentation techniques. The augmentation techniques prevent overtraining in the network. To distinguish the tumor classes, network's fully connected layers have been replaced, and the model is trained to work as a classifier. The model is evaluated on 5-fold cross-validation criterions and achieved an accuracy of 95.6 and 93.01% on random and introduced splits, respectively. Moreover, Guo et al. [41] have generalized the CNN to graph domain for tumors classification. They have proposed a model for Alzheimer's disease prediction using positron emission tomography (PET) based on a graph CNN. Their model was computationally inexpensive and produces robust results on Alzheimer's disease neuroimaging initiative (ADNI) dataset compared to other state-of-the-art models. The model has achieved 93% accuracy for two-class classification problems and 77% for the 3-class classification problem. Furthermore, to improve tumor classification efficiency, the idea of another deep CNN with various layers was reported for brain tumor classification in [42]. The efficacy of the proposed model was checked on three datasets. Their model achieved convincing performance and required less preprocessing compared to prior approaches. Similarly, Deepak et al. [43] improved the 3-class brain tumor classification accuracy using transfer learning. Their model recorded classification accuracy of 98% and surpasses other existing approaches using a small number of training examples. The model also reported analysis on misclassification.
Based on the aforementioned facts and findings, the authors are of the perspective that their proposed method will offer additional eloquent intuitions into brain tumor classification performance through brain MRI imaging. The proposed research work is in contrast to the current research works and distinguished in its results.

Proposed methodology
In this manuscript, an efficient and useful model for medical image analysis is developed for disease diagnosis, particularly brain tumor detection and classification, using low-quality MRI images. The proposed method is a useful addition in the field of medical analysis. Besides, radiologists are likely to get benefit from this applied research study. It will help radiologists get a second opinion that easily supports them in spotting the intensity, diameter, position, and tumor types. Early diagnosis of brain tumors allows experts to set up better treatment plans to achieve healthier results for the patient. The framework of the proposed brain tumor classifier can be seen in Figure 1. All three MRI views per patient fed independently to the network.

Enhancement
In the analysis of medical images, a disease (brain tumor) that needs to be diagnosed by the doctors for early treatment. However, poor visibility in images is a significant obstacle in an efficient disease diagnosis. The poor visibility of the images can be observed with the medical equipment, images produced are complex and available to the doctors in various visual appearance, i.e., high or low intensities, non-uniform, underexposed, overexposed and noisy regions. Poor quality of images lowers the performance of the disease diagnosis process significantly. This creates a clear need to improve the contrast and preserve the brightness in low-quality MRI images before performing subsequent tasks such as detection, classification, and segmentation on MRI images efficiently. To extract hidden and useful structural information from images having poor visual quality is quite challenging. Numerous contrast improvement algorithms have been developed and applied in many machine learning-based tasks; better visualization enables the machine learning algorithms to extract more valuable features from images. In this paper, we provided a clearer and visible input image to our network to achieve improved classification performance. To optimize the contrast of poor quality MRI images, an optimal contrast enhancement strategy is developed. Also, to further boost the textural information in the MRI images, the non-stretching mechanism is adopted. The details are as follows:

Optimal contrast
To improve the visibility of poor contrast MRI images, an optimal contrast strategy is utilized. According to this strategy, a modified image can be obtained by providing a reference image and original image. As that, both of original image 0 I and reference image 0 R contain some useful details. The purpose is to achieve the desired balance between 0 I and 0 R . To achieve this desired balance, the following model in Eq (1) is used to achieve optimal contrast.
The minimization form of the above equation can be written as in Eq (2): In the above Eq (2), the first two terms indicate the modified image of a weighted average of 0 I and 0 R . Besides, a weighted gradient strategy will be applied to the objective function to prevent unnatural effects. This will avoid the over-enhanced effect by smoothing the enhanced image to prevent abrupt changes on edges without reducing the overall contrast. Therefore, the penalty term for the smoothing process is added. Equation (2) shows all the terms are quadratic and have a close-form that can be directly solved. To speed up this operation, Fast Fourier Transformation (FFT) processing is used. First, we will evaluate E shown in Eq (3) by finding its Fourier coefficients and then applying an inverse transformation.

Proposals generation
Once the image visibility is improved, the next significant step is to generate high-quality, classindependent and fewer image proposals/regions/locations where the tumor is positioned in the image. The small set of proposals (regions/locations) can significantly improve object classification performance. However, previous approaches were still inadequate to generate fewer and high-quality proposals. For this reason, we initially segment the enhanced brain MRI images to obtain the set of initial regions using the methods presented in [44][45][46]. Segmentation leads to increase detection performance. Compared to pixels regions containing richer information, it is a good idea to use a region-based approach.
First, the similarities among the neighbouring regions are calculated based on these similarities, and the most similar neighbours are grouped to make one region. Then again, similarities among the two previously combined neighbouring regions are calculated, and similar regions are combined into one region. The process of grouping the similar regions is iterative until all the similar regions are grouped into a single region to form an image. We are aiming to achieve as many proposals as possible. At this point, regions obtained after grouping are our proposals. Once the proposals are obtained, the next task is to score and rank them. The structure edge detector will be used to obtain the edges from the original image to achieve this goal. This edge detector is relatively fast and delivered good accuracy as compared to other competitors. Then, these edges are connected based on their orientation similarities with their neighbouring edges. Finally, the eight adjacent edges whose sum of orientation differences is above pi/2 are combined to form edge groups.
Next, based on their means positions and orientations, the affinities between neighbouring groups are computed. For enhanced computational performance, only affinities whose value was above the 0.05 threshold are stored, and the rest discarded. Based on these edge groups and their affinities, we computed the score for our proposals. A continuous value   bi wS will be calculated for each group where, the ordered path of edge groups is represented by t having a length of T , starts around 1 b tS  and ends at i T tS  , the continuous value   bi wS will be set to 1 if T does not exist, and  is the affinity between two edge groups. Based on the values we computed using Eq (5), the score function can be expressed as, where, w b is the width of the box, h b signifies the box's height and k represents the bias value for the more giant boxes.
Finally, the obtained proposals are ranked according to the score computed using Eq (6) and given as input to our network and input image to refine further and achieve detection and classification tasks. The proposals obtained can be seen in Figure 3

Proposals refinement
The high-quality, class independent and fewer proposals or locations of tumors and their scores will be obtained from the above step. Still, these proposals further can be refined to achieve highquality detection and classification performance. The example of refined proposals is elucidated in Figure 3(b). The top high quality and fewest proposals are the demand of many systems. Because of this reason, the proposal refinement system is employed to refine the proposals obtained from the previous stage. Then these further high-quality proposals are given for the detection and classification tasks. We have designed our overall approach in such a way that proposals refinement, detection and classification parts of our system can share the convolutional features to achieve robust performance.
The proposals whose overlapping value with ground truth boxes is at least 0.7, considered as positive samples. In contrast, proposals whose overlapping values with ground truth are between the interval [0.1, 0.5] are considered a negative sample. For training the detection model, the top 1000 proposals were used. Nevertheless, the model has been tested on 10 top quality proposals per image; at the same time, previous approaches required a huge number of proposals to test their model efficiency.

Backbone architecture
EfficientNets [47,48] is a newly introduced backbone architecture in the deep learning research paradigm by the Google team in 2019. It has achieved the highest classification accuracy and efficiency as compared to previous networks. The network can be scaled up from EfficientNet-B0 to EfficientNet-B7 to obtain high accuracy. The model EfficientNet-B7 marked 84.3 and 91.7% of accuracy on ImageNet and CIFAR-100 datasets, respectively. The network also significantly improves the performance of the models by reducing the number of parameters. In this manuscript, we have adopted EfficientNet-B0 as our backbone network. It is scaled up to Efficient-B7 using a compound scaling mechanism to achieve our goal. This network requires less computational cost and battery usage compared to other competitors.
The baseline network EfficientNet-B0 comprises 1 convolutional layer, seven mobile inverted bottleneck (MBConv) blocks [49], one average pooling layer, and one fully connected layer. The main building block of EfficientNets is MBConv, to which squeeze and excitation block is added along with swish activation. Each MBConv block has a different setting. The first MBConv block uses a single layer of kernel size 33  and 16 output channels. The second MBConv block has two layers, and each layer has a kernel of size 33  and 24 output channels. The third MBConv block also has two layers, but each kernel has a size 55  and 40 output channels. The fourth MBConv block has three layers and a kernel of size 33  with 80 output channels. The fifth MBConv block also has three layers, but each kernel has a size 55  and 112 output channels. The sixth MBConv block has four layers and a kernel of size 55  with 192 output channels. The last MBConv block has a single layer and kernel of size 33  and 320 output channels. In this paper, the EfficientNet is modified after the last MBConv block. A branch is added to refine tumor locations or regions or proposals and achieve detection and classification tasks. The model receives proposals obtained from the first stage and corresponding natural image as an input. The input image will be traversing through the first layer to the fifteenth layer.
Besides, after the last MBConv block, the multi-tasking network for refinement, detection, and classification of a tumor is introduced. The two convolutional layers with kernel sizes of 33  and 55  are added to lessen the number of channels from the previous layer from 320 to 128, and this was the starting point of our proposal refinement. Next, a rectified linear unit (RELU) layer is supplied. The ROI pooling layer is added to perform down-sampling of each initial box region to achieve a fixed feature map of size, i.e., 55  . The down-sampling divided into the input feature map into various grids of equal width and height. Next, the maximum pooling will be performed on each grid. Subsequently, another fully connected layer followed by a RELU layer will be added to only 1024 neurons. In the end, a ranking branch is a fully connected layer to recalculate the proposals score (objectness) that will be added. This ranking layer will have two output neurons, which will symbolize the likelihoods of an object's existence, in this case, tumor. Furthermore, another branch of box regression, a fully connected layer, is added to get the locations offsets of initial proposals and predict the box regression values. Moreover, during the network training process, a binary class label is assigned to initial proposals to check whether it is an object (tumor) or not. The loss function is defined as, where p represents the value computed using SoftMax based on the two outputs of a fully connected layer, and u means the label of the current box. Furthermore, the coordinate offsets will be learned using the box regression layer. The parameterization of the coordinates will be performed, describes as follows, where x and y are the coordinates of box centre, h and w represent the height and width of the candidate box, x is the predicted box, in x indicates the input box and * x is the ground truth box, and similarly, same definitions existed for y , h and w . Thus v means the regression target and t shows the predicted tuple. Therefore, box regression loss mathematically will be described as follows, where,  is a balance parameter, in this project, in our experiments, it will be set to 1.

Proposals alignment
The fully connected layers of the network required fixed-size input to perform successive tasks, which is one of the main problems when object detection is achieved. Because the proposals generated will be of different size or shape. Therefore, all the generated proposals are required to be converted into fixed size or shape. To obtain fixed size proposals, ROI pooling is adopted. Once the refined proposal or locations are obtained, a proposal alignment layer to perform ROI pooling is added to achieve a fixed-length feature vector of 77  . This output size of the ROI pooling in fact not at all count on the size of proposals or input feature map but depend on the number of sections we wanted to divide proposals into. The main advantage of using ROI pooling is computational speed, and the same input feature can be used for all the generated proposals. This also significantly improves the overall detection accuracy [50]. Next, these proposals are shared with the last part of the proposed network to achieve the desired tasks.

Detection and classification
For the detection task, a new layer called mean pooling is added to reduce the feature map's dimension to 1. Then, a fully connected layer of size 1 1 4  will be added to generate the final output. Furthermore, to boost up the classification task, the feature map obtained from the alignment layer is of low resolution, so in order to increase the resolution of the feature maps, three de-convolutional layers of size 33  is added, followed by one convolutional layer of 11  to produce the output. Next, the sigmoid function to our output is applied to achieve three probability maps such as ,, It is because the addition of three de-convolution layers will increase the resolution of the proposals we obtained in the previous stage. Furthermore, the feature maps we obtained from deconvolutional can be transferred to the classification network to boost up the classification performance. For the classification, the feature map we receive will be of size 1 1 1152  for the classification layer. This size of 1152 feature channels are obtained from 1024 features channels obtained from the backbone and 128 features channels from the output of de-convolutions. This combination of features channels from two sources will significantly boost the classification performance, as shown in Figure 1. The SoftMax activation function is used to compute the probability p of each output class u is defined as,

Results and evaluation
Dataset: The proposed model's efficiency has been evaluated on the public brain tumor dataset presented by Cheng et al. [32] and can be accessed at (https: //figshare.com/articles/braintumor dataset/1512427). The dataset was collected from 233 patients during 2005-2010 from two different state-owned hospitals based in Guangzhou and Tianjin, China. It contains 3064 T1-weighted improved contrast brain MRI images of size 512 512  pixels each, with pixels size (voxel spacing) of 0.49 0.49 mm mm  . There are three different kinds of tumors, i.e., Pituitary, Meningioma and Glioma, in three distinct planes such as axial, coronal and sagittal views have existed in this dataset. The images distributed in the dataset consist of 930, 708, 1426 instances of Pituitary, Meningioma and Glioma tumors, respectively. The dataset is made available in (.mat) format of matrix laboratory (MATLAB), including a complete description such as tumor mask, tumor class label, tumor border, and patient ID. Generally, input images are enhanced and recovered before further processing [42] [51][52][53][54][55][56]. In the present case, images used are obtained from different imaging modalities that include artifacts and false intensity levels. Therefore, it is mandatory to clean and enhance the contrast of MRI images. The key objective is to improve the dynamic range of grey values in images to achieve better visual quality. The dataset details are presented in Table 1, and sample pre-processed images contain three types of brain tumors, as shown in Figure 4.  Pre-processing and data augmentation: This stage helps in enhancing and improving input data for the next task. In this paper, an input brain MRI image is provided to our network to achieve the desired classification task. The pixels intensity of the input image is processed using a convolutional kernel. The performance of the convolutional kernel largely depends on these intensities values in the MRI images. Because the pixel intensity values are changeable inside or amongst the subjects, they have no fixed meaning. Besides, these values of image pixels intensity are susceptible to acquisition condition. There are numerous techniques and methods, particularly deep neural networks, required to normalize the pixel intensity values in the input brain MRI image before any operation to boost the network performance. The normalization process can help to obtain the same range of intensity values for the input MRI image, which promises the network robust and stable convergence. Therefore, in this article, the input MRI images are pre-processed using min-max normalization. The input image intensity values are scaled to a range of 0 and 1, and this helps the network's training process significantly. Another step is contrast enhancement because MRI images are collected from distinct environments, conditions and modalities. The false intensity levels and artefacts in images are certain, which reduces the visual quality of images. Therefore, the image contrast is optimized, and visual quality is improved; details can be seen in Section 3.1. Figure 5 shows the qualitative enhancement results of different images. Furthermore, to reduce the over-fitting during the network training process, the data augmentation techniques are adopted to increase the dataset samples. Different variations of images is obtained employing rotations and flipping techniques of data augmentation. The aim is to increase the training dataset, and images were rotated into different angles, i.e., 270°, 180° and 90°, during the training process. Besides, images were mirrored based on a filliping technique to obtain an image from a horizontal and vertical direction. Consequently, in our case, the dataset is augmented three times as many, resulted in 9192 sample images. The result of data augmentation is shown in Figure 6.
Hyper-parameters: In a pre-processing step, the input images are normalized and enhanced to achieve improved image quality. The training process has been improved using data augmentation techniques. To achieve effective and comparable results, the 5-fold cross-validation model of classification is used. The dataset is split into two percentages that 70 and 30% for training and validation, respectively. Training iterations are set to 10 and further averaged to enhance the overall accuracy of the classification results. Furthermore, to ensure the optimal choice of hyper-parameters for the final model assessment, various experiments have been performed on training data; results can be seen in Tables 2-4. The proposed model achieved high accuracy for the Adgrade optimizer, when the learning rate was set to 0.003, a number of epochs are fixed to 20, the dropout was set to 0.5 and batch size was set to 16, respectively. We have evaluated the performance of the proposed model using 5-fold cross-validation introduced by Cheng et al. [32]. This procedure is more reliable in achieving valid and distinguishes classification results. The results can be seen in Tables 6 and 7. The proposed model significantly reduced the problem of over-fitting and converged faster. Moreover, it achieves acceptable retrieval accuracy with a low computational cost. The model is simple with easy practical implementation and can be preferred by the radiologist for the classification tasks in decision making. The proposed method is a powerful architecture and more generic to the brain classification task. Experimental results: To assess the performance of the proposed system, a confusion matrix is generated based on the model's correct and incorrect predictions. Table 5 shows the confusion matrix obtained during experiments. It can be observed that the proposed model has classified 3004 cases correctly and 60 cases incorrectly and achieved an overall accuracy of 98.04%. Furthermore, it can be noticed that the Glioma achieved the highest prediction proportion. This result can be credited to the larger training dataset obtained using different augmentation techniques. The balance in the dataset significantly enhanced the classification outcomes. Moreover, based on this confusion matrix, the classifier performance in particular to each tumor type has been evaluated in terms of accuracy, sensitivity (recall), specificity, and f1-score. respectively. The model also attained high precision and f1-score values, which makes our method more suitable for classifying brain tumors from MRI images. Unquestionably, this because of our model's efficient performance in classifying tumors in sample images. It can be noticed that the proposed method achieved high specificity values for all the classes, which implies that the proposed method accurately classifies the sample images without the specific disease. Compared to other methods, our method achieved improved grading efficiency and improved performance. The efficiency of the model is improved by increasing the number of sample images, and the problem of over-fitting is addressed. Moreover, the proposed method did not perform manual segmentation and required no prior knowledge about the features type's to be extracted, which lessens the network's generalization capability [36]. Based onto the results, we believe that our model attained decent generalization capability and holds the model stable. Moreover, the proposed technique can also be generalized to other applications such as breast tumor classification.   Table 7 shows the comparison of the proposed method with numerous other existing approaches to a 3-class brain classification problem exploited on the same dataset. In this table, classification performance based on commonly exploited accuracy metric followed in all previous methods is presented. To achieve the robust performance, the proposed model is assessed on different parameters, as shown in Tables 2-4. As compared to others, the proposed method reported the highest accuracy of 98.04% for only 20 epochs without manual segmentation. This accuracy indicates the proposed method's efficiency for deep learning-based feature extraction and classification of brain tumors. Moreover, the proposed method not only surpassed others in terms of accuracy only but also achieved remarkable performance for all quality criterions. The classification performance ROC curve of the proposed model can be seen from Figure 7. It shows that the proposed model produces excellent results, i.e., 0.9938, 0.9925, and 0.9855 for Glioma, Pituitary and Meningioma classes. The Glioma achieved the highest true positive rate as compared to others. Besides, the effectiveness of the proposed method's detection results can be seen in the average accuracy curve in Figure 8. It shows that the method that detected the tumor by drawing a bounding box is stable.
The suitable hypermeters choice and architecture in experiments unquestionably increased the performance, making it superior to others. Moreover, brain tumor classification is a challenging problem. Numerous factors may affect the classification task, such as tumor shape, orientation and size, low contrast, noise in MRI images, and limited training samples. This could lead to misclassification and over-fitting problem, which reduces classification accuracy. As compared to previous approaches, the proposed method addresses these issues significantly and achieved acceptable accuracy. The localization of the tumor, contrast enhancement, and data augmentation before performing classification tasks enhances the classification accuracy, which distinguishes the proposed method as compared to others. Thus, the model has achieved robust classification results and reached its highest performance very quickly and reduced the problem of over-fitting significantly. Moreover, networks' training and validation process can be seen in Figure 9. The accuracy and loss curve shows that the model resulted in good consistency during the training and validation process.

Conclusions
Brain tumor classification is one of the most significant areas of research in the field of medical sciences. A number of approaches have been presented for three types of tumor classification. These methods achieved acceptable classification accuracy performance. However, the problem of tumor classification is still open and need to be addressed properly. The classification accuracy can be further be increased with an efficient framework. In this manuscript, an efficient brain tumor classification model that can effectively classify tumors, i.e., Meningioma, Glioma, and Pituitary form brain MRI images has been proposed. The main objective of this research work is to design a brain tumor classification model that can achieve high classification accuracy performance with low complexity. The proposed model first enhances the visual quality of the image using optimal contrast and nonlinear strategies. Secondly, the tumor locations are obtained based on segmentation and clustering techniques. Then, these locations are scored and provided to EfficientNet along with corresponding input image for features extraction. Thirdly, these locations are further refined to increase detection performance. Next, these locations are aligned and processed to determine the categories and location of the tumor. The classification accuracy has been increased by transferring features from detection layers to classification layers. Furthermore, to prevent over-fitting in the network, data augmentation techniques are adopted. The proposed model is evaluated on a publicly available FigShare dataset for brain tumor classification. The experiments exhibited robust results against other similar approaches. The proposed method ends up with overall classification accuracy of 98.04%. Furthermore, the proposed model has achieved optimal accuracy of 98.17, 98.66, 99.24%, sensitivity (recall) of 96.89, 97.82, 99.24%, and specificity of 98.55, 99.38, and 99.25% for Meningioma, Glioma, and Pituitary classes respectively. The proposed model performance shows the supremacy of results compared to state-of-the-art literature, which used the same dataset. Hence the results show that the proposed idea works well for brain tumor classification. Once implemented, the classification process classifies brain tumor with high accuracy and helps in saving precious life significantly.
Since the classification efficiency of the proposed model is proportional to the number of training images, a small image dataset will affect its performance. However, the proposed method achieves high classification efficiency on a large image dataset but expensive in computational cost. The authors believed that the proposed model computational cost could be reduced and generalized to diverse clinical applications, such as breast tumor classification and liver lesion classification with numerous medical images modalities such as CT, PET, and X-rays. The tumor can be localized based on a weakly supervised technique to improve localization accuracy. Moreover, to enhance the proposed model performance, multi-channel classifiers will be added, and a callback function on min loss and max accuracy as a stopping criterion to decide upon the number of epochs will be introduced.