Glioma Grade Classification Using CNNs and Segmentation With an Adaptive Approach Using Histogram Features in Brain MRIs

Artificial intelligence (AI) applications have become popular due to their advantages in solving health problems with high accuracy and confidence. One such application is the diagnosis of brain tumors or anomalies. This paper presents two new approaches for brain tumor grade classification and segmentation. Convolutional neural network (CNN) models were used as the first approach to classify High-Grade Glioma (HGG) and Low-Grade Glioma (LGG) tumors and achieved with 99.85% accuracy, 99.85% F1 and 99.92% AUC scores. A new pipeline consisting of normalization, modality fusion and CNN model for HGG-LGG classification tasks was also proposed and developed. A novel algorithm based on histograms, thresholding and morphological filtering with feature fusion was also proposed and developed for the segmentation task. 70.58% Dice Similarity (DS) on average was achieved with the complete tumor segmentation. Experimental results have shown that the proposed algorithm has improved to measure the complete tumor region 15% more compared to the fixed thresholding. Segmentation results also encourage that the algorithm can be used as a feature extraction process on different sized brain MR images. It is expected that the extracted center of gravity features can be further used in AI algorithms for better segmentation, including T1 and T1CE modalities.


I. INTRODUCTION
Early diagnosis and initiation of treatment are critical in being or staying healthy. Automatic analysis of brain Magnetic Resonance Images (MRIs) using Computer Aided Diagnosis (CAD) systems is one of the most trending research areas in the last decade [1], [2]. Any abnormality in the brain can pose a high health risk and can lead to death. Recent improvements in artificial intelligence (AI) encourage the application of these methods in the health care area which requires critical decision-making capabilities. AI solutions are naturally well-suited for such complicated, multi-parameter problems. It is of great importance to design and implement computer assisted, robust, efficient and automated systems to reduce the The associate editor coordinating the review of this manuscript and approving it for publication was Alessandra Bertoldo. workload of professionals, speed up internal hospital systems, obtain and report fast results.
Analysis of abnormalities monitored in brain tissue is vital for early diagnosis and treatments. Radiologists study, detect, analyze and report any abnormality based on brain images. Neurosurgeons, neurologists and oncology specialists then make decisions and implement treatment in the light of the reported information. The grade of a tumor takes a highly important role in diagnosing or planning any treatment. This information or automatized task helps a neurosurgeon to make a decision on the diagnosis immensely.
For monitoring the brain tissue, tumor or abnormality, methods such as MRI, Positron Emission Tomography (PET), Computed Tomography (CT) and PET-CT are widely used in hospitals. MRI is the most common selection among medical doctors regarding that it is noninvasive, does not require patient to get exposed to any radiation or to take toxical substances. Thus, classification and segmentation are performed using brain MR images in this research. Routine MRI protocols used to evaluate brain masses are T1-weighted, contrastenhanced T1 (T1CE), T2-weighted images, followed by Fluid Attenuated Inversion Recovery (FLAIR).
Tumors can be benign or malignant with varying characteristics. Malignant tumors are classified according to their degree of differentiation (similar to the tissue from which they originate) and on of their distinguishing feature is that they can spread (metastasize) from their region to another region. Brain tumors cause tissue death, compression to the environment, impaired blood flow, bleeding and edema. This can result in headache, loss of consciousness, loss of vision, numbness, and stroke.
There are several other types of diseases such as ischemic stroke (infarct), hemorrhage, neurodegenerative (Prion, Alzheimer, Parkinson) and demyelinating diseases (Multiple Sclerosis -MS) that threaten human life [3], [4], [5], [6]. These types of diseases are rarely encountered in daily life compared to tumors. It can be said that solving problems related to tumors is critical.
In the literature, automatic diagnosis problem is dealt with under 3 subproblems: detection, classification and segmentation [15], [23]. Detection is a more generic paradigm that covers classification and segmentation but it mainly focuses on the presence or absence of the tumor. Classification refers to the subclassification of types of tumors with a given context such as glioma, meningioma, pituitary, or the classification of the grade of the tumor [9], [10] which can be High-Grade Glioma (HGG) and Low-Grade Glioma (LGG) as the case of this study alongside segmentation.
The problem of locating and selecting the tumor area in the given image tackles as segmentation. It is the process of segmenting the tumor area in terms of pixels, which is a more precise measurement, rather than boundary boxes like in many object recognition tasks. One could say that segmenting the given image as tumor and nontumor (healthy) parts means that classifying each pixel as tumor or nontumor, it depends on the perspective. Segmentation branches as segmenting the tumor core, enhancing/non-enhancing tumor and edema regions, and when combined, the complete tumor area is formed. Similarly, detecting the presence of tumor is related to the binary classification of the given image as tumorous or healthy. Fig. 1 summarizes the major solvable problems using brain tumor MRIs and addresses the scope of this study. Looking at the literature, convolutional neural networks (CNNs) show superior success in tumor classification on benchmark datasets [7], [8], [9], [10], as in many object recognition problems. In addition, datasets in the medical imaging field are quite small due to issues such as inconvenience, ethics, privacy, and especially the heavy workload of manually labeling the segmentation masks. Thus, the advantage of transfer learning, which has proven to be very effective when working on small datasets, can be used here.
For segmentation, many Deep Learning (DL) approaches have been proposed but the downside is that these approaches are data dependent and any change in the data characteristic can affect the models dramatically, so a more universal approach has been developed in this study. Since there are contrast differences in the images produced by different MR devices, small contrast changes in the data can significantly affect the result when DL is applied. Because in such models, low level features obtained from small details in the image dramatically affect the results in the succesive deeper layers. One of the most effective solutions is to train the models on a wide variety of data by creating a new dataset for each MR device, which overcomplicates the problem. Histogram features, thresholding and morphological operations have been widely used as an alternative for DL approaches like in [11], [12], [13], [14], and [15] and acquired comparable success. Hence, a novel approach for brain tumor segmentation based on adaptive thresholding has been proposed in this study. 5 state of art neural networks have been developed and tested for the HGG and LGG binary classification problem with the use of pretrained weights. BraTS 2020 training dataset, consists of MR slices of various contrast levels has been used. For segmentation, a histogram, morphological operation and feature fusion based adaptive algorithm has been proposed.
The motivation of this study is mainly grabbed from the Turkish Brain Project. In the project, the aim was to handle and analyze brain MR images including various anomalies and tumors. The work presented here contributes to the project as a preliminary research regarding grading and segmentation of glioma tumors. Studies specific to glioma tumors in the literature were examined and approaches were developed to be used in the project. The major contributions of the proposed work are given as follows: • to design, modify and implement state of art CNN models which are generally used for object detection tasks for the HGG and LGG classification problem, • to obtain higher HGG-LGG classification accuracy than various machine and DL approaches proposed in the literature, supported by F1 and Area Under Curve (AUC) scores, • to use and demonstrate the advantage of multimodal MR images with the fusion of T1CE, T2 and FLAIR modalities by concatenation for the classification task, • to develop a novel adaptive threshold determination approach for MRI binarization to decompose the complete tumor area more accurately in T2 and FLAIR MR images, • to develop an alternative, efficient and simpler method to segment the complete tumor area with the combination of adaptive thresholding, feature fusion of T2 and FLAIR images, followed by morphological operations, • to evaluate the segmentation algorithm with the Dice Similarity (DS) metric for complete tumor, • to evaluate the predicted centers of gravity results of the complete tumor using the automatically segmented regions, and • to discuss and evaluate the results obtained with both the glioma grading approach and the segmentation algorithm for the use in future work. This paper is organized as follows. Section II gives a brief summary of the related works. Section III presents the material and methods used in this study. Section IV introduces the experimental results and discussion. The work is finally concluded in Section V.

II. RELATED WORK A. GLIOMA GRADE CLASSIFICATION
One of the most challenging aspects of applying AI solutions in the health care field is that errors cannot be tolerated because the results are used in the decision-making processes that directly affect human health. Researchers worldwide are working on developing methods and conducting thorough evaluations to build more accurate and robust CAD systems. These approaches mainly include DL, classical Machine Learning (ML) (SVM, clustering), histogram based analysis, and hybrid approaches.
In the case of glioma tumor grading, many ML approaches, including those that involve feature extraction and selection stages and those that are purely based on DL, have been proposed. Since classical ML methods usually have simpler structures than DL approaches, feature extraction and selection phases are critical to obtain accurate classification results. Cho and Park [16] used a system consisting of feature extraction, selection, and finally classification with logistic regression. To represent tumors, radiomic features such as histogram, shape, and GLCM were extracted and further selected using L1-norm regularization (LASSO) to identify important attributes of tumors. After this extensive feature engineering step, a simple logistic regression algorithm was sufficient to achieve high classification accuracy.
In addition to logistic regression, support vector machines (SVMs) are widely used for classification after the feature preparation process in the literature [17], [18], [19], [20]. Polly et al. [17] proposed a pipeline that first segments the tumor and then performs HGG-LGG classification on the segmented tumor regions. Discrete wavelet transform (DWT) was used for feature extraction, followed by principal component analysis (PCA) to perform dimensionality reduction and finally binarized using Otsu binarization. The preprocessed data were then fed into the k-means algorithm and the segmentation results were obtained. By eliminating the unrelated parts of the brain, the SVM classifier obtained better classification results. Similarly, Ain et al. [19] proposed a pipeline that segments the tumor using k-means and classifies it as HGG or LGG using SVMs. The difference was the use of median filters and canny edge detector for feature extraction and, after k-means clustering for segmentation, GLCM features were extracted and fed into the SVM classifier. Bi et al. [18] also proposed a novel feature extraction pipeline and further classified into HGG or LGG using SVMs. They exploited the advantage of deep morphological and physiological features. After information gain was performed on these features, LASSO feature selection was applied, similar to [16]. Dequidt et al. [20] proposed a feature extraction pipeline similar to [17], which consists of shape, intensity, texture, and GLCM features. In addition, inverse difference moment was used for texture analysis. It appears that the preprocessing pipeline proposed in [17] results in better accuracy than [19]. It can be deduced that the method proposed in [17] is the most successful among similar SVM classification approaches that include various feature engineering steps.
Apart from classical ML approaches, DL methods have also shown a very effective success in the classification problem. DL has many advantages over ML, thanks to their ability of extracting and selecting features automatically within the natural model structure. In the case of DL, complicated feature extraction and selection pipelines are no longer needed. Ge et al. [7] developed a multistream CNN which takes the input as 3 different branches (T1, T1C and FLAIR) and then further fuses them within the neural network structure by aggregation. The problem was solved by a compact CNN itself without any feature preparation process with high accuracy. Banarjee et al. [8] have constructed a novel CNN named VolumeNet, which has been trained on polyhedral volumetric slices. State of art VGG and ResNet CNN models were used for comparison and the experimental results have shown that the proposed model performs better in terms of accuracy. Unlike [8], Kularni and Stundari [9] have utilized and tested existing state of art neural networks (AlexNet, GoogLeNet, ResNet18 and ResNet50) using transfer learning with Adam optimizer. Experimental results have shown that ResNet18 performed best among different models. Another approach for HGG-LGG classification using CNNs has been proposed by Singh et al. [10] but with a little modification on the conventional CNN structure. Instead of the convolution blocks, Gabor filters was used to learn smaller features and reduce the computational complexity by lowering the parameters required for training. Similar to [17] and [19], Soleymanifard and Hamghalam [21] have proposed a scheme which segments the tumor area with clustering, and then classifies it by a neural network. Fuzzy c-means clustering has been used for soft clustering and the extraction of tumorous parts of the brain. Then local binary pattern (LBP) algorithm and GLCM have been used for feature extraction on the segmented regions. Finally, classification was performed using a neural network based classifier. Among different proposed DL approaches, [9] and [10] provided the most successful results on the BraTS dataset. A brief summary of remarkable studies for HGG-LGG classification tasks is shown in Table 2, including the comparison with the proposed method in this study. For some studies using multiple datasets, the best results are listed.

B. BRAIN TUMOR SEGMENTATION
Similar DL approaches have also been successfully applied for the segmentation tasks by modifying classical CNN architectures as an encoder-decoder structure. Pereira et al. [22] have proposed an approach which consists of intensity normalization, data augmentation and CNN classification steps for glioma tumor segmentation. By using 3 × 3 kernels, deeper networks were constructed to avoid the overfitting problems. Unlike the traditional CNN approaches, the use of the intensity normalization before the training phase had a positive overall impact on the segmentation performance. Proposed method acquired the 1st place in BraTS 2013 and 2nd place in BraTS 2015 challenge. Similarly, Iqbal et al. [23] developed a pipeline which includes preprocessing steps and in addition, bias correction and mean subtraction have taken place, followed by deep CNNs named SkipNet, IntNet and SENet. The SENet model performed best among them which has skip connections, squeeze and excitation blocks and obtained higher performance than similar methods. Diaz-Pernas et al. [24] have developed a novel multiscale CNN which processes the input matrices in 3 different branches within the network for the segmentation of 3 different tumor types. Elastic transformation for data augmentation and pixel standardization methods have been used before the training phase. Proposed methods have outperformed previous studies on the same dataset. Ranjbarzadeh et al. [25] proposed a pipeline using binarization, followed by a feature fusion operation using matrice dot product and a Cascade CNN with a unique Distance-Wised Attention (DWA) mechanism. Experimental results have shown that the proposed method achieved competitive results. Among the DL approaches used for the segmentation task that has been reviewed in this study, the approach used in [25] obtained the best results for complete, core and enhancing tumor segmentation.
Beyond DL, it is also possible to obtain high performance segmentation results with less complex methods such as the use of gray level histogram, gray intensity based features and the characteristics combined with morphological operations. Rexilius et al. [11] exploited the advantage of multispectral histograms by processing them with global affine and nonrigid image registration. Segmentation was performed by the utilization of a probabilistic intensity model. It provides an alternative solution which is fast and less complex compared to DL models. Murthy et al. [12] also showed that a simple scheme consists of histogram equalization, thresholding and morphological operations can acquire reasonable segmentation results. Singh and Ansari [13] proposed a more dedicated approach which adopts the usage of SVM classifiers after the histogram normalization operation. Before the classification process, median, adaptive and averaging filters have been used for noise removal. After the SVM classification process, k-means clustering method has been used for segmentation, followed by morphological operations and acquired high segmentation accuracy on real life data. Akter et al. [14] have developed an uncommon approach which divides the input image into hemispheres and selecting the one which includes the tumorous tissue. Intensity histograms have been used to detect the tumorous hemisphere. After the selection of the region of interest, thresholding has been applied, followed by a median filter. Connectivity checking has been applied on the obtained output and the biggest object is selected as tumor. Experiments showed that the proposed method achieves satisfactory segmentation results even the volume of the dataset is very small. Abbasi and Tajeripour [15] have used the LBP histograms from Three Orthogonal Planes (LBP-TOP) and the Histogram of Orientation Gradients (HOG-TOP) as tumor representing features, then finally segmented the tumor regions using Random Forest (RF). Like in [23], bias correction was used before the training phase, followed by histogram matching. When compared to other similar approaches, proposed method showed superior performance on the same dataset. A brief summary of remarkable studies for brain tumor segmentation tasks is shown in Table 3, including the comparison with the proposed method in this study. For some studies using multiple datasets, the best results are listed.

III. MATERIAL AND METHOD A. BraTS 2020 DATASET
It is important to note that the Multimodal Brain Tumor Segmentation Challenge (BraTS) dataset [26], [27], [28] is a widely used dataset in the field of medical image analysis for brain tumor classification and segmentation tasks. It contains MR images of patients with HGG and LGG in multiple planes and modalities. It includes a variety of contrast level MRIs which have been acquired from 19 different institutions. The BraTS 2020 training dataset was used for the classification task in this study and included the axial slices of T1CE, T2, and FLAIR MR images of 369 cases. First, all of the LGG samples were extracted using the segmentation labels. Since the HGG samples are dominant in the dataset, undersampling was applied to the HGG samples at the slice level to balance the dataset, creating a dataset containing equal numbers of HGG and LGG samples. For the segmentation task, axial slices of T2 and FLAIR images containing the largest tumor parts were used and underwent binarization, feature fusion, and morphological operations.

B. CLASSIFICATION
Early diagnosis and treatment of glioma tumors that can pose a high risk for health are extremely important. Differentiating the more dangerous HGG tumors from LGG tumors is critical because if the tumor is LGG, surgery may not even be necessary. In order to make a clear distinction between HGG and LGG under standard hospital conditions, a brain tissue sample has to be taken and analyzed pathologically. Solving this classification problem automatically using only the brain MRIs can eliminate the unnecessary surgical risks. CNN classifiers are widely used for various image classification problems as well as brain tumor grading. The main advantage of the CNNs is that the features are extracted and selected automatically thanks to the network architecture, activations and weight optimization. Hence, complex pipelines for feature extraction, selection and then applying a classification algorithm is not needed. Its compact and complex structure enables to solve complicated classification tasks which comes with a computational complexity tradeoff compared to classical ML approaches such as SVM, k-means and logistic regression.
Considering the advantages of CNNs, we used Inception V3 [29], DenseNet201 [30], Xception [31], MobileNet V2 [32] and EfficientNet V2S [33] backbones the with pretrained ImageNet weights for the binary classification inspired by [9], using TensorFlow with Keras [34]. Last fully connected layers are discarded and added additional layers on the problem basis. All the layers are initialized with the ImageNet weights and all trainings have been performed without freezing any layers.
A flatten layer is added on top, followed by a 128-neuron dense layer. Rectified Linear Unit (ReLu) activation is used for the first dense layer. A dropout layer with 0.2 dropout rate is used to avoid overfitting. Finally, a 1-neuron dense layer with the sigmoid activation is added as the last layer. Fig. 2 shows the general architecture of the models developed in this study. RMSprop optimizer is used with a learning rate of 0.0001 on between 4 and 8 sized batches depending on the model complexity and memory capabilities. In the preprocessing step, normalization is applied for each gray level image by dividing each pixel with the maximum gray value, then multiplied by 255. After normalization, 3 modalities are fused by concatenating them in the RGB channels to maximize the tumor details as shown in Fig. 3. This way, different tumor features revealed by different modalities are combined on a single sample, deficiencies in different modalities are eliminated and a more complete view of the brain tumors are highlighted. Instead of fusing the modalities within the model like in [7], the fusion is performed before the training phase which lowers the complexity of the model. Also, the T1CE modality is used instead of T1 because of its known advantages in the medical field. In the early stages of this study, experiments were performed with the use of only the T2 slices of cases without the fusion step. Experiments have shown that the fusion process had a positive impact on the overall model performance because of the richness of the features that it grants.
After preprocessing, the dataset is divided into 10 equal parts in order to perform 10-fold cross validation. Each model is trained for 15 epochs on each fold. To obtain a binary classification result between 0 and 1, sigmoid function is applied on the output of the last fully connected layer (x), as in (1)   The final classification results are then obtained by looking at whether the resultant value is greater or less than 0.5. Fig. 4 represents the sigmoid output of the neural network and Fig. 5 demonstrates the overall pipeline of the proposed classification approach.

C. SEGMENTATION
Today, automatic segmentation of brain anomalies on MR images provides great advantages in the diagnosis of possible disease, treatment planning and the treatment followup. Among the anomalies, tumors are among the abnormal conditions with the highest priority, along with hemorrhage and stroke. The automatic segmentation of glioma tumors covered in this study can assist radiologists and oncologists in their diagnosis and treatment processes. In the literature, various studies using histogram features, thresholding and morphological operations have shown reasonable success, offering an alternative, simpler and efficient way compared to DL based approaches [11], [12], [13], [14], [15]. Thus, a novel adaptive thresholding approach is developed in this study, taking the advantage of feature fusion and morphological operations inspired by the literature.
In the proposed segmentation approach, T2 and FLAIR modalities are used as they visualize the outer boundaries of tumors sharper, and tumor regions with higher contrast. When the dataset is statistically examined, it is seen that the tumor regions usually appear around at 145 as peaks (local maximum) on the x-axis in the gray level histograms of T2 and FLAIR modalities. While this hillock is obvious in some samples, it is not clearly seen in the others (Fig. 6). The local maximums, which refer to the tumor region, is usually seen after the global maximum region formed by the rest of the brain. However, a constant such as 145 is not valid for every sample. The aim is to determine how long after the global maximum the tumor region appears on the x-axis, that is, how much the global maximum has to be shifted to determine the threshold. For this, first of all, gray level averages of all brain regions of each image, including tumors are calculated. Then the smallest (avg min ) and the largest (avg max ) average  are selected. Here, the idea evolves to approximate the value of these averages to the gray level values in the tumor regions. For this, avg min and avg max are multiplied by the expansion coefficients (EC). EC is assumed/selected as 30 (EC max ) for avg max and 50 (EC min ) for avg min to compensate the gray level difference. These EC values are hyperparameters that can be further explored in future work in the context of optimization. A constant c is obtained by multiplying the average gray levels by the ECs as in c = avg max × EC max + avg min × EC min .
The average of each T2 (avg T 2 ) and FLAIR (avg FLAIR ) pair in the same height level in the brain is calculated and the average of those are taken as in For the reduction of the expanded outputs, the obtained c constant is divided by avg T 2,FLAIR . The adaptive shift value S for each sample is calculated as in Finally, by adding the shift value S to the global maximum values in the histograms (max T 2 and max FLAIR ), the adaptive threshold values threshold T 2 and threshold FLAIR is obtained on the basis of each sample, as in After binarizing the T2 and FLAIR images using their unique threshold values, a logical ''AND'' operation is performed between the binarized T2 and FLAIR matrices. This fusion operation is inspired by [25], but instead of using a dot product or a logical ''OR'' operator, a logical ''AND'' operator is used here. The motivation for using the ''AND'' operator is to eliminate the remaining salt and pepper noise and to combine the tumor regions that appear in both matrices. A 2-step morphological erosion is applied to eliminate the remaining salt and pepper in the resultant matrice, and a 5-step dilation is applied to re-grow the shrunken tumor area with a cross shaped filter (Fig. 7). These step sizes are also hyperparameters that can be adjusted with optimization algorithms in the future work. Examples of the obtained outputs after the morphological operations are shown in Fig. 8. Fig. 9 illustrates the complete pipeline of the proposed segmentation algorithm. VOLUME 11, 2023

A. CNN CLASSIFICATION RESULTS
With the HGG-LGG classification approach performed on the BraTS dataset, 99.85% accuracy and F1, 99.92% AUC scores are obtained with MobileNet V2, satisfactory success is achieved for its real life applicability as a CAD or Decision Support System (DSS). 10-fold cross-validation results are listed in Table 1. To demonstrate the model performance over epochs, 6th fold accuracy, F1 and AUC curves of the MobileNet V2 model is shown in Fig. 10. Accuracy, mean precision, mean recall and F1 scores are calculated in order to evaluate the models, as in where TP is true positives, TN is true negatives, FP is false positives and FN is false negatives. The best model, MobileNet V2, in terms of accuracy, F1 and AUC scores is trained for 7.6 epochs on average (Table 1) to obtain the best results. With the lowest training parameters (12.7 × 10 6 ) and complexity, MobileNet V2 performed remarkably well when compared to more complex models. AUC is calculated using 200 different threshold values distributed between 0 and 1. Precision and recall are calculated for each class separately and the mean of them is taken, F1-scores are then calculated using the mean precision and recall. Compared to the literature (Table 2), the proposed pipeline and the transfer learning models acquired higher performance than both classical ML and similar DL approaches. The feature fusion step and the network constructing strategy had a dramatically positive impact on the classification accuracy. The proposed approach performs slightly better than the studies using similar datasets [9], [10], but little improvements in the classification accuracy is extremely important as the potential errors in the health care field can be catastrophic.
To demonstrate the effectiveness of the fusion process, classification is performed using only T2 MRI slices. Using the same setup and NN structure, including normalization, 75-25% of training and validation is performed with the MobileNet V2 model. After training of 15 epochs, the most successful results are obtained at 12th epoch with 93.66% accuracy, 93.89% F1 and 94.71% AUC scores for the validation set which are approximately 6% less than the models use the fused inputs. This clearly shows that the fusion process used in this study had a positive impact in the classification task for various metrics. It can be concluded that the fusion process and the model structure used in this proposed approach are the main reasons of the more successful results than the proposed methods in the literature.

B. SEGMENTATION RESULTS
Dice Similarity (DS) metric is used for the evaluation of the segmentation algorithm. The DS metric is more convenient and naturally fits to measure the similarity between the ground truth segmentation and the automatic segmentation result in the case of medical imaging. Classical classification evaluation metrics such as accuracy and recall might not be sufficient for most of the situations. No matter how precise the automatically selected tumor region is, other parts of the brain count as true negatives, which always results in a higher accuracy regardless of the true positives. In addition, DS is used more often than the Intersection over Union (IoU) metric for the segmentation tasks and when comparing irregular shapes, such as tumors. Thus, DS is an effective evaluation metric for this case. The calculation of DS is similar to that of F-score. It gives equal priority to the intersection and union. The DS between the ground truth A and the automatically predicted segmentation mask B is calculated, as in VOLUME 11, 2023 Ç. Özkaya, Ş. Sağiroğlu: Glioma Grade Classification Using CNNs and Segmentation  It produces the output 1, if and only if the two segmentation masks exactly match. Any missing or extra parts of the predicted mask will be penalized and result in a lower DS. As a result of applying the proposed segmentation to all 369 cases, an average of 70.58% DS is obtained for the complete tumor segmentation. While 136 samples are below the average, the majority of 233 samples are above the average as shown in Fig. 11. Compared to the similar approaches using histogram features, thresholding and morphological operations, the proposed approach obtained reasonable complete tumor DS with a simpler and efficient way (Table 3). Some relatively successful and unsuccessful output examples of the algorithm are illustrated in Fig. 12 at the first two and last two rows, respectively.
When the proposed method is compared with similar approaches in the literature [11], [12], [13], [14], [15], it is seen that reasonable results can be obtained with a less complex algorithm. It can be said that the proposed method is less complex in terms of computational complexity, memory complexity and explainability. For instance, the approach in [12], which is the closest to the proposed algorithm in terms of computational complexity, has equal or greater computational complexity since it uses a more sophisticated method, histogram equalization. Similarly, [11], [13], [14], and [15] use histogram based operations as well as SVM, k-means, LBP and similar methods, which generally increase the complexity. Despite having a lower complexity, the proposed approach has achieved reasonable DS.
The DS obtained might not be sufficient for the daily life applications in hospitals compared to most of the DL approaches. However, its strong side is that it is more robust to variation in contrast levels between MR images taken with devices at different settings, as it does not have complex parameters like NNs which can possibly cause overfitting, and it uses an adaptive threshold mechanism for in the basis of each sample. Also, the automatic segmentation masks' centers of gravity are calculated and the average Euclidean Distance (ED) between the ground truth centers results as VOLUME 11, 2023 8.12 pixels in 240 pixel x 240 pixel MR images. The hit rate of the centers of gravity to the tumor region was calculated as 91.85%. As in the Dice results, 105 samples are above the average, the majority of 264 samples are below the average as shown in Fig. 13. Some relatively successful and unsuccessful gravity center estimation results are listed in Fig. 14. The algorithm proposed for segmentation can be used as a data preprocessing or center of gravity estimation step in the future work. Much less successful results were obtained when the same steps were applied to T2 and FLAIR images with various fixed threshold values between 0.5 and 1.0 instead of the adaptive approach. The adaptive approach resulted in an approximately 15% increase in DS.

V. CONCLUSION
As known, automatic brain tumor grade classification and segmentation play important roles in improving CADs and DSSs to help professionals solve complicated problems. In this study, two approaches were proposed for two critical problems in brain tumor diagnosis: glioma tumor grade classification and complete tumor segmentation.
The classification of the HGG and LGG was performed using CNN models and the transfer learning with a fine tuning approach, resulting in superior performance compared to previous studies with 99.85% accuracy and F1, and 99.92% AUC scores. The use of multimodal MR images here had a positive impact on both classification and segmentation results.
Transfer learning is an effective solution for glioma grade classification with CNNs, and feature fusion is highly recommended for both classification and segmentation tasks. The proposed segmentation algorithm performed slightly worse than the DL based and other histogram based approaches for complete tumor segmentation, with 70.58% DS, but offers an alternative, simpler and less complex solution. In the future studies, segmentation performance can definitely be improved by using various and sophisticated optimization algorithms for the selection of the expansion coefficients, as well as the morphological operation step sizes.
The proposed adaptive thresholding mechanism improved the complete tumor segmentation by approximately 15% compared to fixed thresholding. The segmentation algorithm can also be used as a feature extraction step to estimate the complete tumor gravity center with the error rate of 8.12 pixels in 240 pixel x 240 pixel sized MR images in terms of Euclidian Distance. The extracted center of gravity features can be used in ML algorithms or fused with DL models for better segmentation Dice Similarities. In the future, more precise segmentation results may be obtained by including T1 and T1CE modalities in the adaptive threshold determination and feature fusion steps.