Deep CNNs for glioma grading on conventional MRIs: Performance analysis, challenges, and future directions

: The increasing global incidence of glioma tumors has raised significant healthcare concerns due to their high mortality rates. Traditionally, tumor diagnosis relies on visual analysis of medical imaging and invasive biopsies for precise grading. As an alternative, computer-assisted methods, particularly deep convolutional neural networks (DCNNs), have gained traction. This research paper explores the recent advancements in DCNNs for glioma grading using brain magnetic resonance images (MRIs) from 2015 to 2023. The study evaluated various DCNN architectures and their performance, revealing remarkable results with models such as hybrid and ensemble based DCNNs achieving accuracy levels of up to 98.91%. However, challenges persisted in the form of limited datasets, lack of external validation, and variations in grading formulations across diverse literature sources. Addressing these challenges through expanding datasets, conducting external validation, and standardizing grading formulations can enhance the performance and reliability of DCNNs in glioma grading, thereby advancing brain tumor classification and extending its applications to other neurological disorders.


Introduction
The brain is an incredibly complex organ, and its function relies on the well-coordinated activity of diverse cell types.The brain is composed of two main types of cells: neurons and glial cells.Neurons, also known as nerve cells, are responsible for transmitting electrical and chemical signals in the brain, enabling functions such as thinking, feeling, and movement.Glial cells, or neuroglia, support and modulate the activity of neurons.There are several types of glial cells, including astrocytes, oligodendrocytes, ependymal and microglia, each with specific functions.Astrocytes regulate neurotransmission, form the blood-brain barrier, and support and nourish nerve cells.Oligodendrocytes are involved in the production of myelin, which insulates nerve fibers for rapid signal transmission.Ependymal cells serve as a lining for the ventricles of the brain and the central canal of the spinal cord.They contribute significantly to the homeostasis of the central nervous system (CNS) by regulating fluid balance, providing structural support, and participating in neurogenesis.Microglia act as the primary form of immune defense in the CNS, protecting the brain and spinal cord from infection and injury.The roles of glial cells are diverse and essential for maintaining brain homeostasis, supporting neuronal function, and regulating the brain response to injury and disease.Neurons and glial cells work together to ensure the proper functioning of the brain.Understanding the functions of both cell types is crucial for comprehending the complexities of brain function and the impact of various brain disorders.Glioma is one of the primary brain cancers that originates in the glial cells of the brain.Approximately one-third of CNS cancers are gliomas.Gliomas are categorized according to their subgroups and a numerical grading system.According to the American Cancer Society, three subtypes of gliomas are astrocytomas, oligodendrogliomas, and ependymomas [1].The grade of a tumor relates to the microscopic appearance of these subtype cancer cells.The 2021 world health organization (WHO) classification of cancers of the CNS categorizes glioma tumors into four categories depending on the progression of malignancy aggressiveness [2,3].Grade I tumors grow slowly and are sometimes entirely resectable with surgery, but grade IV tumors are aggressive, rapidly growing, and challenging to treat.The most frequent primary tumors were astrocytomas (38.7%), with high grade gliomas (HGGs) (59.5%) making up the majority [4].The clinical course for a particular patient considers the tumor location, potential symptoms, and the viability of alternative treatment techniques.Hence early detection of tumor cells is crucial for treating patients.Due to the availability of cutting-edge diagnostic and therapeutic tools, physicians can now effectively diagnose patients and administer treatment without endangering their health.One of the most reliable approaches to accomplishing this goal is using medical imaging.Using imaging technology, doctors can look for anomalies in a patient's bones and tissues without cutting them open.Patients with brain tumors benefit significantly from the use of healthcare imaging methods such as X-ray, magnetic resonance imaging (MRI), ultrasound, magnetic resonance spectroscopy, and computed tomography [5].MRI is one of the most preferred noninvasive neuroimaging techniques to diagnose brain tumors as it provides highcontrast images, especially in the case of soft tissues [6].
One of the distinguishing features of modern healthcare operations is that, as the number of patients increases, they generate enormous amounts of data on various interconnected procedures.In comparison to other aspects of healthcare, the generation of data from medical imaging is by far the most prolific, and this production rate is accelerating at an exponential rate.On the other hand, the data volume often exceeds traditional analysis capacity.This is a critical problem to address because proper data interpretation is one of the fundamental building blocks for complex systems like medical imaging.The second problem with human interpretation is that it is prone to inaccuracy for several reasons, such as being under stress, not having enough background, and not having enough experience.Therefore, the solution that makes the most sense is to employ artificial intelligence (AI).Applications that use machine learning / deep learning (ML/DL) may perform data analysis substantially more accurately and quickly, making it much simpler for medical professionals to handle information and carefully evaluate test results [7].In medical image analysis, the deep convolutional neural network (DCNN) has recently gained the most popularity.A CNN can automate and optimize image segmentation procedures by utilizing a wide range of classification and segmentation algorithms that extract as many relevant details as required from the data.DL allows images to be fed directly to CNNs, and important features can be learned automatically.Simple features within images are learned at shallow layers and deeper layers near the output layer are known to learn more complex high-order features [8].The quality and quantity of the dataset with annotations substantially influence the DL algorithm performance.However, annotating a large number of medical images is problematic since annotation can be time-consuming and is knowledge-specific [9].In the case of a limited training dataset, transfer learning (TL) is a promising approach.It improves a network that has been previously trained on a vast labeled dataset from some other field.Applying learnt information to the target dataset speeds up network convergence while reducing computational costs during training [10].Although DL algorithms can analyze medical images with high accuracy, they have yet to replace the role of a human specialist due to various challenges, such as a lack of sufficient data for training, a data imbalance problem, and a lack of a connection between clinicians and researchers.This structured review aims to assess recent advances in the automatic identification and classification of glioma tumors using a DL framework.In this review, we look at recent advances in DCNN techniques for glioma tumor classification, current research limitations, and future research directions in this field.This comprehensive review aims to provide researchers with the most up-to-date information in the brain MRI image classification field, including the advantages and disadvantages of existing DL techniques and algorithms.Figure 1 depicts the conceptual framework presented in this review paper.In total, 7029 records were retrieved through the search process.After a comprehensive assessment, 921 full-text articles were meticulously examined.Among these, 829 articles were deemed irrelevant and subsequently excluded, resulting in 92 studies that were considered for further analysis.The entire study screening and selection process is visually represented in Figure 2.For data collection, a meticulous approach was adopted wherein key data elements such as study purpose, methodology, model performance, and risk of bias were extracted and summarized for each of the 108 included studies.To ensure a comprehensive coverage of relevant literature, a variety of online scientific research repositories were consulted.This included well-regarded sources such as IEEE Xplore, Medline, Google Scholar, ScienceDirect, and ResearchGate.Notably, the search was refined to cover articles published between 2015 and 2023 to ensure relevance to the selected time frame.The search strategy employed a robust combination of domain-specific and methodological search terms, totaling 180 distinct combinations.
The structure of the review is as follows: Sections 2 and 3 contain detailed information regarding the glioma tumor grades, MR imaging, and available imaging databases for tumor classification.In Section 4, we delve into the DL paradigm in imaging and discuss the evolution of techniques utilized by DCNN architectures and in medical imaging.Moving to Section 5, we outline the fundamental stages of DCNN approaches for classifying glioma tumors and present an overview of pertinent primary studies, datasets, and computational methods utilized for developing glioma classification models, along with their respective performance evaluations.Section 6 is dedicated to discussing the implementation challenges associated with the studied architecture.Finally, Sections 7 and 8 encompass the limitations of this study and our concluding remarks.In Section 9, we offer recommendations for enhancing future research in this domain.

Glioma grading
Glioma is an umbrella term for primary brain tumors that are categorized based on their putative cell of origin.The WHO classification is the international standard for glioma diagnosis [3].According to histology criteria [11], glioma tumors are classified into four categories based on the degree of aggressiveness.The histological features that contribute to each glioma grade include cellularity (cell number), mitotic activity (cell division rate), pleomorphism (cell size/shape variation), necrosis (dead tissue presence), and vascularity (blood vessel density) and endothelial proliferation (increased blood vessel growth).Knowing the type of glioma before surgery or other therapies is crucial for clinical planning and decision-making.Figure 3 describes glioma grades and their characteristics according to WHO [3].The 5-year survival statistics for each glioma grade are detailed in Table 1.
Grade I gliomas are slow-growing astrocytomas made up of pilocytic cells that do not spread to other organs of the body.They exhibit minimal mitotic activity and lack necrosis.These tumors are the safest because they grow slowly, have clear borders, and have the best chance of survival.So, they can be removed surgically and cured with a low chance of returning [12].Most of the time, these gliomas are found in children and young adults.Grade I gliomas are also called low grade gliomas (LGGs).Grade II gliomas are harmless and more common in adults.These gliomas tend to spread into nearby healthy tissue and have fuzzy edges.Because of this, it is hard to get rid of them with surgery.They show increased cellular atypia (abnormalities) and mitotic activity compared to Grade I, with rare focal necrosis permissible.Depending on the location and size, chemotherapy and radiation can be used as treatments [13].Since the outlook is better than grades III-IV, they are in the LGG group.Grade III are types of glioma tumors that are cancerous.They display marked cellular atypia, frequent mitoses, and widespread necrosis, indicating their malignant nature.They are also called anaplastic gliomas.The word "anaplastic" is used to describe glioma brain tumor cells that divide quickly.Some cases of astrocytoma or oligodendroglioma transform into the aggressive form.It is harder to deal with than LGGs [14], also known as HGGs.Grade III tumors tend to spread rapidly and are likely to become grade IV tumors.Grade IV are the most malignant glioma tumors and have the lowest survival rate.They exhibit extreme cellular atypia, brisk mitoses, extensive necrosis, and microvascular proliferation (new blood vessels), hinting at their invasive potential.Primary glioblastomas grow quickly, while LGGs can turn into secondary glioblastomas.They often happen to older people and rarely to children [15].The 2021 WHO classification emphasizes a layered approach that integrates molecular markers such as isocitrate dehydrogenase (IDH) mutation and 1p/19q codeletion alongside traditional histopathological features for a more accurate diagnosis and prognosis [3].

MRI
The use of imaging technology is essential for treating intracranial tumors.In recent years, numerous medical imaging tools have been developed to aid clinicians in diagnosing the character and location of the disease.MRI has become the benchmark for diagnosing and monitoring brain malignancies, and its uses continue to expand [22].Improved neuro-oncological imaging not only enhances the detection of various lesions in the CNS but also permits the formulation of a more nuanced treatment approach.Both structural and functional MRI were found to have significant correlations with disease stage and prognosis in cancer patients.In recent years, MRI has received a lot of interest and appreciation because it is noninvasive and provides the finest contrast in cellular structure [23][24][25].An MR scanner can capture many images of the subject under investigation from multiple viewpoints, with varying contrast and physical properties; this is known as multiple modality imaging [26].Brain malignancies are often diagnosed using four MRI imaging sequences: T1W (T1 weighted), T2W (T2 weighted), T1Wc (T1 weighted post contrast), FLAIR (fluid-attenuated inversion recovery).Figure 4 shows sample MR sequences from the BraTS (brain tumor segmentation) challenge 2018 dataset.These sequences provide complementary information about the morphology and physiology of gliomas and enable a comprehensive assessment of the tumor, highlighting features like vascularity, edema, and infiltration patterns.In most segmentation approaches, T2 MRI is utilized.Due to the complicated structure and anatomy of the human brain, the radiologist uses all four MRI techniques to diagnose and classify the type of brain tumor.T1W scans can differentiate between healthy and diseased tissues.T2W scans delineate edematous areas.T1Wc pictures are utilized to locate the tumor boundary.FLAIR imaging can differentiate between edematous and cerebrospinal fluidfilled regions.The changes in the images produced by different MR modalities can be used to establish a contrast between the edema tissue, neoplastic tissue, necrosis tissue, and the unaffected brain, thus forming a tumor border.In addition, functional MRI (fMRI) and diffusion tensor imaging (DTI) are sometimes used to evaluate the alterations in brain function and connectivity induced by gliomas.These techniques provide insights into neuronal activity and the integrity of white matter, which are crucial for identifying eloquent brain areas and assessing the response to treatment.When compared to conventional imaging methods, such as computed tomography (CT) and positron emission tomography (PET), MRI offers superior soft tissue contrast, which enables the precise delineation of gliomas and their surrounding structures.This high contrast resolution is instrumental in facilitating accurate tumor localization and characterization, which are critical for diagnosis and treatment planning.Moreover, the noninvasive nature of MRI and the absence of ionizing radiation make it safe for repeated imaging, a feature that is particularly advantageous for pediatric and vulnerable populations.The use of gadolinium contrast in MRI can highlight areas with a disrupted blood-brain barrier, often indicative of tumor presence and activity.This further enhances the diagnostic accuracy and helps assess tumor aggressiveness.Although modalities such as PET and CT scans have their strengths, such as speed and affordability for initial evaluation or bone involvement and the ability to assess tumor metabolism respectively, they primarily excel in gross anatomical visualization.MRI offers exceptional detail, diverse imaging sequences, functional insights, and safety.This makes MRI an invaluable tool for comprehensive diagnosis, treatment planning, and monitoring of gliomas [27].Table 2 summarizes the sources of the MRI datasets utilized in this review.Table 2.An overview of commonly utilized publicly available datasets for brain tumor analysis.

Deep learning paradigm
While there has been progress in glioma treatment, it is far from sufficient.Before initiating therapy for gliomas, it is critical to determine the tumor stage accurately.The complex and diverse nature of gliomas, characterized by their multidimensional and heterogeneous features, necessitates the development of advanced, automated systems for accurate diagnosis.This urgent need stems from the inherent risks associated with traditional surgical methods like biopsies, especially for tumors located in critical brain regions.Automated systems, such as computer-aided diagnosis (CAD) and AI algorithms, offer promising solutions by enhancing both tumor localization and classification precision.They can assist in glioma detection, grading, segmentation, and even knowledge discovery, leveraging extracted features to predict tumor characteristics.This provides invaluable insights to clinicians, guiding treatment decisions and optimizing patient outcome.Furthermore, automation streamlines the diagnostic process, reducing the burden on healthcare professionals and potentially expediting treatment initiation for glioma patients.In the past decade, ML has seen substantial expansion in its applications to the field of neuro-oncology, with the diagnosis of glioma tumors using MRI, emerging as a prominent focus of interest.Several authors have used traditional ML approaches, which entail a sequence of steps beginning with preprocessing, continuing with feature extraction and feature selection, and concluding with applying a classification algorithm to offer a result [41].Several approaches were used to extract the features, including discrete wavelet transform, gray level cooccurrence matrix, histogram of oriented gradients, genetic algorithm, and zernike moments.Particle swarm optimization and principal component analysis have been used by several authors in this discipline to help them decide which features to include.The most extensively utilized classification strategy for classification was SVM (support vector machine), which multiple authors adopted.Other authors use random forest, adaboost technique, instance-based k-nearest with log and Gaussian weight kernels, extreme learning machine, and sequential minimal optimization as categorization strategies [42].
However, the quality of the classification process in ML studies largely depends on manually created features discovered by feature extraction techniques, which is a time-consuming and error-prone process.There are limitations to employing these manually created features, as they cannot be changed during model training, and it is uncertain if they are the most effective attributes for classification.Additionally, these features require rigorous validation and often exhibit limited generalizability, struggling to adapt to new patient populations or imaging protocols.This significantly hinders their applicability across diverse datasets and clinical scenarios [43,44].Moreover, traditional ML architectures often encounter difficulties in integrating and effectively leveraging multimodality data such as MRI, PET, and genetic information, due to the complex relationships existing between these modalities.These challenges are particularly pronounced in glioma classification.Glioma datasets often vary in terms of imaging modalities, acquisition parameters, and tumor phenotypes, making it challenging for manually engineered features to adapt to such variability.Consequently, the performance of traditional ML models relying on manually created features may degrade when applied to new datasets or clinical scenarios.DL with its ability to automatically learn features directly from data, offers a promising solution to these challenges.By eliminating the need for manual feature engineering, DL models can capture more subtle and complex patterns in the data, potentially leading to improved glioma classification performance.Furthermore, DL models can be designed to effectively integrate multimodal data, thereby fully exploiting the complementary information provided by each modality.DL is a subfield of ML.Here the processes of selecting features from images and classifying them are carried out concurrently by a single algorithm and learning does not require the participation of humans during the training process.Feature extraction is accomplished by a multilayer, nonlinear processing architecture.As we proceed deeper into the network, data abstraction is aided by the fact that each layer output serves as the input to the layer below it [45].The usage of CNNs in various image processing problems is becoming increasingly common due to their prominence as a DL technique.CNNs ability to discern patterns has made it popular, especially in the image processing field.A CNN generally has three layers stacked on top of one another.The convolutional layer is responsible for extracting features from images.It delivers visual knowledge of the dataset images to the network and addresses the use of learnable kernels.Each kernel is typically convolved across the spatial dimensions of the input by the convolutional layer to produce a feature map as an output.The pooling layer is responsible for minimizing the dimensionality of the features obtained in order to reduce the number of parameters and computational complexity of the model.The last layer employs multiple fully-connected layers that focus on converting the 2D feature maps of the preceding levels into 1D vectors.A learning or optimizer algorithm is utilized to modify network weights during training.The learning process uses loss to update the network's filters and weights.At the output layer, an activation function normalizes the output total, so all numbers add up to one [46,47].
The evolution of DCNN began in 1989, with the introduction of LeNet [48].At the time, CNNs were limited to digit identification tasks, which could not be applied to other image analysis problems.From the 1996 to 2000, various developments in CNN architecture were created in order to make it scalable to large multi-class problems.CNN-based applications became popular following AlexNet's remarkable performance on the ImageNet dataset in 2012 [49].Significant advancements have been made since then.Zeiler and Fergus [50] introduced a layer-by-layer representation of CNN to enhance comprehension of feature extraction stages, which shifted the paradigm toward feature extraction at low spatial resolution in DL architecture, as accomplished in VGG [51].VGG stands for visual geometry group, which is a part of the department of science and engineering at oxford university.The Google DL group pioneered the concept of a split, transform and merge with the connecting block known as the inception block in GoogLeNet.These blocks introduced the concept called branching inside a layer, allowing for the abstraction of features at several spatial scales [52].The idea of skip connections, proposed by residual network ResNet [53] for DCNN training, rose to prominence in 2015.Following that, most succeeding networks embraced this concept, like Inception-ResNet, Wide ResNet, and others [54].A new network architecture called ResNeXt [55] was developed for image classification, focusing on increasing cardinality as a key factor for improving accuracy outperforming its ResNet counterpart on various datasets.MobileNet, designed for efficient mobile and embedded vision applications, brought a new level of model efficiency and portability to the field [56].The neural architecture search (NAS) approach led to the creation of NASNet, which automates the design of CNN architectures and has produced competitive models for various tasks [57].EfficientNet, proposed by Tan and Le in 2019, demonstrated remarkable efficiency-accuracy trade-offs by scaling model width, depth, and resolution simultaneously.It has become a popular choice for resource-constrained applications [58].The evolution of DL architectures continued with the emergence of NFNet (normalizer-free ResNets) which built upon the success of Squeeze-and-Excitation blocks and the ReLU(rectified linear unit) activation function to achieve both computational efficiency and state of the art results in computer vision [59].TResNet, inspired by the efficient combination of depthwise separable convolutions and spatial pyramid pooling, offers competitive performance on various computer vision tasks, including image classification [60], The landscape of DL and CNNs has continued to evolve with the introduction of various novel architectures.Table 3 lists an overview of popular CNN architectures for image analysis.TL is currently the most widely utilized DL methodology.Training a CNN from scratch requires many labeled training samples and substantially more time and computational resources as compared to the already trained CNNs.Fine-tuning and freezing are the two main approaches [63] used in TL.Fine-tuning involves using the weights and biases of a pretrained CNN.The pretrained CNN layers are regarded as a fixed feature extractor in the freezing approach.The convolutional layer weights and biases are fixed in this case, but the fully connected layers are fine-tuned across the target dataset.Frozen layers can be any subgroup of convolutional or fully connected layers; however, the more superficial convolutional layers are usually frozen.If the training dataset is too small, an overfitting problem may occur during the training [64].As a result, numerous research [65,66] addresses this issue by slicing 3D MRI volume into 2D slices, increasing the sample size of the original dataset and reducing the class imbalance issue.Additionally using morphological techniques such as rotation, scaling, mirroring, translation, mirroring, and cropping [67] is another efficient technique for expanding the quantity and diversity of training data.This is known as data augmentation.Overfitting also occurs when the learning capacity of a network is so vast that it learns false characteristics rather than real patterns.This occurs when there is an abundance of information to learn.A validation dataset can be utilized throughout the training process to avoid overfitting and to achieve a steady potential of the tumor classification system on a new dataset that has not been observed in clinical practice.
Similar to TL, ensemble algorithm-based architectures [68] have gained prominence in the realm of DL due to their ability to enhance model performance and robustness.Ensembles combine the predictions of several individual models, often using techniques like bagging, boosting or stacking [69,70].Bagging trains models on distinct data subsets, reducing overfitting risk.Boosting iteratively emphasizes weak learners, constructing a robust ensemble.Stacking combines diverse models' predictions via a meta-learner for intricate decision-making.These ensembles elevate DL model accuracy and generalization, especially in complex or data-scarce scenarios.Nonetheless, they demand additional computational resources and meticulous tuning.As with TL, selecting the best ensemble strategy hinges on the specific task, available resources, and managing overfitting using validation data.Figure 5 illustrates a chronological timeline depicting the utilization of different techniques by DCNN architectures and in medical imaging.

Performance metrics
Performance metrics are specific guidelines that give us scientific proof of the authenticity of a particular model.The metrics most used by multiple authors for classification in this study are outlined in Table 4, along with their respective functionalities.

DL in glioma grading
The application of DCNNs to the classification of gliomas is an area of current investigation in the field of imaging science.To create a predictive model that can effectively categorize an image, a CNN may learn radiologic properties and their relative relevance with enough high-quality data [73].The flowchart in Figure 6 provides an overview of a brain tumor diagnostic system, employing a generic DCNN.The process initiates with the collection involves collecting MRI scans of the brain.These scans are typically obtained from various sources, including hospitals, research institutions, and public datasets.Following data acquisition, the dataset is split into training and testing sets to facilitate model development and evaluation.Preprocessing steps are then employed to enhance the quality and utility of the MRI images.This includes normalization to ensure consistent intensity values across images and augmentation techniques to expand the dataset and improve model robustness.Additionally, preprocessing may involve cropping to focus on relevant brain regions and bias correction to mitigate inconsistencies in image acquisition.The subsequent stage involves model training, where different DCNN architectures are considered based on the specific task and data characteristics.Hyperparameter optimization is conducted to fine-tune the model parameters, such as learning rate, batch size, and number of epochs, aiming to maximize its ability to accurately classify brain tumors while minimizing errors and biases.This iterative process typically employs techniques like grid search or random search.Once the model is trained, it undergoes evaluation on the testing set using various performance metrics, including accuracy (Ac), specificity (Sp), sensitivity (Sn)/recall, precision (Pr), F1 score (F1), and area under the curve (AUC).These metrics provide insights into overall performance and its capability to correctly classify brain tumors across different classes.Throughout the process of configuring the model hyperparameters, the validation set gives an objective evaluation of a classification model on the training dataset.
Recent advancements in DL have significantly advanced the field of medical imaging, particularly in the areas of segmentation and classification.Researchers have been dedicated to improving the accuracy and efficiency of DL models for medical image segmentation.For example, Rehman et al. [74] introduced BU-Net, a modified U-Net architecture for brain tumor segmentation, which leverages residual extended skip and wide context to extract diverse features and enhance the valid receptive field.The researchers also employed a custom loss function to extract contextual information, resulting in improved segmentation performance.Addressing the challenge of information loss in deeper layers, Rehman et al. [75] proposed BrainSeg-Net, an encoder-decoder model that strategically shares pertinent details from shallow layers with deeper ones, enhancing tumor identification.Additionally, Rehman et al. [76] introduced a novel encoder-decoder architecture, RAAGR2-Net, which utilizes residual spatial pyramid pooling and attention gate modules to capture rich feature representations and retain local information, particularly in fine segmentation.Another study by Lin et al. [77] explored the integration of EfficientNetV2 as an encoder in combination with U-Net for brain tumor segmentation, significantly enhancing the model's performance.Furthermore, DL models have been successfully applied to tasks such as supraspinatus extraction from MRI, demonstrating high segmentation accuracy by Wang et al. [78].Additionally, Yin et al. [79] proposed a double-branch flat bottom U-Net for efficient medical image segmentation, which achieved outstanding performance in the challenging task of pancreatic segmentation.These recent studies highlight the potential of DL models for medical image segmentation and underscore the importance of developing efficient and accurate models for clinical applications.Much of the ongoing research is confined to brain segmentation, with only a limited amount of work done for tumor grading.Therefore, there is considerable potential to explore grade estimation for brain tumor DL approaches.In this section, we have discussed some of the existing DL-based glioma grading methods.
Recent studies have found that utilizing DCNNs to predict tumor grade and long-term survival is highly successful.Banerjee et al. [80] investigated the feasibility of using DL-based techniques to grade gliomas from MRIs.They used VGGNet and ResNet architectures to assess the appropriateness of transfer learning, achieving an accuracy of 84 and 90%, respectively.The study by Muneer et al. [81] contrasts the glioma classification performance of two DL systems, WNDCHRM (weighted neighbor distance using compound hierarchy of algorithms representing morphology) and VGG-19 DCNN.It was observed that VGG-19 CNN achieved higher accuracy than WNDCHRM.Ge et al. [82] proposed a glioma classification multistream CNN and fusion network.T1Wc, T2W, and FLAIR images were extracted from the BraTS 2017 dataset and put in their own CNN.The collected information was then combined with the extracted features.They were able to achieve a precision of 90.87% by using three distinct data points.Individually, the T1Wc images were the best at distinguishing between HGGs and LGGs.In another study, Yang et al. [83] investigated AlexNet and GoogLeNet's ability to distinguish between LGGs and HGGs.They compared the accuracy of these two CNNs when trained from scratch versus pretrained CNNs with fine-tuning using T1Wc images from glioma patients.According to the results, pretrained CNNs outperform untrained CNNs, with GoogleNet outperforming AlexNet.Gutta et al. [84] built a DCNN model and compared it to ML models trained solely on traditional radiomic data.With an accuracy of 87%, the proposed DCNN model significantly outperforms ML models.
Lu et al. [85] classified gliomas using the ResNet model.Pyramid dilated convolution is added to ResNet to increase classification performance.The proposed method achieves 80.1% accuracy; however, this method can only interpret 2D MRI.Also, manual labeling of the training set was required.Mzoughi et al. [86] proposed a fully automatic 3D CNN architecture with a T1Wc sequence to distinguish between LGGs and HGGs.The accuracy of this 3D-CNN model was 96.49%.Zhuge et al. [87] used conventional MRI to compare 3DConvNet and 2D Mask R-CNN (region-based CNNs) for glioma classification.The results showed that the 3DConvNet outperformed the 2D Mask R-CNN, with a test accuracy of 97.1% versus 96.3%.Khawaldeh et al. [88] utilizes a modified version of AlexNet.The 12-layer ConvNet model proposed in this research study comprises convolutional, subsampling, dense, and fully connected layers.Overall accuracy achieved by this model is 91.16 % on FLAIR MRIs.Chenjie et al. [89] proposed an MRI-based multimodality glioma classification system.To make use of unlabeled data, the authors used deep semi-supervised learning.Generative adversarial networks generated synthetic MRIs to mitigate overfitting in the intermediate dataset.
Using CNN, the suggested system achieved a test accuracy of 86.53% on the TCGA (the cancer genome atlas program) dataset and 90.70% on the BraTS dataset.Liang et al. [90] proposed the more advanced DenseNet to predict IDH mutations.Their approach was also used to grade gliomas, with a 91.4% accuracy.As a result, its potential application can be extended to additional multimodal radiogenomics challenges.Some recent applications of DCNN-based methods for automated glioma grading research are summarized in Table 5. LGG HGG LGG: Figure 7 presents a comparative evaluation of the efficacy of various DCNN architectures in the task of glioma grading.The graph highlights the high level of accuracy attained by several DCNNs using TL, underlining their expertise in grading gliomas.Moreover, EfficientNetB0 [100] also proves to be a robust performer with an accuracy rate of 98.8%, signifying its capability in managing this intricate task.The ensemble algorithm, which employs majority voting (MajVot), demonstrates remarkable accuracy [97,103].This superior performance can be ascribed to the ensemble approach capacity to utilize the strengths of multiple models, thereby augmenting the overall predictive ability.The custom-built DCNN [102] demonstrates the highest accuracy of 98.91% in the task of glioma grading.
These findings indicate that DL models, particularly those that incorporate ensemble techniques and custom architectures, exhibit substantial potential in improving the accuracy and reliability of glioma grading.This enhancement is pivotal as it directly influences clinical decision-making and patient care.By delivering more precise grading, these models can support clinicians in formulating more effective treatment strategies, ultimately leading to better patient outcomes.However, it is important to note that while these DCNNs shows promising results, further validation and testing on broader datasets are necessary to confirm its effectiveness and generalizability in real-world clinical settings.

Computing environment
The utilization of CNN classifiers in complex image classification tasks such as grading brain tumors offers distinctive advantages.CNNs autonomously extract relevant features, eliminating the need for separate feature extraction and classification steps.Despite their compact architecture, CNNs excel in intricate classification, although they entail higher computing complexity compared to traditional methods like SVM or logistic regression.In the realm of high-level programming environments, Python and MATLAB are prominent choices for DL implementation due to user-friendliness.Two primary approaches for evaluating ML models are development-based and production-based.Python, particularly when used with platforms like Google Colab, has an advantage over MATLAB due to its faster training times, made possible by accessible GPUs (graphics processing units) and cloud-based storage.However, the longer training times in MATLAB can be offset by a powerful workstation.The performance of glioma grading algorithms is significantly influenced by computing power.Adequate memory is essential for loading and preprocessing large medical imaging datasets, which include intermediate activations and gradients.GPUs, which perform essential matrix operations in parallel, speed up the training process.Workstations equipped with substantial memory and multiple GPUs can hasten the training, tuning, and evaluation processes.Factors such as model complexity, dataset size, hardware, batch size, and optimization techniques all influence training time.Despite their parallel processing capabilities, GPUs often necessitate manual synchronization in frameworks like OpenMP (open multi-processing) and CUDA (compute unified device architecture).Nevertheless, the use of GPUs for parallel algorithms holds potential for efficient big data processing, especially in high-performance computing applications such as cancer research and AI [105].In terms of GPU vs CPU (central processing unit) performance, it has been observed that GPUs are generally faster than CPUs.However, for smaller networks with only two hidden layers, CPUs can be faster than GPUs if there are less than 1000 neurons in each hidden layer.This highlights the importance of considering the specific requirements and characteristics of the model when choosing between GPU and CPU for prediction.

Implementation challenges
Brain tumors remain a popular research topic in medical image processing.Advanced glioma classification techniques in HGGs and LGGs are constantly evolving.For such problems, DL has emerged as a critical research tool for improving the performance of standard ML approaches.DL facilitates multiple levels of representation and abstraction, thereby providing more comprehensive information about MRIs and their attributes [98].This research focused on the DCNN-based glioma classification architectures.Table 5 summarizes the findings of several studies that show that DCNNbased architectures can handle a wide range of glioma classification tasks effectively and efficiently.It demonstrates that TL using DCNN models such as ResNet, VGG, and GoogleNet outperforms other models developed from scratch.However, some challenges must be resolved before DL can be used in oncology as, shown in Figure 8.
The lack of an objective dataset was one of the most common issues identified in this study.DCNNs are based on supervised learning techniques, requiring a large volume of labeled data to learn properly.The problem with small datasets is that the DL algorithm may produce absurdly inflated algorithm accuracy due to the millions of parameters that must be overfit to a single specialized training population [83].This issue is critical given the scarcity of curated datasets, particularly in radiography research.In addition to collecting large, heterogeneous datasets, various methods have been developed to solve this issue, such as the addition of feature dropout, L2 regularization and batch normalization [106,107].This review also revealed a startling gap in precision among certain researchers when defining the dataset, tumor type, and the accuracy, sensitivity, and specificity performance measures of the algorithm.In addition, research based on meticulously managed datasets, such as BraTS or TCIA (the cancer imaging archive), demonstrated algorithms trained without external validation that might not produce reproducible findings in clinical practice despite their consistently high accuracy.Most publications did not do validation, which is the most significant problem with ML and DL that should be considered.In some cases, only cross-validation was done.Validation is essential in accordance with the standards for constructing and reporting ML/DL prediction models in biomedical research [108].
Although this analysis highlighted several contributions that independently concentrate on the three primary phases of tumor identification, it did not identify any diagnostic approaches that encompass all the phases.The absence of a comprehensive diagnostic system in a single package presents two issues: the lack of a fully automated procedure and the lack of integration between the three processes.The development of a complete and automated system should facilitate the process of diagnosing brain tumors for physicians and radiologists, as well as translating research-based diagnostic algorithms into clinical practice.Additionally, uniform criteria of glioma grade should be utilized when developing DL models.It is interesting to note that there were discrepancies between the LGGs and HGGs definitions, with some research identifying Grade III gliomas as HGGs and others as LGGs.Lack of a standard classification strategy may hinder the performance on independent datasets given that the images used for segmentation, feature extraction, and model training/testing are labelled as LGGs or HGGs based on nonuniform criteria.As glioma grade influences clinical therapy, it is vital that the outputs of LGG and HGG algorithms reflect a universal definition congruent with current WHO standards.Another key observation regarding the DL architecture is that in the current context, GPU-based systems with a lot of memory are essential since DL models need a lot of data [37], which is linked to millions and trillions of parameters [71].Also, to enable practical deployment of well-trained DL models, addressing their extensive memory and computational demands is essential.Particularly in data-intensive domains like healthcare and environmental science, these requirements limit their usage in resource-constrained settings, hindering healthcare applications due to escalating data complexity.Solutions like FPGA (field programmable gate arrays) and GPU hardware accelerators have emerged, while techniques like parameter pruning, knowledge distillation, compact filters, and low-rank factorization offer model compression strategies to mitigate computational challenges [109].Table 6 summarizes the challenges encountered in DL based research related to glioma grading and their influence on algorithm performance.

Limitations
The study presented here offers valuable insights into the performance of DL algorithms in terms of glioma grading.However, it is important to acknowledge certain limitations that might affect the generalizability and robustness of the findings.This study's limitations include the possibility of missing recent and unpublished works due to the timing and criteria of the search, potentially affecting the comprehensiveness of the findings.Additionally, the focus on accuracy as the primary performance metric resulted in the exclusion of studies lacking accuracy results, limiting the overall assessment.Furthermore, the presence of heterogeneity, inconsistent definitions, evolving standards, high variability across studies, and the absence of confidence intervals in the reviewed literature hindered the aggregation of results, introducing uncertainties in the study's conclusions.These challenges are compounded by the sensitivity of DCNNs to subtle variations in medical images, stemming from factors such as patient anatomy, acquisition conditions, and disease presentation.While the human eye can adapt to such nuances effortlessly, DCNNs may struggle, resulting in misdiagnoses or missed diagnoses.This highlights the necessity for evaluation metrics that can capture the model's ability to manage these complexities.The articles reviewed in our study primarily concentrate on conventional evaluation metrics such as accuracy, sensitivity, and specificity.These metrics provide a comprehensive evaluation of model performance across the entire dataset.However, they may not adequately highlight local discrepancies, potentially leading to misleading interpretations of model performance.A recent survey [110] reiterates these concerns, emphasizing the crucial role of model uncertainty and interpretability in building confidence in medical diagnoses.To overcome the limitations of traditional evaluation metrics, future research should also concentrate on local discrepancy analysis using localization metrics like intersection over union or dice similarity coefficient.These metrics measure the spatial overlap between predicted and actual regions.The incorporation of region-based evaluation techniques, such as precision-recall curves or localization error analysis, can offer a more nuanced understanding of model performance.Techniques like gradient-weighted class activation mapping (Grad-CAM) or attention mechanisms can provide visual explanations of the model's decision-making process, assisting in comprehending the model's behavior and identifying potential regions prone to errors.Additionally, task-specific metrics, customized for specific clinical tasks, can offer more pertinent insights than generic metrics, steering development towards clinically relevant applications.
While our current work leverages traditional CNNs, we acknowledge the potential of fuzzy logic to address uncertainty challenges raised in glioma diagnosis, an area that has been scarcely explored.Medical images inherently contain ambiguity due to imaging artifacts, partial volume effects, and interobserver variability that can lead to misclassifications.Traditional performance metrics like accuracy may not adequately capture these nuances.Fuzzy logic, capable of handling ambiguity and incorporating expert knowledge, offers a promising alternative.A recent study [111] presents an intriguing application of fuzzy logic to address uncertainties in evaluating external loads on steel structures.This method, based on divergence computations, achieves better classification, and reduces ambiguity compared to traditional approaches.Similarly, rather than regarding ambiguous features as binary certainties, a DCNN empowered by fuzzy logic could interpret them as possibilities with varying degrees of truth.This could lead to more nuanced and robust classifications, particularly in cases with subtle variations or overlapping tumor regions.We believe exploring this integration holds immense potential for advancing glioma diagnosis.

Conclusions
The ability for medic specialists to categorize brain tumor scans quickly and accurately has never been more crucial.Recently CNNs have accomplished astonishing achievements in categorizing brain tumors such as gliomas.This study examined the most recent DCNN-based glioma classification architecture, datasets, and the efficacy of each suggested model for brain MRIs over the period from 2015 to 2023.Table 5 shows a compilation of pertinent data, applied approaches, DL networks, and their performance.The research findings highlight the potential of DCNN architectures, particularly hybrid and ensemble DCNNs, which have achieved accuracy levels as high as 98.9%.These results underscore the considerable potential of advanced DL models in augmenting the accuracy and reliability of glioma grading.However, despite the undeniable successes of DCNNs, challenges remain in incorporating them into clinical practice.The study also found that preprocessing and segmentation were not always used in the surveyed articles before categorization.No single system can do all the functions automatically and with high precision.
While there is ongoing work to enhance the utility of DL in tumor identification and classification, the need for standardized databases for these purposes remains evident.The varied use of databases and benchmarks by many researchers underscores the need for standardization.Additionally, the blackbox nature of DCNNs has constrained their application beyond research contexts.DL holds great promise for the future of brain tumor research.By focusing on the right strategies, these studies could transition from research labs to clinical settings.These methods could also be applied to the classification of other brain disorders, including alzheimer's, parkinson's, stroke, and autism.We hope that our review will guide researchers toward potential future directions for efficient grading techniques.

Future directions
Addressing the implementation challenges of DCNNs in radiography research requires a strategic approach.To mitigate the impact of limited datasets, collaborative efforts should focus on creating objective, diverse datasets, potentially incorporating data augmentation techniques.Establishing standardized definitions and evaluation metrics across tumor types would enhance algorithm assessment consistency.The development of a unified diagnostic framework spanning tumor identification phases holds promise for increased automation and integration.Overcoming the hurdle of inconsistent glioma grading could involve adopting universally accepted grading criteria.Additionally, to ensure broader applicability, it is essential to explore hardware-efficient solutions, such as model compression techniques, thereby ensuring accessibility to necessary resources.External validation is also very crucial for real-world utility; thus, incorporating rigorous external validation protocols in research design would enhance clinical relevance.As research progresses, accounting for these future directions will refine the robustness and practicality of DCNN implementation in radiography, ultimately benefiting patient care and diagnosis quality.
We propose the following novel directions for shaping forthcoming models: • For robust clinical applicability, future investigations must embrace expansive multicenter datasets, gauging model efficacy across diverse populations independently.• Elevating CNN performance hinges on meticulous hyperparameter selection, underscoring their pivotal role.In future designs, adept optimization techniques must be employed to navigate this critical aspect.• Standardizing imaging methodologies [112] is still a crucial problem to solve since even the best CNNs may prove ineffective when tested on real-life data.This involves ensuring consistency across institutions and modalities, accounting for real-world variability, and enhancing robustness to achieve effective performance on diverse clinical data.• Incorporating explainability in AI models is essential for improving the trust and understanding of AI software.Future models should aim to provide clear, understandable reasoning for their predictions and decisions.This will not only enhance user trust but also facilitate troubleshooting and refinement of the models.• In a dynamic landscape, the WHO has refined glioma classification, transitioning to molecular insights from conventional histopathology in 2016, further accentuated in 2021 by emphasis of cIMPACT-NOW (the consortium to inform molecular and practical approaches to CNS tumor taxonomy) on molecular markers.This evolving scenario introduces flux in defining LGGs and HGGs, impeding inter-comparison of ML/DL models anchored on differing grading criteria.To enhance accuracy and coherence, future research should converge on glioma grading standards alongside molecular subtypes, assuring enduring and accurate prognostications.• Developing precise data augmentation methods to expand and diversify training datasets for improved model performance.• Investigate integrating divergence-based fuzzy logic into existing DCNN architectures for glioma grading to improve classification robustness and address inherent image uncertainity.• Beyond these technical aspects, it is also important to address the clinical issues regarding the adoption of DCNNs for tumor grading.It is important to consider factors such as the interpretability of the model's predictions, the integration of the model into existing workflows, and the training and support provided to healthcare professionals using the technology.Additionally, ethical considerations, such as patient consent and data privacy, must also be addressed.These factors are all critical for the successful adoption of AI technologies in clinical settings.These novel directions, coupled with the previously outlined strategies, underscore the evolving landscape of radiography research, and hold significant potential for advancing both diagnostic accuracy and patient care.

Figure 1 .
Figure 1.The flow of the conceptual framework of this research paper.

Figure 2 .
Figure 2. Overview of screening and selection process.

Figure 3 .
Figure 3. Glioma grades and their characteristics according to WHO [3].

Figure 5 .
Figure 5.A timeline illustrating the development of techniques utilized by DCNN architectures and in medical imaging.

Figure 6 .Table 5 .
Figure 6.A flowchart of a generic DCNN-based brain tumors diagnosing system.

Figure 7 .
Figure 7. Comparative analysis of glioma grading accuracy across different DCNNs.

Figure 8 .
Figure 8. Major challenges in adopting DCNN-based classification algorithms for glioma grading in clinical settings.

Table 1 .
Classification of gliomas into WHO grades, types, characteristics, and 5-year survival rate[16,

Table 3 .
A summary of CNN architectures, their contribution, and limitations in image analysis.
*Architectural depth classifications range from shallow (typically 1 to 10 layers), moderate (around 10 to 100 layers), very deep (often exceeding 100 layers), to variable, allowing significant depth variations beyond predefined ranges.

Table 6 .
Challenges in DL-based research on glioma grading and impact on algorithm performance.lack in their ability of interoperability and automation.Very few architectures are available till date that are fully automated and can adapt to model changes Uniform Criteria for Glioma Grade Inconsistent classification of LGGs and HGGs across datasets due to non-standardized definitions.Model might fail on independent dataset due to grade inconsistency.