SGPNet: A Three-DimensionalMultitask Residual Framework for Segmentation and IDH Genotype Prediction of Gliomas

Key Laboratory of Symbol Computation and Knowledge Engineering, Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China School of Artificial Intelligence, Jilin University, Changchun 130012, China Department of Radiology, )e First Hospital of Jilin University, Changchun 130012, China Department of Obstetrics, )e First Hospital of Jilin University, Changchun 130012, China


Introduction
Glioma is the main type of malignant brain tumor in adults which accounted for approximately 80% of them, and it can be divided into four grades from I to IV according to the World Health Organization (WHO) [1]. Despite the frequency of gliomas, the histology and molecular etiology are variable even in a single pathology class [2]; hence, recognizing the status is crucial for precision medicine. Isocitrate dehydrogenase (IDH) is a general term for IDH1 and IDH2, and previous studies have proved that the IDH genotype (wild-type or mutation) shows significant impacts on the diagnosis, treatment, and prognosis of glioma patients [3][4][5][6]. However, identifying the IDH genotype by a biopsy is an invasive and costly procedure that needs a sample of cells from a patient's lesion, while radiographic medical imaging provides a noninvasive platform for sampling both inter and intralesion heterogeneity of gliomas. Previous research has demonstrated the strong correlation between phenotypes (extracted from medical images) and genotypes (extracted from gene expression files), and the prediction of genotypes from phenotypes becomes a fastdeveloping research field [7].
At present, there have been constructed high-performance models to predict the genotypes of gliomas patients across medical images. Regarding this task, an effective approach is based on radiomics and machine learning algorithms [8,9]. Radiomics is a method that extracted lesion-related features from medical images by experienced radiologists using professional software and data-characterization algorithms [10]. e high-dimensional images' data are well represented by the low-dimensional radiomics features after the processing of radiologists, and using these radiomics features allows researchers to build IDH prediction models more easily. Although the radiomics feature-based models perform well on genotype prediction, they still have some limitations. For example, extracting radiomics features depends on radiologists' judgment is a subjective procedure, and it is also affected by factors of the environment of hardware and software. Different radiologists using different software and algorithms may result in slightly different descriptions of the details of the lesion. Besides, all raw images should be processed before the predicting phase, and the low-dimensional features restrict the models for further investigations. Overall, the model's generalization ability and reproducibility are limited by the high-dependency on manual intervention.
Based on the above observations, researchers introduced deep learning (DL) algorithms into genotype prediction tasks. DL, as a subclass of machine learning (ML), reveals a more powerful learning ability. e annotated data are only required for the training phase, and the well-trained models could receive raw images as input for various tasks.
e raw images preserve all the information about the lesions and the organism that allow the models to finish more complex tasks. Chang et al. developed a residual convolutional neural network (CNN) using magnetic resonance (MR) sequence images [11]. However, in Chang's work, the MR sequence images are manually selected from whole 3D brain MR images. To directly handle the 3D brain MR images, Liang et al. developed a 3D DenseNet for IDH genotype prediction of low-grade (grade II and III, known as low-grade gliomas, LGG) and high-grade (grade IV, known as glioblastoma multiform, GBM) gliomas' MR images and achieved an accuracy of 84.6% on the validation dataset [12]. DL algorithms also perform well on automatic segmentation tasks, and previous studies have established many highperformance models to segment lesion areas from medical images [13,14]. Soltaninejad et al. combined DL and ML algorithms to build superpixel-based and supervoxel-based models for brain tumor segmentation and detection [15]. However, these models are incompetent to predict gene mutation statuses which are also important for the treatment of glioma patients. e attention mechanism is also introduced to improve the performance of segmentation. Although the attention mechanism shows potential to be applied to medical image tasks, it significantly increases the computational complexity of models, especially for the 3D MR images. It means that the attention-based models need more cases, and they are more difficult to be well-trained. Liu et al. developed a multitask model including segmentation of brainstem gliomas and prediction of H3 K27M mutation [16]. e phenotypes of MR images and IDH genotype are both important criteria for gliomas' patients to receive proper medical treatment; however, it still lacks a multitask framework for the segmentation of the lesion areas of gliomas and the prediction of IDH genotype. e brain MR images contain the details of normal tissues and lesion areas. Both normal tissues and lesion areas may affect the performance of genotype prediction. However, previous research studies only focused on conducting the black-box models for the genotype prediction due to constrained by the single-task model structure, which limits the reliability as a computer-aided tool for diagnosis and treatment. Due to the multitask architecture in the SGPNet, we set up controlled experiments to discuss the influence of lesion areas for IDH genotype prediction by setting different groups of learning targets.
In this paper, we focus on a multitask CNN model to address the challenges of the automatic segmentation of lowgrade gliomas (LGG) and glioblastoma multiform (GBM) tumor volumes and the prediction of IDH mutation from MR images (SGPNet). Four types of modalities of MR images including T1, T1Gd, T2, and T2-FLAIR are preprocessed and then fed into the SGPNet, and our model consists of a single backbone with two output blocks, one each for segmentation and IDH status. In order to effectively train such a multitask model, we apply a multiloss function for our network and different learning rates for the different blocks.
e experimental results indicate that our model reduces 26.6% classification error rates comparing with previous models on the datasets of Multimodal Brain Tumor Segmentation Challenge (BRATS) and e Cancer Genome Atlas (TCGA) gliomas' databases. In addition, we further study the features of lesion areas which influence the performance of IDH genotype prediction. We believe that these experiments can prove the information of lesions which is important for the IDH genotypes prediction and increase the reliability of the IDH genotypes prediction.

Gene Profiles and Medical Images Dataset.
In this paper, we used two datasets of e Cancer Genome Atlas (TCGA) and Brain Tumor Segmentation Challenge (BRATS) 2020 databases to conduct our experiments. e genotype-related dataset used in this paper is e Cancer Genome Atlas (TCGA) [17] which provides various gene data types, including gene expression profiling, copy number variation profiling, and so on. More specifically, e TCGA dataset provides four methods to identify gene mutation status in parallel, including MuSE [18], MuTect2 [19], SomaticSniper [20], and VarScan2 [21]. We considered one gene to be in mutation status when more than one of these methods indicated this gene is mutated.
e BRATS 2020 dataset [22][23][24] provides multimodalities brain MR images of LGG and GBM patients, including T1, T1Gd, T2, and T2-FLAIR volumes. One of the sources in the BRATS dataset is e Cancer Imaging Archive (TCIA) dataset [25], which allows us to build cross-referenced MR images and gene expression profiles data according to the project ID in both datasets. e subtypes of the segmentation labels include the necrotic and the nonenhancing (NCR and NET), the peritumoral edema (ED), the enhancing tumor (ET), and the background. In this paper, considering the scale of the datasets and our research content, we integrate the NCR and NET, ED, and ET into the lesion label, and it can make the evaluation of our experimental results more concise. Totally, 121 cross-referenced patients' data are collected from the above datasets which include 56 mutant cases and 65 wild-type cases, respectively.

Data Processing.
e original MR images have been manually annotated by clinical experts; each entity consists of four modalities volumes (T1, T1Gd, T2, and T2-FLAIR) and the ground-truth segmentation labels, and all those images have the same shape of 155 * 240 * 240 pixels. e data preprocessing procedure has the following steps. (1) Every image is cropped to remove the black background. (2) Following the cropped image is reshaped into the unified shape of 144 * 144 * 144 pixels, and then all images except for segmentation labels are normalized to zero mean and unit standard deviation. (3) e four modalities are concatenated as four input channels. Figure 1 shows the above steps of the preprocessing procedure. Considering the scale of dataset size, we also apply the data augmentation technique, and the operations of shift and flip are randomly chosen with a fifty percent chance in each training step.

Model Architecture.
In the segmentation task, using low-level details of the input image is proved to be important when the size of datasets is limited; as a result, U-Net has achieved high performance and been widely applied on medical image segmentation [26][27][28]. Besides, degradation is also a common problem when the network architecture is deep [29]. Inspired by this research, we modify the hyperparameters of 3D U-Net and introduce skip-connection into our model. e basic shape of our framework is based on the standard U-Net containing two paths called contracting path (left side) and expansive path (right side). ere are five pairs of blocks employed in the two paths, where the output of the block in the contracting path is concatenated as part of the input of the block in the corresponding expansive path. ese connections create a quick pathway for information between high-level and low-level feature maps which is facilitating the gradient backward propagation and compensating finer details into high-level semantic features [30]. Besides, these connections allow the output blocks to extract multilevel features for different tasks from the backbone of SGPNet.
Our proposed network is consisting of one backbone and two output blocks, illustrated in Figure 2. More specifically, [Conv (in, out, kernel size, stride)] represents the 3D convolution layer; the items in four-tuple (in, out, kernel size, stride) represent input channels, output channels, kernel size, and stride of the convolutional layer, respectively. IN represents the instance normalization (IN) layer which is designed to remove the instance-specific contrast information from the input image [31], and Up is the up-sampling layer. FC represents the fully connected layer for the prediction of IDH genotype. LR is the following leaky rectified linear unit (LeakyReLU) activation function: e segmentation task and the IDH genotype prediction task share most of the weights in the backbone. In general, our network is an end-to-end model, which receives four channels of MR images as input and outputs the segmentation labels and predicted IDH mutation status.
In the contracting path, we replace the max-pooling layer in the standard U-Net, with one 3 * 3 * 3 convolution with a stride of 2 for down-sampling and double the number of output channels, followed by two repeated 3 * 3 * 3 convolutions with a stride of 1. LR and IN are also added after the convolution layer. e blue dotted line represents the skip-connection; it adds the output of the first convolution layer with the output of the last convolution layer in each block. In the expansive path, the input of each block is the concatenation of the previous block and the corresponding feature map from the paired contracting path. e first 3 * 3 * 3 convolution integrates the information of concatenated input, followed by a 1 * 1 * 1 convolution that halves the number of input channels. e upsampling layer follows these two convolutions and uses the nearest neighbor interpolation algorithm to double the width and height of the input features, followed by a 3 * 3 * 3 convolution to further half the number of input channels. ese two output blocks have also introduced the idea of skip-connection. For the segmentation and IDH genotype output blocks, the input of these blocks is from three different levels' blocks in the expansive path of the backbone.

Evaluation Metrics.
In this section, we use four metrics to assess SGPNet including specificity (SP), sensitivity (SN), accuracy (ACC), and area under the receiver operating characteristic curve (AUC) for IDH status prediction task and dice similarity coefficient (DSC) for segmentation task. Specificity (SP) measures the proportion of negatives that are correctly predicted, as in equation (2), and sensitivity (SN) is the measurements of true positive rate, as in equation (3). ACC is the fraction of the total samples that are identified correctly, as in equation (4). AUC calculates the probability that a randomly selected positive example ranked above a randomly selected negative one. Dice similarity coefficient (DSC) is designed to score how closely the predicted segmentation labels matched the annotated ground-truth segmentation labels, as in equation (5).
Computational Intelligence and Neuroscience 3 ere are four definitions introduced to calculate the above items: true positive (TP) is the quantity of the correctly predicted positive class, likewise, true negative (TN) is the number of correctly predicted negative class. False positive (FP) is the quantity of incorrectly predicted positive class, and false negative (FN) is the quantity of incorrectly predicted negative class.

Implementation Details.
Considering the evaluation metrics, cross-entropy and dice loss are the objective functions of our network. In the task of gene mutation prediction, the IDH status is encoded into two labels (wildtype and mutation). e binary cross-entropy (BCE) loss function L 1 is used to calculate the similarity between the predicted labels and ground-truth labels, which is defined as follows: where y represents the model's prediction of class possibilities and y represents the ground-truth labels. e dice loss function is aimed to calculate the spatial overlap accuracy of predicted segmentation labels compared with manually annotated labels which are defined as follows: e ground-truth segmentation labels contain more information than the IDH mutation status, so it may be not ideal to weigh segmentation error equally with classification error. In order to integrate the above loss functions, we define the total loss as follows: where k is the parameter to balance the segmentation error and classification error. In order to dynamically balance the dice loss and classification loss, the parameter k in the total loss function is defined as (L 2 /L 1 + L 2 ), so the total loss function can be given by the following formula: We set different learning rates for different parts of our network. In particular, the learning rate is set to 0.0001 for the backbone and segmentation labels output block, and it is set to 0.00005 for the IDH status prediction block. Moreover, we adopt learning rate scheduling with cosine annealing during the training phase. e weights of our network are optimized by the Adam [32] method with a minibatch size of two.

Experiments and Results
In this section, we present a series of experiments to demonstrate the performance of the proposed multitask model; we test SGPNet on the BRAST and TCGA datasets and compare SGPNet with three existing models. Furthermore, we discuss the impact of the lesion's information for the IDH status prediction task. Overall, 121 gliomas cases are involved including 56 mutant cases and 65 wild-type cases. e reproducibility of the results is verified in fivefold cross-validations, and the final results are the average of the cross-validations.

Multitask Model for Segmentation and IDH Genotype
Prediction. In order to evaluate the performance of our proposed model, we compare SGPNet with three different models. ACC, SE, SP, and AUC metrics are utilized to quantitatively evaluate the performance of the prediction of IDH genotype, and the DSC metric is used to evaluate the performance of the segmentation task. Table 1 shows the ACC, SN, SP, AUC, and DSC of all models on the performance of the IDH genotype prediction task and segmentation task. Figure 3 illustrates the qualitative   Computational Intelligence and Neuroscience segmentation results of lesion areas with our SGPNet, which demonstrates that the SGPNet can determine the boundary of the lesion accurately. Different from single-task segmentation and classification models, the SGPNet not only can segment the lesions of gliomas but also predicts the IDH genotype depending on the brain MR images. e positive predictive value (PPV) and negative predictive value (NPV) of the SGPNet achieve 0.894 and 0.908, respectively. Moreover, these experimental results show that our proposed model reduced 26.6% classification error rates compared with previous models and performed well on gliomas' lesions segmentation.

3.2.
e Comparisons with Different Groups of Learning Targets. e brain MR images contain the details of normal tissues and lesion areas. Both normal and lesion areas may possibly influence genotype prediction. e multitask model structure allows us to set different groups of learning targets to investigate if the information of lesion areas or the wholebrain MR images may be more likely to influence the genotype prediction, which might increase the reliability as a computer-aided tool for diagnosis and treatment. In this section, we carry out three controlled experiments for analysing the relationship between the genotypes and phenotypes by training SGPNet with different groups of learning targets: (1) SGPNet is only trained with IDH genotype; (2) SGPNet is trained with ground-truth segmentation labels and IDH genotype; and (3) SGPNet is trained with randomly generated tensor as segmentation labels and IDH genotype. Table 2 shows the performance of IDH genotype prediction across three controlled experiments. Figure 4 compares the comparative ROC curves of different experiments.
e total loss function is simplified as a single-task objective function L 1 when SGPNet is only trained with IDH genotype labels. After that, SGPNet is considered as a classifier of IDH genotype, and the performance of SGPNet is worse than Liang et al. and Chang et al. [11,12]. One important reason is that Liang et al. and Chang et al. crop the lesion areas as the models' input, while our model receives whole-brain MR images as input, which increases the difficulty for the model to extract useful features considering the limited information of IDH genotype labels. When the ground-truth segmentation labels are added as learning targets, the performance of the model is significantly improved. However, the first experiment uses a single-task objective function L 1 , while the second experiment uses the multitask objective function L. To further discuss the influence of the objective function, we set up the third experiment that regards randomly generated segmentation labels as learning targets. It means that the segmentation output block learns the wrong features of lesion areas while the IDH status output block can still learn the features of the whole MR images; as a result, the performance of the model is significantly cut down. After comparing these experimental results, we can infer that the ground-truth segmentation labels promote the performance of IDH genotype prediction, and the lesions information is more important to predict the IDH genotype.

Discussion
Developing an automatic segmentation of 3D gliomas lesion is a challenging task, considering the wide variability in tumor size, form, and strength. Furthermore, the mutation status of IDH can be used as a qualified biomarker for selecting diagnostic and therapeutic approaches for gliomas patients. Previous studies have focused on the prediction of genotypes from medical images [8,9,11,12]; however, these single-task models show the limitation of their practicality and scalability. However, it still lacks a multitask model for segmentation and IDH genotype prediction of gliomas. Besides, there is no research to compare the influence of the images' features of whole MR images and lesion areas to the prediction of IDH genotype.
e SGPNet is an end-to-end framework designed to address the challenges of segmentation and IDH genotype prediction of gliomas. In Section 3.1, the experimental results indicate the significant improvement of the performance of IDH genotype prediction, and the prediction error rates reduce 26.6%, comparing to the models of Liang et al. and Chang et al. [11,12]. Due to the multitask model architecture, in Section 3.2, we further discuss if the information of gliomas' lesions or whole MR images is more likely to affect the prediction of IDH genotype by setting different learning target groups.
e experimental results indicate that providing the ground-truth segmentation labels as learning targets will promote the performance of IDH genotype prediction comparing with other experiments. Overall, we infer that the information of lesion areas is more important for IDH genotype prediction, which increases the reliability as a computer-aided tool for diagnosis and treatment.
In clinical practice, the diagnosis of glioma is usually made by experts based on the various MR images and gene mutation statuses. e different modalities of MR images can reflect different characteristics of the lesions. For example, T1 provides anatomical information, and T2 is sensitive to the edema area and reflects the morphological information of tumors [33].
e SGPNet can integrate multimodality MR images to predict the boundary of lesion areas and the IDH genotype of the patients, and it can reduce doctors' workload and help doctors to choose the proper treatment for the patients.
e SGPNet is feasible for segmentation and genotype prediction because the backbone of our framework is designed to learn the intrinsic information of patients' lesions. Meanwhile, the framework of SGPNet can be used to segment other tissue lesions or predict other genotypes when it is welltrained on the corresponding dataset. e SGPNet can be also applied to multicenters and larger-scale multisequence MR image datasets because the backbone in our models is generic for any MR image collected from different institutions, equipment, and modalities. Moreover, increasing the scale of training datasets can improve the generalization ability of the SGPNet. Generating probability density distributions for different tissue types is also an effective approach to reduce noise reduce environmental noise and improve generalization ability [34]. erefore, the design of an automatic multitask model for Computational Intelligence and Neuroscience gliomas has superior clinical value. In the future, we will further develop our framework and apply the SGPNet to more types of diseases and genes.

Conclusion
In this paper, we present a novel multitask 3D framework named SGPNet for automatic segmentation of gliomas lesions and prediction of IDH mutation status from MR images. Our framework employs a backbone for learning the intrinsic MR image information, two output blocks for segmentation and IDH genotype prediction of gliomas. e experimental results indicate that our architecture achieves a better IDH genotype prediction performance on public TCGA and BRATS 2020 datasets comparing with previous studies and achieves a good result on the segmentation task. Furthermore, we compare the influence of the images' features of whole MR images and lesion areas to the prediction of genotype and the experimental results, indicating that the information of patients' lesions is more significant for the prediction of IDH genotype. In summary, the accurate segmentation of glioma lesion regions and prediction of IDH mutation status will improve therapeutic criteria and assist doctors in diagnosis and treatment.

Data Availability
e MRI data used to support the findings of this study have been deposited in the BRATS repository (http:// braintumorsegmentation.org/), and the gene profiles data used to support the findings of this study have been deposited in the TCGA-GBM and TCGA-LGG repositories (https://portal.gdc.cancer.gov/projects/TCGA-GBM and https://portal.gdc.cancer.gov/projects/TCGA-LGG).

Conflicts of Interest
e authors declare that they have no conflicts of interest regarding the publication of this paper.