Fast Brain MRI Segmentation Using a Volumetric Deep Learning Approach

Functional and Structural MRI studies benefit from good segmentation of grey and white matter, for example to allow for cortex-based alignment. Automatic segmentation tools apply (multi-) atlas-based segmentation strategies that often lack the accuracy on difficult-to-segment brain structures and take several hours of processing. Moreover, these algorithm depend on aligning scans and atlases. Alternatively, to avoid this last step, many methods nowadays deploy solutions based on Convolutional Neural Networks (CNNs), by which the testing volume is partitioned into 2D or 3D patches processed independently. This entails a loss of global contextual information thereby negatively impacting the final accuracy of the segmented structures. To fully exploit global spatial information, we introduce a CNN-based segmentation algorithm that processes the whole MRI volume at once and produces an accurate result in only few seconds starting from a single MRI sequence (T1w). Training and testing are performed on 947 out-of-the-scanner MRI volumes acquired using a standard 1mm-isotropic MPRAGE sequence (3T). Results are evaluated using the Dice Similarity Coefficient and the Hausdorff Distance. The comparison with the state of the art shows that our method outperforms any other current CNN-based solution.


Introduction
When dealing with brain MRI, segmentation plays a role of fundamental importance. It is an essential step for many MRI analyses, for example in the diagnosis of many pathologies, and it is an early step in functional MRI (fMRI) study pipelines. To reduce the human time consumption needed for a manual segmentation process, different fully automated pipelines have been developed (Despotović, Goossens, & Philips, 2015). The vast majority of these tools apply an atlasbased (or multi-atlas-based) segmentation strategy (Cabezas, Oliver, Lladó, Freixenet, & Bach Cuadra, 2011), in which a target volume is registered with one or several templates built from manual annotations. However, due to the high intersubject brain variability, these procedures often lack of segmentation accuracy on brain structure or tissue boundaries (Klauschen, Goldman, Barra, Meyer-Lindenberg, & Lundervold, 2009;Klein et al., 2017;Lerch et al., 2017;Wenger et al., 2014). In addition, those approaches are time consuming and computationally intensive.
Recently, Deep Learning (DL) brought substantial advancements in the fields of computer vision (Voulodimos, Doulamis, Doulamis, & Protopapadakis, 2018) and medical image analysis (Shen, Wu, & Suk, 2017;Litjens et al., 2017). Numerous DL-based algorithms that match, or even outperform, atlas-based segmentation have been proposed (Akkus, Galimzianova, Hoogi, Rubin, & Erickson, 2017). However, the common strategy they adopt is to partition the volume in 2D (Roy, Conjeti, Navab, & Wachinger, 2018) or 3D patches (Fedorov et al., 2016;Rajchl, Pawlowski, Rueckert, Matthews, & Glocker, 2018;Dolz et al., 2018;Wachinger, Reuter, & Klein, 2018), process them separately, and aggregate the results to obtain the whole brain segmentation. While this paradigm simplifies the problem from a technical point of view, it introduces important limitations into the analysis, as patch-based methods mostly exploit local spatial information while ignoring "global" cues, such as the absolute and relative position of different brain structures, since each patch is segmented independently from the others.
Different works recently discuss the potential improvements of removing the partitioning of the volume (McClure et al., 2018;Wachinger et al., 2018). Such volumetric approach has already been applied to MRI segmentation of prostate (Milletari, Navab, & Ahmadi, 2016), heart atrium (Savioli, Montana, & Lamata, 2018), and proximal femur (Deniz et al., 2018), but not yet in the context of brain segmentation -where it could prove particularly useful given the complex geometry and the variety of structures characterising the brain anatomy. In this work, we investigate such hypothesis by introducing the first DL-based full-volume approach to MRI brain segmentation, comparing its performance with respect to state-of-theart patch-based approaches.

Model Architecture
Aiming at exploiting at best the global spatial information contained in MRI data, we design a deep convolutional neural network able to tackle the brain segmentation problem in a volumetric manner -which we will refer to as "fullyvolumetric". This is accomplished exploiting an end-to-end encoding-decoding structure, where only convolutional blocks   Figure 1. Inspired by Ronneberger, Fischer, and Brox (2015) and Ç içek, Abdulkadir, Lienkamp, Brox, and Ronneberger (2016), we propose a deep encoder-decoder architecture with six 3D convolutional blocks, arranged in increasing number on three layers. Since a whole volume is considered as an input, the feature maps extracted by such convolutional blocks are not limited to patches but span across the entire volume. This allows each block to capture the content of the whole brain MRI, greatly increasing the amount of context provided to each subsequent block, thus supporting the learning of both local and global spatial features. Moreover, instead of maxpooling, convolutions with stride are used as a dimensionality reduction method -allowing the network to learn the optimal downsampling strategy starting from the extracted features. Finally, skip connections are used along with tensorial sum (instead of concatenation) to improve the quality of the segmented volume, while limiting significantly the number of parameters (Quan, Hildebrand, & Jeong, 2016).

Training and Testing Data
We evaluate our model's performance on 947 out-of-thescanner volumes (i.e., reconstructed DICOM images) collected in more than 10 years by the Centre for Cognitive Neuroimaging (CCNi, Institute of Neuroscience and Psychology, University of Glasgow, UK) using a T1-weighted (T1 w ) 1 − mm isotropic MPRAGE protocol. We use 900 of these volumes for training purposes, 11 for validation and the remaining as test set. Manually annotating such a large database would prove exceptionally time-consuming. However, several works report that labels obtained using automated pipelines can be exploited to train models that perform the same (Rajchl et al., 2018), or even better (Roy et al., 2018), than the automated pipeline itself. For this reason, we train our model on automatic segmentations obtained by FreeSurfer (Fischl, 2012). Focusing on the requirements of most real case scenarios, we relabel the data following the seven classes considered in the MICCAI MRBrainS13 (Mendrik et al., 2015) and MRBrainS18 challenges, i.e., grey matter, basal ganglia, white matter, cerebrospinal fluid, ventricles, cerebellum and brainstem. Each training pair is therefore composed by the unprocessed MRI scan, a set of raw images turned into a neck-cropped NIfTI volume using dcm2niix (Li, Morgan, Ashburner, Smith, & Rorden, 2016) -and the result of the FreeSurfer cortical reconstruction process recon-all after the aforementioned relabelling.

Results
We compare the proposed method with other CNN-based solutions: the well-known 2D-patch-based U-Net (Ronneberger et al., 2015), its 3D variant (Ç içek et al., 2016), and the stateof-the-art architecture QuickNAT (Roy et al., 2018) -which leverages the aggregation of three slightly modified U-Net architectures (trained on coronal, sagittal, and axial MRI slices, respectively).
The learning effectiveness of the models is quantitatively evaluated exploiting Dice Coefficient and 95 th percentile Hausdorff Distance (Figure 2), using FreeSurfer segmentation as a reference. The comparison shows that our model, despite provided with far less parameters, outperforms the others in terms of both metrics -especially in the case of structures such as basal ganglia and ventricles, whose segmentation becomes harder if conducted on patches (due to their similarity to other structures or tissues). Moreover, we evaluate our model generalisation capability by comparing the obtained segmentation against the FreeSurfer ground truth our method is trained on. In most cases (some of which are reported in Figure 3), and in absence of systematic errors in the training ground truth, our model qualitatively outperforms FreeSurfer atlas-based segmentation.

Conclusion
In this work, we trained and tested a CNN tackling the brain MRI segmentation problem in a fully-volumetric fashion. Ex- ploiting Dice Similarity Coefficient and 95 th percentile Hausdorff Distance, we have shown that such model outperforms the state of the art in terms of learning effectiveness. Moreover, we proved our model strong generalisation capability by qualitatively assessing its superior performance with respect to the data it was trained on (FreeSurfer atlas-based segmentation).