An Automatic Segmentation of T2-FLAIR Multiple Sclerosis Lesions

Multiple sclerosis diagnosis and patient follow-up can be helped by an evaluation of the lesion load in MRI sequences. A lot of automatic methods to segment these lesions are available in the literature. The MICCAI workshop Multiple Sclerosis (MS) lesion segmentation Challenge 08 allows to test and compare these algorithms. This paper presents a method designed to detect hyperintense signal area on T2-FLAIR sequence and its results on the Challenge test data. The proposed algorithm uses only three conventional MRI sequences: T1, T2 and T2-FLAIR. First, images are cropped, spatially unbiased and skull-stripped. A segmentation of the brain into its different compartments is performed on the T1 and the T2 sequences. From these segmentations, a threshold for the T2-FLAIR sequence is automatically computed. Then postprocessing operations select the most plausible lesions in the obtained hyperintense signals. Global result on the test data (80/100) is close to the inter-expert variability (90/100).

Multiple Sclerosis (MS) is a chronic, inflammatory, demyelinating disease of the Central Nervous System (CNS). In people affected by MS, patches of damage called lesions appear in seemingly random areas of the CNS. An MRI exam is required to establish MS diagnosis using McDonald criteria [12]. In addition they are often used in patients follow-up and clinical studies [10]. MRI analysis uses currently Barkhof/Tintore criteria [1,21] which include lesions number, location enhancement and are taking into account spinal cord lesions [15].
These lesions can appear as a hyperintense signal or as a hypointense signal depending on its properties and on the used MRI sequence. Lesions are hyperintense signals in T2 and proton density sequences. Active lesions are a piece of evidence of blood-brain barrier leakage and are the only lesion subtype in hyperintense signals in the T1 sequence with Gadolinium. Necrotic lesions are hypointense signal in T1 sequence. Except for necrotic lesions, T2-FLAIR sequence allows a better lesion-healthy tissue differentiation but bony artefacts and flow artefacts are present in the image.
A binary segmentation of the lesions can help to the MS diagnosis and patient follow-up. Manual lesion segmentation is a fastidious task and depends on intra and inter-expert variabilities. For this reason, a lot of automatic lesion segmentation algorithms have been developed in the past 20 years [20]. The Multiple Sclerosis (MS) lesion segmentation Challenge 08 1 offers the possibility to compare these methods. First, a set of train data (with manual segmentation of expert) is available to optimize the different methods. Then, lesion segmentation has to be performed on a set of test data. Results are then compared with manual segmentation by an expert.
The method which is proposed in this article segments automatically the hyperintense signal in the T2-FLAIR sequence. First, the method is described. Then results on the test data of the workshop are given and discussed.

Method
The method proposed in this article segments T2-FLAIR lesions from three MRI sequences (T1, T2 and T2-FLAIR) and is divided in different steps. First, preprocessing steps allow to normalize images and to focus on a region of interest. Secondly, a classification of the brain is performed thanks to an expectationmaximization algorithm [4] applied on the T1 and T2 sequences. These steps are similar to [7]. In a third step, information given by the obtained segmentations and morphological operations allow to extract lesions.

Preprocessing
MRI sequences of the MS lesion segmentation Challenge 08 are already co-registered. This means that a same voxel in the different sequences represents the same location in the brain. However, different preprocessing steps have to be performed before segmenting the images (Figure 1).

Image cropping
This step aims to decrease the number of voxels belonging to the background. It also improves the computation speed of following processes. It is performed using the MNI 2 atlas [9,8] and an affine registration algorithm [14]. First, the average T2 sequence of the atlas is registered on the T2 sequence. Then, the obtained transformation is applied on all the images of the atlas. The obtained information allow to crop the MRI sequences focussing on a region of interest.

Skull-stripping
This step extracts the intracranial space from the image. Many methods such as [17,18,19,23] are described in the literature. Our method is described in [7]. A first expectation-maximization algorithm is performed on the couple of sequences T1 and T2 and leads to a first segmentation of the brain. Morphological operations (detection of the largest connected component, holes filling ...) allow to get the brain mask.

Intensity Normalization
The aim of this step is to correct the fact that two voxels with the same biological composition may not have the same intensity. This difference in intensity is called bias and is caused by RF acquisition field inhomogeneities [13] or biological tissues bias reflecting that the intensity of a same biological structure has a variability around a mean value [16]. In our case, we estimate and correct these inhomogeneities with the Expectation/Conditional Maximization algorithm proposed in [16].

Segmentation of the brain
To segment the brain, the algorithm presented in [5] is applied on the T1 and T2 sequences. This algorithm uses the principle of the EM algorithm [4] to maximize the log-likelihood between the MRI data and a gaussian model of ten classes: white matter (WM), grey matter (GM), cerebro-spinal fluid (CSF), six GM/CSF partial volume classes (with different proportions), and an outlier class (additional class that corresponds mainly to the vessels). First, the probability of belonging to the different classes of each voxel is initialized thanks to the a priori information of the MNI registered atlas [9,8]. Second two steps are iterated: • In the maximization step, the parameters (mean µ k , covariance matrix Σ k ) of each class, k, are computed from the voxels intensities and their probabilities of belonging to the different classes.
• In the expectation step, the probability of belonging to the different classes of each voxel is updated depending on the classes parameters.
Finally, outliers that do not follow the intensity gaussien model are detected thanks to the computation of the Mahalanobis distance (Equation 1) between the intensity vector, v, of each voxel and the mean vector of each class.
If this distance is greater than a threshold the voxel is considered as an outlier. The probability segmentations given by this algorithm are then binarize. Each voxel belongs to the class with the highest probability. At the end of this step 11 binary images (GM, WM, CSF, 6 partial volume classes, outliers corresponding to vessels, others outliers) are obtained ( Figure 2).

Lesion extraction from the T2-FLAIR
Except for necrotic lesions, lesions are hyperintense signals on the T2-FLAIR. The following steps use this property to segment the lesions.

Segmentation from T2-FLAIR sequence
The application of the binary segmentations of the brain (Section 1.2) on the T2-FLAIR sequence gives the properties (mean, µ, standard deviation, σ) of healthy compartments on this sequence. As lesions are hyperintense signals on the T2-FLAIR sequence, a sensitive threshold, T, which gives a preliminary segmentation of the lesions can be compute automatically from the properties of the GM class (Equation 2).
The application of this threshold on the T2-FLAIR sequence can help us to "detect" lesions (most of the lesion have at least a voxel with an intensity higher than the threshold). However lesion voxel intensities are inhomogeneous and the "delineation" of the lesion is not simple even if a voxel of this lesion is known. For this reason, we enhance the contrast in the T2-FLAIR sequence before applying the threshold T (Figure 3, a  et b). This is realized with the algorithm 1. The application of T on the T2-FLAIR sequence with enhanced contrast gives a candidate lesion segmentation, S 1 .

Refinement using classification results
S 1 contains lesions but also others hyperintense signals like bony artefacts and flow artefacts. To eliminate the voxels corresponding to these false positives, a region of interest is defined. Like in [11,22], we are looking for lesions in the "supposed WM". This "supposed brain compartment" correspond to the WM that we should observed if there was no lesion in the sequence. A segmentation of this class can be approximated by the mask of WM (given by the EM algorithm) in which cavities (holes in the segmentation) have been filled. This is realized thanks to morphological operations. The application of this mask on S 1 gives a second preliminary lesion segmentation, S 2 .

Algorithm 1 Enhancing contrast algorithm.
Require: T2-FLAIR sequence, ima. Dima = a grey level dilation of ima Eima = a grey level erosion of ima Cima = an empty image of the same size of ima Return Cima, the T2-FLAIR sequence with enhanced contrast.
According to [6], lesions may be classified as outliers, GM/CSF partial volumes or "pure" GM in the segmentations given by the EM algorithm. Consequently, they are not included in the WM mask or in the "pure" CSF mask. Voxels belonging to one of these masks are removed from S 2 .
Finally, the holes that are present in the segmentation are filled thanks to morphological operations to improve lesion delineation (Figure 3, c).

Limitation of the region of interest
We used the property that lesions are hyperintense signals in the T2-FLAIR sequence. In fact, this is not correct for sub-tentorial lesions. In this case, lesion intensities are close to healthy tissues (GM, WM) intensities. Consequently, sub-tentorial lesions are not included in S 1 and S 2 in most of the cases. To avoid false positive in this region, all voxel of S 2 included in the sub-tentorial region are removed. The mask of the sus-tentorial region is perform with a locally affine registration [3] of an atlas [2] on the data, followed by morphological operations (Figure 4).

Results
Results of the method have been sent to the the Challenge managers. A comparison between the automatic segmentations and manual segmentation performed by two experts has been performed. Different criteria have been used to compare the different segmentations. This is a short description of them : • The volume difference captures the absolute percent volume difference to the expert rater segmentation.
• The average distance captures the symmetric average surface distance to the expert rater segmentation.
• The true positive rate corresponds to the number of lesions in the automatic segmentation that overlaps with a lesion in the expert segmentation divided by the number of overall lesions in the expert segmentation.
• The false positive rate is the number of lesions in the automatic segmentation that DON'T overlap with any lesion in the expert segmentation divided by the number of overall lesions in the automatic segmentation.
• The sensitivity is the ratio of true positives to the sum of true positives and false negatives.
• The specificity is the ratio of true negatives to the sum of true negatives and false positives.
• The positive predictive value is the ratio of true positives to the sum of true positives and false positives.
For the true positive rate and the false positive rate criteria, the unity is the lesion (a connected region of the segmentation). For sensitivity, specificity and positive predictive value, the unity is the voxel. Table 1 gives the results of our method on the test data of the challenge. Figure 5 present the Total score of each patient repartition with a box-and-whisker plot. Figure 6 gives an example of execution time for the patient CHB test1 Case01.  Step time (min) Atlas based image cropping 24 Skull-stripping 2 Intensity Normalization 30 Segmentation of the brain 16 Segmentation from T2-FLAIR 3 Refinement using classification results 3 Limitation of the region of interest 12 Total 96 (1h30) Figure 6: Example of execution time on CHB test1 Case01.

Discussion and future work
The analysis of individual criteria is complex because of their interdependencies. For example, the true positive rate give the number of lesion correctly detected but its value does not take into consideration the volume of the missed lesions. For this reason, our discussion is only based on the global scores.
The comparison of UNC rater and CHB rater is given to be around 90/100. Our method with a global result of 80/100 is near this inter-expert variability. The box-and-whisker plot ( Figure 2) shows that the method scores are between 66 and 91 except for one outlier. Indeed, our method result is only 46 for the patient UNC test1 Case09. This can be explained by the fact that this patient seems to have necrotic lesions, a lesion subtype which is not detected by the proposed method. Indeed, necrotic lesions are hypointense signals in T1 but are not visible in T2-FLAIR. In addition, no lesion is observed on the T2-FLAIR and the WM appear "dirty". Consequently, the proposed method detects a lot of false positives. This is also observed for UNC test1 Case04, CHB test1 Case06 and CHB test1 Case12. In these three cases, the score of our method is under 80.
Moreover, our method has a score under 80 for three other data sets: UNC test1 Case06, UNC test1 Case10 and CHB test1 Case03. Without the segmentation of reference, the best explanation seems to be that these data correspond to patients with few lesions. Therefore, any error of segmentation decreases dramatically the score. This assertion has to be further validated with a comparison between the automatic segmentation and the segmentation of reference.
STAPLE results show that our method is more specific than sensitive. This show an undersegmentation of the lesions. This effect can also be observed in Figure 3.
The proposed method needs around 1 hour 30 minutes to segment the lesion of the Challenge data ( Figure  6). This is often less than manual segmentation. Moreover, the method algorithm has not been optimized. method will also be included in SepINRIA 3 (a software allowing analysis of MS MRI).