Keywords

1 Introduction

Amongst the important research studies for the past three decades is the segmentation of medical images including Magnetic Resonance (MR) brain images. Mainly, diagnosis, computer integrated surgery and treatment planning are the areas of application of this research domain [1]. The distinct advantage of the efficient and fully automated evaluation of MR brain images lies in the fact that it overcomes the human error during the investigation.

The MR brain image segmentation generally aims to divide the brain into three tissues, which are: white matter (WM), grey matter (GM) and cerebrospinal fluid (CSF) for functional visualization in the diagnosis of diseases like cancer and neurological disorders [2]. To analyze the original image easily, the segmentation process is applied to transform the original image representation into another meaningful format [3]. Fundamentally, boundaries between the brain tissues and assignment of a new label to every pixel in the image are defined by image segmentation. The categorization and grouping of brain tissues are simplified by the assigned labels where a particular computed characteristic including texture, intensity, shape, or color represent pixels that contain the same label [3].

Development of an image segmentation, in terms of clustering has been the focus of many research studies in recent years, as it has been found that clustering can help in developing effective segmentation methods. The k-means, Fuzzy C-means (FCM), Gaussian Mixture Model (GMM), and Self-Organized Map (SOM) are some of the popular techniques of clustering. Each one of these methods has its own advantages and imitations [4]. For example, even though the k-means approach is faster and simple to use, it is highly influenced by the outlier points, and thus being sensitive to cluster initialization [5]. Alternatively, FCM relatively attains better outcome for overlapped classes and is less sensitive to initialization, but it has high computational cost [6]. Clearly, it proves to be challenging to have one clustering approach, which combines all the strength points of the existing approaches.

A completely-automatic segmentation approach for MR images is presented in this paper that integrates two or more clustering methods and a Neural Network model to retain the salient features of those clustering methods. In the beginning, a superpixel algorithm is applied to divide the image into objects. Then, two clustering methods are utilized to produce segmentation results. This is followed by training of a neural network model through combining the outcome of the clustering algorithm with the pixel intensity features.

The rest of this paper is organized as follows. The related work is described in Sect. 2. Section 3 presents details of the proposed method. Section 4 presents the experimental results, and the conclusion is given in Sect. 5.

2 Related Work

In recent years, many methods have been introduced in the area of medical image segmentation that include fully automatic methodologies for brain image segmentation. For instance, Kalaiselvi and Somasundaram [7] applied FCM for brain tissue segmentation, which is aimed to reduce the computational cost by using image histogram information to initialize the centroid values. The main drawback with that method is that it is iterative in nature, which makes the whole segmentation process computationally complex [8].

In [9], an automatic brain segmentation using K-means clustering was introduced to detect the shape and range of tumors. A median filter was applied to remove the artifacts and sharpen the edges of the image. Then, the k-means clustering method was utilized using random centroid values. To predict and calculate the tumor region in the image, a binary mask was applied to identify the class with high contrast values.

Ortiz et al. [10] proposed a fully unsupervised and automated method to segment MR brain images using Self-Organising Maps (SOMs) and Genetic Algorithms (GAs). The method combined five steps: image acquisition and preprocessing, feature extraction, feature selection using genetic algorithm, voxel classification using SOM, and sharp map clustering.

To combine the advantage of two or more of the existing clustering methods, we develop an automatic segmentation technique that integrates the clustering methods using a neural network model. The next section presents our proposed method.

3 The Proposed Method

The proposed method is categorized into two main phases: training and testing. Below are the steps for each phase.

3.1 Training Phase

  1. (1)

    Pre-processing step: This step mainly aims to improve the MR segmentation of the under-represented class, i.e., the cerebrospinal fluid (CSF) class. The first step comprises image scaling and resizing.

    \(_{\blacksquare }\) :

    Image scaling: The contrast of the image is improved by scaling the pixel intensity distribution to cover the range of 0 to 255. Equation 1 is applied to generate the new image.

    $$\begin{aligned} \text {Scale pixel} = a + (\text {pixel} - c) *(\frac{b - a}{d - c}) \end{aligned}$$
    (1)

    where a and b are the target minimum and maximum gray levels respectively, while c and d are the original gray levels.

    \(_{\blacksquare }\) :

    Image resizing: The dimension of the original and ground truth images are expanded from (\(256\times 256\)) to (\(512\times 512\)). This is made possible by duplicating each pixel into block of size (\(2\times 2\)). In contrast to WM and GM, CSF segmentation accuracy is poor in the majority of the existing algorithms. The CSF class comprises of a diminutive area with an average ratio of 2.10 in the brain image. Therefore, there is a possibility of integrating the CSF class with other classes while applying the superpixel algorithm.

  2. (2)

    Pre-segmentation:

    \(_{\blacksquare }\) :

    Simple Linear Iterative Clustering (SLIC) Superpixels [11]: The superpixel algorithm is applied to over-partition the scaled image into objects. The first process involves manually defining the necessary number of superpixels (k). The initial superpixel cluster centers \(C_j = [I_j, x_j, y_j]^T\) with \(j=\lbrace 1,2,\cdots ,k\rbrace \), where \(I_j\) is the pixel intensity and \((x_j,y_j)\) is the coordinator of the center. The initial cluster centers are sampled on a regular grid spaced S pixels. In order to have approximately equal size of superpixel, the grid size is set to \(\sqrt{\frac{N}{k}}\), where N is the total number of image pixels. Then, the centers are moved to the lowest gradient position in \(3\times 3\) neighborhood to prevent centering a superpixel on an edge or a noisy pixel. Grouping of the pixels is in terms of intensity similarity. After each iteration, the pixels that are assigned to the cluster are used for updating the cluster center values. This process is repeated until constancy is attained in superpixel centers.

    \(_{\blacksquare }\) :

    Remove the background: The over-partition image is divided into background and object regions based on Eq. 2.

    $$\begin{aligned} g_{(x,y)} =&\left\{ \begin{matrix} 1&{} if \,\, f_{(x,y)} > Th\\ 0&{} \text {otherwise} \end{matrix}\right. \end{aligned}$$
    (2)

    where \(g_{(x,y)}\) is the segmented binary image, \(f_{(x,y)}\) is the original pixel value and Th is the threshold value, which is set to zero.

  3. (3)

    Feature Extraction: For each object, the pixel intensity features are extracted to be input to two varied clustering techniques in a simultaneous manner. Thus, there is a difference in over-segmented objects’ size. The five highest frequencies of pixel intensity are extracted for every object.

  4. (4)

    Clustering techniques: The image is partitioned into groups through the number of base clustering techniques. The extracted features and number of classes are the inputs for each clustering method. It is important to note that there are three classes in the MR brain image excluding the background. k-means and Self-Organizing Map (SOM) are chosen to produce the segmented image. A brief description of each of these algorithms is given below.

    \(\circ \) :

    k-means clustering: k-means is one of the most popular unsupervised clustering technique that separates the input data into groups based on their distance from each other. It is simple and fast to apply on images with large data points [12]. The k-means method is described in Algorithm 1.

    $$\begin{aligned} J = \sum _{i=1}^{n}\sum _{j=1}^{k} \begin{Vmatrix}x_i - c_j\end{Vmatrix} ^ 2 \end{aligned}$$
    (3)

    where n is the number of data points \(\left( x_{1}, x_{2},\cdots , x_{n} \right) \), and k is the number of cluster centers.

    $$\begin{aligned} c_j = \frac{1}{m_i} \sum _{x \in c_j} x \end{aligned}$$
    (4)

    \(c_j\) is the \(j^{th}\) cluster center and \(m_i\) is the number of data point (x) in the \(j^{th}\) cluster center.

    figure a
    \(\circ \) :

    Self-Organizing Map (SOM): SOM [13] is an unsupervised learning neural network. SOM has a feed-forward structure and does not require to specify the targeted output like other types of neural network, because it divides the data point into groups by learning from the data itself, instead of constructing a rule set. The map contains two layers: input and output (or competitive). In the input layer, each node corresponds to single input data, while the output layer is organized into a two dimensional grid of competitive neurons. Each input node is connected to the output node by adjustable weight vector and is updated in each iterative process.

    The winner neuron, also know as Best Matching Unit (BMU), is determined by selecting the minimum Euclidean distance between the input data and weight vector at each iteration. SOM also utilizes neighborhood function. So, when the node wins a competition, the node neighbors are also updated. Let \(X = \lbrace x_1, x_2,\cdots , x_N \rbrace \) be the input data of size \(N\times 1\) and \(w_{ij}\) is the weight vector of the node \(x_i\). The winner neuron c is computed by \(c = \Vert x_i - w_{ij}\Vert \). The winner neuron and its neighbors are updated using Eq. 5.

    $$\begin{aligned} w_{ij} (t+1) = w_{ij}(t) + H_{ci} \Big [x_i(t) - w_{ij}(t) \Big ] \end{aligned}$$
    (5)

    where \(w_{ij}\) and \(w_{ij}(t+1)\) is the old and new adjusted weight for the node \(x_i\) respectively, t describes the iteration number of the training process, \(x_i(t)\) is the input data at iteration t, and \(H_{ci}\) is the neighborhood function for the winner neuron c, which is calculated using Eq. 6.

    $$\begin{aligned} H_{ci} = \alpha \Bigg [\exp ^{(-\frac{\Vert r_c - r_i\Vert ^2}{2\sigma (t)^2})}\Bigg ] \end{aligned}$$
    (6)

    where \(\alpha \) is the learning rate, \(r_i\) and \(r_c\) are the positions of the node i, and the winner node c in the topological map (output space), respectively, \(\Vert r_c - r_i\Vert \) is the distance between the i and winning neurons, \(\sigma \) is the search distance (number of neighborhood pixels).

  5. (5)

    Back Propagation Neural Network (BPNN): The simplest technique for organized training of multi-layered NN is BPNN. It adjusts the weight values by estimating the non-linear connection between the input and the output. The back propagation technique generalizes the Least Mean Square (LMS) algorithm by modifying the network weights for the purpose of minimizing the mean squared error between the network and targeted outputs. Two inputs are used to train the supervised network: the extracted features and outputs of the two clustering techniques. The BPNN is also provided with the targeted outputs (ground truth). We used the first five images of the dataset for training. Thus, a model for brain tissue segmentation is learned from the training dataset by combining the clustering techniques together.

3.2 Testing Phase

The method presents an effective combination of the two clustering methods and NN to achieve better segmentation, particularly for the CSF region. It aims to predict the final segmentation result by combining the two clustering method outputs. This combination is implemented using the trained NN model that receives two sets of inputs; i.e., extracted features and output of the two clustering techniques, and attempt to predict the class label of each superpixel.

The implementation of the testing process is explained in the following steps:

  1. 1.

    Pre-processing step: The test image is scaled to enhance the image contrast. Then, the image size is duplicated to help in classifying the CSF region.

  2. 2.

    Pre-segmentation step: The pre-processed image is divided into objects using the SLIC superpixel algorithm. Then, the background is excluded using a global threshold value.

  3. 3.

    Feature Extraction: For each object, we extract the five highest frequencies of pixel intensity.

  4. 4.

    Clustering Techniques: k-means and SOM are applied individually to divide the brain image into groups. The extracted features and number of classes are the inputs for each clustering algorithm.

  5. 5.

    Neural Network: The NN is used to predict the segmented image using the extracted features and outputs of the two clustering methods.

4 Experimental Results

The performance of the proposed method is evaluated using the Internet Brain Segmentation Repository (IBSR), which is made available by the Center for Morphometric Analysis, Massachusetts General Hospital (http://www.cma.mgh.harvard.edu/ibsr). The IBSR dataset contains a three dimensional T1-weighted MRI brain data set for 20 normal subjects and the corresponding manual segmentation, performed by a trained expert [14]. We utilize this manual segmentation of WM, GM, and CSF as the ground truth to evaluate our results.

We used the Jaccard Similarity (JS) metric to evaluate the spatial overlap between the ground truth and segmented images, which is computed as the ratio between the intersection and union of the segmented and ground truth images. The Jaccard Similarity (JS) is defined using Eq. 7.

$$\begin{aligned} JS = \frac{S_1 \cap S_2}{S_1 \cup S_2} \end{aligned}$$
(7)

where \(S_1\) is the ground truth image and \(S_2\) is the segmented image.

Table 1. Jaccard similarity values of k-means, SOM and our proposed method using slice number 20.

Table 1 presents the results of Jaccard Similarity using k-means, SOM and our proposed method. It is observed that the proposed method achieved a higher degree of similarity for all three classes (WM, GM and CSF) compared to the other two clustering methods.

We also used the Dice Similarity Coefficient (DSC), which is a statistical validation metric that was proposed to evaluate the accuracy of segmentation methods using Eq. 8.

$$\begin{aligned} DSC = 2 \times \frac{S_1 \bigcap S_2}{\mid S_1\mid + \mid S_2\mid } \end{aligned}$$
(8)
Table 2. Dice similarity coefficient values of k-means, SOM and our proposed method using slice number 20.

Table 2 shows the results of the Dice Similarity Coefficient obtained using the two based segmentation methods and our proposed method. The proposed method also achieved better segmentation than other methods using this measure.

The outcome of the segmentation method is also tested using Root Mean Square Error (RMSE). The RMSE is a statistical metric that finds the difference between the ground truth and segmented images. It is computed using Eq. 9.

$$\begin{aligned} RMSE =\sqrt{\frac{\sum _{i=1}^{W}\sum _{j=1}^{H} {(S2- S1)}^2}{W\times H}} \end{aligned}$$
(9)

where W and H are the width and height of the ground truth (S1) and segmented (S2) images.

Table 3 shows the experimental results of the proposed method compared to the k-means and SOM. It can be observed from Table 3 that the proposed method obtained better results compared to the other methods in terms of RMSE values.

Table 3. Root mean square error of k-means, SOM and our proposed method using slice number 20.

Figures. 1(a) and (b) show the ground truth and segmented images respectively. The resulted image indicates a high degree of similarity with the manual segmented image.

Fig. 1.
figure 1

Subject 202-3, slice 20: (a) ground truth and (b) proposed method.

5 Conclusions

This paper introduced a segmentation method for MR brain images based on clustering techniques and neural network. The method comprises two stages; training and testing. Both stages start with a pre-processing step to improve the image contrast, as the brain structure is not realized by unique intensities in MR images. Then, the brain image is partitioned into objects using the SLIC superpixels algorithm. The training stage involves the training of a NN model that is fed with features extracted from frequencies of pixel intensities and outputs of the two base clustering methods. Our method achieved better results compared to the based clustering according to the three measures of JS, DSC and MSE.