Classification of Hippocampal Region using Extreme Learning Machine

Important brain parts like hippocampal usually being manually segmented by doctors. But with the introduction of hybrid between machine learning along with neuroimaging technique, it has proved to shows some promising results regarding on segmenting subcortical structures. However, it is known that Extreme Learning Machine (ELM) is to be superior machine learning technique. This study will investigate on the usage of ELM to segment hippocampal by using various hidden nodes configuration. This study also will address on the usage of full image and region of interest (ROI) using ELM. Bag of features is used as a feature extractor where it will segment the hippocampal of the MRI in order to get its visual words. ELM will used it to learn its feature. Results shows that with suitable hidden nodes, it could achieve up to 100% performance on both cases for full image and ROI in hippocampal segmentation.


Introduction
With the introduction of biomedical imaging modality, it has brings great help especially for doctors and medical expert community.Medical imaging can be defined as a modality that deliver information on the subject of the volume underneath the skin [1].There are other biomedical imaging modalities that has been introduced such as Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), radiography and many more to detect stroke, cancer, Alzheimer and epilepsy.However the problem arises when doctors need to make further diagnosis as it only provide raw images for them P -735 to diagnose [2].This can be seen in the cases where doctors usually detect hippocampal manually which can consume a lot of their time and also lead to error at some point [2,3].
Computer Aided Diagnosis (CAD) is introduced to solve this problem, whereas it can be describe by the means of the usage of machine learning and image processing to process image and interpret it in order to assist doctors and experts [4].The process to segment subcortical structures is difficult and important task at the same time too especially in brain analysis, this is because, it is where the hippocampal is located [5].Many research has shown that a patient's hippocampal who is effected by Epilepsy or Alzheimer tends to become smaller in size compare to a healthy person [6].The likes of Atlas-based Segmentation, Statistical Model and Machine Learning is the result of a research that has been done in this past years in order to segment hippocampal [6][7][8].Machine learning, technique such as Structure Vector Machine (SVM) [5], radial basis [9], Artificial Neural Network (ANN) [10] has been widely used as the learning method in order to learn the structure of the hippocampal.Although it is proven that the aforementioned machine learning could achieve good result, however ELM is known to be superior in machine learning technique point of view as it offer a unified framework where the generalized decision from training data can be used to solve binary classification and predicting a scalar number for regression problem [11].Therefore, this study will investigate on the usage of ELM to segment hippocampal by using various hidden nodes configuration with normal control from ADNI dataset as its data.Furthermore, this study also will address on the usage of full image and region of interest (ROI) using ELM.Bag of features [12] will be used as a feature extractor where it will segment the hippocampal of the MRI in order to gets its visual words.Thus, it could be feed into ELM to learn its feature.

Related Works
Lately, a lot of research has been done particularly in the area of segmenting subcortical structure.Technique such as Atlas selection, Statistical model and Machine learning which can be seen in Figure 1 are some of the result achieved by extensive research that have been done [5][6][7].

Atlas Selection, Statistical Model and Machine Learning
One of the technique that is popular among researcher is Atlas-based segmentation, where it will register atlas image to the subject image.And the result of the registered atlas will be used to map the coordinates of the structure of interest from the atlas image to the subject image [7].Since image registration is basically the essence of an extensive variety of medical applications together with visualisation, and image guided surgery and voxel-based morphometry [6,7].Hence allowing Atlas-based segmentation to gain benefits from methodological advances driven by a wide range of application areas.However, it has been noted that its performance depends on the image registration accuracy and anatomical differences between the target image [5].
Another technique is statistical models.The idea of statistical model is that a priori shape information will be used as its learning set so that it could learn the variation from it and limit the search space to only acceptable instances defined by the trained model [6].This technique is widely used among researchers as expert knowledge can be captured in the forms of training examples [13].
Machine learning is one of the method that vastly become favourite among researcher [8,14,15], it is used not only to segment subcortical structure but also to detect tumour [6].Besides, it also can handle large amount of data especially in segmenting or classifying brain anatomical as it needs a lot of raw data in order to get better result [8].However, machine learning has several limitations that could hinder its accuracy, it needs a lot of data in order to be more accurate [6], and thus a lot of experiments need to be done in order to get its optimal parameter.

SVM and neural network
The idea of SVM is that it separates the data in the future space by looking for support vectors is one of the reason of the growing interest in applying SVM in neuroimaging as it has good generalization ability [14].Besides, it also has the capability to classify non-linearly separable data too [6].Nevertheless, SVM also has several disadvantages that might deter the performance during classification as it is difficult to determine the optimal parameters and to understand its structure algorithm too.
Apart from that, SVM also takes a lot of time for training compare to other machine learning approach [6].Besides that, SVM also is not a sparse model since support vector tends to grows along the size of training samples [9].
Whereas ANN is a data processing technique that can be categorised by the form of architecture between connections of the neurons, its method to determine the weights on the connections which is the training or learning algorithm and its activation function [16].Its performance depends entirely on the structure of the training set as if it is not well design it might hinder the classification result and might be resulting to overfitting too [6].But on the hindsight, researcher still opt for ANN as it is capable for classification and regression and it is tolerant towards noise because of the ANN structure which it keeps on improving until it finds the optimal parameter.Plus, ANN can handle and classified more than one output too [6].

Employed Techniques
This section will briefly explain the techniques that has been employed in this study.

Spatial normalization
Spatial normalization is a step where it will transform a brain subject by establish a one-to-one correspondence between the brains by matching it to standard brain form called "template" [12].This technique is developed mainly to simplify inter-subject comparisons by placing all subjects into a standardized stereotactic template space [17].Spatial normalization is an important process in this research as it involves large inter-individual variability of human brain, thus with spatial normalization it helps to decrease the number of interindividual variability and also to ease subject comparison.
This permit an exploratory approach looking for group effects across the entire brain, or a hypothesis-driven approach whereby common ROIs may be utilised across all subjects, keeping away from the necessity for ROI tracing for each subjects [18].

Bag of Feature
Bag of Feature (BoF) is an approach where it represent the whole image or an ROI as a histogram occurrence quantized visual features that also known as "visual signature" of the image [12].It is also one of the most popular method for content-based visual information retrieval (CBVIR) especially in medical field.One of the advantages of BoF is that it represent a direct identification of the features instead of their quantization, thus this approach is able to classify the intended classification whether the image represent hippocampal or not-hippocampal [12].

ELM
ELM is a supervised learning technique which is based on single-hidden layer feedforward neural networks (SLFN) [19].The main idea of ELM is that the hidden node parameters don't need to adjust as they can be assigned with random values.Unlike conventional neural network, ELM also does not need much parameter selection which make it appropriate to be applied in neuroimaging, and ELM also fast in training with good generalisation performance too [20].
ELM can be formulated by: From the formula above, i a are the input nodes and i b is the bias of the th i , which they are defined as the learning parameters of the th i hidden nodes.Whereas x is defined as the input vector with d dimensions and i is the output weight from the th i hidden node.As for is the output of th i hidden node with respect to input x and G activation function.The activation function can be nonlinear continuous functions such as sigmoid, Gaussian and many more.
During training phase, hidden nodes and output nodes parameter must be determined.According to ELM theory, the hidden node parameters i a and i b are assigned with values randomly regardless of its nature and will P -737 remained fixed after that.But in this scenario, i is the only parameter that needs to be determined based on training data.In ELM, given a training data ( , ), = 1, . . ., where ∈ and ∈ {−1, +1}, thus in order to minimize the training error in the cost function that is formed in least square sense and given can be seen in equation (2).
Equation 2 can be further simplified as shown in formula (3).
Where is the hidden layer output matrix of SLFN.While the column of is the hidden node output with respect to input , , … .ℎ( ) is defined as ℎ( ) = [ ( , , . . ., , , and called the hidden layer feature mapping.The row is the hidden layer feature mapping with input .In this case, solving the linear system in equation for equivalent to training the network.If the number of training samples equal to the number of hidden nodes, = then a square matrix and be found by calculating the inverse of a zero training error is obtained.Thus, if L < N then it is not a square matrix and a solution can be found using Moore-Penrose generalized inverse of matrix as given in Equation ( 4) In binary classification problems, the decision function for ELM with one output node can be written in a vector form from Equation (1) and is given in Equation (5).
where, the estimated output weight vector in equation ( 4) and ℎ( ) is a vector that maps the ddimensional input space to L-dimensional hidden layer feature space.

Proposed Methodology
Since the objective of this study is to investigate on the usage of ELM to segment hippocampal by using various hidden nodes configuration.And also, to investigate on the usage of full image and region of interest (ROI), the overall process is being highlighted on flowchart in Figure 2.

Fig. 2. Experiment methodology
The process first started with acquiring dataset from ADNI database, it can be downloaded directly from adni.loni.usc.eduwebsite.A total of 68 MRI from Normal Control (NC) data from the ADNI1: Annual 2 Year 1.5T dataset has been downloaded from http://ida.loni.usc.edudetails can be seen in table 1. Next, is the pre-process step, all of them has undergoes through spatial normalisation process that will register the MRI image into template's image.Then, the normalised image will be fed into BoF in order to get its visual words where it will be later on used as the training and testing feature for the ELM.

Spatial Normalisation
The first step is to correct the origin of the raw MRI data, the main purpose is to reposition the crosshair position so that it would provide better normalisation result of the subject's structural MRI to the template's structural MRI which in this case, it is the MNI152 space.

P -738
Next, is to realign and reslice.Fundamentally, the function of realign and reslice is to registers all images of a subject to generate parameter files so that it can be used later on to correct for head motion.By realigning the images of the subject, it will match the image by transforming it to manipulate the scan.The realign function only allows translations by moving the image in X, Y and Z direction and rotation.After the realignment process the next step will be reslice.Reslice is a function where it will refine which images are needed as sometimes the MRI might not have the same thickness as others thus corrects its motion.
The last step of spatial normalisation is to normalise the subject's MRI to the template's MRI, so that it could put the subject's MRI into standardised MNI space.The function of this step is mainly to determine the transformation that minimises the between two scans by minimising the sum of squares of intensity differences.The result of the normalised MRI can be seen in Figure 3 Fig. 3. Spatial normalisation result

Feature extraction using BoF
The next step is to segment hippocampal and nonhippocampal so that the segmented image can be feed into BoF to get its Visual Words.There are several steps involved in order to extract the Visual Words which are to extract its features, learn "Visual Vocabulary" and quantize features using Visual Vocabulary and lastly to represent each image by frequencies of Visual Words which can be seen in Figure 4.

Experiment, Result, and Discussion
This part will discuss on the result of the experiment that have been conducted for hippocampal classification in two experiments 1) full MRI image and 2) ROI of MRI image.68 MRI from NC has been pre-processed using spatial normalisation technique.From the pre-processed MRI, 2-4 hippocampal from sagittal, axial and coronal has been extracted resulting to 140 extracted images of hippocampal and non-hippocampal.All of them has been fed into bag of feature in order to extract its feature vector.Both of the experiment will used 60% from the extracted images as the training sample and another 40% as testing set.As mentioned previously, two set of experiment have been conducted where the first experiment is to test on different configuration of hidden neurons for full image of MRI.The investigated hidden nodes are 10, 30, 50, 100, 300, 500, 800, 1000, 1200 and 1500 with 30 set of experiment where all of the image will be sorted by using pseudorandom method.The results for the first experiments are tabulated in Table 4. From the table result above, it shows that ELM could clearly distinguish between hippocampal and nonhippocampal image where both training and testing result could achieve as high as 100%.This could be seen in bar chart below that shows the average of training and testing accuracy.When the more the number of hidden neuron is applied, the higher the accuracy will be.In this experiment 1000 hidden neurons are the optimum hidden neuron parameter for this experiment set, because the average of the testing accuracy will start to decrease a little bit when the number of hidden neuron applied is increased to 1200 but when 1500 hidden neuron is applied it started to increased back.

P -739
As for the second experiment, all of the segmented image of both hippocampal and Non-hippocampal will be classified using ROI of MRI image using similar parameters setup as given in Table 3.The results for the second experiments are tabulated in Table 5. Experiment will be setup with the same configuration as experiment 1 which can be seen in table 1  The results of the second experiment show that it could offer a better result compare to experiment 1.This is because, the average testing accuracy in experiment 2 is higher than experiment as tabulated in Table 5.The comparison performance between experiment 1 and experiment 2 are shown in figure 5 The optimum hidden neurons for experiment 2 is 500 because after that, the testing accuracy remains the same when the number of hidden neurons applied is increased.

P -740
However, some things worth to be mention, where ELM could not give a good classification result when the parameter between hippocampal and non-hippocampal is almost the same.This could be seen in the result of Experiment 2 on hidden neurons 500 to 1500.Which in this case the testing result shows 1 of the testing is classified wrong.One of the reason that could be deducted is its visual word's is almost the same with the non-hippocampal's visual words, figure 6 shows the comparison between the classified wrong image's visual words and non-hippocampal's visual words.Aside from that, this study proves that ELM could be applied in Neuroimaging problems especially on segmenting brain subcortical problem.Besides that, several hidden nodes configuration for ELM has been investigated too.From the result of experiment 1 and 2, this study could deduct that ELM could easily classify between hippocampal and nonhippocampal image this is because the result for both experiment 1 and 2 could achieve as high as 100%!Besides that, from the experiment conducted also, a hypothesis could be made, where the accuracy result is influenced by the number of hidden neurons applied.This clearly shows the ability of ELM's where it has good generalisation performance, as its only need to adjust the hidden neuron's parameter in order to increase its accuracy.For future works, ELM could be further improved into structured-ELM so that it could be experimented in more complicated problem especially regarding hippocampal segmentation problems.

Table 2 .
Table 2 below shows the clarification on how the experiment has been setup.Whereas the parameter for both of the experiment can be seen on table 3 Distribution dataset for training and testing

Table 5 .
and table 2. The result can be seen on table below.Experiment 2 result