Discriminating brain activated area and predicting the stimuli performed using artificial neural network

classifying fMRI paradigms. The classifier’s performance was evaluated in terms of the Sensitivity and Specificity , Prediction Accuracy and the area A z under the receiver operating charac-teristics (ROC) curve. From the ROC analysis, values of A z up to 1 were obtained with 60 PCs in discriminating the visual paradigm from the auditory paradigm.

Discriminating brain activated area and predicting the stimuli performed using artificial neural network In this work, a Multilayer Perceptron implementation -MLP using functional Magnetic Resonance Imaging (fMRI) is used to infer stimuli performed. Sets of images of brain activation were generated by visual, auditory and finger tapping paradigms in 54 healthy volunteers. These images were used for training the MLP network in a leave-one-out manner in order to predict the paradigm that a subject performed by using other images, so far unseen by the MLP network. The aim in this paper is the exploring of the influence of the number of the Principal Component (PC) on the performance of the MLP in classifying fMRI paradigms. The classifier's performance was evaluated in terms of the Sensitivity and Specificity, Prediction Accuracy and the area A z under the receiver operating characteristics (ROC) curve. From the ROC analysis, values of A z up to 1 were obtained with 60 PCs in discriminating the visual paradigm from the auditory paradigm.

Introduction
Functional magnetic resonance imaging (fMRI) is a non-invasive imaging technique that can be effectively used to map different sensor, motor and cognitive functions to specific regions in the brain. It provides an open window onto the brain at work, exposing a relevant insight to the neural basis of the brain processes (HARDOON, 2005). By recording changes in cerebral blood flow, as a subject performs a mental task, fMRI shows which brain regions activate when a subject makes movements, hears or smells something, sees someone, thinks and so forth (HARDOON, 2005). The fMRI neuroimaging is considered by several researchers as a datum extremely rich in signal information and poorly characterized in terms of signal and noise structure (ROBINSON, 2004). Over the last few decades, fMRI developments and researches had got advances in interrelated fields such as machine learning, data mining, and statistics in order to enhance its capabilities to extract and characterize subtle features in data sets from a wide variety of scientific fields (ROBINSON, 2004 et al, 2007). The first one is paramagnetic, so it is able to be attracted by a magnetic field. The second one is diamagnetic, namely, is slightly repelled by a magnetic field and does not retain the magnetic properties once the external field is removed (GIACOMANTONE, 2005;ERCC, 2007). One example of contrast imaging is the Blood Oxygen Level Dependent effect which the acroname is BOLD, in which the presence of oxyhemoglobin in a tissue produces a difference of susceptibility between the tissue and the neighboring area, that is, regions with high concentrations of oxyhemoglobin (tissue) provide brighter image than regions with low concentration -neighboring area (AMARO and BARKER, 2006). The temporal evolution of the BOLD effect is shown in figure 1.

Paradigm in fMRI
According to AMARO, E. and BARKER, G. J. (2006), paradigm in fMRI is the construction, temporal organization structure and behavioral predictions of cognitive tasks made by a subject during an fMRI experiment. Typical examples of fMRI paradigms are: visual, auditory and finger tapping paradigms.

fMRI scan
An fMRI scan measures the BOLD response at all the points in a three dimensional image or voxels (volume elements). A simple fMRI scan is able to collect three dimensional brain images (BOLD images) of the whole brain with approximately 10,000 to 15,000 voxels every 1-3s (MITCHELL et al, 2004;AMARO and BARKER, 2006). These BOLD images are a result of series of cognitive tasks (paradigm) performed inside the scanner by a subject (AMARO and BARKER, 2006). They show brightness levels changes of certain cerebral areas, proportional to the underlining activities, associated to the BOLD effect. The area in which the brightness changes in response to a specific paradigm made can be identified using statistical analyses or pattern recognition techniques (AMARO and BARKER, 2006).

Pattern classification
Here, we summarize only the relevant concepts for MLP-based classification that are essential for describing its application to fMRI.
A full MLP description can be found in Haikin (1999). A MLP is a kind of Artificial Neural Network (ANN), assembled with a group of processing units (neurons) that are interconnected with varying synaptic weights. MLPs can be applied to a lot of areas within biology and neuroscience (HAYKIN, 1999;PETERS et al, 2001), including fMRI data (MCKEOWN, 1998;MISAKI and MIYAUCHI, 2006). The popularity of MLP is primarily a result of its apparent ability of taking decisions and making conclusions when it deals with complex problems, defined in "noisy environment", or when the information used in the learning process are not enough to conduct the training or when the network has to adapt its behavior due to the nature of information used in the training (HAYKIN, 1999). In neuroimaging, MLP has been applied in data classification and pattern recognition to facilitate the diagnosis of pathological anomalies (diseases) and investigate functional activities of the brain.

MLP Architecture
The type of MLP we have used in our studies consists of a three-layered unit. They have neurons with adjustable synaptic weights and bias.
The first and the third are the input and output layers, respectively. Between them there is a layer of hidden neurons. Each input neuron is connected to each hidden neuron by synaptic weights.
Similarly, each hidden neuron is connected to each output ones by another group of synaptic weights (PETERS et al, 2001). • Input signals, weighted by the correspondently synaptic weights, are summed with other input signals on a linear combination fashion; • An activation function that limits the amplitude of output signal. The activation func-tion, ϕ(.), defines the output neuron in terms of active signal level in its input and provide a nonlinear characteristic to the MLP. An example of activation function is (HAYKIN, 1999): The network output is the value of activation function for n linear combination summing of the input level. It can also present an external threshold θ k , that is, an offset from the normal output.
From figure 2, in which the sequences x 1 , x 2 , …, x p and w k1 , w k2 , …, w kp are the input signals and synaptic weights, respectively.

Training method
The x 1 x 2 x 3

Dimensionality reduction
It is hard to classify high-dimensional fMRI volumes into visual, auditory and finger tapping (left and right) paradigm. The dimension of each 54 brain activated image (converted into a feature vector of length 19968) is 256x78 pixel. Therefore, a dimensionality reduction must be done for decreasing the computational effort normally required to discriminate data like these.
The PCA formulation was used as a dimen- The resulting compressed image is the one which the feature vector has as many less significant components as possible, which means as many principal components as possible (SMITH, 2002). Therefore, the image compression rate can be quantified from the number of PC chosen, that is, the less is the amount of PC the more compressed is the final image. In our studies compressed images with 10 to 60 PC were obtained.

Pattern recognition
The pattern recognition step can be organized in two sessions: • The training session; • The test session.

The ROC curve
In this section, the classifier performance is evaluated in terms of the area A z under the ROC curve (METZ, 1986;WOODS and BOWYER, 1997). For a specific value of PC, one ROC plots the ability of the MLP in separating visual para-digm from auditory paradigm (figures 4 or 6 and another one plot the discrimination performed between right finger tapping and left finger tapping, figures 5 or 7).

Classifier performance in terms of sensitivity and specificity
In   Table 2 are similar to those found with visual and auditory paradigm. In any case, an improvement in performance is observed as the amount of PC (decrease in image compression rate) increases from 10 to 60.
Additionally, in  To summarize, the novelty in this work was to demonstrate that it is possible to use a neural network implementation to infer the tasks performed by subjects. The bases of our approach deal with statistical parametric maps (translated into feature vector), PCA formulation and the separation of them into groups of auditory, visual, left and right finger tapping paradigms.