Abstract
In this paper, we consider the problem of multinomial classification of magnetoencephalography (MEG) data. The proposed method participated in the MEG mind reading competition of ICANN’11 conference, where the goal was to train a classifier for predicting the movie the test person was shown. Our approach was the best among ten submissions, reaching accuracy of 68 % of correct classifications in this five category problem. The method is based on a regularized logistic regression model, whose efficient feature selection is critical for cases with more measurements than samples. Moreover, a special attention is paid to the estimation of the generalization error in order to avoid overfitting to the training data. Here, in addition to describing our competition entry in detail, we report selected additional experiments, which question the usefulness of complex feature extraction procedures and the basic frequency decomposition of MEG signal for this application.
Similar content being viewed by others
Notes
The data can be downloaded from http://www.cis.hut.fi/icann2011/meg/measurements.html.
Note, that the challenge report [25] erroneously states the frequency features to be the envelopes of the frequency bands. However, the data consists of the plain frequency bands; see the erratum at http://www.cis.hut.fi/icann2011/meg/megicann_erratum.pdf.
In the subsequent sections we refer to the first-day data as training data, the 25 training samples from the second day as validation data and the remaining 25 samples from the second day as test data. The 653 originally unlabeled test samples from the second day are called secret test data.
Note, that this is relevant although the stimuli were presented without audio: language processing is not limited to the processing of spoken language [33].
While short term clips from movie categories 1, 2 and 3 (see Sect. 2.1) were shown by the organizers in an intermingled fashion, the “storyline” movies (categories 4 and 5), have been presented in one continuous block, each at the end of the experiment [25]. Therefore, the acquired signals in categories 1, 2, and 3 might be different to the signals in categories 4 and 5 purely for ‘chronological’ reasons, e.g., decreasing vigilance.
The term filter (see Guyon and Elisseeff [11]) here refers to the application of a feature selection method that is independent of the classifier.
References
Anderson, J., Blair, V.: Penalized maximum likelihood estimation in logistic regression and discrimination. Biometrika 69, 123–136 (1982)
Besserve, M., Jerbi, K., Laurent, F., Baillet, S., Martinerie, J., Garnero, L.: Classification methods for ongoing EEG and MEG signals. Biol. Res. 40(4), 415–437 (2007)
Blankertz, B., Müller, K.R., Krusienski, D.J., Schalk, G., Wolpaw, J.R., Schlögl, A., del Pfurtscheller, G., RMillán, J., Schröder, M., Birbaumer, N.: The BCI competition III: validating alternative approaches to actual BCI problems. IEEE Trans. Neural Syst. Rehabil. Eng. 14(2), 153–159 (2006)
Blankertz, B., Tangermann, M., Vidaurre, C., Fazli, S., Sannelli, C., Haufe, S., Maeder, C., Ramsey, L., Sturm, I., Curio, G., Müller, K.R.: The Berlin brain-computer interface: non-medical uses of BCI technology. Front Neurosci. 4, 198 (2010)
Carroll, M.K., Cecchi, G.A., Rish, I., Garg, R., Rao, A.R.: Prediction and interpretation of distributed neural activity with sparse models. Neuroimage 44(1), 112–122 (2009)
Chan, A.M., Halgren, E., Marinkovic, K., Cash, S.S.: Decoding word and category-specific spatiotemporal representations from MEG and EEG. Neuroimage 54(4), 3028–3039 (2011)
Debuse, J.C., Rayward-Smith, V.J.: Feature subset selection within a simulated annealing data mining algorithm. J. Intell. Inf. Syst. 9, 57–81 (1997)
Dougherty, E.R., Sima, C., Hua, J., Hanczar, B., Braga-Neto, U.M.: Performance of error estimators for classification. Curr. Bioinf. 5(1), 53–67 (2010)
Friedman, J.H., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)
Grosenick, L., Greer, S., Knutson, B.: Interpretable classifiers for FMRI improve prediction of purchases. IEEE Trans. Neural Syst. Rehabil. Eng. 16(6), 539–548 (2008)
Guyon, I., Elisseeff, A.: An introduction to variable and feature seletion. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Hand, D.J.: Classifier technology and the illusion of progress. Stat. Sci. 21(1), 1–14 (2006). http://www.jstor.org/stable/27645729
Hanke, M., Halchenko, Y.O., Sederberg, P.B., Olivetti, E., Fründ, I., Rieger, J.W., Herrmann, C.S., Haxby, J.V., Hanson, S.J., Pollmann, S.: PyMVPA: a unifying approach to the analysis of neuroscientific data. Front Neuroinf. 3, 3 (2009)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2ndedn. Springer Series in Statistics. Springer (2009)
Haxby, J.V., Gobbini, M.I., Furey, M.L., Ishai, A., Schouten, J., Pietrini, P.: Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293(5539), 2425–2430 (2001)
Haynes, J.D.: Multivariate decoding and brain reading: introduction to the special issue. NeuroImage 56(2), 385–386 (2011)
Haynes, J.D., Rees, G.: Predicting the orientation of invisible stimuli from activity inhuman primary visual cortex. Nat. Neurosci. 8(5), 686–691 (2005)
Holte, R.C.: Elaboration on two points raised in “classifier technology and the illusion of progress”. Stat. Sci. 21(1), 24–26 (2006). http://www.jstor.org/stable/27645732
Huttunen, H., Kauppi, J.P., Tohka, J.: Regularized logistic regression for mind reading with parallel validation. In: Proceedings of ICANN/PASCAL2 Challenge: MEG Mind-Reading, pp. 20–24 (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf
Huttunen, H., Manninen, T., Tohka, J.: MEG mind reading: Strategies for feature selection. In: Proceedings of the Federated Computer Science Event 2012, pp. 42–49 (2012). http://www.cs.helsinki.fi/u/starkoma/ytp/YTP-Proceedings-2012.pdf
Jylänki, P., Riihimäki, J., Vehtari, A.: Multi-class Gaussian process classification of single trial MEG based on frequency specific latent features extracted with binary linear classifiers. In: Proceedings of ICANN/PASCAL2 Challenge: MEG Mind-Reading, pp. 31–34 (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf
Kamitani, Y., Tong, F.: Decoding the visual and subjective contents of the human brain. Nat. Neurosci. 8(5), 679–685 (2005)
Kauppi, J.P., Huttunen, H., Korkala, H., Jääskeläinen, I.P., Sams, M., Tohka, J.: Face prediction from fMRI data during movie stimulus: strategies for feature selection. In: Proceedings of ICANN 2011. Lecture Notes in Computer Science, Vol. 6792, pp. 189–196. Springer (2011)
Kippenhan, J.S., Barker, W.W., Pascal, S., Nagel, J., Duara, R.: Evaluation of a neural-network classifier for pet scans of normal and alzheimer’s disease subjects. J. Nucl. Med. 33(8), 1459–1467 (1992)
Klami, A., Ramkumar, P., Virtanen, S., Parkkonen, L., Hari, R., Kaski, S.: ICANN/PASCAL2 Challenge: MEG Mind-Reading—Overview and Results (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf
Kleinbaum, D., Klein, M.: Logistic Regression. Statistics for Biology and Health. Springer, New York (2010)
Lautrup, B., Hansen, L., Law, I., Mørch, N., Svarer, C., Strother, S.: Massive weight sharing: a cure for extremely ill-posed problems. In: Supercomputing in Brain Research: From Tomography to, Neural Networks, pp. 137–148 (1994)
Lilliefors, H.W.: On the Kolmogorov–Smirnov test for normality with mean and variance unknown. J. Am. Stat. Assoc. 62(318), 399–402 (1967)
Lotte, F., Congedo, M., Lécuyer, A., Lamarche, F., Arnaldi, B.: A review of classification algorithms for EEG-based brain–computer interfaces. J. Neural Eng. 4(2), R1 (2007)
Mar, R.: The neuropsychology of narrative: story comprehension, story production and their interrelation. Neuropsychologia 42(10), 1414–1434 (2004)
Mørch, N., Hansen, L.K., Strother, S.C., Svarer, C., Rottenberg, D.A., Lautrup, B., Savoy, R., Paulson, O.B.: Nonlinear versus linear models in functional neuroimaging: learning curves and generalization crossover. In: Proceedings of the 15th International Conference on Information Processing in Medical Imaging. Lecture Notes in Computer Science, vol. 1230, pp. 259–270 (1997)
Naselaris, T., Kay, K.N., Nishimoto, S., Gallant, J.L.: Encoding and decoding in fMRI. NeuroImage 56(2), 400–410 (2011)
Nickels, L.: The hypothesis testing approach to the assesment of language. In: Stremmer, B., Whitaker, H. (eds.) The Handbook of Neuroscience of Language. Academic press (2008)
Olsson, C.J., Jonsson, B., Larsson, A., Nyberg, L.: Motor representations and practice affect brain systems underlying imagery: an fMRI study of internal imagery in novices and active high jumpers. Open Neuroimaging J. 2, 5–13 (2008)
O’Toole, A.J., Jiang, F., Abdi, H., Pénard, N., Dunlop, J.P., Parent, M.A.: Theoretical, statistical, and practical perspectives on pattern-based classification approaches to the analysis of functional neuroimaging data. J. Cogn. Neurosci. 19(11), 1735–1752 (2007)
Pereira, F., Botvinick, M.: Information mapping with pattern classifiers: a comparative study. Neuroimage 56(2), 476–496 (2011). doi: 10.1016/j.neuroimage.2010.05.026
Pereira, F., Mitchell, T., Botvinick, M.: Machine learning classifiers and fMRI: a tutorial overview. NeuroImage 45(Suppl 1), S199–S209 (2009)
Pfurtscheller, G., Lopes da Silva, F.H.: Event-related EEG/MEG synchronization and desynchronization: basic principles. Clin. Neurophysiol. 110(11), 1842–1857 (1999)
Poldrack, R.A., Halchenko, Y.O., Hanson, S.J.: Decoding the large-scale structure of brain function by classifying mental states across individuals. Psychol. Sci. 20(11), 1364–1372 (2009)
Pudil, P., Novovičová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15(11), 1119–1125 (1994)
Rasmussen, P.M., Hansen, L.K., Madsen, K.H., Churchill, N.W., Strother, S.C.: Model sparsity and brain pattern interpretation of classification models in neuroimaging. Pattern Recognit. 45(6), 2085–2100 (2012)
Rasmussen, P.M., Madsen, K.H., Lund, T.E., Hansen, L.K.: Visualization of nonlinear kernel models in neuroimaging by sensitivity maps. NeuroImage 55(3), 1120–1131 (2011)
Rieger, J.W., Reichert, C., Gegenfurtner, K.R., Noesselt, T., Braun, C., Heinze, H.J., Kruse, R., Hinrichs, H.: Predicting the recognition of natural scenes from single trial MEG recordings of brain activity. Neuroimage 42(3), 1056–1068 (2008)
Santana, R., Bielza, C., Larranaga, P.: An ensemble of classifiers approach with multiple sources of information. In: Proceedings of ICANN/PASCAL2 Challenge: MEG Mind-Reading, pp. 25–30 (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf
Stam, C.: Use of magnetoencephalography (MEG) to study functional brain networks in neurodegenerative disorders. J. Neurol. Sci. 289(1–2), 128–134 (2010)
Tangermann, M., Müller, K.R., Aertsen, A., Birbaumer, N., Braun, C., Brunner, C., Leeb, R., Mehring, C., Miller, K.J., Mueller-Putz, G., Nolte, G., Pfurtscheller, G., Preissl, H., Schalk, G., Schlögl, A., Vidaurre, C., Waldert, S., Blankertz, B.: Review of the BCI competition IV. Front. Neurosci. 6(55), 1–31 (2012)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1994)
Tomioka, R., Müller, K.R.: A regularized discriminative framework for EEG analysis with application to brain-computer interface. NeuroImage 49(1), 415–432 (2010)
van De Ville, D., Lee, S.W.: Brain decoding: opportunities and challenges for pattern recognition. Pattern Recognit. Spec. Issue Brain Decod. 45(6), 2033–2034 (2012)
van Gerven, M., Hesse, C., Jensen, O., Heskes, T.: Interpreting single trial data using groupwise regularisation. Neuroimage 46, 665–676 (2009)
Waldert, S., Preissl, H., Demandt, E., Braun, C., Birbaumer, N., Aertsen, A., Mehring, C.: Hand movement direction decoded from MEG and EEG. J. Neurosci. 28(4), 1000–1008 (2008)
Webb, A.: Statistical Pattern Recognition, 2nd edn. John Wiley& Sons, Chichester, England (2002)
Zhdanov, A., Hendler, T., Ungerleider, L., Intrator, N.: Inferring functional brain states using temporal evolution of regularized classifiers. Comput. Intell. Neurosci. 2007 (2007)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Acknowledgments
The research was funded by the Academy of Finland grant no 130275. We also want to thank Professor R. Hari (Brain Research Unit, Low Temperature Laboratory, Aalto University School of Science, Finland) for her valuable remarks concerning our study.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huttunen, H., Manninen, T., Kauppi, JP. et al. Mind reading with regularized multinomial logistic regression. Machine Vision and Applications 24, 1311–1325 (2013). https://doi.org/10.1007/s00138-012-0464-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-012-0464-y