Skip to main content
Log in

Mind reading with regularized multinomial logistic regression

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In this paper, we consider the problem of multinomial classification of magnetoencephalography (MEG) data. The proposed method participated in the MEG mind reading competition of ICANN’11 conference, where the goal was to train a classifier for predicting the movie the test person was shown. Our approach was the best among ten submissions, reaching accuracy of 68 % of correct classifications in this five category problem. The method is based on a regularized logistic regression model, whose efficient feature selection is critical for cases with more measurements than samples. Moreover, a special attention is paid to the estimation of the generalization error in order to avoid overfitting to the training data. Here, in addition to describing our competition entry in detail, we report selected additional experiments, which question the usefulness of complex feature extraction procedures and the basic frequency decomposition of MEG signal for this application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://www.bbci.de/competition/iv/index.html.

  2. The data can be downloaded from http://www.cis.hut.fi/icann2011/meg/measurements.html.

  3. Note, that the challenge report [25] erroneously states the frequency features to be the envelopes of the frequency bands. However, the data consists of the plain frequency bands; see the erratum at http://www.cis.hut.fi/icann2011/meg/megicann_erratum.pdf.

  4. http://www.cs.tut.fi/~hehu/mindreading.html.

  5. http://www-stat.stanford.edu/~tibs/glmnet-matlab.

  6. In the subsequent sections we refer to the first-day data as training data, the 25 training samples from the second day as validation data and the remaining 25 samples from the second day as test data. The 653 originally unlabeled test samples from the second day are called secret test data.

  7. Note, that this is relevant although the stimuli were presented without audio: language processing is not limited to the processing of spoken language [33].

  8. While short term clips from movie categories 1, 2 and 3 (see Sect. 2.1) were shown by the organizers in an intermingled fashion, the “storyline” movies (categories 4 and 5), have been presented in one continuous block, each at the end of the experiment [25]. Therefore, the acquired signals in categories 1, 2, and 3 might be different to the signals in categories 4 and 5 purely for ‘chronological’ reasons, e.g., decreasing vigilance.

  9. The term filter (see Guyon and Elisseeff [11]) here refers to the application of a feature selection method that is independent of the classifier.

References

  1. Anderson, J., Blair, V.: Penalized maximum likelihood estimation in logistic regression and discrimination. Biometrika 69, 123–136 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  2. Besserve, M., Jerbi, K., Laurent, F., Baillet, S., Martinerie, J., Garnero, L.: Classification methods for ongoing EEG and MEG signals. Biol. Res. 40(4), 415–437 (2007)

    Google Scholar 

  3. Blankertz, B., Müller, K.R., Krusienski, D.J., Schalk, G., Wolpaw, J.R., Schlögl, A., del Pfurtscheller, G., RMillán, J., Schröder, M., Birbaumer, N.: The BCI competition III: validating alternative approaches to actual BCI problems. IEEE Trans. Neural Syst. Rehabil. Eng. 14(2), 153–159 (2006)

    Article  Google Scholar 

  4. Blankertz, B., Tangermann, M., Vidaurre, C., Fazli, S., Sannelli, C., Haufe, S., Maeder, C., Ramsey, L., Sturm, I., Curio, G., Müller, K.R.: The Berlin brain-computer interface: non-medical uses of BCI technology. Front Neurosci. 4, 198 (2010)

    Article  Google Scholar 

  5. Carroll, M.K., Cecchi, G.A., Rish, I., Garg, R., Rao, A.R.: Prediction and interpretation of distributed neural activity with sparse models. Neuroimage 44(1), 112–122 (2009)

    Article  Google Scholar 

  6. Chan, A.M., Halgren, E., Marinkovic, K., Cash, S.S.: Decoding word and category-specific spatiotemporal representations from MEG and EEG. Neuroimage 54(4), 3028–3039 (2011)

    Article  Google Scholar 

  7. Debuse, J.C., Rayward-Smith, V.J.: Feature subset selection within a simulated annealing data mining algorithm. J. Intell. Inf. Syst. 9, 57–81 (1997)

    Google Scholar 

  8. Dougherty, E.R., Sima, C., Hua, J., Hanczar, B., Braga-Neto, U.M.: Performance of error estimators for classification. Curr. Bioinf. 5(1), 53–67 (2010)

    Article  Google Scholar 

  9. Friedman, J.H., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)

    Google Scholar 

  10. Grosenick, L., Greer, S., Knutson, B.: Interpretable classifiers for FMRI improve prediction of purchases. IEEE Trans. Neural Syst. Rehabil. Eng. 16(6), 539–548 (2008)

    Article  Google Scholar 

  11. Guyon, I., Elisseeff, A.: An introduction to variable and feature seletion. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  12. Hand, D.J.: Classifier technology and the illusion of progress. Stat. Sci. 21(1), 1–14 (2006). http://www.jstor.org/stable/27645729

  13. Hanke, M., Halchenko, Y.O., Sederberg, P.B., Olivetti, E., Fründ, I., Rieger, J.W., Herrmann, C.S., Haxby, J.V., Hanson, S.J., Pollmann, S.: PyMVPA: a unifying approach to the analysis of neuroscientific data. Front Neuroinf. 3, 3 (2009)

    Google Scholar 

  14. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2ndedn. Springer Series in Statistics. Springer (2009)

  15. Haxby, J.V., Gobbini, M.I., Furey, M.L., Ishai, A., Schouten, J., Pietrini, P.: Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293(5539), 2425–2430 (2001)

    Article  Google Scholar 

  16. Haynes, J.D.: Multivariate decoding and brain reading: introduction to the special issue. NeuroImage 56(2), 385–386 (2011)

    Article  Google Scholar 

  17. Haynes, J.D., Rees, G.: Predicting the orientation of invisible stimuli from activity inhuman primary visual cortex. Nat. Neurosci. 8(5), 686–691 (2005)

    Article  Google Scholar 

  18. Holte, R.C.: Elaboration on two points raised in “classifier technology and the illusion of progress”. Stat. Sci. 21(1), 24–26 (2006). http://www.jstor.org/stable/27645732

  19. Huttunen, H., Kauppi, J.P., Tohka, J.: Regularized logistic regression for mind reading with parallel validation. In: Proceedings of ICANN/PASCAL2 Challenge: MEG Mind-Reading, pp. 20–24 (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf

  20. Huttunen, H., Manninen, T., Tohka, J.: MEG mind reading: Strategies for feature selection. In: Proceedings of the Federated Computer Science Event 2012, pp. 42–49 (2012). http://www.cs.helsinki.fi/u/starkoma/ytp/YTP-Proceedings-2012.pdf

  21. Jylänki, P., Riihimäki, J., Vehtari, A.: Multi-class Gaussian process classification of single trial MEG based on frequency specific latent features extracted with binary linear classifiers. In: Proceedings of ICANN/PASCAL2 Challenge: MEG Mind-Reading, pp. 31–34 (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf

  22. Kamitani, Y., Tong, F.: Decoding the visual and subjective contents of the human brain. Nat. Neurosci. 8(5), 679–685 (2005)

    Article  Google Scholar 

  23. Kauppi, J.P., Huttunen, H., Korkala, H., Jääskeläinen, I.P., Sams, M., Tohka, J.: Face prediction from fMRI data during movie stimulus: strategies for feature selection. In: Proceedings of ICANN 2011. Lecture Notes in Computer Science, Vol. 6792, pp. 189–196. Springer (2011)

  24. Kippenhan, J.S., Barker, W.W., Pascal, S., Nagel, J., Duara, R.: Evaluation of a neural-network classifier for pet scans of normal and alzheimer’s disease subjects. J. Nucl. Med. 33(8), 1459–1467 (1992)

    Google Scholar 

  25. Klami, A., Ramkumar, P., Virtanen, S., Parkkonen, L., Hari, R., Kaski, S.: ICANN/PASCAL2 Challenge: MEG Mind-Reading—Overview and Results (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf

  26. Kleinbaum, D., Klein, M.: Logistic Regression. Statistics for Biology and Health. Springer, New York (2010)

  27. Lautrup, B., Hansen, L., Law, I., Mørch, N., Svarer, C., Strother, S.: Massive weight sharing: a cure for extremely ill-posed problems. In: Supercomputing in Brain Research: From Tomography to, Neural Networks, pp. 137–148 (1994)

  28. Lilliefors, H.W.: On the Kolmogorov–Smirnov test for normality with mean and variance unknown. J. Am. Stat. Assoc. 62(318), 399–402 (1967)

    Article  Google Scholar 

  29. Lotte, F., Congedo, M., Lécuyer, A., Lamarche, F., Arnaldi, B.: A review of classification algorithms for EEG-based brain–computer interfaces. J. Neural Eng. 4(2), R1 (2007)

  30. Mar, R.: The neuropsychology of narrative: story comprehension, story production and their interrelation. Neuropsychologia 42(10), 1414–1434 (2004)

    Article  Google Scholar 

  31. Mørch, N., Hansen, L.K., Strother, S.C., Svarer, C., Rottenberg, D.A., Lautrup, B., Savoy, R., Paulson, O.B.: Nonlinear versus linear models in functional neuroimaging: learning curves and generalization crossover. In: Proceedings of the 15th International Conference on Information Processing in Medical Imaging. Lecture Notes in Computer Science, vol. 1230, pp. 259–270 (1997)

  32. Naselaris, T., Kay, K.N., Nishimoto, S., Gallant, J.L.: Encoding and decoding in fMRI. NeuroImage 56(2), 400–410 (2011)

    Article  Google Scholar 

  33. Nickels, L.: The hypothesis testing approach to the assesment of language. In: Stremmer, B., Whitaker, H. (eds.) The Handbook of Neuroscience of Language. Academic press (2008)

  34. Olsson, C.J., Jonsson, B., Larsson, A., Nyberg, L.: Motor representations and practice affect brain systems underlying imagery: an fMRI study of internal imagery in novices and active high jumpers. Open Neuroimaging J. 2, 5–13 (2008)

    Article  Google Scholar 

  35. O’Toole, A.J., Jiang, F., Abdi, H., Pénard, N., Dunlop, J.P., Parent, M.A.: Theoretical, statistical, and practical perspectives on pattern-based classification approaches to the analysis of functional neuroimaging data. J. Cogn. Neurosci. 19(11), 1735–1752 (2007)

    Article  Google Scholar 

  36. Pereira, F., Botvinick, M.: Information mapping with pattern classifiers: a comparative study. Neuroimage 56(2), 476–496 (2011). doi: 10.1016/j.neuroimage.2010.05.026

    Google Scholar 

  37. Pereira, F., Mitchell, T., Botvinick, M.: Machine learning classifiers and fMRI: a tutorial overview. NeuroImage 45(Suppl 1), S199–S209 (2009)

    Article  Google Scholar 

  38. Pfurtscheller, G., Lopes da Silva, F.H.: Event-related EEG/MEG synchronization and desynchronization: basic principles. Clin. Neurophysiol. 110(11), 1842–1857 (1999)

    Article  Google Scholar 

  39. Poldrack, R.A., Halchenko, Y.O., Hanson, S.J.: Decoding the large-scale structure of brain function by classifying mental states across individuals. Psychol. Sci. 20(11), 1364–1372 (2009)

    Article  Google Scholar 

  40. Pudil, P., Novovičová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15(11), 1119–1125 (1994)

    Article  Google Scholar 

  41. Rasmussen, P.M., Hansen, L.K., Madsen, K.H., Churchill, N.W., Strother, S.C.: Model sparsity and brain pattern interpretation of classification models in neuroimaging. Pattern Recognit. 45(6), 2085–2100 (2012)

    Article  Google Scholar 

  42. Rasmussen, P.M., Madsen, K.H., Lund, T.E., Hansen, L.K.: Visualization of nonlinear kernel models in neuroimaging by sensitivity maps. NeuroImage 55(3), 1120–1131 (2011)

    Article  Google Scholar 

  43. Rieger, J.W., Reichert, C., Gegenfurtner, K.R., Noesselt, T., Braun, C., Heinze, H.J., Kruse, R., Hinrichs, H.: Predicting the recognition of natural scenes from single trial MEG recordings of brain activity. Neuroimage 42(3), 1056–1068 (2008)

    Article  Google Scholar 

  44. Santana, R., Bielza, C., Larranaga, P.: An ensemble of classifiers approach with multiple sources of information. In: Proceedings of ICANN/PASCAL2 Challenge: MEG Mind-Reading, pp. 25–30 (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf

  45. Stam, C.: Use of magnetoencephalography (MEG) to study functional brain networks in neurodegenerative disorders. J. Neurol. Sci. 289(1–2), 128–134 (2010)

    Article  Google Scholar 

  46. Tangermann, M., Müller, K.R., Aertsen, A., Birbaumer, N., Braun, C., Brunner, C., Leeb, R., Mehring, C., Miller, K.J., Mueller-Putz, G., Nolte, G., Pfurtscheller, G., Preissl, H., Schalk, G., Schlögl, A., Vidaurre, C., Waldert, S., Blankertz, B.: Review of the BCI competition IV. Front. Neurosci. 6(55), 1–31 (2012)

  47. Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1994)

    MathSciNet  Google Scholar 

  48. Tomioka, R., Müller, K.R.: A regularized discriminative framework for EEG analysis with application to brain-computer interface. NeuroImage 49(1), 415–432 (2010)

    Article  Google Scholar 

  49. van De Ville, D., Lee, S.W.: Brain decoding: opportunities and challenges for pattern recognition. Pattern Recognit. Spec. Issue Brain Decod. 45(6), 2033–2034 (2012)

    Article  MATH  Google Scholar 

  50. van Gerven, M., Hesse, C., Jensen, O., Heskes, T.: Interpreting single trial data using groupwise regularisation. Neuroimage 46, 665–676 (2009)

    Google Scholar 

  51. Waldert, S., Preissl, H., Demandt, E., Braun, C., Birbaumer, N., Aertsen, A., Mehring, C.: Hand movement direction decoded from MEG and EEG. J. Neurosci. 28(4), 1000–1008 (2008)

    Article  Google Scholar 

  52. Webb, A.: Statistical Pattern Recognition, 2nd edn. John Wiley& Sons, Chichester, England (2002)

  53. Zhdanov, A., Hendler, T., Ungerleider, L., Intrator, N.: Inferring functional brain states using temporal evolution of regularized classifiers. Comput. Intell. Neurosci. 2007 (2007)

  54. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The research was funded by the Academy of Finland grant no 130275. We also want to thank Professor R. Hari (Brain Research Unit, Low Temperature Laboratory, Aalto University School of Science, Finland) for her valuable remarks concerning our study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heikki Huttunen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huttunen, H., Manninen, T., Kauppi, JP. et al. Mind reading with regularized multinomial logistic regression. Machine Vision and Applications 24, 1311–1325 (2013). https://doi.org/10.1007/s00138-012-0464-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-012-0464-y

Keywords

Navigation