Skip to main content

Model Complexity and Selection

  • Chapter
  • First Online:
Advanced Data Analysis in Neuroscience

Part of the book series: Bernstein Series in Computational Neuroscience ((BSCN))

  • 2796 Accesses

Abstract

In Chap. 2 the bias-variance tradeoff was introduced and approaches to regulate model complexity by some parameter λ—but how to choose it? Here is a fundamental issue in statistical model fitting or parameter estimation: We usually only have available a comparatively small sample from a much larger population, but we really want to make statements about the population as a whole. Now, if we choose a sufficiently flexible model, e.g., a local or spline regression model with many parameters, we may always achieve a perfect fit to the training data, as we already saw in Chap. 2 (see Fig. 2.5). The problem with this is that it might not say much about the true underlying population anymore as we may have mainly fitted noise—we have overfit the data, and consequently our model would generalize poorly to sets of new observations not used for fitting. As a note on the side, it is not only the nominal number of parameters relevant for this but also the functional form or flexibility of our model and constraints put on the parameters. For instance, of course we cannot accurately capture a nonlinear functional relationship with a (globally) linear model, regardless of how many parameters. Or, as noted before, in basis expansions and kernel approaches, the effective number of parameters may be much smaller as the variables are constrained by their functional relationships. This chapter, especially the following discussion and Sects. 4.1–4.4, largely develops along the exposition in Hastie et al. (2009; but see also the brief discussion in Bishop, 2006, from a slightly different angle).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 69.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Proceedings of the Second International Symposium on Information Theory, Budapest, pp. 267–281 (1973)

    Google Scholar 

  • Allefeld, C., Haynes, J.D.: Searchlight-based multi-voxel pattern analysis of fMRI by cross-validated MANOVA. Neuroimage. 89, 345–357 (2014)

    Article  Google Scholar 

  • Balaguer-Ballester, E., Lapish, C.C., Seamans, J.K., Daniel Durstewitz, D.: Attractor dynamics of cortical populations during memory-guided decision-making. PLoS Comput. Biol. 7, e1002057 (2011)

    Article  Google Scholar 

  • Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    MATH  Google Scholar 

  • Brusco, M.J., Stanley, D.: Exact and approximate algorithms for variable selection in linear discriminant analysis. Comput. Stat. Data Anal. 55, 123–131 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Demanuele, C., Bähner, F., Plichta, M.M., Kirsch, P., Tost, H., Meyer-Lindenberg, A., Durstewitz, D.: A statistical approach for segregating cognitive task stages from multivariate fMRI BOLD time series. Front. Human Neurosci. 9, 537 (2015a)

    Article  Google Scholar 

  • Demanuele, C., Kirsch, P., Esslinger, C., Zink, M., Meyer-Lindenberg, A., Durstewitz, D.: Area-specific information processing in prefrontal cortex during a probabilistic inference task: a multivariate fMRI BOLD time series analysis. PLoS One. 10, e0135424 (2015b)

    Article  Google Scholar 

  • Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)

    MATH  Google Scholar 

  • Durstewitz, D., Vittoz, N.M., Floresco, S.B., Seamans, J.K.: Abrupt transitions between prefrontal neural ensemble states accompany behavioral transitions during rule learning. Neuron. 66, 438–448 (2010)

    Article  Google Scholar 

  • Efron, B.: Estimating the error rate of a prediction rule: some improvements on cross-validation. J. Am. Stat. Assoc. 78, 316–331 (1983)

    Article  MATH  Google Scholar 

  • Efron, B., Tibshirani, R.: Improvements on cross-validation: the 632+ bootstrap: method. J. Am. Stat. Assoc. 92, 548–560 (1997)

    MathSciNet  MATH  Google Scholar 

  • Fahrmeir, L., Tutz, G.: Multivariate Statistical Modelling Based on Generalized Linear Models. Springer, New York (2010)

    MATH  Google Scholar 

  • Ferraty, F., van Keilegom, I., Vieu, P.: On the validity of the bootstrap in non-parametric functional regression. Scand. J. Stat. 37, 286–306 (2010a)

    Article  MathSciNet  MATH  Google Scholar 

  • Ferraty, F., Hall, P., Vieu, P.: Most-predictive design points for functional data predictors. Biometrika. 97(4), 807–824 (2010b)

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman, J.H.: On bias, variance, 0/1—loss, and the curse-of-dimensionality. Data Mining Knowl. Discov. 1, 55–77 (1997)

    Article  Google Scholar 

  • Friston, K.J., Harrison, L., Penny, W.: Dynamic causal modelling. Neuroimage. 19, 1273–1302 (2003)

    Article  Google Scholar 

  • Garg, G., Prasad, G., Coyle, D.: Gaussian Mixture Model-based noise reduction in resting state fMRI data. J. Neurosci. Methods. 215(1), 71–77 (2013)

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning (Vol. 2, No. 1) Springer, New York (2009)

    Google Scholar 

  • Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  • Kaufman, L., Rousseeuw, P.J.: Finding groups in data. Wiley, New York (1990)

    Book  MATH  Google Scholar 

  • Khamassi, M., Quilodran, R., Enel, P., Dominey, P.F., Procyk, E.: Behavioral regulation and the modulation of information coding in the lateral prefrontal and cingulate cortex. Cereb. Cortex. 25(9), 3197–3218 (2014)

    Article  Google Scholar 

  • Knuth, K.H., Habeck, M., Malakar, N.K., Mubeen, A.M., Placek, B.: Bayesian evidence and model selection. Dig. Signal Process. 47, 50–67 (2015)

    Article  MathSciNet  Google Scholar 

  • Lapish, C.C., Durstewitz, D., Chandler, L.J., Seamans, J.K.: Successful choice behavior is associated with distinct and coherent network states in anterior cingulate cortex. Proc. Natl. Acad. Sci. U S A. 105, 11963–11968 (2008)

    Article  Google Scholar 

  • Penny, W.D.: Comparing dynamic causal models using AIC, BIC and free energy. Neuroimage. 59, 319–330 (2012)

    Article  Google Scholar 

  • Penny, W.D., Mattout, J., Trujillo-Barreto, N.: Chapter 35: Bayesian model selection and averaging. In: Friston, K., Ashburner, J., Kiebel, S., Nichols, T., Penny, W. (eds.) Statistical Parametric Mapping: The Analysis of Functional Brain Images. Elsevier, London (2006)

    Google Scholar 

  • Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  • Stephan, K.E., Penny, W.D., Daunizeau, J., Moran, R.J., Friston, K.J.: Bayesian model selection for group studies. Neuroimage. 46, 1004–1017 (2009)

    Article  Google Scholar 

  • Stone, M.: Cross-Validatory Choice and Assessment of Statistical Predictions. J. R. Stat. Soc. Ser. B. 36, 111–147 (1974)

    MathSciNet  MATH  Google Scholar 

  • Vincent, T., Badillo, S., Risser, L., Chaari, L., Bakhous, C., Forbes, F., Ciuciu, P.: Flexible multivariate hemodynamics fMRI data analyses and simulations with PyHRF. Front. Neurosci. 8, 67 (2014)

    Article  Google Scholar 

  • Watanabe, T.: Disease prediction based on functional connectomes using a scalable and spatially-informed support vector machine. Neuroimage. 96, 183–202 (2014)

    Article  Google Scholar 

  • Witten, D.M., Tibshirani, R.: Covariance-regularized regression and classification for high dimensional problems. J. R. Stat. Soc. Ser. B (Statistical Methodology). 71, 615–636 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Witten, D.M., Tibshirani, R.: Penalized classification using Fisher’s linear discriminant. J. R. Stat. Soc. Ser. B. 73, 753–772 (2011a)

    Article  MathSciNet  MATH  Google Scholar 

  • Young, G., Householder, A.S.: Discussion of a set of points in terms of their mutual distances. Psychometrika. 3, 19–22 (1938)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Durstewitz, D. (2017). Model Complexity and Selection. In: Advanced Data Analysis in Neuroscience. Bernstein Series in Computational Neuroscience. Springer, Cham. https://doi.org/10.1007/978-3-319-59976-2_4

Download citation

Publish with us

Policies and ethics