Abstract
In this chapter we are concerned with the topic of construction , assessment, and selection of models in general, and of biochemical models in particular. Standard approaches to model construction and (automated) generation of candidate models are first discussed. We then present the most commonly used methods for model assessment, as well as the underlying concepts and ideas. In particular we focus on the information theoretic and Bayesian approaches to model selection. Information theoretic methods for model selection include the Akaike information criterion and the more recent deviance information criterion. Bayesian approaches include the computation of posterior ratios for relative model probabilities from Bayes factors as well as the approximate Bayesian information criterion. We also briefly discuss other methods such as cross-validation and bootstrapping techniques, and the theoretically appealing approach of minimum description length. We sketch how the most important results can be derived, emphasize distinctions between the methods, and discuss how model inference methods are employed in practice. We conclude that there is no generally applicable method for model assessment: a suitable choice depends on the specific inference problem, and to some extent also on the subjective preferences of the modeler.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In Petrov, B.N., Csaki, F. (Eds.) 2nd International Symposium on Information Theory, pp. 267–281 (1973)
Ando, T.: Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models. Biometrika 94, 443–458 (2007)
Augusto, D.A., Barbosa, H.J.C.: Symbolic regression via genetic programming. In: IEEE Proceedings of the Sixth Brazilian Symposium on Neural Networks, pp. 173–178 (2000)
Berglund, M., Sunnåker, M., Adiels, M., Jirstrand, M., Wennberg, B.: Investigations of a compartmental model for leucine kinetics using non-linear mixed effects models with ordinary and stochastic differential equations. Math. Med. Biol. (2011)
Buhmann, J.M.: Information theoretic model validation for clustering. In: International Symposium on Information Theory, pp. 1398–1402, Austin Texas, IEEE (2010)
Burnham, K.P., Anderson, D.R., Huyvaert, K.P.: Aic model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behav. Ecol. Sociobiol. 65, 2335 (2011)
Casella, G., George, E.I.: Explaining the gibbs sampler. Am. Stat. 46(3), 167–174 (1992)
Cavanaugh, J.E.: Unifying the derivations for the akaike and corrected akaike information criteria. Stat. Probab. Lett., pp. 201–208 (1997)
Cedersund, G., Samuelsson, O., Ball, G., Tegnér, J., Gomez-Cabrero, D.: Optimization in biology parameter estimation and the associated optimization problem. In: Uncertainty in Biology, A Computational Modeling Approach. Springer, Chem (2016, this volume)
Chamberlin, T.C.: The method of multiple working hypotheses. Science 15, 92–96 (1890)
Chehreghani, M.H., Busetto, A.G., Buhmann, J.M.: Information theoretic model validation for spectral clustering. In: Proceedings of the 15th International conference on artificial intelligence and statistics (AISTATS), pp. 495–503, 2012
Chib, S., Greenberg, E.: Understanding the metropolis-hastings algorithm. Am. Stat. 49, 327–335 (1995)
Chris, S.: Wallace and David M Boulton. An information measure for classification. Comput. J. 11(2), 185–194 (1968)
Csilléry, K., Blum, M.G.B., Gaggiotti, O.E., Franois, O.: Approximate bayesian computation (abc) in practice. Trends Ecol. Evol. 25(7), 410–418 (2010)
Dalle Pezze, P., Sonntag, A.G., Thien, A., Prentzell, M.T., Goedel, M., Fischer, S., Neumann-Haefelin, E., Huber, T.B., Baumeister, R., Shanley, D.P., Thedieck, K.: A dynamic network model of mtor signaling reveals tsc-independent mtorc2 regulation. Sci. Sig. 5(217), ra25 (2012)
DeLeeuw, J.: Introduction to Akaike (1973) information theory and an extension of the maximum likelihood principle. (1992)
Draper, D.: Assessment and propagation of model uncertainty. J. R. Stat. Soc. Ser. B 57, 45–97 (1995)
Efron, B.: Estimating the error rate of a prediction rule: improvement on cross-validation. J. Am. Stat. Assoc. 78, 316–331 (1983)
Fagundes, N.J.R., Ray, N., Beaumont, M., Neuenschwander, S., Salzano, F.M., Bonatto, S.L., Excoffier, L.: Statistical evaluation of alternative models of human evolution. Proc. Natl. Acad. Sci. U.S.A. 104(45), 17614–17619 (2007). Nov
Floettmann, M., Schaber, J., Hoops, S., Klipp, E., Mendes, P.: Modelmage: A tool for automatic model generation, selection and management. Genome. Inform. 20, 52–63 (2008)
Gerstner, T., Griebel, M.: Iterative and non-iterative simulation algorihms. Computing Science and Statistics (Interface Proceedings) 24, 433–438 (1992)
Gerstner, T., Griebel, M.: Dimension-adaptive tensor-product quadrature. Computing 71(1), 65–87 (2003)
Grünwald, P.: A tutorial introduction to the minimum description length principle. In: Grünwald, M.A.P.P., Myung, I.J. (Ed.) Advances in Minimum Description Length: Theory and Applications. MIT Press, US, 2005
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning - Data Mining, Inference, and Prediction, Second edition. Springer, 2008
Hug, S., Schmidl, D., Li, W.B., Greiter, M.B., Theis, F.J.: Bayesian model selection methods and their application to biological ODE systems. In: Uncertainty in Biology, A Computational Modeling Approach. Springer, Chem (2016, this volume)
Hurvich, C.M., Tsai, C.-L.: Regression and time series model selection in small samples. Biometrika 76, 297–307 (1989)
Jiang, W., Simon, R.: A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification. Stat. Med. 26(29), 5320–5334 (2007)
Johnson, J.B., Omland, K.S.: Model selection in ecology and evolution. Trends. Ecol. Evol. 19(2), 101–108 (2004). Feb
Karlsson, M.O., Beal, S.L., Sheiner, L.B.: Three new residual error models for population pk/pd analyses. J. Pharmacokinet. Biopharm. 23(6), 651–672 (1995)
Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90(430), 773–795 (1995)
Kirk, P., Silk, D., Stumpf, M.P.H.: Reverse engineering under uncertainty. In: Uncertainty in Biology, A Computational Modeling Approach. Springer, Chem (2016, this volume)
Kirk, P., Thorne, T., Stumpf, M.P.H.: Model selection in systems and synthetic biology. Curr. Opin. Biotechnol. (0):– (2013)
Kristensen, N.R., Madsen, H., Ingwersen, S.H.: Using stochastic differential equations for pk/pd model development. J. Pharmacokinet. Pharmacodyn. 32(1), 109–141 (2005)
Kuepfer, L., Peter, M., Sauer, U., Stelling, J.: Ensemble modeling for analysis of cell signaling dynamics. Nat. Biotechnol. 25(9), 1001–1006 (2007). Sep
Kuwahara, H., Myers, C.J., Samoilov, M.S., Barker, N.A., Arkin, A.P.: Automated abstraction methodology for genetic regulatory networks. In: Transactions on computational systems biology VI, pp. 150–175. Springer, 2006
Lillacci, G., Khammash, M.: Parameter estimation and model selection in computational biology. PLoS Comput. Biol. 6(3), e1000696 (2010). Mar
Michael, D.: Schmidt, Ravishankar R Vallabhajosyula, Jerry W Jenkins, Jonathan E Hood, Abhishek S Soni, John P Wikswo, and Hod Lipson. Automated refinement and inference of analytical models for metabolic networks. Phys. Biol. 8(5), 055011 (2011)
Milias, A., Porreca, R., Summers, S., Lygeros, J.: Bayesian model selection for the yeast gata-factor network: a comparison of computational approaches. In: IEEE Conference on Decision and Control, Atlanta, Georgia, USA, 2010
Müller, T.G., Faller, D., Timmer, J., Swameye, I., Sandra, O., Klingmüller, U.: Tests for cycling in a signalling pathway. J. Royal Stat. Soc.: Ser. C (Appl. Stat.) 53(4), 557–568 (2004)
Posada, D., Buckley, T.R.: Model selection and model averaging in phylogenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests. Syst. Biol. 53(5), 793–808 (2004). Oct
Raue, A., Kreutz, C., Maiwald, T., Bachmann, J., Schilling, M., Klingmüller, U., Timmer, J.: Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics 25(15), 1923–1929 (2009). Aug
Rissanen, J.: Fisher information and stochastic complexity. IEEE Transact. Inf. Theor. 42(1), 40–47 (1996)
Ristic, B., Arulampalam, S., Gordon, N.: Beyond the Kalman filter. Arctec House, 2004
Rune, V.: Overgaard, Niclas Jonsson, Christoffer W. Tornøe, and Henrik Madsen. Non-linear mixed-effects models with stochastic differential equations: implementation of an estimation algorithm. J. Pharmacokinet. Pharmacodyn. 32(1), 85–107 (2005). Feb
Schliemann-Bullinger, M., Fey, D., Bastogne, T., Findeisen, R., Scheurich, P., Bullinger, E.: The experimental side of parameter estimation. In: Uncertainty in Biology, A Computational Modeling Approach. Springer, Chem (2016, this volume)
Shtarkov, Y.M.: Universal sequential coding of single messages. (Translated From) Probl. Inf. Transm. 23(3), 3–17 (1987)
Sleep, D.J.H.: Statistical versus biological hypothesis testing: response to Steidl. J. Wildl. Manag. 71, 21202121 (2007)
Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A.: Bayesian Measures of Model Complexity and Fit. J. Royal Stat. Soc. 1–34 (2002)
Steidl, R.J.: Limits of Data Analysis in Scientific Inference: Reply to Sleep, et al. J. Wildl. Manag. 71, 2122–2124 (2007)
Steidl, R.J.: Model selection, hypothesis testing, and risks of condemning analytical tools. J. Wildl. Manag. 70, 14971498 (2006)
Sugiura, N.: Further analysts of the data by akaike’ s information criterion and the finite corrections. Commun. Stat. 7(1), 13–26 (1978)
Sunnåker, M., Zamora-Sillero, E., Dechant, R., Ludwig, C., Busetto, A.G., Wagner, A., Stelling, J.: A method for automatic generation of predictive dynamic models reveals nuclear phosphorylation as the key msn2 control mechanism. Sci. Signal. 6, ra41 (2013)
Sunnåker, M., Busetto, A.G., Numminen, E., Corander, J., Foll, M., Dessimoz, C.: Approximate bayesian computation. PLoS Comput. Biol. 9(1), e1002803 (2013). Jan
Toni, T., Welch, D., Strelkowa, N., Ipsen, A., Stumpf, M.P.H.: Approximate bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interf. 6(31), 187–202 (2009). Feb
Toni, T., Stumpf, M.P.H.: Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics 26(1), 104–110 (2010). Jan
Transtrum, Mark K., Machta, Benjamin B., Sethna, James P.: Geometry of nonlinear least squares with applications to sloppy models and optimization. Phys. Rev. E Stat. Nonlin. Soft. Matter. Phys. 83(3 Pt 2), 036701 (2011). Mar
Turkheimer, F.E., Hinz, R., Cunningham, V.J.: On the undecidability among kinetic models: from model selection to model averaging. J. Cereb. Blood Flow Metab. 23(4), 490–498 (2003). Apr
von Dassow, G., Meir, E., Munro, E.M., Odell, G.M.: The segment polarity network is a robust developmental module. Nature 406(6792), 188–192 (2000). Jul
Vyshemirsky, V., Girolami, M.A.: Bayesian ranking of biochemical system models. Bioinformatics 24(6), 833–839 (2008). Mar
Wagenmakers, Eric-Jan, Farrell, Simon: AIC model selection using akaike weights. Psychonomic. Bull. Rev. 11(1), 192–196 (2004)
Wilkinson, D.J.: Bayesian methods in bioinformatics and computational systems biology. Brief Bioinform. 8(2), 109–116 (2007). Mar
Wilkinson, D.J.: Stochastic modelling for quantitative description of heterogeneous biological systems. Nat. Rev. Genet. 10(2), 122–133 (2009). Feb
Xu, T.-R., Vyshemirsky, V., Gormand, A., von Kriegsheim, A., Girolami, M., Baillie, G.S., Ketley, D., Dunlop, A.J., Milligan, G., Houslay, M.D., Kolch, W.: Inferring signaling pathway topologies from multiple perturbation measurements of specific biochemical species. Sci. Signal 3(113), ra20 (2010)
Zamora-Sillero, E., Hafner, M., Ibig, A., Stelling, J., Wagner, A.: Efficient characterization of high-dimensional parameter spaces for systems biology. BMC Syst. Biol. 5, 142 (2011)
Zellner, A., Chung-Ki, M.: Bayesian Analysis, Model Selection and Prediction, 1st edn. Cambridge University Press, Cambridge Books Online, Cambridge (1993)
Acknowledgments
We acknowledge funding from the Swiss Initiative for Systems Biology SystemsX.ch (project YeastX) evaluated by the Swiss National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Conflict of Interest
Conflict of Interest
The authors declare that they have no conflict of interest.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Sunnåker, M., Stelling, J. (2016). Model Extension and Model Selection. In: Geris, L., Gomez-Cabrero, D. (eds) Uncertainty in Biology. Studies in Mechanobiology, Tissue Engineering and Biomaterials, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-319-21296-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-21296-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21295-1
Online ISBN: 978-3-319-21296-8
eBook Packages: EngineeringEngineering (R0)