Abstract
Challenging research in various fields has driven a wide range of methodological advances in variable selection for regression models with high-dimensional predictors. In comparison, selection of nonlinear functions in models with additive predictors has been considered only more recently. Several competing suggestions have been developed at about the same time and often do not refer to each other. This article provides a state-of-the-art review on function selection, focusing on penalized likelihood and Bayesian concepts, relating various approaches to each other in a unified framework. In an empirical comparison, also including boosting, we evaluate several methods through applications to simulated and real data, thereby providing some guidance on their performance in practice.
Similar content being viewed by others
References
Avalos, M., Grandvalet, Y., Ambroise, C.: Parsimonious additive models. Comput. Stat. Data. Anal. 51, 2851–2870 (2007)
Belitz, C., Lang, S.: Simultaneous selection of variables and smoothing parameters in structured additive regression models. Comput. Stat. Data. Anal. 53, 61–81 (2008)
Belitz, C., Brezger, A., Kneib, T., Lang, S,, Umlauf, N.: BayesX-Software for Bayesian inference in structured additive regression models (2012). http://www.bayesx.org. Version 2.1
Bühlmann, P., Hothorn, T.: Boosting algorithms: Regularization, prediction and model fitting. Stat. Sci. 22, 477–505 (2007)
Bühlmann, P., Yu, B.: Boosting with the \(l_2\) loss: regression and classification. J. Am. Stat. Assoc. 98, 324–339 (2003)
Cottet, R., Kohn, R.J., Nott, D.J.: Variable selection and model averaging in semiparametric overdispersed generalized linear models. J. Am. Stat. Assoc. 103, 661–671 (2008)
Eaton, J. W., Bateman, D., Hauberg, S.: GNU Octave Manual Version 3. Network Theory Limited (2008)
Eilers, P.H.C., Marx, B.D.: Flexible smoothing using B-splines and penalized likelihood. Stat. Sci. 11, 89–121 (1996)
Eugster, M.A., Hothorn, T. (Authors), Frick, H., Kondofersky, I., Kuehnle, O. S., Lindenlaub, C., Pfundstein, G., Speidel, M., Spindler, M., Straub, A., Wickler, F., Zink, K. (Contributors): hgam: High-dimensional additive modelling (2010) R package version 0.1-0
Fahrmeir, L., Kneib, T.: Bayesian smoothing and regression for longitudinal, spatial and event history data. Oxford Statistical Science Series 36, Oxford (2011)
Fahrmeir, L., Kneib, T., Konrath, S.: Bayesian regularization in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection. Stat. Comput. 20, 203–219 (2010)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Frank, A., Asuncion, A.: UCI machine learning repository (2010). http://archive.ics.uci.edu/ml
George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)
George, E.I., McCulloch, R.E.: Approaches for Bayesian variable selection. Statistica Sinica 7, 339–374 (1997)
Griffin, J.E., Brown, P.J.: Alternative prior distributions for variable selection with very many more variables than observations. Technical Report UKC/IMS/05/08, IMS, University of Kent (2005)
Gu, C.: Smoothing Spline ANOVA Models. Springer, Brlin (2002)
Hothorn, T., Bühlmann, P., Kneib, T., Schmid, M., Hofner, B.: mboost. Model-based boosting (2012). R package version 2.1-1
Huang, J., Horowitz, J.L., Wei, F.: Variable selection in nonparametric additive models. Ann. Stat. 38, 2282–2313 (2010)
Ishwaran, H., Rao, J.S.: Spike and slab variable selection: frequentist and Bayesian strategies. Ann. Stat. 33(2), 730–773 (2005)
Kneib, T., Hothorn, T., Tutz, G.: Variable selection and model choice in geoadditive regression models. Biometrics 65, 626–634 (2009)
Kneib, T., Konrath, S., Fahrmeir, L.: High-dimensional structured additive regression models: Bayesian regularisation, smoothing and predictive performance. Appl. Stat. 60, 51–70 (2011)
Konrath, S., Kneib, T., Fahrmeir, L.: Bayesian smoothing, shrinkage and variable selection in hazard regression. In: Becker, C., Fried, R., Kuhnt, S. (eds.) Robustness and Complex Data Structures. Festschrift in Honour of Ursula Gather (2013)
Leng, C., Zhang, H.H.: Model selection in nonparametric hazard regression. Nonparametr. Stat. 18, 417–429 (2006)
Lin, Y., Zhang, H.H.: Component selection and smoothing in multivariate nonparametric regression. Ann. Stat. 34, 2272–2297 (2006)
Marra, G., Wood, S.: Practical variable selection for generalized additive models. Comput. Stat. Data Anal. 55, 2372–2387 (2011)
MATLAB. MATLAB version 7.10.0 (R2010a). The MathWorks Inc., Natick, Massachusetts (2010)
Meier, L.: grplasso: Fitting user specified models with Group Lasso penalty (2009). R package version 0.4-2
Meier, L., van de Geer, S., Bühlmann, P.: The group Lasso for logistic regression. J. R. Stat. Soc. Ser. B 70, 53–71 (2008)
Meier, L., van der Geer, S., Bühlmann, P.: High-dimensional additive modeling. Ann. Stat. 37, 3779–3821 (2009)
O’Hara, R.B., Sillanpää, M.J.: A review of Bayesian variable selection methods: what, how, and which? Bayesian Anal. 4, 85–118 (2009)
Panagiotelis, A., Smith, M.: Bayesian identification, selection and estimation of semiparametric functions in high-dimensional additive models. J. Econom. 143, 291–316 (2008)
Park, T., Casella, G.: The Bayesian lasso. J. Am. Stat. Assoc. 103, 681–686 (2008)
Polson, N.G., Scott, J.G.: Local shrinkage rules, Lévy processes and regularized regression. J. R. Stat. Soc. Ser. B 74(2), 287–311 (2012)
R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2011). http://www.R-project.org/
Radchenko, P., James, G.M.: Variable selection using adaptive nonlinear interaction structures in high dimensions. J. Am. Stat. Assoc. 105, 1–13 (2010)
Ravikumar, P., Liu, H., Lafferty, J., Wasserman, L.: Sparse additive models. J. R. Stat. Soc. Ser. B 71, 1009–1030 (2009)
Reich, B.J., Storlie, C.B., Bondell, H.D.: Variable selection in Bayesian smoothing spline ANOVA models: application to deterministic computer codes. Technometrics 51, 110 (2009)
Rue, H., Held, L.: Gaussian Markov Random Fields. Chapman & Hall / CRC (2005)
Sabanés Bové, D.: hypergsplines:Bayesian model selection with penalised splines and hyper-g prior (2012) R package version 0.0-32
Sabanés Bové, D., Held, L., Kauermann, G.: Mixtures of g-priors for generalised additive model selection with penalised splines. Technical report, University of Zurich and University Bielefeld (2011). http://arxiv.org/abs/1108.3520
Scheipl, F.: Bayesian regularization and model choice in structured additive regression. PhD thesis, Ludwig-Maximilians-Universität München, (2011a)
Scheipl, F.: spikeSlabGAM: Bayesian variable selection, model choice and regularization for generalized additive mixed models in R. Journal of Statistical Software, 43(14), 1–24, 9 (2011b). http://www.jstatsoft.org/v43/i14
Scheipl, F., Fahrmeir, L., Kneib, T.: Spike-and-slab priors for function selection in structured additive regression models. J. Am. Stat. Assoc. 107(500), 1518–1532 (2012). http://arxiv.org/abs/1105.5250
Smith, M., Kohn, R.: Nonparametric regression using Bayesian variable selection. J. Econometr. 75, 317–344 (1996)
Storlie, C., Bondell, H., Reich, B., Zhang, H.H.: Surface estimation, variable selection, and the nonparametric oracle property. Statistica Sinica 21(2), 679–705 (2011)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)
Tutz, G., Binder, H.: Generalized additive modelling with implicit variable selection by likelihood based boosting. Biometrics 62, 961–971 (2006)
Umlauf, N., Kneib, T., Lang, S.: R2BayesX: Estimate structured additive regression models with BayesX (2012) R package Version 0.1-1
Wahba, G.: Spline Models for Observational Data. SIAM (1990)
Wang, L., Chen, G., Li, H.: Group SCAD regression analysis for microarray time course gene expression data. Bioinformatics 23, 1486–1494 (2007)
Wood, S.: mgcv: GAMs with GCV/AIC/REML smoothness estimation and GAMMs by PQL (2012). R package version 1.7-18
Wood, S., Kohn, R., Shively, T., Jiang, W.: Model selection in spline nonparametric regression. J. R. Stat. Soc. Ser. B 64, 119–139 (2002)
Xue, L.: Consistent variable selection in additive models. Statistica Sinica 19, 1281–1296 (2009)
Yau, P., Kohn, R., Wood, S.: Bayesian variable selection and model averaging in high-dimensional multinomial nonparametric regression. J. Comput. Graph. Stat. 12, 23–54 (2003)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68, 49–67 (2006)
Zhang, H.H., Cheng, G., Liu, Y.: Linear or nonlinear? automatic structure discovery for partially linear models. J. Am. Stat. Assoc. 106(495), 1099–1112 (2011)
Zhang, H.H., Lin, Y.: Component selection and smoothing for nonparametric regression in exponential families. Statistica Sinica 16, 1021–1041 (2006)
Zou, H.: The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Acknowledgments
Financial support from the German Science Foundation, grants FA 128/5-1, FA 128/5-2 is gratefully acknowledged. We thank M. Avalos, H. Liu, and L. Xue for providing software implementing their methods upon request and D. Sabanés Bové for his generous assistance with the application of hypergsplines.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Scheipl, F., Kneib, T. & Fahrmeir, L. Penalized likelihood and Bayesian function selection in regression models. AStA Adv Stat Anal 97, 349–385 (2013). https://doi.org/10.1007/s10182-013-0211-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10182-013-0211-3