Abstract
Conformal predictors, introduced by Vovk et al. (Algorithmic Learning in a Random World, Springer, New York, 2005), serve to build prediction intervals by exploiting a notion of conformity of the new data point with previously observed data. We propose a novel method for constructing prediction intervals for the response variable in multivariate linear models. The main emphasis is on sparse linear models, where only few of the covariates have significant influence on the response variable even if the total number of covariates is very large. Our approach is based on combining the principle of conformal prediction with the ℓ 1 penalized least squares estimator (LASSO). The resulting confidence set depends on a parameter ε>0 and has a coverage probability larger than or equal to 1−ε. The numerical experiments reported in the paper show that the length of the confidence set is small. Furthermore, as a by-product of the proposed approach, we provide a data-driven procedure for choosing the LASSO penalty. The selection power of the method is illustrated on simulated and real data.
Similar content being viewed by others
References
Bühlmann, P., Hothorn, T.: Twin boosting: improved feature selection and prediction. Stat. Comput. (2010, this issue)
Bunea, F., Tsybakov, A., Wegkamp, M.: Sparsity oracle inequalities for the Lasso. Electron. J. Stat. 1, 169–194 (2007)
Casella, G., Berger, R.L.: Statistical Inference. Duxbury, N. Scituate (2001)
Chen, S.S., Donoho, D.L.: Atomic decomposition by basis pursuit. Technical Report (1995)
Chesneau, Ch., Hebiri, M.: Some theoretical results on the grouped variables Lasso. Math. Methods Stat. 17, 317–326 (2008)
Dalalyan, A., Tsybakov, A.: Aggregation by exponential weighting and sharp oracle inequalities. In: Learning Theory. Lecture Notes in Comput. Sci., vol. 4539, pp. 97–111. Springer, Berlin (2007)
Dalalyan, A., Tsybakov, A.: Aggregation by exponential weighting, sharp pac-bayesian bounds and sparsity. Mach. Learn. 72, 39–61 (2008)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression—with discussion. Ann. Stat. 32, 407–499 (2004)
Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Stat. 1, 302–332 (2007)
Garrigues, P., El Ghaoui, L.: An homotopy algorithm for the lasso with online observations. In: Neural Information Processing Systems (Nips), vol. 21, pp. 489–496. MIT Press, Cambridge (2008)
Györfi, L., Kohler, M., Krzyzak, A., Walk, H.: A Distribution-Free Theory of Nonparametric Regression. Springer Series in Statistics. Springer, New York (2002)
Hebiri, M.: Regularization with the smooth-lasso procedure. Technical Report (2008)
Huang, C., Cheang, G.L.H., Barron, A.: Risk of penalized least squares, greedy selection and l1 penalization for flexible function libraries. Preprint (2008)
Kim, S.J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: An interior-point method for large-scale l1-regularized least squares. IEEE J. Sel. Top. Signal Process. 1, 606–617 (2007)
Knight, K., Fu, W.: Asymptotics for lasso-type estimators. Ann. Stat. 28, 1356–1378 (2000)
Langford, J., Li, L., Zhang, T.: Sparse online learning via truncated gradient. J. Mach. Learn. Res. 10, 777–801 (2009)
Meinshausen, N., Bühlmann, P.: High dimensional graphs and variable selection with the lasso. Ann. Stat. 34, 1436–1462 (2006)
Osborne, M., Presnell, B., Turlach, B.: On the LASSO and its dual. J. Comput. Graph. Stat. 9, 319–337 (2000a)
Osborne, M.R., Presnell, B., Turlach, B.A.: A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20, 389–403 (2000b)
Park, M.Y., Hastie, T.: L 1-regularization path algorithm for generalized linear models. J. R. Stat. Soc., Ser. B, Stat. Methodol. 69, 659–677 (2007)
Rosset, S., Zhu, J.: Piecewise linear regularized solution paths. Ann. Stat. 35, 1012–1030 (2007)
Santosa, F., Symes, W.W.: Linear inversion of band-limited reflection seismograms. SIAM J. Sci. Stat. Comput. 7, 1307–1330 (1986)
Shalev-Shwartz, S., Tewari, A.: Stochastic methods for ℓ 1 regularized loss minimization. In: Proceedings of the 26th International Conference on Machine Learning. Omnipress, Montreal (2009)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc., Ser. B 58, 267–288 (1996)
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. J. R. Stat. Soc., Ser. B, Stat. Methodol. 67, 91–108 (2005)
Vapnik, V.: Statistical Learning Theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control. Wiley, New York (1998)
Vovk, V.: Asymptotic optimality of transductive confidence machine. In: Algorithmic Learning Theory. Lecture Notes in Comput. Sci., vol. 2533, pp. 336–350. Springer, Berlin (2002a)
Vovk, V.: On-line confidence machines are well-calibrated. In: Proceedings of the Forty-Third Annual Symposium on Foundations of Computer Science, pp. 187–196. IEEE Computer Society, Los Alamitos (2002b)
Vovk, V., Gammerman, A., Saunders, C.: Machine-learning applications of algorithmic randomness. In Proceedings of the 16th International Conference on Machine Learning, pp. 444–453. ICML (1999)
Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, New York (2005)
Vovk, V., Nouretdinov Ilia, G., Gammerman, A.: On-line predictive linear regression. Technical Report (2007)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc., Ser. B, Stat. Methodol. 68, 49–67 (2006)
Zhao, P., Yu, B.: On model selection consistency of Lasso. J. Mach. Learn. Res. 7, 2541–2563 (2006)
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc., Ser. B, Stat. Methodol. 67, 301–320 (2005)
Zou, H., Hastie, T., Tibshirani, R.: On the “Degrees of Freedom” of the lasso. Ann. Stat. 35, 2173–2192 (2007). URL citeseer.ist.psu.edu/766780.html
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hebiri, M. Sparse conformal predictors. Stat Comput 20, 253–266 (2010). https://doi.org/10.1007/s11222-009-9167-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-009-9167-2