Skip to main content
Log in

Estimator Selection: a New Method with Applications to Kernel Density Estimation

  • Published:
Sankhya A Aims and scope Submit manuscript

Abstract

Estimator selection has become a crucial issue in non parametric estimation. Two widely used methods are penalized empirical risk minimization (such as penalized log-likelihood estimation) or pairwise comparison (such as Lepski’s method). Our aim in this paper is twofold. First we explain some general ideas about the calibration issue of estimator selection methods. We review some known results, putting the emphasis on the concept of minimal penalty which is helpful to design data-driven selection criteria. Secondly we present a new method for bandwidth selection within the framework of kernel density density estimation which is in some sense intermediate between these two main methods mentioned above. We provide some theoretical results which lead to some fully data-driven selection strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Proceedings 2nd International Symposium on Information Theory. (P. N. Petrov and F. Csaki, eds.). Akademia Kiado, Budapest, pp. 267–281.

    Google Scholar 

  • Arlot, S. and Bach, F. (2009). Data-driven calibration of linear estimators with minimal penalties. In Advances in Neural Information Processing Systems. (Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams and A. Culotta, eds.). Vol. 22, pp. 46–54.

  • Arlot, S. and Massart, P. (2009). Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. 10, 245–279 (electronic).

    Google Scholar 

  • Bahadur, R.R. (1958). Examples of inconsistency of maximum likelihood estimates. Sankhya Ser. A 20, 207–210.

    MathSciNet  MATH  Google Scholar 

  • Barron, A.R. and Cover, T.M. (1991). Minimum complexity density estimation. IEEE Trans. Inform. Theory 37, 1034–1054.

    Article  MathSciNet  MATH  Google Scholar 

  • Barron, A.R., Birgé, L. and Massart, P. (1999). Risk bounds for model selection via penalization. Probab. Th. Rel. Fields 113, 301–415.

    Article  MathSciNet  MATH  Google Scholar 

  • Baudry J.-P., Maugis C. and Michel B. 2011 Slope heuristics: overview and implementation. Stat. Comput., 1–16.

  • Bertin, K., Lacour, C. and Rivoirard, V. (2016). Adaptive pointwise estimation of conditional density function. Ann. Inst. Henri Poincaré Probab. Stat. 52, 939–980.

    Article  MathSciNet  MATH  Google Scholar 

  • Bertin, K., Le Pennec, E. and Rivoirard, V. (2011). Adaptive Dantzig density estimation. Ann. Inst. Henri Poincaré Probab. Stat. 47, 43–74.

    Article  MathSciNet  MATH  Google Scholar 

  • Bickel, P.J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37, 1705–1732.

    Article  MathSciNet  MATH  Google Scholar 

  • Birgé, L. and Massart, P. (1993). Rates of convergence for minimum contrast estimators. Probab. Th. Relat. Fields 97, 113–150.

    Article  MathSciNet  MATH  Google Scholar 

  • Birgé, L. and Massart, P. (1998). Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4, 329–375.

    Article  MathSciNet  MATH  Google Scholar 

  • Birgé, L. and Massart, P. (2001) Gaussian model selection. J. Eur. Math. Soc., 203–268.

  • Birgé, L. and Massart, P. (2007). Minimal penalties for Gaussian model selection. Probab. Th. Rel. Fields 138, 33–73.

    Article  MathSciNet  MATH  Google Scholar 

  • Boucheron, S., Lugosi, G. and Massart, P. (2013) Concentration inequalities. Oxford University Press.

  • Daniel, C. and Wood, F.S. (1971). Fitting Equations to Data. Wiley, New York.

    MATH  Google Scholar 

  • Devroye, L. and Lugosi, G. (2001). Combinatorial methods in density estimation, Springer Series in Statistics. Springer, New York.

    Book  MATH  Google Scholar 

  • Donoho, D.L. and Johnstone, I.M. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika 81, 425–455.

    Article  MathSciNet  MATH  Google Scholar 

  • Donoho. D.L. and Johnstone. I.M. (1994). Ideal denoising in an orthonormal basis chosen from a library of bases. C. R. Acad. Sc. Paris Sér. I Math. 319, 1317–1322.

    MathSciNet  MATH  Google Scholar 

  • Donoho, D.L., Johnstone, I.M., Kerkyacharian, G. and Picard, D. (1995). Wavelet shrinkage:Asymptopia? J. R. Statist. Soc. B 57, 301–369.

    MathSciNet  MATH  Google Scholar 

  • Donoho, D.L., Johnstone, I.M., Kerkyacharian, G. and Picard, D. (1996). Density estimation by wavelet thresholding. Ann. Statist. 24, 508–539.

    Article  MathSciNet  MATH  Google Scholar 

  • Doumic, M., Hoffmann, M., Reynaud-Bouret, P. and Rivoirard, V. (2012). Nonparametric estimation of the division rate of a size-structured population. SIAM J. Numer. Anal. 50, 925–950.

    Article  MathSciNet  MATH  Google Scholar 

  • Efroimovitch, S.Yu. and Pinsker, M.S. (1984). Learning algorithm for nonparametric filtering. Automat. Remote Control 11, 1434–1440. translated from Avtomatika i Telemekhanika 11, 58–65.

    MATH  Google Scholar 

  • Goldenshluger, A. and Lepski, O. (2008) Universal pointwise selection rule in multivariate function estimation, Vol. 14.

  • Goldenshluger, A. and Lepski, O. (2009). Structural adaptation via L p -norm oracle inequalities. Probab. Theory Related Fields 143, 41–71.

    Article  MathSciNet  MATH  Google Scholar 

  • Goldenshluger, A. and Lepski, O. (2011). Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality. Ann. Statist. 39, 1608–1632.

    Article  MathSciNet  MATH  Google Scholar 

  • Goldenshluger, A. and Lepski, O. (2013). General selection rule from a family of linear estimators. Theory Probab. Appl. 57, 209–226.

    Article  MathSciNet  MATH  Google Scholar 

  • Goldenshluger, A. and Lepski, O. (2014). On adaptive minimax density estimation on \(\mathbb {R}^{d}\). Theory Probab. Appl. 159, 479–543.

    Article  MathSciNet  MATH  Google Scholar 

  • Kerkyacharian, G., Lepski, O. and Picard, D. (2008). Nonlinear estimation in anisotropic multiindex denoising. Sparse case. Theory Probab. Appl. 52, 58–77.

    Article  MathSciNet  MATH  Google Scholar 

  • Lacour, C. and Massart, P.P. (2016) Minimal penalty for Goldenschluger-Lepski method. < hal-01121989v2 >. To appear in Stoch. Proc. Appl.

  • Lebarbier, E. (2005). Detecting multiple change points in the mean of Gaussian process by model selection. Signal Process. 85, 717–736.

    Article  MATH  Google Scholar 

  • Lepskii, O.V. (1990). On a problem of adaptive estimation in Gaussian white noise. Theory Probab. Appl. 36, 454–466.

    MathSciNet  Google Scholar 

  • Lepskii, O.V. (1991). Asymptotically minimax adaptive estimation I: Upper bounds. Optimally adaptive estimates. Theory Probab. Appl. 36, 682–697.

    Article  MathSciNet  MATH  Google Scholar 

  • Lepskii, O.V. (2013). Upper functions for positive random functionals. II. Application to the empirical processes theory, Part 1. Math. Methods Statist. 22, 83–99.

    Article  MathSciNet  Google Scholar 

  • Lerasle, M. (2012). Optimal model selection in density estimation. Ann. Inst. Henri Poincaré Probab. Stat. 48, 884–908.

    Article  MathSciNet  MATH  Google Scholar 

  • Lerasle, M., Malter-Magalahes, N. and Reynaud-Bouret, P. (2015) Optimal kernel selection for density estimation. To appear in High dimensional probabilities VII: The Cargese Volume.

  • Lerasle, M. and Takahashi, D.Y. (2016). Sharp oracle inequalities and slope heuristic for specification probabilities estimation in general random fields. Bernoulli 22, 1.

    Article  MathSciNet  MATH  Google Scholar 

  • Mallows, C.L. (1973). Some comments on C p . Technometrics 15, 661–675.

    MATH  Google Scholar 

  • Massart, P. (2007). Concentration inequalities and model selection. Ecole d’été de Probabilités de Saint-Flour 2003. Lecture Notes in Mathematics 1896. Springer, Berlin/Heidelberg.

    MATH  Google Scholar 

  • Nikol’skii, S. M. (1977) Priblizhenie funktsii mnogikh peremennykh i teoremy vlozheniya. (Russian) [Approximation of functions of several variables and imbedding theorems] Second edition, revised and supplemented. “Nauka”, Moscow.

  • Pinsker, M.S. (1980). Optimal filtration of square-integrable signals in Gaussian noise. Probl. Inf. Transm. 16, 120–133.

    MathSciNet  MATH  Google Scholar 

  • Reynaud-Bouret, P., Rivoirard, V. and Tuleau-Malot, C. (2011). Adaptive density estimation: a curse of support? J. Statist. Plann. Inference 141, 115–139.

    Article  MathSciNet  MATH  Google Scholar 

  • Rigollet, P. (2006). Adaptive density estimation using the blockwise Stein method. Bernoulli 12, 351–370.

    Article  MathSciNet  MATH  Google Scholar 

  • Saumard, A. (2013). Optimal model selection in heteroscedastic regression using piecewise polynomial functions. Electron. J. Stat. 7, 1184–1223.

    Article  MathSciNet  MATH  Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6, 461–464.

    Article  MathSciNet  MATH  Google Scholar 

  • Silverman, B. W. (1986). Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability. Chapman & Hall, London.

    Book  MATH  Google Scholar 

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B. 58, 267–288.

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pascal Massart.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lacour, C., Massart, P. & Rivoirard, V. Estimator Selection: a New Method with Applications to Kernel Density Estimation. Sankhya A 79, 298–335 (2017). https://doi.org/10.1007/s13171-017-0107-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13171-017-0107-5

Keywords and phrases

AMS (2000) subject classification

Navigation