Skip to main content
Log in

A review on consistency and robustness properties of support vector machines for heavy-tailed distributions

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Support vector machines (SVMs) belong to the class of modern statistical machine learning techniques and can be described as M-estimators with a Hilbert norm regularization term for functions. SVMs are consistent and robust for classification and regression purposes if based on a Lipschitz continuous loss and a bounded continuous kernel with a dense reproducing kernel Hilbert space. For regression, one of the conditions used is that the output variable Y has a finite first absolute moment. This assumption, however, excludes heavy-tailed distributions. Recently, the applicability of SVMs was enlarged to these distributions by considering shifted loss functions. In this review paper, we briefly describe the approach of SVMs based on shifted loss functions and list some properties of such SVMs. Then, we prove that SVMs based on a bounded continuous kernel and on a convex and Lipschitz continuous, but not necessarily differentiable, shifted loss function have a bounded Bouligand influence function for all distributions, even for heavy-tailed distributions including extreme value distributions and Cauchy distributions. SVMs are thus robust in this sense. Our result covers the important loss functions \({\epsilon}\) -insensitive for regression and pinball for quantile regression, which were not covered by earlier results on the influence function. We demonstrate the usefulness of SVMs even for heavy-tailed distributions by applying SVMs to a simulated data set with Cauchy errors and to a data set of large fire insurance claims of Copenhagen Re.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Christmann A, Steinwart I (2004) On robust properties of convex risk minimization methods for pattern recognition. J Mach Learning Res 5: 1007–1034

    MathSciNet  Google Scholar 

  • Christmann A, Steinwart I (2007) Consistency and robustness of kernel based regression in convex minimization. Bernoulli 13: 799–819

    Article  MATH  MathSciNet  Google Scholar 

  • Christmann A, Steinwart I (2008) Consistency of kernel based quantile regression. Appl Stoch Models Bus Ind 24: 171–183

    Article  MATH  MathSciNet  Google Scholar 

  • Christmann A, Van Messem A (2008) Bouligand derivatives and robustness of support vector machines for regression. J Mach Learning Res 9: 915–936

    MathSciNet  Google Scholar 

  • Christmann A, Van Messem A, Steinwart I (2009) On consistency and robustness properties of support vector machines for heavy-tailed distributions. Stat Interface 2: 311–327

    MathSciNet  Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge

    Google Scholar 

  • Hampel FR (1968) Contributions to the theory of robust estimation. Unpublished Ph.D. thesis, Department of Statistics, University of California, Berkeley

  • Hampel FR (1974) The influence curve and its role in robust estimation. J Am Stat Assoc 69: 383–393

    Article  MATH  MathSciNet  Google Scholar 

  • Hosking J, Wallis J (1989) Parameter and quantile estimation for the generalized pareto distribution. Technometrics 29: 339–349

    Article  MathSciNet  Google Scholar 

  • Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the 5th Berkeley Symposium 1: 221–233

    MathSciNet  Google Scholar 

  • Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) kernlab—an S4 package for kernel methods in R. J Stat Softw 11(9): 1–20

    Google Scholar 

  • Koenker R (2005) Quantile Regression. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  • Lax PD (2002) Functional analysis. Wiley, New York

    MATH  Google Scholar 

  • Pickands J (1975) Statistical inference using extreme order statistics. Ann Stat 3: 119–131

    Article  MATH  MathSciNet  Google Scholar 

  • R Development Core Team (2009) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0

  • Robinson SM (1987) Local structure of feasible sets in nonlinear programming. Part III. Stability and sensitivity. Math Programming Study 30: 45–66

    MATH  Google Scholar 

  • Robinson SM (1991) An implicit-function theorem for a class of nonsmooth functions. Math Oper Res 16: 292–309

    Article  MATH  MathSciNet  Google Scholar 

  • Schölkopf B, Smola AJ (2002) Learning with Kernels. Support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge

    Google Scholar 

  • Steinwart I, Christmann A (2008) How SVMs can estimate quantiles and the median. In: Platt JC, Koller D, Singer Y, Roweis S (eds) Advances in neural information processing systems, vol 20. MIT Press, Cambridge

    Google Scholar 

  • Steinwart I, Christmann A (2008) Support vector machines. Springer, New York

    MATH  Google Scholar 

  • Vapnik VN (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arnout Van Messem.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Van Messem, A., Christmann, A. A review on consistency and robustness properties of support vector machines for heavy-tailed distributions. Adv Data Anal Classif 4, 199–220 (2010). https://doi.org/10.1007/s11634-010-0067-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-010-0067-2

Keywords

Mathematics Subject Classification (2000)

Navigation