Abstract
Support vector machines (SVMs) belong to the class of modern statistical machine learning techniques and can be described as M-estimators with a Hilbert norm regularization term for functions. SVMs are consistent and robust for classification and regression purposes if based on a Lipschitz continuous loss and a bounded continuous kernel with a dense reproducing kernel Hilbert space. For regression, one of the conditions used is that the output variable Y has a finite first absolute moment. This assumption, however, excludes heavy-tailed distributions. Recently, the applicability of SVMs was enlarged to these distributions by considering shifted loss functions. In this review paper, we briefly describe the approach of SVMs based on shifted loss functions and list some properties of such SVMs. Then, we prove that SVMs based on a bounded continuous kernel and on a convex and Lipschitz continuous, but not necessarily differentiable, shifted loss function have a bounded Bouligand influence function for all distributions, even for heavy-tailed distributions including extreme value distributions and Cauchy distributions. SVMs are thus robust in this sense. Our result covers the important loss functions \({\epsilon}\) -insensitive for regression and pinball for quantile regression, which were not covered by earlier results on the influence function. We demonstrate the usefulness of SVMs even for heavy-tailed distributions by applying SVMs to a simulated data set with Cauchy errors and to a data set of large fire insurance claims of Copenhagen Re.
Similar content being viewed by others
References
Christmann A, Steinwart I (2004) On robust properties of convex risk minimization methods for pattern recognition. J Mach Learning Res 5: 1007–1034
Christmann A, Steinwart I (2007) Consistency and robustness of kernel based regression in convex minimization. Bernoulli 13: 799–819
Christmann A, Steinwart I (2008) Consistency of kernel based quantile regression. Appl Stoch Models Bus Ind 24: 171–183
Christmann A, Van Messem A (2008) Bouligand derivatives and robustness of support vector machines for regression. J Mach Learning Res 9: 915–936
Christmann A, Van Messem A, Steinwart I (2009) On consistency and robustness properties of support vector machines for heavy-tailed distributions. Stat Interface 2: 311–327
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge
Hampel FR (1968) Contributions to the theory of robust estimation. Unpublished Ph.D. thesis, Department of Statistics, University of California, Berkeley
Hampel FR (1974) The influence curve and its role in robust estimation. J Am Stat Assoc 69: 383–393
Hosking J, Wallis J (1989) Parameter and quantile estimation for the generalized pareto distribution. Technometrics 29: 339–349
Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the 5th Berkeley Symposium 1: 221–233
Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) kernlab—an S4 package for kernel methods in R. J Stat Softw 11(9): 1–20
Koenker R (2005) Quantile Regression. Cambridge University Press, New York
Lax PD (2002) Functional analysis. Wiley, New York
Pickands J (1975) Statistical inference using extreme order statistics. Ann Stat 3: 119–131
R Development Core Team (2009) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0
Robinson SM (1987) Local structure of feasible sets in nonlinear programming. Part III. Stability and sensitivity. Math Programming Study 30: 45–66
Robinson SM (1991) An implicit-function theorem for a class of nonsmooth functions. Math Oper Res 16: 292–309
Schölkopf B, Smola AJ (2002) Learning with Kernels. Support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge
Steinwart I, Christmann A (2008) How SVMs can estimate quantiles and the median. In: Platt JC, Koller D, Singer Y, Roweis S (eds) Advances in neural information processing systems, vol 20. MIT Press, Cambridge
Steinwart I, Christmann A (2008) Support vector machines. Springer, New York
Vapnik VN (1998) Statistical learning theory. Wiley, New York
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Van Messem, A., Christmann, A. A review on consistency and robustness properties of support vector machines for heavy-tailed distributions. Adv Data Anal Classif 4, 199–220 (2010). https://doi.org/10.1007/s11634-010-0067-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-010-0067-2
Keywords
- Regularized empirical risk minimization
- Support vector machines
- Consistency
- Robustness
- Bouligand influence function
- Heavy tails