Skip to main content
Log in

Approximation with polynomial kernels and SVM classifiers

  • Published:
Advances in Computational Mathematics Aims and scope Submit manuscript

Abstract

This paper presents an error analysis for classification algorithms generated by regularization schemes with polynomial kernels. Explicit convergence rates are provided for support vector machine (SVM) soft margin classifiers. The misclassification error can be estimated by the sum of sample error and regularization error. The main difficulty for studying algorithms with polynomial kernels is the regularization error which involves deeply the degrees of the kernel polynomials. Here we overcome this difficulty by bounding the reproducing kernel Hilbert space norm of Durrmeyer operators, and estimating the rate of approximation by Durrmeyer operators in a weighted L1 space (the weight is a probability distribution). Our study shows that the regularization parameter should decrease exponentially fast with the sample size, which is a special feature of polynomial kernels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc. 68 (1950) 337–404.

    Article  MATH  MathSciNet  Google Scholar 

  2. P.L. Bartlett, M.I. Jordan and J.D. McAuliffe, Convexity, classification, and risk bounds, Preprint (2003).

  3. B.E. Boser, I. Guyon and V. Vapnik, A training algorithm for optimal margin classifiers, in: Proc. of the 5th Annual Workshop of Computational Learning Theory, Vol. 5 (ACM, Pittsburgh, 1992) pp. 144–152.

    Google Scholar 

  4. O. Bousquet and A. Elisseeff, Stability and generalization, J. Mach. Learning Res. 2 (2002) 499–526.

    Article  MathSciNet  Google Scholar 

  5. D. Chen, Q. Wu, Y. Ying and D.X. Zhou, Support vector machine soft margin classifiers: Error analysis, J. Mach. Learning Res. 5 (2004) 1143–1175.

    Google Scholar 

  6. C. Cortes and V. Vapnik, Support-vector networks, Mach. Learning 20 (1995) 273–297.

    Google Scholar 

  7. N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines (Cambridge Univ. Press, Cambridge, 2000).

    Google Scholar 

  8. F. Cucker and S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc. 39 (2001) 1–49.

    Article  MathSciNet  Google Scholar 

  9. F. Cucker and S. Smale, Best choices for regularization parameters in learning theory: On the bias-variance problem, Found. Comput. Math. 1 (2002) 413–428.

    Article  MathSciNet  Google Scholar 

  10. F. Cucker and D.X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Monograph manuscript in preparation for Cambridge Univ. Press.

  11. L. Devroye, L. Györfi and G. Lugosi, A Probabilistic Theory of Pattern Recognition (Springer, New York, 1997).

    Google Scholar 

  12. T. Evgeniou, M. Pontil and T. Poggio, Regularization networks and support vector machines, Adv. Comput. Math. 13 (2000) 1–50.

    Article  MathSciNet  Google Scholar 

  13. B. Hammer and K. Gersmann, A note on the universal approximation capability of support vector machines, Neural Processing Lett. 17 (2003) 43–53.

    Article  Google Scholar 

  14. Y. Lin, Support vector machines and the Bayes rule in classification, Data Mining Knowledge Discovery 6 (2002) 259–275.

    Article  Google Scholar 

  15. P. Niyogi and F. Girosi, On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions, Neural Comput. 8 (1996) 819–842.

    Google Scholar 

  16. F. Rosenblatt, Principles of Neurodynamics (Spartan Book, New York, 1962).

    Google Scholar 

  17. C. Scovel and I. Steinwart, Fast rates for support vector machines, Preprint (December, 2003).

  18. J. Shawe-Taylor, P.L. Bartlett, R.C. Williamson and M. Anthony, Structural risk minimization over data-dependent hierarchies, IEEE Trans. Inform. Theory 44 (1998) 1926–1940.

    Article  MathSciNet  Google Scholar 

  19. S. Smale and D.X. Zhou, Estimating the approximation error in learning theory, Anal. Appl. 1 (2003) 17–41.

    Article  MathSciNet  Google Scholar 

  20. S. Smale and D.X. Zhou, Shannon sampling and function reconstruction from point values, Bull. Amer. Math. Soc. 41 (2004) 279–305.

    Article  MathSciNet  Google Scholar 

  21. S. Smale and D.X. Zhou, Shannon sampling II. Connections to learning theory, Appl. Comput. Harmonic Anal. (July 2004) submitted by invitation.

  22. I. Steinwart, Support vector machines are universally consistent, J. Complexity 18 (2002) 768–791.

    Article  MATH  MathSciNet  Google Scholar 

  23. I. Steinwart, On the influence of the kernel on the consistency of support vector machines, J. Mach. Learning Res. 2 (2001) 67–73.

    Article  MathSciNet  Google Scholar 

  24. A.B. Tsybakov, Optimal aggregation of classifiers in statistical learning, Ann. Statist. 32 (2004) 135–166.

    Article  MATH  MathSciNet  Google Scholar 

  25. V. Vapnik, Statistical Learning Theory (Wiley, New York, 1998).

    Google Scholar 

  26. G. Wahba, Spline Models for Observational Data (SIAM, Philadelphia, PA, 1990).

    Google Scholar 

  27. G. Wahba, Support vector machines, reproducing kernel Hilbert spaces and the Randomized GACV, in: Advances in Kernel Methods – Support Vector Learning, eds. Schölkopf, Burges and Smola (MIT Press, Cambridge, MA, 1999) pp. 69–88.

    Google Scholar 

  28. Q. Wu, Y. Ying and D.X. Zhou, Multikernel regularized classifiers, J. Complexity, to appear.

  29. Q. Wu and D.X. Zhou, Analysis of support vector machine classification (2004) submitted.

  30. Q. Wu and D.X. Zhou, Support vector machine classifiers: Linear programming versus quadratic programming, Neural Computation, in press.

  31. T. Zhang, Statistical behavior and consistency of classification methods based on convex risk minimization, Ann. Statist. 32 (2004) 56–85.

    Article  MATH  MathSciNet  Google Scholar 

  32. D.X. Zhou, The covering number in learning theory, J. Complexity 18 (2002) 739–767.

    Article  MATH  MathSciNet  Google Scholar 

  33. D.X. Zhou, Capacity of reproducing kernel spaces in learning theory, IEEE Trans. Inform. Theory 49 (2003) 1743–1752.

    Article  MathSciNet  Google Scholar 

  34. D.X. Zhou, Density problem and approximation error in learning theory (2003) submitted.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Communicated by Y. Xu

Dedicated to Charlie Micchelli on the occasion of his 60th birthday

Mathematics subject classifications (2000)

68T05, 62J02.

Ding-Xuan Zhou: The first author is supported partially by the Research Grants Council of Hong Kong (Project No. CityU 103704).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, DX., Jetter, K. Approximation with polynomial kernels and SVM classifiers. Adv Comput Math 25, 323–344 (2006). https://doi.org/10.1007/s10444-004-7206-2

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10444-004-7206-2

Keywords

Navigation