skip to main content
10.1145/2487575.2487591acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Fast and scalable polynomial kernels via explicit feature maps

Published:11 August 2013Publication History

ABSTRACT

Approximation of non-linear kernels using random feature mapping has been successfully employed in large-scale data analysis applications, accelerating the training of kernel machines. While previous random feature mappings run in O(ndD) time for $n$ training samples in d-dimensional space and D random feature maps, we propose a novel randomized tensor product technique, called Tensor Sketching, for approximating any polynomial kernel in O(n(d+D \log{D})) time. Also, we introduce both absolute and relative error bounds for our approximation to guarantee the reliability of our estimation algorithm. Empirically, Tensor Sketching achieves higher accuracy and often runs orders of magnitude faster than the state-of-the-art approach for large-scale real-world datasets.

References

  1. C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In Proceedings of ICALP'02, pages 693--703, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Chitta, R. Jin, T. C. Havens, and A. K. Jain. Approximate kernel k-means: solution to large scale kernel clustering. In Proceedings of KDD'11, pages 895--903, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Chitta, R. Jin, and A. K. Jain. Efficient kernel clustering using random fourier features. In Proceedings of ICDM'12, pages 161--170, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Drineas and M. W. Mahoney. On the Nyström method for approximating a gram matrix for improved kernel-based learning. Journal of Machine Learning Research, 6:2153--2175, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9:1871--1874, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Fine and K. Scheinberg. Efficient SVM training using low-rank kernel representations. Journal of Machine Learning Research, 2:243--264, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Frank and A. Asuncion. UCI machine learning repository, 2010.Google ScholarGoogle Scholar
  9. T. Joachims. Training linear SVMs in linear time. In Proceedings of KDD'06, pages 217--226, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Kar and H. Karnick. Random feature maps for dot product kernels. In Proceedings of AISTATS'12, pages 583--591, 2012.Google ScholarGoogle Scholar
  11. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86:2278--2324, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  12. S. Maji and A. C. Berg. Max-margin additive classifiers for detection. In Proceedings of ICCV'09, pages 40--47, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  13. E. Osuna, R. Freund, and F. Girosi. An improved training algorithm for support vector machines. In Proceedings of NNSP'97, pages 276--285, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  14. R. Pagh. Compressed matrix multiplication. In Proceedings of ICTS'12, pages 442--451, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Patraşcu and M. Thorup. The power of simple tabulation hashing. In Proceedings of STOC'11, pages 1--10, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Rahimi and B. Recht. Random features for large-scale kernel machines. In Advances in NIPS'08, pages 1177--1184, 2007.Google ScholarGoogle Scholar
  17. B. Schökopf and A. J. Smola. Learning with kernels: Support vector machines, regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, USA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for SVM. In Proceedings of ICML'07, pages 807--814, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. J. Smola and B. Schökopf. Sparse greedy matrix approximation for machine learning. In Proceedings of ICML'00, pages 911--918, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Vedaldi and A. Zisserman. Efficient additive kernels via explicit feature maps. In Proceedings of CVPR'10, pages 3539--3546, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  21. S. Vempati, A. Vedaldi, A. Zisserman, and C. V. Jawahar. Generalized RBF feature maps for efficient detection. In Proceedings of BMVC'10, pages 1--11, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  22. K. Q. Weinberger, A. Dasgupta, J. Langford, A. J. Smola, and J. Attenberg. Feature hashing for large scale multitask learning. In Proceedings of ICML'09, pages 1113--1120, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. K. I. Williams and M. Seeger. Using the Nyström method to speed up kernel machines. In Advances in NIPS'01, pages 682--688, 2001.Google ScholarGoogle Scholar
  24. T. Yang, Y.-F. Li, M. Mahdavi, R. Jin, and Z.-H. Zhou. Nyström method vs random fourier features: A theoretical and empirical comparison". In Advances in NIPS'12, pages 485--493, 2012.Google ScholarGoogle Scholar

Index Terms

  1. Fast and scalable polynomial kernels via explicit feature maps

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
        August 2013
        1534 pages
        ISBN:9781450321747
        DOI:10.1145/2487575

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 August 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        KDD '13 Paper Acceptance Rate125of726submissions,17%Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader