Large-Scale Support Vector Learning with Structural Kernels

Severyn, Aliaksei; Moschitti, Alessandro

doi:10.1007/978-3-642-15939-8_15

Aliaksei Severyn²³ &
Alessandro Moschitti²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6323))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3551 Accesses
10 Citations

Abstract

In this paper, we present an extensive study of the cutting-plane algorithm (CPA) applied to structural kernels for advanced text classification on large datasets. In particular, we carry out a comprehensive experimentation on two interesting natural language tasks, e.g. predicate argument extraction and question answering. Our results show that (i) CPA applied to train a non-linear model with different tree kernels fully matches the accuracy of the conventional SVM algorithm while being ten times faster; (ii) by using smaller sampling sizes to approximate subgradients in CPA we can trade off accuracy for speed, yet the optimal parameters and kernels found remain optimal for the exact SVM. These results open numerous research perspectives, e.g. in natural language processing, as they show that complex structural kernels can be efficiently used in real-world applications. For example, for the first time, we could carry out extensive tests of several tree kernels on millions of training instances. As a direct benefit, we could experiment with a variant of the partial tree kernel, which we also propose in this paper.

Download to read the full chapter text

Chapter PDF

Semantic Tree Kernels for Statistical Natural Language Learning

Russian-Language Question Classification: A New Typology and First Results

Distributional Models for Lexical Semantics: An Investigation of Different Representations for Natural Language Learning

Keywords

References

Carreras, X., Màrquez, L.: Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling. In: Proceedings of the 9th Conference on Natural Language Learning, CoNLL-2005, Ann Arbor, MI, USA (2005)
Google Scholar
Charniak, E.: A maximum-entropy-inspired parser. In: ANLP, pp. 132–139 (2000)
Google Scholar
Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In: ACL, pp. 263–270 (2002)
Google Scholar
Fine, S., Scheinberg, K.: Efficient svm training using low-rank kernel representations. Journal of Machine Learning Research 2, 243–264 (2001)
Article Google Scholar
Franc, V., Sonnenburg, S.: Optimized cutting plane algorithm for support vector machines. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) ICML. ACM International Conference Proceeding Series, vol. 307, pp. 320–327. ACM, New York (2008)
Chapter Google Scholar
Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning, ch. 11, pp. 169–184. MIT Press, Cambridge (1999)
Google Scholar
Joachims, T.: Training linear SVMs in linear time. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 217–226 (2006)
Google Scholar
Joachims, T., Yu, C.N.J.: Sparse kernel svms via cutting-plane training. Machine Learning 76(2-3), 179–193 (2009); European Conference on Machine Learning (ECML) Special Issue
Article Google Scholar
Keerthi, S.S., Chapelle, O., Decoste, D., Bennett, P., Parrado-hernndez, E.: Building support vector machines with reduced classifier complexity. Journal of Machine Learning Research 8, 2006 (2001)
Google Scholar
Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: Becker, S., Thrun, S., Obermayer, K. (eds.) NIPS, pp. 3–10. MIT Press, Cambridge (2002)
Google Scholar
Kudo, T., Matsumoto, Y.: Fast methods for kernel-based text analysis. In: Proceedings of ACL’03 (2003)
Google Scholar
Leslie, C., Eskin, E., Cohen, A., Weston, J., Noble, W.S.: Mismatch string kernels for discriminative protein classification. Bioinformatics 20(4), 467–476 (2004)
Article Google Scholar
Li, X., Roth, D.: Learning question classifiers: the role of semantic information. Natural Language Engineering 12(3), 229–249 (2006)
Article Google Scholar
Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)
Google Scholar
Moschitti, A., Pighin, D., Basili, R.: Tree kernels for semantic role labeling. Computational Linguistics 34(2), 193–224 (2008)
Article MathSciNet Google Scholar
Moschitti, A., Zanzotto, F.: Fast and effective kernels for relational learning from texts. In: Ghahramani, Z. (ed.) Proceedings of the 24th Annual International Conference on Machine Learning, ICML 2007 (2007)
Google Scholar
Moschitti, A.: Making tree kernels practical for natural language learning. In: EACL. The Association for Computer Linguistics (2006)
Google Scholar
Moschitti, A.: Kernel methods, syntax and semantics for relational text categorization. In: Proceeding of CIKM ’08, NY, USA (2008)
Google Scholar
Moschitti, A., Quarteroni, S., Basili, R., Manandhar, S.: Exploiting syntactic and shallow semantic kernels for question/answer classification. In: Proceedings of ACL’07 (2007)
Google Scholar
Palmer, M., Kingsbury, P., Gildea, D.: The proposition bank: An annotated corpus of semantic roles. Computational Linguistics 31(1), 71–106 (2005)
Article Google Scholar
Pighin, D., Moschitti, A.: Efficient linearization of tree kernel functions. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009), pp. 30–38. Association for Computational Linguistics, Boulder (June 2009), http://www.aclweb.org/anthology/W09-1106
Chapter Google Scholar
Pradhan, S., Hacioglu, K., Krugler, V., Ward, W., Martin, J.H., Jurafsky, D.: Support vector learning for semantic argument classification. Mach. Learn. 60(1-3), 11–39 (2005)
Article Google Scholar
Rieck, K., Krueger, T., Brefeld, U., Mueller, K.R.: Approximate tree kernels. Journal of Machine Learning Research 11, 555–580 (2010)
Google Scholar
Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: Primal estimated sub-gradient solver for SVM. In: Ghahramani, Z. (ed.) ICML. International Conference Proceeding Series, vol. 227, pp. 807–814. ACM, New York (2007)
Chapter Google Scholar
Shen, L., Joshi, A.K.: An SVM-based voting algorithm with application to parse reranking. In: Daelemans, W., Osborne, M. (eds.) Proceedings of CoNLL HLT-NAACL 2003, pp. 9–16 (2003), http://www.aclweb.org/anthology/W03-0402.pdf
Surdeanu, M., Ciaramita, M., Zaragoza, H.: Learning to rank answers on large online QA collections. In: Proceedings of ACL-08, HLT, Columbus, Ohio (2008), http://www.aclweb.org/anthology/P/P08/P08-1082
Williams, C., Seeger, M.: Using the nystrm method to speed up kernel machines. In: Advances in Neural Information Processing Systems, vol. 13, pp. 682–688. MIT Press, Cambridge (2001)
Google Scholar
Yu, C.N.J., Joachims, T.: Training structural svms with kernels using sampled cuts. In: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 794–802 (2008)
Google Scholar
Zaki, M.J.: Efficiently mining frequent trees in a forest. In: KDD ’02: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM Press, New York (2002)
Chapter Google Scholar
Zhang, D., Lee, W.S.: Question classification using support vector machines. In: SIGIR, pp. 26–32. ACM, New York (2003)
Google Scholar
Zhang, M., Zhang, J., Su, J.: Exploring Syntactic Features for Relation Extraction using a Convolution tree kernel. In: Proceedings of NAACL, New York City, USA, pp. 288–295 (2006), http://www.aclweb.org/anthology/N/N06/N06-1037

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Trento, Via Sommarive 14, 38100, POVO, (TN), Italy
Aliaksei Severyn & Alessandro Moschitti

Authors

Aliaksei Severyn
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Moschitti
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Matemáticas, Estadística y Computación, Universidad de Cantabria, Avenida de los Castros, s/n, 39071, Santander, Spain
José Luis Balcázar
Yahoo! Research Barcelona, Avinguda Diagonal 177, 08018, Barcelona, Spain
Francesco Bonchi
Yahoo! Research Barcelona, Avinguda Diagnonal 177, 08018, Barcelona, Spain
Aristides Gionis
TAO, CNRS-INRIA-LRI, Université Paris-Sud, 91405, Orsay, France
Michèle Sebag

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Severyn, A., Moschitti, A. (2010). Large-Scale Support Vector Learning with Structural Kernels. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15939-8_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-15939-8_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15938-1
Online ISBN: 978-3-642-15939-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Large-Scale Support Vector Learning with Structural Kernels

Abstract

Chapter PDF

Similar content being viewed by others

Semantic Tree Kernels for Statistical Natural Language Learning

Russian-Language Question Classification: A New Typology and First Results

Distributional Models for Lexical Semantics: An Investigation of Different Representations for Natural Language Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Large-Scale Support Vector Learning with Structural Kernels

Abstract

Chapter PDF

Similar content being viewed by others

Semantic Tree Kernels for Statistical Natural Language Learning

Russian-Language Question Classification: A New Typology and First Results

Distributional Models for Lexical Semantics: An Investigation of Different Representations for Natural Language Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation