Skip to main content
Log in

A Generalization of Haussler's Convolution Kernel — Mapping Kernel and Its Application to Tree Kernels

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Haussler's convolution kernel provides an effective framework for engineering positive semidefinite kernels, and has a wide range of applications. On the other hand, the mapping kernel that we introduce in this paper is its natural generalization, and will enlarge the range of application significantly. Our main theorem with respect to positive semidefiniteness of the mapping kernel (1) implies Haussler's theorem as a corollary, (2) exhibits an easy-to-check necessary and sufficient condition for mapping kernels to be positive semidefinite, and (3) formalizes the mapping kernel so that significant flexibility is provided in engineering new kernels. As an evidence of the effectiveness of our results, we present a framework to engineer tree kernels. The tree is a data structure widely used in many applications, and tree kernels provide an effective method to analyze tree-type data. Thus, not only is the framework important as an example but also as a practical research tool. The description of the framework accompanies a survey of the tree kernels in the literature, where we see that 18 out of the 19 surveyed tree kernels of different types are instances of the mapping kernel, and examples of novel interesting tree kernels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Haussler D. Convolution kernels on discrete structures. UCSC-CRL 99-10, Dept. of Computer Science, University of California at Santa Cruz, 1999.

  2. Lodhi H, Shawe-Taylor J, Cristianini N, Watkins C J C H. Text classification using string kernels. In Advances in Neural Information Processing Systems 13 (NIPS 2000), Denver, USA, MIT Press, 2001, pp.563–569.

  3. Collins M, Duffy N. Convolution kernels for natural language. In Advances in Neural Information Processing Systems 14 (NIPS 2001), Vancouver, MIT Press, Canada, Dec. 3-8, 2001, pp.625–632.

  4. Moschitti A. Kernel methods, syntax and semantics for relational text categorization. In Proc. ACM 17th Conference on Information and Knowledge Management, Napa Valley, USA, Oct. 26-30, 2008, pp.253–262.

  5. Leslie C S, Eskin E, Stafford Noble W. The spectrum kernel: A string kernel for SVM protein classification. In Proc. Pacific Symposium on Biocomputing, Lihue, Hawaii, Jan. 3-7, 2002, pp.566–575.

  6. Schölkopf C, Simard P, Smola A J, Vapnik V. Prior knowledge in support vector kernels. In Advances in Neural Information Processing System (NIPS 1998), Denver, USA, MIT Press, 1998, pp.640–646.

  7. Zien A, Rätsch G, Mika S, Schölkopf B, Lengauer T, Müller K R. Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics, 2000, 16(9): 799–807.

    Article  Google Scholar 

  8. Shin K. Position-aware string kernels with weighted shifts and a general framework to apply string kernels to other structured data. In Proc. The 8th International Conference on Intelligent Data Engineering and Automated Learning, Birmingham, UK, Dec. 16-19, 2007, pp.316–325.

  9. Shin K, Kuboyama T. A generalization of Haussler's convolution kernel|Mapping kernel. In Proc. ICML 2008, Helsinki, Finland, June 5-9, 2008, pp.944–951.

  10. Leslie C, Eskin E, Cohen A, Weston J, Noble W S. Mismatch string kernels for discriminative protein classification. Bioinformatics, 2004, 20(4): 467–476.

    Article  Google Scholar 

  11. Kuboyama T, Shin K, Kashima H. Flexible tree kernels based on counting the number of tree mappings. In Proc. Machine Learning with Graphs, 2006, pp.61–72.

  12. Tai K C. The tree-to-tree correction problem. JACM, July 1979, 26(3): 422–433.

    Article  MATH  MathSciNet  Google Scholar 

  13. Zhang K. Algorithms for the constrained editing distance between ordered labeled trees and related problems. Pattern Recognition, March 1995, 28(3): 463–474.

    Article  Google Scholar 

  14. Lu C L, Su Z Y, Tang G Y. A new measure of edit distance between labeled trees. In Proc. the 7th Annual Int. Conf. Computing and Combinatories, Guilin, China, Aug. 20-23, 2001, pp.338–348.

  15. Hizukuri Y, Yamanishi Y, Nakamura O, Yagi F, Goto S, Kanehisa M. Extraction of leukemia specific glycan motifs in humans by computational glycomics. Carbohydrate Research, 2005, 340(14): 2270–2278.

    Article  Google Scholar 

  16. Berg C, Christensen J P R, Ressel R. Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions. Springer, 1984.

  17. Shin K, Kuboyama T. Polynomial summaries of positive semidefinite kernels. In Proc. ALT 2007, Sendai, Japan, Oct. 1-4, 2007, pp.313–327.

  18. Vert J P. A tree kernel to analyze phylogenetic profiles. Bioinformatics, 2002, 18(Suppl. 1): 276–284.

    Google Scholar 

  19. Moschitti A, Pighin D, Basili R. Tree kernel engineering in semantic role labeling systems. In Proc. EACL 2006, Trento, Italy, April. 3-7, 2006, pp.49–56.

  20. Zhou G D, Zhang M, Ji D H, Zhu Q M. Tree kernel-based relation extraction with context-sensitive structure parse tree information. In Proc. the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural language Learning, Prague, Czech, Jun. 28-30, 2007, pp.723–726.

  21. Moschitti A, Pighin D, Basili R. Tree kernels for semantic role labeling. Computational Linguistics Journal, Special Issue on Semantic Role Labeling, 2008, 34(2): 193–224.

    MathSciNet  Google Scholar 

  22. Moschitti A, Zanzotto F M. Fast and effective kernels for relational learning from texts. In Proc. The 24th Annual International Conference on Machine Learning, Corvalis, USA, Jun. 20-24, 2007, pp.649–656.

  23. Vishwanathan S V N, Smola A J. Fast kernels for string and tree matching. In Advances Neural Information Processing Systems'5 (NIPS 2002), Vancouver, Canada, MIT Press, Dec. 9-14, 2002, pp.569–576.

  24. Kashima H, Koyanagi T. Kernels for semi-structured data. In Proc. the 9th International Conference on Machine Learning (ICML 2002), Sydney, Australia, Jul. 8-12, 2002, pp.291–298.

  25. Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction. Journal of Machine Learning Research, 2003, 3(Special Issue): 1083–1106.

    Article  MATH  MathSciNet  Google Scholar 

  26. Culotta A, Sorensen J. Dependency tree kernels for relation extraction. In Proc. ACL 2004, Barcelona, Spain, Jul. 21-26, 2004, pp.423–429.

  27. Moschitti A. Efficient convolution kernels for dependency and constituent syntactic trees. In Proc. the 17th European Conference on Machine Learning, Berlin, Germany, Sept. 18-22, 2006, pp.318–329.

  28. Kuboyama T, Hirata K, Kashima H, Aoki-Kinoshita K F, Yasuda H. A spectrum tree kernel. Transactions of JSAI, 2007, 22(2): 140–147.

    Google Scholar 

  29. Yamanishi Y, Bach F, Vert J P. Glycan classification with tree kernels. Bioinformatics, 2007, 23(10): 1211–1216.

    Article  Google Scholar 

  30. Shin K, Kuboyama T. Kernels based on distributions of agreement subtrees. In Proc. The 21st Australian Joint Conference on AI (AI08), Auckland, New Zealand, Dec. 3-5, 2008, pp.236–246.

  31. Hashimoto K, Goto S, Kawano S, Aoki-Kinoshita K F, Ueda N. Kegg as a glycome informatics resource. Glycobiology, 2006, 16: 63R–70R.

    Google Scholar 

  32. Doubet S, Albersheim P. Carbzbank. Glycobiology, 1992, 2(6): 505.

    Article  Google Scholar 

  33. Chang C C, Lin C J. Libsvm: A library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm/, 2001.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kilho Shin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shin, K., Kuboyama, T. A Generalization of Haussler's Convolution Kernel — Mapping Kernel and Its Application to Tree Kernels. J. Comput. Sci. Technol. 25, 1040–1054 (2010). https://doi.org/10.1007/s11390-010-9386-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-010-9386-1

Keywords

Navigation