Skip to main content

A Concept Lattice-Based Kernel for SVM Text Classification

  • Conference paper
Formal Concept Analysis (ICFCA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5548))

Included in the following conference series:

Abstract

Standard Support Vector Machines (SVM) text classification relies on bag-of-words kernel to express the similarity between documents. We show that a document lattice can be used to define a valid kernel function that takes into account the relations between different terms. Such a kernel is based on the notion of conceptual proximity between pairs of terms, as encoded in the document lattice. We describe a method to perform SVM text classification with concept lattice-based kernel, which consists of text pre-processing, feature selection, lattice construction, computation of pairwise term similarity and kernel matrix, and SVM classification in the transformed feature space. We tested the accuracy of the proposed method on the 20NewsGroup database: the results show an improvement over the standard SVM when very little training data are available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Basili, R., Cammisa, M., Moschitti, A.: A semantic kernel to classify texts with very few training examples. Informatica 30, 163–172 (2006)

    MATH  Google Scholar 

  2. Belohlavek, R.: Similarity relations in concept lattices. Journal of Logic and Computation 10(6), 823–845 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  3. Carpineto, C., Romano, G.: Order-Theoretical Ranking. Journal of the American Society for Information Science 51(7), 587–601 (2000)

    Article  Google Scholar 

  4. Carpineto, C., Romano, G.: Concept Data Analysis — Theory and Applications. Wiley, Chichester (2004)

    Book  MATH  Google Scholar 

  5. Cole, R., Eklund, P., Stumme, G.: Document retrieval for e-mail search and discovery using formal concept analysis. Applied Artificial Intelligence 17(3), 257–280 (2003)

    Article  Google Scholar 

  6. Cristianini, N., Shawe-Taylor, J., Lodhi, H.: Latent semantic kernels. Journal of Intelligent Information Systems 18(2–3), 127–152 (2002)

    Article  Google Scholar 

  7. Deerwester, S., Dumais, S.T., Furnas, W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  8. Formica, A.: Concept similarity in formal concept analysis: An information content approach. Knowledge-Based Systems 21(1), 80–87 (2008)

    Article  MathSciNet  Google Scholar 

  9. Haasdonk, B.: Feature space interpretation of svms with indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(4), 482–492 (2005)

    Article  Google Scholar 

  10. Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  11. Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), Madison, Wisconsin, USA, pp. 296–304. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  12. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  13. Meghini, C., Spyratos, N.: Computing intensions of digital library collections. In: Kuznetsov, S.O., Schmidt, S. (eds.) ICFCA 2007. LNCS, vol. 4390, pp. 66–81. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Priss, U.: Formal concept analysis in information science. Annual Review of Information Science and Technology (ARIST) 40 (2006)

    Google Scholar 

  15. Schölkopf, H., Smola, A.J.: Learning with Kernels. MIT Press, Cambridge (2002)

    MATH  Google Scholar 

  16. Siolas, G., d’Alche Buc, F.: Support vector machines based on a semantic kernel for text categorization. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000), vol. 5, pp. 205–209 (2000)

    Google Scholar 

  17. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)

    Book  MATH  Google Scholar 

  18. Wang, L., Liu, X.: A new model of evaluating concept similarity. Knowledge-Based Systems 21(4), 842–846 (2008)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Carpineto, C., Michini, C., Nicolussi, R. (2009). A Concept Lattice-Based Kernel for SVM Text Classification. In: Ferré, S., Rudolph, S. (eds) Formal Concept Analysis. ICFCA 2009. Lecture Notes in Computer Science(), vol 5548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01815-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01815-2_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01814-5

  • Online ISBN: 978-3-642-01815-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics