Skip to main content

A Sparse L 2-Regularized Support Vector Machines for Large-Scale Natural Language Learning

  • Conference paper
Information Retrieval Technology (AIRS 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6458))

Included in the following conference series:

  • 1350 Accesses

Abstract

Linear support vector machines (SVMs) have become one of the most prominent classification algorithms for many natural language learning problems such as sequential labeling tasks. Even though the L 2-regularized SVMs yields slightly more superior accuracy than L 1-SVM, it produces too much near but non zero feature weights. In this paper, we present a cutting-weight algorithm to guide the optimization process of L 2-SVM into sparse solution. To verify the proposed method, we conduct the experiments with three well-known sequential labeling tasks and one dependency parsing task. The result shows that our method achieved at least 400% feature parameter reduction rates in comparison to the original L 2-SVM, with almost no change in accuracy and training times. In terms of run time efficiency, our method is faster than the original L 2-regularized SVMs at least 20% in all tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Empirical Methods in Natural Language Processing, pp. 1–8 (2002)

    Google Scholar 

  2. Frommer, A., Maaß, P.: Fast CG-based methods for Tikhonov-Phillips regularization. Journal of Scientific Computing 20(5), 1831–1850 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  3. Gao, J., Andrew, G., Johnson, M., Toutanova, K.: A comparative study of parameter estimation methods for statistical natural language processing. In: 45th Annual Meeting of the Association of Computational Linguistics, pp. 824–831 (2007)

    Google Scholar 

  4. Hsieh, C.J., Chang, K.W., Lin, C.J., Keerthi, S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: 15th International Conference on Machine Learning, pp. 408–415 (2008)

    Google Scholar 

  5. Joachims, T.: Training linear SVMs in linear time. In: ACM Conference on Knowledge Discovery and Data Mining, pp. 217–226 (2006)

    Google Scholar 

  6. Keerthi, S., Sundararajan, S., Chang, K.W., Hsieh, C.J., Lin, C.J.: A sequential dual method for large scale multi-class linear SVMs. In: ACM Conference on Knowledge Discovery and Data Mining, pp. 408–416 (2008)

    Google Scholar 

  7. Keerthi, S., DeCoste, D.: A modified finite Newton method for fast solution of large scale linear SVMs. Journal of Machine Learning Research 6, 341–361 (2005)

    MathSciNet  MATH  Google Scholar 

  8. Kudo, T. and Matsumoto, Y.: Chunking with support vector machines. In: North American Chapter of the Association for Computational Linguistics on Language Technologies, pp. 192-199 (2001)

    Google Scholar 

  9. Kudo, T., Matsumoto, Y.: Fast methods for kernel-based text analysis. In: The 41st Annual Meeting of the Association of Computational Linguistics, pp. 24–31 (2003)

    Google Scholar 

  10. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: 8th International Conference on Machine Learning, pp. 282–289 (2001)

    Google Scholar 

  11. Mangasarian, O.L., Musicant, D.: Lagrangian support vector machines. Journal of Machine Learning Research 1, 161–177 (2001)

    MathSciNet  MATH  Google Scholar 

  12. Manning, C., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  13. Ng, H.T., Low, J.K.: Chinese part-of-speech tagging: one-at-a-time or all-at-once? Word-based or character-based? In: Empirical Methods in Natural Language Processing, pp. 277–284 (2004)

    Google Scholar 

  14. Tjong Kim Sang, E.F., Buchholz, S.: Introduction to the CoNLL 2000 shared task: chunking. In: 4th Conference on Computational Natural Language Learning, pp. 127–132 (2000)

    Google Scholar 

  15. Wu, Y.C., Yang, J.C., Lee, Y.S.: An approximate approach for training polynomial kernel SVMs in linear time. In: The 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 65–68 (2007)

    Google Scholar 

  16. Wu, Y.C., Lee, Y.S., Yang, J.C.: Robust and efficient Chinese word dependency analysis with linear kernel support vector machines. In: proceedings of 22nd International Conference on Computational Linguistics Poster, pp. 135–138 (2008)

    Google Scholar 

  17. Zhang, Y., Clark, S.: Chinese segmentation with a word-based perceptron algorithm. In: 45th Annual Meeting of the Association of Computational Linguistics, pp. 840–847 (2007)

    Google Scholar 

  18. Zhao, H., Kit, C.: Incorporating global information into supervised learning for Chinese word segmentation. In: 10th Conference of the Pacific Association for Computational Linguistics, pp. 66–74 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, YC., Lee, YS., Yang, JC., Yen, SJ. (2010). A Sparse L 2-Regularized Support Vector Machines for Large-Scale Natural Language Learning. In: Cheng, PJ., Kan, MY., Lam, W., Nakov, P. (eds) Information Retrieval Technology. AIRS 2010. Lecture Notes in Computer Science, vol 6458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17187-1_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17187-1_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17186-4

  • Online ISBN: 978-3-642-17187-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics