Skip to main content

An Evaluation of Part of Speech Tagging on Written Second Language Spanish

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6608))

Abstract

With the increase in the number and size of computer learner corpora in the field of Second Language Acquisition, there is a growing need to automatically analyze the language produced by learners. However, the computational tools developed for natural language processing are generally not considered as appropiate because they are designed to treat native language. This paper analyzes the reliability of two part-of-speech taggers on second language Spanish and investigates the most frequent tagger errors and the impact of learner errors in the performance of the taggers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Granger, S.: Computer learner corpus research: current status and future prospects. In: Connor, U., Upton, T. (eds.) Applied Corpus Linguistics: A Multidimensional Perspective, pp. 123ā€“145. Amsterdam & Atlanta, Rodopi (2004)

    ChapterĀ  Google ScholarĀ 

  2. Nicholls, D.: The cambridge learner corpus ā€“ error coding and analysis for lexicography and elt. In: Wilson, A., et al. (eds.) Proceedings of the Corpus Linguistics 2003 Conference (CL 2003), Technical papers 16, Lancaster University, Archer et al. (2003)

    Google ScholarĀ 

  3. Rooy, B.V., SchƤfer, L.: Automatic POS tagging of a learner corpus: The influence of learner error on tagger accuracy. In: Archer, D., Rayson, P., Wilson, A., McEnery, T. (eds.) Proceedings of the Corpus Linguistics 2003 Conference, CL 2003 (2003)

    Google ScholarĀ 

  4. Thouesny, S.: Increasing the reliability of a part-of-speech tagging tool for use with learner language. In: Automatic Analysis of Learner Language, AALL 2009 (2009)

    Google ScholarĀ 

  5. Mancera, A.M.C., MartĆ­nez, I.P., Canales, A.B., FernĆ”ndez, L.C., Granda, J.F.S.: Corpus para el anĆ”lisis de errores de aprendices de E/LE (CORANE). In: Sanz, A.G. (ed.) Actas del XII Congreso Internacional de ASELE: tecnologĆ­as de la informaciĆ³n y de las comunicaciones en la enseƱanza de la E/LE, pp. 527ā€“534 (2001)

    Google ScholarĀ 

  6. Atserias, J., Casas, B., Comelles, E., GonzĆ”lez, M., PadrĆ³, L., PadrĆ³, M.: Freeling 1.3: Syntactic and semantic services in an open-source nlp library. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006), ELRA (2006)

    Google ScholarĀ 

  7. Bick, E.: A constraint grammar-based parser for spanish. In: Proceedings of TIL 2006 - 4th Workshop on Information and Human Language Technology, RibeirĆ£o, Preto (2006)

    Google ScholarĀ 

  8. Schmidt, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing (1994)

    Google ScholarĀ 

  9. DĆ­az-Negrillo, A., Meurers, D., Valera, S., Wunsch, H.: Towards interlanguage POS annotation for effective learner corpora in SLA and FLT. Language Forum 36(1ā€“2) (2010)

    Google ScholarĀ 

  10. Granger, S.: Error-tagged learner corpora and CALL: A promising synergy. CALICO JournalĀ 20(3), 465ā€“480 (2003)

    Google ScholarĀ 

  11. Aarts, J., Granger, S.: Tag sequences in learner corpora: a key to interlanguage grammar and discourse. In: Learner English on Computer, pp. 132ā€“141. Longman, Redwood City (1998)

    Google ScholarĀ 

  12. Tono, Y.: A corpus-based analysis of interlanguage development: analysing POS tag sequences of EFL learner corpora. In: Practical Applications in Language Corpora (1999)

    Google ScholarĀ 

  13. Heift, T., Schulze, M.: Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues. Routledge Studies in Computer Assisted Language Learning (2007)

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Valverde IbaƱez, M.P. (2011). An Evaluation of Part of Speech Tagging on Written Second Language Spanish. In: Gelbukh, A.F. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19400-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19400-9_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19399-6

  • Online ISBN: 978-3-642-19400-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics