Skip to main content

Incorporating Subject Areas into the Apertium Machine Translation System

  • Chapter
Computational Linguistics

Part of the book series: Studies in Computational Intelligence ((SCI,volume 458))

  • 1652 Accesses

Abstract

The Universitat Oberta de Catalunya (Open University of Catalonia, UOC), is a public university based in Barcelona. The UOC is characterised by three main factors: (a) it is a virtual university based in an e-Learning model, (b) it is based in a strongly Spanish-Catalan bilingual region, and (c) students come from around the world, so that linguistic and cultural diversity is a crucial factor.

Within this context, it becomes essential to meet the UOC’s linguistic needs taking into account its particular characteristics. One of the tools created to this end is the adaptation of Apertium, a free/open-source rule-based machine translation platform, which can be found under http://apertium.uoc.edu/, customised to the translation needs of the institution in order to offer the best possible service to their user community.

In order to continue adapting and adding value to the existing tool for generalisable large-scale applications, the UOC’s translation system has recently implemented a semantic filter based on subject fields aimed at improving the translation quality and at better fitting the university needs. The paper will explain all the steps of this adaptive process, as well as a demonstration of the resulting tool: (a) the choice of the subject fields according to the university studies, (b) the design and implementation of the dictionaries used to extract the required information to filter and disambiguate homonym and polysemous terms, including source code in the dictionaries, and (c) the design and implementation of the corresponding web interface.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bondi, M., Scott, M.: Keyness in Texts. Benjamins, Amsterdam (2010)

    Google Scholar 

  2. Carpuat, M.: One sense per discourse. In: Proceedings of the NAACL HLT Workshop on Semantic Evaluations: Recent Achievements and Future Directions, Boulder, Colorado, pp. 19–27 (June 2009)

    Google Scholar 

  3. Craciunescu, O., Gerding-Salas, C., Stringer-O’Keeffe, S.: Machine Translation and Computer-Assisted Translation: a New Way of Translating? The Translation Journal 8(3) (2005)

    Google Scholar 

  4. Forcada, M.L., Tyers, F.M., Ramírez-Sánchez, G.: The Free/Open-Source Machine Translation Platform Apertium: Five Years on. In: Proceedings of the First International Workshop on Free/Open-Source Rule-based Machine Translation FreeRBMT, Alacant, Spain (2009)

    Google Scholar 

  5. Gale, W., Church, K., Yarowski, D.: One sense per discourse. In: HLT 1991: Proceedings of the Workshop on Speech and Natural Language, Morristown, USA, pp. 233–237. Association for Computational Linguistics (1992)

    Google Scholar 

  6. Kirchhoff, K., Turner, A.M., Axelrod, A., Saavedra, F.: Application of Statistical Machine Translation to Public Health Information: a Feasibility Study. Journal of the American Medical Informatics Association 18, 473–478 (2011), doi:10.1136/amiajnl-2011-000176

    Article  Google Scholar 

  7. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  8. Scott, M., Tribble, C.: Textual Patterns: keyword and corpus analysis in language education. Benjamins, Amsterdam (2006)

    Google Scholar 

  9. Tyers, F.M., Sánchez-Martínez, F., Forcada, M.L.: Flexible finite-state lexical selection for rule-based machine translation. Research paper accepted at the EAMT (to be released, 2012)

    Google Scholar 

  10. Villarejo, L., Cullen, D., Corral, A.: La integració de les tecnologies de la llengua en el flux de treball del Servei Lingüístic de la UOC. Llengua i ús, Revista tècnica de Política Lingüística 46 (2009a)

    Google Scholar 

  11. Villarejo, L., Ortiz, S., Ginestí, M.: Joint efforts to further develop and incorporate Apertium into the document management flow at Universitat Oberta de Catalunya. In: Proceedings of the First International Workshop on Free/Open-Source Rule-based Machine Translation (2009b)

    Google Scholar 

  12. Villarejo, L., Farrús, M., Ortiz, S., Ramírez, G.: A web-based translation service at the UOC based on Apertium. In: Proc. of the International Multiconference on Computer Science and Information Technology, Wisla, Poland, pp. 525–520 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jordi Duran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Duran, J., Villarejo, L., Farrús, M., Ortiz, S., Ramírez, G. (2013). Incorporating Subject Areas into the Apertium Machine Translation System. In: Przepiórkowski, A., Piasecki, M., Jassem, K., Fuglewicz, P. (eds) Computational Linguistics. Studies in Computational Intelligence, vol 458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34399-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34399-5_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34398-8

  • Online ISBN: 978-3-642-34399-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics