Skip to main content
Log in

A linguistically motivated taxonomy for Machine Translation error analysis

  • Published:
Machine Translation

Abstract

A detailed error analysis is a fundamental step in every natural language processing task, as to be able to diagnose what went wrong will provide cues to decide which research directions are to be followed. In this paper we focus on error analysis in Machine Translation (MT). We significantly extend previous error taxonomies so that translation errors associated with Romance language specificities can be accommodated. Furthermore, based on the proposed taxonomy, we carry out an extensive analysis of the errors generated by four different systems: two mainstream online translation systems Google Translate (Statistical) and Systran (Hybrid Machine Translation), and two in-house MT systems, in three scenarios representing different challenges in the translation from English to European Portuguese. Additionally, we comment on how distinct error types differently impact translation quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://translate.google.com.

  2. http://www.systranet.com/translate.

  3. http://www.statmt.org/moses.

  4.  http://www.ted.com/.

  5. http://upmagazine-tap.com/.

  6. http://metanet4u.l2f.inesc-id.pt/.

  7. While not presenting a taxonony per se, the work of Naskar et al. (2011) is interesting to note as it presents a tool for diagnostic evaluation of translation errors, DELiC4MT, available at http://www.computing.dcu.ie/~atoral/delic4mt, focusing mainly on linguistic checkpoints for different part-of-speech categories.

  8. http://reverso.softissimo.com/en/reverso-promt-pro.

  9. http://amedida.ibit.org/comprendium.php.

  10. http://www.freetranslation.com.

  11. http://corpus.leeds.ac.uk/mellange/about_mellange.html.

  12. http://langsoft.cz/translatorA.html.

  13. http://ufal.mff.cuni.cz/tectomt.

  14. http://terra.cl.uzh.ch/terra-corpus-collection.html.

  15. https://wiki.ufal.ms.mff.cuni.cz/user:zeman:addicter.

  16. http://www.dfki.de/~mapo02/hjerson.

  17. http://www.issco.unige.ch:8080/cocoon/femti/st-home.html.

  18. Although in some languages they can be more frequent, as for instance in German, where all nouns are spelled with capital letter. This can be a problem for foreign students that do not have this particularity in their mother tongue.

  19. We should not confuse Omission errors of a function word with Misselection errors (contraction). In the first case, in the phrase na (em + a) Indía (in India), the article a is missing, so we have an Omission error. Meanwhile, if we had a contraction problem, the sentence would be em a India, where both preposition and article were correctly selected but were not contracted as they should be (em + a = na).

  20. Translated on 22/10/2014.

  21. Tokenized using the default Moses tokenizer.

  22. http://www.wagsoft.com/CorpusTool.

  23. For instance, the following sentence is in the TAP corpus: If it’s Saturday, there’s a play at Teatro da Trindade called Havia um Menino que era Pessoa, where theatre-goers can discover the verses the poet wrote for his nephews and nieces.

  24. SVO is a sentence structure where the subject comes first, the verb second, and the object third, and languages may be classified according to the dominant sequence of these elements. SVO is one of the most common order in world languages.

  25. version 1.4

References

  • Batista F, Mamede N, Trancoso I (2007) A lightweight on-the-fly capitalization system for automatic speech recognition. In: Proceedings of the recent advantages in natural language processing (RANLP’07), Borovets, Bulgaria

  • Bojar O (2011) Analysing error types in English-Czech machine translation. Prague Bull Math Linguist 95:63–76

    Article  Google Scholar 

  • Bojar O, Mareček D, Novák V, Popel M, Ptáček J, Rouš J, Žabokrtský Z (2009) English-Czech MT in 2008. In: Proceedings of the fourth workshop on statistical machine translation, Greece, Athens, pp 125–129

  • Callison-Burch C, Fordyce C, Koehn P, Monz C, Schroeder J (2007) (meta-) evaluation of machine translation. In: Proceedings of the second workshop on statistical machine translation, Czech Republic, Prague, pp 136–158

  • Castagnoli S, Ciobanu D, Kunz K, Volanschi A, Kubler, N (2007) Designing a learner translator corpus for training purposes. In: TALC7, proceedings of the 7th teaching and language corpora conference, Paris, France

  • Chiang D (2007) Hierarchical phrase-based translation. Comput Linguist 33(2):201–228

    Article  MATH  Google Scholar 

  • Condon SL, Parvaz D, Aberdeen JS, Doran C, Freeman A, Awad M (2010) Evaluation of machine translation errors in english and Iraqi Arabic. In: Proceedings of the seventh international conference on language resources and evaluation, Valletta, Malta, pp 159–168

  • Corder SP (1967) The significance of learner’s errors. Int Rev Appl Linguist 5(4):161–169

    Google Scholar 

  • Costa A, Luís T, Coheur L (2014) Translation errors from English to Portuguese: an annotated corpus. In: Proceedings of the ninth international conference on language resources and evaluation (LREC’14), Reykjavik, Iceland, pp 1231–1234

  • Costa A, Luís T, Ribeiro J, Mendes AC, Coheur L (2012) An English-Portuguese parallel corpus of questions: translation guidelines and application in SMT. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12), Istanbul, Turkey, pp 2172–2176

  • Cunha C, Cintra L (1987) Nova Gramática do Português Contemporâneo. Edições Sá da Costa, Lisboa

    Google Scholar 

  • De Saussure F (1916) Cours de linguistique générale. Payot, Paris

    Google Scholar 

  • Denkowski M, Lavie A (2014) Meteor universal: language specific translation evaluation for any target language. In: WMT 2014: proceedings of the ninth workshop on statistical machine translation, Baltimore, Maryland USA, pp 376–380

  • Dulay H, Burt MK, Krashen SD (1982) Language two. Oxford University Press, Oxford

    Google Scholar 

  • Elliott D, Hartley A, Atwell E (2004) A fluency error categorization scheme to guide automated machine translation evaluation. In: Frederking RE, Taylor KB (eds) Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, vol 3265., Lecture Notes in Computer Science. Springer, Berlin, pp 64–73

  • Fishel M, Bojar O, Popović M (2012) Terra: a collection of translation error-annotated corpora. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12), Istanbul, Turkey, pp 7–14

  • James C (1998) Errors in language learning and use. Exploring error analysis, applied linguistics and language study. Routledge, New York

    Google Scholar 

  • Keenan EL, Stabler EP (2010) Language variation and linguistic invariants. Lingua 120(12):2680–2685

    Article  Google Scholar 

  • Kirchhoff K, Rambow O, Habash N, Diab M (2007) Semi-automatic error analysis for large-scale statistical machine translation. In: MT Summit XI. Proceedings, Copenhagen, Denmark, pp 289–296

  • Koehn P (2005) Europarl: A Parallel Corpus for Statistical Machine Translation. In: MT Summit X, Conference proceedings: the tenth machine translation summit, Phuket, Thailand, pp 79–86

  • Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Prague, Czech Republic, Association for Computational Linguistics, pp 177–180

  • Landis RJ, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–74

    Article  MathSciNet  MATH  Google Scholar 

  • Ling W, Luis T, Graca J, Coheur L, Trancoso I (2010) Towards a general and extensible phrase-extraction algorithm. In: IWSLT ’10: international workshop on spoken language translation, France, Paris, pp 313–320

  • Li X, Roth D (2002) Learning question classifiers. In: Proceedings of the 19th international conference on compuatational linguistics (COLING), Taipei, Taiwan, pp 556–562

  • Llitjós AF, Carbonell JG, Lavie A (2005) A framework for interactive and automatic refinement of transfer-based machine translation. In: 10th EAMT conference “Practical applications of machine translation”, Budapest, Hungary, pp 87–96

  • Naskar SK, Toral A, Gaspari F, Way A (2011) Framework for diagnostic evaluation of MT based on linguistic checkpoints. In: Proceedings of machine translation summit XIII, Xiamen, China, pp 529–536

  • Niessen S, Och FJ, Leusch G, Ney H (2000) An evaluation tool for machine translation: fast evaluation for MT research. In: LREC-2000: second international conference on language resources and evaluation. Proceedings, Athens, Greece, pp 39–45

  • Och FJ (2003) Minimum error rate training in statistical machine translation. In: ACL-2003: 41st annual meeting of the association for computational linguistics, Sapporo, Japan, pp 160–167

  • Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29:19–51

    Article  MATH  Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: ACL-2002: 40th annual meeting of the association for computational linguistics, Philadelphia, pp 311–318

  • Popović M, de Gispert A, Gupta D, Lambert P, Ney H, Marino JB, Federico M, Banchs R (2006) Morpho-syntactic information for automatic error analysis of statistical machine translation output. In: HLT-NAACL 2006: proceedings of the workshop on statistical machine translation, New York, NY, USA, pp 1–6

  • Popović M, Ney H (2006) Error analysis of verb inflections in Spanish translation output. In: TC-STAR workshop on speech-to-speech translation, Barcelona, Spain, pp 99–103

  • Popović M, Ney H (2011) Towards automatic error analysis of machine translation output. Comput Linguist 37(4):657–688

    Article  MathSciNet  MATH  Google Scholar 

  • Richards J (1974) Error analysis. Perspectives on second language acquisition. Longman, London

    Google Scholar 

  • Secară A (2005) Translation evaluation—a state of the art survey. eCoLoRe/MeLLANGE Workshop. Leeds, UK, pp 39–44

  • Vilar D, Xu J, D’Haro LF, Ney H (2006) Error analysis of machine translation output. In: LREC-2006: fifth international conference on language resources and evaluation. Proceedings, Genoa, Italy, pp 697–702

  • Zeman D, Fishel M, Berka J, Ondřej (2011) Addicter: what is wrong with my translations? Prague Bull Math Linguist 96:79–88

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially supported by national funds through FCT - Fundação para a Ciência e a Tecnologia, under project UID/CEC/50021/2013. Ângela Costa, Wang Ling and Rui Correia are supported by PhD fellowships from FCT (SFRH/BD/85737/2012, SFRH/BD/51157/2010, SFRH/BD/51156/2010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ângela Costa.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Costa, Â., Ling, W., Luís, T. et al. A linguistically motivated taxonomy for Machine Translation error analysis. Machine Translation 29, 127–161 (2015). https://doi.org/10.1007/s10590-015-9169-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-015-9169-0

Keywords

Navigation