Abstract
A detailed error analysis is a fundamental step in every natural language processing task, as to be able to diagnose what went wrong will provide cues to decide which research directions are to be followed. In this paper we focus on error analysis in Machine Translation (MT). We significantly extend previous error taxonomies so that translation errors associated with Romance language specificities can be accommodated. Furthermore, based on the proposed taxonomy, we carry out an extensive analysis of the errors generated by four different systems: two mainstream online translation systems Google Translate (Statistical) and Systran (Hybrid Machine Translation), and two in-house MT systems, in three scenarios representing different challenges in the translation from English to European Portuguese. Additionally, we comment on how distinct error types differently impact translation quality.
Similar content being viewed by others
Notes
While not presenting a taxonony per se, the work of Naskar et al. (2011) is interesting to note as it presents a tool for diagnostic evaluation of translation errors, DELiC4MT, available at http://www.computing.dcu.ie/~atoral/delic4mt, focusing mainly on linguistic checkpoints for different part-of-speech categories.
Although in some languages they can be more frequent, as for instance in German, where all nouns are spelled with capital letter. This can be a problem for foreign students that do not have this particularity in their mother tongue.
We should not confuse Omission errors of a function word with Misselection errors (contraction). In the first case, in the phrase na (em + a) Indía (in India), the article a is missing, so we have an Omission error. Meanwhile, if we had a contraction problem, the sentence would be em a India, where both preposition and article were correctly selected but were not contracted as they should be (em + a = na).
Translated on 22/10/2014.
Tokenized using the default Moses tokenizer.
For instance, the following sentence is in the TAP corpus: If it’s Saturday, there’s a play at Teatro da Trindade called Havia um Menino que era Pessoa, where theatre-goers can discover the verses the poet wrote for his nephews and nieces.
SVO is a sentence structure where the subject comes first, the verb second, and the object third, and languages may be classified according to the dominant sequence of these elements. SVO is one of the most common order in world languages.
version 1.4
References
Batista F, Mamede N, Trancoso I (2007) A lightweight on-the-fly capitalization system for automatic speech recognition. In: Proceedings of the recent advantages in natural language processing (RANLP’07), Borovets, Bulgaria
Bojar O (2011) Analysing error types in English-Czech machine translation. Prague Bull Math Linguist 95:63–76
Bojar O, Mareček D, Novák V, Popel M, Ptáček J, Rouš J, Žabokrtský Z (2009) English-Czech MT in 2008. In: Proceedings of the fourth workshop on statistical machine translation, Greece, Athens, pp 125–129
Callison-Burch C, Fordyce C, Koehn P, Monz C, Schroeder J (2007) (meta-) evaluation of machine translation. In: Proceedings of the second workshop on statistical machine translation, Czech Republic, Prague, pp 136–158
Castagnoli S, Ciobanu D, Kunz K, Volanschi A, Kubler, N (2007) Designing a learner translator corpus for training purposes. In: TALC7, proceedings of the 7th teaching and language corpora conference, Paris, France
Chiang D (2007) Hierarchical phrase-based translation. Comput Linguist 33(2):201–228
Condon SL, Parvaz D, Aberdeen JS, Doran C, Freeman A, Awad M (2010) Evaluation of machine translation errors in english and Iraqi Arabic. In: Proceedings of the seventh international conference on language resources and evaluation, Valletta, Malta, pp 159–168
Corder SP (1967) The significance of learner’s errors. Int Rev Appl Linguist 5(4):161–169
Costa A, Luís T, Coheur L (2014) Translation errors from English to Portuguese: an annotated corpus. In: Proceedings of the ninth international conference on language resources and evaluation (LREC’14), Reykjavik, Iceland, pp 1231–1234
Costa A, Luís T, Ribeiro J, Mendes AC, Coheur L (2012) An English-Portuguese parallel corpus of questions: translation guidelines and application in SMT. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12), Istanbul, Turkey, pp 2172–2176
Cunha C, Cintra L (1987) Nova Gramática do Português Contemporâneo. Edições Sá da Costa, Lisboa
De Saussure F (1916) Cours de linguistique générale. Payot, Paris
Denkowski M, Lavie A (2014) Meteor universal: language specific translation evaluation for any target language. In: WMT 2014: proceedings of the ninth workshop on statistical machine translation, Baltimore, Maryland USA, pp 376–380
Dulay H, Burt MK, Krashen SD (1982) Language two. Oxford University Press, Oxford
Elliott D, Hartley A, Atwell E (2004) A fluency error categorization scheme to guide automated machine translation evaluation. In: Frederking RE, Taylor KB (eds) Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, vol 3265., Lecture Notes in Computer Science. Springer, Berlin, pp 64–73
Fishel M, Bojar O, Popović M (2012) Terra: a collection of translation error-annotated corpora. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12), Istanbul, Turkey, pp 7–14
James C (1998) Errors in language learning and use. Exploring error analysis, applied linguistics and language study. Routledge, New York
Keenan EL, Stabler EP (2010) Language variation and linguistic invariants. Lingua 120(12):2680–2685
Kirchhoff K, Rambow O, Habash N, Diab M (2007) Semi-automatic error analysis for large-scale statistical machine translation. In: MT Summit XI. Proceedings, Copenhagen, Denmark, pp 289–296
Koehn P (2005) Europarl: A Parallel Corpus for Statistical Machine Translation. In: MT Summit X, Conference proceedings: the tenth machine translation summit, Phuket, Thailand, pp 79–86
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Prague, Czech Republic, Association for Computational Linguistics, pp 177–180
Landis RJ, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–74
Ling W, Luis T, Graca J, Coheur L, Trancoso I (2010) Towards a general and extensible phrase-extraction algorithm. In: IWSLT ’10: international workshop on spoken language translation, France, Paris, pp 313–320
Li X, Roth D (2002) Learning question classifiers. In: Proceedings of the 19th international conference on compuatational linguistics (COLING), Taipei, Taiwan, pp 556–562
Llitjós AF, Carbonell JG, Lavie A (2005) A framework for interactive and automatic refinement of transfer-based machine translation. In: 10th EAMT conference “Practical applications of machine translation”, Budapest, Hungary, pp 87–96
Naskar SK, Toral A, Gaspari F, Way A (2011) Framework for diagnostic evaluation of MT based on linguistic checkpoints. In: Proceedings of machine translation summit XIII, Xiamen, China, pp 529–536
Niessen S, Och FJ, Leusch G, Ney H (2000) An evaluation tool for machine translation: fast evaluation for MT research. In: LREC-2000: second international conference on language resources and evaluation. Proceedings, Athens, Greece, pp 39–45
Och FJ (2003) Minimum error rate training in statistical machine translation. In: ACL-2003: 41st annual meeting of the association for computational linguistics, Sapporo, Japan, pp 160–167
Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29:19–51
Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: ACL-2002: 40th annual meeting of the association for computational linguistics, Philadelphia, pp 311–318
Popović M, de Gispert A, Gupta D, Lambert P, Ney H, Marino JB, Federico M, Banchs R (2006) Morpho-syntactic information for automatic error analysis of statistical machine translation output. In: HLT-NAACL 2006: proceedings of the workshop on statistical machine translation, New York, NY, USA, pp 1–6
Popović M, Ney H (2006) Error analysis of verb inflections in Spanish translation output. In: TC-STAR workshop on speech-to-speech translation, Barcelona, Spain, pp 99–103
Popović M, Ney H (2011) Towards automatic error analysis of machine translation output. Comput Linguist 37(4):657–688
Richards J (1974) Error analysis. Perspectives on second language acquisition. Longman, London
Secară A (2005) Translation evaluation—a state of the art survey. eCoLoRe/MeLLANGE Workshop. Leeds, UK, pp 39–44
Vilar D, Xu J, D’Haro LF, Ney H (2006) Error analysis of machine translation output. In: LREC-2006: fifth international conference on language resources and evaluation. Proceedings, Genoa, Italy, pp 697–702
Zeman D, Fishel M, Berka J, Ondřej (2011) Addicter: what is wrong with my translations? Prague Bull Math Linguist 96:79–88
Acknowledgments
This work was partially supported by national funds through FCT - Fundação para a Ciência e a Tecnologia, under project UID/CEC/50021/2013. Ângela Costa, Wang Ling and Rui Correia are supported by PhD fellowships from FCT (SFRH/BD/85737/2012, SFRH/BD/51157/2010, SFRH/BD/51156/2010).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Costa, Â., Ling, W., Luís, T. et al. A linguistically motivated taxonomy for Machine Translation error analysis. Machine Translation 29, 127–161 (2015). https://doi.org/10.1007/s10590-015-9169-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-015-9169-0