Abstract
Given the rich body of technical developments and a relatively long history of industrial use of Machine Translation (MT), it is astonishing of how little interest the topic of MT quality has received so far. In this paper, we present three ways of performing MT quality evaluation from our own research: (1) using TQ-AutoTest, a framework for semi-automatic testing and comparison of different translation engines; (2) applying the multidimensional quality metrics for analytical markup of translation errors; and (3) performing task-based user testing. We will set these three in perspective as they serve different needs and different people’s interest in translation quality assessment. This paper deals with the translation of text in the first place. Still, we hope that the methods, insights, and observations we report transfer to broader applications of translation in the field of Media Accessibility.
Similar content being viewed by others
Notes
Deliverables 2.2, 2.4, 2.8, and 2.11 providing details about Pilot 0 through Pilot 3 can be found at http://qtleap.eu/reports/.
References
Balkan, L., Arnold, D., Meijer, S.: Test suites for natural language processing. In: Aslib Proceedings, vol. 47, pp. 95–98. MCB UP Ltd (1995)
Burchardt, A., Harris, K., Rehm, G., Uszkoreit, H.: Towards a systematic and human-informed paradigm for high-quality machine translation. In: G. Rehm, A. Burchardt, O. Bojar, C. Dugast, M. Federico, J. van Genabith, B. Haddow, J. Hajic, K. Harris, P. Koehn, M. Negri, M. Popel, L. Specia, M. Turchi, H. Uszkoreit (eds.) Proceedings of the LREC: Workshop “Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem”, located at International Conference on Language Resources and Evaluation (LREC), May 24, p. 2016. Slovenia. o.A, Portorosz (2016)
Burchardt, A., Lommel, A.R., Bywood, L., Harris, K., Popovic, M.: Machine translation quality in an audiovisual context. Target 28(2), 206–221 (2016)
Burchardt, A., Macketanz, V., Dehdari, J., Heigold, G., Peter, J.T., Williams, P.: A linguistic evaluation of rule-based, phrase-based, and neural MT engines. Prague Bull. Math. Linguist. 108, 159–170 (2017). https://doi.org/10.1515/pralin-2017-0017
de Sousa, C.M.S., Aziz, W., Specia, L.: Assessing the post-editing effort for automatic and semi-automatic translations of DVD subtitles. In: Proceedings of the International Conference Recent Advances in Natural Language Processing 2011. Association for Computational Linguistics, Hissar, Bulgaria (2011)
Fernández, A., Matamala, A.: Machine translation in audio description? Comparing creation, translation and post-editing efforts. SKASE J. Transl. Interpret. 9, 64–87 (2016)
Fomicheva, M., Specia, L.: Reference bias in monolingual machine translation evaluation. In: 54th Annual Meeting of the Association for Computational Linguistics, ACL, Berlin, Germany (2016)
Gerber-Morón, O., Szarkowska, A.: Line breaks in subtitling: an eye tracking study on viewer preferences. J. Eye. Mov. Res. (2018). https://doi.org/10.16910/jemr.11.3.2
Guillou, L., Hardmeier, C.: PROTEST: A test suite for evaluating pronouns in machine translation. In: Tenth International Conference on Language Resources and Evaluation (LREC 2016) (2016)
Isabelle, P., Cherry, C., Foster, G.: A challenge set approach to evaluating machine translation. In: EMNLP 2017: Conference on Empirical Methods in Natural Language Processing (2017)
King, M., Falkedal, K.: Using test suites in evaluation of machine translation systems. In: Proceedings of the 13th conference on Computational Linguistics, vol. 2, pp. 211–216. Association for Computational Linguistics, Morristown, NJ, USA (1990). https://doi.org/10.3115/997939.997976
Lehmann, S., Oepen, S., Regnier-Prost, S., Netter, K., Lux, V., Klein, J., Falkedal, K., Fouvry, F., Estival, D., Dauphin, E., Compagnion, H., Baur, J., Balkan, L., Arnold, D.: Tsnlp-test suites for natural language processing. In: Proceedings of the 16th International Conference on Computational Linguistics, pp. 711–716. o.A. (1996)
Macketanz, V., Avramidis, E., Burchardt, A., Uszkoreit, H.: Fine-grained evaluation of German–English machine translation based on a test suite. In: Proceedings of the Third Conference on Machine Translation. Workshop on Statistical Machine Translation (WMT-2018), located at 2018 Conference on Empirical Methods in Natural Language Processing, October 31-November 1, Brussels, Belgium. Association for Computational Linguistics (2018)
Moorkens, J., Castilho, S., Gaspari, F., Doherty, S.: Translation Quality Assessment: From Principles to Practice. Machine Translation: Technologies and Applications. Springer 2018)
Multidimensional quality metrics (mqm) definition. http://www.qt21.eu/mqm-definition/definition-2015-12-30.html. Accessed 20 Dec. 2018
Multidimensional quality metrics (mqm) issue types. http://www.qt21.eu/mqm-definition/issues-list-2015-12-30.html. Accessed: 20 Dec. 2018
Ortiz-Boix, C., Matamala, A.: Assessing the quality of post-edited wildlife documentaries. Perspectives 25(4), 571–593 (2017). https://doi.org/10.1080/0907676X.2016.1245763
Porsiel, J. (ed.): Machine Translation. BDÚ Weiterbildungs- und Fachverlagsgesellschaft mbH (2017)
Romero-Fresco, P., Pérez, J.M.: Accuracy Rate in Live Subtitling: The NER Model, pp. 28–50. Palgrave Macmillan UK, London (2015). https://doi.org/10.1057/9781137552891_3
Specia, L., Harris, K., Blain, F., Burchardt, A., Macketanz, V., Skadiņa, I., Negri, M., , Turchi, M.: Translation quality and productivity: a study on rich morphology languages. In: Machine Translation Summit XVI, pp. 55–71. Asia-Pacific Association for Machine Translation (2017)
Way, A.: Developer-oriented evaluation of MT systems. In: K. Falkedal (ed.) Proceedings of the evaluators’ forum, pp. 237–244. ISSCO, Les Rasses, Vaud, Switzerland (1991)
Acknowledgements
Most of the work reported in this paper was developed in close cooperation with colleagues first and foremost from the consortia of the QTLaunchPad, QT21, and QTLeap projects. Special thanks (in alphabetical order) to Renlong Ai, Antonio Branco, Rosa Del Gaudio, Silvia Hansen-Schirra, Kim Harris, Wang He, Martin Popel, and Hans Uszkoreit. A big thanks also to the anonymous reviewers for their very helpful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Burchardt, A., Lommel, A. & Macketanz, V. A new deal for translation quality. Univ Access Inf Soc 20, 701–715 (2021). https://doi.org/10.1007/s10209-020-00736-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10209-020-00736-5