Exploiting Parts of Speech in Bangla-To-English Machine Translation Evaluation

Datta, Goutam; Joshi, Nisheeth; Gupta, Kusum

doi:10.1007/978-981-99-0601-7_5

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1011))

Included in the following conference series:

The International Conference on Recent Innovations in Computing

341 Accesses

Abstract

Machine translation (MT) converts one language to another automatically. One of the major challenges of MT is evaluating the performance of the system. There are many automatic evaluation metrics available these days. But the results of automatic evaluation metrics are sometimes not reliable. In this paper, we have attempted to address this issue by considering another type of evaluation strategy, i.e., syntactic evaluation in Bangla-to-English translation. We have attempted to address the problems of automatic evaluation metric BLEU and, thereby, how syntactic evaluation could be helpful in achieving higher accuracy is discussed. In our syntactic evaluation, we have exploited the use of parts of speech (POS) during computing evaluation scores. A comparative analysis is done on different types of evaluations such as syntactic, human, and automatic on a low-resourced English–Bangla language pair. A correlation indicates syntactic evaluation score correlates more with the human evaluation score compared to the normal BLEU score.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hutchins J, Lovtskii E (2000) Petr Petrovich Troyanskii (1894–1950): a forgotten pioneer of mechanical translation. Mach Transl 15:187–221
Article MATH Google Scholar
Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1991) Statistical approach to sense disambiguation in machine translation 146–151. https://doi.org/10.3115/112405.112427
Xiong D, Meng F, Liu Q (2016) Topic-based term translation models for statistical machine translation. Artif Intell 232:54–75
Article MathSciNet Google Scholar
Koehn P et al (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting ACL interaction poster demonstration session—ACL ’07 177. https://doi.org/10.3115/1557769.1557821
Vaswani A et al (2017) Attention is all you need. Adv Neural Inf Process Syst 2017-Decem, 5999–6009
Google Scholar
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Adv Neural Inf Process Syst 4:3104–3112
Google Scholar
Stahlberg F (2020) Neural machine translation: a review. J Artif Intell Res 69:343–418
Article MathSciNet Google Scholar
Vathsala MK, Holi G (2020) RNN based machine translation and transliteration for Twitter data. Int J Speech Technol 23:499–504
Article Google Scholar
Duh K (2008) Ranking vs. regression in machine translation evaluation. In: Third workshop on statistical machine translation WMT 2008 annual meeting association on computer linguist ACL 2008 191–194. https://doi.org/10.3115/1626394.1626425
Liu D, Gildea D (2005) Syntactic features for evaluation of machine translation. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization ACL 2005 25–32
Google Scholar
Banerjee S, Lavie A (2005) METEOR: an automatic metric for mt evaluation with improved correlation with human judgments. In: Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization 2005 65–72
Google Scholar
Doddington G (2002) Automatic evaluation of machine translation quality using n-gram co-occurrence statistics 138. https://doi.org/10.3115/1289189.1289273
Park Y, Patwardhan S, Visweswariah K, Gates SC (2008) An empirical analysis of word error rate and keyword error rate. In: Proceedings of annual conference on international speech communication association INTERSPEECH 2070–2073. https://doi.org/10.21437/interspeech.2008-537
Guzmán F, Joty S, Màrquez L, Nakov P (2017) Machine translation evaluation with neural networks. Comput Speech Lang 45:180–200
Article Google Scholar
Popovíc M, Ney H (2009) Syntax-oriented evaluation measures for machine translation output. In: EACL 2009—Proceedings of the Fourth Workshop on Statistical Machine Translation 29–32. https://doi.org/10.3115/1626431.1626435
Duma M, Vertan C, Park VM, Menzel W (2013) A new syntactic metric for evaluation of machine translation. ACL Student Res Work 130–135
Google Scholar
Haque R, Hasanuzzaman M, Way A (2020) Analysing terminology translation errors in statistical and neural machine translation. Mach Transl 34:149–195
Article Google Scholar
Papineni K, Roukos S, Ward T, Zhu WJ (2002) {B}leu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics 311–318 (Association for Computational Linguistics, 2002). https://doi.org/10.3115/1073083.1073135
Agnihotri S (2019) Hyperparameter optimization on neural machine translation. Creat Components 124
Google Scholar
Lim R, Heafield K, Hoang H, Briers M, Malony A (2018) Exploring hyper-parameter optimization for neural machine translation on GPU architectures 1–8
Google Scholar
Tran N, Schneider J-G, Weber I, Qin AK (2020) Hyper-parameter optimization in classification: to-do or not-to-do. Pattern Recognit 103:107245
Article Google Scholar
Lankford S, Afli H, Way A (2022) Human evaluation of English–Irish transformer-Based NMT 1–19
Google Scholar
Newman B, Ang KS, Gong J, Hewitt J (2021) Refining targeted syntactic evaluation of language models 3710–3723. https://doi.org/10.18653/v1/2021.naacl-main.290
Manning C et al (2015) The Stanford CoreNLP natural language processing toolkit 55–60. https://doi.org/10.3115/v1/p14-5010
Marcus MP, Santorini B, Marcinkiewicz MA (1993) Building a large annotated corpus of English: the Penn Treebank. Comput Linguist 19:313–330
Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematical and Computer Science, Banasthali Vidyapeeth, Banasthali, Rajasthan, India
Goutam Datta, Nisheeth Joshi & Kusum Gupta
Informatics, School of Computer Science and Engineering, University of Petroleum and Energy Studies, Dehradun, India
Goutam Datta

Authors

Goutam Datta
View author publications
You can also search for this author in PubMed Google Scholar
Nisheeth Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Kusum Gupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Goutam Datta .

Editor information

Editors and Affiliations

Central University of Jammu, Jammu, Jammu and Kashmir, India
Yashwant Singh
Department of Media and Educational Informatics, Faculty of Informatics, Eötvös Loránd University, Budapest, Hungary
Chaman Verma
Department of Media and Educational Informatics, Faculty of Informatics, Eötvös Loránd University, Budapest, Hungary
Illés Zoltán
Department of Computer Engineering, National Institute of Technology Kurukshetra, Kurukshetra, Haryana, India
Jitender Kumar Chhabra
KIET Group of Institutions, Ghaziabad, Uttar Pradesh, India
Pradeep Kumar Singh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Datta, G., Joshi, N., Gupta, K. (2023). Exploiting Parts of Speech in Bangla-To-English Machine Translation Evaluation. In: Singh, Y., Verma, C., Zoltán, I., Chhabra, J.K., Singh, P.K. (eds) Proceedings of International Conference on Recent Innovations in Computing. ICRIC 2022. Lecture Notes in Electrical Engineering, vol 1011. Springer, Singapore. https://doi.org/10.1007/978-981-99-0601-7_5

Download citation

DOI: https://doi.org/10.1007/978-981-99-0601-7_5
Published: 17 May 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-0600-0
Online ISBN: 978-981-99-0601-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics