research-article

Machine Translation Testing via Syntactic Tree Pruning

Authors:
Quanjun Zhang

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

0000-0002-2495-3805
View Profile

,
Juan Zhai

Manning College of Information & Computer Sciences, University of Massachusetts Amherst, Amherst, USA

Manning College of Information & Computer Sciences, University of Massachusetts Amherst, Amherst, USA

0000-0001-5017-8016
View Profile

,
Chunrong Fang

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

0000-0002-9930-7111
View Profile

,
Jiawei Liu

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

0000-0002-4930-9637
View Profile

,
Weisong Sun

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

0000-0001-9236-8264
View Profile

,
Haichuan Hu

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

0009-0002-3007-488X
View Profile

,
Qingyu Wang

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

0009-0003-1693-4166
View Profile

ACM Transactions on Software Engineering and Methodology Volume 33 Issue 5Article No.: 125pp 1–39https://doi.org/10.1145/3640329

Published:04 June 2024Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

Machine translation systems have been widely adopted in our daily life, making life easier and more convenient. Unfortunately, erroneous translations may result in severe consequences, such as financial losses. This requires to improve the accuracy and the reliability of machine translation systems. However, it is challenging to test machine translation systems because of the complexity and intractability of the underlying neural models. To tackle these challenges, we propose a novel metamorphic testing approach by syntactic tree pruning (STP) to validate machine translation systems. Our key insight is that a pruned sentence should have similar crucial semantics compared with the original sentence. Specifically, STP (1) proposes a core semantics-preserving pruning strategy by basic sentence structures and dependency relations on the level of syntactic tree representation, (2) generates source sentence pairs based on the metamorphic relation, and (3) reports suspicious issues whose translations break the consistency property by a bag-of-words model. We further evaluate STP on two state-of-the-art machine translation systems (i.e., Google Translate and Bing Microsoft Translator) with 1,200 source sentences as inputs. The results show that STP accurately finds 5,073 unique erroneous translations in Google Translate and 5,100 unique erroneous translations in Bing Microsoft Translator (400% more than state-of-the-art techniques), with 64.5% and 65.4% precision, respectively. The reported erroneous translations vary in types and more than 90% of them are not found by state-of-the-art techniques. There are 9,393 erroneous translations unique to STP, which is 711.9% more than state-of-the-art techniques. Moreover, STP is quite effective in detecting translation errors for the original sentences with a recall reaching 74.0%, improving state-of-the-art techniques by 55.1% on average.

REFERENCES

[1] BBC. 2022. The British Broadcasting Corporation (BBC) News Homepage. Retrieved from https://www.bbc.com/(accessed August, 2022).Google Scholar
[2] Belinkov Yonatan and Bisk Yonatan. 2018. Synthetic and natural noise both break neural machine translation. In Proceedings of the 6th International Conference on Learning Representations (ICLR’18). 1–13.Google Scholar
[3] Cao Jialun, Li Meiziniu, Li Yeting, Wen Ming, Cheung Shing-Chi, and Chen Haiming. 2022. SemMT: A semantic-based testing approach for machine translation systems. ACM Trans. Softw. Eng. Methodol. 31, 2 (2022), 1–36.Google ScholarDigital Library
[4] Chen Danqi and Yih Wen-tau. 2020. Open-domain question answering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 34–37.Google ScholarCross Ref
[5] Chen Songqiang, Jin Shuo, and Xie Xiaoyuan. 2021. Testing your question answering software via asking recursively. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). 104–116.Google ScholarDigital Library
[6] Chen Tsong Y., Cheung Shing C., and Yiu Shiu Ming. 2020. Metamorphic testing: A new approach for generating next test cases. Retrieved from https://arXiv:2002.12543Google Scholar
[7] Chen Tsong Yueh, Kuo Fei-Ching, Liu Huai, Poon Pak-Lok, Towey Dave, Tse T. H., and Zhou Zhi Quan. 2018. Metamorphic testing: A review of challenges and opportunities. ACM Comput. Surveys 51, 1 (2018), 1–27.Google ScholarDigital Library
[8] Cheng Yong, Jiang Lu, and Macherey Wolfgang. 2019. Robust neural machine translation with doubly adversarial inputs. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL’19). 4324–4333.Google ScholarCross Ref
[9] Chomsky Noam. 2002. Syntactic Structures. Walter de Gruyter.Google ScholarCross Ref
[10] CNN. 2022. The Cable News Network (CNN) News Homepage. Retrieved from https://edition.cnn.com/(accessed August, 2022).Google Scholar
[11] Daily China. 2022. China Daily News Homepage. Retrieved from https://www.chinadaily.com.cn/Google Scholar
[12] Docs IBM Cloud. 2016. Machine Translation Tips. Retrieved from https://cloud.ibm.com/docs/GlobalizationPipeline?topic=GlobalizationPipeline-globalizationpipeline_tips&locale=en(accessed August, 2022).Google Scholar
[13] Dong Yinpeng, Fu Qi-An, Yang Xiao, Pang Tianyu, Su Hang, Xiao Zihao, and Zhu Jun. 2019. Benchmarking adversarial robustness. Retrieved from https://arXiv:1912.11852Google Scholar
[14] Du Xiaoning, Xie Xiaofei, Li Yi, Ma Lei, Liu Yang, and Zhao Jianjun. 2019. Deepstellar: Model-based quantitative analysis of stateful deep learning systems. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’19). 477–487.Google ScholarDigital Library
[15] Ebrahimi Javid, Lowd Daniel, and Dou Dejing. 2018. On adversarial examples for character-level neural machine translation. In Proceedings of the 27th International Conference on Computational Linguistics (COLING’18). 653–663.Google Scholar
[16] Google. 2022. Google Translate. Retrieved from https://translate.google.com(accessed August, 2022).Google Scholar
[17] Group Stanford NLP. 2022. CoreNLP. Retrieved from https://stanfordnlp.github.io/CoreNLP(accessed August, 2022).Google Scholar
[18] Gupta Shashij, He Pinjia, Meister Clara, and Su Zhendong. 2020. Machine translation testing via pathological invariance. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’20). 863–875.Google Scholar
[19] Hassan Hany, Aue Anthony, Chen Chang, Chowdhary Vishal, Clark Jonathan, Federmann Christian, Huang Xuedong, Junczys-Dowmunt Marcin, Lewis William, Li Mu et al. 2018. Achieving human parity on automatic chinese to english news translation. Retrieved from https://arXiv:1803.05567Google Scholar
[20] He Pinjia. 2022. Machine Translation Testing Toolkit. Retrieved from https://github.com/RobustNLP/TestTranslation(accessed August, 2022).Google Scholar
[21] He Pinjia, Meister Clara, and Su Zhendong. 2020. Structure-invariant testing for machine translation. In Proceedings of the 42nd IEEE/ACM International Conference on Software Engineering (ICSE’20). 961–973.Google ScholarDigital Library
[22] He Pinjia, Meister Clara, and Su Zhendong. 2021. Testing machine translation via referential transparency. In Proceedings of the 43nd IEEE/ACM International Conference on Software Engineering (ICSE’21). 961–973.Google ScholarDigital Library
[23] Huang Jen-tse, Zhang Jianping, Wang Wenxuan, He Pinjia, Su Yuxin, and Lyu Michael R.. 2022. AEON: A method for automatic evaluation of NLP test cases. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’22). 202–214.Google ScholarDigital Library
[24] Huddleston Rodney. 1984. Introduction to the Grammar of English. Cambridge University Press.Google ScholarCross Ref
[25] Ji Pin, Feng Yang, Liu Jia, Zhao Zhihong, and Xu Baowen. 2021. Automated testing for machine translation via constituency invariance. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). 468–479.Google ScholarDigital Library
[26] Jia Robin and Liang Percy. 2017. Adversarial examples for evaluating reading comprehension systems. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2021–2031.Google ScholarCross Ref
[27] Kim Jinhan, Feldt Robert, and Yoo Shin. 2019. Guiding deep learning system testing using surprise adequacy. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering (ICSE’19). IEEE, 1039–1049.Google ScholarDigital Library
[28] Le Vu, Afshari Mehrdad, and Su Zhendong. 2014. Compiler validation via equivalence modulo inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’14). 216–226.Google ScholarDigital Library
[29] Li Shaohua and Su Zhendong. 2023. Accelerating fuzzing through prefix-guided execution. Proc. ACM Program. Lang. 7, OOPSLA1 (2023), 1–27.Google ScholarDigital Library
[30] Li Zuchao, Wang Rui, Chen Kehai, Utiyama Masao, Sumita Eiichiro, Zhang Zhuosheng, and Zhao Hai. 2020. Explicit sentence compression for neural machine translation. In Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI’20), Vol. 34. 8311–8318.Google ScholarCross Ref
[31] Lidbury Christopher, Lascu Andrei, Chong Nathan, and Donaldson Alastair F.. 2015. Many-core compiler fuzzing. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’15). 65–76.Google ScholarDigital Library
[32] Lin Ji, Gan Chuang, and Han Song. 2018. Defensive quantization: When efficiency meets robustness. In Proceedings of the International Conference on Learning Representations.Google Scholar
[33] Lindvall Mikael, Ganesan Dharmalingam, Árdal Ragnar, and Wiegand Robert E.. 2015. Metamorphic model-based testing applied on NASA DAT–an experience report. In Proceedings of the 37th IEEE/ACM International Conference on Software Engineering (ICSE’15), Vol. 2. 129–138.Google Scholar
[34] Lingua. 2022. The 20 Most Spoken Languages in the World in 2022. Retrieved from https://lingua.edu/the-20-most-spoken-languages-in-the-world-in-2022/(accessed August, 2022).Google Scholar
[35] Liu Qian, Chen Bei, Lou Jian-Guang, Zhou Bin, and Zhang Dongmei. 2020. Incomplete utterance rewriting as semantic segmentation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 2846–2857.Google ScholarCross Ref
[36] Lyons John and John Lyons. 1995. Linguistic Semantics: An Introduction. Cambridge University Press.Google ScholarCross Ref
[37] Ma Shiqing, Liu Yingqi, Lee Wen-Chuan, Zhang Xiangyu, and Grama Ananth. 2018. MODE: Automated neural network model debugging via state differential analysis and input selection. In Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’18). 175–186.Google ScholarDigital Library
[38] Mann William C. and Thompson Sandra A.. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text-interdisc. J. Study Disc. 8, 3 (1988), 243–281.Google ScholarCross Ref
[39] Manning Christopher D., Surdeanu Mihai, Bauer John, Finkel Jenny Rose, Bethard Steven, and McClosky David. 2014. The stanford CoreNLP natural language processing toolkit. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (ACL’14). 55–60.Google ScholarCross Ref
[40] Microsoft. 2022. Bing Microsoft Translator. Retrieved from https://www.bing.com/translator(accessed August, 2022).Google Scholar
[41] Mudrakarta Pramod Kaushik, Taly Ankur, Sundararajan Mukund, and Dhamdhere Kedar. 2018. Did the model understand the question? In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL’18). 1896–1906.Google ScholarCross Ref
[42] Murphy Christian, Kaiser Gail E., Hu Lifeng, and Wu Leon. 2008. Properties of machine learning applications for use in metamorphic testing. In Proceedings of the 20th International Conference on Software Engineering and Knowledge Engineering (SEKE’08). 867–872.Google Scholar
[43] Niklaus Christina, Cetto Matthias, Freitas André, and Handschuh Siegfried. 2019. Transforming complex sentences into a semantic hierarchy. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL’19). 3415–3427.Google ScholarCross Ref
[44] Ott Myle, Auli Michael, Grangier David, and Ranzato Marc’Aurelio. 2018. Analyzing uncertainty in neural machine translation. In Proceedings of the 35th International Conference on Machine Learning (ICML’18). 3956–3965.Google Scholar
[45] Papernot Nicolas, McDaniel Patrick, Wu Xi, Jha Somesh, and Swami Ananthram. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. In Proceedings of the IEEE Symposium on Security and Privacy (SP’16). IEEE, 582–597.Google ScholarCross Ref
[46] Pesu Daniel, Zhou Zhi Quan, Zhen Jingfeng, and Towey Dave. 2018. A monte carlo method for metamorphic testing of machine translation services. In Proceedings of the IEEE/ACM 3rd International Workshop on Metamorphic Testing (MET’18). IEEE, 38–45.Google ScholarDigital Library
[47] Quirk Randolph. 2010. A Comprehensive Grammar of the English Language. Pearson Education India.Google Scholar
[48] Reuters. 2022. Reuters News Homepage. Retrieved from https://www.reuters.com/(accessed August, 2022).Google Scholar
[49] Segura Sergio, Fraser Gordon, Sanchez Ana B., and Ruiz-Cortés Antonio. 2016. A survey on metamorphic testing. IEEE Trans. Softw. Eng. 42, 9 (2016), 805–824.Google ScholarCross Ref
[50] Shen Qingchao, Chen Junjie, Zhang Jie, Wang Haoyu, Liu Shuang, and Tian Menghan. 2022. Natural test generation for precise testing of question answering software. In Proceedings of the IEEE/ACM Conference on Automated Software Engineering (ASE’22).Google ScholarDigital Library
[51] Shi Jieke, Yang Zhou, Xu Bowen, Kang Hong Jin, and Lo David. 2022. Compressing pre-trained models of code into 3 MB. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE’22). 1–12.Google ScholarDigital Library
[52] Sikka Punardeep, Singh Manmeet, Pink Allen, and Mago Vijay. 2020. A survey on text simplification. Retrieved from https://arXiv:2008.08612Google Scholar
[53] Sun Liqun and Zhou Zhi Quan. 2018. Metamorphic testing for machine translations: MT4MT. In Proceedings of the 25th Australasian Software Engineering Conference (ASWEC’18). IEEE, 96–100.Google ScholarCross Ref
[54] Sun Zeyu, Zhang Jie M., Harman Mark, Papadakis Mike, and Zhang Lu. 2020. Automatic testing and improvement of machine translation. In Proceedings of the 42nd IEEE/ACM International Conference on Software Engineering (ICSE’20). 974–985.Google ScholarDigital Library
[55] Sun Zeyu, Zhang Jie M., Xiong Yingfei, Harman Mark, Papadakis Mike, and Zhang Lu. 2022. Improving machine translation systems via isotopic replacement. In Proceedings of the 44th IEEE/ACM International Conference on Software Engineering (ICSE’22).Google ScholarDigital Library
[56] Tian Yuchi, Pei Kexin, Jana Suman, and Ray Baishakhi. 2018. DeepTest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th IEEE/ACM International Conference on Software Engineering (ICSE’18). 303–314.Google ScholarDigital Library
[57] Turovsky Barak. 2016. Ten Years of Google Translate. Retrieved from https://blog.google/products/translate/ten-years-of-google-translate/Google Scholar
[58] Wang Deze, Jia Zhouyang, Li Shanshan, Yu Yue, Xiong Yun, Dong Wei, and Liao Xiangke. 2022. Bridging pre-trained models and downstream tasks for source code understanding. In Proceedings of the 44th IEEE/ACM International Conference on Software Engineering (ICSE’22). 287–298.Google ScholarDigital Library
[59] Wang Jingyi, Dong Guoliang, Sun Jun, Wang Xinyu, and Zhang Peixin. 2019. Adversarial sample detection for deep neural network through model mutation testing. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering (ICSE’19). IEEE, 1245–1256.Google ScholarDigital Library
[60] Wang Wenyu, Zheng Wujie, Liu Dian, Zhang Changrong, Zeng Qinsong, Deng Yuetang, Yang Wei, He Pinjia, and Xie Tao. 2019. Detecting failures of neural machine translation in the absence of reference translations. In Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’19). 1–4.Google ScholarCross Ref
[61] Wu Lei, Hoi Steven C. H., and Yu Nenghai. 2010. Semantics-preserving bag-of-words models and applications. IEEE Trans. Image Process. 19, 7 (2010), 1908–1920.Google ScholarDigital Library
[62] Wu Yonghui, Schuster Mike, Chen Zhifeng, Le Quoc V., Norouzi Mohammad, Macherey Wolfgang, Krikun Maxim, Cao Yuan, Gao Qin, Macherey Klaus et al. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. Retrieved from https://arXiv:1609.08144Google Scholar
[63] Xiang Chong, Qi Charles R., and Li Bo. 2019. Generating 3d adversarial point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 9136–9144.Google ScholarCross Ref
[64] Xiao Chaowei, Yang Dawei, Li Bo, Deng Jia, and Liu Mingyan. 2019. Meshadv: Adversarial meshes for visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 6898–6907.Google ScholarCross Ref
[65] Xie Xiaoyuan, Ho Joshua W. K., Murphy Christian, Kaiser Gail, Xu Baowen, and Chen Tsong Yueh. 2011. Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84, 4 (2011), 544–558.Google ScholarDigital Library
[66] Xu Jiacheng, Gan Zhe, Cheng Yu, and Liu Jingjing. 2020. Discourse-aware neural extractive text summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’18). 5021–5031.Google ScholarCross Ref
[67] Youdao. 2022. Youdao Translator. Retrieved from http://www.youdao.com(accessed August, 2022).Google Scholar
[68] Yu Boxi, Zhong Zhiqing, Qin Xinran, Yao Jiayi, Wang Yuancheng, and He Pinjia. 2022. Automated testing of image captioning systems. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’22). 467–479.Google ScholarDigital Library
[69] Zhang Fuyuan, Chowdhury Sankalan Pal, and Christakis Maria. 2019. DeepSearch: Simple and effective blackbox fuzzing of deep neural networks. Retrieved from https://arXiv:1910.06296Google Scholar
[70] Zhang Jie, Chen Junjie, Hao Dan, Xiong Yingfei, Xie Bing, Zhang Lu, and Mei Hong. 2014. Search-based inference of polynomial metamorphic relations. In Proceedings of the 29th IEEE/ACM International Conference on Automated Software Engineering (ASE’14). 701–712.Google ScholarDigital Library
[71] Zhang Mengshi, Zhang Yuqun, Zhang Lingming, Liu Cong, and Khurshid Sarfraz. 2018. Deeproad: Gan-based metamorphic testing and input validation framework for autonomous driving systems. In Proceedings of the 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE’18). 132–142.Google ScholarDigital Library
[72] Zhang Quanjun and Hu Haichuan. 2023. STP Reproduction Artifacts. Retrieved from https://github.com/iSEngLab/STP(accessed December, 2023).Google Scholar
[73] Zhang Xinze, Zhang Junzhe, Chen Zhenhua, and He Kun. 2021. Crafting adversarial examples for neural machine translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL’21). 1967–1977.Google ScholarCross Ref
[74] Zhang Yuhao, Chen Yifan, Cheung Shing-Chi, Xiong Yingfei, and Zhang Lu. 2018. An empirical study on tensorflow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’18). 129–140.Google ScholarDigital Library
[75] Zhang Yuhao, Qi Peng, and Manning Christopher D.. 2018. Graph convolution over pruned dependency trees improves relation extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’18). 2205–2215.Google ScholarCross Ref
[76] Zheng Wujie, Wang Wenyu, Liu Dian, Zhang Changrong, Zeng Qinsong, Deng Yuetang, Yang Wei, He Pinjia, and Xie Tao. 2019. Testing untestable neural machine translation: An industrial case. In Proceedings of the 41st IEEE/ACM International Conference on Software Engineering (ICSE’19). 314–315.Google ScholarDigital Library
[77] Zhou Zhi Quan, Xiang Shaowen, and Chen Tsong Yueh. 2015. Metamorphic testing for software quality assessment: A study of search engines. IEEE Trans. Softw. Eng. 42, 3 (2015), 264–284.Google ScholarDigital Library
[78] Zhou Zhi Quan, Zhang ShuJia, Hagenbuchner Markus, Tse T. H., Kuo Fei-Ching, and Chen Tsong Yueh. 2012. Automated functional testing of online search services. Softw. Test. Verific. Reliab. 22, 4 (2012), 221–243.Google ScholarDigital Library
[79] Zhu Muhua, Zhang Yue, Chen Wenliang, Zhang Min, and Zhu Jingbo. 2013. Fast and accurate shift-reduce constituent parsing. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13). 434–443.Google Scholar

Index Terms

Machine Translation Testing via Syntactic Tree Pruning
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Language Modeling for Syntax-Based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation

The poor grammatical output of Machine Translation (MT) systems appeals syntax-based approaches within language modeling. However, previous studies showed that syntax-based language modeling using (Context-Free) Treebank Grammars was not very helpful in ...
Read More
Structure-invariant testing for machine translation
ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

In recent years, machine translation software has increasingly been integrated into our daily lives. People routinely use machine translation for various applications, such as describing symptoms to a foreign doctor and reading political news in a ...
Read More
Large aligned treebanks for syntax-based machine translation

We present a collection of parallel treebanks that have been automatically aligned on both the terminal and the non-terminal constituent level for use in syntax-based machine translation. We describe how they were constructed and applied to a syntax- ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Software Engineering and Methodology Volume 33, Issue 5
June 2024
952 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3618079
Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 June 2024
- Online AM: 10 January 2024
- Accepted: 31 December 2023
- Revised: 26 October 2023
- Received: 31 August 2022
Published in tosem Volume 33, Issue 5

Check for updates
Author Tags
Software testing
machine translation
metamorphic testing
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 153
  Total Downloads
- Downloads (Last 12 months)153
- Downloads (Last 6 weeks)30
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

Machine Translation Testing via Syntactic Tree Pruning

ACM Transactions on Software Engineering and Methodology

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Language Modeling for Syntax-Based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation

Structure-invariant testing for machine translation

Large aligned treebanks for syntax-based machine translation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

Caption

Machine Translation Testing via Syntactic Tree Pruning

ACM Transactions on Software Engineering and Methodology

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Language Modeling for Syntax-Based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation

Structure-invariant testing for machine translation

Large aligned treebanks for syntax-based machine translation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

Share this Publication link

Share on Social Media