research-article

One Type Context Is Not Enough: Global Context-aware Neural Machine Translation

Authors:
Linqing Chen

Soochow University, Suzhou, Jiangsu, China

Soochow University, Suzhou, Jiangsu, China
View Profile

,
Junhui Li

Soochow University, Suzhou, Jiangsu, China

Soochow University, Suzhou, Jiangsu, China
View Profile

,
Zhengxian Gong

Soochow University, Suzhou, Jiangsu, China

Soochow University, Suzhou, Jiangsu, China
View Profile

,
Min Zhang

Soochow University, Suzhou, Jiangsu, China

Soochow University, Suzhou, Jiangsu, China
View Profile

,
Guodong Zhou

Soochow University, Suzhou, Jiangsu, China

Soochow University, Suzhou, Jiangsu, China
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 21 Issue 6Article No.: 131pp 1–14https://doi.org/10.1145/3526215

Published:12 November 2022Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

How to effectively model global context has been a critical challenge for document-level neural machine translation (NMT). Both preceding and global context have been carefully explored in the sequence-to-sequence (seq2seq) framework. However, previous studies generally map global context into one vector, which is not enough to well represent the entire document since this largely ignores the hierarchy between sentences and words within. In this article, we propose to model global context for source language from both sentence level and word level. Specifically at sentence level, we extract useful global context for the current sentence, while at word level, we compute global context against words within the current sentence. On this basis, both kinds of global context can be appropriately fused before being incorporated into the state-of-the-art seq2seq model, i.e., Transformer. Detailed experimentation on various document-level translation tasks shows that global context at both sentence level and word level significantly improve translation performance. More encouraging, both kinds of global context are complementary. This leads to more improvement when both kinds of global context are used.

REFERENCES

[1] Bahdanau Dzmitry, Cho Kyunghyun, and Bengio Yoshua. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR.Google Scholar
[2] Bawden Rachel, Sennrich Rico, Birch Alexandra, and Haddow Barry. 2018. Evaluating discourse phenomena in neural machine translation. In Proceedings of NAACL. 1304–1313.Google ScholarCross Ref
[3] Cettolo Mauro, Girardi Christian, and Federico Marcello. 2012. WIT3: Web inventory of transcribed and translated talks. In Proceedings of EAMT. 261–268.Google Scholar
[4] Chen Linqing, Li Junhui, and Gong Zhengxian. 2020. Hierarchical global context augmented document-level neural machine translation. In Proceedings of CCL. 434–445.Google Scholar
[5] Garcia Eva Martínez, España-Bonet Cristina, and Màrquez Lluís. 2015. Document-level machine translation with word vector models. In Proceedings of EAMT. 59–66.Google Scholar
[6] Gong Zhengxian, Zhang Min, and Zhou Guodong. 2011. Cache-based document-level statistical machine translation. In Proceedings of EMNLP. 909–919.Google Scholar
[7] Hardmeier Christian, Nivre Joakim, and Tiedemann Jörg. 2012. Document-wide decoding for phrase-based statistical machine translation. In Proceedings of EMNLP-CoNLL. 1179–1190.Google Scholar
[8] Jean Sebastien, Lauly Stanislas, Firat Orhan, and Cho Kyunghyun. 2017. Does neural machine translation benefit from larger context?Computing Research Repository arXiv:1704.05135 (2017).Google Scholar
[9] Kang Xiaomian, Zhao Yang, Zhang Jiajun, and Zong Chengqing. 2020. Dynamic context selection for document-level neural machine translation via reinforcement learning. In Proceedings of EMNLP. 2242–2254.Google ScholarCross Ref
[10] Kingma Diederik P. and Ba Jimmy. 2015. Adam: A method for stochastic optimization. In Proceedings of ICLR.Google Scholar
[11] Klein Guillaume, Kim Yoon, Deng Yuntian, Senellart Jean, and Rush Alexander. 2017. OpenNMT: Open-source toolkit for neural machine translation. In Proceedings of ACL 2017, System Demonstrations. 67–72.Google ScholarCross Ref
[12] Koehn Philipp. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of EMNLP. 388–395.Google Scholar
[13] Koehn Philipp, Hoang Hieu, Birch Alexandra, Callison-Burch Chris, Federico Marcello, Bertoldi Nicola, Cowan Brooke, Shen Wade, Moran Christine, Zens Richard, Dyer Chris, Bojar Ondřej, Constantin Alexandra, and Herbst Evan. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of ACL 2007, System Demonstrations. 177–180.Google ScholarCross Ref
[14] Kuang Shaohui, Xiong Deyi, Luo Weihua, and Zhou Guodong. 2018. Modeling coherence for neural machine translation with dynamic and topic caches. In Proceedings of COLING. 596–606.Google Scholar
[15] Lavie Alon and Agarwal Abhaya. 2007. METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of WMT. 228–231.Google ScholarCross Ref
[16] Li Bei, Liu Hui, Wang Ziyang, Jiang Yufan, Xiao Tong, Zhu Jingbo, Liu Tongran, and Li Changliang. 2020. Does multi-encoder help? A case study on context-aware neural machine translation. In Proceedings of ACL. 3512–3518.Google ScholarCross Ref
[17] Lin Zhouhan, Feng Minwei, Santos Cicero Nogueira dos, Yu Mo, Xiang Bing, Zhou Bowen, and Bengio Yoshua. 2017. A structured self-attentive sentence embedding. In Proceedings of ICLR.Google Scholar
[18] Mace Valentin and Servan Christophe. 2019. Using whole document context in neural machine translation. In Proceedings of IWSLT.Google Scholar
[19] Maruf Sameen and Haffari Gholamreza. 2018. Document context neural machine translation with memory networks. In Proceedings of ACL. 1275–1284.Google ScholarCross Ref
[20] Maruf Sameen, Martins André F. T., and Haffari Gholamreza. 2019. Selective attention for context-aware neural machine translation. In Proceedings of NAACL. 3092–3102.Google ScholarCross Ref
[21] Miculicich Lesly, Ram Dhananjay, Pappas Nikolaos, and Henderson James. 2018. Document-level neural machine translation with hierarchical attention networks. In Proceedings of EMNLP. 2947–2954.Google ScholarCross Ref
[22] Papineni Kishore, Roukos Salim, Todd Ward, and Zhu Wei-Jing. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of ACL. 311–318.Google Scholar
[23] Sennrich Rico, Haddow Barry, and Birch Alexandra. 2016. Neural machine translation of rare words with subword units. In Proceedings of ACL. 1715–1725.Google ScholarCross Ref
[24] Sun Zewei, Wang Mingxuan, Zhou Hao, Zhao Chengqi, Huang Shujian, Chen Jiajun, and Li Lei. 2020. Capturing longer context for document-level neural machine translation: A multi-resolutional approach. Computing Research Repository arXiv:2010.08961 (2020).Google Scholar
[25] Tan Xin, Zhang Longyin, Xiong Deyi, and Zhou Guodong. 2019. Hierarchical modeling of global context for document-level neural machine translation. In Proceedings of EMNLP-IJCNLP. 1576–1585.Google ScholarCross Ref
[26] Tiedemann Jörg and Scherrer Yves. 2017. Neural machine translation with extended context. In Proceedings of the 3rd Workshop on Discourse in Machine Translation. 82–92.Google ScholarCross Ref
[27] Tu Mei, Zhou Yu, and Zong Chengqing. 2014. Enhancing grammatical cohesion: Generating transitional expressions for SMT. In Proceedings of ACL. 850–860.Google ScholarCross Ref
[28] Tu Zhaopeng, Liu Yang, Shi Shuming, and Zhang Tong. 2018. Learning to remember translation history with a continuous cache. Transactions of the Association for Computational Linguistics 6 (2018), 407–420.Google ScholarCross Ref
[29] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Lukasz, and Polosukhin Illia. 2017. Attention is all you need. In Proceedings of NIPS. 5998–6008.Google Scholar
[30] Voita Elena, Sennrich Rico, and Titov Ivan. 2019. When a good translation is wrong in context: Context-aware machine translation improves on deixis, ellipsis, and lexical cohesion. In Proceedings of ACL. 1198–1212.Google ScholarCross Ref
[31] Voita Elena, Serdyukov Pavel, Sennrich Rico, and Titov Ivan. 2018. Context-aware neural machine translation learns anaphora resolution. In Proceedings of ACL. 1264–1274.Google ScholarCross Ref
[32] Wang Longyue, Tu Zhaopeng, Way Andy, and Liu Qun. 2017. Exploiting cross-sentence context for neural machine translation. In Proceedings of EMNLP. 2826–2831.Google ScholarCross Ref
[33] Werlen Lesly Miculicich and Popescu-Belis Andrei. 2017. Validation of an automatic metric for the accuracy of pronoun translation (APT). In Proceedings of Workshop on Discourse in Machine Translation. 17–25.Google ScholarCross Ref
[34] Xiong Deyi, Ding Yang, Zhang Min, and Tan Chew Lim. 2013. Lexical chain based cohesion models for document-level statistical machine translation. In Proceedings of EMNLP. 1563–1573.Google Scholar
[35] Xiong Hao, He Zhongjun, Wu Hua, and Wang Haifeng. 2019. Modeling coherence for discourse neural machine translation. In Proceedings of AAAI. 7338–7345.Google ScholarDigital Library
[36] Yang Zhengxin, Zhang Jinchao, Meng Fandong, Gu Shuhao, Feng Yang, and Zhou Jie. 2019. Enhancing context modeling with a query-guided capsule network for document-level translation. In Proceedings of EMNLP. 1527–1537.Google ScholarCross Ref
[37] Zhang Jiacheng, Luan Huanbo, Sun Maosong, Zhai Feifei, Xu Jingfang, Zhang Min, and Liu Yang. 2018. Improving the transformer translation model with document-level context. In Proceedings of EMNLP. 533–542.Google ScholarCross Ref
[38] Zheng Zaixiang, Yue Xiang, Huang Shujian, Chen Jiajun, and Birch Alexandra. 2020. Towards making the most of context in neural machine translation. In Proceedings of IJCAI. 3983–3989.Google ScholarCross Ref

Index Terms

One Type Context Is Not Enough: Global Context-aware Neural Machine Translation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Machine translation

Recommendations

A study of BERT for context-aware neural machine translation
Abstract
Context-aware neural machine translation (NMT), which targets at translating sentences with contextual information, has attracted much attention recently. A key problem for context-aware NMT is to effectively encode and aggregate the contextual ...
Read More
Document-Level Neural Machine Translation with Hierarchical Modeling of Global Context
Abstract
Document-level machine translation (MT) remains challenging due to its difficulty in efficiently using document-level global context for translation. In this paper, we propose a hierarchical model to learn the global context for document-level ...
Read More
Syntax-aware neural machine translation directed by syntactic dependency degree
Abstract
There are various ways to incorporate syntax knowledge into neural machine translation (NMT). However, quantifying the dependency syntactic intimacy (DSI) between word pairs in a dependency tree has not being considered to use in attentional and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 21, Issue 6
November 2022
372 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3568970
Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 November 2022
- Online AM: 31 March 2022
- Accepted: 12 March 2022
- Revised: 7 March 2022
- Received: 8 August 2021
Published in tallip Volume 21, Issue 6

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Neural machine translation
global context-aware
transformer
document-level
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 241
  Total Downloads
- Downloads (Last 12 months)60
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

One Type Context Is Not Enough: Global Context-aware Neural Machine Translation

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A study of BERT for context-aware neural machine translation

Document-Level Neural Machine Translation with Hierarchical Modeling of Global Context

Syntax-aware neural machine translation directed by syntactic dependency degree

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Caption

One Type Context Is Not Enough: Global Context-aware Neural Machine Translation

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A study of BERT for context-aware neural machine translation

Document-Level Neural Machine Translation with Hierarchical Modeling of Global Context

Syntax-aware neural machine translation directed by syntactic dependency degree

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Share this Publication link

Share on Social Media