Abstract
How to effectively model global context has been a critical challenge for document-level neural machine translation (NMT). Both preceding and global context have been carefully explored in the sequence-to-sequence (seq2seq) framework. However, previous studies generally map global context into one vector, which is not enough to well represent the entire document since this largely ignores the hierarchy between sentences and words within. In this article, we propose to model global context for source language from both sentence level and word level. Specifically at sentence level, we extract useful global context for the current sentence, while at word level, we compute global context against words within the current sentence. On this basis, both kinds of global context can be appropriately fused before being incorporated into the state-of-the-art seq2seq model, i.e., Transformer. Detailed experimentation on various document-level translation tasks shows that global context at both sentence level and word level significantly improve translation performance. More encouraging, both kinds of global context are complementary. This leads to more improvement when both kinds of global context are used.
- [1] . 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR.Google Scholar
- [2] . 2018. Evaluating discourse phenomena in neural machine translation. In Proceedings of NAACL. 1304–1313.Google ScholarCross Ref
- [3] . 2012. WIT3: Web inventory of transcribed and translated talks. In Proceedings of EAMT. 261–268.Google Scholar
- [4] . 2020. Hierarchical global context augmented document-level neural machine translation. In Proceedings of CCL. 434–445.Google Scholar
- [5] . 2015. Document-level machine translation with word vector models. In Proceedings of EAMT. 59–66.Google Scholar
- [6] . 2011. Cache-based document-level statistical machine translation. In Proceedings of EMNLP. 909–919.Google Scholar
- [7] . 2012. Document-wide decoding for phrase-based statistical machine translation. In Proceedings of EMNLP-CoNLL. 1179–1190.Google Scholar
- [8] . 2017. Does neural machine translation benefit from larger context?Computing Research Repository arXiv:1704.05135 (2017).Google Scholar
- [9] . 2020. Dynamic context selection for document-level neural machine translation via reinforcement learning. In Proceedings of EMNLP. 2242–2254.Google ScholarCross Ref
- [10] . 2015. Adam: A method for stochastic optimization. In Proceedings of ICLR.Google Scholar
- [11] . 2017. OpenNMT: Open-source toolkit for neural machine translation. In Proceedings of ACL 2017, System Demonstrations. 67–72.Google ScholarCross Ref
- [12] . 2004. Statistical significance tests for machine translation evaluation. In Proceedings of EMNLP. 388–395.Google Scholar
- [13] . 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of ACL 2007, System Demonstrations. 177–180.Google ScholarCross Ref
- [14] . 2018. Modeling coherence for neural machine translation with dynamic and topic caches. In Proceedings of COLING. 596–606.Google Scholar
- [15] . 2007. METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of WMT. 228–231.Google ScholarCross Ref
- [16] . 2020. Does multi-encoder help? A case study on context-aware neural machine translation. In Proceedings of ACL. 3512–3518.Google ScholarCross Ref
- [17] . 2017. A structured self-attentive sentence embedding. In Proceedings of ICLR.Google Scholar
- [18] . 2019. Using whole document context in neural machine translation. In Proceedings of IWSLT.Google Scholar
- [19] . 2018. Document context neural machine translation with memory networks. In Proceedings of ACL. 1275–1284.Google ScholarCross Ref
- [20] . 2019. Selective attention for context-aware neural machine translation. In Proceedings of NAACL. 3092–3102.Google ScholarCross Ref
- [21] . 2018. Document-level neural machine translation with hierarchical attention networks. In Proceedings of EMNLP. 2947–2954.Google ScholarCross Ref
- [22] . 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of ACL. 311–318.Google Scholar
- [23] . 2016. Neural machine translation of rare words with subword units. In Proceedings of ACL. 1715–1725.Google ScholarCross Ref
- [24] . 2020. Capturing longer context for document-level neural machine translation: A multi-resolutional approach. Computing Research Repository arXiv:2010.08961 (2020).Google Scholar
- [25] . 2019. Hierarchical modeling of global context for document-level neural machine translation. In Proceedings of EMNLP-IJCNLP. 1576–1585.Google ScholarCross Ref
- [26] . 2017. Neural machine translation with extended context. In Proceedings of the 3rd Workshop on Discourse in Machine Translation. 82–92.Google ScholarCross Ref
- [27] . 2014. Enhancing grammatical cohesion: Generating transitional expressions for SMT. In Proceedings of ACL. 850–860.Google ScholarCross Ref
- [28] . 2018. Learning to remember translation history with a continuous cache. Transactions of the Association for Computational Linguistics 6 (2018), 407–420.Google ScholarCross Ref
- [29] . 2017. Attention is all you need. In Proceedings of NIPS. 5998–6008.Google Scholar
- [30] . 2019. When a good translation is wrong in context: Context-aware machine translation improves on deixis, ellipsis, and lexical cohesion. In Proceedings of ACL. 1198–1212.Google ScholarCross Ref
- [31] . 2018. Context-aware neural machine translation learns anaphora resolution. In Proceedings of ACL. 1264–1274.Google ScholarCross Ref
- [32] . 2017. Exploiting cross-sentence context for neural machine translation. In Proceedings of EMNLP. 2826–2831.Google ScholarCross Ref
- [33] . 2017. Validation of an automatic metric for the accuracy of pronoun translation (APT). In Proceedings of Workshop on Discourse in Machine Translation. 17–25.Google ScholarCross Ref
- [34] . 2013. Lexical chain based cohesion models for document-level statistical machine translation. In Proceedings of EMNLP. 1563–1573.Google Scholar
- [35] . 2019. Modeling coherence for discourse neural machine translation. In Proceedings of AAAI. 7338–7345.Google ScholarDigital Library
- [36] . 2019. Enhancing context modeling with a query-guided capsule network for document-level translation. In Proceedings of EMNLP. 1527–1537.Google ScholarCross Ref
- [37] . 2018. Improving the transformer translation model with document-level context. In Proceedings of EMNLP. 533–542.Google ScholarCross Ref
- [38] . 2020. Towards making the most of context in neural machine translation. In Proceedings of IJCAI. 3983–3989.Google ScholarCross Ref
Index Terms
- One Type Context Is Not Enough: Global Context-aware Neural Machine Translation
Recommendations
A study of BERT for context-aware neural machine translation
AbstractContext-aware neural machine translation (NMT), which targets at translating sentences with contextual information, has attracted much attention recently. A key problem for context-aware NMT is to effectively encode and aggregate the contextual ...
Document-Level Neural Machine Translation with Hierarchical Modeling of Global Context
AbstractDocument-level machine translation (MT) remains challenging due to its difficulty in efficiently using document-level global context for translation. In this paper, we propose a hierarchical model to learn the global context for document-level ...
Syntax-aware neural machine translation directed by syntactic dependency degree
AbstractThere are various ways to incorporate syntax knowledge into neural machine translation (NMT). However, quantifying the dependency syntactic intimacy (DSI) between word pairs in a dependency tree has not being considered to use in attentional and ...
Comments