Skip to main content
Log in

Knowledge enhanced zero-resource machine translation using image-pivoting

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Zero resource machine translation usually means that there are no parallel corpora in the training of machine translation models, which can be solved with the help of extra information such as images. However, the ambiguity in the text, together with the irrelevant information in images, may cause the problem of translation errors of some key words. In order to alleviate the problem of image-text alignment deviation caused by word ambiguity, we introduce knowledge entities as an extra modality for the source language to enhance the representations of the source text to clarify its semantics. Specifically, we use additional multi-modal information including images and knowledge entities as an auxiliary hint for the source text in the Transformer-based zero-resource translation framework. We also solve the problem of the structural difference between the training and inference stages to handle the cases where there is no longer visual information in the inference stage. The proposed method achieves state-of-the-art BLEU scores in the field of zero-resource machine translation with the image as the pivot.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://github.com/multi30k/dataset/blob/master/scripts/feature-extractor

  2. The source code is publicly available at https://github.com/OverFlow001/knwl-mmt.

  3. https://www.nltk.org/

  4. https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/multi-bleu.perl

References

  1. Ahmadnia B, Dorr BJ (2019) Augmenting neural machine translation through round-trip training approach. Open Computer Science 9(1):268–278

    Article  Google Scholar 

  2. Ahmadnia B, Serrano J, Haffari G (2017) Persian-spanish low-resource statistical machine translation through english as pivot language. In: Proceedings of the international conference recent advances in natural language processing, RANLP 2017, pp 24–30

  3. Ahmadnia B, Dorr BJ, Kordjamshidi P (2020) Knowledge graphs effectiveness in neural machine translation improvement. Computer Science 21:299–318

    Article  Google Scholar 

  4. Bahdanau D, Cho KH, Bengio Y (2014) Neural machine translation by jointly learning to align and translate, pp 1–15. arXiv:14090473

  5. Bordes A, Usunier N, Garcia-Durán A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Advances in neural information processing systems, vol 26, pp 2787–2795

  6. Caglayan O, Madhyastha P, Specia L, Barrault L (2019) Probing the need for visual context in multimodal machine translation. In: Proceedings of the conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1, pp 4159–4170

  7. Chen S, Jin Q, Fu J (2019) From words to sentences: a progressive learning approach for zero-resource machine translation with visual pivots. In: International joint conference on artificial intelligence, pp 4932–4938

  8. Chen Y, Liu Y, Li VO (2018) Zero-resource neural machine translation with multi-agent communication game. In: 32nd AAAI conference on artificial intelligence, pp 5086–5093

  9. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the conference on empirical methods in natural language processing, pp 1724–1734

  10. Chowdhury KD, Hasanuzzaman M, Liu Q (2018) Multimodal neural machine translation for low-resource language pairs using synthetic data. In: Proceedings of the workshop on deep learning approaches for low-resource NLP, pp 33–42

  11. Elliott D, Frank S, Sima’an K, Specia L (2016) Multi30K: multilingual English-German image descriptions, pp 1–5. arXiv:160500459

  12. Grönroos SA, Huet B, Kurimo M, Laaksonen J, Merialdo B, Pham P, Sjöberg M, Sulubacak U, Tiedemann J, Troncy R, Vázquez R (2019) The MeMAD submission to the WMT18 multimodal translation task, pp 1–9. arXiv:180810802

  13. Grubinger M, Clough P, Müller H, Deselaers T (2006) The IAPR TC-12 benchmark: a new evaluation resource for visual information systems. In: OntoImage, workshop on language resources for content-based image retrieval during LREC, pp 13–23

  14. Guarasci R, Silvestri S, De Pietro G, Fujita H, Esposito M (2021) Assessing bert?s ability to learn italian syntax: a study on null-subject and agreement phenomena. Journal of Ambient Intelligence and Humanized Computing:1–15

  15. Guarasci R, Silvestri S, De Pietro G, Fujita H, Esposito M (2022) Bert syntactic transfer: a computational experiment on Italian, French and English languages. Computer Speech & Language 71:101261

    Article  Google Scholar 

  16. Han X, Cao S, Lv X, Lin Y, Liu Z, Sun M, Li J (2018) OpenKE: an open toolkit for knowledge embedding. In: Proceedings of the conference on empirical methods in natural language processing: system demonstrations, pp 139–144

  17. He D, Xia Y, Qin T, Wang L, Yu N, Liu TY, Ma WY (2016) Dual learning for machine translation. Advances in Neural Information Processing Systems 29

  18. Huang P, Sun S, Yang H (2020) Image-assisted transformer in zero-resource multi-modal translation. In: International conference on acoustics, speech and signal processing, pp 7548– 7552

  19. Huang PY, Hu J, Chang X, Hauptmann A (2020) Unsupervised multimodal neural machine translation with pseudo visual pivoting. In: Association for computational linguistics (ACL), pp 8226–8237. https://doi.org/10.18653/v1/2020.acl-main.731

  20. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R et al (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the association for computational linguistics companion volume, pp 177–180

  21. Lee J, Cho K, Weston J, Kiela D (2017) Emergent translation in multi-agent communication, pp 1–18. arXiv:171006922

  22. Lu Y, Zhang J, Zong C (2019) Exploiting knowledge graph in neural machine translation. In: Communications in computer and information science, vol 954, pp 27–38

  23. Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38(11):39–41

    Article  Google Scholar 

  24. Moussallem D, Ngomo ACN, Buitelaar P, Arcan M (2019) Utilizing knowledge graphs for neural machine translation augmentation. In: Proceedings of the 10th international conference on knowledge capture, vol 19, pp 139–146

  25. Nakayama H, Nishida N (2017) Zero-resource machine translation by multimodal encoder-decoder network with multimedia pivot. Mach Transl 31(1-2):49–64

    Article  Google Scholar 

  26. Papineni K, Roukos S, Ward T, Zhu WJ (2001) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311–318

  27. Pota M, Ventura M, Fujita H, Esposito M (2021) Multilingual evaluation of pre-processing for bert-based sentiment analysis of tweets. Expert Syst Appl 181:115119

    Article  Google Scholar 

  28. Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. In: 54th annual meeting of the association for computational linguistics, vol 1, pp 86–96

  29. Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: 54th annual meeting of the association for computational linguistics, vol 3, pp 1715–1725

  30. Shi C, Liu S, Ren S, Feng S, Li M, Zhou M, Sun X, Wang H (2016) Knowledge-based semantic embedding for machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 2245–2254

  31. Song K, Tan X, Qin T, Lu J, Liu TY (2019) MASS: masked sequence to sequence pre-training for language generation. In: 36th international conference on machine learning, pp 10384–10394

  32. Su Y, Fan K, Bach N, Kuo CC, Huang F (2019) Unsupervised multi-modal neural machine translation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 10474–10483

  33. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5999–6009

  34. Young P, Lai A, Hodosh M, Hockenmaier J (2014) From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. Transactions of the Association for Computational Linguistics 2:67–78

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Shanghai Municipal Project 20511100900, the NSFC Projects 62076096 and 62006078, Shanghai Knowledge Service Platform Project ZF1213, STCSM Project 22ZR1421700 and the Fundamental Research Funds for the Central Universities.

Our work is an extension of the paper ‘IMAGE-ASSISTED TRANSFORMER IN ZERO-RESOURCE MULTI-MODAL TRANSLATION’. On this basis, we introduce knowledge entities to enhance the representations of the source text to clarify its semantics. At the same time, we also introduce imitation loss and module dropout to solve the problem of the structural difference between the training and inference stages.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shilinag Sun.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, P., Zhao, J., Sun, S. et al. Knowledge enhanced zero-resource machine translation using image-pivoting. Appl Intell 53, 7484–7496 (2023). https://doi.org/10.1007/s10489-022-03997-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03997-0

Keywords

Navigation