Knowledge enhanced zero-resource machine translation using image-pivoting

Huang, Ping; Zhao, Jing; Sun, Shilinag; Lin, Yichu

doi:10.1007/s10489-022-03997-0

Knowledge enhanced zero-resource machine translation using image-pivoting

Published: 01 August 2022

Volume 53, pages 7484–7496, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Ping Huang¹,
Jing Zhao ORCID: orcid.org/0000-0003-0158-5330¹,
Shilinag Sun¹ &
…
Yichu Lin¹

331 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Zero resource machine translation usually means that there are no parallel corpora in the training of machine translation models, which can be solved with the help of extra information such as images. However, the ambiguity in the text, together with the irrelevant information in images, may cause the problem of translation errors of some key words. In order to alleviate the problem of image-text alignment deviation caused by word ambiguity, we introduce knowledge entities as an extra modality for the source language to enhance the representations of the source text to clarify its semantics. Specifically, we use additional multi-modal information including images and knowledge entities as an auxiliary hint for the source text in the Transformer-based zero-resource translation framework. We also solve the problem of the structural difference between the training and inference stages to handle the cases where there is no longer visual information in the inference stage. The proposed method achieves state-of-the-art BLEU scores in the field of zero-resource machine translation with the image as the pivot.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-modal and Multi-perspective Machine Translation by Collecting Diverse Alignments

Text-image matching for multi-model machine translation

Article 09 May 2023

Multimodal Machine Translation with Fusion of Generated Visual Information

Notes

https://github.com/multi30k/dataset/blob/master/scripts/feature-extractor
The source code is publicly available at https://github.com/OverFlow001/knwl-mmt.
https://www.nltk.org/
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/multi-bleu.perl

References

Ahmadnia B, Dorr BJ (2019) Augmenting neural machine translation through round-trip training approach. Open Computer Science 9(1):268–278
Article Google Scholar
Ahmadnia B, Serrano J, Haffari G (2017) Persian-spanish low-resource statistical machine translation through english as pivot language. In: Proceedings of the international conference recent advances in natural language processing, RANLP 2017, pp 24–30
Ahmadnia B, Dorr BJ, Kordjamshidi P (2020) Knowledge graphs effectiveness in neural machine translation improvement. Computer Science 21:299–318
Article Google Scholar
Bahdanau D, Cho KH, Bengio Y (2014) Neural machine translation by jointly learning to align and translate, pp 1–15. arXiv:14090473
Bordes A, Usunier N, Garcia-Durán A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Advances in neural information processing systems, vol 26, pp 2787–2795
Caglayan O, Madhyastha P, Specia L, Barrault L (2019) Probing the need for visual context in multimodal machine translation. In: Proceedings of the conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1, pp 4159–4170
Chen S, Jin Q, Fu J (2019) From words to sentences: a progressive learning approach for zero-resource machine translation with visual pivots. In: International joint conference on artificial intelligence, pp 4932–4938
Chen Y, Liu Y, Li VO (2018) Zero-resource neural machine translation with multi-agent communication game. In: 32nd AAAI conference on artificial intelligence, pp 5086–5093
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the conference on empirical methods in natural language processing, pp 1724–1734
Chowdhury KD, Hasanuzzaman M, Liu Q (2018) Multimodal neural machine translation for low-resource language pairs using synthetic data. In: Proceedings of the workshop on deep learning approaches for low-resource NLP, pp 33–42
Elliott D, Frank S, Sima’an K, Specia L (2016) Multi30K: multilingual English-German image descriptions, pp 1–5. arXiv:160500459
Grönroos SA, Huet B, Kurimo M, Laaksonen J, Merialdo B, Pham P, Sjöberg M, Sulubacak U, Tiedemann J, Troncy R, Vázquez R (2019) The MeMAD submission to the WMT18 multimodal translation task, pp 1–9. arXiv:180810802
Grubinger M, Clough P, Müller H, Deselaers T (2006) The IAPR TC-12 benchmark: a new evaluation resource for visual information systems. In: OntoImage, workshop on language resources for content-based image retrieval during LREC, pp 13–23
Guarasci R, Silvestri S, De Pietro G, Fujita H, Esposito M (2021) Assessing bert?s ability to learn italian syntax: a study on null-subject and agreement phenomena. Journal of Ambient Intelligence and Humanized Computing:1–15
Guarasci R, Silvestri S, De Pietro G, Fujita H, Esposito M (2022) Bert syntactic transfer: a computational experiment on Italian, French and English languages. Computer Speech & Language 71:101261
Article Google Scholar
Han X, Cao S, Lv X, Lin Y, Liu Z, Sun M, Li J (2018) OpenKE: an open toolkit for knowledge embedding. In: Proceedings of the conference on empirical methods in natural language processing: system demonstrations, pp 139–144
He D, Xia Y, Qin T, Wang L, Yu N, Liu TY, Ma WY (2016) Dual learning for machine translation. Advances in Neural Information Processing Systems 29
Huang P, Sun S, Yang H (2020) Image-assisted transformer in zero-resource multi-modal translation. In: International conference on acoustics, speech and signal processing, pp 7548– 7552
Huang PY, Hu J, Chang X, Hauptmann A (2020) Unsupervised multimodal neural machine translation with pseudo visual pivoting. In: Association for computational linguistics (ACL), pp 8226–8237. https://doi.org/10.18653/v1/2020.acl-main.731
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R et al (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the association for computational linguistics companion volume, pp 177–180
Lee J, Cho K, Weston J, Kiela D (2017) Emergent translation in multi-agent communication, pp 1–18. arXiv:171006922
Lu Y, Zhang J, Zong C (2019) Exploiting knowledge graph in neural machine translation. In: Communications in computer and information science, vol 954, pp 27–38
Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38(11):39–41
Article Google Scholar
Moussallem D, Ngomo ACN, Buitelaar P, Arcan M (2019) Utilizing knowledge graphs for neural machine translation augmentation. In: Proceedings of the 10th international conference on knowledge capture, vol 19, pp 139–146
Nakayama H, Nishida N (2017) Zero-resource machine translation by multimodal encoder-decoder network with multimedia pivot. Mach Transl 31(1-2):49–64
Article Google Scholar
Papineni K, Roukos S, Ward T, Zhu WJ (2001) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311–318
Pota M, Ventura M, Fujita H, Esposito M (2021) Multilingual evaluation of pre-processing for bert-based sentiment analysis of tweets. Expert Syst Appl 181:115119
Article Google Scholar
Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. In: 54th annual meeting of the association for computational linguistics, vol 1, pp 86–96
Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: 54th annual meeting of the association for computational linguistics, vol 3, pp 1715–1725
Shi C, Liu S, Ren S, Feng S, Li M, Zhou M, Sun X, Wang H (2016) Knowledge-based semantic embedding for machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 2245–2254
Song K, Tan X, Qin T, Lu J, Liu TY (2019) MASS: masked sequence to sequence pre-training for language generation. In: 36th international conference on machine learning, pp 10384–10394
Su Y, Fan K, Bach N, Kuo CC, Huang F (2019) Unsupervised multi-modal neural machine translation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 10474–10483
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5999–6009
Young P, Lai A, Hodosh M, Hockenmaier J (2014) From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. Transactions of the Association for Computational Linguistics 2:67–78
Article Google Scholar

Download references

Acknowledgements

This work was supported by the Shanghai Municipal Project 20511100900, the NSFC Projects 62076096 and 62006078, Shanghai Knowledge Service Platform Project ZF1213, STCSM Project 22ZR1421700 and the Fundamental Research Funds for the Central Universities.

Our work is an extension of the paper ‘IMAGE-ASSISTED TRANSFORMER IN ZERO-RESOURCE MULTI-MODAL TRANSLATION’. On this basis, we introduce knowledge entities to enhance the representations of the source text to clarify its semantics. At the same time, we also introduce imitation loss and module dropout to solve the problem of the structural difference between the training and inference stages.

Author information

Authors and Affiliations

School of Computer Science and Technology, East China Normal University, Shanghai, China
Ping Huang, Jing Zhao, Shilinag Sun & Yichu Lin

Authors

Ping Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shilinag Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yichu Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shilinag Sun.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, P., Zhao, J., Sun, S. et al. Knowledge enhanced zero-resource machine translation using image-pivoting. Appl Intell 53, 7484–7496 (2023). https://doi.org/10.1007/s10489-022-03997-0

Download citation

Accepted: 12 July 2022
Published: 01 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10489-022-03997-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge enhanced zero-resource machine translation using image-pivoting

Abstract

Access this article

Similar content being viewed by others

Multi-modal and Multi-perspective Machine Translation by Collecting Diverse Alignments

Text-image matching for multi-model machine translation

Multimodal Machine Translation with Fusion of Generated Visual Information

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Knowledge enhanced zero-resource machine translation using image-pivoting

Abstract

Access this article

Similar content being viewed by others

Multi-modal and Multi-perspective Machine Translation by Collecting Diverse Alignments

Text-image matching for multi-model machine translation

Multimodal Machine Translation with Fusion of Generated Visual Information

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation