Two momentum contrast in triplet for unsupervised visual representation learning

Long, Xianzhong; Du, Han; Li, Yun

doi:10.1007/s11042-023-15998-3

Two momentum contrast in triplet for unsupervised visual representation learning

Published: 22 June 2023

Volume 83, pages 10467–10480, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

210 Accesses
1 Altmetric
Explore all metrics

Abstract

In unsupervised representational learning, self-supervised learning has made great progress due to its combination with contrastive learning. The core idea of self-supervised learning is to make positive sample pairs closer and negative sample pairs far away. However, the disadvantage of such methods, such as MoCo, is that some of the positive samples can be misclassified as negative samples, which affects the learning ability of the model. To address this problem, we propose two momentum contrast in triplet (TMCT) for unsupervised visual representation learning. The method maps the obtained representations to another space and employs the samples of the third network as the target for the final learning of the model. Experimental results on three benchmark datasets demonstrate the effectiveness of our proposed method. TMCT obtains classification accuracy of 84.50\(\%\) on CIFAR10, which is 2.47\(\%\) higher than SimCLR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Data Availability

Data openly available in a public repository. The datasets analysed during the current study are available, in the CIFAR10 at http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz, in the CIFAR100 at http://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz, in the TinyImageNet at https://tiny-imagenet.herokuapp.com.

References

Bromley J, Guyon I, LeCun Y, Sackinger E, Shah R (1993) Signature verification using a “siamese” time delay neural network. In: Advances in neural information processing systems. pp 737–744
Cai T, Frankle J, Schwab DJ, Morcos AS (2021) Are all negatives created equal in contrastive instance discrimination? arXiv:2010.06682
Caron M, Bojanowski P, Joulin A, Douze M (2018) Deep clustering for unsupervised learning of visual features. In: European conference on computer vision. pp 1–30
Caron M, Misra I, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. In: Advances in neural information processing systems. pp 1–23
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. pp 1597–1607
Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. arXiv:2003.04297
Chen X, He K (2021) Exploring simple siamese representation learning. In: IEEE conference on computer vision and pattern recognition. pp 15745–15753
Doersch C, Gupta A, Efros AA (2015) Unsupervised visual representation learning by context prediction. In: IEEE international conference on computer vision. pp 1422–1430
Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. In: International conference on learning representations. pp 1–16
Grill J-B, Strub F, Altche F, Corentin Tallec EA (2020) Bootstrap your own latent: A new approach to self-supervised learning. In: Advances in neural information processing systems. pp 21271–21284
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: IEEE conference on computer vision and pattern recognition. pp 9726–9735
Hinton GE, Nair V (2010) Rectified linear units improve restricted Boltzmann machines. In: International conference on machine learning. pp 1–8
Jaiswal A, Babu AR, Zadeh MZ, Banerjee D, Makedon F (2021) A survey on contrastive self-supervised learning arXiv:2011.00362
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. pp 1–60
Noroozi M, Favaro P (2016) Unsupervised learning of visual representations by solving jigsaw puzzles. In: European conference on computer vision, pp 69–84
Pathak D, Krahenbuhl P, Donahue J, Darrel T, Efros AA (2016) Context encoders: Feature learning by inpainting. In: IEEE conference on computer vision and pattern recognition. pp 2536–2544
Pouransari H, Ghili S (2015) Tiny imagenet visual recognition challenge
Ren S, Sun J, He K, Zhang X (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition. pp 770–778
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252
Article MathSciNet Google Scholar
Tian Y, Sun C, Poole B, Krishnan D, Schmid C, Isola P (2020) What makes for good views for contrastive learning. In: Advances in neural information processing systems. pp 1–24
Wang X, Zhang R, Shen C, Li L (2021) Dense contrastive learning for self-supervised visual pre-training. In: IEEE conference on computer vision and pattern recognition. pp 3024–3033
Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via non-parametric instance discrimination. In: IEEE conference on computer vision and pattern recognition. pp 3733–3742
Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision. pp 649–666

Download references

Acknowledgements

This work is supported by the National Nature Science Foundation of China (Grant Number 619006098).

Author information

Authors and Affiliations

Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
Xianzhong Long, Han Du & Yun Li

Authors

Xianzhong Long
View author publications
You can also search for this author in PubMed Google Scholar
Han Du
View author publications
You can also search for this author in PubMed Google Scholar
Yun Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianzhong Long.

Ethics declarations

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Long, X., Du, H. & Li, Y. Two momentum contrast in triplet for unsupervised visual representation learning. Multimed Tools Appl 83, 10467–10480 (2024). https://doi.org/10.1007/s11042-023-15998-3

Download citation

Received: 17 April 2022
Revised: 25 April 2023
Accepted: 06 June 2023
Published: 22 June 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-15998-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two momentum contrast in triplet for unsupervised visual representation learning

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

ImageNet Large Scale Visual Recognition Challenge

Image Matching from Handcrafted to Deep Features: A Survey

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Two momentum contrast in triplet for unsupervised visual representation learning

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

ImageNet Large Scale Visual Recognition Challenge

Image Matching from Handcrafted to Deep Features: A Survey

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation