Abstract
Similarity metric learning models the general semantic similarities and distances between objects and classes of objects (e.g. persons) in order to recognise them. Different strategies and models based on Deep Learning exist and generally consist in learning a non-linear projection into a lower dimensional vector space where the semantic similarity between instances can be easily measured with a standard distance. As opposed to supervised learning, one does not train the model to predict the class labels, and the actual labels may not even be used or not known in advance. Machine learning-based similarity metric learning approaches rather operate in a weakly supervised way. That is, the training target (loss) is defined on the relationship between several instances, i.e. similar or different pairs, triplets or tuples. This learnt distance can then be applied, for example, to two new, unseen examples of unknown classes in order to determine if they belong to the same class or if they are similar. There exist numerous applications for metric learning such as face or speaker verification, image retrieval, human activity recognition or person re-identification in images. In this chapter, an overview of the principle methods and models used for similarity metric learning with neural networks is given, describing the most common architectures, loss functions and training algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mohammad Adiban, Hossein Sameti, and Saeedreza Shehnepoor. Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge. Computer Speech & Language, 64, 2020.
Jane Bromley, Isabelle Guyon, Yann Lecun, Eduard Säckinger, and Roopak Shah. Signature Verification using a “Siamese” Time Delay Neural Network. In Proceedings of NIPS, 1994.
A. Bellet, A. Habrard, and M. Sebban. A survey on metric learning for feature vectors and structured data. Computing Research Repository, abs/1306.6709, 2013.
Samuel Berlemont, Gregoire Lefebvre, Stefan Duffner, and Christophe Garcia. Siamese neural network based similarity metric for inertial gesture classification and rejection. In Proceedings of the International Conference on Automatic Face and Gesture Recognition (FG), Ljubljana, Slovenia, May 2015.
Samuel Berlemont, Grégoire Lefebvre, Stefan Duffner, and Christophe Garcia. Polar sine based siamese neural network for gesture recognition. In Proceedings of the International Conference on International Conference on Artificial Neural Networks (ICANN), Barcelona, Spain, 2016.
Samuel Berlemont, Grégoire Lefebvre, Stefan Duffner, and Christophe Garcia. Class-balanced siamese neural networks. Neurocomputing, 273:47–56, 2018.
Antoine Bordes, Jason Weston, Ronan Collobert, and Yoshua Bengio. Learning structured embeddings of knowledge bases. In Conference on Artificial Intelligence, 2011.
Ushasi Chaudhuri, Biplab Banerjee, and Avik Bhattacharya. Siamese graph convolutional network for content based remote sensing image retrieval. Computer Vision and Image Understanding, 184:22–30, 2019.
Weihua Chen, Xiaotang Chen, Jianguo Zhang, and Kaiqi Huang. Beyond triplet loss: a deep quadruplet network for person re-identification. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
Weihua Chen, Xiaotang Chen, Jianguo Zhang, and Kaiqi Huang. A multi-task deep network for person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, 2017.
Yiqiang Chen, Stefan Duffner, Andrei Stoian, Jean-Yves Dufour, and Atilla Baskurt. Similarity learning with listwise ranking for person re-identification. In Proceedings of the International Conference on Image Processing (ICIP), 2018.
S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 539–546. IEEE, 2005.
Fatih Cakir, Kun He, Xide Xia, Brian Kulis, and Stan Sclaroff. Deep metric learning to rank. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Paul Compagnon, Gregoire Lefebvre, Stefan Duffner, and Christophe Garcia. Routine modeling with time series metric learning. In Proceedings of the International Conference on International Conference on Artificial Neural Networks (ICANN), September 2019.
Ke Chen and Ahmad Salman. Extracting Speaker-Specific Information with a Regularized Siamese Deep Network. In Proceedings of Advances in Neural Information Processing Systems (NIPS), pages 298–306, 2011.
G. Chechik, V. Sharma, U. Shalit, and S. Bengio. Large scale online learning of image similarity through ranking. Journal of Machine Learning Research, 11:1109–1135, 2010.
Yueqi Duan, Wenzhao Zheng, Xudong Lin, Jiwen Lu, and Jie Zhou. Deep adversarial metric learning. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
Xing Fan, Wei Jiang, Hao Luo, and Mengjuan Fei. SphereReID: Deep hypersphere manifold embedding for person re-identification. Journal of Visual Communication and Image Representation, 60:51–58, 2019.
Weifeng Ge, Weilin Huang, Dengke Dong, and Matthew R. Scott. Deep metric learning with hierarchical triplet loss. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.
Alexander Hermans, Lucas Beyer, and Bastian Leibe. In Defense of the Triplet Loss for Person Re-Identification. arXiv preprint arXiv:1703.07737, 2017.
Raia Hadsell, Sumit Chopra, and Yann LeCun. Dimensionality reduction by learning an invariant mapping. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), pages 1735–1742, 2006.
N. V. Hieu and B. Li. Cosine similarity metric learning for face verification. In Proceedings of the Asian Conference on Computer Vision (ACCV), pages 709–720. Springer, 2011.
J. Hu, J. Lu, and Y.-P. Tan. Discriminative deep metric learning for face verification in the wild. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), pages 1875–1882, 2014.
Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5):359–366, 1989.
Theofanis Karaletsos, Serge Belongie, and Gunnar Rätsch. Bayesian representation learning with oracle constraints. In International Conference on Learning Representations (ICLR), 2016.
D. Kedem, S. Tyree, F. Sha, G. R. Lanckriet, and K. Q. Weinberger. Non-linear metric learning. In Proceedings of Advances in Neural Information Processing Systems (NIPS), pages 2573–2581, 2012.
Y. A. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller. Efficient backprop. In Neural networks: Tricks of the trade, pages 9–48. Springer, 2012.
Grégoire Lefebvre and Christophe Garcia. Learning a bag of features based nonlinear metric for facial similarity. In Proceedings of the International Conference on Advanced Video and Signal-Based Surveillance (AVSS), pages 238–243, 2013.
Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. Graph matching networks for learning the similarity of graph structured objects. In Proceedings of the International Conference on Machine Learning (ICML), 2019.
Gilad Lerman and J. Tyler Whitehouse. On D-dimensional D-semimetrics and simplex-type inequalities for high-dimensional sine functions. Journal of Approximation Theory, 156(1):52–81, January 2009.
Yair Movshovitz-Attias, Alexander Toshev, Thomas K. Leung, Sergey Ioffe, and Saurabh Singh. No fuss distance metric learning using proxies. In Proceedings of the International Conference on Computer Vision (ICCV), 2017.
Jonathan Masci, Michael M. Bronstein, Alexander M. Bronstein, and Jurgen Schmidhuber. Multimodal Similarity-Preserving Hashing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(4):824–830, April 2014.
Niall McLaughlin, Jesus Martinez del Rincon, and Paul C Miller. Person reidentification using deep convnets with multitask learning. IEEE Transactions on Circuits and Systems for Video Technology, 27(3):525–539, 2017.
Panagiotis Moutafis, Mengjun Leng, and Ioannis A Kakadiaris. An overview and empirical comparison of distance metric learning methods. IEEE Transactions on Cybernetics, 2016.
Weiqing Min, Shuhuan Mei, Zhuo Li, and Shuqiang Jiang. A two-stage triplet network training framework for image retrieval. IEEE Transactions on Multimedia, 2020.
J. Mueller and A. Thyagarajan. Siamese recurrent architectures for learning sentence similarity. In Proceedings of the AAAI Conference on Artificial Intelligence, 2016.
Vinod Nair and Geoffrey E. Hinton. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the International Conference on Machine Learning (ICML), pages 807–814, 2010.
Paul Neculoiu, Maarten Versteegh, and Mihai Rotaru. Learning Text Similarity with Siamese Recurrent Networks. In Proceedings of the 1st Workshop on Representation Learning for NLP, pages 148–157, August 2016.
A. M. Qamar, E. Gaussier, J. P. Chevallet, and J. H. Lim. Similarity learning for nearest neighbor classification. In Proceedings of the International Conference on Data Mining (ICDM), pages 983–988. IEEE, 2008.
Qi Qian, Lei Shang, Baigui Sun, Juhua Hu, Hao Li, and Rong Jin. SoftTriple loss: Deep metric learning without triplet sampling. In Proceedings of the International Conference on Computer Vision (ICCV), 2019.
Y. Sun, Y. Chen, X. Wang, and X. Tang. Deep learning face representation by joint identification-verification. In Proceedings of Advances in Neural Information Processing Systems (NIPS), pages 1988–1996, 2014.
Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), pages 815–823, 2015.
Weijie Sheng and Xinde Li. Siamese denoising autoencoders for joints trajectories reconstruction and robust gait recognition. Neurocomputing, 395:86–94, 2020.
Kihyuk Sohn. Improved deep metric learning with multi-class N-pair loss objective. In Proceedings of Advances in Neural Information Processing Systems (NIPS), 2016.
Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. Deep metric learning via lifted structured feature embedding. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
Hailin Shi, Yang Yang, Xiangyu Zhu, Shengcai Liao, Zhen Lei, Weishi Zheng, and Stan Z. Li. Embedding deep metric for person re-identification: A study against large variations. In Proceedings of the European Conference on Computer Vision (ECCV), 2016.
K. Weinberger, J. Blitzer, and L. Saul. Distance metric learning for large margin nearest neighbor classification. In Proceedings of Advances in Neural Information Processing Systems (NIPS), volume 18, page 1473, 2006.
Chong Wang, Xue Zhang, and Xipeng Lan. How to train triplet networks with 100k identities? In Proceedings of the International Conference on Computer Vision (ICCV), 2017.
J. Wang, F. Zhou, S. Wen, X. Liu, and Y. Lin. Deep metric learning with angular loss. In Proceedings of the International Conference on Computer Vision (ICCV), 2017.
E. P. Xing, A. Y. Ng, M. I. Jordan, and S. Russell. Distance metric learning with application to clustering with side-information. In Proceedings of Advances in Neural Information Processing Systems (NIPS), pages 521–528. MIT; 1998, 2003.
Yao Yang, Haoran Chen, and Junming Shao. Triplet enhanced autoencoder: Model-free discriminative network embedding. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2019.
Dong Yi, Zhen Lei, Shengcai Liao, and Stan Z. Li. Deep metric learning for person re-identification. In Proceedings of International Conference on Pattern Recognition (ICPR), pages 34–39, 2014.
Baosheng Yu and Dacheng Tao. Deep metric learning with tuplet margin loss. In Proceedings of the International Conference on Computer Vision (ICCV), 2019.
Wen-tau Yih, Kristina Toutanova, John C. Platt, and Christopher Meek. Learning discriminative projections for text similarity measures. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pages 247–256. Association for Computational Linguistics, 2011.
Mang Ye, Zheng Wang, Xiangyuan Lan, and Pong C. Yuen. Visible thermal person re-identification via dual-constrained top-ranking. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2018.
Jun Yu, Xiaokang Yang, Fei Gao, and Dacheng Tao. Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics, 2016.
Xun Yang, Peicheng Zhou, and Meng Wang. Person reidentification via structural deep metric learning. IEEE Transactions on Neural Networks and Learning Systems, 30(10), 2018.
Lilei Zheng, Stefan Duffner, Khalid Idrissi, Christophe Garcia, and Atilla Baskurt. Pairwise identity verification via linear concentrative metric learning. IEEE Transactions on Cybernetics, 48(1):324–335, 2018.
Lilei Zheng, Khalid Idrissi, Christophe Garcia, Stefan Duffner, and Atilla Baskurt. Logistic similarity metric learning for face verification. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE, 2015.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Duffner, S., Garcia, C., Idrissi, K., Baskurt, A. (2021). Similarity Metric Learning. In: Benois-Pineau, J., Zemmari, A. (eds) Multi-faceted Deep Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-74478-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-74478-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-74477-9
Online ISBN: 978-3-030-74478-6
eBook Packages: Computer ScienceComputer Science (R0)