Skip to main content

Similarity Metric Learning

  • Chapter
  • First Online:
Multi-faceted Deep Learning

Abstract

Similarity metric learning models the general semantic similarities and distances between objects and classes of objects (e.g. persons) in order to recognise them. Different strategies and models based on Deep Learning exist and generally consist in learning a non-linear projection into a lower dimensional vector space where the semantic similarity between instances can be easily measured with a standard distance. As opposed to supervised learning, one does not train the model to predict the class labels, and the actual labels may not even be used or not known in advance. Machine learning-based similarity metric learning approaches rather operate in a weakly supervised way. That is, the training target (loss) is defined on the relationship between several instances, i.e. similar or different pairs, triplets or tuples. This learnt distance can then be applied, for example, to two new, unseen examples of unknown classes in order to determine if they belong to the same class or if they are similar. There exist numerous applications for metric learning such as face or speaker verification, image retrieval, human activity recognition or person re-identification in images. In this chapter, an overview of the principle methods and models used for similarity metric learning with neural networks is given, describing the most common architectures, loss functions and training algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mohammad Adiban, Hossein Sameti, and Saeedreza Shehnepoor. Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge. Computer Speech & Language, 64, 2020.

    Google Scholar 

  2. Jane Bromley, Isabelle Guyon, Yann Lecun, Eduard Säckinger, and Roopak Shah. Signature Verification using a “Siamese” Time Delay Neural Network. In Proceedings of NIPS, 1994.

    Google Scholar 

  3. A. Bellet, A. Habrard, and M. Sebban. A survey on metric learning for feature vectors and structured data. Computing Research Repository, abs/1306.6709, 2013.

    Google Scholar 

  4. Samuel Berlemont, Gregoire Lefebvre, Stefan Duffner, and Christophe Garcia. Siamese neural network based similarity metric for inertial gesture classification and rejection. In Proceedings of the International Conference on Automatic Face and Gesture Recognition (FG), Ljubljana, Slovenia, May 2015.

    Google Scholar 

  5. Samuel Berlemont, Grégoire Lefebvre, Stefan Duffner, and Christophe Garcia. Polar sine based siamese neural network for gesture recognition. In Proceedings of the International Conference on International Conference on Artificial Neural Networks (ICANN), Barcelona, Spain, 2016.

    Google Scholar 

  6. Samuel Berlemont, Grégoire Lefebvre, Stefan Duffner, and Christophe Garcia. Class-balanced siamese neural networks. Neurocomputing, 273:47–56, 2018.

    Article  Google Scholar 

  7. Antoine Bordes, Jason Weston, Ronan Collobert, and Yoshua Bengio. Learning structured embeddings of knowledge bases. In Conference on Artificial Intelligence, 2011.

    Google Scholar 

  8. Ushasi Chaudhuri, Biplab Banerjee, and Avik Bhattacharya. Siamese graph convolutional network for content based remote sensing image retrieval. Computer Vision and Image Understanding, 184:22–30, 2019.

    Article  Google Scholar 

  9. Weihua Chen, Xiaotang Chen, Jianguo Zhang, and Kaiqi Huang. Beyond triplet loss: a deep quadruplet network for person re-identification. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

    Google Scholar 

  10. Weihua Chen, Xiaotang Chen, Jianguo Zhang, and Kaiqi Huang. A multi-task deep network for person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, 2017.

    Google Scholar 

  11. Yiqiang Chen, Stefan Duffner, Andrei Stoian, Jean-Yves Dufour, and Atilla Baskurt. Similarity learning with listwise ranking for person re-identification. In Proceedings of the International Conference on Image Processing (ICIP), 2018.

    Google Scholar 

  12. S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 539–546. IEEE, 2005.

    Google Scholar 

  13. Fatih Cakir, Kun He, Xide Xia, Brian Kulis, and Stan Sclaroff. Deep metric learning to rank. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

    Google Scholar 

  14. Paul Compagnon, Gregoire Lefebvre, Stefan Duffner, and Christophe Garcia. Routine modeling with time series metric learning. In Proceedings of the International Conference on International Conference on Artificial Neural Networks (ICANN), September 2019.

    Google Scholar 

  15. Ke Chen and Ahmad Salman. Extracting Speaker-Specific Information with a Regularized Siamese Deep Network. In Proceedings of Advances in Neural Information Processing Systems (NIPS), pages 298–306, 2011.

    Google Scholar 

  16. G. Chechik, V. Sharma, U. Shalit, and S. Bengio. Large scale online learning of image similarity through ranking. Journal of Machine Learning Research, 11:1109–1135, 2010.

    MathSciNet  MATH  Google Scholar 

  17. Yueqi Duan, Wenzhao Zheng, Xudong Lin, Jiwen Lu, and Jie Zhou. Deep adversarial metric learning. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

    Google Scholar 

  18. Xing Fan, Wei Jiang, Hao Luo, and Mengjuan Fei. SphereReID: Deep hypersphere manifold embedding for person re-identification. Journal of Visual Communication and Image Representation, 60:51–58, 2019.

    Article  Google Scholar 

  19. Weifeng Ge, Weilin Huang, Dengke Dong, and Matthew R. Scott. Deep metric learning with hierarchical triplet loss. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.

    Google Scholar 

  20. Alexander Hermans, Lucas Beyer, and Bastian Leibe. In Defense of the Triplet Loss for Person Re-Identification. arXiv preprint arXiv:1703.07737, 2017.

    Google Scholar 

  21. Raia Hadsell, Sumit Chopra, and Yann LeCun. Dimensionality reduction by learning an invariant mapping. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), pages 1735–1742, 2006.

    Google Scholar 

  22. N. V. Hieu and B. Li. Cosine similarity metric learning for face verification. In Proceedings of the Asian Conference on Computer Vision (ACCV), pages 709–720. Springer, 2011.

    Google Scholar 

  23. J. Hu, J. Lu, and Y.-P. Tan. Discriminative deep metric learning for face verification in the wild. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), pages 1875–1882, 2014.

    Google Scholar 

  24. Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5):359–366, 1989.

    Article  Google Scholar 

  25. Theofanis Karaletsos, Serge Belongie, and Gunnar Rätsch. Bayesian representation learning with oracle constraints. In International Conference on Learning Representations (ICLR), 2016.

    Google Scholar 

  26. D. Kedem, S. Tyree, F. Sha, G. R. Lanckriet, and K. Q. Weinberger. Non-linear metric learning. In Proceedings of Advances in Neural Information Processing Systems (NIPS), pages 2573–2581, 2012.

    Google Scholar 

  27. Y. A. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller. Efficient backprop. In Neural networks: Tricks of the trade, pages 9–48. Springer, 2012.

    Google Scholar 

  28. Grégoire Lefebvre and Christophe Garcia. Learning a bag of features based nonlinear metric for facial similarity. In Proceedings of the International Conference on Advanced Video and Signal-Based Surveillance (AVSS), pages 238–243, 2013.

    Google Scholar 

  29. Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. Graph matching networks for learning the similarity of graph structured objects. In Proceedings of the International Conference on Machine Learning (ICML), 2019.

    Google Scholar 

  30. Gilad Lerman and J. Tyler Whitehouse. On D-dimensional D-semimetrics and simplex-type inequalities for high-dimensional sine functions. Journal of Approximation Theory, 156(1):52–81, January 2009.

    Google Scholar 

  31. Yair Movshovitz-Attias, Alexander Toshev, Thomas K. Leung, Sergey Ioffe, and Saurabh Singh. No fuss distance metric learning using proxies. In Proceedings of the International Conference on Computer Vision (ICCV), 2017.

    Google Scholar 

  32. Jonathan Masci, Michael M. Bronstein, Alexander M. Bronstein, and Jurgen Schmidhuber. Multimodal Similarity-Preserving Hashing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(4):824–830, April 2014.

    Article  Google Scholar 

  33. Niall McLaughlin, Jesus Martinez del Rincon, and Paul C Miller. Person reidentification using deep convnets with multitask learning. IEEE Transactions on Circuits and Systems for Video Technology, 27(3):525–539, 2017.

    Google Scholar 

  34. Panagiotis Moutafis, Mengjun Leng, and Ioannis A Kakadiaris. An overview and empirical comparison of distance metric learning methods. IEEE Transactions on Cybernetics, 2016.

    Google Scholar 

  35. Weiqing Min, Shuhuan Mei, Zhuo Li, and Shuqiang Jiang. A two-stage triplet network training framework for image retrieval. IEEE Transactions on Multimedia, 2020.

    Google Scholar 

  36. J. Mueller and A. Thyagarajan. Siamese recurrent architectures for learning sentence similarity. In Proceedings of the AAAI Conference on Artificial Intelligence, 2016.

    Google Scholar 

  37. Vinod Nair and Geoffrey E. Hinton. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the International Conference on Machine Learning (ICML), pages 807–814, 2010.

    Google Scholar 

  38. Paul Neculoiu, Maarten Versteegh, and Mihai Rotaru. Learning Text Similarity with Siamese Recurrent Networks. In Proceedings of the 1st Workshop on Representation Learning for NLP, pages 148–157, August 2016.

    Google Scholar 

  39. A. M. Qamar, E. Gaussier, J. P. Chevallet, and J. H. Lim. Similarity learning for nearest neighbor classification. In Proceedings of the International Conference on Data Mining (ICDM), pages 983–988. IEEE, 2008.

    Google Scholar 

  40. Qi Qian, Lei Shang, Baigui Sun, Juhua Hu, Hao Li, and Rong Jin. SoftTriple loss: Deep metric learning without triplet sampling. In Proceedings of the International Conference on Computer Vision (ICCV), 2019.

    Google Scholar 

  41. Y. Sun, Y. Chen, X. Wang, and X. Tang. Deep learning face representation by joint identification-verification. In Proceedings of Advances in Neural Information Processing Systems (NIPS), pages 1988–1996, 2014.

    Google Scholar 

  42. Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), pages 815–823, 2015.

    Google Scholar 

  43. Weijie Sheng and Xinde Li. Siamese denoising autoencoders for joints trajectories reconstruction and robust gait recognition. Neurocomputing, 395:86–94, 2020.

    Article  Google Scholar 

  44. Kihyuk Sohn. Improved deep metric learning with multi-class N-pair loss objective. In Proceedings of Advances in Neural Information Processing Systems (NIPS), 2016.

    Google Scholar 

  45. Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. Deep metric learning via lifted structured feature embedding. In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

    Google Scholar 

  46. Hailin Shi, Yang Yang, Xiangyu Zhu, Shengcai Liao, Zhen Lei, Weishi Zheng, and Stan Z. Li. Embedding deep metric for person re-identification: A study against large variations. In Proceedings of the European Conference on Computer Vision (ECCV), 2016.

    Google Scholar 

  47. K. Weinberger, J. Blitzer, and L. Saul. Distance metric learning for large margin nearest neighbor classification. In Proceedings of Advances in Neural Information Processing Systems (NIPS), volume 18, page 1473, 2006.

    Google Scholar 

  48. Chong Wang, Xue Zhang, and Xipeng Lan. How to train triplet networks with 100k identities? In Proceedings of the International Conference on Computer Vision (ICCV), 2017.

    Google Scholar 

  49. J. Wang, F. Zhou, S. Wen, X. Liu, and Y. Lin. Deep metric learning with angular loss. In Proceedings of the International Conference on Computer Vision (ICCV), 2017.

    Google Scholar 

  50. E. P. Xing, A. Y. Ng, M. I. Jordan, and S. Russell. Distance metric learning with application to clustering with side-information. In Proceedings of Advances in Neural Information Processing Systems (NIPS), pages 521–528. MIT; 1998, 2003.

    Google Scholar 

  51. Yao Yang, Haoran Chen, and Junming Shao. Triplet enhanced autoencoder: Model-free discriminative network embedding. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2019.

    Google Scholar 

  52. Dong Yi, Zhen Lei, Shengcai Liao, and Stan Z. Li. Deep metric learning for person re-identification. In Proceedings of International Conference on Pattern Recognition (ICPR), pages 34–39, 2014.

    Google Scholar 

  53. Baosheng Yu and Dacheng Tao. Deep metric learning with tuplet margin loss. In Proceedings of the International Conference on Computer Vision (ICCV), 2019.

    Google Scholar 

  54. Wen-tau Yih, Kristina Toutanova, John C. Platt, and Christopher Meek. Learning discriminative projections for text similarity measures. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pages 247–256. Association for Computational Linguistics, 2011.

    Google Scholar 

  55. Mang Ye, Zheng Wang, Xiangyuan Lan, and Pong C. Yuen. Visible thermal person re-identification via dual-constrained top-ranking. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2018.

    Google Scholar 

  56. Jun Yu, Xiaokang Yang, Fei Gao, and Dacheng Tao. Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics, 2016.

    Google Scholar 

  57. Xun Yang, Peicheng Zhou, and Meng Wang. Person reidentification via structural deep metric learning. IEEE Transactions on Neural Networks and Learning Systems, 30(10), 2018.

    Google Scholar 

  58. Lilei Zheng, Stefan Duffner, Khalid Idrissi, Christophe Garcia, and Atilla Baskurt. Pairwise identity verification via linear concentrative metric learning. IEEE Transactions on Cybernetics, 48(1):324–335, 2018.

    Article  Google Scholar 

  59. Lilei Zheng, Khalid Idrissi, Christophe Garcia, Stefan Duffner, and Atilla Baskurt. Logistic similarity metric learning for face verification. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE, 2015.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Duffner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Cite this chapter

Duffner, S., Garcia, C., Idrissi, K., Baskurt, A. (2021). Similarity Metric Learning. In: Benois-Pineau, J., Zemmari, A. (eds) Multi-faceted Deep Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-74478-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-74478-6_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-74477-9

  • Online ISBN: 978-3-030-74478-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics