Skip to main content

Transferable Deep Metric Learning for Clustering

  • Conference paper
  • First Online:
Advances in Intelligent Data Analysis XXI (IDA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13876))

Included in the following conference series:

Abstract

Clustering in high dimension spaces is a difficult task; the usual distance metrics may no longer be appropriate under the curse of dimensionality. Indeed, the choice of the metric is crucial, and it is highly dependent on the dataset characteristics. However a single metric could be used to correctly perform clustering on multiple datasets of different domains. We propose to do so, providing a framework for learning a transferable metric. We show that we can learn a metric on a labelled dataset, then apply it to cluster a different dataset, using an embedding space that characterises a desired clustering in the generic sense. We learn and test such metrics on several datasets of variable complexity (synthetic, MNIST, SVHN, omniglot) and achieve results competitive with the state-of-the-art while using only a small number of labelled training datasets and shallow networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    As a reminder, Let T and U be two topological spaces. A function \(f:T\mapsto U\) is continuous in the open set definition if for every \(t\in T\) and every open set u containing f(t), there exists a neighbourhood v of t such that \(f(v)\subset u\).

References

  1. Arjovsky, M., et al.: Wasserstein generative adversarial networks. In: ICML, pp. 214–223 (2017)

    Google Scholar 

  2. Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: ICML, p. 11 (2004)

    Google Scholar 

  3. de Boer, P.T., Kroese, D.P., et al.: A tutorial on the cross-entropy method. Ann. Oper. Res. 134(1), 19–67 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  4. Cohen, G., et al.: EMNIST: Extending MNIST to handwritten letters. In: IJCNN (2017)

    Google Scholar 

  5. Ganin, Y., et al.: Domain-adversarial training of neural networks. JMLR 17, 2096–2130 (2016)

    MathSciNet  Google Scholar 

  6. Han, K., et al.: Learning to discover novel visual categories via deep transfer clustering (2019)

    Google Scholar 

  7. Han, K., Rebuffi, S.A., et al.: AutoNovel: automatically discovering and learning novel visual categories. PAMI 1 (2021)

    Google Scholar 

  8. Hsu, et al.: Learning to cluster in order to transfer across domains and tasks (2017)

    Google Scholar 

  9. Kipf, T.N., Welling, M.: Variational graph auto-encoders (2016)

    Google Scholar 

  10. Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  11. LeCun, Y., Cortes, C., Burges, C.: MNIST handwritten digit database (2010)

    Google Scholar 

  12. Netzer, Y., et al.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning (2011)

    Google Scholar 

  13. Ng, A.Y., Russell, S.: Algorithms for inverse reinforcement learning. In: Proceedings of the 17th International Conference on Machine Learning, pp. 663–670 (2000)

    Google Scholar 

  14. Saito, K., Ushiku, Y., Harada, T.: Asymmetric tri-training for unsupervised domain adaptation. ICML, pp. 2988–2997 (2017)

    Google Scholar 

  15. Sener, O., Song, H.O., et al.: Learning transfer able representations for unsupervised domain adaptation. In: NIPS, pp. 2110–2118 (2016)

    Google Scholar 

  16. Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining multiple partitions. JMLR 3(Dec), 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  17. Wang, X., Qian, B., Davidson, I.: On constrained spectral clustering and its applications. Data Min. Knowl. Discov. 28, 1–30 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  18. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017)

    Google Scholar 

  19. Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: ICML, pp. 478–487 (2016)

    Google Scholar 

  20. Yang, Y., Xu, D., et al.: Image clustering using local discriminant models and global integration. IEEE Trans. Image Process. 19(10), 2761–2773 (2010)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge Orianne Debeaupuis for making the figure. We also acknowledge computing support from NVIDIA. This work was supported by funds from the French Program “Investissements d’Avenir”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Alami Chehboune .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alami Chehboune, M., Kaddah, R., Read, J. (2023). Transferable Deep Metric Learning for Clustering. In: Crémilleux, B., Hess, S., Nijssen, S. (eds) Advances in Intelligent Data Analysis XXI. IDA 2023. Lecture Notes in Computer Science, vol 13876. Springer, Cham. https://doi.org/10.1007/978-3-031-30047-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-30047-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-30046-2

  • Online ISBN: 978-3-031-30047-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics