Transferable Deep Metric Learning for Clustering

Alami Chehboune, Mohamed; Kaddah, Rim; Read, Jesse

doi:10.1007/978-3-031-30047-9_2

Mohamed Alami Chehboune^10,11,
Rim Kaddah¹¹ &
Jesse Read¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13876))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

767 Accesses
1 Citations

Abstract

Clustering in high dimension spaces is a difficult task; the usual distance metrics may no longer be appropriate under the curse of dimensionality. Indeed, the choice of the metric is crucial, and it is highly dependent on the dataset characteristics. However a single metric could be used to correctly perform clustering on multiple datasets of different domains. We propose to do so, providing a framework for learning a transferable metric. We show that we can learn a metric on a labelled dataset, then apply it to cluster a different dataset, using an embedding space that characterises a desired clustering in the generic sense. We learn and test such metrics on several datasets of variable complexity (synthetic, MNIST, SVHN, omniglot) and achieve results competitive with the state-of-the-art while using only a small number of labelled training datasets and shallow networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
As a reminder, Let T and U be two topological spaces. A function \(f:T\mapsto U\) is continuous in the open set definition if for every \(t\in T\) and every open set u containing f(t), there exists a neighbourhood v of t such that \(f(v)\subset u\).

References

Arjovsky, M., et al.: Wasserstein generative adversarial networks. In: ICML, pp. 214–223 (2017)
Google Scholar
Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: ICML, p. 11 (2004)
Google Scholar
de Boer, P.T., Kroese, D.P., et al.: A tutorial on the cross-entropy method. Ann. Oper. Res. 134(1), 19–67 (2005)
Article MathSciNet MATH Google Scholar
Cohen, G., et al.: EMNIST: Extending MNIST to handwritten letters. In: IJCNN (2017)
Google Scholar
Ganin, Y., et al.: Domain-adversarial training of neural networks. JMLR 17, 2096–2130 (2016)
MathSciNet Google Scholar
Han, K., et al.: Learning to discover novel visual categories via deep transfer clustering (2019)
Google Scholar
Han, K., Rebuffi, S.A., et al.: AutoNovel: automatically discovering and learning novel visual categories. PAMI 1 (2021)
Google Scholar
Hsu, et al.: Learning to cluster in order to transfer across domains and tasks (2017)
Google Scholar
Kipf, T.N., Welling, M.: Variational graph auto-encoders (2016)
Google Scholar
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015)
Article MathSciNet MATH Google Scholar
LeCun, Y., Cortes, C., Burges, C.: MNIST handwritten digit database (2010)
Google Scholar
Netzer, Y., et al.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning (2011)
Google Scholar
Ng, A.Y., Russell, S.: Algorithms for inverse reinforcement learning. In: Proceedings of the 17th International Conference on Machine Learning, pp. 663–670 (2000)
Google Scholar
Saito, K., Ushiku, Y., Harada, T.: Asymmetric tri-training for unsupervised domain adaptation. ICML, pp. 2988–2997 (2017)
Google Scholar
Sener, O., Song, H.O., et al.: Learning transfer able representations for unsupervised domain adaptation. In: NIPS, pp. 2110–2118 (2016)
Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining multiple partitions. JMLR 3(Dec), 583–617 (2002)
MathSciNet MATH Google Scholar
Wang, X., Qian, B., Davidson, I.: On constrained spectral clustering and its applications. Data Min. Knowl. Discov. 28, 1–30 (2014)
Article MathSciNet MATH Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017)
Google Scholar
Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: ICML, pp. 478–487 (2016)
Google Scholar
Yang, Y., Xu, D., et al.: Image clustering using local discriminant models and global integration. IEEE Trans. Image Process. 19(10), 2761–2773 (2010)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We gratefully acknowledge Orianne Debeaupuis for making the figure. We also acknowledge computing support from NVIDIA. This work was supported by funds from the French Program “Investissements d’Avenir”.

Author information

Authors and Affiliations

Department of Computer Science, Ecole Polytechnique, Palaiseau, France
Mohamed Alami Chehboune & Jesse Read
IRT SystemX, Palaiseau, France
Mohamed Alami Chehboune & Rim Kaddah

Authors

Mohamed Alami Chehboune
View author publications
You can also search for this author in PubMed Google Scholar
Rim Kaddah
View author publications
You can also search for this author in PubMed Google Scholar
Jesse Read
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Alami Chehboune .

Editor information

Editors and Affiliations

Université de Caen Normandie, Caen, France
Bruno Crémilleux
Eindhoven University of Technology, Eindhoven, The Netherlands
Sibylle Hess
UCLouvain, Louvain-la-Neuve, Belgium
Siegfried Nijssen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alami Chehboune, M., Kaddah, R., Read, J. (2023). Transferable Deep Metric Learning for Clustering. In: Crémilleux, B., Hess, S., Nijssen, S. (eds) Advances in Intelligent Data Analysis XXI. IDA 2023. Lecture Notes in Computer Science, vol 13876. Springer, Cham. https://doi.org/10.1007/978-3-031-30047-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-30047-9_2
Published: 01 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30046-2
Online ISBN: 978-3-031-30047-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Transferable Deep Metric Learning for Clustering