Evaluating the Performance of FedCLUS Algorithm Using FedCI: A New Federated Cluster Validity Metric

Sharma, Shachi; Gupta, Sargam

doi:10.1007/s42979-024-02663-1

Evaluating the Performance of FedCLUS Algorithm Using FedCI: A New Federated Cluster Validity Metric

Original Research
Published: 27 March 2024

Volume 5, article number 332, (2024)
Cite this article

SN Computer Science Aims and scope Submit manuscript

33 Accesses
Explore all metrics

Abstract

Federated learning is a recent trend in the field of machine learning for building a collaborative model from distributed data while preserving its privacy. The focus of existing literature is on developing supervised federated learning algorithms requiring labeled data. Whereas only a few solutions have been proposed to identify patterns in distributed unlabeled data using federated clustering methods. However, the issue of measuring the goodness of clusters remains unsolved as existing cluster validity indices cannot be applied in federated learning due to the unavailability of the entire data. To fulfill this research gap, a new metric called FedCI is proposed in the paper for measuring the performance of federated clustering methods, The rationale for FedCI is also discussed and the new metric is validated by comparing it with DB index and Silhouette score. It is found that the behavior of FedCI is consistent with existing metrics. Further, FedCI is applied to the recently proposed FedCLUS a federated clustering method. The FedCLUS algorithm has distinctive characteristics like identification of arbitrarily shaped clusters; the ability to merge, split and discard clusters reported by data owners; communication cost effectiveness. The performance of FedCLUS is compared with centralized DBSCAN using FedCI on various datasets. The results indicate that FedCLUS performs close to the centralized DBSCAN clustering algorithm. The FedCI is expected to guide in finding better clusters in federated settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Clustered Federated Learning with Bootstrap Median-of-Means

Personalized Federated Learning with Robust Clustering Against Model Poisoning

Dynamic Clustering Federated Learning for Non-IID Data

References

Ajagbe SA, Adegun AA, Olanrewaju AB, Oladosu JB, Adigun MO. Performance investigation of two-stage detection techniques using traffic light detection dataset. IAES Int J Artif Intell (IJ-AI). 2023;12(4):1909–19.
Google Scholar
Carbonell JG, Michalski RS, Mitchell TM. An overview of machine learning. In: Michalski RS, Carbonell JG, Mitchell TM, editors. Machine learning. San Francisco, CA: Morgan Kaufmann; 1983. p. 3–23 (ISBN 978-0-08-051054-5).
Google Scholar
Sharma S, Bassi I. Efficacy of tsallis entropy in clustering categorical data. In: 2019 IEEE Bombay section signature conference (IBSSC); 2019. p. 1–5. https://doi.org/10.1109/IBSSC47189.2019.8973057.
Sharma S, Pemo S. Performance analysis of various entropy measures in categorical data clustering. In: 2020 International conference on computational performance evaluation (ComPE); 2020. p. 592–595. https://doi.org/10.1109/ComPE49325.2020.9200074.
Galakatos A, Crotty A, Kraska T. Distributed machine learning. New York, NY: Springer; 2018. p. 1196–201 (ISBN 978-1-4614-8265-9).
Google Scholar
Rawat R, Oki OA, Sankaran KS, Olasupo O, Ebong GN, Ajagbe SA. A new solution for cyber security in big data using machine learning approach. In: Shakya S, Papakostas G, Kamel KA, editors. Mobile computing and sustainable informatics. Singapore: Springer Nature Singapore; 2023. p. 495–505.
Chapter Google Scholar
Konecny J, McMahan HB, Ramage D, Richtárik P. Federated optimization: distributed machine learning for on-device intelligence. arXiv:abs/1610.02527; 2016.
Yang Q, Fan L, Yu H. Federated learning privacy and incentive. Berlin: Springer; 2020.
Book Google Scholar
Li Q, Wen Z, He B. Practical federated gradient boosting decision trees. In: AAAI conference on artificial intelligence; 2019.
Yamamoto F, Ozawa S, Wang L. efl-boost: efficient federated learning for gradient boosting decision trees. IEEE Access. 2022;10:43954–63. https://doi.org/10.1109/ACCESS.2022.3169502.
Article Google Scholar
Ng I, Zhang K. Towards federated Bayesian network structure learning with continuous optimization. In: 25th International conference on artificial intelligence and statistics (AISTATS), Valencia, Spain; 2022.
Dennis DK, Li T, Smith V. Heterogeneity for the win: one-shot federated clustering. In: Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning, ICML 2021, 18–24 July 2021, virtual event, vol. 139 of Proceedings of machine learning research. PMLR; 2021. p. 2611–2620.
Triebe OJ, Rajagopal R. Federated k-means: clustering algorithm and proof of concept, 2022. https://github.com/ourownstory/federated_kmeans/blob/master/federated_kmeans_arxiv.pdf.
Chung J, Lee K, Ramchandran K. Federated unsupervised clustering with generative models, 2022. Paper presented at AAAI.
Saxena D, Cao J. Generative adversarial networks (gans): challenges, solutions, and future directions. ACM Comput Surv. 2021;54(3):1–42.
Article Google Scholar
Gupta S, Sharma S. Fedclus: federated clustering from distributed homogeneous data. In Patel KK, Doctor G, Patel A, Lingras P (eds) 4th international conference on soft computing and its engineering applications, soft computing and its engineering applications. Springer, 2022.
Lu N, Wang Z, Li X, Niu G, Dou Q, Sugiyama M. Federated learning from only unlabeled data with class-conditional-sharing clients. In: International conference on learning representations, 2022. https://openreview.net/forum?id=WHA8009laxu.
Nour B, Cherkaoui S. Unsupervised data splitting scheme for federated edge learning in iot networks. In: ICC 2022—IEEE international conference on communications; 2022. pp. 1–6. https://doi.org/10.1109/ICC45855.2022.9882289.
Ghosh A, Chung J, Yin D, Ramchandran K. An efficient framework for clustered federated learning. IEEE Trans Inf Theory. 2022;68(12):8076–91. https://doi.org/10.1109/TIT.2022.3192506.
Article MathSciNet Google Scholar
Xie G, Wang J, Huang Y, Li Y, Zheng Y, Zheng F, Jin Y. FedMed-GAN: federated domain translation on unsupervised cross-modality brain image synthesis. Neurocomputing. 2023;46:126282.
Google Scholar
Theodoridis S, Koutroumbas K. Chapter 16–cluster validity. In: Theodoridis S, Koutroumbas K, editors. Pattern recognition. 4th ed. Boston: Academic Press; 2009. p. 863–913 (ISBN 978-1-59749-272-0).
Chapter Google Scholar
Davies DL, Bouldin DW. A cluster separation measure. IEEE Trans Pattern Anal Mach Intell. 1979;1(2):224–7. https://doi.org/10.1109/TPAMI.1979.4766909.
Article Google Scholar
Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987;20:53–65. https://doi.org/10.1016/0377-0427(87)90125-7.
Article Google Scholar
Ester M, Kriegel H, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining, KDD’96. AAAI Press; 1996. p. 226–231.
Henderson DG. Experiencing geometry: on plane and sphere. New York: Cornell University; 1995.
Google Scholar
Fränti P, Virmajoki O. Iterative shrinking method for clustering problems. Pattern Recogn. 2006;39(5):761–75.
Article Google Scholar
Karkkainen I, Franti P. Dynamic local search for clustering with unknown number of clusters. In: 2002 International conference on pattern recognition, vol. 2, 2002. p. 240–243. https://doi.org/10.1109/ICPR.2002.1048283.
Rezaei M, Fränti P. Set matching measures for external cluster validity. IEEE Trans Knowl Data Eng. 2016;28(8):2173–86. https://doi.org/10.1109/TKDE.2016.2551240.
Article Google Scholar
Rezaei M, Fränti P. Can the number of clusters be determined by external indices? IEEE Access. 2020;8:89239–57. https://doi.org/10.1109/ACCESS.2020.2993295.
Article Google Scholar
Fränti P, Sieranoja S. K-means properties on six clustering benchmark datasets. Appl Intell. 2018;48:4743–59. https://doi.org/10.1007/s10489-018-1238-7.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, South Asian University, Rajpur Road, Maidan Garhi, New Delhi, 110068, Delhi, India
Shachi Sharma
Department of Computing Science, Umeå Universitet, Universitetstorget 4, Umeå, 90187, Sweden
Sargam Gupta

Authors

Shachi Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Sargam Gupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shachi Sharma.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Soft Computing in Engineering Applications” guest edited by Kanubhai K. Patel.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sharma, S., Gupta, S. Evaluating the Performance of FedCLUS Algorithm Using FedCI: A New Federated Cluster Validity Metric. SN COMPUT. SCI. 5, 332 (2024). https://doi.org/10.1007/s42979-024-02663-1

Download citation

Received: 01 March 2023
Accepted: 29 January 2024
Published: 27 March 2024
DOI: https://doi.org/10.1007/s42979-024-02663-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating the Performance of FedCLUS Algorithm Using FedCI: A New Federated Cluster Validity Metric

Abstract

Access this article

Similar content being viewed by others

Robust Clustered Federated Learning with Bootstrap Median-of-Means

Personalized Federated Learning with Robust Clustering Against Model Poisoning

Dynamic Clustering Federated Learning for Non-IID Data

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evaluating the Performance of FedCLUS Algorithm Using FedCI: A New Federated Cluster Validity Metric

Abstract

Access this article

Similar content being viewed by others

Robust Clustered Federated Learning with Bootstrap Median-of-Means

Personalized Federated Learning with Robust Clustering Against Model Poisoning

Dynamic Clustering Federated Learning for Non-IID Data

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation