Parallelization of the self-organized maps algorithm for federated learning on distributed sources

Kholod, Ivan; Rukavitsyn, Andrey; Paznikov, Alexey; Gorlatch, Sergei

doi:10.1007/s11227-020-03509-2

Parallelization of the self-organized maps algorithm for federated learning on distributed sources

Published: 25 November 2020

Volume 77, pages 6197–6213, (2021)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Ivan Kholod ORCID: orcid.org/0000-0002-7255-5035¹,
Andrey Rukavitsyn¹,
Alexey Paznikov¹ &
…
Sergei Gorlatch²

391 Accesses
3 Citations
Explore all metrics

Abstract

This paper describes a formally based approach for parallelizing the Kohonen algorithm used for the federated learning process in a special kind of neural networks—Self-Organizing Maps. Our approach enables executing the parallel algorithm version on the distributed data sources, taking into account the kind of data distribution on the nodes. Compared to the traditional approaches, we distinguish two kinds of data distributions—horizontal and vertical: for both, our suggested approach avoids gathering data in a single storage, but rather moves computations nearer to the data source nodes. This reduces the execution time of the algorithm, the network traffic, and the risk of an unauthorized access to the data during their transmission. Our experimental evaluation demonstrates the advantages of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

From distributed machine learning to federated learning: a survey

Article 22 March 2022

Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system

Article Open access 12 October 2021

Regularized One-Layer Neural Networks for Distributed and Incremental Environments

References

Dehghani Z (2019) How to move beyond a monolithic data lake to a distributed data mesh. https://martinfowler.com/articles/data-monolith-to-mesh.html
Voigt P, Von dem Bussche A (2017) The EU general data protection regulation (GDPR). In: A practical guide, 1st ed. Springer International Publishing, Cham
California Consumer Privacy Act Home Page. https://www.caprivacy.org/
Konecný J, Brendan McMahan H, Ramage D, Richtárik P (2016) Federated optimization: distributed machine learning for on-device intelligence. arXiv:CoRRabs/1610.02527(2016)
Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications. ACM Trans Intell Syst Technol 10(2):12
Article Google Scholar
Kohonen T (2001) Self-organizing maps (Third Extended Edition), New York
Kholod I, Shorov A, Efimova M, Gorlatch S (2019) Parallelization of algorithms for mining data from distributed sources. PaCT-2019. Springer. LNCS, pp 289–303 https://doi.org/10.1007/978-3-030-25636-4_23
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning: data mining, inference, and prediction. Springer
Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In Proceedings of Operating Systems Design and Implementation. San Francisco, CA
Gorlatch S, Cole M (2011) Parallel Skeletons. In: Padua D (ed.) Encyclopedia of parallel computing. Springer
Lawrence RD, Almasi GS, Rushmeier HE (1999) A scalable parallel algorithm for selfor-ganizing maps with applications to sparse data mining problems. Data Min Knowl Disc 3(2):171–195
Article Google Scholar
Fort J, Letrémy P, Cottrell M (2002) Advantages and drawbacks of the Batch Kohonen algo-rithm. ESANN
Weichel Ch (2010) Adapting self-organizing maps to the mapreduce programming paradigm. STeP, pp 119–131. https://doi.org/10.1524/9783486853162.119
Sarazin T, Azzag H, Lebbah M (2014) SOM Clustering using spark-mapreduce. In: 2014 IEEE 28th International Parallel & Distributed Processing Symposium Workshops, pp 1727–1734 https://doi.org/10.1109/IPDPSW.2014.192
Dafonte C, Garabato D, Álvarez MA, Manteiga M (2018) Distributed fast self-organized maps for massive spectrophotometric data analysis. Sensors (Basel) 18(5):1419. Published 2018 May 3. https://doi.org/10.3390/s18051419
Flavius LG, Jose Alfredo FC (2008) Parallel self-organizing maps with application in clustering distributed data. Neural Networks. IJCNN 2008. IEEE International Joint Conference on IEEE World Congress on Computational Intelligence
Li Q, et al (2020) Federated learning systems: vision, hype and reality for data privacy and protection. arXiv:abs/1907.09693
Ingerman A, Ostrowski K (2019) Introducing TensorFlow Federated https://blog.tensorflow.org/2019/03/introducing-tensorflow-federated.html
Ryffel Th, Trask A, Dahl M, Wagner B, Mancuso J, Rueckert D, Passerat-Palmbach J (2018) A generic framework for privacy preserving deep learning. preprint arXiv:1811.04017
An Industrial Grade Federated Learning Framework https://fate.fedai.org/
Paddle Federated Learning https://github.com/PaddlePaddle/PaddleFL
Kholod I, Kuprianov M, Titkov E, Shorov A, Postnikova E, Mironenko I, Sokolov S (2019) Training normal Bayes classifier on distributed data. Proc Comput Sci 150:389–396. https://doi.org/10.1016/j.procs.2019.02.068
Article Google Scholar
Kholod I, Rukavitsyn A, Reva N, Shorov A (2019) Distributed data clustering by neural network algorithms. In: Proceedings of the 2019 IEEE Russia Section Young Researchers in Electrical and Electronic Engineering Conference—IEEE. pp 249–253. https://doi.org/10.1109/EIConRus.2019.8657175
https://github.com/Awethon/SOM-FuncBlock
https://github.com/iiholod/XelopesFL

Download references

Acknowledgements

We are grateful to the anonymous reviewers whose very helpful comments allowed us to significantly improve. This work was supported by the German Ministry of Education and Research (BMBF) in the framework of project HPC2SE at the University of Muenster.

Author information

Authors and Affiliations

Saint Petersburg Electrotechnical University ”LETI”, Saint Petersburg, Russia
Ivan Kholod, Andrey Rukavitsyn & Alexey Paznikov
University of Muenster, Muenster, Germany
Sergei Gorlatch

Authors

Ivan Kholod
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Rukavitsyn
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Paznikov
View author publications
You can also search for this author in PubMed Google Scholar
Sergei Gorlatch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ivan Kholod.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kholod, I., Rukavitsyn, A., Paznikov, A. et al. Parallelization of the self-organized maps algorithm for federated learning on distributed sources. J Supercomput 77, 6197–6213 (2021). https://doi.org/10.1007/s11227-020-03509-2

Download citation

Accepted: 02 November 2020
Published: 25 November 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11227-020-03509-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallelization of the self-organized maps algorithm for federated learning on distributed sources

Abstract

Access this article

Similar content being viewed by others

From distributed machine learning to federated learning: a survey

Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system

Regularized One-Layer Neural Networks for Distributed and Incremental Environments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parallelization of the self-organized maps algorithm for federated learning on distributed sources

Abstract

Access this article

Similar content being viewed by others

From distributed machine learning to federated learning: a survey

Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system

Regularized One-Layer Neural Networks for Distributed and Incremental Environments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation