Towards a Metadata Management System for Provenance, Reproducibility and Accountability in Federated Machine Learning

Peregrina, José A.; Ortiz, Guadalupe; Zirpins, Christian

doi:10.1007/978-3-031-23298-5_1

José A. Peregrina^12,13,
Guadalupe Ortiz¹² &
Christian Zirpins¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1617))

Included in the following conference series:

European Conference on Service-Oriented and Cloud Computing

332 Accesses
2 Citations

Abstract

The application of Data Governance (DG) to Federated Machine Learning (FML) could provide a way to produce better Machine Learning models. Nevertheless, such an application is still almost nonexistent in literature. Within a proposal for applying DG to FML, we first present an approach of metadata for FML, to provide accountability and assist with the continuous improvement of models in the federation. Our proposal includes a metadata model for tracing the operations of participants and collecting all information regarding the definition of goals and configuration of FML training processes. Additionally, we present the outline of a metadata management system as part of a broader DG architecture. Finally, we show some use cases of metadata management.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Ballet, V., Renard, X., Aigrain, J., et al.: Imperceptible adversarial attacks on tabular data. arXiv:1911.03274 [cs, stat] (2019). http://arxiv.org/abs/1911.03274
Balta, D., et al.: Accountable federated machine learning in government: engineering and management insights. In: Edelmann, N., et al. (eds.) ePart 2021. LNCS, vol. 12849, pp. 125–138. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82824-0_10
Chapter Google Scholar
Beutel, D.J., Topal, T., Mathur, A., et al.: Flower: a friendly federated learning research framework. arXiv preprint arXiv:2007.14390 (2020)
Chandrasekaran, V., Jia, H., Thudi, A., et al.: SoK: machine learning governance (2021). http://arxiv.org/abs/2109.10870
Desai, H.B., Ozdayi, M.S., Kantarcioglu, M.: BlockFLA: accountable federated learning via hybrid blockchain architecture, pp. 101–112. ACM (2021)
Google Scholar
Galtier, M.N., Marini, C.: Substra: a framework for privacy-preserving, traceable and collaborative ml (2019). https://arxiv.org/abs/1910.11567
Hard, A., Rao, K., Mathews, R., et al.: Federated learning for mobile keyboard prediction (2018). http://arxiv.org/abs/1811.03604
Janssen, M., Brous, P., Estevez, E., et al.: Data governance: organizing data for trustworthy artificial intelligence. GIQ 37(3), 101493 (2020)
Google Scholar
Kairouz, P., McMahan, H.B., Avent, B., et al.: Advances and open problems in federated learning. Found. Trends ML 14(1–2), 1–210 (2021)
Article MATH Google Scholar
Khatri, V., Brown, C.V.: Designing data governance. CACM 53(1), 148–152 (2010)
Article Google Scholar
Lin, J., Du, M., Liu, J.: Free-riders in Federated Learning: attacks and Defenses. Technical report arXiv:1911.12560 (2019). http://arxiv.org/abs/1911.12560
Liu, Z., Chen, Y., Yu, H., et al.: GTG-shapley: efficient and accurate participant contribution evaluation in federated learning. ACM Trans. Intell. Syst. Technol. 13(4), 60:1–60:21 (2022)
Google Scholar
Majeed, U., Hong, C.S.: FLchain: federated learning via MEC-enabled blockchain network. In: 2019 20th Asia-Pacific Network Operations and Management Symposium (APNOMS), pp. 1–4 (2019)
Google Scholar
Naja, I., Markovic, M., Edwards, P., Cottrill, C.: A semantic framework to support AI system accountability and audit. In: Verborgh, R., et al. (eds.) ESWC 2021. LNCS, vol. 12731, pp. 160–176. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77385-4_10
Chapter Google Scholar
Schad, J., Sambasivan, R., Woodward, C.: Arangopipe, a tool for machine learning meta-data management. Data Sci. 4(2), 85–99 (2021)
Article Google Scholar
Siebert, J., Joeckel, L., Heidrich, J., et al.: Construction of a quality model for machine learning systems. Softw. Qual. J. 2021, 1–29 (2021)
Google Scholar
Simon, G., Vincent, T.: A projected stochastic gradient algorithm for estimating shapley value applied in attribute importance. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2020. LNCS, vol. 12279, pp. 97–115. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57321-8_6
Chapter Google Scholar
Souza, R., Azevedo, L., Lourenço, V., et al.: Provenance data in the machine learning lifecycle in computational science and engineering. In: 2019 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS), pp. 1–10 (2019)
Google Scholar
Wang, R.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–34 (1996)
Article Google Scholar
Wang, T., Rausch, J., Zhang, C., Jia, R., Song, D.: A principled approach to data valuation for federated learning. In: Yang, Q., Fan, L., Yu, H. (eds.) Federated Learning. LNCS (LNAI), vol. 12500, pp. 153–167. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63076-8_11
Chapter Google Scholar
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM TIST 10(2), 12:1–12:19 (2019)
Google Scholar

Download references

Acknowledgment

Funded by the German Federal Ministry of Education and Research. Project name: KIWI, RefNr: 16KIS1142K.

Author information

Authors and Affiliations

Computer Science and Engineering Department, University of Cádiz, Av. Universidad de Cádiz, 10, 11519, Puerto Real, Cádiz, Spain
José A. Peregrina & Guadalupe Ortiz
Faculty of Computer Science and Business Information Systems, Karlsruhe University of Applied Sciences, Moltkestr. 30, 76133, Karlsruhe, Germany
José A. Peregrina & Christian Zirpins

Authors

José A. Peregrina
View author publications
You can also search for this author in PubMed Google Scholar
Guadalupe Ortiz
View author publications
You can also search for this author in PubMed Google Scholar
Christian Zirpins
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to José A. Peregrina .

Editor information

Editors and Affiliations

Karlsruhe University of Applied Sciences, Karlsruhe, Germany
Christian Zirpins
University of Cádiz, Cádiz, Spain
Guadalupe Ortiz
Karlsruhe University of Applied Sciences, Karlsruhe, Germany
Zoltan Nochta
Karlsruhe University of Applied Sciences, Karlsruhe, Germany
Oliver Waldhorst
University of Pisa, Pisa, Italy
Jacopo Soldani
University of Messina, Messina, Italy
Massimo Villari
TU/e – JADS, Eindhoven, The Netherlands
Damian Tamburri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peregrina, J.A., Ortiz, G., Zirpins, C. (2022). Towards a Metadata Management System for Provenance, Reproducibility and Accountability in Federated Machine Learning. In: Zirpins, C., et al. Advances in Service-Oriented and Cloud Computing. ESOCC 2022. Communications in Computer and Information Science, vol 1617. Springer, Cham. https://doi.org/10.1007/978-3-031-23298-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-23298-5_1
Published: 01 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23297-8
Online ISBN: 978-3-031-23298-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards a Metadata Management System for Provenance, Reproducibility and Accountability in Federated Machine Learning