Skip to main content

Towards a Metadata Management System for Provenance, Reproducibility and Accountability in Federated Machine Learning

  • Conference paper
  • First Online:
Advances in Service-Oriented and Cloud Computing (ESOCC 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1617))

Included in the following conference series:

Abstract

The application of Data Governance (DG) to Federated Machine Learning (FML) could provide a way to produce better Machine Learning models. Nevertheless, such an application is still almost nonexistent in literature. Within a proposal for applying DG to FML, we first present an approach of metadata for FML, to provide accountability and assist with the continuous improvement of models in the federation. Our proposal includes a metadata model for tracing the operations of participants and collecting all information regarding the definition of goals and configuration of FML training processes. Additionally, we present the outline of a metadata management system as part of a broader DG architecture. Finally, we show some use cases of metadata management.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.tensorflow.org/tfx/guide/mlmd.

  2. 2.

    https://www.w3.org/TR/prov-overview/.

  3. 3.

    https://prov.readthedocs.io/en/latest/.

References

  1. Ballet, V., Renard, X., Aigrain, J., et al.: Imperceptible adversarial attacks on tabular data. arXiv:1911.03274 [cs, stat] (2019). http://arxiv.org/abs/1911.03274

  2. Balta, D., et al.: Accountable federated machine learning in government: engineering and management insights. In: Edelmann, N., et al. (eds.) ePart 2021. LNCS, vol. 12849, pp. 125–138. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82824-0_10

    Chapter  Google Scholar 

  3. Beutel, D.J., Topal, T., Mathur, A., et al.: Flower: a friendly federated learning research framework. arXiv preprint arXiv:2007.14390 (2020)

  4. Chandrasekaran, V., Jia, H., Thudi, A., et al.: SoK: machine learning governance (2021). http://arxiv.org/abs/2109.10870

  5. Desai, H.B., Ozdayi, M.S., Kantarcioglu, M.: BlockFLA: accountable federated learning via hybrid blockchain architecture, pp. 101–112. ACM (2021)

    Google Scholar 

  6. Galtier, M.N., Marini, C.: Substra: a framework for privacy-preserving, traceable and collaborative ml (2019). https://arxiv.org/abs/1910.11567

  7. Hard, A., Rao, K., Mathews, R., et al.: Federated learning for mobile keyboard prediction (2018). http://arxiv.org/abs/1811.03604

  8. Janssen, M., Brous, P., Estevez, E., et al.: Data governance: organizing data for trustworthy artificial intelligence. GIQ 37(3), 101493 (2020)

    Google Scholar 

  9. Kairouz, P., McMahan, H.B., Avent, B., et al.: Advances and open problems in federated learning. Found. Trends ML 14(1–2), 1–210 (2021)

    Article  MATH  Google Scholar 

  10. Khatri, V., Brown, C.V.: Designing data governance. CACM 53(1), 148–152 (2010)

    Article  Google Scholar 

  11. Lin, J., Du, M., Liu, J.: Free-riders in Federated Learning: attacks and Defenses. Technical report arXiv:1911.12560 (2019). http://arxiv.org/abs/1911.12560

  12. Liu, Z., Chen, Y., Yu, H., et al.: GTG-shapley: efficient and accurate participant contribution evaluation in federated learning. ACM Trans. Intell. Syst. Technol. 13(4), 60:1–60:21 (2022)

    Google Scholar 

  13. Majeed, U., Hong, C.S.: FLchain: federated learning via MEC-enabled blockchain network. In: 2019 20th Asia-Pacific Network Operations and Management Symposium (APNOMS), pp. 1–4 (2019)

    Google Scholar 

  14. Naja, I., Markovic, M., Edwards, P., Cottrill, C.: A semantic framework to support AI system accountability and audit. In: Verborgh, R., et al. (eds.) ESWC 2021. LNCS, vol. 12731, pp. 160–176. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77385-4_10

    Chapter  Google Scholar 

  15. Schad, J., Sambasivan, R., Woodward, C.: Arangopipe, a tool for machine learning meta-data management. Data Sci. 4(2), 85–99 (2021)

    Article  Google Scholar 

  16. Siebert, J., Joeckel, L., Heidrich, J., et al.: Construction of a quality model for machine learning systems. Softw. Qual. J. 2021, 1–29 (2021)

    Google Scholar 

  17. Simon, G., Vincent, T.: A projected stochastic gradient algorithm for estimating shapley value applied in attribute importance. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2020. LNCS, vol. 12279, pp. 97–115. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57321-8_6

    Chapter  Google Scholar 

  18. Souza, R., Azevedo, L., Lourenço, V., et al.: Provenance data in the machine learning lifecycle in computational science and engineering. In: 2019 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS), pp. 1–10 (2019)

    Google Scholar 

  19. Wang, R.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–34 (1996)

    Article  Google Scholar 

  20. Wang, T., Rausch, J., Zhang, C., Jia, R., Song, D.: A principled approach to data valuation for federated learning. In: Yang, Q., Fan, L., Yu, H. (eds.) Federated Learning. LNCS (LNAI), vol. 12500, pp. 153–167. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63076-8_11

    Chapter  Google Scholar 

  21. Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM TIST 10(2), 12:1–12:19 (2019)

    Google Scholar 

Download references

Acknowledgment

Funded by the German Federal Ministry of Education and Research. Project name: KIWI, RefNr: 16KIS1142K.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José A. Peregrina .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Peregrina, J.A., Ortiz, G., Zirpins, C. (2022). Towards a Metadata Management System for Provenance, Reproducibility and Accountability in Federated Machine Learning. In: Zirpins, C., et al. Advances in Service-Oriented and Cloud Computing. ESOCC 2022. Communications in Computer and Information Science, vol 1617. Springer, Cham. https://doi.org/10.1007/978-3-031-23298-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-23298-5_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-23297-8

  • Online ISBN: 978-3-031-23298-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics