Abstract
In fully distributed machine learning, privacy and security are important issues. These issues are often dealt with using secure multiparty computation (MPC). However, in our application domain, known MPC algorithms are not scalable or not robust enough. We propose a light-weight protocol to quickly and securely compute the sum of the inputs of a subset of participants assuming a semi-honest adversary. During the computation the participants learn no individual values. We apply this protocol to efficiently calculate the sum of gradients as part of a fully distributed mini-batch stochastic gradient descent algorithm. The protocol achieves scalability and robustness by exploiting the fact that in this application domain a “quick and dirty” sum computation is acceptable. In other words, speed and robustness takes precedence over precision. We analyze the protocol theoretically as well as experimentally based on churn statistics from a real smartphone trace. We derive a sufficient condition for preventing the leakage of an individual value, and we demonstrate the feasibility of the overhead of the protocol.
Keywords
References
Ahmad, W., Khokhar, A.: Secure aggregation in large scale overlay networks. In: IEEE Global Telecommunications Conference (GLOBECOM 2006) (2006)
Berta, Á., Bilicki, V., Jelasity, M.: Defining and understanding smartphone churn over the internet: A measurement study. In: Proceedings of the 14th IEEE International Conference on Peer-to-Peer Computing (P2P 2014). IEEE (2014)
Bickson, D., Reinman, T., Dolev, D., Pinkas, B.: Peer-to-peer secure multi-party numerical computation facing malicious adversaries. Peer-to-Peer Networking and Applications 3(2), 129–144 (2010)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)
Bottou, L.: Stochastic gradient descent tricks. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 421–436. Springer, Heidelberg (2012)
Bottou, L., LeCun, Y.: Large scale online learning. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16, MIT Press, Cambridge (2004)
Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. SIGKDD Explor. Newsl. 4(2), 28–34 (2002)
Dekel, O., Gilad-Bachrach, R., Shamir, O., Xiao, L.: Optimal distributed online prediction using mini-batches. J. Mach. Learn. Res. 13(1), 165–202 (2012)
Dwork, C.: A firm foundation for private data analysis. Commun. ACM 54(1), 86–95 (2011)
Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: Privacy via distributed noise generation. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 486–503. Springer, Heidelberg (2006)
Gimpel, K., Das, D., Smith, N.A.: Distributed asynchronous online learning for natural language processing. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning (CoNLL 2010), pp. 213–222. Association for Computational Linguistics, Stroudsburg (2010)
Han, S., Ng, W.K., Wan, L., Lee, V.C.S.: Privacy-preserving gradient-descent methods. IEEE Transactions on Knowledge and Data Engineering 22(6), 884–899 (2010)
Jesi, G.P., Montresor, A., van Steen, M.: Secure peer sampling. Computer Networks 54(12), 2086–2098 (2010)
Lichman, M.: UCI machine learning repository (2013), http://archive.ics.uci.edu/ml
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious URLs: an application of large-scale online learning. In: Proceedings of the 26th Annual International Conference on Machine Learning (ICML 2009), pp. 681–688. ACM, New York (2009)
Maurer, U.: Secure multi-party computation made simple. Discrete Applied Mathematics 154(2), 370–381 (2006)
Naranjo, J.A.M., Casado, L.G., Jelasity, M.: Asynchronous privacy-preserving iterative computation on peer-to-peer networks. Computing 94(8-10), 763–782 (2012)
Ormándi, R., Hegedűs, I., Jelasity, M.: Gossip learning with linear models on fully distributed data. Concurrency and Computation: Practice and Experience 25(4), 556–571 (2013)
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)
Rajkumar, A., Agarwal, S.: A differentially private stochastic gradient descent algorithm for multiparty classification. In: JMLR Workshop and Conference Proceedings of AISTATS 2012, vol. 22, pp. 933–941 (2012)
Roverso, R., Dowling, J., Jelasity, M.: Through the wormhole: Low cost, fresh peer sampling for the internet. In: Proceedings of the 13th IEEE International Conference on Peer-to-Peer Computing (P2P 2013). IEEE (2013)
Saia, J., Zamani, M.: Recent results in scalable multi-party computation. In: Italiano, G.F., Margaria-Steffen, T., Pokorný, J., Quisquater, J.-J., Wattenhofer, R. (eds.) SOFSEM 2015. LNCS, vol. 8939, pp. 24–44. Springer, Heidelberg (2015)
Stutzbach, D., Rejaie, R., Duffield, N., Sen, S., Willinger, W.: On unbiased sampling for unstructured peer-to-peer networks. IEEE/ACM Transactions on Networking 17(2), 377–390 (2009)
Yao, A.C.: Protocols for secure computations. In: Proceedings of the 23rd Annual Symposium on Foundations of Computer Science (FOCS), pp. 160–164 (1982)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 IFIP International Federation for Information Processing
About this paper
Cite this paper
Danner, G., Jelasity, M. (2015). Fully Distributed Privacy Preserving Mini-batch Gradient Descent Learning. In: Bessani, A., Bouchenak, S. (eds) Distributed Applications and Interoperable Systems. DAIS 2015. Lecture Notes in Computer Science(), vol 9038. Springer, Cham. https://doi.org/10.1007/978-3-319-19129-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-19129-4_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19128-7
Online ISBN: 978-3-319-19129-4
eBook Packages: Computer ScienceComputer Science (R0)