Abstract
A recent paper [1] proposes a general model for distributed learning that bounds the communication required for learning classifiers with ε error on linearly separable data adversarially distributed across nodes. In this work, we develop key improvements and extensions to this basic model. Our first result is a two-party multiplicative-weight-update based protocol that uses O(d 2 log1/ε) words of communication to classify distributed data in arbitrary dimension d, ε-optimally. This extends to classification over k nodes with O(kd 2 log1/ε) words of communication. Our proposed protocol is simple to implement and is considerably more efficient than baselines compared, as demonstrated by our empirical results.
In addition, we show how to solve fixed-dimensional and high-dimensional linear programming with small communication in a distributed setting where constraints may be distributed across nodes. Our techniques make use of a novel connection from multipass streaming, as well as adapting the multiplicative- weight-update framework more generally to a distributed setting.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Daumé III, H., Phillips, J., Saha, A., Venkatasubramanian, S.: Protocols for learning classifiers on distributed data. In: AISTATS 2012 (2012)
Bekkerman, R., Bilenko, M., Langford, J. (eds.): Scaling up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press (2011)
McDonald, R., Hall, K., Mann, G.: Distributed training strategies for the structured perceptron. In: NAACL HLT (2010)
Mann, G., McDonald, R., Mohri, M., Silberman, N., Walker, D.: Efficient large-scale distributed training of conditional maximum entropy models. In: NIPS (2009)
Collins, M.: Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In: EMNLP (2002)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36(1-2) (1999)
Dekel, O., Gilad-Bachrach, R., Shamir, O., Xiao, L.: Optimal distributed online prediction using mini-batches. arXiv:1012.1367 (2010)
Chu, C.T., Kim, S.K., Lin, Y.A., Yu, Y., Bradski, G., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: NIPS (2007)
Teo, C.H., Vishwanthan, S., Smola, A.J., Le, Q.V.: Bundle methods for regularized risk minimization. J. Mach. Learn. Res. 11, 311–365 (2010)
Zinkevich, M., Weimer, M., Smola, A., Li, L.: Parallelized stochastic gradient descent. In: NIPS (2010)
Servedio, R.A., Long, P.: Algorithms and hardness results for parallel large margin learning. In: NIPS (2011)
Balcan, M.F., Blum, A., Fine, S., Mansour, Y.: Distributed learning, communication complexity and privacy. In: COLT 2012, arXiv:1204.3514 (to appear, June 2012)
Daumé III, H., Phillips, J.M., Saha, A., Venkatasubramanian, S.: Efficient protocols for distributed classification and optimization. arXiv:1204.3523
Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations, Cambridge (2009)
Cormode, G., Muthukrishnan, S., Yi, K.: Algorithms for distributed functional monitoring. In: SODA (2008)
Cormode, G., Muthukrishnan, S., Yi, K., Zhang, Q.: Optimal sampling from distributed streams. In: PODS (2010)
Matousek, J.: Approximations and optimal geometric divide-and-conquer. In: STOC (1991)
Chazelle, B.: The Discrepancy Method, Cambridge (2000)
Matoušek, J.: Geometric Discrepancy. Springer (1999)
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM TIST 2(3) (2011)
Arora, S., Hazan, E., Kale, S.: Fast algorithms for approximate semidefinite programming using the multiplicative weights update method. In: FOCS (2005)
Meka, R., Jain, P., Caramanis, C., Dhillon, I.S.: Rank minimization via online learning. In: ICML (2008)
Muthukrishnan, S.: Data streams: algorithms and applications. Foundations and trends in theoretical computer science. Now Publishers (2005)
Chan, T.M., Chen, E.Y.: Multi-pass geometric algorithms. Disc. & Comp. Geom. 37(1), 79–102 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Daumé, H., Phillips, J.M., Saha, A., Venkatasubramanian, S. (2012). Efficient Protocols for Distributed Classification and Optimization. In: Bshouty, N.H., Stoltz, G., Vayatis, N., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2012. Lecture Notes in Computer Science(), vol 7568. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34106-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-34106-9_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34105-2
Online ISBN: 978-3-642-34106-9
eBook Packages: Computer ScienceComputer Science (R0)