Efficient Protocols for Distributed Classification and Optimization

Daumé, Hal; Phillips, Jeff M.; Saha, Avishek; Venkatasubramanian, Suresh

doi:10.1007/978-3-642-34106-9_15

Hal Daumé III²³,
Jeff M. Phillips²⁴,
Avishek Saha²⁴ &
…
Suresh Venkatasubramanian²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7568))

Included in the following conference series:

International Conference on Algorithmic Learning Theory

2286 Accesses
8 Citations

Abstract

A recent paper [1] proposes a general model for distributed learning that bounds the communication required for learning classifiers with ε error on linearly separable data adversarially distributed across nodes. In this work, we develop key improvements and extensions to this basic model. Our first result is a two-party multiplicative-weight-update based protocol that uses O(d ² log1/ε) words of communication to classify distributed data in arbitrary dimension d, ε-optimally. This extends to classification over k nodes with O(kd ² log1/ε) words of communication. Our proposed protocol is simple to implement and is considerably more efficient than baselines compared, as demonstrated by our empirical results.

In addition, we show how to solve fixed-dimensional and high-dimensional linear programming with small communication in a distributed setting where constraints may be distributed across nodes. Our techniques make use of a novel connection from multipass streaming, as well as adapting the multiplicative- weight-update framework more generally to a distributed setting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Daumé III, H., Phillips, J., Saha, A., Venkatasubramanian, S.: Protocols for learning classifiers on distributed data. In: AISTATS 2012 (2012)
Google Scholar
Bekkerman, R., Bilenko, M., Langford, J. (eds.): Scaling up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press (2011)
Google Scholar
McDonald, R., Hall, K., Mann, G.: Distributed training strategies for the structured perceptron. In: NAACL HLT (2010)
Google Scholar
Mann, G., McDonald, R., Mohri, M., Silberman, N., Walker, D.: Efficient large-scale distributed training of conditional maximum entropy models. In: NIPS (2009)
Google Scholar
Collins, M.: Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In: EMNLP (2002)
Google Scholar
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36(1-2) (1999)
Google Scholar
Dekel, O., Gilad-Bachrach, R., Shamir, O., Xiao, L.: Optimal distributed online prediction using mini-batches. arXiv:1012.1367 (2010)
Google Scholar
Chu, C.T., Kim, S.K., Lin, Y.A., Yu, Y., Bradski, G., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: NIPS (2007)
Google Scholar
Teo, C.H., Vishwanthan, S., Smola, A.J., Le, Q.V.: Bundle methods for regularized risk minimization. J. Mach. Learn. Res. 11, 311–365 (2010)
MathSciNet MATH Google Scholar
Zinkevich, M., Weimer, M., Smola, A., Li, L.: Parallelized stochastic gradient descent. In: NIPS (2010)
Google Scholar
Servedio, R.A., Long, P.: Algorithms and hardness results for parallel large margin learning. In: NIPS (2011)
Google Scholar
Balcan, M.F., Blum, A., Fine, S., Mansour, Y.: Distributed learning, communication complexity and privacy. In: COLT 2012, arXiv:1204.3514 (to appear, June 2012)
Google Scholar
Daumé III, H., Phillips, J.M., Saha, A., Venkatasubramanian, S.: Efficient protocols for distributed classification and optimization. arXiv:1204.3523
Google Scholar
Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations, Cambridge (2009)
Google Scholar
Cormode, G., Muthukrishnan, S., Yi, K.: Algorithms for distributed functional monitoring. In: SODA (2008)
Google Scholar
Cormode, G., Muthukrishnan, S., Yi, K., Zhang, Q.: Optimal sampling from distributed streams. In: PODS (2010)
Google Scholar
Matousek, J.: Approximations and optimal geometric divide-and-conquer. In: STOC (1991)
Google Scholar
Chazelle, B.: The Discrepancy Method, Cambridge (2000)
Google Scholar
Matoušek, J.: Geometric Discrepancy. Springer (1999)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM TIST 2(3) (2011)
Google Scholar
Arora, S., Hazan, E., Kale, S.: Fast algorithms for approximate semidefinite programming using the multiplicative weights update method. In: FOCS (2005)
Google Scholar
Meka, R., Jain, P., Caramanis, C., Dhillon, I.S.: Rank minimization via online learning. In: ICML (2008)
Google Scholar
Muthukrishnan, S.: Data streams: algorithms and applications. Foundations and trends in theoretical computer science. Now Publishers (2005)
Google Scholar
Chan, T.M., Chen, E.Y.: Multi-pass geometric algorithms. Disc. & Comp. Geom. 37(1), 79–102 (2007)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

University of Maryland, CP, MD, 20742, USA
Hal Daumé III
University of Utah, SLC, UT, 84112, USA
Jeff M. Phillips, Avishek Saha & Suresh Venkatasubramanian

Authors

Hal Daumé III
View author publications
You can also search for this author in PubMed Google Scholar
Jeff M. Phillips
View author publications
You can also search for this author in PubMed Google Scholar
Avishek Saha
View author publications
You can also search for this author in PubMed Google Scholar
Suresh Venkatasubramanian
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Technion, 32000, Haifa, Israel
Nader H. Bshouty
Ecolre Normale Sup’erieure, CNRS, INRIA, 45 rue d’Ulm, 75005, Paris, France
Gilles Stoltz
Ecole Normale Supérieure de Cachan, 61, avenue du Président Wilson, 94 235, Cachan cedex, France
Nicolas Vayatis
Division of Computer Science, Hokkaido University, N-14, W-9, 060-0814, Sapporo, Japan
Thomas Zeugmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Daumé, H., Phillips, J.M., Saha, A., Venkatasubramanian, S. (2012). Efficient Protocols for Distributed Classification and Optimization. In: Bshouty, N.H., Stoltz, G., Vayatis, N., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2012. Lecture Notes in Computer Science(), vol 7568. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34106-9_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-34106-9_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34105-2
Online ISBN: 978-3-642-34106-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics