research-article

Public Access

Efficient Coflow Scheduling Without Prior Knowledge

Authors:
Mosharaf Chowdhury

UC Berkeley, Berkeley, CA, USA

UC Berkeley, Berkeley, CA, USA
View Profile

,
Ion Stoica

UC Berkeley, Berkeley, CA, USA

UC Berkeley, Berkeley, CA, USA
View Profile

SIGCOMM '15: Proceedings of the 2015 ACM Conference on Special Interest Group on Data CommunicationAugust 2015Pages 393–406https://doi.org/10.1145/2785956.2787480

Published:17 August 2015Publication History

SIGCOMM '15: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication

Pages 393–406

ABSTRACT

Inter-coflow scheduling improves application-level communication performance in data-parallel clusters. However, existing efficient schedulers require a priori coflow information and ignore cluster dynamics like pipelining, task failures, and speculative executions, which limit their applicability. Schedulers without prior knowledge compromise on performance to avoid head-of-line blocking. In this paper, we present Aalo that strikes a balance and efficiently schedules coflows without prior knowledge.

Aalo employs Discretized Coflow-Aware Least-Attained Service (D-CLAS) to separate coflows into a small number of priority queues based on how much they have already sent across the cluster. By performing prioritization across queues and by scheduling coflows in the FIFO order within each queue, Aalo's non-clairvoyant scheduler reduces coflow completion times while guaranteeing starvation freedom. EC2 deployments and trace-driven simulations show that communication stages complete 1.93X faster on average and 3.59X faster at the 95th percentile using Aalo in comparison to per-flow mechanisms. Aalo's performance is comparable to that of solutions using prior knowledge, and Aalo outperforms them in presence of cluster dynamics.

Supplemental Material

p393-chowdhury.webm

webm

193.5 MB

Download

References

Amazon EC2. http://aws.amazon.com/ec2.Google Scholar
Apache Hive. http://hive.apache.org.Google Scholar
Apache Tez. http://tez.apache.org.Google Scholar
Impala performance update: Now reaching DBMS-class speed. http://blog.cloudera.com/blog/2014/01/impala-performance-dbms-class-speed.Google Scholar
A look inside Google's data center networks. http://googlecloudplatform.blogspot.com/2015/06/A-Look-Inside-Googles-Data-Center-Networks.html.Google Scholar
TPC Benchmark DS (TPC-DS). http://www.tpc.org/tpcds.Google Scholar
TPC-DS kit for Impala. https://github.com/cloudera/impala-tpcds-kit.Google Scholar
M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic flow scheduling for data center networks. In NSDI, 2010. Google ScholarDigital Library
M. Alizadeh, T. Edsall, S. Dharmapurikar, R. Vaidyanathan, K. Chu, A. Fingerhut, F. Matus, R. Pan, N. Yadav, and G. Varghese. CONGA: Distributed congestion-aware load balancing for datacenters. In SIGCOMM, 2014. Google ScholarDigital Library
M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. Mckeown, B. Prabhakar, and S. Shenker. pFabric: Minimal near-optimal datacenter transport. In SIGCOMM, 2013. Google ScholarDigital Library
G. Ananthanarayanan, A. Ghodsi, A. Wang, D. Borthakur, S. Kandula, S. Shenker, and I. Stoica. PACMan: Coordinated memory caching for parallel jobs. In NSDI, 2012. Google ScholarDigital Library
G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the outliers in mapreduce clusters using Mantri. In OSDI, 2010. Google ScholarDigital Library
R. H. Arpaci-Dusseau and A. C. Arpaci-Dusseau. Scheduling: The multi-level feedback queue. In Operating Systems: Three Easy Pieces. 2014.Google Scholar
W. Bai, L. Chen, K. Chen, D. Han, C. Tian, and H. Wang. Information-agnostic flow scheduling for commodity data centers. In NSDI, 2015. Google ScholarDigital Library
H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron. Towards predictable datacenter networks. In SIGCOMM, 2011. Google ScholarDigital Library
T. Benson, A. Anand, A. Akella, and M. Zhang. MicroTE: Fine grained traffic engineering for data centers. In CoNEXT, 2011. Google ScholarDigital Library
M. Chowdhury, S. Kandula, and I. Stoica. Leveraging endpoint flexibility in data-intensive clusters. In SIGCOMM, 2013. Google ScholarDigital Library
M. Chowdhury and I. Stoica. Coflow: A networking abstraction for cluster applications. In HotNets, 2012. Google ScholarDigital Library
M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica. Managing data transfers in computer clusters with Orchestra. In SIGCOMM, 2011. Google ScholarDigital Library
M. Chowdhury, Y. Zhong, and I. Stoica. Efficient coflow scheduling with Varys. In SIGCOMM, 2014. Google ScholarDigital Library
E. G. Coffman and L. Kleinrock. Feedback queueing models for time-shared systems. Journal of the ACM, 15(4):549--576, 1968. Google ScholarDigital Library
T. Condie, N. Conway, P. Alvaro, and J. M. Hellerstein. Mapreduce online. In NSDI, 2010. Google ScholarDigital Library
F. J. Corbató, M. Merwin-Daggett, and R. C. Daley. An experimental time-sharing system. In Spring Joint Computer Conference, pages 335--344, 1962. Google ScholarDigital Library
J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In OSDI, 2004. Google ScholarDigital Library
F. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron. Decentralized task-aware scheduling for data center networks. In SIGCOMM, 2014. Google ScholarDigital Library
N. G. Duffield, P. Goyal, A. Greenberg, P. Mishra, K. K. Ramakrishnan, and J. E. van der Merive. A flexible model for resource management in virtual private networks. In SIGCOMM, 1999. Google ScholarDigital Library
A. D. Ferguson, A. Guha, C. Liang, R. Fonseca, and S. Krishnamurthi. Participatory networking: An API for application control of SDNs. In SIGCOMM, 2013. Google ScholarDigital Library
A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: A scalable and flexible data center network. In SIGCOMM, 2009. Google ScholarDigital Library
C.-Y. Hong, M. Caesar, and P. B. Godfrey. Finishing flows quickly with preemptive scheduling. In SIGCOMM, 2012. Google ScholarDigital Library
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. In EuroSys, 2007. Google ScholarDigital Library
N. Kang, Z. Liu, J. Rexford, and D. Walker. Optimizing the "One Big Switch" abstraction in Software-Defined Networks. In CoNEXT, 2013. Google ScholarDigital Library
J. E. Kelley. Critical-path planning and scheduling: Mathematical basis. Operations Research, 9(3):296--320, 1961. Google ScholarDigital Library
J. E. Kelley. The critical-path method: Resources planning and scheduling. Industrial scheduling, 13:347--365, 1963.Google Scholar
D. Kempe, A. Dobra, and J. Gehrke. Gossip-based computation of aggregate information. In FOCS, 2003. Google ScholarDigital Library
Y. Kim, D. Han, O. Mutlu, and M. Harchol-Balter. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In HPCA, 2010.Google Scholar
G. Kumar, M. Chowdhury, S. Ratnasamy, and I. Stoica. A case for performance-centric network allocation. In HotCloud, 2012. Google ScholarDigital Library
M. Mastrolilli, M. Queyranne, A. S. Schulz, O. Svensson, and N. A. Uhan. Minimizing the sum of weighted completion times in a concurrent open shop. Operations Research Letters, 38(5):390--395, 2010. Google ScholarDigital Library
T. Moscibroda and O. Mutlu. Distributed order scheduling and its application to multi-core DRAM controllers. In PODC, 2008. Google ScholarDigital Library
R. Motwani, S. Phillips, and E. Torng. Nonclairvoyant scheduling. Theoretical Computer Science, 130(1):17--47, 1994. Google ScholarDigital Library
R. N. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. PortLand: A scalable fault-tolerant layer 2 data center network fabric. In SIGCOMM, 2009. Google ScholarDigital Library
J. Nair, A. Wierman, and B. Zwart. The fundamentals of heavy tails: Properties, emergence, and identification. In SIGMETRICS, 2013. Google ScholarDigital Library
M. Nuyens and A. Wierman. The Foreground--Background queue: A survey. Performance Evaluation, 65(3):286--307, 2008. Google ScholarDigital Library
L. Popa, G. Kumar, M. Chowdhury, A. Krishnamurthy, S. Ratnasamy, and I. Stoica. FairCloud: Sharing the network in cloud computing. In SIGCOMM, 2012. Google ScholarDigital Library
Z. Qiu, C. Stein, and Y. Zhong. Minimizing the total weighted completion time of coflows in datacenter networks. In SPAA, 2015. Google ScholarDigital Library
I. A. Rai, G. Urvoy-Keller, and E. W. Biersack. Analysis of LAS scheduling for job size distributions with high variance. ACM SIGMETRICS Performance Evaluation Review, 31(1):218--228, 2003. Google ScholarDigital Library
C. J. Rossbach, Y. Yu, J. Currey, J.-P. Martin, and D. Fetterly. Dandelion: A compiler and runtime for heterogeneous systems. In SOSP, 2013. Google ScholarDigital Library
C. Wilson, H. Ballani, T. Karagiannis, and A. Rowstron. Better never than late: Meeting deadlines in datacenter networks. In SIGCOMM, 2011. Google ScholarDigital Library
R. S. Xin, J. Rosen, M. Zaharia, M. J. Franklin, S. Shenker, and I. Stoica. Shark: SQL and rich analytics at scale. In SIGMOD, 2013. Google ScholarDigital Library
J. Yu, R. Buyya, and K. Ramamohanarao. Workflow scheduling algorithms for grid computing. In Metaheuristics for Scheduling in Distributed Computing Environments, pages 173--214. 2008.Google ScholarCross Ref
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, 2012. Google ScholarDigital Library
M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica. Improving MapReduce performance in heterogeneous environments. In OSDI, 2008. Google ScholarDigital Library
Y. Zhao, K. Chen, W. Bai, C. Tian, Y. Geng, Y. Zhang, D. Li, and S. Wang. RAPIER: Integrating routing and scheduling for coflow-aware data center networks. In INFOCOM, 2015.Google ScholarCross Ref

Index Terms

Efficient Coflow Scheduling Without Prior Knowledge
1. Networks
  1. Network services
    1. Cloud computing

Recommendations

Efficient coflow scheduling with Varys
SIGCOMM '14: Proceedings of the 2014 ACM conference on SIGCOMM

Communication in data-parallel applications often involves a collection of parallel flows. Traditional techniques to optimize flow-level metrics do not perform well in optimizing such collections, because the network is largely agnostic to application-...
Read More
Efficient Coflow Scheduling Without Prior Knowledge
SIGCOMM'15

Inter-coflow scheduling improves application-level communication performance in data-parallel clusters. However, existing efficient schedulers require a priori coflow information and ignore cluster dynamics like pipelining, task failures, and ...
Read More
Efficient coflow scheduling with Varys
SIGCOMM'14

Communication in data-parallel applications often involves a collection of parallel flows. Traditional techniques to optimize flow-level metrics do not perform well in optimizing such collections, because the network is largely agnostic to application-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGCOMM '15: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication
August 2015
684 pages
ISBN:9781450335423
DOI:10.1145/2785956
General Chairs:
Steve Uhlig
Queen Mary University of London, UK
,
Olaf Maennel
Tallinn U. of Technology in Estonia, Estonia
,
Program Chairs:
Brad Karp
University College London, UK
,
Jitendra Padhye
Microsoft, USA
ACM SIGCOMM Computer Communication Review Volume 45, Issue 4
SIGCOMM'15
October 2015
659 pages
ISSN:0146-4833
DOI:10.1145/2829988
Editors:
Konstantina Papagiannaki
Telefonica Research, Barcelona, Spain
,
Katerina Argyraki
EPFL, Switzerland
,
Hitesh Ballani
Microsoft Research Cambridge, UK
,
Fabián Bustamante
Northwestern University, USA
,
Joseph Camp
SMU, USA
,
Augustin Chaintreau
Columbia University, USA
,
Phillipa Gill
Stony Brook University, USA
,
Marco Mellia
Politecnico di Torino, Italy
,
Bhaskaran Raman
IIT Bombay, India
,
Joel Sommers
Colgate University, USA
,
Aline Carneiro Viana
INRIA, France
Issue’s Table of Contents
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 August 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
coflow
data-intensive applications
datacenter networks
Qualifiers
- research-article
Conference

Acceptance Rates
SIGCOMM '15 Paper Acceptance Rate40of242submissions,17%Overall Acceptance Rate554of3,547submissions,16%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 244
  Total Citations
  View Citations
- 1,964
  Total Downloads
- Downloads (Last 12 months)317
- Downloads (Last 6 weeks)35
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient Coflow Scheduling Without Prior Knowledge

SIGCOMM '15: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Efficient coflow scheduling with Varys

Efficient Coflow Scheduling Without Prior Knowledge

Efficient coflow scheduling with Varys