ABSTRACT
Massively Parallel Computation (MPC) is an emerging model which distills core aspects of distributed and parallel computation. It was developed as a tool to solve (typically graph) problems in systems where input is distributed over many machines with limited space. Recent work has focused on the regime in which machines have sublinear (in n, number of nodes in the input graph) space, with randomized algorithms presented for the fundamental problems of Maximal Matching and Maximal Independent Set. There are, however, no prior corresponding deterministic algorithms.
A major challenge in the sublinear space setting is that the local space of each machine may be too small to store all the edges incident to a single node. To overcome this barrier we introduce a new graph sparsification technique that deterministically computes a low-degree subgraph with additional desired properties: degrees in the subgraph are sufficiently small that nodes' neighborhoods can be stored on single machines, and solving the problem on the subgraph provides significant global progress towards solving the problem for the original input graph.
Using this framework to derandomize the well-known randomized algorithm of Luby [SICOMP'86], we obtain O(log Δ+log log n)$-round deterministic MPC algorithms for solving the fundamental problems of Maximal Matching and Maximal Independent Set with O(nε) space on each machine for any constant ε > 0. Based on the recent work of Ghaffari et al. [FOCS'18], this additive O(log log n) factor is conditionally essential. These algorithms can also be shown to run in O(log Δ) rounds in the closely related model of CONGESTED CLIQUE, improving upon the state-of-the-art bound of O(log2 Δ) rounds by Censor-Hillel et al. [DISC'17].
- Noga Alon, László Babai, and Alon Itai. A fast and simple randomized parallel algorithm for the maximal independent set problem. Journal of Algorithms, 7(4):567--583, 1986.Google ScholarDigital Library
- Alexandr Andoni, Zhao Song, Clifford Stein, Zhengyu Wang, and Peilin Zhong. Parallel graph connectivity in log diameter rounds. In ¶roc 59th FOCS, pages 674--685, 2018.Google Scholar
- Alexandr Andoni, Clifford Stein, and Peilin Zhong. Parallel approximate undirected shortest paths via low hop emulators. In ¶roc 52nd STOC, 2020.Google Scholar
- Sepehr Assadi, MohammadHossein Bateni, Aaron Bernstein, Vahab Mirrokni, and Cliff Stein. Coresets meet EDCS: Algorithms for matching and vertex cover on massive graphs. In ¶roc 30th SODA, pages 1616--1635, 2019.Google Scholar
- Sepehr Assadi, Xiaorui Sun, and Omri Weinstein. Massively parallel algorithms for finding well-connected components in sparse graphs. In ¶roc 37th ¶ODC, pages 461--470, 2019.Google Scholar
- Soheil Behnezhad, Sebastian Brandt, Mahsa Derakhshan, Manuela Fischer, MohammadTaghi Hajiaghayi, Richard M. Karp, and Jara Uitto. Massively parallel computation of matching and MIS in sparse graphs. In ¶roc 37th ¶ODC, pages 481--490, 2019. A preliminary version of a merge of CoRR abs/1807.06701 and CoRR abs/1807.05374.Google Scholar
- Soheil Behnezhad, Mahsa Derakhshan, and MohammadTaghi Hajiaghayi. Semi-MapReduce meets Congested Clique. CoRR abs/1802.10297, 2018.Google Scholar
- Soheil Behnezhad, Laxman Dhulipala, Hossein Esfandiari, Jakub Łcacki, and Vahab S. Mirrokni. Near-optimal massively parallel graph connectivity. In ¶roc 60th FOCS, pages 1615--1636, 2019.Google Scholar
- Soheil Behnezhad, Laxman Dhulipala, Hossein Esfandiari, Jakub Łcacki, Vahab S. Mirrokni, and Warren Schudy. Massively parallel computation via remote memory access. In ¶roc 31st SPAA, pages 59--68, 2019.Google Scholar
- Soheil Behnezhad, MohammadTaghi Hajiaghayi, and David G. Harris. Exponentially faster massively parallel maximal matching. In ¶roc 60th FOCS, pages 1637--1649, 2019.Google Scholar
- Mihir Bellare and John Rompel. Randomness-efficient oblivious sampling. In ¶roc 35th FOCS, pages 276--287, 1994.Google Scholar
- Keren Censor-Hillel, Merav Parter, and Gregory Schwartzman. Derandomizing local distributed algorithms under bandwidth restrictions. In ¶roc 31st DISC, pages 11:1--11:16, 2017.Google Scholar
- Yi-Jun Chang, Manuela Fischer, Mohsen Ghaffari, Jara Uitto, and Yufan Zheng. The complexity of (Δ+1) coloring in congested clique, massively parallel computation, and centralized local computation. In ¶roc 38th ¶ODC, pages 471--480, 2019.Google Scholar
- Artur Czumaj, Peter Davies, and Merav Parter. Simple, deterministic, constant-round coloring in the congested clique. In ¶roc 39th ¶ODC, 2020.Google Scholar
- Artur Czumaj, Jakub Łcacki, Aleksander Mcadry, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski. Round compression for parallel matching algorithms. In ¶roc 50th STOC, pages 471--484, 2018.Google Scholar
- Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified data processing on large clusters. In ¶roc 6th ØSDI, pages 10--10, 2004.Google Scholar
- Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified data processing on large clusters. Commununication of the ACM, 51(1):107--113, January 2008.Google ScholarDigital Library
- Mohsen Ghaffari. An improved distributed algorithm for maximal independent set. In ¶roc 27th SODA, pages 270--277, 2016.Google Scholar
- Mohsen Ghaffari, Themis Gouleakis, Christian Konrad, Slobodan Mitrović, and Ronitt Rubinfeld. Improved massively parallel computation algorithms for MIS, matching, and vertex cover. In ¶roc 36th ¶ODC, pages 129--138, 2018.Google Scholar
- Mohsen Ghaffari, Christoph Grunau, and Ce Jin. Improved MPC algorithms for MIS, matching, and coloring on trees and beyond. CoRR abs/2002.09610, February 2020.Google Scholar
- Mohsen Ghaffari, Fabian Kuhn, and Jara Uitto. Conditional hardness results for massively parallel computation from distributed lower bounds. In ¶roc 60th FOCS, pages 1650--1663, 2019.Google Scholar
- Mohsen Ghaffari and Jara Uitto. Sparsifying distributed algorithms with ramifications in massively parallel computation and centralized local computation. In ¶roc 30th SODA, pages 1636--1653, 2019.Google Scholar
- Mark K. Goldberg and Thomas H. Spencer. A new parallel algorithm for the maximal independent set problem. SICOMP, 18(2):419--427, 1989.Google ScholarDigital Library
- M. Goodrich. Communication-efficient parallel sorting. SIAM Journal on Computing, 29(2):416--432, 1999.Google ScholarDigital Library
- Michael T. Goodrich, Nodari Sitchinava, and Qin Zhang. Sorting, searching, and simulation in the MapReduce framework. In ¶roc 22nd ISAAC, pages 374--383, 2011.Google Scholar
- Yijie Han. A fast derandomization scheme and its applications. SICOMP, 25(1):52--82, 1996.Google ScholarDigital Library
- David G. Harris. Deterministic parallel algorithms for bilinear objective functions. Algorithmica, 81(3):1288--1318, 2019.Google ScholarDigital Library
- Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. SIGOPS Operating Systems Review, 41(3):59--72, March 2007.Google ScholarDigital Library
- Amos Israeli and Alon Itai. A fast and simple randomized parallel algorithm for maximal matching. Information Processing Letters, 22(2):77--80, 1986.Google ScholarDigital Library
- Howard J. Karloff, Siddharth Suri, and Sergei Vassilvitskii. A model of computation for MapReduce. In ¶roc 21st SODA, pages 938--948, 2010.Google Scholar
- Richard M. Karp and Avi Wigderson. A fast parallel algorithm for the maximal independent set problem. JACM, 32(4):762--773, 1985.Google ScholarDigital Library
- Fabian Kuhn. Weak graph colorings: Distributed algorithms and applications. In ¶roc 21st SPAA, pages 138--144, 2009.Google Scholar
- Silvio Lattanzi, Benjamin Moseley, Siddharth Suri, and Sergei Vassilvitskii. Filtering: A method for solving graph problems in MapReduce. In ¶roc 23rd SPAA, pages 85--94, 2011.Google Scholar
- Jakub Łcacki, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski. Walking randomly, massively, and efficiently. In ¶roc 52nd STOC, 2020.Google Scholar
- Christoph Lenzen. Optimal deterministic routing and sorting on the congested clique. In ¶roc 32nd ¶ODC, pages 42--50, 2013.Google Scholar
- Nathan Linial. Locality in distributed graph algorithms. SICOMP, 21(1):193--201, February 1992.Google ScholarDigital Library
- Zvi Lotker, Elan Pavlov, Boaz Patt-Shamir, and David Peleg. Mst construction in o (log log n) communication rounds. In Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures, pages 94--100, 2003.Google ScholarDigital Library
- Michael Luby. A simple parallel algorithm for the maximal independent set problem. SIAM Journal on Computing, 15(4):1036--1053, 1986.Google ScholarDigital Library
- Tim Roughgarden, Sergei Vassilvitski, and Joshua R. Wang. Shuffles and circuits (on lower bounds for modern parallel computation). JACM, 65(6):41:1--41:24, November 2018.Google ScholarDigital Library
- Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 7(1--3):1--336, 2012.Google ScholarDigital Library
- Eric Vigoda. Lecture notes for randomized algorithms: Luby's alg. for maximal independent sets using pairwise independence. https://www.cc.gatech.edu/ vigoda/RandAlgs/MIS.pdf, 2006.Google Scholar
- Tom White. Hadoop: The Definitive Guide. O'Reilly Media, Inc., 2012.Google ScholarDigital Library
- Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. Spark: Cluster computing with working sets. In ¶roc 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), 2010.Google Scholar
Index Terms
- Graph Sparsification for Derandomizing Massively Parallel Computation with Low Space
Recommendations
Graph Sparsification for Derandomizing Massively Parallel Computation with Low Space
The Massively Parallel Computation (MPC) model is an emerging model that distills core aspects of distributed and parallel computation, developed as a tool to solve combinatorial (typically graph) problems in systems of many machines with limited space.
...Deterministic Massively Parallel Symmetry Breaking for Sparse Graphs
SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and ArchitecturesWe consider the problem of designing deterministic graph algorithms for the model of Massively Parallel Computation (MPC) that improve with the sparsity of the input graph, as measured by the standard notion of arboricity. For the problems of maximal ...
Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover
PODC '18: Proceedings of the 2018 ACM Symposium on Principles of Distributed ComputingWe present O(loglog n) -round algorithms in the Massively Parallel Computation (MPC) model, with Õ (n) memory per machine, that compute a maximal independent set, a 1+ε approximation of maximum matching, and a 2+εapproximation of minimum vertex cover, ...
Comments