Abstract
Since Google introduced the first distributed graph computing system Pregel, many similar systems are proposed. The distributed graph computing systems become a standard platform for large-scale graph analysis. Compared to the previous graph processing libraries, the new systems have the advantages of scalability, usability, and flexibility. In this chapter, we briefly review the basic concepts of the distributed graph computing systems, including the architecture, execution flow, and programming abstraction and computation models (e.g., vertex-centric, edge-centric, subgraph-centric, etc.). In this book, we concentrate on the vertex-centric computation model and then describe two excellent and popular programming abstractions—vertex programming abstraction and gather–apply–scatter (GAS) programming abstraction. Finally, we introduce the workload-aware cost model which classifies the factors influencing the performance into two types—workload source and workload distribution. The model helps to estimate the workload for a distributed graph computing system and guides us to optimize the systems and algorithms smartly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
V. Aggarwal, S. Sengupta, V. S. Sharma, and A. Santharam. A scalable master-worker architecture for PaaS clouds. In 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, pages 1268–1275, Nov 2012.
Gene M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18–20, 1967, Spring Joint Computer Conference, AFIPS ’67 (Spring), pages 483–485, New York, NY, USA, 1967. ACM.
D. A. Bader and K. Madduri. Snap, small-world network analysis and partitioning: An open-source parallel graph framework for the exploration of large-scale networks. In 2008 IEEE International Symposium on Parallel and Distributed Processing, pages 1–12, April 2008.
Albert Chan, Frank Dehne, and Ryan Taylor. CGMGRAPH/CGMLIB: Implementing and testing CGM graph algorithms on PC clusters and shared memory machines. In Journal of HPCA, pages 81–97, 2005.
Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. One trillion edges: Graph processing at Facebook-scale. Proc. VLDB Endow., 8(12):1804–1815, August 2015.
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. Powergraph: distributed graph-parallel computation on natural graphs. In OSDI, 2012.
Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation, OSDI’14, pages 599–613, Berkeley, CA, USA, 2014. USENIX Association.
Douglas Gregor and Andrew Lumsdaine. The parallel BGL: A generic library for distributed graph computations. In POOSC, 2005.
Safiollah Heidari, Yogesh Simmhan, Rodrigo N. Calheiros, and Rajkumar Buyya. Scalable graph processing frameworks: A taxonomy and open challenges. ACM Comput. Surv., 51(3):60:1–60:53, June 2018.
Sungpack Hong, Hassan Chafi, Edic Sedlar, and Kunle Olukotun. Green-Marl: A DSL for easy and efficient graph analysis. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 349–362, New York, NY, USA, 2012. ACM.
Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: A system for large-scale graph processing. In SIGMOD, 2010.
Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel. Chaos: Scale-out graph processing from secondary storage. In Proceedings of the 25th Symposium on Operating Systems Principles, SOSP ’15, pages 410–424, New York, NY, USA, 2015. ACM.
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP ’13, pages 472–488, New York, NY, USA, 2013. ACM.
Semih Salihoglu and Jennifer Widom. GPS: A graph processing system. In SSDBM, 2013.
Y. Shao, B. Cui, and L. Ma. PAGE: A partition aware engine for parallel graph computation. IEEE Transactions on Knowledge and Data Engineering, 27(2):518–530, Feb 2015.
Yingxia Shao, Lei Chen, and Bin Cui. Efficient cohesive subgraphs detection in parallel. In Proc. of ACM SIGMOD Conference, pages 613–624, 2014.
Kamran Siddique, Zahid Akhtar, Yangwoo Kim, Young-Sik Jeong, and Edward J. Yoon. Investigating Apache Hama: A bulk synchronous parallel computing framework. J. Supercomput., 73(9):4190–4205, September 2017.
Yogesh Simmhan, Alok Kumbhare, Charith Wickramaarachchi, Soonil Nagarkar, Santosh Ravi, Cauligi Raghavendra, and Viktor Prasanna. Goffish: A sub-graph centric framework for large-scale graph analytics. In Fernando Silva, Inês Dutra, and Vítor Santos Costa, editors, Euro-Par 2014 Parallel Processing, pages 451–462, Cham, 2014. Springer International Publishing.
Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, and John McPherson. From “think like a vertex” to “think like a graph”. Proc. VLDB Endow., 7(3):193–204, November 2013.
Leslie G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103–111, August 1990.
David W. Walker, David W. Walker, Jack J. Dongarra, and Jack J. Dongarra. Mpi: A standard message passing interface. Supercomputer, 12:56–68, 1996.
N. Xu, B. Cui, L. Chen, Z. Huang, and Y. Shao. Heterogeneous environment aware streaming graph partitioning. IEEE Transactions on Knowledge and Data Engineering, 27(6):1560–1572, June 2015.
Da Yan, James Cheng, Yi Lu, and Wilfred Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14):1981–1992, 2014.
Shao Yingxia, Yao Junjie, Cui Bin, and Ma Lin. PAGE: A partition aware graph computation engine. In CIKM, pages 823–828, 2013.
P. Yuan, W. Zhang, C. Xie, H. Jin, L. Liu, and K. Lee. Fast iterative graph computation: A path centric approach. In SC ’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 401–412, Nov 2014.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Shao, Y., Cui, B., Chen, L. (2020). Graph Computing Systems for Large-Scale Graph Analysis. In: Large-scale Graph Analysis: System, Algorithm and Optimization. Big Data Management. Springer, Singapore. https://doi.org/10.1007/978-981-15-3928-2_2
Download citation
DOI: https://doi.org/10.1007/978-981-15-3928-2_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3927-5
Online ISBN: 978-981-15-3928-2
eBook Packages: Computer ScienceComputer Science (R0)