Skip to main content

Graph Partitioning: Formulations and Applications to Big Data

  • Living reference work entry
  • Latest version View entry history
  • First Online:
Encyclopedia of Big Data Technologies

Definitions

Given an input graph G = (V, E) and an integer k ≥ 2, the graph partitioning problem is to divide V  into k disjoint blocks of vertices V 1, V 2, …, V k , such that ∪1≤ik V i  = V , while simultaneously optimizing an objective function and maintaining balance: \(|V_i|\leq (1+\epsilon )\left \lceil |V| / k\right \rceil \) for some 𝜖 ≥ 0.

Overview

Subdividing a problem into manageable pieces is a critical task in effectively parallelizing computation and even accelerating sequential computation. A key method that has received a lot of attention for doing so is graph partitioning. The simplest and most common form of graph partitioning asks for the vertex set to be partitioned into k roughly equal-sized blocks while minimizing the number of edges between the blocks (called cut edges). Even this most basic variant is NP-hard (Hyafil and Rivest 1973). Graph partitioning has many application areas, including VLSI (Karypis et al. 1999), scientific computing (Langguth et al. 2015...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Akhremtsev Y, Sanders P, Schulz C (2015) (Semi-)external algorithms for graph partitioning and clustering. In: Proceeding of 17th workshop on algorithm engineering and experiments (ALENEX 2015). SIAM, pp 33–43

    Google Scholar 

  • Alpert CJ, Kahng AB, Yao SZ (1999) Spectral partitioning with multiple eigenvectors. Discret Appl Math 90(1):3–26

    Google Scholar 

  • Andreev K, Räcke H (2006) Balanced graph partitioning. Theory Comput Syst 39(6):929–939

    Google Scholar 

  • Arz J, Sanders P, Stegmaier J, Mikut R (2017) 3D cell nuclei segmentation with balanced graph partitioning. CoRR abs/1702.05413

    Google Scholar 

  • Aydin K, Bateni M, Mirrokni V (2016) Distributed balanced partitioning via linear embedding. In: Proceeding of the ninth ACM international conference on web search and data mining. ACM, pp 387–396

    Google Scholar 

  • Bichot C, Siarry P (eds) (2011) Graph partitioning. Wiley, London

    Google Scholar 

  • Bourse F, Lelarge M, Vojnovic M (2014) Balanced graph edge partition. In: Proceeding 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’14. ACM, pp 1456–1465

    Google Scholar 

  • Buluç A, Madduri K (2012) Graph partitioning for scalable distributed graph computations. In: Proceeding of 10th DIMACS implementation challenge, contemporary mathematics. AMS, pp 83–102

    Google Scholar 

  • Buluç A, Meyerhenke H, Safro I, Sanders P, Schulz C (2016) Recent advances in graph partitioning. In: Kliemann L, Sanders P (eds) Algorithm engineering. Springer, Cham, pp 117–158

    Google Scholar 

  • Fiduccia CM, Mattheyses RM (1982) A linear-time heuristic for improving network partitions. In: Proceedings of the 19th conference on design automation, pp 175–181

    Google Scholar 

  • Fietz J, Krause M, Schulz C, Sanders P, Heuveline V (2012) Optimized hybrid parallel lattice Boltzmann fluid flow simulations on complex geometries. In: Proceeding of Euro-Par 2012 parallel processing. LNCS, vol 7484. Springer, pp 818–829

    Google Scholar 

  • Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) PowerGraph: distributed graph-parallel computation on natural graphs. In: Presented as part of the 10th USENIX symposium on operating systems design and implementation (OSDI 12), USENIX, pp 17–30

    Google Scholar 

  • Hendrickson B, Kolda TG (2000) Graph partitioning models for parallel computing. Parallel Comput 26(12):1519–1534

    Google Scholar 

  • Hendrickson B, Leland R (1995) A multilevel algorithm for partitioning graphs. In: Proceeding of the ACM/IEEE conference on supercomputing’95. ACM

    Google Scholar 

  • Hyafil L, Rivest R (1973) Graph partitioning and constructing optimal decision trees are polynomial complete problems. Technical report 33, IRIA – Laboratoire de Recherche en Informatique et Automatique

    Google Scholar 

  • Jammula N, Chockalingam SP, Aluru S (2017) Distributed memory partitioning of high-throughput sequencing datasets for enabling parallel genomics analyses. In: Proceeding of 8th ACM international conference on bioinformatics, computational biology, and health informatics, ACM-BCB’17. ACM, pp 417–424

    Google Scholar 

  • Karypis G, Aggarwal R, Kumar V, Shekhar S (1999) Multilevel hypergraph partitioning: applications in VLSI domain. IEEE Trans Very Large Scale Integr (VLSI) Syst 7(1):69–79

    Google Scholar 

  • Kim J, Hwang I, Kim YH, Moon BR (2011) Genetic approaches for graph partitioning: a survey. In: 13th genetic and evolutionary computation (GECCO). ACM, pp 473–480

    Google Scholar 

  • Lamm S, Sanders P, Schulz C, Strash D, Werneck RF (2017) Finding near-optimal independent sets at scale. J Heuristics 23(4):207–229

    Google Scholar 

  • Lang K, Rao S (2004) A flow-based method for improving the expansion or conductance of graph cuts. In: Proceedings of the 10th international integer programming and combinatorial optimization conference. LNCS, vol 3064. Springer, pp 383–400

    Google Scholar 

  • Langguth J, Sourouri M, Lines GT, Baden SB, Cai X (2015) Scalable heterogeneous CPU-GPU computations for unstructured tetrahedral meshes. IEEE Micro 35(4):6–15

    Google Scholar 

  • Li L, Geda R, Hayes AB, Chen Y, Chaudhari P, Zhang EZ, Szegedy M (2017) A simple yet effective balanced edge partition model for parallel computing. Proc ACM Meas Anal Comput Syst 1(1):14:1–14:21

    Google Scholar 

  • McCune RR, Weninger T, Madey G (2015) Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput Surv 48(2):25:1–25:39

    Google Scholar 

  • Meyerhenke H, Sauerwald T (2012) Beyond good partition shapes: an analysis of diffusive graph partitioning. Algorithmica 64(3):329–361

    Google Scholar 

  • Meyerhenke H, Sanders P, Schulz C (2017) Parallel graph partitioning for complex networks. IEEE Trans Parallel Distrib Syst 28:2625–2638

    Google Scholar 

  • Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106

    Google Scholar 

  • Rahimian F, Payberah AH, Girdzijauskas S, Haridi S (2014) Distributed vertex-cut partitioning. In: IFIP international conference on distributed applications and interoperable systems. Springer, pp 186–200

    Google Scholar 

  • Sanders P, Schulz C (2011) Engineering multilevel graph partitioning algorithms. In: Proceedings of the 19th European symposium on algorithms. LNCS, vol 6942. Springer, pp 469–480

    Google Scholar 

  • Sanders P, Schulz C (2013) High quality graph partitioning. In: Proceedings of the 10th DIMACS implementation challenge – graph clustering and graph partitioning. AMS, pp 1–17

    Google Scholar 

  • Schloegel K, Karypis G, Kumar V (2003) Graph partitioning for high-performance scientific simulations. In: Dongarra J, Foster I, Fox G, Gropp W, Kennedy K, Torczon L, White A (eds) Sourcebook of parallel computing. Morgan Kaufmann Publishers, San Francisco, pp 491–541

    Google Scholar 

  • Shalita A, Karrer B, Kabiljo I, Sharma A, Presta A, Adcock A, Kllapi H, Stumm M (2016) Social hash: an assignment framework for optimizing distributed systems operations on social networks. In: Argyraki KJ, Isaacs R (eds) 13th USENIX symposium on networked systems design and implementation, NSDI. USENIX Association, pp 455–468

    Google Scholar 

  • Slota GM, Rajamanickam S, Devine K, Madduri K (2017) Partitioning trillion-edge graphs in minutes. In: Proceedings of the 31st IEEE international parallel and distributed processing symposium (IPDPS 2017), pp 646–655

    Google Scholar 

  • Tomer R, Khairy K, Amat F, Keller PJ (2012) Quantitative high-speed imaging of entire developing embryos with simultaneous multiview light-sheet microscopy. Nat Methods 9(7):755–763

    Article  Google Scholar 

  • Tran DA, Nguyen K, Pham C (2012) S-clone: socially-aware data replication for social networks. Comput Netw 56(7):2001–2013

    Article  Google Scholar 

  • Ugander J, Backstrom L (2013) Balanced label propagation for partitioning massive graphs. In: Proceedings of the sixth ACM international conference on web search and data mining, WSDM’13. ACM, pp 507–516

    Google Scholar 

  • Zhou M, Sahni O, Devine KD, Shephard MS, Jansen KE (2010) Controlling unstructured mesh partitions for massively parallel simulations. SIAM J Sci Comput 32(6):3201–3227

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Schulz .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Schulz, C., Strash, D. (2018). Graph Partitioning: Formulations and Applications to Big Data. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_312-2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63962-8_312-2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63962-8

  • Online ISBN: 978-3-319-63962-8

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

  1. Latest

    Graph Partitioning: Formulations and Applications to Big Data
    Published:
    20 March 2018

    DOI: https://doi.org/10.1007/978-3-319-63962-8_312-2

  2. Original

    Graph Partition
    Published:
    12 February 2018

    DOI: https://doi.org/10.1007/978-3-319-63962-8_312-1