Skip to main content
Log in

Efficient parallel graph trimming by arc-consistency

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Given a large data graph, trimming techniques can reduce the search space by removing vertices without outgoing edges. One application is to speed up the parallel decomposition of graphs into strongly connected components (SCC decomposition), which is a fundamental step for analyzing graphs. We observe that graph trimming is essentially a kind of arc-consistency problem, and AC-3, AC-4, and AC-6 are the most relevant arc-consistency algorithms for application to graph trimming. The existing parallel graph trimming methods require worst-case \(\mathcal O(nm)\) time and worst-case \(\mathcal O(n)\) space for graphs with n vertices and m edges. We call these parallel AC-3-based as they are much like the AC-3 algorithm. In this work, we propose AC-4-based and AC-6-based trimming methods. That is, AC-4-based trimming has an improved worst-case time of \(\mathcal O(n+m)\) but requires worst-case space of \(\mathcal O(n+m)\); compared with AC-4-based trimming, AC-6-based has the same worst-case time of \(\mathcal O(n+m)\) but an improved worst-case space of \(\mathcal O(n)\). We parallelize the AC-4-based and AC-6-based algorithms to be suitable for shared-memory multi-core machines. The algorithms are designed to minimize synchronization overhead. For these algorithms, we also prove the correctness and analyze time complexities with the work-depth model. In experiments, we compare these three parallel trimming algorithms over a variety of real and synthetic graphs on a multi-core machine, where each core corresponds to a worker. Specifically, for the maximum number of traversed edges per worker by using 16 workers, AC-3-based traverses up to 58.3 and 36.5 times more edges than AC-6-based trimming and AC-4-based trimming, respectively. That is, AC-6-based trimming traverses much fewer edges than other methods, which is meaningful especially for implicit graphs. In particular, for the practical running time, AC-6-based trimming achieves high speedups over graphs with a large portion of trimmable vertices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. All our implementations, benchmarks, and results are available at https://github.com/Itisben/graph-trimming.git.

  2. https://snap.stanford.edu.

  3. http://networkrepository.com.

References

  1. Aggarwal A, Anderson RJ (1988) A random NC algorithm for depth first search. Combinatorica 8(1):1–12

    Article  MathSciNet  Google Scholar 

  2. Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) DBpedia: a nucleus for a web of open data. In: The Semantic Web. Springer, Berlin, pp 722–735. https://doi.org/10.1007/978-3-540-76298-0_52

  3. Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery (ACM), pp 44–54. https://doi.org/10.1145/1150402.1150412

  4. Batagelj V, Zaversnik M (2003) An \({O}(m)\) algorithm for cores decomposition of networks. CoRR. arxiv: cs.DS/0310049

  5. Bessière C (1994) Arc-consistency and arc-consistency again. Artif Intell 65(1):179–190. https://doi.org/10.1016/0004-3702(94)90041-8

    Article  MathSciNet  Google Scholar 

  6. Blelloch GE, Maggs BM (2010) Parallel algorithms. In: Algorithms and theory of computation handbook: special topics and techniques, pp 25–25

  7. Bloemen V (2015) On-the-fly parallel decomposition of strongly connected components. Master’s thesis, University of Twente

  8. Bloemen V, Laarman A, van de Pol J (2016) Multi-core on-the-fly SCC decomposition. ACM SIGPLAN Not 51(8):1–12. https://doi.org/10.1145/3016078.2851161

    Article  Google Scholar 

  9. Cha M, Haddadi H, Benevenuto F, Gummadi K (2010) Measuring user influence in twitter: the million follower fallacy. In: Proceedings of the International AAAI Conference on Web and Social Media, vol 4

  10. Cha M, Haddadi H, Benevenuto F, Gummadi KP (2010) Measuring user influence in twitter: the million follower fallacy. In: ICWSM. Washington DC, USA

  11. Chen X, Chen C, Shen J, Fang J, Tang T, Yang C, Wang Z (2018) Orchestrating parallel detection of strongly connected components on GPUs. Parallel Comput 78:101–114. https://doi.org/10.1016/j.parco.2017.11.001

    Article  MathSciNet  Google Scholar 

  12. Chen Y, Guo B, Huang X (2019) \(\delta\)-transitive closures and triangle consistency checking: a new way to evaluate graph pattern queries in large graph databases. J Supercomput. https://doi.org/10.1007/s11227-019-02762-4

  13. Cooper PR, Swain MJ (1992) Arc consistency: parallelism and domain dependence. Artif Intell 58(1–3):207–235. https://doi.org/10.1016/0004-3702(92)90008-l

    Article  MathSciNet  Google Scholar 

  14. Coppersmith D, Fleischer L, Hendrickson B, Pinar A (2003) A divide-and-conquer algorithm for identifying strongly connected components. Tech. rep., Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US). https://doi.org/10.2172/889876

  15. Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms. MIT Press, Cambridge

    MATH  Google Scholar 

  16. Dagum L, Menon R (1998) OpenMP: an industry standard API for shared-memory programming. IEEE Comput Sci Eng 5(1):46–55. https://doi.org/10.1109/99.660313

    Article  Google Scholar 

  17. Defo RK, Wang R, Manjunathaiah M (2019) Parallel BFS implementing optimized decomposition of space and KMC simulations for diffusion of vacancies for quantum storage. J Comput Sci 36:101018

    Article  Google Scholar 

  18. Dhulipala L, Blelloch G, Shun J (2017) Julienne: a framework for parallel graph algorithms using work-efficient bucketing. In: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, pp 293–304

  19. Dib M, Abdallah R, Caminada A (2010) Arc-consistency in constraint satisfaction problems: a survey. In: 2010 Second International Conference on Computational Intelligence, Modelling and Simulation. IEEE. https://doi.org/10.1109/cimsim.2010.18

  20. Erlebach T, Hagerup T, Jansen K, Minzlaff M, Wolff A (2010) Trimming of graphs, with application to point labeling. Theory Comput Syst 47(3):613–636

    Article  MathSciNet  Google Scholar 

  21. Fleischer LK, Hendrickson B, Pınar A (2000) On identifying strongly connected components in parallel. In: International Parallel and Distributed Processing Symposium. Springer, pp 505–511. https://doi.org/10.1007/3-540-45591-4_68

  22. Fleischer LK, Hendrickson B, Pinar A (2007) On identifying strongly connected components in parallel (November 2014), pp 505–511. https://doi.org/10.1007/3-540-45591-4_68

  23. Freuder E, Régin JC (1999) Using constraint metaknowledge to reduce arc consistency computation. Artif Intell 107(1):125–148. https://doi.org/10.1016/s0004-3702(98)00105-2

    Article  MathSciNet  MATH  Google Scholar 

  24. Gao Y, Dong W, Wu W, Chen C, Li XY, Bu J (2015) Scalpel: scalable preferential link tomography based on graph trimming. IEEE/ACM Trans Netw 24(3):1392–1403

    Article  Google Scholar 

  25. Harabor D, Grastien A (2011) Online graph pruning for pathfinding on grid maps. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 25

  26. Heule MJ (2019) Trimming graphs using clausal proof optimization. In: International Conference on Principles and Practice of Constraint Programming. Springer, pp 251–267

  27. Hojati R, Brayton RK, Kurshan RP (1993) BDD-based debugging of designs using language containment and fair CTL. In: International Conference on Computer Aided Verification. Springer, pp 41–58. https://doi.org/10.1007/3-540-56922-7_5

  28. Hong S, Chafi H, Sedlar E, Olukotun K (2012) Green-marl: a DSL for easy and efficient graph analysis. In: Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, pp 349–362. https://doi.org/10.1145/2248487.2151013

  29. Hong S, Rodia NC, Olukotun K (2013) On fast parallel detection of strongly connected components (SCC) in small-world graphs. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC ‘13. ACM Press. https://doi.org/10.1145/2503210.2503246

  30. JéJé J (1992) An introduction to parallel algorithms. Addison-Wesley, Reading

    Google Scholar 

  31. Ji Y, Liu H, Huang HH (2018) iSpan: parallel identification of strongly connected components with spanning trees. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE. https://doi.org/10.1109/sc.2018.00061

  32. Kirousis LM (1993) Fast parallel constraint satisfaction. Tech. Rep. 1. https://doi.org/10.1016/0004-3702(93)90063-h

  33. Kumar R, Novak J, Tomkins A (2010) Structure and evolution of online social networks. In: Link Mining: Models, Algorithms, and Applications. Springer, New York, pp 337–357. https://doi.org/10.1007/978-1-4419-6515-8_13

  34. Kunegis J (2013) KONECT. In: Proceedings of the 22nd International Conference on World Wide Web—WWW ‘13 Companion. ACM, ACM Press. https://doi.org/10.1145/2487788.2488173

  35. Leskovec J, Huttenlocher D, Kleinberg J (2010) Signed networks in social media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp 1361–1370

  36. Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp 177–187

  37. Leskovec J, Krevl A (2014) SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data

  38. Lowe G (2016) Concurrent depth-first search algorithms based on Tarjan’s Algorithm. Int J Softw Tools Technol Transf 18(2):129–147. https://doi.org/10.1007/s10009-015-0382-1

    Article  Google Scholar 

  39. Mackworth AK (1977) Consistency in networks of relations. Artif Intell 8(1):99–118. https://doi.org/10.1016/0004-3702(77)90007-8

    Article  MathSciNet  MATH  Google Scholar 

  40. Mackworth AK, Freuder EC (1985) The complexity of some polynomial network consistency algorithms for constraint satisfaction problems. Artif Intell 25(1):65–74. https://doi.org/10.1016/0004-3702(85)90035-9

    Article  Google Scholar 

  41. Mclendon III W, Hendrickson B, Plimpton SJ, Rauchwerger L (2005) Finding strongly connected components in distributed graphs. J Parallel Distrib Comput 65(8):901–910. https://doi.org/10.1016/j.jpdc.2005.03.007

    Article  MATH  Google Scholar 

  42. Merz S (2001) Model checking: a tutorial overview. In: Modeling and verification of parallel processes. Springer, Berlin, pp 3–38. https://doi.org/10.1007/3-540-45510-8_1

  43. Michael MM (2002) High performance dynamic lock-free hash tables and list-based sets. In: Proceedings of the Fourteenth Annual ACM Symposium on Parallel Algorithms and Architectures—SPAA ‘02. ACM Press. https://doi.org/10.1145/564870.564881

  44. Milman G, Kogan A, Lev Y, Luchangco V, Petrank, E (2018) Bq: a lock-free queue with batching. In: Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures—SPAA ‘18. ACM Press. https://doi.org/10.1145/3210377.3210388

  45. Mohr R, Henderson TC (1986) Arc and path consistency revisited. Artif Intell 28(2):225–233. https://doi.org/10.1016/0004-3702(86)90083-4

    Article  Google Scholar 

  46. Niu X, Sun X, Wang H, Rong S, Qi G, Yu Y (2011) Zhishi.me—weaving chinese linking open data. In: The Semantic Web—ISWC 2011. Springer, Berlin, pp 205–220. https://doi.org/10.1007/978-3-642-25093-4_14

  47. Pelánek R (2007) BEEM: benchmarks for explicit model checkers. In: Model checking software. Springer, Berlin, pp 263–267. https://doi.org/10.1007/978-3-540-73370-6_17

  48. Reif JH (1985) Depth-first search is inherently sequential. Inf Process Lett 20(5):229–234. https://doi.org/10.1016/0020-0190(85)90024-9

    Article  MathSciNet  MATH  Google Scholar 

  49. Renault E, Duret-Lutz A, Kordon F, Poitrenaud D (2015) Parallel explicit model checking for generalized Büchi automata. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 9035. Springer, Verlag, pp 613–627. https://doi.org/10.1007/978-3-662-46681-0_56

  50. Rossi RA, Ahmed NK (2015) The network data repository with interactive graph analytics and visualization. In: AAAI. http://networkrepository.com

  51. Russell S, Norvig P (2009) Artificial intelligence: a modern approach, 3rd edn. Prentice Hall Press, Upper Saddle River

    MATH  Google Scholar 

  52. Shun J (2017) Shared-memory parallelism can be simple, fast, and scalable. PUB7255 Association for Computing Machinery and Morgan & Claypool

  53. Slota GM, Rajamanickam S, Madduri K (2014) BFS and coloring-based parallel algorithms for strongly connected components and related problems. In: Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS. IEEE Computer Society, pp 550–559. https://doi.org/10.1109/IPDPS.2014.64

  54. Social network F. Friendster: the online gaming social network. https://archive.org/details/friendster-dataset-201107

  55. Sun J, Kunegis J, Staab S (2016) Predicting user roles in social networks using transfer learning with feature transformation. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW). IEEE, pp 128–135. https://doi.org/10.1109/icdmw.2016.0026

  56. Takac L, Zabovsky M (2012) Data analysis in public social networks. In: International Scientific Conference and International Workshop Present Day Trends of Innovations, vol 1

  57. Tarjan R (1972) Depth-first search and linear graph algorithms. SIAM J Comput 1(2):146–160. https://doi.org/10.1137/0201010

    Article  MathSciNet  MATH  Google Scholar 

  58. Valois JD (1995) Lock-free linked lists using compare-and-swap. In: Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing—PODC ‘95. ACM Press, pp 214–222. https://doi.org/10.1145/224964.224988

  59. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440. https://doi.org/10.1515/9781400841356.301

    Article  MATH  Google Scholar 

  60. Xiaoping G, Mengyu R, Hong Z, Ping W, Ruijun R, Feng G (2021) Construction technology of knowledge graph and its application in power grid. In: E3S Web of Conferences, vol 256. EDP Sciences, p 01039

Download references

Acknowledgements

We acknowledge the support of the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Guo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, B., Sekerinski, E. Efficient parallel graph trimming by arc-consistency. J Supercomput 78, 15269–15313 (2022). https://doi.org/10.1007/s11227-022-04457-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04457-9

Keywords

Navigation