Skip to main content
Log in

I/O efficient ECC graph decomposition via graph reduction

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

The problem of computing k-edge connected components (k-\(\mathsf {ECC}\)s) of a graph G for a specific k is a fundamental graph problem and has been investigated recently. In this paper, we study the problem of \(\mathsf {ECC}\) decomposition, which computes the k-\(\mathsf {ECC}\)s of a graph G for all possible k values. \(\mathsf {ECC}\) decomposition can be widely applied in a variety of applications such as graph-topology analysis, community detection, Steiner Component Search, and graph visualization. A straightforward solution for \(\mathsf {ECC}\) decomposition is to apply the existing k-\(\mathsf {ECC}\) computation algorithm to compute the k-\(\mathsf {ECC}\)s for all k values. However, this solution is not applicable to large graphs for two challenging reasons. First, all existing k-\(\mathsf {ECC}\) computation algorithms are highly memory intensive due to the complex data structures used in the algorithms. Second, the number of possible k values can be very large, resulting in a high computational cost when each k value is independently considered. In this paper, we address the above challenges, and study I/O efficient \(\mathsf {ECC}\) decomposition via graph reduction. We introduce two elegant graph reduction operators which aim to reduce the size of the graph loaded in memory while preserving the connectivity information of a certain set of edges to be computed for a specific k. We also propose three novel I/O efficient algorithms, \(\mathsf {Bottom}\)-\(\mathsf {Up}\), \(\mathsf {Top}\)-\(\mathsf {Down}\), and \(\mathsf {Hybrid}\), that explore the k values in different orders to reduce the redundant computations between different k values. We analyze the I/O and memory costs for all proposed algorithms. In addition, we extend our algorithm to build an efficient index for Steiner Component Search. We show that our index can be used to perform Steiner Component Search in optimal I/Os when only the node information of the graph is allowed to be loaded in memory. In our experiments, we evaluate our algorithms using seven real large datasets with various graph properties, one of which contains 1.95 billion edges. The experimental results show that our proposed algorithms are scalable and efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. http://newsroom.fb.com/company-info.

  2. http://law.di.unimi.it/datasets.php.

References

  1. Abello, J., Resende, M.G., Sudarsky, S.: Massive quasi-clique detection. In: Latin American Symposium on Theoretical Informatics, pp. 598–612 (2002)

  2. Aggarwal, A., Vitter, J., et al.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)

    Article  MathSciNet  Google Scholar 

  3. Agrawal, R., Rajagopalan, S., Srikant, R., Xu, Y.: Mining newsgroups using networks arising from social behavior. In: Proceedings of WWW, pp. 529–535 (2003)

  4. Akiba, T., Iwata, Y., Yoshida, Y.: Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In: Proceedings CIKM, pp. 909–918 (2013)

  5. Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. In: Advances in Neural Information Processing Systems, pp. 41–50 (2005)

  6. Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: How the k-core decomposition helps in understanding the internet topology. In: ISMA Workshop on the Internet Topology, vol. 1 (2006)

  7. Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: K-core decomposition of internet graphs: hierarchies, self-similarity and measurement biases. Netw. Heterog. Media 3(2), 371–393 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  8. Carmi, S., Havlin, S., Kirkpatrick, S., Shavitt, Y., Shir, E.: A model of internet topology using k-shell decomposition. Proc. Natl Acad. Sci. 104(27), 11150–11154 (2007)

    Article  Google Scholar 

  9. Chang, L., Lin, X., Qin, L., Yu, J.X., Zhang, W.: Index-based optimal algorithms for computing Steiner components with maximum connectivity. In: Proceedings of the SIGMOD, pp. 459–474 (2015)

  10. Chang, L., Yu, J.X., Qin, L., Lin, X., Liu, C., Liang, W.: Efficiently computing k-edge connected components via graph decomposition. In: Proceedings of the SIGMOD, pp. 205–216 (2013)

  11. Chen, J., Yuan, B.: Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics 22(18), 2283–2290 (2006)

    Article  Google Scholar 

  12. Cheng, J., Ke, Y., Chu, S., Ozsu. M.T.: Efficient core decomposition in massive networks. In: Proceedings of the ICDE, pp. 51–62 (2011)

  13. Cheng, J., Zhu, L., Ke, Y., Chu, S.: Fast algorithms for maximal clique enumeration with limited memory. In: Proceedings of the SIGKDD, pp. 1240–1248 (2012)

  14. Hartuv, E., Shamir, R.: A clustering algorithm based on graph connectivity. Inf. Process. Lett. 76(4), 175–181 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  15. Hu, X., Tao, Y., Chung, C.: Massive graph triangulation. In: Proceedings of the SIGMOD, pp. 325–336 (2013)

  16. Jia, Y., Hoberock, J., Garland, M., Hart, J.: On the visualization of social and other scale-free networks. IEEE Trans. Vis. Comput. Graph. 14(6), 1285–1292 (2008)

    Article  Google Scholar 

  17. Luce, R.D.: Connectivity and generalized cliques in sociometric group structure. Psychometrika 15(2), 169–190 (1950)

    Article  MathSciNet  Google Scholar 

  18. Luce, R.D., Perry, A.D.: A method of matrix analysis of group structure. Psychometrika 14(2), 95–116 (1949)

    Article  MathSciNet  Google Scholar 

  19. Magnanti, T.L., Raghavan, S.: Strong formulations for network design problems with connectivity requirements. Networks 45(2), 61–79 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  20. Matsuda, H., Ishihara, T., Hashimoto, A.: Classifying molecular sequences using a linkage graph with their pairwise similarities. Theor. Comput. Sci. 210(2), 305–325 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  21. Nagamochi, H., Ibaraki, T.: A linear-time algorithm for finding a sparse k-connected spanning subgraph of a k-connected graph. Algorithmica 7(1–6), 583–596 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  22. Pei, J., Jiang, D., Zhang, A.: On mining cross-graph quasi-cliques. In: Proceedings of the SIGKDD, pp. 228–238 (2005)

  23. Seidman, S.B.: Network structure and minimum degree. Soc. Netw. 5(3), 269–287 (1983)

    Article  MathSciNet  Google Scholar 

  24. Spirin, V., Mirny, L.A.: Protein complexes and functional modules in molecular networks. Proc. Natl. Acad. Sci. 100(21), 12123–12128 (2003)

    Article  Google Scholar 

  25. Wang, J., Cheng, J.: Truss decomposition in massive networks. PVLDB 5(9), 812–823 (2012)

  26. Wang, N., Zhang, J., Tan, K., Tung, A.K.H.: On triangulation-based dense neighborhood graphs discovery. PVLDB 4(2), 58–68 (2010)

    Google Scholar 

  27. White, D.R., Harary, F.: The cohesiveness of blocks in social networks: node connectivity and conditional density. Sociol. Methodol. 31(1), 305–359 (2001)

    Article  Google Scholar 

  28. Yan, X., Zhou, X., Han, J.: Mining closed relational graphs with connectivity constraints. In: Proceedings of the SIGKDD, pp. 324–333 (2005)

  29. Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Diversified top-k clique search. VLDB J. 25(2), 171–196 (2016)

    Article  Google Scholar 

  30. Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: I/O efficient ecc graph decomposition via graph reduction. PVLDB 9(7), 516–527 (2016)

    Google Scholar 

  31. Zhang, Y., Parthasarathy, S.: Extracting analyzing and visualizing triangle k-core motifs within networks. In: Proceedings of the ICDE, pp. 1049–1060 (2012)

  32. Zhang, Z., Yu, J.X., Qin, L., Chang, L., Lin, X.: I/O efficient: computing SCCs in massive graphs. In Proceedings of the SIGMOD, pp. 245–270 (2013)

  33. Zhang, Z., Yu, J.X., Qin, L., Shang, Z.: Divide & conquer: I/O efficient depth-first search. In: Proceedings of the SIGMOD, pp. 445–458 (2015)

  34. Zhou, R., Liu, C., Yu, J.X., Liang, W., Chen, B., Li, J.: Finding maximal k-edge-connected subgraphs from a large graph. In: Proceedings of the EDBT, pp. 480–491 (2012)

Download references

Acknowledgements

Lu Qin is supported by ARC DE140100999 and ARC DP160101513. Xuemin Lin is supported by NSFC61232006, ARC DP140103578 and ARC DP150102728. Lijun Chang is supported by ARC DE150100563 and ARC DP160101513. Wenjie Zhang is supported by ARC DP150103071 and ARC DP150102728.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lu Qin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, L., Qin, L., Lin, X. et al. I/O efficient ECC graph decomposition via graph reduction. The VLDB Journal 26, 275–300 (2017). https://doi.org/10.1007/s00778-016-0451-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-016-0451-4

Keywords

Navigation