Skip to main content
Log in

Computing weight constraint reachability in large networks

The VLDB Journal Aims and scope Submit manuscript

Abstract

Reachability is a fundamental problem on large-scale networks emerging nowadays in various application domains, such as social networks, communication networks, biological networks, road networks, etc. It has been studied extensively. However, little existing work has studied reachability with realistic constraints imposed on graphs with real-valued edge or node weights. In fact, such weights are very common in many real-world networks, for example, the bandwidth of a link in communication networks, the reliability of an interaction between two proteins in PPI networks, and the handling capacity of a warehouse/storage point in a distribution network. In this paper, we formalize a new yet important reachability query in weighted undirected graphs, called weight constraint reachability (WCR) query that asks: is there a path between nodes \(a\) and \(b\), on which each real-valued edge (or node) weight satisfies a range constraint. We discover an interesting property of WCR, based on which, we design a novel edge-based index structure to answer the WCR query in \(O(1)\) time. Furthermore, we consider the case when the index cannot entirely fit in the memory, which can be very common for emerging massive networks. An I/O-efficient index is proposed, which provides constant I/O (precisely four I/Os) query time with \(O(|V|\log |V|)\) disk-based index size. Extensive experimental studies on both real and synthetic datasets demonstrate the efficiency and scalability of our solutions in answering the WCR query.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. The \(O(|\Sigma ||V|)\) space complexity is for handling the general bounded interval constraint \([x, y],\,x, y\in \mathbb R \). For the half-bounded constraint \(\ge \!x\) or \(\le \!y\), the complexity is \(O(|V|)\).

  2. http://socialnetworks.mpi-sws.org/data-wosn2009.html.

  3. http://www.dis.uniroma1.it/~challenge9/download.shtml.

References

  1. Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proceedings of the 1989 ACM SIGMOD international conference on Management of data (SIGMOD 1989), pp. 253–262 (1989)

  2. Bebek, G., Yang, J.: PathFinder: Mining signal transduction pathway segments from protein-protein interaction networks. BMC Bioinform. J. 8, 335 (2007)

    Article  Google Scholar 

  3. Bender, M. A., Farach-Colton, M.: The LCA problem revisited. In: LATIN 2000: Theoretical Informatics, volume 1776 of Lecture Notes in Computer Science, pp. 88–94. Springer, Berlin/Heidelberg

  4. Bramandia, R., Choi, B., Ng, W.K.: On incremental maintenance of 2-hop labeling of graphs. In: Proceedings of the 17th international conference on World Wide Web (WWW 2008), pp. 845–854 (2008)

  5. Chen, Y., Chen, Y.: An efficient algorithm for answering graph reachability queries. In: Proceedings of the 24th International Conference on Data Engineering (ICDE 2008), pp. 893–902 (2008)

  6. Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P. S.: Fast computing reachability labelings for large graphs with high compression rate. In: Proceedings of the 11th International Conference on Extending Database Technology (EDBT 2008), pp. 193–204 (2008)

  7. Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2002), pp. 937–946 (2002)

  8. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. The MIT Press, New York (2001)

    MATH  Google Scholar 

  9. Fan, W., Li, J., Ma, S., Tang, N., Wu, Y.: Adding regular expressions to graph reachability and pattern queries. In: Proceedings of the 27th International Conference on Data Engineering (ICDE 2011), pp. 39–50 (2011)

  10. Florescu, D., Levy, A.Y., Suciu, D.: Query containment for conjunctive queries with regular expressions. In: Proceedings of the 1998 Symposium on Principles of Database Systems (PODS 1998), pp. 139–148 (1998)

  11. Gomory, R.E., Hu, T.C.: Multi terminal network flows. J. Soc. Ind. Appl. Math. 9, 551–571 (1961)

    Article  MathSciNet  MATH  Google Scholar 

  12. He, H., Wang, H., Yang, J., Yu, P. S.: Compact reachability labeling for graph-structured data. In: Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management (CIKM 2005), pp. 594–601 (2005)

  13. Jagadish, H.V.: A compression technique to materialize transitive closure. ACM Trans. Database Syst. 15(4), 558–598 (1990)

    Article  MathSciNet  Google Scholar 

  14. Jin, R., Hong, H., Wang, H., Ruan, N., Xiang, Y.: Computing label-constraint reachability in graph databases. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD 2010), pp. 123–134 (2010)

  15. Jin, R., Liu, L., Ding, B., Wang, H.: Distance-constraint reachability computation in uncertain graphs. Proc. VLDB Endowment (PVLDB 2011) 4(9), 551–562 (2011)

    Google Scholar 

  16. Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-HOP: a high-compression indexing scheme for reachability query. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD 2009), pp. 813–826 (2009)

  17. Jin, R., Xiang, Y., Ruan, N., Wang, H.: Efficiently answering reachability queries on very large directed graphs. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD 2008), pp. 595–608 (2008)

  18. Johnsonbaugh, R., Kalin, M.: A graph generation software package. In: Proceedings of the 22nd SIGCSE Technical Symposium on Computer Science Education (SIGCSE 1991), pp. 151–154 (1991)

  19. Kruskal, J.B.: On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Amer. Math. Soc. 7(1), 48–50 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  20. Lawder, J.K., King, P.J.H.: Querying multi-dimensional data indexed using the hilbert space-filling curve. SIGMOD Rec. 30(1), 19–24 (2001)

    Article  Google Scholar 

  21. Ma, Q., Steenkiste, P.: On path selection for traffic with bandwidth guarantees. In: Proceedings of the 1997 International Conference on Network Protocols (ICNP 1997), pp. 191–202 (1997)

  22. Mendelzon, A.O., Wood, P.T.: Finding regular simple paths in graph databases. SIAM J. Comput. 24(6), 1235–1258 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  23. Newman, M.E.J.: Power laws, pareto distributions and zipf’s law. Contemp. Phys. 46, 323–351 (2005)

    Article  Google Scholar 

  24. Roditty, L., Zwick, U.: A fully dynamic reachability algorithm for directed graphs with an almost linear update time. In: Proceedings of the 36th annual ACM symposium on Theory of computing (STOC 2004), pp. 184–191 (2004)

  25. Schenkel, R., Theobald, A., Weikum, G.: HOPI: an efficient connection index for complex XML document collections. In: Proceedings of the 9th International Conference on Extending Database Technology (EDBT 2004), pp. 237–255 (2004)

  26. Schenkel, R., Theobald, A., Weikum, G.: Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In: Proceedings of the 21th International Conference on Data Engineering (ICDE 2005), pp. 360–371 (2005)

  27. TrißI, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data (SIGMOD 2007), pp. 845–856 (2007)

  28. van Schaik, S.J., de Moor, O.: A memory efficient reachability data structure through bit vector compression. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD 2011), pp. 913–924 (2011)

  29. Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: On the evolution of user interaction in facebook. In: Proceedings of the 2nd ACM SIGCOMM Workshop on Social Networks (WOSN 2009), pp. 37–42 (2009)

  30. Vitter, J.S.: External memory algorithms and data structures. ACM Comput. Surv. 33(2), 209–271 (2001)

    Article  Google Scholar 

  31. Wang, H., He, H., Yang, J., Yu, P. S., Yu, J. X.: Dual labeling: answering graph reachability queries in constant time. In: Proceedings of the 22th International Conference on Data Engineering (ICDE 2006), pp. 75 (2006)

  32. Xu, K., Zou, L., Yu, J. X., Chen, L., Xiao, Y., Zhao, D.: Answering label-constraint reachability in large graphs. In: Proceedings of the 2011 ACM CIKM International Conference on Information and Knowledge Management (CIKM 2011), pp. 1595–1600 (2011)

  33. Yildirim, H., Chaoji, V., Zaki, M.J.: GRAIL: scalable reachability index for large graphs. Proc. VLDB Endowment (PVLDB 2010) 3(1), 276–284 (2010)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the Hong Kong Research Grants Council (RGC) General Research Fund (GRF) Project No. CUHK 411211, 411310 and 419109, and by NSF through grants IIS-0905215, IIS-0914934, CNS-1115234, DBI-0960443 and OISE-1129076, Google Mobile 2014 Program, and KAU grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Cheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qiao, M., Cheng, H., Qin, L. et al. Computing weight constraint reachability in large networks. The VLDB Journal 22, 275–294 (2013). https://doi.org/10.1007/s00778-012-0288-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-012-0288-4

Keywords

Navigation