Abstract
In this work we present a sampling algorithm for estimating the local clustering of each vertex of a graph. Let G be a graph with n vertices, m edges, and maximum degree \(\varDelta \). We present an algorithm that, given G and fixed constants \(0< \varepsilon , \delta , p < 1\), outputs the values for the local clustering coefficient within \(\varepsilon \) error with probability \(1 - \delta \), for every vertex v of G, provided that the (exact) local clustering of v is not “too small.” We use VC dimension theory to give a bound for the number of edges required to be sampled by the algorithm. We show that the algorithm runs in time \(\mathcal {O}(\varDelta \lg \varDelta + m)\). We also show that the running time drops to, possibly, sublinear time if we restrict G to belong to some well-known graph classes. In particular, for planar graphs the algorithm runs in time \(\mathcal {O}(\varDelta )\). In the case of bounded-degree graphs the running time is \(\mathcal {O}(1)\) if a bound for the value of \(\varDelta \) is given as a part of the input, and \(\mathcal {O}(n)\) otherwise.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations, 1st edn. Cambridge University Press, New York (2009)
Barabási, A.L., Pósfai, M.: Network Science. Cambridge University Press, Cambridge (2016)
Bloznelis, M.: Degree and clustering coefficient in sparse random intersection graphs. Ann. Appl. Probab. 23(3), 1254–1289 (2013)
Brautbar, M., Kearns, M.: Local algorithms for finding interesting individuals in large networks. In: Innovations in Computer Science (2010)
de Lima, A.M., da Silva, M.V., Vignatti, A.L.: Percolation centrality via rademacher complexity. Discret. Appl. Math. (2021)
Easley, D.A., Kleinberg, J.M.: Networks, Crowds, and Markets - Reasoning About a Highly Connected World. Cambridge University Press, NY (2010)
Fronczak, A., Fronczak, P., Hołyst, J.A.: Mean-field theory for clustering coefficients in Barabási-Albert networks. Phys. Rev. E 68(4), 046126 (2003)
Gupta, A.K., Sardana, N.: Significance of clustering coefficient over Jaccard Index. In: 2015 Eighth International Conference on Contemporary Computing (IC3), pp. 463–466. IEEE (2015)
Holland, P.W., Leinhardt, S.: Transitivity in structural models of small groups. Comp. Group Stud. 2(2), 107–124 (1971)
Iskhakov, L., Kamiński, B., Mironov, M., Prałat, P., Prokhorenkova, L.: Local clustering coefficient of spatial preferential attachment model. J. Complex Netw. 8(1), cnz019 (2020)
Ji, Q., Li, D., Jin, Z.: Divisive algorithm based on node clustering coefficient for community detection. IEEE Access 8, 142337–142347 (2020)
Kartun-Giles, A.P., Bianconi, G.: Beyond the clustering coefficient: a topological analysis of node neighbourhoods in complex networks. Chaos Solit. Fractals: X 1, 100004 (2019)
Kolda, T.G., Pinar, A., Plantenga, T., Seshadhri, C., Task, C.: Counting triangles in massive graphs with MapReduce. SIAM J. Sci. Comput. 36(5), S48–S77 (2014)
Krot, A., Ostroumova Prokhorenkova, L.: Local clustering coefficient in generalized preferential attachment models. In: Gleich, D.F., Komjáthy, J., Litvak, N. (eds.) WAW 2015. LNCS, vol. 9479, pp. 15–28. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26784-5_2
Kutzkov, K., Pagh, R.: On the streaming complexity of computing local clustering coefficients. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 677–686 (2013)
Li, M., Zhang, R., Hu, R., Yang, F., Yao, Y., Yuan, Y.: Identifying and ranking influential spreaders in complex networks by combining a local-degree sum and the clustering coefficient. Int. J. Mod. Phys. B 32(06), 1850118 (2018)
Li, X., Chang, L., Zheng, K., Huang, Z., Zhou, X.: Ranking weighted clustering coefficient in large dynamic graphs. World Wide Web 20(5), 855–883 (2017)
Li, Y., Long, P.M., Srinivasan, A.: Improved bounds on the sample complexity of learning. J. Comput. Syst. Sci. 62(3), 516–527 (2001)
de Lima, A.M., da Silva, M.V., Vignatti, A.L.: Estimating the percolation centrality of large networks through pseudo-dimension theory. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1839–1847 (2020)
Liu, S., Xia, Z.: A two-stage BFS local community detection algorithm based on node transfer similarity and local clustering coefficient. Phys. A 537, 122717 (2020)
Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis, 2nd edn. Cambridge University Press, New York (2017)
Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, Cambridge (2012)
Nascimento, M.C.: Community detection in networks via a spectral heuristic based on the clustering coefficient. Discret. Appl. Math. 176, 89–99 (2014)
Newman, M.E.J.: Networks: an introduction. Oxford University Press (2010)
Pan, X., Xu, G., Wang, B., Zhang, T.: A novel community detection algorithm based on local similarity of clustering coefficient in social networks. IEEE Access 7, 121586–121598 (2019)
Riondato, M., Kornaropoulos, E.M.: Fast approximation of betweenness centrality through sampling. Data Min. Knowl. Disc. 30(2), 438–475 (2016)
Riondato, M., Upfal, E.: ABRA: approximating betweenness centrality in static and dynamic graphs with rademacher averages. ACM Trans. Knowl. Discov. Data 12(5), 61:1–61:38 (2018)
Seshadhri, C., Pinar, A., Kolda, T.G.: Fast triangle counting through wedge sampling. In: Proceedings of the SIAM Conference on Data Mining, vol. 4, p. 5. Citeseer (2013)
Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, New York (2014)
Soffer, S.N., Vazquez, A.: Network clustering coefficient without degree-correlation biases. Phys. Rev. E 71(5), 057101 (2005)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998)
West, D.B.: Introduction to Graph Theory, 2 edn. Prentice Hall (2000)
Wu, Z., Lin, Y., Wang, J., Gregory, S.: Link prediction with node clustering coefficient. Phys. A 452, 1–8 (2016)
Zhang, H., Zhu, Y., Qin, L., Cheng, H., Yu, J.X.: Efficient local clustering coefficient estimation in massive graphs. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10178, pp. 371–386. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55699-4_23
Zhang, J., Tang, J., Ma, C., Tong, H., Jing, Y., Li, J.: Panther: fast top-k similarity search on large networks. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1445–1454 (2015)
Zhang, R., Li, L., Bao, C., Zhou, L., Kong, B.: The community detection algorithm based on the node clustering coefficient and the edge clustering coefficient. In: Proceeding of the 11th World Congress on Intelligent Control and Automation, pp. 3240–3245. IEEE (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
de Lima, A.M., da Silva, M.V.G., Vignatti, A.L. (2022). Estimating the Clustering Coefficient Using Sample Complexity Analysis. In: Castañeda, A., Rodríguez-Henríquez, F. (eds) LATIN 2022: Theoretical Informatics. LATIN 2022. Lecture Notes in Computer Science, vol 13568. Springer, Cham. https://doi.org/10.1007/978-3-031-20624-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-20624-5_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20623-8
Online ISBN: 978-3-031-20624-5
eBook Packages: Computer ScienceComputer Science (R0)