ABSTRACT
This paper investigates the network completion problem, where it is assumed that only a small sample of a network (e.g., a complete or partially observed subgraph of a social graph) is observed and we would like to infer the unobserved part of the network. In this paper, we assume that besides the observed subgraph, side information about the nodes such as the pairwise similarity between them is also provided. In contrast to the original network completion problem where the standard methods such as matrix completion is inapplicable due the non-uniform sampling of observed links, we show that by effectively exploiting the side information, it is possible to accurately predict the unobserved links. In contrast to existing matrix completion methods with side information such as shared subsapce learning and matrix completion with transduction, the proposed algorithm decouples the completion from transduction to effectively exploit the similarity information. This crucial difference greatly boosts the performance when appropriate similarity information is used. The recovery error of the proposed algorithm is theoretically analyzed based on the richness of the similarity information and the size of the observed submatrix. To the best of our knowledge, this is the first algorithm that addresses the network completion with similarity of nodes with provable guarantees. Experiments on synthetic and real networks from Facebook and Google+ show that the proposed two-stage method is able to accurately reconstruct the network and outperforms other methods.
- M. Kim and J. Leskovec, "The network completion problem: Inferring missing nodes and edges in networks." in SDM. SIAM, 2011, pp. 47--58.Google Scholar
- A. Annibale and A. Coolen, "What you see is not what you get: how sampling affects macroscopic features of biological networks," Interface Focus, vol. 1, no. 6, pp. 836--856, 2011.Google ScholarCross Ref
- M. Papagelis, G. Das, and N. Koudas, "Sampling online social networks," Knowledge and Data Engineering, IEEE Transactions on, vol. 25, no. 3, pp. 662--676, 2013. Google ScholarDigital Library
- M. Shiga, I. Takigawa, and H. Mamitsuka, "Annotating gene function by combining expression data with a modular gene network," Bioinformatics, vol. 23, no. 13, pp. i468--i478, 2007. Google ScholarDigital Library
- B. Recht, "A simpler approach to matrix completion," JMLR, vol. 12, pp. 3413--3430, 2011. Google ScholarDigital Library
- R. Guimerà and M. Sales-Pardo, "Missing and spurious interactions and the reconstruction of complex networks," Proceedings of the National Academy of Sciences, vol. 106, no. 52, pp. 22 073--22 078, 2009.Google ScholarCross Ref
- S. Hanneke and E. P. Xing, "Network completion and survey sampling," in AISTAT, 2009, pp. 209--215.Google Scholar
- D. Liben-Nowell and J. Kleinberg, "The link-prediction problem for social networks," Journal of the American society for information science and technology, vol. 58, no. 7, pp. 1019--1031, 2007. Google ScholarDigital Library
- N. Srebro, J. Rennie, and T. S. Jaakkola, "Maximum-margin matrix factorization," in NIPS, 2004, pp. 1329--1336.Google ScholarDigital Library
- T. Zhou, H. Shan, A. Banerjee, and G. Sapiro, "Kernelized probabilistic matrix factorization: Exploiting graphs and side information." in SDM, vol. 12. SIAM, 2012, pp. 403--414.Google Scholar
- A. K. Menon, K.-P. Chitrapura, S. Garg, D. Agarwal, and N. Kota, "Response prediction using collaborative filtering with hierarchies and side-information," in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2011, pp. 141--149. Google ScholarDigital Library
- I. Porteous, A. U. Asuncion, and M. Welling, "Bayesian matrix factorization with side information and dirichlet process mixtures." in AAAI, 2010.Google Scholar
- Y. Fang and L. Si, "Matrix co-factorization for recommendation with rich side information and implicit feedback," in Proceedings of the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems. ACM, 2011, pp. 65--69. Google ScholarDigital Library
- W. Pan, E. W. Xiang, N. N. Liu, and Q. Yang, "Transfer learning in collaborative filtering for sparsity reduction." in AAAI, vol. 10, 2010, pp. 230--235.Google ScholarDigital Library
- J. Abernethy, F. Bach, T. Evgeniou, and J.-P. Vert, "A new approach to collaborative filtering: Operator estimation with spectral regularization," JMLR, vol. 10, pp. 803--826, 2009. Google ScholarDigital Library
- E. J. Candès and B. Recht, "Exact matrix completion via convex optimization," Foundations of Computational mathematics, vol. 9, no. 6, pp. 717--772, 2009. Google ScholarDigital Library
- J.-F. Cai, E. J. Candès, and Z. Shen, "A singular value thresholding algorithm for matrix completion," SIAM Journal on Optimization, vol. 20, no. 4, pp. 1956--1982, 2010. Google ScholarDigital Library
- A. Goldberg, B. Recht, J. Xu, R. Nowak, and X. Zhu, "Transduction with matrix completion: Three birds with one stone," in NIPS, 2010, pp. 757--765.Google Scholar
- K.-Y. Chiang, C.-J. Hsieh, N. Natarajan, I. S. Dhillon, and A. Tewari, "Prediction and clustering in signed networks: a local to global perspective," JMLR, vol. 15, no. 1, pp. 1177--1213, 2014. Google ScholarDigital Library
- W. Liu, J. Wang, and S.-F. Chang, "Robust and scalable graph-based semisupervised learning," Proceedings of the IEEE, vol. 100, no. 9, pp. 2624--2638, 2012.Google ScholarCross Ref
- A. K. Menon and C. Elkan, "Link prediction via matrix factorization," in Machine Learning and Knowledge Discovery in Databases. Springer, 2011, pp. 437--452. Google ScholarDigital Library
- Network Completion with Node Similarity: A Matrix Completion Approach with Provable Guarantees
Recommendations
Network completion via joint node clustering and similarity learning
ASONAM '16: Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningIn this study, we investigate the problem of network completion by considering the similarities between the node attributes. Given a sample of observed nodes with their incident edges, how can we efficiently reconstruct the network by completing the ...
Multi-view network embedding with node similarity ensemble
AbstractNode similarity is utilized as the most popular guidance for network embedding: nodes more similar in a network should still be more similar when mapping node information from a high-dimensional vector space to a low-dimensional vector space. Most ...
Heterogeneous graph neural network for attribute completion
AbstractHeterogeneous graphs consist of multiple types of nodes and edges, and contain comprehensive information and rich semantics, which can properly model real-world complex systems. However, the attribute values of nodes are often ...
Comments