Abstract
The assumption of data completeness plays a significant role in the effectiveness of current Multi-view Clustering (MVC) methods. However, data collection and transmission would unavoidably breach this assumption, resulting in the Partially Data-missing Problem (PDP). A common remedy is to first impute missing values and then conduct MVC methods, which may cause performance degeneration due to inaccurate imputation. To address these issues in PDP, we introduce an imputation-free framework that utilizes a matrix correction technique, employing a novel two-stage strategy termed ’correction-clustering’. In the first stage, we correct distance matrices derived from incomplete data and compute affinity matrices. Following this, we integrate them with affinity-based MVC methods. This approach circumvents the uncertainties associated with inaccurate imputations, enhancing clustering performance. Comprehensive experiments show that our method outperforms traditional imputation-based techniques, yielding superior clustering results across various levels of missing data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balzano, L., Nowak, R., Recht, B.: Online identification and tracking of subspaces from highly incomplete information. In: 2010 48th Annual Allerton conference on Communication, Control, and Computing (Allerton), pp. 704–711. IEEE (2010)
Bauschke, H.H., Borwein, J.M.: Dykstra’s alternating projection algorithm for two sets. J. Approx. Theory 79(3), 418–443 (1994)
Berry, M.W., Mezher, D., Philippe, B., Sameh, A.: Parallel algorithms for the singular value decomposition. In: Handbook of Parallel Computing and Statistics, pp. 133–180. Chapman and Hall/CRC (2005)
Boyle, J.P., Dykstra, R.L.: A method for finding projections onto the intersection of convex sets in Hilbert spaces. In: Advances in Order Restricted Statistical Inference, pp. 28–47. Springer, New York (1986). https://doi.org/10.1007/978-1-4613-9940-7_3
Cai, J.F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)
Candes, E., Recht, B.: Exact matrix completion via convex optimization. Commun. ACM 55(6), 111–119 (2012)
Du, L., et al.: Robust multiple kernel K-means using L21-Norm. In: 24th International Joint Conference on Artificial Intelligence (2015)
Dykstra, R.L.: An algorithm for restricted least squares regression. J. Am. Stat. Assoc. 78(384), 837–842 (1983)
Fan, J., Udell, M.: Online high rank matrix completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8690–8698 (2019)
Guo, J., Ye, J.: Anchors bring ease: an embarrassingly simple approach to partial multi-view clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 118–125 (2019)
Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)
Hasan, M.K., Alam, M.A., Roy, S., Dutta, A., Jawad, M.T., Das, S.: Missing value imputation affects the performance of machine learning: a review and analysis of the literature (2010–2021). Inf. Med. Unlocked 27, 100799 (2021)
Huang, H.C., Chuang, Y.Y., Chen, C.S.: Affinity aggregation for spectral clustering. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 773–780. IEEE (2012)
Kumar, A., Rai, P., Daume, H.: Co-regularized multi-view spectral clustering. In: Advances in Neural Information Processing Systems 24 (2011)
Li, S.Y., Jiang, Y., Zhou, Z.H.: Partial multi-view clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28 (2014)
Li, W.: Estimating jaccard index with missing observations: a matrix calibration approach. In: Advances in Neural Information Processing Systems, vol. 28, pp. 2620–2628. Canada (2015)
Li, W.: Scalable calibration of affinity matrices from incomplete observations. In: Asian Conference on Machine Learning, pp. 753–768. PMLR, Bangkok, Thailand (2020)
Li, W., Yu, F.: Calibrating distance metrics under uncertainty. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 219–234. Springer (2022). https://doi.org/10.1007/978-3-031-26409-2_14
Li, W., Yu, F., Ma, Z.: Metric nearness made practical. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 8648–8656 (2023)
Lin, W.-C., Tsai, C.-F.: Missing value imputation: a review and analysis of the literature (2006–2017). Artif. Intell. Rev. 53(2), 1487–1509 (2020). https://doi.org/10.1007/s10462-019-09709-4
Liu, J., et al.: Optimal neighborhood multiple kernel clustering with adaptive local kernels. IEEE Trans. Knowl. Data Eng. (2020)
Liu, J., et al.: Self-representation subspace clustering for incomplete multi-view data. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 2726–2734 (2021)
Liu, X., et al.: Efficient and effective regularized incomplete multi-view clustering. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2634–2646 (2020)
Liu, X.: Multiple kernel \(k\) k-means with incomplete kernels. IEEE Trans. Pattern Anal. Mach. Intell. 42(5), 1191–1204 (2019)
Nader, R., Bretto, A., Mourad, B., Abbas, H.: On the positive semi-definite property of similarity matrices. Theoret. Comput. Sci. 755, 13–28 (2019)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems 14 (2001)
Nie, F., Tian, L., Li, X.: Multiview clustering via adaptively weighted procrustes. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2022–2030 (2018)
Shao, W., He, L., Yu, P.S.: Multiple incomplete views clustering via weighted nonnegative matrix factorization with \(L_{2,1}\) regularization. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9284, pp. 318–334. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23528-8_20
Tang, C., et al.: CGD: multi-view clustering via cross-view graph diffusion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 5924–5931 (2020)
Wang, S., et al.: Highly-efficient incomplete large-scale multi-view clustering with consensus bipartite graph. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9776–9785 (2022)
Wang, S., et al.: Multi-view clustering via late fusion alignment maximization. In: 28th International Joint Conference on Artificial Intelligence, pp. 3778–3784 (2019)
Xia, R., Pan, Y., Du, L., Yin, J.: Robust multi-view spectral clustering via low-rank and sparse decomposition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28 (2014)
Xu, N., Guo, Y., Wang, J., Luo, X., Kong, X.: Multi-view clustering via simultaneously learning shared subspace and affinity matrix. Int. J. Adv. Rob. Syst. 14(6), 1729881417745677 (2017)
Yu, F., Bao, R., Mao, J., Li, W.: Highly-efficient Robinson-Foulds distance estimation with matrix correction. In: (to appear) 26th European Conference on Artificial Intelligence (2023)
Yu, F., Zeng, Y., Mao, J., Li, W.: Online estimation of similarity matrices with incomplete data. In: Uncertainty in Artificial Intelligence, pp. 2454–2464. PMLR (2023)
Zhan, K., Nie, F., Wang, J., Yang, Y.: Multiview consensus graph clustering. IEEE Trans. Image Process. 28(3), 1261–1270 (2018)
Zhan, K., Zhang, C., Guan, J., Wang, J.: Graph learning for multiview clustering. IEEE Trans. Cybern. 48(10), 2887–2895 (2017)
Zhang, P., et al.: Adaptive weighted graph fusion incomplete multi-view subspace clustering. Sensors 20(20), 5755 (2020)
Zhang, S.: Nearest neighbor selection for iteratively KNN imputation. J. Syst. Softw. 85(11), 2541–2552 (2012)
Zhao, H., Liu, H., Fu, Y.: Incomplete multi-modal visual data grouping. In: 25th International Joint Conference on Artificial Intelligence, pp. 2392–2398 (2016)
Zhou, S., et al.: Multiple kernel clustering with neighbor-kernel subspace segmentation. IEEE Trans. Neural Netw. Learn. Syst. 31(4), 1351–1362 (2019)
Zong, L., Zhang, X., Liu, X., Yu, H.: Weighted multi-view spectral clustering based on spectral perturbation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Acknowledgements
We appreciate the anonymous reviewers for their helpful feedback that greatly improved this paper. The work of Fangchen Yu and Wenye Li was supported in part by Guangdong Basic and Applied Basic Research Foundation (2021A1515011825), Guangdong Introducing Innovative and Entrepreneurial Teams Fund (2017ZT07X152), Shenzhen Science and Technology Program (CUHKSZWDZC0004), and Shenzhen Research Institute of Big Data Scholarship Program. The work of Yuqi Ma was supported in part by CUHKSZ-SRIBD Joint PhD Program. The work of Jianfeng Mao was supported in part by National Natural Science Foundation of China under grant U1733102, in part by the Guangdong Provincial Key Laboratory of Big Data Computing, The Chinese University of Hong Kong, Shenzhen under grant B10120210117, and in part by CUHK-Shenzhen under grant PF.01.000404.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yu, F., Shi, Z., Ma, Y., Mao, J., Li, W. (2024). From Incompleteness to Unity: A Framework for Multi-view Clustering with Missing Values. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1965. Springer, Singapore. https://doi.org/10.1007/978-981-99-8145-8_9
Download citation
DOI: https://doi.org/10.1007/978-981-99-8145-8_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8144-1
Online ISBN: 978-981-99-8145-8
eBook Packages: Computer ScienceComputer Science (R0)