Abstract
Density peak clustering (DPC) is a recently developed density-based clustering algorithm that achieves competitive performance in a non-iterative manner. DPC is capable of effectively handling clusters with single density peak (single center), i.e., based on DPC’s hypothesis, one and only one data point is chosen as the center of any cluster. However, DPC may fail to identify clusters with multiple density peaks (multi-centers) and may not be able to identify natural clusters whose centers have relatively lower local density. To address these limitations, we propose a novel clustering algorithm based on a hierarchical approach, named multi-center density peak clustering (McDPC). Firstly, based on a widely adopted hypothesis that the potential cluster centers are relatively far away from each other. McDPC obtains centers of the initial micro-clusters (named representative data points) whose minimum distance to the other higher-density data points are relatively larger. Secondly, the representative data points are autonomously categorized into different density levels. Finally, McDPC deals with micro-clusters at each level and if necessary, merges the micro-clusters at a specific level into one cluster to identify multi-center clusters. To evaluate the effectiveness of our proposed McDPC algorithm, we conduct experiments on both synthetic and real-world datasets and benchmark the performance of McDPC against other state-of-the-art clustering algorithms. We also apply McDPC to perform image segmentation and facial recognition to further demonstrate its capability in dealing with real-world applications. The experimental results show that our method achieves promising performance.
Similar content being viewed by others
References
Kriegel H-P, Pfeifle M (2005) Hierarchical density-based clustering of uncertain data. In: IEEE international conference on data mining, vol 1–4
Chang H, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recognit 41:191–203
Kashyap M, Bhattacharya M (2017) A density invariant approach to clustering. Neural Comput Appl 28:1695–1713
Chamundeswari G, Varma PPS, Satyanaraya C (2014) Spatial data clustering: a review. Int J Adv Res Comput Sci 5:62–63
Bai L, Cheng X, Liang J, Shen H, Guo Y (2017) Fast density clustering strategies based on the k-means algorithm. Pattern Recognit 71:375–386
Ester M, Kriegel H-P, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: International conference on knowledge discovery and data mining, pp 226–231
Viswanath P, Pinkesh R (2006) l-DBSCAN: a fast hybrid density based clustering method. Pattern Recognit 1:912–915
Alex R, Alessandro L (2014) Clustering by fast search and find of density peaks. Science 344:1492–1496
Du M, Ding S, Xue Y (2018) A robust density peaks clustering algorithm using fuzzy neighborhood. Int J Mach Learn Cybern 9:1131–1140
Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24:1273–1280
Wang Y, Pang W, Zhou Y (2018) Density propagation based adaptive multi-density clustering algorithm. PLoS ONE 13:1–13
Ding J, He X, Yuan J, Jiang B (2018) Automatic clustering based on density peak detection using generalized extreme value distribution. Soft Comput 22:2777–2796
Liu Y, Ma Z, Fang Y (2017) Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy. Knowl Based Syst 133:208–220
Xie J, Gao H, Xie W, Liu X, Grant PW (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf Sci 354:19–40
Mingjing D, Ding S, Jia H (2016) Study on density peaks clustering based on K-nearest neighbors and principal component analysis. Knowl Based Syst 99:135–145
Ji X, Wang G, Deng W (2016) DenPEHC: density peak based efficient hierarchical clustering. Inf Sci 373:200–218
Wang M, Zuo W, Wang Y (2016) An improved density peaks-based clustering method for social circle discovery in social networks. Neurocomputing 179:219–227
Parmar M, Wang D, Zhang X, Tan A-H, Miao C, Jiang J, Zhou Y (2019) REDPC: a residual error-based density peak clustering algorithm. Neurocomputing 348:82–96
Lin P, Lin Y, Chen Z, Lijun W, Chen L, Cheng S (2017) A density peak-based clustering approach for fault diagnosis of photovoltaic arrays. Int J Photoenergy 2017:1–14
Tu B, Yang X, Li N, Zhou C, He D (2020) Hyperspectral anomaly detection via density peak clustering. Pattern Recognit Lett 129:144–149
Ding S, Mingjing D, Sun T, Xiao X, Xue Y (2017) An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood. Knowl Based Syst 113:294–313
Guo M, Donghua Y, Liu G, Liu X, Cheng S (2019) Drug-target interaction data cluster analysis based on improving the density peaks clustering algorithm. Intell Data Anal 23:1335–1353
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–973
Givoni IE, Frey BJ (2009) A binary variable model for affinity propagation. Neural Comput 21:1589–1600
David Martin Powers (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J Mach Learn Technol 2:2229–3981
Vin NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
Zhang H, Guo H, Wang X, Ji Y, Wu QJ (2020) Clothescounter: a framework for star-oriented clothes mining from videos. Neurocomputing 377:38–48
Shi Y, Chen Z, Qi Z, Meng F, Cui L (2017) A novel clustering-based image segmentation via density peaks algorithm with mid-level feature. Neural Comput Appl 28:29–39
Guo Y, Xia R, Sengur A, Polat K (2017) A novel image segmentation approach based on neutrosophic c-means clustering and indeterminacy filtering. Neural Comput Appl 28:3009–3019
Zhang S, You Z, Xiaowei W (2019) Plant disease leaf image segmentation based on superpixel clustering and EM algorithm. Neural Comput Appl 31:1225–1232
Zhang H, Wang S, Xu X, Chow TWS, Wu QJ (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 11:5304–5318
Samaria FS, Harter AC (1994) Parameterisation of a stochastic modelfor human face identification. Proc IEEE Workshop Appl Comput Vis 22:138–142
Acknowledgements
This research is supported by the National Natural Science Foundation of China (61772227, 61572227), the Science and Technology Development Foundation of Jilin Province (20180201045GX) and the Social Science Foundation of Education Department of Jilin Province (JJKH20181315SK). This research is also supported, in part, by the National Research Foundation Sinapore under its AI Singapore Programme (Award Number: AISG-GC-2019-003), the Singapore Ministry of Health under its National Innovation Challenge on Active and Confident Ageing (NIC Project No. MOH/NIC/COG04/2017), and the Joint NTU-WeBank Research Centre on FinTech, Nanyang Technological University, Singapore.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Y., Wang, D., Zhang, X. et al. McDPC: multi-center density peak clustering. Neural Comput & Applic 32, 13465–13478 (2020). https://doi.org/10.1007/s00521-020-04754-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-04754-5