Skip to main content
Log in

McDPC: multi-center density peak clustering

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Density peak clustering (DPC) is a recently developed density-based clustering algorithm that achieves competitive performance in a non-iterative manner. DPC is capable of effectively handling clusters with single density peak (single center), i.e., based on DPC’s hypothesis, one and only one data point is chosen as the center of any cluster. However, DPC may fail to identify clusters with multiple density peaks (multi-centers) and may not be able to identify natural clusters whose centers have relatively lower local density. To address these limitations, we propose a novel clustering algorithm based on a hierarchical approach, named multi-center density peak clustering (McDPC). Firstly, based on a widely adopted hypothesis that the potential cluster centers are relatively far away from each other. McDPC obtains centers of the initial micro-clusters (named representative data points) whose minimum distance to the other higher-density data points are relatively larger. Secondly, the representative data points are autonomously categorized into different density levels. Finally, McDPC deals with micro-clusters at each level and if necessary, merges the micro-clusters at a specific level into one cluster to identify multi-center clusters. To evaluate the effectiveness of our proposed McDPC algorithm, we conduct experiments on both synthetic and real-world datasets and benchmark the performance of McDPC against other state-of-the-art clustering algorithms. We also apply McDPC to perform image segmentation and facial recognition to further demonstrate its capability in dealing with real-world applications. The experimental results show that our method achieves promising performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. URL: https://github.com/mlyizhang/Multi-center-DPC.git.

  2. URL: http://archive.ics.uci.edu/ml/index.php.

  3. URL: http://cs.uef.fi/sipu/images/.

  4. https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/segbench/.

  5. URL: http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html.

References

  1. Kriegel H-P, Pfeifle M (2005) Hierarchical density-based clustering of uncertain data. In: IEEE international conference on data mining, vol 1–4

  2. Chang H, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recognit 41:191–203

    Article  Google Scholar 

  3. Kashyap M, Bhattacharya M (2017) A density invariant approach to clustering. Neural Comput Appl 28:1695–1713

    Article  Google Scholar 

  4. Chamundeswari G, Varma PPS, Satyanaraya C (2014) Spatial data clustering: a review. Int J Adv Res Comput Sci 5:62–63

    Google Scholar 

  5. Bai L, Cheng X, Liang J, Shen H, Guo Y (2017) Fast density clustering strategies based on the k-means algorithm. Pattern Recognit 71:375–386

    Article  Google Scholar 

  6. Ester M, Kriegel H-P, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: International conference on knowledge discovery and data mining, pp 226–231

  7. Viswanath P, Pinkesh R (2006) l-DBSCAN: a fast hybrid density based clustering method. Pattern Recognit 1:912–915

    Google Scholar 

  8. Alex R, Alessandro L (2014) Clustering by fast search and find of density peaks. Science 344:1492–1496

    Article  Google Scholar 

  9. Du M, Ding S, Xue Y (2018) A robust density peaks clustering algorithm using fuzzy neighborhood. Int J Mach Learn Cybern 9:1131–1140

    Article  Google Scholar 

  10. Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24:1273–1280

    Article  Google Scholar 

  11. Wang Y, Pang W, Zhou Y (2018) Density propagation based adaptive multi-density clustering algorithm. PLoS ONE 13:1–13

    Google Scholar 

  12. Ding J, He X, Yuan J, Jiang B (2018) Automatic clustering based on density peak detection using generalized extreme value distribution. Soft Comput 22:2777–2796

    Article  Google Scholar 

  13. Liu Y, Ma Z, Fang Y (2017) Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy. Knowl Based Syst 133:208–220

    Article  Google Scholar 

  14. Xie J, Gao H, Xie W, Liu X, Grant PW (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf Sci 354:19–40

    Article  Google Scholar 

  15. Mingjing D, Ding S, Jia H (2016) Study on density peaks clustering based on K-nearest neighbors and principal component analysis. Knowl Based Syst 99:135–145

    Article  Google Scholar 

  16. Ji X, Wang G, Deng W (2016) DenPEHC: density peak based efficient hierarchical clustering. Inf Sci 373:200–218

    Article  Google Scholar 

  17. Wang M, Zuo W, Wang Y (2016) An improved density peaks-based clustering method for social circle discovery in social networks. Neurocomputing 179:219–227

    Article  Google Scholar 

  18. Parmar M, Wang D, Zhang X, Tan A-H, Miao C, Jiang J, Zhou Y (2019) REDPC: a residual error-based density peak clustering algorithm. Neurocomputing 348:82–96

    Article  Google Scholar 

  19. Lin P, Lin Y, Chen Z, Lijun W, Chen L, Cheng S (2017) A density peak-based clustering approach for fault diagnosis of photovoltaic arrays. Int J Photoenergy 2017:1–14

    Google Scholar 

  20. Tu B, Yang X, Li N, Zhou C, He D (2020) Hyperspectral anomaly detection via density peak clustering. Pattern Recognit Lett 129:144–149

    Article  Google Scholar 

  21. Ding S, Mingjing D, Sun T, Xiao X, Xue Y (2017) An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood. Knowl Based Syst 113:294–313

    Article  Google Scholar 

  22. Guo M, Donghua Y, Liu G, Liu X, Cheng S (2019) Drug-target interaction data cluster analysis based on improving the density peaks clustering algorithm. Intell Data Anal 23:1335–1353

    Article  Google Scholar 

  23. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–973

    Article  MathSciNet  Google Scholar 

  24. Givoni IE, Frey BJ (2009) A binary variable model for affinity propagation. Neural Comput 21:1589–1600

    Article  MathSciNet  Google Scholar 

  25. David Martin Powers (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J Mach Learn Technol 2:2229–3981

    Google Scholar 

  26. Vin NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854

    MathSciNet  MATH  Google Scholar 

  27. Zhang H, Guo H, Wang X, Ji Y, Wu QJ (2020) Clothescounter: a framework for star-oriented clothes mining from videos. Neurocomputing 377:38–48

    Article  Google Scholar 

  28. Shi Y, Chen Z, Qi Z, Meng F, Cui L (2017) A novel clustering-based image segmentation via density peaks algorithm with mid-level feature. Neural Comput Appl 28:29–39

    Article  Google Scholar 

  29. Guo Y, Xia R, Sengur A, Polat K (2017) A novel image segmentation approach based on neutrosophic c-means clustering and indeterminacy filtering. Neural Comput Appl 28:3009–3019

    Article  Google Scholar 

  30. Zhang S, You Z, Xiaowei W (2019) Plant disease leaf image segmentation based on superpixel clustering and EM algorithm. Neural Comput Appl 31:1225–1232

    Article  Google Scholar 

  31. Zhang H, Wang S, Xu X, Chow TWS, Wu QJ (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 11:5304–5318

    Article  MathSciNet  Google Scholar 

  32. Samaria FS, Harter AC (1994) Parameterisation of a stochastic modelfor human face identification. Proc IEEE Workshop Appl Comput Vis 22:138–142

    Google Scholar 

Download references

Acknowledgements

This research is supported by the National Natural Science Foundation of China (61772227, 61572227), the Science and Technology Development Foundation of Jilin Province (20180201045GX) and the Social Science Foundation of Education Department of Jilin Province (JJKH20181315SK). This research is also supported, in part, by the National Research Foundation Sinapore under its AI Singapore Programme (Award Number: AISG-GC-2019-003), the Singapore Ministry of Health under its National Innovation Challenge on Active and Confident Ageing (NIC Project No. MOH/NIC/COG04/2017), and the Joint NTU-WeBank Research Centre on FinTech, Nanyang Technological University, Singapore.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to You Zhou.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Wang, D., Zhang, X. et al. McDPC: multi-center density peak clustering. Neural Comput & Applic 32, 13465–13478 (2020). https://doi.org/10.1007/s00521-020-04754-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-04754-5

Keywords

Navigation