Chameleon based on clustering feature tree and its application in customer segmentation

Li, Jinfeng; Wang, Kanliang; Xu, Lida

doi:10.1007/s10479-008-0368-4

Chameleon based on clustering feature tree and its application in customer segmentation

Published: 12 June 2008

Volume 168, pages 225–245, (2009)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Jinfeng Li¹,
Kanliang Wang¹ &
Lida Xu^1,2,3

482 Accesses
54 Citations
Explore all metrics

Abstract

Clustering analysis plays an important role in the filed of data mining. Nowadays, hierarchical clustering technique is becoming one of the most widely used clustering techniques. However, for most algorithms of hierarchical clustering technique, the requirements of high execution efficiency and high accuracy of clustering result cannot be met at the same time. After analyzing the advantages and disadvantages of the hierarchical algorithms, the paper puts forward a two-stage clustering algorithm, named Chameleon Based on Clustering Feature Tree (CBCFT), which hybridizes the Clustering Tree of algorithm BIRCH with algorithm CHAMELEON. By calculating the time complexity of CBCFT, the paper argues that the time complexity of CBCFT increases linearly with the number of data. By experimenting on sample data set, this paper demonstrates that CBCFT is able to identify clusters with large variance in size and shape and is robust to outliers. Moreover, the result of CBCFT is as similar as that of CHAMELEON, but CBCFT overcomes the shortcoming of the low execution efficiency of CHAMELEON. Although the execution time of CBCFT is longer than BIRCH, the clustering result of CBCFT is much satisfactory than that of BIRCH. Finally, through a case of customer segmentation of Chinese Petroleum Corp. HUBEI branch; the paper demonstrates that the clustering result of the case is meaningful and useful.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance-enhanced rough $$k$$ -means clustering algorithm

Article 11 August 2020

A Novel Approach for Customer Segmentation Based on Biclustering

An Unsupervised Data Mining Approach for Clustering Customers of Abrasive Manufacturer

References

Ankerst, M., Breunig, M., & Kriegel, H. P. Sander, J. (1999). OPICS: Ordering points to identify the clustering structure. In Proceedings of 1999 ACM-SIGMOD international conference on management of data (SIGMOD’99), Philadelphia, June 1999 (pp. 49–60).
Chen, M. S., Han, J., & Yu, P. S. (1996). Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6), 866–883.
Article Google Scholar
Chicc, G., Napoli, R. C., & Piglione, F. (2006). Comparisons among clustering techniques for electricity customer classification. IEEE Transactions on Power Systems, 21(2).
Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the second international conference of knowledge discovery and data mining (pp. 226–231).
Friedman, J. H., Bentley, J. L., & Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3, 209–226.
Article Google Scholar
Guha, S., Rastogi, R., & Shim, K. (1998) CURE: An efficient clustering algorithm for large databases. In Proceedings of ACM–SIGMOD international conference on management of data (pp. 73–84).
Guha, S., Rastogi, R., & Shim, K. (1999). ROCK: a robust clustering algorithm for categorical attributes. In Proceedings of the 15th international conference on data engineering (p. 512).
Halkidi, M., Batistakis, Y., & Vazirgiannis, M. (2001). Clustering algorithms and validity measures. In Proceedings of the thirteenth international conference on scientific and statistical database management (Vol. 3, pp. 1099–3371).
Han, J., & Kamber, M. (2001). Data mining concepts and techniques. BeiJing: Higher Education Press.
Google Scholar
Hinneburg, A., & Keim, D. A. (1998). An efficient approach to clustering in large multimedia databases with noise. In Proc. of 1998 int. conf. on knowledge discovery and data mining (KDD’98), New York, August 1998 (pp. 58–65).
Hu, T. L., & Sheu, J. B. (2003). A fuzzy-based customer classification method for demand-responsive logistical distribution operations. Fuzzy Sets and Systems, 139, 431–450.
Article Google Scholar
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM Computing Surveys (CSUR), 31(3), 264–323.
Article Google Scholar
Jung, S. Y., & Kim, T. S. (2001). An agglomerative hierarchical clustering using partial maximum array and incremental similarity computation method. In First IEEE international conference on data mining (p. 265).
Karypis, G., Han, E. H., & Kumar, V. (1999). Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer, 32(8), 68–75 (Special Issue on Data Analysis and Mining).
Google Scholar
Karypis, G., & Kumar, V. (1998). hMETIS 1.5: A hypergraph partitioning package. Technical report, Department of Computer Science, University of Minnesota. http://www.cs.umn.edu/~metis.
Lewis, J., & Chase, J. (2004). Data structures. Java edition. Englewood Cliffs: Prentice-Hall.
Google Scholar
Li, C., Becerra, V. M., & Deng, J. (2004). Extension of fuzzy c-means algorithm. In Proceedings of the IEEE conference on cybernetics and intelligent systems, Singapore, 1–3 December 2004.
Lin, C. R., & Chen, M. S. (2005). Combining partitional and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging. IEEE Transactions on Knowledge and Date Engineering, 17(2).
Ng, R., & Han, J. (1994). Efficient and effective clustering method for spatial data mining. In Proceedings of international conference of very large data bases (VLDB’94), Santiago, Chile, September 1994 (pp. 144–155).
Olson, D. L., & Zhao, F. (2007). CIOs’ perspectives of critical success factors in ERP upgrade projects. Enterprise Information Systems, 1(1), 129–138.
Article Google Scholar
Palomino, M. A., & Whitley, E. A. (2007). The effects of national culture on ERP implementation: a study of Colombia and Switzerland. Enterprise Information Systems, 3(1), 301–325.
Google Scholar
Qian, Y. T., Shi, Q. S., & Wang, Q. (2002). CURE-NS: A hierarchical clustering algorithm with new shrinking scheme. In Proceedings of the first international conference on machine learning and cybernetics, Beijing, 4–5 November 2002.
Sheikholeslami, G., Chatterjee, S., & Zhang, A. (1998). WaveCluster: A multi-resolution clustering approach for very large spatial databases. In Proceedings of the IEEE conference on very large data bases (VLDB’98), New York, August 1998 (pp. 428–439).
Wan, L. H., Li, Y. J., Liu, W. Y., & Zha, D. Y. (2005). Application and study of spatial cluster and customer partitioning. In Proceedings of the fourth international conference on machine learning and cybernetics, Guangzhou, 18–21 August 2005.
Wang, W., Yang, J., & Muntz, R. (1997). STING: A statistical information grid approach to spatial data mining. In Proceedings of 1997 international conference on very large data bases (VLDB’97), Athens, Greece, August 1997 (pp. 186–195).
Wang, Z., He, P. L., Guo, L. S., & Zheng, X. S. (2004). Clustering analysis of customer relationship in securities trade. In Proceedings of the third international conference on machine learning and cybernetics, Shanghai, August 2004 (pp. 26–29).
Warfield, J. N. (2007). Systems science serves enterprise integration: a tutorial. Enterprise Information Systems, 2(1), 235–254.
Article Google Scholar
Weiss, M. A. (2004). Data structures and algorithm analysis in Java. Englewood Cliffs: Prentice-Hall.
Google Scholar
Zhang, T., Ramakrishnan, R., & Linvy, M. (1996). Birch: an efficient data clustering method for large databases. In Proc. of 1996 ACM–SIGMOD international conference on management of data, Montreal, Quebec (pp. 103–114).

Download references

Author information

Authors and Affiliations

The School of Management, Xi’an Jiaotong University, Xi’an, 710049, China
Jinfeng Li, Kanliang Wang & Lida Xu
Department of Information Technology and Decision Science, Old Dominion University, Norfolk, VA, 23529, USA
Lida Xu
College of Economics and Management, Beijing Jiaotong University, Beijing, 100044, China
Lida Xu

Authors

Jinfeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Kanliang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lida Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kanliang Wang.

Additional information

The research is partially supported by National Natural Science Foundation of China (grants #70372049 and #70121001).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, J., Wang, K. & Xu, L. Chameleon based on clustering feature tree and its application in customer segmentation. Ann Oper Res 168, 225–245 (2009). https://doi.org/10.1007/s10479-008-0368-4

Download citation

Published: 12 June 2008
Issue Date: April 2009
DOI: https://doi.org/10.1007/s10479-008-0368-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Chameleon based on clustering feature tree and its application in customer segmentation

Abstract

Access this article

Similar content being viewed by others

Performance-enhanced rough $$k$$ -means clustering algorithm

A Novel Approach for Customer Segmentation Based on Biclustering

An Unsupervised Data Mining Approach for Clustering Customers of Abrasive Manufacturer

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Chameleon based on clustering feature tree and its application in customer segmentation

Abstract

Access this article

Similar content being viewed by others

Performance-enhanced rough $$k$$ -means clustering algorithm

A Novel Approach for Customer Segmentation Based on Biclustering

An Unsupervised Data Mining Approach for Clustering Customers of Abrasive Manufacturer

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation