Skip to main content
Log in

An Extension of Multiple Correspondence Analysis for Identifying Heterogeneous Subgroups of Respondents

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

An extension of multiple correspondence analysis is proposed that takes into account cluster-level heterogeneity in respondents’ preferences/choices. The method involves combining multiple correspondence analysis and k-means in a unified framework. The former is used for uncovering a low-dimensional space of multivariate categorical variables while the latter is used for identifying relatively homogeneous clusters of respondents. The proposed method offers an integrated graphical display that provides information on cluster-based structures inherent in multivariate categorical data as well as the interdependencies among the data. An empirical application is presented which demonstrates the usefulness of the proposed method and how it compares to several extant approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arabie, P., & Hubert, L. (1994). Cluster analysis in marketing research. In R.P. Bagozzi (Ed.), Advanced methods of marketing research (pp. 160–189). Oxford: Blackwell.

    Google Scholar 

  • Arimond, G., & Elfessi, A. (2001). A clustering method for categorical data in tourism market segmentation research. Journal of Travel Research, 39, 391–397.

    Article  Google Scholar 

  • Bagozzi, R.P. (1982). A field investigation of causal relations among cognition, affect, intensions, and behavior. Journal of Marketing Research, 19, 562–584.

    Article  Google Scholar 

  • Benzécri, J.P. (1973). l’ Analyse des données. Vol. 2. l’ Analyse des correspondances. Paris: Dunod.

    Google Scholar 

  • Benzécri, J.P. (1979). Sur le calcul des taux d’inertia dans l’analyse d’un questionaire. Addendum et erratum à [BIN.MULT]. Cahiers de l’Analyse des Données, 4, 377–378.

    Google Scholar 

  • Bezdek, J.C. (1974). Numerical taxonomy with fuzzy sets. Journal of Mathematical Biology, 1, 57–71.

    Article  Google Scholar 

  • Bock, H.H. (1987). On the interface between cluster analysis, principal component analysis, and multidimensional scaling. In H. Bozdogan, & A. K. Gupta, (Eds.) Multivariate statistical modeling and data analysis (pp. 17–34). New York: Reidel.

    Google Scholar 

  • Chang, W. (1983). On using principal components before separating a mixture of two multivariate normal distributions. Applied Statistics, 32, 267–275.

    Google Scholar 

  • de Leeuw, J., Young, F.W., & Takane, Y. (1976). Additive structure in qualitative data: An alternating least squares method with optimal scaling features. Psychometrika, 41, 471–503.

    Article  Google Scholar 

  • DeSarbo, W.S., Howard, D.J., & Jedidi, K. (1991). MULTICLUS: A new method for simultaneous performing multidimensional scaling and clustering. Psychometrika, 56, 121–136.

    Article  Google Scholar 

  • DeSarbo, W.S., Jedidi, K., Cool, K., & Schendel, D. (1990). Simultaneous multidimensional unfolding and cluster analysis: An investigation of strategic groups. Marketing Letters, 2, 129–146.

    Article  Google Scholar 

  • De Soete, G., & Carroll, J.D. (1994). k-means clustering in a low-dimensional Euclidean space. In E. Diday et al. (Eds.), New approaches in classification and data analysis (pp. 212–219). Heidelberg: Springer-Verlag.

    Google Scholar 

  • Dolničar, S., & Leisch, F. (2001). Behavioral market segmentation of binary guest survey data with bagged clustering. In G. Dorffner, H. Bischof, & K. Hornik (Eds.), ICANN 2001 (pp. 111–118). Berlin: Springer-Verlag.

    Google Scholar 

  • Gifi, A. (1990). Nonlinear multivariate analysis. Chichester, UK: Wiley.

    Google Scholar 

  • Green, P.E., Carmone, F.J., & Kim, J. (1990). A preliminary study of optimal variable weighting in k-means clustering. Journal of Classification, 7, 271–285.

    Article  Google Scholar 

  • Green, P.E., & Krieger, A.M. (1995). Alternative approaches to cluster-based market segmentation. Journal of the Market Research Society, 37, 221–239.

    Google Scholar 

  • Green, P.E., & Krieger, A.M. (1998). User’s Guide to HIERMAPR. The Wharton School. University of Pennsylvania.

  • Green, P.E., Schaffer, C.M., & Patterson, K.M. (1988). A reduced-space approach to the clustering of categorical data in market segmentation. Journal of the Market Research Society, 30, 267–288.

    Google Scholar 

  • Greenacre, M.J. (1984). Theory and applications of correspondence analysis. London: Academic Press.

    Google Scholar 

  • Heiser, W.J. (1993). Clustering in low-dimensional space. In O. Opitz, B. Lausen, & R. Klar (Eds.), Information and classification: Concepts, methods, and applications(pp. 162–173). Heidelberg: Springer-Verlag.

    Google Scholar 

  • Hwang, H., & Takane, Y. (2002). Generalized constrained multiple correspondence analysis. Psychometrika, 67, 211–224.

    Article  Google Scholar 

  • Javalgi, R., Whipple, T., McManamon, M., & Edick, V. (1992). Hospital image: A correspondence analysis approach. Journal of Health Care Marketing, 12, 34–41.

    PubMed  Google Scholar 

  • Kamakura, W.A., Kim, B., & Lee, J. (1996). Modeling preference and structural heterogeneity in consumer choice. Marketing Science, 15, 152–172.

    Article  Google Scholar 

  • Kruskal, J.B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–27

    Article  Google Scholar 

  • Lebart, L. (1994). Complementary use of correspondence analysis and cluster analysis. In M. J., Greenacre, & J. Blasius (Eds.), Correspondence Analysis in the Social Sciences (pp. 162–178). London: Academic Press.

    Google Scholar 

  • Lebart, L., Morineau, A., & Warwick, K.M. (1984). Multivariate descriptive statistical analysis. New York: Wiley.

    Google Scholar 

  • Lloyd, S.P. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28, 129–37.

    Article  Google Scholar 

  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In L.M. Le Cam, & J. Neyman (Eds.), Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (pp. 281–297). Berkeley: University of California Press.

    Google Scholar 

  • Manton, K.G., Woodbury, M.A., & Tolley, H.D. (1994). Statistical applications using fuzzy sets. New York: Wiley.

    Google Scholar 

  • Mucha, H.-J. (2002). An intelligent clustering technique based on dual scaling. In S. Nishisato, Y. Baba, H. Bozdogan, & K. Kanefuji (Eds.), Measurement and multivariate analysis(pp. 37–46). Tokyo: Springer-Verlag.

    Google Scholar 

  • Nishisato, S. (1980). Analysis of categorical data: Dual scaling and its applications. Toronto: University of Toronto Press.

    Google Scholar 

  • Nishisato, S. (1984). Forced classification: A simple application of a quantitative technique. Psychometrika, 49, 25–36.

    Article  Google Scholar 

  • Nishisato, S. (1994). Elements of dual scaling: An introduction to practical data analysis. Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Punj, G., & Stewart, D.W. (1983). Cluster analysis in marketing research: Review and suggestions for application. Journal of Marketing Research, 20, 134–148.

    Article  Google Scholar 

  • Ramsay, J.O. (1988). Monotone regression splines in action (with discussion). Statistical Science, 3, 425–461.

    Google Scholar 

  • Ramsay, J.O. (1998). Estimating smooth monotone functions. Journal of the Royal Statistical Society Series B, 60, 365–375.

    Article  Google Scholar 

  • Rovan, J. (1994). Visualizing solutions in more than two dimensions. In M. J. Greenacre, & J. Blasius (Eds.), Correspondence analysis in the social sciences (pp. 210–229). London: Academic Press.

    Google Scholar 

  • Steinley, D. (2003). Local optima in k-means clustering: What you don’t know may hurt you. Psychological Methods, 8, 294–302.

    Article  PubMed  Google Scholar 

  • Van Buuren, S., & Heiser, W.J. (1989). Clustering n objects into k groups under optimal scaling of variables. Psychometrika, 54, 699–706.

    Article  Google Scholar 

  • Vichi, M., & Kiers, H.A.L. (2001). Factorial k-means analysis for two-way data. Computational Statistics and Data Analysis, 37, 49–64.

    Article  Google Scholar 

  • Wedel, M., & Kamakura, W.A. (1998). Market segmentation: Conceptual and methodological foundations. Boston: Kluwer Academic.

    Google Scholar 

  • Wind, Y. (1978). Issues and advances in segmentation research. Journal of Marketing Research, 15, 317–337.

    Article  Google Scholar 

  • Yanai, H. (1998). Generalized canonical correlation analysis with linear constraints. In C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock, & Y. Baba (Eds.), Data science, classification, and related methods (pp. 539–546). Tokyo: Springer-Verlag.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heungsun Hwang.

Additional information

The work reported in this paper was supported by Grant 290439 and Grant A6394 from the Natural Sciences and Engineering Research Council of Canada to the first and third authors, respectively. We wish to thank Ulf Böckenholt, Paul Green, and Marc Tomiuk for their insightful comments on an earlier version of this paper. We also wish to thank Byunghwa Yang for generously providing us with his data.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hwang, H., Montréal, H., Dillon, W.R. et al. An Extension of Multiple Correspondence Analysis for Identifying Heterogeneous Subgroups of Respondents. Psychometrika 71, 161–171 (2006). https://doi.org/10.1007/s11336-004-1173-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-004-1173-x

Keywords

Navigation