Abstract
An extension of multiple correspondence analysis is proposed that takes into account cluster-level heterogeneity in respondents’ preferences/choices. The method involves combining multiple correspondence analysis and k-means in a unified framework. The former is used for uncovering a low-dimensional space of multivariate categorical variables while the latter is used for identifying relatively homogeneous clusters of respondents. The proposed method offers an integrated graphical display that provides information on cluster-based structures inherent in multivariate categorical data as well as the interdependencies among the data. An empirical application is presented which demonstrates the usefulness of the proposed method and how it compares to several extant approaches.
Similar content being viewed by others
References
Arabie, P., & Hubert, L. (1994). Cluster analysis in marketing research. In R.P. Bagozzi (Ed.), Advanced methods of marketing research (pp. 160–189). Oxford: Blackwell.
Arimond, G., & Elfessi, A. (2001). A clustering method for categorical data in tourism market segmentation research. Journal of Travel Research, 39, 391–397.
Bagozzi, R.P. (1982). A field investigation of causal relations among cognition, affect, intensions, and behavior. Journal of Marketing Research, 19, 562–584.
Benzécri, J.P. (1973). l’ Analyse des données. Vol. 2. l’ Analyse des correspondances. Paris: Dunod.
Benzécri, J.P. (1979). Sur le calcul des taux d’inertia dans l’analyse d’un questionaire. Addendum et erratum à [BIN.MULT]. Cahiers de l’Analyse des Données, 4, 377–378.
Bezdek, J.C. (1974). Numerical taxonomy with fuzzy sets. Journal of Mathematical Biology, 1, 57–71.
Bock, H.H. (1987). On the interface between cluster analysis, principal component analysis, and multidimensional scaling. In H. Bozdogan, & A. K. Gupta, (Eds.) Multivariate statistical modeling and data analysis (pp. 17–34). New York: Reidel.
Chang, W. (1983). On using principal components before separating a mixture of two multivariate normal distributions. Applied Statistics, 32, 267–275.
de Leeuw, J., Young, F.W., & Takane, Y. (1976). Additive structure in qualitative data: An alternating least squares method with optimal scaling features. Psychometrika, 41, 471–503.
DeSarbo, W.S., Howard, D.J., & Jedidi, K. (1991). MULTICLUS: A new method for simultaneous performing multidimensional scaling and clustering. Psychometrika, 56, 121–136.
DeSarbo, W.S., Jedidi, K., Cool, K., & Schendel, D. (1990). Simultaneous multidimensional unfolding and cluster analysis: An investigation of strategic groups. Marketing Letters, 2, 129–146.
De Soete, G., & Carroll, J.D. (1994). k-means clustering in a low-dimensional Euclidean space. In E. Diday et al. (Eds.), New approaches in classification and data analysis (pp. 212–219). Heidelberg: Springer-Verlag.
Dolničar, S., & Leisch, F. (2001). Behavioral market segmentation of binary guest survey data with bagged clustering. In G. Dorffner, H. Bischof, & K. Hornik (Eds.), ICANN 2001 (pp. 111–118). Berlin: Springer-Verlag.
Gifi, A. (1990). Nonlinear multivariate analysis. Chichester, UK: Wiley.
Green, P.E., Carmone, F.J., & Kim, J. (1990). A preliminary study of optimal variable weighting in k-means clustering. Journal of Classification, 7, 271–285.
Green, P.E., & Krieger, A.M. (1995). Alternative approaches to cluster-based market segmentation. Journal of the Market Research Society, 37, 221–239.
Green, P.E., & Krieger, A.M. (1998). User’s Guide to HIERMAPR. The Wharton School. University of Pennsylvania.
Green, P.E., Schaffer, C.M., & Patterson, K.M. (1988). A reduced-space approach to the clustering of categorical data in market segmentation. Journal of the Market Research Society, 30, 267–288.
Greenacre, M.J. (1984). Theory and applications of correspondence analysis. London: Academic Press.
Heiser, W.J. (1993). Clustering in low-dimensional space. In O. Opitz, B. Lausen, & R. Klar (Eds.), Information and classification: Concepts, methods, and applications(pp. 162–173). Heidelberg: Springer-Verlag.
Hwang, H., & Takane, Y. (2002). Generalized constrained multiple correspondence analysis. Psychometrika, 67, 211–224.
Javalgi, R., Whipple, T., McManamon, M., & Edick, V. (1992). Hospital image: A correspondence analysis approach. Journal of Health Care Marketing, 12, 34–41.
Kamakura, W.A., Kim, B., & Lee, J. (1996). Modeling preference and structural heterogeneity in consumer choice. Marketing Science, 15, 152–172.
Kruskal, J.B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–27
Lebart, L. (1994). Complementary use of correspondence analysis and cluster analysis. In M. J., Greenacre, & J. Blasius (Eds.), Correspondence Analysis in the Social Sciences (pp. 162–178). London: Academic Press.
Lebart, L., Morineau, A., & Warwick, K.M. (1984). Multivariate descriptive statistical analysis. New York: Wiley.
Lloyd, S.P. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28, 129–37.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In L.M. Le Cam, & J. Neyman (Eds.), Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (pp. 281–297). Berkeley: University of California Press.
Manton, K.G., Woodbury, M.A., & Tolley, H.D. (1994). Statistical applications using fuzzy sets. New York: Wiley.
Mucha, H.-J. (2002). An intelligent clustering technique based on dual scaling. In S. Nishisato, Y. Baba, H. Bozdogan, & K. Kanefuji (Eds.), Measurement and multivariate analysis(pp. 37–46). Tokyo: Springer-Verlag.
Nishisato, S. (1980). Analysis of categorical data: Dual scaling and its applications. Toronto: University of Toronto Press.
Nishisato, S. (1984). Forced classification: A simple application of a quantitative technique. Psychometrika, 49, 25–36.
Nishisato, S. (1994). Elements of dual scaling: An introduction to practical data analysis. Hillsdale, NJ: Lawrence Erlbaum.
Punj, G., & Stewart, D.W. (1983). Cluster analysis in marketing research: Review and suggestions for application. Journal of Marketing Research, 20, 134–148.
Ramsay, J.O. (1988). Monotone regression splines in action (with discussion). Statistical Science, 3, 425–461.
Ramsay, J.O. (1998). Estimating smooth monotone functions. Journal of the Royal Statistical Society Series B, 60, 365–375.
Rovan, J. (1994). Visualizing solutions in more than two dimensions. In M. J. Greenacre, & J. Blasius (Eds.), Correspondence analysis in the social sciences (pp. 210–229). London: Academic Press.
Steinley, D. (2003). Local optima in k-means clustering: What you don’t know may hurt you. Psychological Methods, 8, 294–302.
Van Buuren, S., & Heiser, W.J. (1989). Clustering n objects into k groups under optimal scaling of variables. Psychometrika, 54, 699–706.
Vichi, M., & Kiers, H.A.L. (2001). Factorial k-means analysis for two-way data. Computational Statistics and Data Analysis, 37, 49–64.
Wedel, M., & Kamakura, W.A. (1998). Market segmentation: Conceptual and methodological foundations. Boston: Kluwer Academic.
Wind, Y. (1978). Issues and advances in segmentation research. Journal of Marketing Research, 15, 317–337.
Yanai, H. (1998). Generalized canonical correlation analysis with linear constraints. In C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock, & Y. Baba (Eds.), Data science, classification, and related methods (pp. 539–546). Tokyo: Springer-Verlag.
Author information
Authors and Affiliations
Corresponding author
Additional information
The work reported in this paper was supported by Grant 290439 and Grant A6394 from the Natural Sciences and Engineering Research Council of Canada to the first and third authors, respectively. We wish to thank Ulf Böckenholt, Paul Green, and Marc Tomiuk for their insightful comments on an earlier version of this paper. We also wish to thank Byunghwa Yang for generously providing us with his data.
Rights and permissions
About this article
Cite this article
Hwang, H., Montréal, H., Dillon, W.R. et al. An Extension of Multiple Correspondence Analysis for Identifying Heterogeneous Subgroups of Respondents. Psychometrika 71, 161–171 (2006). https://doi.org/10.1007/s11336-004-1173-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-004-1173-x