Abstract
A clustering that consists of a nested set of clusters may be represented graphically by a tree. In contrast, a clustering that includes non-nested overlapping clusters (sometimes termed a “nonhierarchical” clustering) cannot be represented by a tree. Graphical representations of such non-nested overlapping clusterings are usually complex and difficult to interpret. Carroll and Pruzansky (1975, 1980) suggested representing non-nested clusterings with multiple ultrametric or additive trees. Corter and Tversky (1986) introduced the extended tree (EXTREE) model, which represents a non-nested structure as a tree plus overlapping clusters that are represented by marked segments in the tree. We show here that the problem of finding a nested (i.e., tree-structured) set of clusters in an overlapping clustering can be reformulated as the problem of finding a clique in a graph. Thus, clique-finding algorithms can be used to identify sets of clusters in the solution that can be represented by trees. This formulation provides a means of automatically constructing a multiple tree or extended tree representation of any non-nested clustering. The method, called “clustrees”, is applied to several non-nested overlapping clusterings derived using the MAPCLUS program (Arabie and Carroll 1980).
Similar content being viewed by others
References
ARABIE, P., and CARROLL, J. D. (1980), “MAPCLUS: A Mathematical Programming Approach to Fitting the ADCLUS Model,”Psychometrika, 45, 211–235.
ARABIE, P., CARROLL, J. D., and DESARBO, W. S. (1987),Three-Way Scaling and Clustering, Newbury Park: Sage.
BALAS, E., and YU, C. S. (1986), “Finding the Maximum Clique in an Arbitrary Graph,”SIAM Journal on Computing, 15, 1054–1068.
BRON, C., and KERBOSCH, J. (1971), “Finding All Cliques of an Undirected Graph,”Communications of the Association for Computing Machinery, 16, 575–577.
CARROLL, J. D. (1976), “Spatial, Non-Spatial, and Hybrid Models for Scaling,”Psychometrika, 41, 439–463.
CARROLL, J. D., and ARABIE, P. (1983), “INDCLUS: An Individual Differences Generalization of the ADCLUS Model and the MAPCLUS Algorithm,”Psychometrika, 48, 157–169.
CARROLL, J. D., and CHANG, J-J. (1973), “A Method for Fitting a Class of Hierarchical Tree Structure Models to Dissimilarities Data and its Application to Some ‘Body Parts’ Data of Miller’s,”Proceedings of the 81st Annual Convention of the American Psychological Association, 8, 1097–1098.
CARROLL, J. D., and PRUZANSKY, S. (1975), “Fitting of Hierarchical Tree Structure (HTS) Models, Mixtures of HTS Models, and Hybrid Models, via Mathematical Programming and Alternating Least Squares,” U.S.-Japan Seminar on Theory, Methods and Applications of Multidimensional Scaling and Related Techniques, San Diego.
CARROLL, J. D., and PRUZANSKY, S. (1980), “Discrete and Hybrid Scaling Models,” inSimilarity and Choice, Eds., E.D. Lanterman and H. Feger, Bern: Hans Huber, 108–139.
CORTER, J. E., and CARROLL, J. D. (1993), “Automatic Fitting of Complementary Clusters in MAPCLUS,” Unpublished manuscript, Columbia University.
CORTER, J. E., and TVERSKY, A. (1986), “Extended Similarity Trees,”Psychometrika, 51, 429–451.
CUNNINGHAM, J. P. (1978), “Free Trees and Bidirectional Trees as Representations of Psychological Distance,”Journal of Mathematical Psychology, 17, 165–188.
DAS, S. R., SHENG, C. L., CHEN, Z., and LIN, T. (1979), “Magnitude Ordering of Degree Complements of Certain Node Pairs in an Undirected Graph and an Algorithm to Find a Class of Maximal Subgraphs,”Computers and Electrical Engineering, 6, 139–151.
GERHARDS, L., and LINDENBERG, W. (1979), “Clique Detection for Nondirectional Graphs: Two New Algorithms,”Computing, 21, 295–322.
GODEHARDT, E. (1988),Graphs as Structural Models: The Application of Graphs and Multigraphs in Cluster Analysis, Braunschweig: Vieweg.
GOLUMBIC, M. C. (1980),Algorithmic Graph Theory and Perfect Graphs, New York: Academic.
JOHNSTON H. C. (1976), “Cliques of a Graph: Variations on the Bron-Kerbosch Algorithm,”International Journal of Computer and Information Science, 5, 109–238.
PARDOLOS, P. M., and PHILLIPS, A. T. (1990), “A Global Optimization Approach for Solving the Maximum Clique Problem,”International Journal of Computer Mathematics, 33, 209–216.
ROSENBERG, S., and KIM, M. P. (1975), “The Method of Sorting as a Data-Gathering Procedure in Multivariate Research,”Multivariate Behavioral Research, 10, 489–502.
SATTATH, S., and TVERSKY, A. (1977), “Additive Similarity Trees,”Psychometrika, 42, 319–345.
SATTATH, S., and TVERSKY, A. (1987), “On the Relation Between Common and Distinctive Feature Models,”Psychological Review, 94, 16–22.
SHEPARD, R. N., and ARABIE, P. (1979), “Additive Clustering: Representation of Similarities as Combinations of Discrete Overlapping Properties,”Psychological Review, 86, 87–123.
SMITH, E. E., RIPS, L. J., SHOBEN, E. J., ROSCH, E., and MERVIS, C. B. (1975), Unpublished data.
SNEATH, P. H., and SOKAL, R. R. (1973),Numerical Taxonomy, San Francisco. Freeman.
TSUKIYAMA, S., IDE, M., ARIYOSHI, H., and SHIRAKAWA, I. (1977), “A New Algorithm for Generating All the Maximal Independent Sets,”SIAM Journal on Computing, 6, 505–517.
TVERSKY, A. (1977), “Features of Similarity,”Psychological Review, 84, 327–352.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Carroll, J.D., Corter, J.E. A graph-theoretic method for organizing overlapping clusters into trees, multiple trees, or extended trees. Journal of Classification 12, 283–313 (1995). https://doi.org/10.1007/BF03040859
Issue Date:
DOI: https://doi.org/10.1007/BF03040859