Abstract
This is an exploratory study to see if configurations that were coupled to an output variable could be found in data. The focus in this study was on the modal configurations, which are profiles of best fit for clusters, and their average cluster scores for an output variable. A multistage procedure explained in the paper below was applied to a crime dataset to identify the modal configurations for a sample of cities and towns of the USA and their links to the incidence of violent crime. Three coupled configurations were found including one that was indicative of an African American Configuration having the highest rate of violent crime followed by one indicative of a High Divorce Configuration and one indicative of an Economic Hardship Configuration. The results indicated that using this multistage procedure is feasible for finding modal configurations and their couplings in data. The advantages of this approach are discussed and future directions with the research are outlined.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, Section 6.3. Morgan Kaufmann Publishers, San Francisco (2006)
Perner, P. (ed.): Data Mining on Multimedia Data. LNCS (LNAI), vol. 2558, pp. 1–11. Springer, Heidelberg (2002)
Jaenichen, S., Perner, P.: Conceptual Clustering and Case Generalization of two-dimensional Forms. Computational Intelligence 22(3/4), 178–193 (2006)
Quigley, B.: Assessing the Athlete – Potential and Progress in Sports Coaching, pp. 78–79. Australian Government Printing Service, Canberra (1976)
Hemery, D.: The Pursuit of Sporting Excellence, pp. 28–30. Willow Books, London (1986)
Meehl, P.E.: Configural Scoring. Journal of Consulting Psychology 14, 165–171 (1950)
von Eye, A.: Introduction to Configural Frequency Analysis: The search for types and antitypes in cross-classifications. Cambridge University Press, Cambridge (1990)
Kriegel, H.-P., Kroger, P., Zimek, A.: Clustering High-Dimensional Data: A Survey on Subspace Clustering, Pattern-Based Clustering, and Correlation Clustering. ACM Transactions on Knowledge Discovery from Data (March 2009)
Waller, N.G., Meehl, P.E.: Multivariate Taxometric Procedures: Distinguishing Types from Continua. Safe, Thousand Oaks (1998)
Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer, Berlin (2002)
See crime site, archive.ics.uci.edu/ml
U. S. Department of Commerce, Bureau of the Census, Census of Population and Housing. Summary Tape File 1a & 3a (Computer Files), United States (1990)
U.S. Department of Justice, Bureau of Justice Statistics, Law Enforcement Management And Administrative Statistics (Computer File) U.S. Department Of Commerce, Bureau Of The Census Producer and Inter-university Consortium for Political and Social Research, Washington, DC, Ann Arbor, Michigan (1992)
U.S. Department of Justice, Federal Bureau of Investigation, Crime in the United States (Computer File) (1995)
Koesmarno, H.K., Graco, W.J., He, H., Cooksey, R.W.: MAMBAC versus Outlier tables for Identifying Classes in Data. In: Proceedings of the Third International Conference on the Practical Applications of Knowledge Discovery and Data Mining, pp. 107–125. The Practical Application Company, London (1999)
Kendall, M.G.: Rank Correlation Methods, 4th edn. Griffin, London (1970), http://en.wikipedia.org/wiki/Spearman_Rho
Sall, J., Lehman, A., Stephens, M., Creighton, L.: JMP Start Statistics: A Guide to Statistics and Data Analysis using JMP, 5th edn. SAS Institute, Cary (2012), http://www.jmp.com
Kohonen, T.: Self Organizing Maps, 3rd edn. Springer, Berlin (2001)
Chou, Y., Polansky, A.M., Mason, R.L.: Transforming Nonnormal Data to Normality in Statistical Process Control. Journal of Quality Technology 30, 133–141 (1998)
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley, New York (2001)
Enders, C.K.: Applied Missing Data Analysis, 1st edn. Guildford Press, New York (2010), http://en.wikipedia.org/wiki/Missing_data
Wang, C., She, Z., Cao, L.: Coupled Clustering Ensemble: Incorporating Couplings Relationships Both between Base Clusterings and Objects. In: Paper to be presented to the 29th IEEE International Conference on Data Engineering (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Graco, W., Koesmarno, H. (2013). Configurations and Couplings: An Exploratory Study. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2013. Lecture Notes in Computer Science(), vol 7987. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39736-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-39736-3_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39735-6
Online ISBN: 978-3-642-39736-3
eBook Packages: Computer ScienceComputer Science (R0)