Skip to main content

Advertisement

Log in

Using cluster analysis for data mining in educational technology research

  • Research Article
  • Published:
Educational Technology Research and Development Aims and scope Submit manuscript

Abstract

Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through two examples of mining click-stream server-log data that reflects student use of online learning environments. Cluster analysis can be used to help researchers develop profiles that are grounded in learner activity—like sequence for accessing tasks and information, or time spent engaged in a given activity or examining resources—during a learning session. The examples in this paper illustrate the use of a hierarchical clustering method (Ward’s clustering) and a non-hierarchical clustering method (k-Means clustering) to analyze characteristics of learning behavior while learners engage in a problem-solving activity in an online learning environment. A discussion of advantages and limitations of using cluster analysis as a data mining technique in educational technology research concludes the article.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis. Beverly Hills: Sage Press.

    Google Scholar 

  • Barab, S. A., Bowdish, B. E., & Lawless, K. A. (1997). Hypermedia navigation: profiles of hypermedia users. Educational Technology Research and Development, 45(3), 23–42.

    Article  Google Scholar 

  • Belland, B. R., French, B., & Ertmer, P. A. (2009). Validity and problem-based learning research: a review of instruments used to assess intended learning outcomes. Interdisciplinary Journal of Problem-based Learning, 3(1), 59–89.

    Google Scholar 

  • Belland, B. R., Glazewski, K. D., & Richardson, J. C. (2010). Problem-based learning and argumentation: testing a scaffolding framework to support middle school students’ creation of evidence-based arguments. Instructional Science,. doi:10.1007/s11251-010-9148-z.

    Google Scholar 

  • Clark, R. E. (2010). Cognitive and neuroscience research on learning and instruction: recent insights about the impact of non-conscious knowledge on problem solving, higher order thinking skills and interactive cyber-learning environments. Presented at the International Conference on Education Research (ICER), Seoul.

  • Cronbach, L. J., & Gleser, G. C. (1953). Assessing similarity between profiles. Psychological Bulletin, 50, 456–473.

    Article  Google Scholar 

  • Donner, A., & Koval, J. J. (1980). The estimation of intraclass correlation in the analysis of family data. Biometrics, 36(1), 19–25.

    Article  Google Scholar 

  • Everitt, B. S., Landau, S., & Leese, M. (2009). Cluster analysis (4th ed.) London: Arnold.

  • Facione, P. A., & Facione, N. C. (1994). Holistic critical thinking scoring rubric. Millbrae: California Academic Press.

    Google Scholar 

  • Feldon, D. F. (2007). Implications of research on expertise for curriculum and pedagogy. Educational Psychology Review, 19(2), 91–110.

    Article  Google Scholar 

  • Fielding, A. H. (2007). Cluster and classification techniques for the biosciences. Cambridge: Cambridge University Press.

    Google Scholar 

  • Gijbels, D., Dochy, F., Van den Bossche, P., & Segers, M. (2005). Effects of problem-based learning: a meta-analysis from the angle of assessment. Review of Educational Research, 75(1), 27–61.

    Article  Google Scholar 

  • Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a survey. ACM Computing Surveys, 31, 264–323.

    Article  Google Scholar 

  • Jeong, H., & Hmelo-Silver, C. E. (2010). Productive use of learning resources in an online problem-based learning environment. Computers in Human Behavior, 26, 84–99.

    Article  Google Scholar 

  • Jonassen, D. H. (1997). Instructional design models for well-structured and ill-structured problem-solving learning outcomes. Educational Technology Research and Development, 45(1), 65–94.

    Article  Google Scholar 

  • Jonassen, D. H. (2000). Toward a design theory of problem solving. Educational Technology Research and Development, 48(4), 63–85.

    Article  Google Scholar 

  • Kim, M. C., & Hannafin, M. J. (2011). Scaffolding problem solving in technology-enhanced learning environments (TELEs): bridging research and theory with practice. Computers & Education, 56, 403–417.

    Article  Google Scholar 

  • Kumsaikaew, P., Jackman, J., & Dark, V. J. (2006). Task relevant information in engineering problem solving. Journal of Engineering Education, 95, 227–239.

    Google Scholar 

  • Lawless, K., & Kulikowich, J. (1996). Understanding hypertext navigation through cluster analysis. Journal of Educational Computing Research, 14(4), 385–399.

    Article  Google Scholar 

  • Leighton, J. P. (2004). Avoiding misconception, misuse, and missed opportunities: the collection of verbal reports in educational achievement testing. Educational Measurement: Issues and Practice, 23, 6–15.

    Article  Google Scholar 

  • MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In L. M. Le Cam & J. Neyman (Eds.) Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability (pp. 281–297). Berkley: University of California Press.

  • Milligan, G. W. (1980). A review of Monte Carlo tests of cluster analysis. Multivariate Behavioral Research, 16, 379–407.

    Article  Google Scholar 

  • Milligan, G. W., & Cooper, M. C. (1987). Methodology review: clustering methods. Applied Psychological Measurement, 11, 329–354.

    Article  Google Scholar 

  • Ng, R. T., & Han, J. (1994). Efficient and effective clustering methods for spatial data mining. In J. B. Bocca, M. Jarke, & C. Zaniolo (Eds.) Proceedings of the Twentieth International Conference on Very Large Databases (pp. 144–155). Santiago: Morgan Kaufmann.

  • Niederhauser, D. S., Antonenko, P., Ryan, S., Jackman, J., Ogilvie, C., Marathe, R., & Kumsaikaew, P. (2007). Solution strategies of more and less successful problem solvers in an online problem-based learning environment. Presented at the annual conference of the American Educational Research Association. Chicago, IL.

  • Nisbet, R., Elder, J., & Miner, G. (2009). Handbook of statistical analysis and data mining applications. London: Academic Press.

    Google Scholar 

  • Norušis, M. (2005). SPSS 13.0 statistical procedures companion. Englewood Cliffs:Prentice Hall.

  • Ryan, S., Jackman, J., Kumsaikaew, P., Dark, V., & Olafsson, S. (2007). Use of information in collaborative problem solving. In D. H. Jonassen (Ed.), Learning to solve complex, scientific problems (pp. 187–204). Mahwah, NJ: Lawrence Erlbaum Associates.

  • Schrader, P. G., & Lawless, K. A. (2007). Dribble files: methodologies to evaluate learning and performance in complex environments. Performance Improvement, 46(1), 40–48.

    Article  Google Scholar 

  • Sokal, R. R., & Michener, C. D. (1958). A statistical method for evaluating systematic relationships. The University of Kansas Scientific Bulletin, 38, 1409–1438.

    Google Scholar 

  • Stevens, R. H. (2007). Quantifying student’s scientific problem solving efficiency and effectiveness. Cognition and Learning, 5, 325–337.

    Google Scholar 

  • Toy, S. (2008). Online ill-structured problem-solving strategies and their influence on problem-solving performance. Unpublished doctoral dissertation, Iowa State University, Ames, IA.

  • Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of American Statistical Association, 58(301), 236–244.

    Google Scholar 

  • Webb, A. (2002). Statistical pattern recognition. Hoboken: John Wiley.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pavlo D. Antonenko.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Antonenko, P.D., Toy, S. & Niederhauser, D.S. Using cluster analysis for data mining in educational technology research. Education Tech Research Dev 60, 383–398 (2012). https://doi.org/10.1007/s11423-012-9235-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11423-012-9235-8

Keywords

Navigation