skip to main content
10.1145/2702123.2702509acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

ModelTracker: Redesigning Performance Analysis Tools for Machine Learning

Published:18 April 2015Publication History

ABSTRACT

Model building in machine learning is an iterative process. The performance analysis and debugging step typically involves a disruptive cognitive switch from model building to error analysis, discouraging an informed approach to model building. We present ModelTracker, an interactive visualization that subsumes information contained in numerous traditional summary statistics and graphs while displaying example-level performance and enabling direct error examination and debugging. Usage analysis from machine learning practitioners building real models with ModelTracker over six months shows ModelTracker is used often and throughout model building. A controlled experiment focusing on ModelTracker's debugging capabilities shows participants prefer ModelTracker over traditional tools without a loss in model performance.

References

  1. Ankerst, M., Elsen, C., Ester, M., and Kriegal, H. Visual Classification: An Interactive Approach to Decision Tree Construction. Proc. KDD 1999, ACM Press (1999), 392--396. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Becker, B., Kohavi, R., and Sommerfield, D. Visualizing the Simple Bayesian Classifier. Information Visualization in Data Mining and Knowledge Discovery. Fayyad, U., Grinstein, G.G., and Wierse, A. (eds). Morgan Kaufmann Publishers, 2001, 237--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bird, S., Klein, E., and Loper, E. Natural Language Processing with Python. O'Reilly Media, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Broekens, J., Cocx, T., and Kosters, W. Object-Centered Interactive Multi-Dimensional Scaling: Ask the Expert. Proc. BNAIC 2006, 59--66.Google ScholarGoogle Scholar
  5. Caragea, D., Cook, D., and Honavar, V. Gaining Insights into Support Vector Machine Pattern Classifiers Using Projection-Based Tour Methods. Proc. KDD 2001, ACM Press (2001), 251--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chan, Y., Correa, C., and Ma, K-L. Flow-based Scatterplots for Sensitivity Analysis. Proc. VAST 2010, IEEE (2010), 43--50.Google ScholarGoogle ScholarCross RefCross Ref
  7. Choo, J., Hanseung, L., Liu, Z., Stasko, J., and Park, H. An Interactive Visual Testbed System for Dimension Reduction and Clustering of Large-Scale HighDimensional Data. Proc. SPIE Electronic Imaging 2013, 865402-865402-15.Google ScholarGoogle Scholar
  8. Domingos, P. A Few Useful Things to Know about Machine Learning. CACM 55, 10 (2012), 78--87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Fails, J.A. and Olsen, D.R. Interactive Machine Learning. Proc. IUI 2003, ACM Press (2003), 39--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Fiebrink, R., Cook, P.R., and Trueman, D. Human Model Evaluation in Interactive Supervised Learning. Proc. CHI 2011, ACM Press (2011), 147--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I.H. The WEKA Data Mining Software: An Update. SIGKDD Explorations 11, 1 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hao, M.C., Dayal, U., Sharma, R.K., Keim, D.A., and Janetzko, H. Variable Binned Scatter Plots. Information Visualization 9, 3 (2010), 194--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. MATLAB 9.0 and Statistics Toolbox Release 2014a, The MathWorks, Inc., Natick, Massachusetts, USA, http://www.mathworks.com/products/statistics, 2014.Google ScholarGoogle Scholar
  14. Mayorga, A. and Gleicher, M. Scatterplots: Overcoming Overdraw in Scatter Plots. IEEE TVCG 19, 9 (2013), 1526--1538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Nettleton, D. F., Orriols-Puig, A., and Fornells, A. A Study of the Effect of Different Types of Noise on the Precision of Supervised Learning Techniques. AI Review 33, 4 (2010), 275--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Patel, K., Bancroft, N., Drucker, S.M., Fogarty, J., Ko, A., and Landay, J.A. Gestalt: Integrated Support for Implementation and Analysis in Machine Learning Processes. Proc. UIST 2010, ACM Press (2010), 37--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Patel, K., Drucker, S.M., Fogarty, J., Kapoor, A., and Tan, D.S. Using Multiple Models to Understand Data Proc. IJCAI 2011, AAAI Press (2011), 1723--1728. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Patel, K., Fogarty, J., Landay, J.A., and Harrison, B. Examining Difficulties Software Developers Encounter in the Adoption of Statistical Machine Learning. Proc. AAAI 2008, AAAI Press (2008), 1563--1566. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R Core Team, "R: A Language and Environment for Statistical Computing," R Foundation for Statistical Computing, http://www.R-project.org, 2013.Google ScholarGoogle Scholar
  20. Rossi, F. Visual Data Mining and Machine Learning Proc. ESANN 2006, 251--264.Google ScholarGoogle Scholar
  21. Simard, P., Chickering, D., Lakshmiratan, A., Charles, D., Bottou, L., Suarez, C.G.J., Grangier, D., Amershi, S., Verwey, J., and Suh, J. ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided Problems. 2014, arXiv:1409.4814.Google ScholarGoogle Scholar
  22. Talbot, J., Lee, B., Kapoor, A., and Tan, D. EnsembleMatrix: Interactive Visualization to Support Machine Learning with Multiple Classifiers. Proc. CHI 2009, ACM Press (2009), 1283--1292. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ModelTracker: Redesigning Performance Analysis Tools for Machine Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CHI '15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems
      April 2015
      4290 pages
      ISBN:9781450331456
      DOI:10.1145/2702123

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 April 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CHI '15 Paper Acceptance Rate486of2,120submissions,23%Overall Acceptance Rate6,199of26,314submissions,24%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader