Skip to main content

ML-ModelExplorer: An Explorative Model-Agnostic Approach to Evaluate and Compare Multi-class Classifiers

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12279))

Abstract

A major challenge during the development of Machine Learning systems is the large number of models resulting from testing different model types, parameters, or feature subsets. The common approach of selecting the best model using one overall metric does not necessarily find the most suitable model for a given application, since it ignores the different effects of class confusions. Expert knowledge is key to evaluate, understand and compare model candidates and hence to control the training process. This paper addresses the research question of how we can support experts in the evaluation and selection of Machine Learning models, alongside the reasoning about them. ML-ModelExplorer is proposed – an explorative, interactive, and model-agnostic approach utilising confusion matrices. It enables Machine Learning and domain experts to conduct a thorough and efficient evaluation of multiple models by taking overall metrics, per-class errors, and individual class confusions into account. The approach is evaluated in a user-study and a real-world case study from football (soccer) data analytics is presented.

ML-ModelExplorer and a tutorial video are available online for use with own data sets: www.ml-and-vis.org/mex

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    domain experts are assumed to have a basic understanding of classification problems, i.e. understand class errors and class confusions.

  2. 2.

    ML-ModelExplorer online: www.ml-and-vis.org/mex.

  3. 3.

    ML-ModelExplorer video: https://youtu.be/IO7IWTUxK_Y.

References

  1. Alsallakh, B., Hanbury, A., Hauser, H., Miksch, S., Rauber, A.: Visual methods for analyzing probabilistic classification data. IEEE Trans. Visual Comput. Graphics 20(12), 1703–1712 (2014)

    Article  Google Scholar 

  2. Armatas, V., Yiannakos, A., Papadopoulou, S., Skoufas, D.: Evaluation of goals scored in top ranking soccer matches: Greek “superleague” 2006–08. Serbian J. Sports Sci. 3, 39–43 (2009)

    Google Scholar 

  3. Bernard, J., Zeppelzauer, M., Sedlmair, M., Aigner, W.: VIAL: a unified process for visual interactive labeling. Vis. Comput. 34(9), 1189–1207 (2018). https://doi.org/10.1007/s00371-018-1500-3

    Article  Google Scholar 

  4. Chang, W., Cheng, J., Allaire, J., Xie, Y., McPherson, J.: shiny: web application framework for R. r package version 1.0.5 (2017). https://CRAN.R-project.org/package=shiny

  5. Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1

    Chapter  Google Scholar 

  6. Fawcett, T.: ROC graphs: notes and practical considerations for researchers. Technical report, HP Laboratories (2004)

    Google Scholar 

  7. Frencken, W., Lemmink, K., Delleman, N., Visscher, C.: Oscillations of centroid position and surface area of soccer teams in small-sided games. Eur. J. Sport Sci. 11(4), 215–223 (2011). https://doi.org/10.1080/17461391.2010.499967

    Article  Google Scholar 

  8. Goes, F.R., Kempe, M., Meerhoff, L.A., Lemmink, K.A.P.M.: Not every pass can be an assist: a data-driven model to measure pass effectiveness in professional soccer matches. Big Data 7(1), 57–70 (2019). https://doi.org/10.1089/big.2018.0067

  9. Goes, F.R., et al.: Unlocking the potential of big data to support tactical performance analysis in professional soccer: a systematic review. Eur. J. Sport Sci. (2020, to appear). https://doi.org/10.1080/17461391.2020.1747552

  10. Holzinger, A., et al.: Interactive machine learning: experimental evidence for the human in the algorithmic loop. Appl. Intell. 49(7), 2401–2414 (2018). https://doi.org/10.1007/s10489-018-1361-5

    Article  Google Scholar 

  11. Huang, W., Song, G., Li, M., Hu, W., Xie, K.: Adaptive weight optimization for classification of imbalanced data. In: Sun, C., Fang, F., Zhou, Z.-H., Yang, W., Liu, Z.-Y. (eds.) IScIDE 2013. LNCS, vol. 8261, pp. 546–553. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42057-3_69

    Chapter  Google Scholar 

  12. Inc., P.T.: Collaborative data science (2015). https://plot.ly

  13. Inselberg, A.: The plane with parallel coordinates. Vis. Comput. 1(2), 69–91 (1985)

    Article  MathSciNet  Google Scholar 

  14. Jiang, L., Liu, S., Chen, C.: Recent research advances on interactive machine learning. J. Vis. 22(2), 401–417 (2018). https://doi.org/10.1007/s12650-018-0531-1

    Article  Google Scholar 

  15. Kautz, T., Eskofier, B.M., Pasluosta, C.F.: Generic performance measure for multiclass-classifiers. Pattern Recogn. 68, 111–125 (2017). https://doi.org/10.1016/j.patcog.2017.03.008

    Article  Google Scholar 

  16. Krause, J., Perer, A., Bertini, E.: Infuse: interactive feature selection for predictive modeling of high dimensional data. IEEE Trans. Visual Comput. Graph. 20(12), 1614–1623 (2014)

    Article  Google Scholar 

  17. Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51(2), 181–207 (2003)

    Article  Google Scholar 

  18. LeCun, Y.: The MNIST database of handwritten digits (1999). http://yann.lecun.com/exdb/mnist/

  19. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015). https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  20. Link, D., Lang, S., Seidenschwarz, P.: Real time quantification of dangerousity in football using spatiotemporal tracking data. PLoS ONE 11(12), 1–16 (2016). https://doi.org/10.1371/journal.pone.0168768

  21. Meerhoff, L.A., Goes, F., de Leeuw, A.W., Knobbe, A.: Exploring successful team tactics in soccer tracking data. In: MLSA@PKDD/ECML (2019)

    Google Scholar 

  22. Memmert, D., Lemmink, K.A.P.M., Sampaio, J.: Current approaches to tactical performance analyses in soccer using position data. Sports Med. 47(1), 1–10 (2016). https://doi.org/10.1007/s40279-016-0562-5

    Article  Google Scholar 

  23. Park, C., Lee, J., Han, H., Lee, K.: ComDia+: an interactive visual analytics system for comparing, diagnosing, and improving multiclass classifiers. In: 2019 IEEE Pacific Visualization Symposium (PacificVis), pp. 313–317, April 2019

    Google Scholar 

  24. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  25. Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syste. Mag. 6, 21–45 (2006)

    Google Scholar 

  26. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2017). https://www.R-project.org/

  27. Raschka, S.: Model evaluation, model selection, and algorithm selection in machine learning. CoRR abs/1811.12808 (2018)

    Google Scholar 

  28. Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29(9), 2352–2449 (2017)

    Article  MathSciNet  Google Scholar 

  29. Ren, D., Amershi, S., Lee, B., Suh, J., Williams, J.D.: Squares: supporting interactive performance analysis for multiclass classifiers. IEEE Trans. Visual Comput. Graphics 23(1), 61–70 (2017)

    Article  Google Scholar 

  30. Sacha, D., et al.: What you see is what you can change: human-centered machine learning by interactive visualization. Neurocomputing 268, 164–175 (2017). https://doi.org/10.1016/j.neucom.2017.01.105

    Article  Google Scholar 

  31. Shneiderman, B.: The eyes have it: a task by data type taxonomy for information visualizations. In: In Proceedings of Visual Languages, pp. 336–343. IEEE Computer Science Press (1996)

    Google Scholar 

  32. Theissler, A.: Detecting known and unknown faults in automotive systems using ensemble-based anomaly detection. Knowl. Based Syst. 123(C), 163–173 (2017). https://doi.org/10.1016/j.knosys.2017.02.023

  33. Zhang, J., Wang, Y., Molino, P., Li, L., Ebert, D.S.: Manifold: a model-agnostic framework for interpretation and diagnosis of machine learning models. IEEE Trans. Visual Comput. Graph. 25(1), 364–373 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Theissler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Theissler, A., Vollert, S., Benz, P., Meerhoff, L.A., Fernandes, M. (2020). ML-ModelExplorer: An Explorative Model-Agnostic Approach to Evaluate and Compare Multi-class Classifiers. In: Holzinger, A., Kieseberg, P., Tjoa, A., Weippl, E. (eds) Machine Learning and Knowledge Extraction. CD-MAKE 2020. Lecture Notes in Computer Science(), vol 12279. Springer, Cham. https://doi.org/10.1007/978-3-030-57321-8_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-57321-8_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-57320-1

  • Online ISBN: 978-3-030-57321-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics