Skip to main content

Benchmarking Concept Drift Detectors forĀ Online Machine Learning

  • Conference paper
  • First Online:
Model and Data Engineering (MEDI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13761))

Included in the following conference series:

Abstract

Concept drift detection is an essential step to maintain the accuracy of online machine learning. The main task is to detect changes in data distribution that might cause changes in the decision boundaries for a classification algorithm. Upon drift detection, the classification algorithm may reset its model or concurrently grow a new learning model. Over the past fifteen years, several drift detection methods have been proposed. Most of these methods have been implemented within the Massive Online Analysis (MOA). Moreover, a couple of studies have compared the drift detectors. However, such studies have merely focused on comparing the detection accuracy. Moreover, most of these studies are focused on synthetic data sets only. Additionally, these studies do not consider drift detectors not integrated into MOA. Furthermore, None of the studies have considered other metrics like resource consumption and runtime characteristics. These metrics are of utmost importance from an operational point of view.

In this paper, we fill this gap. Namely, this paper evaluates the performance of sixteen different drift detection methods using three different metrics: accuracy, runtime, and memory usage. To guarantee a fair comparison, MOA is used. Fourteen algorithms are implemented in MOA. We integrate two new algorithms (ADWIN++ and SDDM) into MOA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/mahmoudmahgoub/moa.

  2. 2.

    https://github.com/openjdk/jmh.

  3. 3.

    https://visualvm.github.io/.

References

  1. Baena-Garcıa, M., del Campo-Ɓvila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams, vol. 6, pp. 77ā€“86 (2006)

    Google ScholarĀ 

  2. de Barros, R.S.M., de Lima Cabral, D.R., GonƧalves Jr, P.M.G., de Carvalho Santos, S.G.T.: RDDM: reactive drift detection method. Expert Syst. Appl. 90, 344ā€“355 (2017)

    Google ScholarĀ 

  3. Barros, R.S.M., Santos, S.G.T.C.: A large-scale comparison of concept drift detectors. Inf. Sci. 451ā€“452, 348ā€“370 (2018)

    ArticleĀ  MathSciNetĀ  Google ScholarĀ 

  4. Bifet, A., GavaldĆ , R.: Learning from time-changing data with adaptive windowing. In: ICDM, pp. 443ā€“448. SIAM (2007)

    Google ScholarĀ 

  5. Bifet, A., GavaldĆ , R., Holmes, G., Pfahringer, B.: Machine Learning for Data Streams with Practical Examples in MOA. MIT Press, Cambridge (2018)

    BookĀ  Google ScholarĀ 

  6. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601ā€“1604 (2010)

    Google ScholarĀ 

  7. Brzeziński, D., Stefanowski, J.: Accuracy updated ensemble for data streams with concept drift. In: Corchado, E., Kurzyński, M., WoÅŗniak, M. (eds.) HAIS 2011. LNCS (LNAI), vol. 6679, pp. 155ā€“163. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21222-2_19

    ChapterĀ  Google ScholarĀ 

  8. Domingos, P.M., Hulten, G.: Mining high-speed data streams. In: SIGKDD, pp. 71ā€“80. ACM (2000)

    Google ScholarĀ 

  9. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley (2001)

    Google ScholarĀ 

  10. FrĆ­as-Blanco, I., del Campo-Ɓvila, J., Ramos-JimĆ©nez, G., Morales-Bueno, R., Ortiz-DĆ­az, A., Caballero-Mota, Y.: Online and non-parametric drift detection methods based on Hoeffdingā€™s bounds. IEEE Trans. Knowl. Data Eng. 27(3), 810ā€“823 (2015)

    ArticleĀ  Google ScholarĀ 

  11. Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286ā€“295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29

    ChapterĀ  Google ScholarĀ 

  12. Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1ā€“44:37 (2014)

    Google ScholarĀ 

  13. GonƧalves, P.M., de Carvalho Santos, S.G., Barros, R.S., Vieira, D.C.: A comparative study on concept drift detectors. Expert Syst. Appl. 41(18), 8144ā€“8156 (2014)

    ArticleĀ  Google ScholarĀ 

  14. Grulich, P.M., Saitenmacher, R., Traub, J., BreƟ, S., Rabl, T., Markl, V.: Scalable detection of concept drifts on data streams with parallel adaptive windowing. In: EDBT, pp. 477ā€“480. OpenProceedings.org (2018)

    Google ScholarĀ 

  15. Han, M., Chen, Z., Li, M., Wu, H., Zhang, X.: A survey of active and passive concept drift handling methods. Comput. Intell. 38(4), 1492ā€“1535 (2022)

    ArticleĀ  Google ScholarĀ 

  16. Huang, D.T.J., Koh, Y.S., Dobbie, G., Pears, R.: Detecting volatility shift in data streams, pp. 863ā€“868 (2014)

    Google ScholarĀ 

  17. Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: a new ensemble method for tracking concept drift. In: ICDM, pp. 123ā€“130. IEEE (2003)

    Google ScholarĀ 

  18. de Lima Cabral, D.R., de Barros, R.S.M.: Concept drift detection based on Fisherā€™s Exact test. Inf. Sci. 442, 220ā€“234 (2018)

    Google ScholarĀ 

  19. Liu, G., Cheng, H.R., Qin, Z.G., Liu, Q., Liu, C.X.: E-CVFDT: an improving CVFDT method for concept drift data stream. In: ICCCAS, vol. 1, pp. 315ā€“318. IEEE (2013)

    Google ScholarĀ 

  20. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE TKDE 31(12), 2346ā€“2363 (2019)

    Google ScholarĀ 

  21. Micevska, S., Awad, A., Sakr, S.: SDDM: an interpretable statistical concept drift detection method for data streams. J. Intell. Inf. Syst. 56(3), 459ā€“484 (2021). https://doi.org/10.1007/s10844-020-00634-5

    ArticleĀ  Google ScholarĀ 

  22. Moharram, H., Awad, A., El-Kafrawy, P.M.: Optimizing ADWIN for steady streams. In: ACM/SIGAPP SAC, pp. 450ā€“459. ACM (2022)

    Google ScholarĀ 

  23. Nishida, K., Yamauchi, K.: Detecting concept drift using statistical testing. In: Corruble, V., Takeda, M., Suzuki, E. (eds.) DS 2007. LNCS (LNAI), vol. 4755, pp. 264ā€“269. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75488-6_27

    ChapterĀ  Google ScholarĀ 

  24. Page, E.S.: Continuous inspection schemes. Biometrika 41(1/2), 100ā€“115 (1954). https://doi.org/10.1093/biomet/41.1-2.100

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  25. Pears, R., Sripirakas, S., Koh, Y.S.: Detecting concept change in dynamic data streams. Mach. Learn. 97, 259ā€“293 (2014). https://doi.org/10.1007/s10994-013-5433-9

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  26. Pesaranghader, A., Viktor, H.L., Paquet, E.: McDiarmid drift detection methods for evolving data streams. In: IJCNN, pp. 1ā€“9. IEEE (2018)

    Google ScholarĀ 

  27. Roberts, S.W.: Control chart tests based on geometric moving averages. Technometrics 1(3), 239ā€“250 (1959). http://www.jstor.org/stable/1266443

  28. Ross, G.J., Adams, N.M., Tasoulis, D.K., Hand, D.J.: Exponentially weighted moving average charts for detecting concept drift. Pattern Recogn. Lett. 33(2), 191ā€“198 (2012). https://www.sciencedirect.com/science/article/pii/S0167865511002704

  29. Sakthithasan, S., Pears, R., Koh, Y.S.: One pass concept change detection for data streams. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 461ā€“472. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_39

    ChapterĀ  Google ScholarĀ 

  30. Sobolewski, P., Wozniak, M.: Enhancing concept drift detection with simulated recurrence. In: Pechenizkiy, M., Wojciechowski, M. (eds.) New Trends in Databases and Information Systems. AISC, vol. 185, pp. 153ā€“162. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32518-2_15

    ChapterĀ  Google ScholarĀ 

  31. Souza, V.M.A., dos Reis, D.M., Maletzke, A.G., Batista, G.E.A.P.A.: Challenges in benchmarking stream learning algorithms with real-world data. Data Min. Knowl. Discov. 34(6), 1805ā€“1858 (2020). https://doi.org/10.1007/s10618-020-00698-5

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  32. Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: SIGKDD, pp. 377ā€“382. ACM (2001)

    Google ScholarĀ 

  33. Wald, A.: Sequential Analysis. Courier Corporation (1973)

    Google ScholarĀ 

  34. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: SIGKDD, pp. 226ā€“235. ACM (2003)

    Google ScholarĀ 

  35. Wares, S., Isaacs, J., Elyan, E.: Data stream mining: methods and challenges for handling concept drift. SN Appl. Sci. 1(11), 1ā€“19 (2019). https://doi.org/10.1007/s42452-019-1433-0

    ArticleĀ  Google ScholarĀ 

  36. Webb, G.I., Lee, L.K., Petitjean, F., Goethals, B.: Understanding concept drift. CoRR abs/1704.00362 (2017)

    Google ScholarĀ 

Download references

Acknowledgments

The work of Ahmed Awad is funded by the European Regional Development Funds (Mobilitas Plus Programme grant MOBTT75).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmed Awad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mahgoub, M., Moharram, H., Elkafrawy, P., Awad, A. (2023). Benchmarking Concept Drift Detectors forĀ Online Machine Learning. In: Fournier-Viger, P., Hassan, A., Bellatreche, L. (eds) Model and Data Engineering. MEDI 2022. Lecture Notes in Computer Science, vol 13761. Springer, Cham. https://doi.org/10.1007/978-3-031-21595-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21595-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21594-0

  • Online ISBN: 978-3-031-21595-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics