Skip to main content

Adaptive Fast XGBoost for Regression

  • Conference paper
  • First Online:
Intelligent Systems (BRACIS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13653))

Included in the following conference series:

Abstract

The increasing generation of data by devices, people and systems arises the need for processing non-stationary data streams, which continuously change over time. It was noticed that when compared to data stream classification, there is a lack of data stream regression studies. This work proposes AFXGBReg-D, an Adaptive Fast regression algorithm using XGBoost and active concept drift detectors. AFXGBReg uses an alternate model training strategy to achieve lean models adapted to concept drift, combined with a set of drift detector algorithms: ADWIN, KSWIN and DMM. We compared two AFXGBReg variants with other regressors and data stream regressors, simulating using synthetic datasets with different kinds of concept drifts. We show that AFXGBReg models have similar MSE to ARFReg, with these models achieving the best performance than others as proven statistically. Also AFXGBReg is 33 times faster than ARFReg, meaning that it is able to keep the same MSE level while being much faster. Another improvement is its ability of doing a faster recovery from concept drifts, having a smaller MSE peak.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abbaszadeh, O., Amiri, A., Khanteymoori, A.R.: An ensemble method for data stream classification in the presence of concept drift. Front. Inf. Technol. Electron. Eng. 16(12), 1059–1068 (2015). https://doi.org/10.1631/FITEE.1400398

    Article  Google Scholar 

  2. Barddal, J.P.: Vertical and horizontal partitioning in data stream regression ensembles. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Curitiba (2019)

    Google Scholar 

  3. Bonassa, G.: Adaptação de classificador utilizando a biblioteca XGBoost para classificação rápida de fluxos de dados parcialmente classificados com mudança de conceito (2021)

    Google Scholar 

  4. Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., et al.: Xgboost: extreme gradient boosting. R Package Version 0.4-2 1(4), 1–4 (2015)

    Google Scholar 

  5. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  6. Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments: a survey. Comput. Intell. Mag. 10(4), 12–25 (2015).https://doi.org/10.1109/MCI.2015.2471196

  7. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

  8. Gama, J., Žliobaitundefined, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4) (2014). https://doi.org/10.1145/2523813

  9. Gamage, S., Premaratne, U.: Detecting and adapting to concept drift in continually evolving stochastic processes. In: Proceedings of the International Conference on Big Data and Internet of Thing, BDIOT 2017, pp. 109–114. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3175684.3175723

  10. Gomes, H.M., Barddal, J.P., Ferreira, L.E.B., Bifet, A.: Adaptive random forests for data stream regression. In: ESANN. IEEE, Curitiba (2018)

    Google Scholar 

  11. Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J., Woźniak, M.: Ensemble learning for data stream analysis: a survey. Inf. Fus. 37, 132–156 (2017)

    Article  Google Scholar 

  12. Laney, D.: 3D data management: controlling data volume, velocity, and variety. Technical report, META Group, EUA (2001). http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf

  13. Larson, D., Chang, V.: A review and future direction of agile, business intelligence, analytics and data science. Int. J. Inf. Manag. 36(5), 700–710 (2016)

    Article  Google Scholar 

  14. Liao, Z., Wang, Y.: Rival learner algorithm with drift adaptation for online data stream regression. In: Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence, ACAI 2018, Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3302425.3302475

  15. Lopes, R.H., Reid, I., Hobson, P.R.: The two-dimensional kolmogorov-smirnov test (2007)

    Google Scholar 

  16. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2018)

    Google Scholar 

  17. Mahdi, O.A., Pardede, E., Ali, N., Cao, J.: Fast reaction to sudden concept drift in the absence of class labels. Appl. Sci. 10(2), 606 (2020)

    Article  Google Scholar 

  18. Mayr, A., Binder, H., Gefeller, O., Schmid, M.: The evolution of boosting algorithms. Methods Inf. Med. 53(06), 419–427 (2014)

    Article  Google Scholar 

  19. Mehmood, H., Kostakos, P., Cortes, M., Anagnostopoulos, T., Pirttikangas, S., Gilman, E.: Concept drift adaptation techniques in distributed environment for real-world data streams. Smart Cities 4(1), 349–371 (2021)

    Article  Google Scholar 

  20. Montiel, J., Mitchell, R., Frank, E., Pfahringer, B., Abdessalem, T., Bifet, A.: Adaptive XGBoost for evolving data streams. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Hamilton (2020)

    Google Scholar 

  21. Montiel, J., Read, J., Bifet, A., Abdessalem, T.: Scikit-multiflow: a multi-output streaming framework. J. Mach. Learn. Res. 19(72), 1–5 (2018). http://jmlr.org/papers/v19/18-251.html

  22. Ramraj, S., Uzir, N., Sunil, R., Banerjee, S.: Experimenting XGBoost algorithm for prediction and classification of different datasets. Int. J. Control Theory Appl. 9, 651–662 (2016)

    Google Scholar 

  23. Schapire, R.E.: The boosting approach to machine learning: an overview. In: Nonlinear Estimation and Classification, pp. 149–171 (2003)

    Google Scholar 

  24. Yan, M.M.W.: Accurate detecting concept drift in evolving data streams. ICT Express 6(4), 332–338 (2020)

    Article  Google Scholar 

  25. Yang, L., Manias, D.M., Shami, A.: Pwpae: an ensemble framework for concept drift adaptation in iot data streams. arXiv preprint arXiv:2109.05013 (2021)

  26. Yu, H., Lu, J., Zhang, G.: Morstreaming: a multioutput regression system for streaming data. IEEE Trans. Syst. Man Cybern. Syst., 1–13 (2021). https://doi.org/10.1109/TSMC.2021.3102978

Download references

Acknowledgments

We would like to specially thanks FAPESC – Fundação de Amparo à Pesquisa e Inovação do Estado de Santa Catarina – to partially funded this research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabiano Baldo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

de Souza, F.M., Grando, J., Baldo, F. (2022). Adaptive Fast XGBoost for Regression. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. Lecture Notes in Computer Science(), vol 13653. Springer, Cham. https://doi.org/10.1007/978-3-031-21686-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21686-2_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21685-5

  • Online ISBN: 978-3-031-21686-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics