Learning from evolving data streams through ensembles of random patches

Gomes, Heitor Murilo; Read, Jesse; Bifet, Albert; Durrant, Robert J.

doi:10.1007/s10115-021-01579-z

Learning from evolving data streams through ensembles of random patches

Regular Paper
Published: 09 June 2021

Volume 63, pages 1597–1625, (2021)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Heitor Murilo Gomes ORCID: orcid.org/0000-0002-5276-637X¹,
Jesse Read²,
Albert Bifet¹ &
…
Robert J. Durrant³

598 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Ensemble methods represent an effective way to solve supervised learning problems. Such methods are prevalent for learning from evolving data streams. One of the main reasons for such popularity is the possibility of incorporating concept drift detection and recovery strategies in conjunction with the ensemble algorithm. On top of that, successful ensemble strategies, such as bagging and random forest, can be easily adapted to a streaming setting. In this work, we analyse a novel ensemble method designed specially to cope with evolving data streams, namely the streaming random patches (SRP) algorithm. SRP combines random subspaces and online bagging to achieve competitive predictive performance in comparison with other methods. We significantly extend previous theoretical insights and empirical results illustrating different aspects of SRP. In particular, we explain how the widely adopted incremental Hoeffding trees are not, in fact, unstable learners, unlike their batch counterparts, and how this fact significantly influences ensemble methods design and performance. We compare SRP against state-of-the-art ensemble variants for streaming data in a multitude of datasets. The results show how SRP produces a high predictive performance for both real and synthetic datasets. We also show how ensembles of random subspaces can be an efficient and accurate option to SRP and leveraging bagging as we increase the number of base learners. Besides, we analyse the diversity over time and the average tree depth, which provides insights on the differences between local subspace randomization (as in random forest) and global subspace randomization (as in random subspaces). Finally, we analyse the behaviour of SRP when using Naive Bayes as its base learner instead of Hoeffding trees.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic Forest for Learning from Data Streams with Varying Feature Spaces

Adaptive random forests for evolving data stream classification

Article 13 June 2017

Adaptive Ensembles for Evolving Data Streams – Combining Block-Based and Online Solutions

Notes

The implementation and instructions are available at https://github.com/hmgomes/StreamingRandomPatches.
A formal definition of concept drift can be found in [48].
The inference is one way: Algorithmic stability is sufficient, but not necessary, for learning.
\(\Lambda \) could comprise values of multiple types, for example, here the integer ensemble size M and real-valued weights \(w_j\) could be hyperparameters.
GP and c were originally identified as \(n_{\min }\) and \(\delta \) by Domingos and Hulten [15]; however, we choose to keep their acronyms as in the Massive Online Analysis (MOA) framework to facilitate reproducibility.
Results for AGR(A) and AGR(G) for \(k=50\%\) and \(k=60\%\) produce the same results as \(k=0.6\times 9=5.4\) and \(k=0.5\times 9=4.5\) rounded to the nearest integer is 5 in both cases.
In DWM [30], we can only set the maximum number of base learners, since DWM dynamically changes the ensemble size during execution.
The results in Figs. 10 and 11 exclude SPAM CPU Time and RAM hours for all algorithms, since BAG and LB did not finish executing.

References

Abdulsalam H, Skillicorn DB, Martin P (2008) Classifying evolving data streams using dynamic streaming random forests. In: International conference on database and expert systems applications. Springer, pp 643–651 (2008)
Bifet A, Frank E, Holmes G, Pfahringer B (2012) Ensembles of restricted Hoeffding trees. ACM TIST 3(2):30:1–30:20. https://doi.org/10.1145/2089094.2089106
Article Google Scholar
Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. In: SIAM
Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) Moa: massive online analysis. J Mach Learn Res 11:1601–1604
Google Scholar
Bifet A, Holmes G, Pfahringer B (2010) Leveraging bagging for evolving data streams. In: PKDD, pp 135–150
Bousquet O, Elisseeff A (2002) Stability and generalization. J Mach Learn Res 2:499–526
MathSciNet MATH Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1023/A:1018054314350
Article MATH Google Scholar
Breiman L (1999) Pasting small votes for classification in large databases and on-line. Mach Learn 36(1–2):85–103
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. J Inf Fusion 6:5–20
Article Google Scholar
Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265:50–67. https://doi.org/10.1016/j.ins.2013.12.011
Article MathSciNet MATH Google Scholar
Chen ST, Lin HT, Lu CJ (2012) An online boosting algorithm with theoretical justifications. In: Proceedings of the international conference on machine learning (ICML)
Da Xu L, He W, Li S (2014) Internet of things in industries: a survey. IEEE Trans Ind Inform 10(4):2233–2243
Article Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM SIGKDD, pp 71–80
Domingos PM (2000) A unified bias-variance decomposition for zero-one and squared loss. AAAI 2000:564–569
Google Scholar
Freund Y, Schapire RE et al (1996) Experiments with a new boosting algorithm. ICML 96:148–156
Google Scholar
Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37. https://doi.org/10.1145/2523813
Article MATH Google Scholar
Gomes HM, Barddal JP, Enembreck F, Bifet A (2017) A survey on ensemble learning for data stream classification. ACM Comput Surv 50(2):23:1–23:36. https://doi.org/10.1145/3054925
Article Google Scholar
Gomes HM, Barddal JP, Ferreira LEB, Bifet A (2018) Adaptive random forests for data stream regression. In: ESANN
Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 6:1–27. https://doi.org/10.1007/s10994-017-5642-8
Article MathSciNet MATH Google Scholar
Gomes HM, Montiel J, Mastelini SM, Pfahringer B, Bifet A (2020) On ensemble techniques for data stream regression. In: 2020 International joint conference on neural networks (IJCNN). IEEE, pp 1–8
Gomes HM, Read J, Bifet A (2019) Streaming random patches for evolving data stream classification. In: IEEE international conference on data mining. IEEE
Gomes HM, Read J, Bifet A, Barddal JP, Gama J (2019) Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explor Newsl 21(2):6–22
Article Google Scholar
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer series in statistics. Springer, New York
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Article Google Scholar
Hoens TR, Chawla NV, Polikar R (2011) Heuristic updatable weighted random subspaces for non-stationary environments. In: 2011 IEEE 11th international conference on data mining (ICDM). IEEE, pp 241–250
Holmes G, Kirkby R, Pfahringer B (2005) Stress-testing Hoeffding trees. Knowl Discov Databases PKDD 2005:495–502. https://doi.org/10.1007/11564126_50
Article Google Scholar
Ikonomovska E, Gama J, Džeroski S (2011) Learning model trees from evolving data streams. Data Min Knowl Discov 23(1):128–168
Article MathSciNet Google Scholar
Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res 8:2755–2790
MATH Google Scholar
Kuncheva LI (2003) That elusive diversity in classifier ensembles. In: Iberian conference on pattern recognition and image analysis. Springer, pp 1126–1138 (2003)
Kuncheva LI, Rodríguez JJ, Plumpton CO, Linden DE, Johnston SJ (2010) Random subspace ensembles for FMRI classification. IEEE Trans Med Imaging 29(2):531–542
Article Google Scholar
Kutin S, Niyogi P (2002) Almost-everywhere algorithmic stability and generalization error. In: Proceedings of the eighteenth conference on uncertainty in artificial intelligence. Morgan Kaufmann, pp 275–282
Kutin S, Niyogi P (2002) Almost-everywhere algorithmic stability and generalization error. Tech. Rep. TR-2002-03, University of Chicago
Lim N, Durrant RJ (2017) Linear dimensionality reduction in linear time: Johnson-lindenstrauss-type guarantees for random subspace. arXiv:1705.06408
Lim N, Durrant RJ (2020) A diversity-aware model for majority vote ensemble accuracy. In: International conference on artificial intelligence and statistics. PMLR, pp 4078–4087
Lin Y, Jeon Y (2006) Random forests and adaptive nearest neighbors. J Am Stat Assoc 101(474):578–590
Article MathSciNet Google Scholar
Littlestone N, Warmuth MK (1994) The weighted majority algorithm. Inf Comput 108(2):212–261
Article MathSciNet Google Scholar
Liu Y, Yao X (1999) Ensemble learning via negative correlation. Neural Netw 12:1399–1404
Article Google Scholar
Louppe G, Geurts P (2012) Ensembles on random patches. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 346–361 (2012)
Minku LL, White AP, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22(5):730–742
Article Google Scholar
Oza N, Russell S (2001) Online bagging and boosting. In: Artificial intelligence and statistics 2001, pp 105–112. Morgan Kaufmann
Panov P, Džeroski S (2007) Combining bagging and random subspaces to create better ensembles. In: International symposium on intelligent data analysis. Springer, pp 118–129 (2007)
Plumpton CO, Kuncheva LI, Oosterhof NN, Johnston SJ (2012) Naive random subspace ensemble with linear classifiers for real-time classification of FMRI data. Pattern Recognit 45(6):2101–2108
Article Google Scholar
Servedio RA (2003) Smooth boosting and learning with malicious noise. J Mach Learn Res 4:633–648
MathSciNet MATH Google Scholar
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Stapenhurst RJ (2012) Diversity, margins and non-stationary learning. Ph.D. thesis, University of Manchester, UK
Webb GI, Hyde R, Cao H, Nguyen HL, Petitjean F (2016) Characterizing concept drift. Data Min Knowl Discov 30(4):964–994
Article MathSciNet Google Scholar
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101. https://doi.org/10.1023/A:1018046501280
Article Google Scholar
Žliobaite I (2010) Change with delayed labeling: When is it detectable? In: 2010 IEEE international conference on Data mining workshops (ICDMW). IEEE, pp 843–850 (2010)

Download references

Author information

Authors and Affiliations

AI Institute, University of Waikato, Hamilton, New Zealand
Heitor Murilo Gomes & Albert Bifet
LIX, École Polytechnique, Palaiseau, France
Jesse Read
Department of Statistics, University of Waikato, Hamilton, New Zealand
Robert J. Durrant

Authors

Heitor Murilo Gomes
View author publications
You can also search for this author in PubMed Google Scholar
Jesse Read
View author publications
You can also search for this author in PubMed Google Scholar
Albert Bifet
View author publications
You can also search for this author in PubMed Google Scholar
Robert J. Durrant
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heitor Murilo Gomes.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gomes, H.M., Read, J., Bifet, A. et al. Learning from evolving data streams through ensembles of random patches. Knowl Inf Syst 63, 1597–1625 (2021). https://doi.org/10.1007/s10115-021-01579-z

Download citation

Received: 28 January 2020
Revised: 04 May 2021
Accepted: 10 May 2021
Published: 09 June 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s10115-021-01579-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning from evolving data streams through ensembles of random patches

Abstract

Access this article

Similar content being viewed by others

Dynamic Forest for Learning from Data Streams with Varying Feature Spaces

Adaptive random forests for evolving data stream classification

Adaptive Ensembles for Evolving Data Streams – Combining Block-Based and Online Solutions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning from evolving data streams through ensembles of random patches

Abstract

Access this article

Similar content being viewed by others

Dynamic Forest for Learning from Data Streams with Varying Feature Spaces

Adaptive random forests for evolving data stream classification

Adaptive Ensembles for Evolving Data Streams – Combining Block-Based and Online Solutions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation