Recurrent concepts in data streams classification

Gama, João; Kosina, Petr

doi:10.1007/s10115-013-0654-6

Recurrent concepts in data streams classification

Regular Paper
Published: 31 May 2013

Volume 40, pages 489–507, (2014)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

João Gama^1,2 &
Petr Kosina^1,3

1035 Accesses
31 Citations
Explore all metrics

Abstract

This work addresses the problem of mining data streams generated in dynamic environments where the distribution underlying the observations may change over time. We present a system that monitors the evolution of the learning process. The system is able to self-diagnose degradations of this process, using change detection mechanisms, and self-repair the decision models. The system uses meta-learning techniques that characterize the domain of applicability of previously learned models. The meta-learner can detect recurrence of contexts, using unlabeled examples, and take pro-active actions by activating previously learned models. The experimental evaluation on three text mining problems demonstrates the main advantages of the proposed system: it provides information about the recurrence of concepts and rapidly adapts decision models when drift occurs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining Recurrent Concepts in Data Streams Using the Discrete Fourier Transform

CPF: Concept Profiling Framework for Recurring Drifts in Data Streams

Recovery Analysis for Adaptive Learning from Non-stationary Data Streams

Notes

The default value is 250 examples.

References

Baena-Garcia M, Campo-Avila J, Fidalgo R, Bifet A, Gavalda R, Morales-Bueno R (2006) Early drift detection method. In: Fourth international workshop on knowledge discovery from data streams (ECML-PKDD), Berlin, Germany
Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the SIAM international conference on data mining, Minneapolis, USA. SIAM, pp 443–448
Dijkstra W (1974) Self-stabilizing systems in spite of distributed control. Commun ACM 17(11):643–644
Article MATH Google Scholar
Duda RO, Hart PE (1973) Pattern classification and scene analysis, vol 95. Wiley, New York
Gama J (2010) Knowledge discovery from data streams. CRC Press, Boca Raton
Book MATH Google Scholar
Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: SBIA Brazilian symposium on artificial intelligence. Springer, Berlin, pp 286–295
Gama J, Sebastiao R, Rodrigues PP (2013) On evaluation stream learning algorithms. Mach Learn 90(3):317–346
Article MATH MathSciNet Google Scholar
Granitzer M, Kröll M, Seifert C, Rath AS, Weber N, Dietzel O, Lindstaedt SN (2008) Analysis of machine learning techniques for context extraction. In: Pichappan P, Abraham A (eds) ICDIM. IEEE, pp 233–240
Grant E, Leavenworth R (1996) Statistical quality control. McGraw-Hill, London
Google Scholar
Harries MB, Sammut C, Horn K (1998) Extracting hidden context. Mach Learn 32:101–126
Article MATH Google Scholar
Katakis I, Tsoumakas G, Banos E, Bassiliades N, Vlahavas I (2009) An adaptive personalized news dissemination system. J Intell Inf Syst 32:191–212
Article Google Scholar
Katakis I, Tsoumakas G, Vlahavas I (2010) Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowl Inf Syst 22:371–391
Article Google Scholar
Klinkenberg R (2004) Learning drifting concepts: example selection vs. example weighting. Intell Data Anal 8(3):281–300
Google Scholar
Lazarescu MM (2005) A multi-resolution learning approach to tracking concept drift and recurrent concepts. In: Proceedings of the 5th international workshop on pattern recognition in information systems
Ortega J (1995) Exploiting multiple existing models and learning algorithms. In: AAAI 96—workshop in induction of multiple learning models, pp 17–21
Ortega J, Koppel M, Argamon S (2001) Arbitrating among competing classifiers using learned referees. Knowl Inf Syst 3(4):470–490
Article MATH Google Scholar
Ramamurthy S, Bhatnagar R (2007) Tracking recurrent concept drift in streaming data using ensemble classifiers. In: Proceedings of the sixth international conference on machine learning and applications (ICMLA ’07), pp 404–409. IEEE Computer Society, Washington, DC
Seewald A, Fürnkranz J (2001) An evaluation of grading classifiers. In: Hoffmann F, Hand DJ, Adams N, Fisher D, Guimaraes G (eds) Advances in intelligent data analysis: proceedings of the 4th international conference (IDA-01), Cascais, Portugal. Springer, pp 115–124
Street WN, Kim Y (2001) A streaming ensemble algorithm (sea) for large-scale classification. In: SIGKDD, Knowledge discovery and data mining. ACM Press, New York, pp 377–382
Turney P (1996) The management of context-sensitive features: a review of strategies. In: 13th international conference on machine learning (ICML96), workshop on learning in context-sensitive domains, Bari, Italy, pp 60–66
Widmer G (1997) Tracking context changes through meta-learning. Mach Learn 27(3):259–286
Article Google Scholar
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101
Google Scholar
Yang Y, Wu X, Zhu X (2006) Mining in anticipation for concept change: proactive–reactive prediction in data streams. Data Min Knowl Discov 13(3):261–289
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work is part-funded by the ERDF—European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness), by the Portuguese Funds through the FCT (Portuguese Foundation for Science and Technology) within project FCOMP—01-0124-FEDER-022701. The authors acknowledge the financial support given by the project Knowledge Discovery from Ubiquitous Data Streams (PTDC/EIA/098355/2008), funded by FCT. Petr Kosina acknowledges the support of Masaryk University, Faculty of Informatics.

Author information

Authors and Affiliations

LIAAD-INESC TEC, Porto, Portugal
João Gama & Petr Kosina
FEP-University of Porto, Porto, Portugal
João Gama
FI Masaryk University, Brno, Czech Republic
Petr Kosina

Authors

João Gama
View author publications
You can also search for this author in PubMed Google Scholar
Petr Kosina
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to João Gama.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gama, J., Kosina, P. Recurrent concepts in data streams classification. Knowl Inf Syst 40, 489–507 (2014). https://doi.org/10.1007/s10115-013-0654-6

Download citation

Received: 28 October 2011
Revised: 05 February 2013
Accepted: 27 April 2013
Published: 31 May 2013
Issue Date: September 2014
DOI: https://doi.org/10.1007/s10115-013-0654-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recurrent concepts in data streams classification

Abstract

Access this article

Similar content being viewed by others

Mining Recurrent Concepts in Data Streams Using the Discrete Fourier Transform

CPF: Concept Profiling Framework for Recurring Drifts in Data Streams

Recovery Analysis for Adaptive Learning from Non-stationary Data Streams

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Recurrent concepts in data streams classification

Abstract

Access this article

Similar content being viewed by others

Mining Recurrent Concepts in Data Streams Using the Discrete Fourier Transform

CPF: Concept Profiling Framework for Recurring Drifts in Data Streams

Recovery Analysis for Adaptive Learning from Non-stationary Data Streams

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation