Abstract
Modern technology has allowed real-time data collection in a variety of domains, ranging from environmental monitoring to healthcare. Consequently, there is a growing need for algorithms capable of performing inferential tasks in an online manner, continuously revising their estimates to reflect the current status of the underlying process. In particular, we are interested in constructing online and temporally adaptive classifiers capable of handling the possibly drifting decision boundaries arising in streaming environments. We first make a quadratic approximation to the log-likelihood that yields a recursive algorithm for fitting logistic regression online. We then suggest a novel way of equipping this framework with self-tuning forgetting factors. The resulting scheme is capable of tracking changes in the underlying probability distribution, adapting the decision boundary appropriately and hence maintaining high classification accuracy in dynamic or unstable environments. We demonstrate the scheme’s effectiveness in both real and simulated streaming environments.
Similar content being viewed by others
References
Adams NM, Hand DJ (2000) Improving the practice of classifier performance assessment. Neural Comput 12(2): 305–311
Aggarwal, CC, Han J, Wang J, Yu PS (2004) On demand classification of data streams. In: KKD’04 proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 503–508
Alaiz-Rodrigez R, Japkowicz N (2008) Assessing the impact of changing environments on classifier performance. In: Advances in artificial intelligence, Lecture Notes in Computer Science, vol 5032/2008. Springer, Heidelberg, pp 13–24
Baena-Garcia M, del Campo-Avila J, Fidalgo R, Bifet A, Gavalda R, Morales-Bueno R (2006) Early drift detection method. In: ECML PKDD 2006 workshop on knowledge discovery from data streams, pp 77–86
Balakrishnan S, Madigan D (2008) Algorithms for sparse linear classifiers in the massive data setting. J Mach Learn Res 9: 313–337
Benveniste A, Priouret P, Métivier M (1990) Adaptive algorithms and stochastic approximations. Springer, New York
Black M, Hickey R (2004) Learning classification rules for telecom customer call data under concept drift. Soft Comput 8(2): 102–108
Carbonara L, Borrowman A (1998) A comparison of batch and incremental supervised learning algorithms. In: Principles of data mining and knowledge discovery, LNCS, vol 1510, pp 264–272
Chapelle O, Zien A (2005) Semi-supervised classification by low density separation. In: Proceedings of the tenth international workshop on artificial intelligence and statistics, vol 2005
Copsey KD (2005) Automatic target recognition using both measurements from identity sensors and motion information from tracking sensors. In: Automatic target recognition, Proc SPIE, vol 5807, pp 273–283
Copsey K, Webb A (2004) Classifier design for population and sensor drift. In: structural, syntactic and statistical pattern recognition, LNCS, vol 3138, pp 744–752
Crabtree B, Soltysiak SJ (1998) Identifying and tracking changing interests. Int J Digit Libr 2(1): 38–53
Darken C, Chang J, Moody J (1992) Learning rate schedules for faster stochastic gradient search. In: Neural networks for signal processing 2—Proceedings of the 1992 IEEE workshop
Fdez-Riverola F, Iglesias EL, Diaz F, Mendez JR, Corchado JM (2007) SpamHunting: an instance-based reasoning system for spam labelling and filtering. Decis Support Syst 43: 722–726
Fung G, Mangasarian OL (2002) Incremental support vector machine classification. In: Proceedings of the second SIAM international conference on data mining, pp 247–260
Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. SBIA Brazilian symposium on artificial intelligence, pp 286–295
Hand DJ (1997) Construction and assessment of classification rules. Wiley, London
Hand DJ (2006) Classifier technology and the illusion of progres (with discussion). Stat Sci 21(1): 1–34
Hand DJ, Whitrow C, Adams NM, Juszczak P, Weston DJ (2008) Performance criteria for plastic card fraud detection tools. J Oper Res Soc 59: 956–962
Haralick RM (1979) Statistical and structural approaches to texture. Proc IEEE 67(5): 786–804
Harries M (1999) Splice-2 comparative evaluation: electricity pricing. Tech. rept. University of New South Wales, School of Computer Science and Engineering
Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, Berlin
Haykin S (1996) Adaptive filter theory. Prentice-Hall, Upper Saddle River
Kuncheva LI (2004) Classifier ensembles for changing environments. In: Multiple classifier systems, LNCS, vol 3077, pp 1–15
Kushner HJ, Yang J (1995) Analysis of adaptive step size sa algorithms for parameter tracking. IEEE Trans Automat Contr 40(8): 1403–1410
Ljung L, Gunnarsson S (1990) Adaptation and tracking in system identification—a survey. Automatica 26(1): 7–21
Magoulas GD, Plagianakos VP, Vrahatis MN (2004a) Neural network-based colonoscopic diagnosis using on-line learning and differential evolution. Appl Soft Comput 4: 369–379
Magoulas GD, Plagianakos VP, Tasoulis DK, Vrahatis MN (2004b) Tumor detection in colonoscopy using the unsupervised k-windows clustering algorithm and neural networks. In: Proceedings of fourth European symposium on biomedical engineering, Patras, Greece, pp 152–163
McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall/CRC, London, UK
Penny WD, Roberts SJ (1999) Dynamic logistic regression. In: International joint conference on neural networks, vol 3, pp 1562–1567
Press W, Teukolsky S, Vetterling W, Flannery B (1988) Numerical recipes in C. Cambridge University Press, Cambridge, UK
Riedmiller M, Braun H (1993) A direct adaptive method for faster backpropagation learning: theRPROP algorithm. In: IEEE international conference on neural networks, pp 586–591
Saad D (1998) On-line learning in neural networks. Cambridge University Press, Cambridge, UK
Salgado ME, Goodwin GC, Middleton RH (1988) Modified least squares algorithm incorporating exponential resetting and forgetting. Int J Contr 47(2): 477–491
Schraudolph NN (1999) Local gain adaptation in stochastic gradient descent. Artif Neural Netw 470: 569–574
Street WN, Kim Y (2001) A streaming ensemble algorith (SEA) for large-scale classification. In: KKD’01: proceedings of the seventh acm SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 377–382
Sutton RS (1992) Gain adaptation beats least squares? In: Proceedings of the seventh Yale workshop on adaptive and learning systems
Tasoulis DK, Adams NM, Hand DJ (2007) Selective fusion of delayed measurements in filtering. In: IEEE international workshop on machine learning for signal processing (IEEE MLSP), pp 336–341
Whittaker J, Whitehead C, Somers M (2008) A dynamic scorecard for monitoring baseline performance with application to tracking a mortgage portfolio. J Oper Res Soc 58(11): 911–921
Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1: 270–280
Yang Y (2007) Adaptive credit scoring with kernel learning methods. Euro J Oper Res 183(3): 1521–1536
Zheng W (2006) Class-incremental generalized discriminant analysis. Neural Comput 18: 979–1006
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Anagnostopoulos, C., Tasoulis, D.K., Adams, N.M. et al. Temporally adaptive estimation of logistic classifiers on data streams. Adv Data Anal Classif 3, 243–261 (2009). https://doi.org/10.1007/s11634-009-0051-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-009-0051-x