10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing

Research Article

Evolving Stream Classification using Change Detection

Download593 downloads
  • @INPROCEEDINGS{10.4108/icst.collaboratecom.2014.257769,
        author={Ahmad Mustafa and Ahsanul Haque and Latifur Khan and Michael Baron and Bhavani Thuraisingham},
        title={Evolving Stream Classification using Change Detection},
        proceedings={10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing},
        publisher={IEEE},
        proceedings_a={COLLABORATECOM},
        year={2014},
        month={11},
        keywords={change detection stream mining concept drift},
        doi={10.4108/icst.collaboratecom.2014.257769}
    }
    
  • Ahmad Mustafa
    Ahsanul Haque
    Latifur Khan
    Michael Baron
    Bhavani Thuraisingham
    Year: 2014
    Evolving Stream Classification using Change Detection
    COLLABORATECOM
    IEEE
    DOI: 10.4108/icst.collaboratecom.2014.257769
Ahmad Mustafa1,*, Ahsanul Haque1, Latifur Khan1, Michael Baron1, Bhavani Thuraisingham1
  • 1: The University of Texas at Dallas
*Contact email: amm106220@utdallas.edu

Abstract

Classifying instances in evolving data stream is a challenging task because of its properties, e.g., infinite length, concept drift, and concept evolution. Most of the currently available approaches to classify stream data instances divide the stream data into fixed size chunks to fit the data in memory and process the fixed size chunk one after another. However, this may lead to failure of capturing the concept drift immediately. We try to determine the chunk size dynamically by exploiting change point detection (CPD) techniques on stream data. In general, the distribution families before and after the change point are unknown over the stream, therefore non-parametric CPD algorithms are used in this case. We propose a multi-dimensional non-parametric CPD technique to determine chunk boundary over data streams dynamically which leads to better models to classify instances of evolving data streams. Experimental results show that our approach can detect the change points and classify instances of evolving data stream with high accuracy as compared to other baseline approaches.