Summary
Data streams are generated in large quantities and at rapid rates from sensor networks that typically monitor environmental conditions, traffic conditions and weather conditions among others. A significant challenge in sensor networks is the analysis of the vast amounts of data that are rapidly generated and transmitted through sensing. Given that wired communication is infeasible in the environmental situations outlined earlier, the current method for communicating this data for analysis is through satellite channels. Satellite communication is exorbitantly expensive. In order to address this issue, we propose a strategy for on-board mining of data streams in a resource-constrained environment. We have developed a novel approach that dynamically adapts the data-stream mining process on the basis of available memory resources. This adaptation is algorithm-independent and enables data-stream mining algorithms to cope with high data rates in the light of finite computational resources. We have also developed lightweight data-stream mining algorithms that incorporate our adaptive mining approach for resource constrained environments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abadi, D., D. Carney, U. Cetintemel, M. Cherniack, C. Convey, C. Erwin, E. Galvez, M. Hatoun, J. Hwang, A. Maskey, A. Rasin, A. Singer, M. Stonebraker, N. Tatbul, Y. Xing, R. Yan and S. Zdonik, 2003: Aurora: A data stream management system (demonstration). Proceedings of the ACM SIGMOD International Conference on Management of Data.
Aggarwal, C., J. Han, J. Wang and P. S. Yu, 2003: A framework for clustering evolving data streams. Proceedings of 2003 International Conference on Very Large Databases.
— 2004: A framework for projected clustering of high dimensional data streams. Proceedings of International Conference on Very Large Databases.
— 2004: On demand classification of data streams. Proceedings of International Conference on Knowledge Discovery and Data Mining.
Arasu, A., B. Babcock, S. Babu, M. Datar, K. Ito, I. Nishizawa, J. Rosenstein and J. Widom, 2003: STREAM: The Stanford stream data manager demonstration description-short overview of system status and plans. Proceedings of the ACM International Conference on Management of Data.
Babcock, B., S. Babu, M. Datar, R. Motwani and J. Widom, 2002: Models and issues in data stream systems. Proceedings of the 21st Symposium on Principles of Database Systems.
Babcock, B., M. Datar and R. Motwani 2003: Load shedding techniques for data stream systems (short paper). Proceedings of the Workshop on Management and Processing of Data Streams.
Babcock, B., M. Datar, R. Motwani and L. O’Callaghan, 2003: Maintaining variance and k-medians over data stream windows. Proceedings of the 22nd Symposium on Principles of Database Systems.
Bhargava, R., H. Kargupta and M. Powers, 2003: Energy consumption in data analysis for on-board and distributed applications. Proceedings of the International Conference on Machine Learning workshop on Machine Learning Technologies for Autonomous Space Applications.
Burl, M., C. Fowlkes, J. Roden, A. Stechert and S. Mukhtar, 1999: Diamond Eye: A distributed architecture for image data mining. Proceedings of SPIE Conference on Data Mining and Knowledge Discovery: Theory, Tools, and Technology.
Cai, Y. D., D. Clutter, G. Pape, J. Han, M. Welge and L. Auvil, 2004: MAIDS: Mining alarming incidents from data streams (system demonstration). Proceedings of ACM-SIGMOD International Conference on Management of Data.
Charikar, M., L. O’Callaghan and R. Panigrahy, 2003: Better streaming algorithms for clustering problems. Proceedings of 35th ACM Symposium on Theory of Computing.
Cormode, G., and S. Muthukrishnan, 2003: Radial histograms for spatial streams, Technical Report DIMACS TR 2003-11.
Coughlan, J., 2004: Accelerating scientific discovery at NASA. Proceedings of Fourth SIAM International Conference on Data Mining.
Datar, M., A. Gionis, P. Indyk and R. Motwani: Maintaining stream statistics over sliding windows (extended abstract). Proceedings of 13th Annual ACM-SIAM Symposium on Discrete Algorithms.
Domingos, P., and G. Hulten, 2000: Mining high-speed data streams. Proceedings of the Association for Computing Machinery Sixth International Conference on Knowledge Discovery and Data Mining,71–80.
— 2001: A general method for scaling up machine learning algorithms and its application to clustering. Proceedings of the Eighteenth International Conference on Machine Learning, 106–13.
Dong, G., J. Han, L. Lakshmanan, J. Pei, H. Wang and P. S. Yu, 2003: Online mining of changes from data streams: Research problems and preliminary results. Proceedings of the ACM SIGMOD Workshop on Management and Processing of Data Streams. In cooperation with the ACMSIGMOD International Conference on Management of Data.
Ganti, V., J. Gehrke and R. Ramakrishnan, 2002: Mining data streams under block evolution. SIGKDD Explorations, 3(2), 1–10.
Gaber, M. M., S. Krishnaswamy and A. Zaslavsky, 2003: Adaptive mining techniques for data streams using algorithm output granularity. Proceedings of the Australasian Data Mining Workshop, Held in conjunction with the Congress on Evolutionary Computation.
— 2004: Cost-efficient mining techniques for data streams. Proceedings of the Australasian Workshop on Data Mining and Web Intelligence (DMWI2004), CRPIT, 32. Purvis, M., Ed. ACS.
— 2004: A wireless data stream mining model. Proceedings of the Third International Workshop on Wireless Information Systems, Held in conjunction with the Sixth International Conference on Enterprise Information Systems ICEIS Press.
— 2004: Ubiquitous data stream mining, Current Research and Future Directions Workshop Proceedings held in conjunction with the Eighth Pacific-Asia Conference on Knowledge Discovery and Data Mining.
Garofalakis, M., J. Gehrke and R. Rastogi, 2002: Querying and mining data streams: you only get one look (a tutorial). Proceedings of the ACM SIGMOD international conference on Management of data.
Giannella, C., J. Han, J. Pei, X. Yan and P. S. Yu, 2003: Mining frequent patterns in data streams at multiple time granularities. H. Kargupta, A. Joshi, K. Sivakumar and Y. Yesha (eds.), Next Generation Data Mining, AAAI/MIT.
Golab L., and M. Ozsu, 2003: Issues in data stream management. SIGMOD Record, Number 2, 32, 5–14.
Gaber, M. M., A. Zaslavsky and S. Krishnaswamy, 2004: A cost-efficient model for ubiquitous data stream mining. Proceedings of the Tenth International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems.
— 2004: Resource-aware knowledge discovery in data streams. Proceedings of First International Workshop on Knowledge Discovery in Data Streams, to be held in conjunction with the 15 thEuropean Conference on Machine Learning and the 8 thEuropean Conference on the Principals and Practice of Knowledge Discovery in Databases.
Guha, S., N. Mishra, R. Motwani and L. O’Callaghan, 2000: Clustering data streams. Proceedings of the IEEE Annual Symposium on Foundations of Computer Science.
Guha, S., A. Meyerson, N. Mishra, R. Motwani and L. O’Callaghan, 2003: Clustering data streams: Theory and practice. TKDE special issue on clustering, 15.
Henzinger, M., P. Raghavan and S. Rajagopalan, 1998: Computing on data streams. Technical Note 1998-011, Digital Systems Research Center.
Hsu, J., 2002: Data mining trends and developments: The key data mining technologies and applications for the 21st century. Proceedings of the 19th Annual Information Systems Education Conference.
Hulten, G., L. Spencer and P. Domingos, 2001: Mining time-changing data streams. Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 97–106.
Kargupta, H., R. Bhargava, K. Liu, M. Powers, P. Blair, S. Bushra, J. Dull, K. Sarkar, M. Klein, M. Vasa and D. Handy, 2004: VEDAS: A mobile and distributed data stream mining system for real-time vehicle monitoring. Proceedings of SIAM International Conference on Data Mining.
Kargupta, H., B. Park, S. Pittie, L. Liu, D. Kushraj and K. Sarkar, 2002: MobiMine: Monitoring the stock market from a PDA. ACM SIGKDD Explorations, 3, 2, 37–46.
Keogh, E., J. Lin and W. Truppel, 2003: Clustering of time series subsequences is meaningless: implications for past and future research. Proceedings of the 3rd IEEE International Conference on Data Mining.
Krishnamurthy, S., S. Chandrasekaran, O. Cooper, A. Deshpande, M. Franklin, J. Hellerstein, W. Hong, S. Madden, V. Raman, F. Reiss and M. Shah, 2003: TelegraphCQ: An architectural status report. IEEE Data Engineering Bulletin, 26(1).
Krishnaswamy, S., S. W. Loke and A. Zaslavsky, 2000: Cost models for heterogeneous distributed data mining. Proceedings of the 12th International Conference on Software Engineering and Knowledge Engineering, 31–8.
Koudas, N., and D. Srivastava, 2003: Data stream query processing: A tutorial. Presented at International Conference on Very Large Databases.
Manku, G. S., and R. Motwani, 2002: Approximate frequency counts over data streams. Proceedings of the 28th International Conference on Very Large Databases.
Muthukrishnan, S., 2003: Data streams: algorithms and applications. Proceedings of the fourteenth annual ACM-SIAM symposium on discrete algorithms.
Muthukrishnan, S., 2003: Seminar on processing massive data sets. Available at URL: athos.rutgers.edu/%7Emuthu/stream-seminar.html.
O’Callaghan, L., N. Mishra, A. Meyerson, S. Guha and R. Motwani, 2002: Streaming-data algorithms for high-quality clustering. Proceedings of IEEE International Conference on Data Engineering.
Ordonez, C., 2003: Clustering binary data streams with k-means. Proceedings of ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), 10–17.
Papadimitriou, S., C. Faloutsos and A. Brockwell, 2003: Adaptive, handsoff stream mining. Proceedings of 29 th International Conference on Very Large Databases.
Park, B., and H. Kargupta, 2002: Distributed data mining: Algorithms, systems, and applications. Data Mining Handbook, Nong Ye (ed.).
Srivastava, A., and J. Stroeve, 2003: Onboard detection of snow, ice, clouds and other geophysical processes using kernel methods. Proceedings of the International Conference on Machine Learning workshop on Machine Learning Technologies for Autonomous Space Applications.
Tanner, S., M. Alshayeb, E. Criswell, M. Iyer, A. McDowell, M. McEniry and K. Regner, 2002: EVE: On-board process planning and execution. Proceedings of Earth Science Technology Conference.
Tatbul, N., U. Cetintemel, S. Zdonik, M. Cherniack and M. Stonebraker, 2003: Load shedding in a data stream manager. Proceedings of the 29th International Conference on Very Large Data Bases.
— 2003 Load shedding on data streams, Proceedings of the Workshop on Management and Processing of Data Streams.
Viglas, S. D., and F. Jeffrey, 2002: Rate based query optimization for streaming information sources. Proceedings of the ACM SIGMOD International Conference on Management of Data.
Wang, H., W. Fan, P. Yu and J. Han, 2003: Mining concept-drifting data streams using ensemble classifiers. Proceedings of 9th ACM International Conference on Knowledge Discovery and Data Mining.
Zaki, M., V. Stonebraker and D. Skillicorn, eds., 2001: Parallel and distributed data mining. CD-ROM Workshop Proceedings, IEEE Computer Society Press.
Zhu, Y., and D. Shasha, 2002: StatStream: Statistical monitoring of thousands of data streams in real time. Proceedings of the 28 thInternational Conference on Very Large Databases, 358–69.
Rights and permissions
Copyright information
© 2005 Dr Sanghamitra Bandyopadhyay
About this chapter
Cite this chapter
Gaber, M.M., Krishnaswamy, S., Zaslavsky, A. (2005). On-board Mining of Data Streams in Sensor Networks. In: Advanced Methods for Knowledge Discovery from Complex Data. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-284-5_12
Download citation
DOI: https://doi.org/10.1007/1-84628-284-5_12
Publisher Name: Springer, London
Print ISBN: 978-1-85233-989-0
Online ISBN: 978-1-84628-284-3
eBook Packages: Computer ScienceComputer Science (R0)