On-board Mining of Data Streams in Sensor Networks

Gaber, Mohamed Medhat; Krishnaswamy, Shonali; Zaslavsky, Arkady

doi:10.1007/1-84628-284-5_12

Mohamed Medhat Gaber,
Shonali Krishnaswamy &
Arkady Zaslavsky

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

1005 Accesses
19 Citations

Summary

Data streams are generated in large quantities and at rapid rates from sensor networks that typically monitor environmental conditions, traffic conditions and weather conditions among others. A significant challenge in sensor networks is the analysis of the vast amounts of data that are rapidly generated and transmitted through sensing. Given that wired communication is infeasible in the environmental situations outlined earlier, the current method for communicating this data for analysis is through satellite channels. Satellite communication is exorbitantly expensive. In order to address this issue, we propose a strategy for on-board mining of data streams in a resource-constrained environment. We have developed a novel approach that dynamically adapts the data-stream mining process on the basis of available memory resources. This adaptation is algorithm-independent and enables data-stream mining algorithms to cope with high data rates in the light of finite computational resources. We have also developed lightweight data-stream mining algorithms that incorporate our adaptive mining approach for resource constrained environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Quality-Aware Sensor Data Management

Data Gathering, Storage, and Post-Processing

Potentials of Computational Intelligence for Big Multi-sensor Data Management

References

Abadi, D., D. Carney, U. Cetintemel, M. Cherniack, C. Convey, C. Erwin, E. Galvez, M. Hatoun, J. Hwang, A. Maskey, A. Rasin, A. Singer, M. Stonebraker, N. Tatbul, Y. Xing, R. Yan and S. Zdonik, 2003: Aurora: A data stream management system (demonstration). Proceedings of the ACM SIGMOD International Conference on Management of Data.
Google Scholar
Aggarwal, C., J. Han, J. Wang and P. S. Yu, 2003: A framework for clustering evolving data streams. Proceedings of 2003 International Conference on Very Large Databases.
Google Scholar
— 2004: A framework for projected clustering of high dimensional data streams. Proceedings of International Conference on Very Large Databases.
Google Scholar
— 2004: On demand classification of data streams. Proceedings of International Conference on Knowledge Discovery and Data Mining.
Google Scholar
Arasu, A., B. Babcock, S. Babu, M. Datar, K. Ito, I. Nishizawa, J. Rosenstein and J. Widom, 2003: STREAM: The Stanford stream data manager demonstration description-short overview of system status and plans. Proceedings of the ACM International Conference on Management of Data.
Google Scholar
Babcock, B., S. Babu, M. Datar, R. Motwani and J. Widom, 2002: Models and issues in data stream systems. Proceedings of the 21^st Symposium on Principles of Database Systems.
Google Scholar
Babcock, B., M. Datar and R. Motwani 2003: Load shedding techniques for data stream systems (short paper). Proceedings of the Workshop on Management and Processing of Data Streams.
Google Scholar
Babcock, B., M. Datar, R. Motwani and L. O’Callaghan, 2003: Maintaining variance and k-medians over data stream windows. Proceedings of the 22nd Symposium on Principles of Database Systems.
Google Scholar
Bhargava, R., H. Kargupta and M. Powers, 2003: Energy consumption in data analysis for on-board and distributed applications. Proceedings of the International Conference on Machine Learning workshop on Machine Learning Technologies for Autonomous Space Applications.
Google Scholar
Burl, M., C. Fowlkes, J. Roden, A. Stechert and S. Mukhtar, 1999: Diamond Eye: A distributed architecture for image data mining. Proceedings of SPIE Conference on Data Mining and Knowledge Discovery: Theory, Tools, and Technology.
Google Scholar
Cai, Y. D., D. Clutter, G. Pape, J. Han, M. Welge and L. Auvil, 2004: MAIDS: Mining alarming incidents from data streams (system demonstration). Proceedings of ACM-SIGMOD International Conference on Management of Data.
Google Scholar
Charikar, M., L. O’Callaghan and R. Panigrahy, 2003: Better streaming algorithms for clustering problems. Proceedings of 35th ACM Symposium on Theory of Computing.
Google Scholar
Cormode, G., and S. Muthukrishnan, 2003: Radial histograms for spatial streams, Technical Report DIMACS TR 2003-11.
Google Scholar
Coughlan, J., 2004: Accelerating scientific discovery at NASA. Proceedings of Fourth SIAM International Conference on Data Mining.
Google Scholar
Datar, M., A. Gionis, P. Indyk and R. Motwani: Maintaining stream statistics over sliding windows (extended abstract). Proceedings of 13th Annual ACM-SIAM Symposium on Discrete Algorithms.
Google Scholar
Domingos, P., and G. Hulten, 2000: Mining high-speed data streams. Proceedings of the Association for Computing Machinery Sixth International Conference on Knowledge Discovery and Data Mining,71–80.
Google Scholar
— 2001: A general method for scaling up machine learning algorithms and its application to clustering. Proceedings of the Eighteenth International Conference on Machine Learning, 106–13.
Google Scholar
Dong, G., J. Han, L. Lakshmanan, J. Pei, H. Wang and P. S. Yu, 2003: Online mining of changes from data streams: Research problems and preliminary results. Proceedings of the ACM SIGMOD Workshop on Management and Processing of Data Streams. In cooperation with the ACMSIGMOD International Conference on Management of Data.
Google Scholar
Ganti, V., J. Gehrke and R. Ramakrishnan, 2002: Mining data streams under block evolution. SIGKDD Explorations, 3(2), 1–10.
Google Scholar
Gaber, M. M., S. Krishnaswamy and A. Zaslavsky, 2003: Adaptive mining techniques for data streams using algorithm output granularity. Proceedings of the Australasian Data Mining Workshop, Held in conjunction with the Congress on Evolutionary Computation.
Google Scholar
— 2004: Cost-efficient mining techniques for data streams. Proceedings of the Australasian Workshop on Data Mining and Web Intelligence (DMWI2004), CRPIT, 32. Purvis, M., Ed. ACS.
Google Scholar
— 2004: A wireless data stream mining model. Proceedings of the Third International Workshop on Wireless Information Systems, Held in conjunction with the Sixth International Conference on Enterprise Information Systems ICEIS Press.
Google Scholar
— 2004: Ubiquitous data stream mining, Current Research and Future Directions Workshop Proceedings held in conjunction with the Eighth Pacific-Asia Conference on Knowledge Discovery and Data Mining.
Google Scholar
Garofalakis, M., J. Gehrke and R. Rastogi, 2002: Querying and mining data streams: you only get one look (a tutorial). Proceedings of the ACM SIGMOD international conference on Management of data.
Google Scholar
Giannella, C., J. Han, J. Pei, X. Yan and P. S. Yu, 2003: Mining frequent patterns in data streams at multiple time granularities. H. Kargupta, A. Joshi, K. Sivakumar and Y. Yesha (eds.), Next Generation Data Mining, AAAI/MIT.
Google Scholar
Golab L., and M. Ozsu, 2003: Issues in data stream management. SIGMOD Record, Number 2, 32, 5–14.
Google Scholar
Gaber, M. M., A. Zaslavsky and S. Krishnaswamy, 2004: A cost-efficient model for ubiquitous data stream mining. Proceedings of the Tenth International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems.
Google Scholar
— 2004: Resource-aware knowledge discovery in data streams. Proceedings of First International Workshop on Knowledge Discovery in Data Streams, to be held in conjunction with the 15 thEuropean Conference on Machine Learning and the 8 thEuropean Conference on the Principals and Practice of Knowledge Discovery in Databases.
Google Scholar
Guha, S., N. Mishra, R. Motwani and L. O’Callaghan, 2000: Clustering data streams. Proceedings of the IEEE Annual Symposium on Foundations of Computer Science.
Google Scholar
Guha, S., A. Meyerson, N. Mishra, R. Motwani and L. O’Callaghan, 2003: Clustering data streams: Theory and practice. TKDE special issue on clustering, 15.
Google Scholar
Henzinger, M., P. Raghavan and S. Rajagopalan, 1998: Computing on data streams. Technical Note 1998-011, Digital Systems Research Center.
Google Scholar
Hsu, J., 2002: Data mining trends and developments: The key data mining technologies and applications for the 21st century. Proceedings of the 19^th Annual Information Systems Education Conference.
Google Scholar
Hulten, G., L. Spencer and P. Domingos, 2001: Mining time-changing data streams. Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 97–106.
Google Scholar
Kargupta, H., R. Bhargava, K. Liu, M. Powers, P. Blair, S. Bushra, J. Dull, K. Sarkar, M. Klein, M. Vasa and D. Handy, 2004: VEDAS: A mobile and distributed data stream mining system for real-time vehicle monitoring. Proceedings of SIAM International Conference on Data Mining.
Google Scholar
Kargupta, H., B. Park, S. Pittie, L. Liu, D. Kushraj and K. Sarkar, 2002: MobiMine: Monitoring the stock market from a PDA. ACM SIGKDD Explorations, 3, 2, 37–46.
Google Scholar
Keogh, E., J. Lin and W. Truppel, 2003: Clustering of time series subsequences is meaningless: implications for past and future research. Proceedings of the 3rd IEEE International Conference on Data Mining.
Google Scholar
Krishnamurthy, S., S. Chandrasekaran, O. Cooper, A. Deshpande, M. Franklin, J. Hellerstein, W. Hong, S. Madden, V. Raman, F. Reiss and M. Shah, 2003: TelegraphCQ: An architectural status report. IEEE Data Engineering Bulletin, 26(1).
Google Scholar
Krishnaswamy, S., S. W. Loke and A. Zaslavsky, 2000: Cost models for heterogeneous distributed data mining. Proceedings of the 12^th International Conference on Software Engineering and Knowledge Engineering, 31–8.
Google Scholar
Koudas, N., and D. Srivastava, 2003: Data stream query processing: A tutorial. Presented at International Conference on Very Large Databases.
Google Scholar
Manku, G. S., and R. Motwani, 2002: Approximate frequency counts over data streams. Proceedings of the 28th International Conference on Very Large Databases.
Google Scholar
Muthukrishnan, S., 2003: Data streams: algorithms and applications. Proceedings of the fourteenth annual ACM-SIAM symposium on discrete algorithms.
Google Scholar
Muthukrishnan, S., 2003: Seminar on processing massive data sets. Available at URL: athos.rutgers.edu/%7Emuthu/stream-seminar.html.
Google Scholar
O’Callaghan, L., N. Mishra, A. Meyerson, S. Guha and R. Motwani, 2002: Streaming-data algorithms for high-quality clustering. Proceedings of IEEE International Conference on Data Engineering.
Google Scholar
Ordonez, C., 2003: Clustering binary data streams with k-means. Proceedings of ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), 10–17.
Google Scholar
Papadimitriou, S., C. Faloutsos and A. Brockwell, 2003: Adaptive, handsoff stream mining. Proceedings of 29 th International Conference on Very Large Databases.
Google Scholar
Park, B., and H. Kargupta, 2002: Distributed data mining: Algorithms, systems, and applications. Data Mining Handbook, Nong Ye (ed.).
Google Scholar
Srivastava, A., and J. Stroeve, 2003: Onboard detection of snow, ice, clouds and other geophysical processes using kernel methods. Proceedings of the International Conference on Machine Learning workshop on Machine Learning Technologies for Autonomous Space Applications.
Google Scholar
Tanner, S., M. Alshayeb, E. Criswell, M. Iyer, A. McDowell, M. McEniry and K. Regner, 2002: EVE: On-board process planning and execution. Proceedings of Earth Science Technology Conference.
Google Scholar
Tatbul, N., U. Cetintemel, S. Zdonik, M. Cherniack and M. Stonebraker, 2003: Load shedding in a data stream manager. Proceedings of the 29^th International Conference on Very Large Data Bases.
Google Scholar
— 2003 Load shedding on data streams, Proceedings of the Workshop on Management and Processing of Data Streams.
Google Scholar
Viglas, S. D., and F. Jeffrey, 2002: Rate based query optimization for streaming information sources. Proceedings of the ACM SIGMOD International Conference on Management of Data.
Google Scholar
Wang, H., W. Fan, P. Yu and J. Han, 2003: Mining concept-drifting data streams using ensemble classifiers. Proceedings of 9th ACM International Conference on Knowledge Discovery and Data Mining.
Google Scholar
Zaki, M., V. Stonebraker and D. Skillicorn, eds., 2001: Parallel and distributed data mining. CD-ROM Workshop Proceedings, IEEE Computer Society Press.
Google Scholar
Zhu, Y., and D. Shasha, 2002: StatStream: Statistical monitoring of thousands of data streams in real time. Proceedings of the 28 thInternational Conference on Very Large Databases, 358–69.
Google Scholar

Download references

Authors

Mohamed Medhat Gaber
View author publications
You can also search for this author in PubMed Google Scholar
Shonali Krishnaswamy
View author publications
You can also search for this author in PubMed Google Scholar
Arkady Zaslavsky
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gaber, M.M., Krishnaswamy, S., Zaslavsky, A. (2005). On-board Mining of Data Streams in Sensor Networks. In: Advanced Methods for Knowledge Discovery from Complex Data. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-284-5_12

Download citation

DOI: https://doi.org/10.1007/1-84628-284-5_12
Publisher Name: Springer, London
Print ISBN: 978-1-85233-989-0
Online ISBN: 978-1-84628-284-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On-board Mining of Data Streams in Sensor Networks

Summary

Access this chapter

Preview

Similar content being viewed by others

Quality-Aware Sensor Data Management

Data Gathering, Storage, and Post-Processing

Potentials of Computational Intelligence for Big Multi-sensor Data Management

References

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

On-board Mining of Data Streams in Sensor Networks

Summary

Access this chapter

Preview

Similar content being viewed by others

Quality-Aware Sensor Data Management

Data Gathering, Storage, and Post-Processing

Potentials of Computational Intelligence for Big Multi-sensor Data Management

References

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation