Skip to main content

Summary

Data streams are generated in large quantities and at rapid rates from sensor networks that typically monitor environmental conditions, traffic conditions and weather conditions among others. A significant challenge in sensor networks is the analysis of the vast amounts of data that are rapidly generated and transmitted through sensing. Given that wired communication is infeasible in the environmental situations outlined earlier, the current method for communicating this data for analysis is through satellite channels. Satellite communication is exorbitantly expensive. In order to address this issue, we propose a strategy for on-board mining of data streams in a resource-constrained environment. We have developed a novel approach that dynamically adapts the data-stream mining process on the basis of available memory resources. This adaptation is algorithm-independent and enables data-stream mining algorithms to cope with high data rates in the light of finite computational resources. We have also developed lightweight data-stream mining algorithms that incorporate our adaptive mining approach for resource constrained environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Abadi, D., D. Carney, U. Cetintemel, M. Cherniack, C. Convey, C. Erwin, E. Galvez, M. Hatoun, J. Hwang, A. Maskey, A. Rasin, A. Singer, M. Stonebraker, N. Tatbul, Y. Xing, R. Yan and S. Zdonik, 2003: Aurora: A data stream management system (demonstration). Proceedings of the ACM SIGMOD International Conference on Management of Data.

    Google Scholar 

  2. Aggarwal, C., J. Han, J. Wang and P. S. Yu, 2003: A framework for clustering evolving data streams. Proceedings of 2003 International Conference on Very Large Databases.

    Google Scholar 

  3. — 2004: A framework for projected clustering of high dimensional data streams. Proceedings of International Conference on Very Large Databases.

    Google Scholar 

  4. — 2004: On demand classification of data streams. Proceedings of International Conference on Knowledge Discovery and Data Mining.

    Google Scholar 

  5. Arasu, A., B. Babcock, S. Babu, M. Datar, K. Ito, I. Nishizawa, J. Rosenstein and J. Widom, 2003: STREAM: The Stanford stream data manager demonstration description-short overview of system status and plans. Proceedings of the ACM International Conference on Management of Data.

    Google Scholar 

  6. Babcock, B., S. Babu, M. Datar, R. Motwani and J. Widom, 2002: Models and issues in data stream systems. Proceedings of the 21st Symposium on Principles of Database Systems.

    Google Scholar 

  7. Babcock, B., M. Datar and R. Motwani 2003: Load shedding techniques for data stream systems (short paper). Proceedings of the Workshop on Management and Processing of Data Streams.

    Google Scholar 

  8. Babcock, B., M. Datar, R. Motwani and L. O’Callaghan, 2003: Maintaining variance and k-medians over data stream windows. Proceedings of the 22nd Symposium on Principles of Database Systems.

    Google Scholar 

  9. Bhargava, R., H. Kargupta and M. Powers, 2003: Energy consumption in data analysis for on-board and distributed applications. Proceedings of the International Conference on Machine Learning workshop on Machine Learning Technologies for Autonomous Space Applications.

    Google Scholar 

  10. Burl, M., C. Fowlkes, J. Roden, A. Stechert and S. Mukhtar, 1999: Diamond Eye: A distributed architecture for image data mining. Proceedings of SPIE Conference on Data Mining and Knowledge Discovery: Theory, Tools, and Technology.

    Google Scholar 

  11. Cai, Y. D., D. Clutter, G. Pape, J. Han, M. Welge and L. Auvil, 2004: MAIDS: Mining alarming incidents from data streams (system demonstration). Proceedings of ACM-SIGMOD International Conference on Management of Data.

    Google Scholar 

  12. Charikar, M., L. O’Callaghan and R. Panigrahy, 2003: Better streaming algorithms for clustering problems. Proceedings of 35th ACM Symposium on Theory of Computing.

    Google Scholar 

  13. Cormode, G., and S. Muthukrishnan, 2003: Radial histograms for spatial streams, Technical Report DIMACS TR 2003-11.

    Google Scholar 

  14. Coughlan, J., 2004: Accelerating scientific discovery at NASA. Proceedings of Fourth SIAM International Conference on Data Mining.

    Google Scholar 

  15. Datar, M., A. Gionis, P. Indyk and R. Motwani: Maintaining stream statistics over sliding windows (extended abstract). Proceedings of 13th Annual ACM-SIAM Symposium on Discrete Algorithms.

    Google Scholar 

  16. Domingos, P., and G. Hulten, 2000: Mining high-speed data streams. Proceedings of the Association for Computing Machinery Sixth International Conference on Knowledge Discovery and Data Mining,71–80.

    Google Scholar 

  17. — 2001: A general method for scaling up machine learning algorithms and its application to clustering. Proceedings of the Eighteenth International Conference on Machine Learning, 106–13.

    Google Scholar 

  18. Dong, G., J. Han, L. Lakshmanan, J. Pei, H. Wang and P. S. Yu, 2003: Online mining of changes from data streams: Research problems and preliminary results. Proceedings of the ACM SIGMOD Workshop on Management and Processing of Data Streams. In cooperation with the ACMSIGMOD International Conference on Management of Data.

    Google Scholar 

  19. Ganti, V., J. Gehrke and R. Ramakrishnan, 2002: Mining data streams under block evolution. SIGKDD Explorations, 3(2), 1–10.

    Google Scholar 

  20. Gaber, M. M., S. Krishnaswamy and A. Zaslavsky, 2003: Adaptive mining techniques for data streams using algorithm output granularity. Proceedings of the Australasian Data Mining Workshop, Held in conjunction with the Congress on Evolutionary Computation.

    Google Scholar 

  21. — 2004: Cost-efficient mining techniques for data streams. Proceedings of the Australasian Workshop on Data Mining and Web Intelligence (DMWI2004), CRPIT, 32. Purvis, M., Ed. ACS.

    Google Scholar 

  22. — 2004: A wireless data stream mining model. Proceedings of the Third International Workshop on Wireless Information Systems, Held in conjunction with the Sixth International Conference on Enterprise Information Systems ICEIS Press.

    Google Scholar 

  23. — 2004: Ubiquitous data stream mining, Current Research and Future Directions Workshop Proceedings held in conjunction with the Eighth Pacific-Asia Conference on Knowledge Discovery and Data Mining.

    Google Scholar 

  24. Garofalakis, M., J. Gehrke and R. Rastogi, 2002: Querying and mining data streams: you only get one look (a tutorial). Proceedings of the ACM SIGMOD international conference on Management of data.

    Google Scholar 

  25. Giannella, C., J. Han, J. Pei, X. Yan and P. S. Yu, 2003: Mining frequent patterns in data streams at multiple time granularities. H. Kargupta, A. Joshi, K. Sivakumar and Y. Yesha (eds.), Next Generation Data Mining, AAAI/MIT.

    Google Scholar 

  26. Golab L., and M. Ozsu, 2003: Issues in data stream management. SIGMOD Record, Number 2, 32, 5–14.

    Google Scholar 

  27. Gaber, M. M., A. Zaslavsky and S. Krishnaswamy, 2004: A cost-efficient model for ubiquitous data stream mining. Proceedings of the Tenth International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems.

    Google Scholar 

  28. — 2004: Resource-aware knowledge discovery in data streams. Proceedings of First International Workshop on Knowledge Discovery in Data Streams, to be held in conjunction with the 15 thEuropean Conference on Machine Learning and the 8 thEuropean Conference on the Principals and Practice of Knowledge Discovery in Databases.

    Google Scholar 

  29. Guha, S., N. Mishra, R. Motwani and L. O’Callaghan, 2000: Clustering data streams. Proceedings of the IEEE Annual Symposium on Foundations of Computer Science.

    Google Scholar 

  30. Guha, S., A. Meyerson, N. Mishra, R. Motwani and L. O’Callaghan, 2003: Clustering data streams: Theory and practice. TKDE special issue on clustering, 15.

    Google Scholar 

  31. Henzinger, M., P. Raghavan and S. Rajagopalan, 1998: Computing on data streams. Technical Note 1998-011, Digital Systems Research Center.

    Google Scholar 

  32. Hsu, J., 2002: Data mining trends and developments: The key data mining technologies and applications for the 21st century. Proceedings of the 19th Annual Information Systems Education Conference.

    Google Scholar 

  33. Hulten, G., L. Spencer and P. Domingos, 2001: Mining time-changing data streams. Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 97–106.

    Google Scholar 

  34. Kargupta, H., R. Bhargava, K. Liu, M. Powers, P. Blair, S. Bushra, J. Dull, K. Sarkar, M. Klein, M. Vasa and D. Handy, 2004: VEDAS: A mobile and distributed data stream mining system for real-time vehicle monitoring. Proceedings of SIAM International Conference on Data Mining.

    Google Scholar 

  35. Kargupta, H., B. Park, S. Pittie, L. Liu, D. Kushraj and K. Sarkar, 2002: MobiMine: Monitoring the stock market from a PDA. ACM SIGKDD Explorations, 3, 2, 37–46.

    Google Scholar 

  36. Keogh, E., J. Lin and W. Truppel, 2003: Clustering of time series subsequences is meaningless: implications for past and future research. Proceedings of the 3rd IEEE International Conference on Data Mining.

    Google Scholar 

  37. Krishnamurthy, S., S. Chandrasekaran, O. Cooper, A. Deshpande, M. Franklin, J. Hellerstein, W. Hong, S. Madden, V. Raman, F. Reiss and M. Shah, 2003: TelegraphCQ: An architectural status report. IEEE Data Engineering Bulletin, 26(1).

    Google Scholar 

  38. Krishnaswamy, S., S. W. Loke and A. Zaslavsky, 2000: Cost models for heterogeneous distributed data mining. Proceedings of the 12th International Conference on Software Engineering and Knowledge Engineering, 31–8.

    Google Scholar 

  39. Koudas, N., and D. Srivastava, 2003: Data stream query processing: A tutorial. Presented at International Conference on Very Large Databases.

    Google Scholar 

  40. Manku, G. S., and R. Motwani, 2002: Approximate frequency counts over data streams. Proceedings of the 28th International Conference on Very Large Databases.

    Google Scholar 

  41. Muthukrishnan, S., 2003: Data streams: algorithms and applications. Proceedings of the fourteenth annual ACM-SIAM symposium on discrete algorithms.

    Google Scholar 

  42. Muthukrishnan, S., 2003: Seminar on processing massive data sets. Available at URL: athos.rutgers.edu/%7Emuthu/stream-seminar.html.

    Google Scholar 

  43. O’Callaghan, L., N. Mishra, A. Meyerson, S. Guha and R. Motwani, 2002: Streaming-data algorithms for high-quality clustering. Proceedings of IEEE International Conference on Data Engineering.

    Google Scholar 

  44. Ordonez, C., 2003: Clustering binary data streams with k-means. Proceedings of ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), 10–17.

    Google Scholar 

  45. Papadimitriou, S., C. Faloutsos and A. Brockwell, 2003: Adaptive, handsoff stream mining. Proceedings of 29 th International Conference on Very Large Databases.

    Google Scholar 

  46. Park, B., and H. Kargupta, 2002: Distributed data mining: Algorithms, systems, and applications. Data Mining Handbook, Nong Ye (ed.).

    Google Scholar 

  47. Srivastava, A., and J. Stroeve, 2003: Onboard detection of snow, ice, clouds and other geophysical processes using kernel methods. Proceedings of the International Conference on Machine Learning workshop on Machine Learning Technologies for Autonomous Space Applications.

    Google Scholar 

  48. Tanner, S., M. Alshayeb, E. Criswell, M. Iyer, A. McDowell, M. McEniry and K. Regner, 2002: EVE: On-board process planning and execution. Proceedings of Earth Science Technology Conference.

    Google Scholar 

  49. Tatbul, N., U. Cetintemel, S. Zdonik, M. Cherniack and M. Stonebraker, 2003: Load shedding in a data stream manager. Proceedings of the 29th International Conference on Very Large Data Bases.

    Google Scholar 

  50. — 2003 Load shedding on data streams, Proceedings of the Workshop on Management and Processing of Data Streams.

    Google Scholar 

  51. Viglas, S. D., and F. Jeffrey, 2002: Rate based query optimization for streaming information sources. Proceedings of the ACM SIGMOD International Conference on Management of Data.

    Google Scholar 

  52. Wang, H., W. Fan, P. Yu and J. Han, 2003: Mining concept-drifting data streams using ensemble classifiers. Proceedings of 9th ACM International Conference on Knowledge Discovery and Data Mining.

    Google Scholar 

  53. Zaki, M., V. Stonebraker and D. Skillicorn, eds., 2001: Parallel and distributed data mining. CD-ROM Workshop Proceedings, IEEE Computer Society Press.

    Google Scholar 

  54. Zhu, Y., and D. Shasha, 2002: StatStream: Statistical monitoring of thousands of data streams in real time. Proceedings of the 28 thInternational Conference on Very Large Databases, 358–69.

    Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Dr Sanghamitra Bandyopadhyay

About this chapter

Cite this chapter

Gaber, M.M., Krishnaswamy, S., Zaslavsky, A. (2005). On-board Mining of Data Streams in Sensor Networks. In: Advanced Methods for Knowledge Discovery from Complex Data. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-284-5_12

Download citation

  • DOI: https://doi.org/10.1007/1-84628-284-5_12

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-85233-989-0

  • Online ISBN: 978-1-84628-284-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics