Skip to main content

A Data Stream Prediction Strategy for Elastic Stream Computing Systems

  • Conference paper
  • First Online:
Broadband Communications, Networks, and Systems (BROADNETS 2021)

Abstract

In a distributed stream processing system, elastic resource provisioning/scheduling is the main factor that affects system performance and limits system applications. However, in the data stream computing platform, resource allocation is often suboptimal due to the large fluctuations of the data stream rate, which creates a performance bottleneck for the cluster. In this paper, we propose a data stream prediction strategy (Dp-Stream) for elastic computing system to mitigate the resource allocation issue. First, we establish a back propagation (BP) neural network prediction model based on genetic simulated annealing algorithm to predict the trend of the data stream rate in the next time window of the cluster; second, according to the time latency, the estimation model adjusts the resources allocated to the critical operations of the critical path in the Directed Acyclic Graph (DAG) and finally, the resource communication cost is optimized. We evaluate the prediction accuracy and system latency of the proposed scheduling strategy in Storm. The experimental results prove the feasibility and effectiveness of the proposed strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zhu, F., Lv, Y., Chen, Y., Wang, X., Xiong, G., Wang, F.Y.: Parallel transportation systems: toward IoT-Enabled smart urban traffic control and management. IEEE Trans. Intell. Transp. Syst. 21(10), 4063–4071 (2020)

    Article  Google Scholar 

  2. Toshniwal, A., et al.: Storm@twitter. In: 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014. ACM Press, pp. 147–156 (2014)

    Google Scholar 

  3. Chintapalli, S., et al.: Benchmarking streaming computation engines: storm, Flink and Spak streaming. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1789–1792 (2016)

    Google Scholar 

  4. Liu, X., Buyya, R.: Resource management and scheduling in distributed stream processing systems: a taxonomy, review, and future directions. ACM Comput. Surv. 53(3), 1–41 (2020)

    Article  Google Scholar 

  5. Zhang, J., Li, C., Zhu, L., Liu, Y.: The real-time scheduling strategy based on traffic and load balancing in storm. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications, IEEE 14th International Conference on Smart City, IEEE 2nd International Conference on Data Science and Systems, Sydney, NSW, Australia, pp. 372–379 (2016)

    Google Scholar 

  6. Duan, W., Zhou, L.: Task scheduling optimization based on firefly algorithm in storm. In: 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC), pp. 150–154 (2020)

    Google Scholar 

  7. Heinze, T., Pappalardo, V., Jerzak, Z., Fetzer, C.: Auto-scaling techniques for elastic data stream processing. In: 2014 IEEE 30th International Conference on Data Engineering Workshops, Chicago, IL, USA, pp. 296–302 (2014)

    Google Scholar 

  8. Hidalgo, N., Wladdimiro, D., Rosasl, E.: Self-adaptive processing graph with operator fission for elastic stream processing. J. Syst. Softw. 127, 205–216 (2017)

    Article  Google Scholar 

  9. Cardellini, V., Nardelli, M., Luzi, D.: Elastic stateful stream processing in storm. In: International Conference on High Performance Computing and Simulation, pp. 583–590 (2016)

    Google Scholar 

  10. Chakraborty, R., Majumdar, S.: A priority-based resource scheduling technique for multitenant storm clusters. In: 2016 International Symposium on Performance Evaluation of Computer and Telecommunication Systems, pp. 1–6 (2016)

    Google Scholar 

  11. Zhou, Y., Liu, Y., Zhang, C., Peng, X., Oin, X.: TOSS: a topology-based scheduler for storm C1usters. In: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), New Orleans, LA, USA, pp. 587–596 (2020)

    Google Scholar 

  12. Matteis, T.D., Mencagli, G.: Proactive elasticity and energy awareness in data stream processing. J. Syst. Softw. 127, 302–319 (2017)

    Article  Google Scholar 

  13. Farahabady, M.R.H., Samani, H.R.D., Wang, Y., Zomaya, A.Y., Tari, Z.: A QoS-aware controller for apache storm. In: 2016 IEEE 15th International Symposium on Network Computing and Applications (NCA), Cambridge, MA, USA, pp. 334–342 (2016)

    Google Scholar 

  14. Farahabady, M.R.H., Zomaya, A.Y., Tari, Z.: QoS-and contention-aware resource provisioning in a stream processing engine. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 137–146 (2017)

    Google Scholar 

  15. Wang, C., Meng, X., Guo, Q., Weng, Z., Yang, C.: OrientStream: a framework for dynamic resource allocation in distributed data stream management systems. In: 25th ACM International on Conference on Information and Knowledge Management. ACM Press, pp. 2281–2286 (2016)

    Google Scholar 

  16. Fu, T.Z.J., Ding, J., Ma, R.T.B., Winslett, M., Yang, Y., Zhang, Z.: DRS: dynamic resource scheduling for real-time analytics over fast streams. In: 2015 IEEE 35th International Conference on Distributed Computing Systems, Columbus, OH, USA, pp. 411–420 (2015)

    Google Scholar 

  17. Liu, S., Weng, J., Wang, J.H., An, C., Zhou, Y., Wang, J.: An adaptive online scheme for scheduling and resource enforcement in storm. IEEE/ACM Trans. Networking 27(4), 1373–1386 (2019)

    Article  Google Scholar 

  18. Wang, W., Zhang, C., Chen, X., Li, Z., Ding, H., Wen, X.: An on-the-fly scheduling strategy for distributed stream processing platform. In: 2018 IEEE International Conference on Parallel and Distributed Processing with Applications, Melbourne, VIC, Australia, pp. 773–780 (2018)

    Google Scholar 

  19. Liu, X., Buyya, R.: D-Storm: dynamic resource-efficient scheduling of stream processing applications. In: 2017 IEEE 23rd International Conference on Parallel and Distributed Systems, pp. 485–492 (2017)

    Google Scholar 

  20. De Matteis, T., Mencagli, G.: Elastic scaling for distributed latency-sensitive data stream operators. In: 2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), St. Petersburg, Russia, pp. 61–68 (2017)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grant No. 61972364 and the Fundamental Research Funds for the Central Universities under Grant No. 2652021001. This work is also supported by Melbourne-Chindia Cloud Computing (MC3) Research Network.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dawei Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, H., Sun, D., Sajjanhar, A., Buyya, R. (2022). A Data Stream Prediction Strategy for Elastic Stream Computing Systems. In: Xiang, W., Han, F., Phan, T.K. (eds) Broadband Communications, Networks, and Systems. BROADNETS 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 413. Springer, Cham. https://doi.org/10.1007/978-3-030-93479-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93479-8_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93478-1

  • Online ISBN: 978-3-030-93479-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics