Abstract
In a distributed stream processing system, elastic resource provisioning/scheduling is the main factor that affects system performance and limits system applications. However, in the data stream computing platform, resource allocation is often suboptimal due to the large fluctuations of the data stream rate, which creates a performance bottleneck for the cluster. In this paper, we propose a data stream prediction strategy (Dp-Stream) for elastic computing system to mitigate the resource allocation issue. First, we establish a back propagation (BP) neural network prediction model based on genetic simulated annealing algorithm to predict the trend of the data stream rate in the next time window of the cluster; second, according to the time latency, the estimation model adjusts the resources allocated to the critical operations of the critical path in the Directed Acyclic Graph (DAG) and finally, the resource communication cost is optimized. We evaluate the prediction accuracy and system latency of the proposed scheduling strategy in Storm. The experimental results prove the feasibility and effectiveness of the proposed strategy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhu, F., Lv, Y., Chen, Y., Wang, X., Xiong, G., Wang, F.Y.: Parallel transportation systems: toward IoT-Enabled smart urban traffic control and management. IEEE Trans. Intell. Transp. Syst. 21(10), 4063–4071 (2020)
Toshniwal, A., et al.: Storm@twitter. In: 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014. ACM Press, pp. 147–156 (2014)
Chintapalli, S., et al.: Benchmarking streaming computation engines: storm, Flink and Spak streaming. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1789–1792 (2016)
Liu, X., Buyya, R.: Resource management and scheduling in distributed stream processing systems: a taxonomy, review, and future directions. ACM Comput. Surv. 53(3), 1–41 (2020)
Zhang, J., Li, C., Zhu, L., Liu, Y.: The real-time scheduling strategy based on traffic and load balancing in storm. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications, IEEE 14th International Conference on Smart City, IEEE 2nd International Conference on Data Science and Systems, Sydney, NSW, Australia, pp. 372–379 (2016)
Duan, W., Zhou, L.: Task scheduling optimization based on firefly algorithm in storm. In: 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC), pp. 150–154 (2020)
Heinze, T., Pappalardo, V., Jerzak, Z., Fetzer, C.: Auto-scaling techniques for elastic data stream processing. In: 2014 IEEE 30th International Conference on Data Engineering Workshops, Chicago, IL, USA, pp. 296–302 (2014)
Hidalgo, N., Wladdimiro, D., Rosasl, E.: Self-adaptive processing graph with operator fission for elastic stream processing. J. Syst. Softw. 127, 205–216 (2017)
Cardellini, V., Nardelli, M., Luzi, D.: Elastic stateful stream processing in storm. In: International Conference on High Performance Computing and Simulation, pp. 583–590 (2016)
Chakraborty, R., Majumdar, S.: A priority-based resource scheduling technique for multitenant storm clusters. In: 2016 International Symposium on Performance Evaluation of Computer and Telecommunication Systems, pp. 1–6 (2016)
Zhou, Y., Liu, Y., Zhang, C., Peng, X., Oin, X.: TOSS: a topology-based scheduler for storm C1usters. In: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), New Orleans, LA, USA, pp. 587–596 (2020)
Matteis, T.D., Mencagli, G.: Proactive elasticity and energy awareness in data stream processing. J. Syst. Softw. 127, 302–319 (2017)
Farahabady, M.R.H., Samani, H.R.D., Wang, Y., Zomaya, A.Y., Tari, Z.: A QoS-aware controller for apache storm. In: 2016 IEEE 15th International Symposium on Network Computing and Applications (NCA), Cambridge, MA, USA, pp. 334–342 (2016)
Farahabady, M.R.H., Zomaya, A.Y., Tari, Z.: QoS-and contention-aware resource provisioning in a stream processing engine. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 137–146 (2017)
Wang, C., Meng, X., Guo, Q., Weng, Z., Yang, C.: OrientStream: a framework for dynamic resource allocation in distributed data stream management systems. In: 25th ACM International on Conference on Information and Knowledge Management. ACM Press, pp. 2281–2286 (2016)
Fu, T.Z.J., Ding, J., Ma, R.T.B., Winslett, M., Yang, Y., Zhang, Z.: DRS: dynamic resource scheduling for real-time analytics over fast streams. In: 2015 IEEE 35th International Conference on Distributed Computing Systems, Columbus, OH, USA, pp. 411–420 (2015)
Liu, S., Weng, J., Wang, J.H., An, C., Zhou, Y., Wang, J.: An adaptive online scheme for scheduling and resource enforcement in storm. IEEE/ACM Trans. Networking 27(4), 1373–1386 (2019)
Wang, W., Zhang, C., Chen, X., Li, Z., Ding, H., Wen, X.: An on-the-fly scheduling strategy for distributed stream processing platform. In: 2018 IEEE International Conference on Parallel and Distributed Processing with Applications, Melbourne, VIC, Australia, pp. 773–780 (2018)
Liu, X., Buyya, R.: D-Storm: dynamic resource-efficient scheduling of stream processing applications. In: 2017 IEEE 23rd International Conference on Parallel and Distributed Systems, pp. 485–492 (2017)
De Matteis, T., Mencagli, G.: Elastic scaling for distributed latency-sensitive data stream operators. In: 2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), St. Petersburg, Russia, pp. 61–68 (2017)
Acknowledgements
This work is supported by the National Natural Science Foundation of China under Grant No. 61972364 and the Fundamental Research Funds for the Central Universities under Grant No. 2652021001. This work is also supported by Melbourne-Chindia Cloud Computing (MC3) Research Network.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Zhang, H., Sun, D., Sajjanhar, A., Buyya, R. (2022). A Data Stream Prediction Strategy for Elastic Stream Computing Systems. In: Xiang, W., Han, F., Phan, T.K. (eds) Broadband Communications, Networks, and Systems. BROADNETS 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 413. Springer, Cham. https://doi.org/10.1007/978-3-030-93479-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-93479-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93478-1
Online ISBN: 978-3-030-93479-8
eBook Packages: Computer ScienceComputer Science (R0)