Abstract
Effective prediction of online live streaming traffic plays a crucial role not only in optimizing network resource allocation for enhancing viewer experience but also in assessing factors impacting audience retention and the overall sustainability of streaming platforms. This study conducts a comprehensive evaluation of machine learning methods for online live streaming traffic prediction using extensive hourly traffic data. The dataset comprises 1,385,444,808 live streaming entries and encompasses 30,690,841 unique streamers from the Douyu platform, spanning December 2020 to April 2023. Various experimental settings are employed to compare the performance of these methods. Our findings reveal that among ten methodologies considered, the Bidirectional Long Short-Term Memory, Extra Tree (ET), and Random Forest models demonstrate consistent and robust performance. Particularly, the ET model exhibits outstanding accuracy and precision in predicting daily viewer counts when incorporating pertinent features. In the domain of large-scale and long-term live streaming data prediction, machine learning approaches surpass traditional time series forecasting methods. Moreover, our analysis underscores the significance of incorporating streamer count in enhancing the accuracy of network traffic prediction. Interestingly, while hourly features show limited impact, in certain scenarios, their inclusion may even diminish the predictive efficacy of the models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shu-Hui, G., Xin, L.: Live streaming: data mining and behavior analysis. Acta Phys. Sinica 69(83) (2020)
Sharma, S., Gupta, V.: Role of twitter user profile features in retweet prediction for big data streams. Multimedia Tools Appl. 81, 27309–27338 (2022)
Liu, X.: The market changes and causes of game live streaming industry from 2019 to 2020 by case study of HUYA. In: The 2022 International Conference on Economics, Smart Finance and Contemporary Trade (2022)
Heim, A.B., Patel, R.J.: Remote learning options. Science 377(6601), 22–24 (2022)
Chen, H., Dou, Y., Xiao, Y.: Understanding the role of live streamers in live-streaming e-commerce. Electron. Commer. Res. Appl. 59(C), 101266 (2023)
Qian, T.Y., Seifried, C.: Virtual interactions and sports viewing on social live streaming platforms: the role of co-creation experiences, platform involvement, and follow status. J. Bus. Res. 162, 113884 (2023)
(CNNIC)ew, t.C.I.N.I.C.: The 51st edition of the “statistical report on internet development in china”. Report 1009-3125 (2023)
Mengxuan, K., Junping, S., Pengfei, F.A.N.: Survey of network traffic forecast based on deep learning. Comput. Eng Appl. 57(10), 1–9 (2021)
Yan, Z., Yang, Z., Griffiths, M.D.: “Danmu” preference, problematic online video watching, loneliness and personality: an eye-tracking study and survey study. BMC Psychiatry 23(1), 523 (2023)
Kaytoue, M., Silva, A., Cerf, L., Meira Jr, W., Raıssi, C.: Watch me playing, i am a professional: a first study on video game live streaming. In: Proceedings of the 21st International Conference on World Wide Web, pp. 1181–1188 (2012)
Jia, A.L., Shen, S., Epema, D.H., Iosup, A.: When game becomes life: the creators and spectators of online game replays and live streaming. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 12(4), 1–24 (2016)
Arnett, L., Netzorg, R., Chaintreau, A., Wu, E.: Cross-platform interactions and popularity in the live-streaming community. In: The 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–6 (2019)
Netzorg, R., Arnett, L., Chaintreau, A., Wu, E.: PopFactor: live-streamer behavior and popularity. In: International Conference on Web and Social Media (2021)
Nascimento, G., et al.: Modeling and analyzing the video game live-streaming community. In: 2014 9th Latin American Web Congress, pp. 1–9 (2014)
Tu, W., Yan, C., Yan, Y., Ding, X., Sun, L.: Who is earning? Understanding and modeling the virtual gifts behavior of users in live streaming economy (2018)
Chen, Z., Shen, J., Zhu, M., Hu, B., Liu, A.: Predicting virtual gifting behaviors in live streaming using Danmaku information. In: 2022 8th International Conference on Big Data Computing and Communications (BigCom), pp. 190–198 (2022)
Douyu reports fourth quarter 2022 unaudited financial results (2023/03/20 2023)
Zhang, Y., Meng, G.: Simulation of an adaptive model based on AIC and BIC ARIMA predictions. J. Phys: Conf. Ser. 2449, 012027 (2023)
Siami-Namini, S., Tavakoli, N., Namin, A.S.: A comparison of ARIMA and LSTM in forecasting time series (2018)
Pierre, A.A., Akim, S.A., Semenyo, A.K., Babiga, B.: Peak electrical energy consumption prediction by ARIMA, LSTM, GRU, ARIMA-LSTM and ARIMA-GRU approaches. Energies 16, 4739 (2023)
Guenoupkati, A., Salami, A.A., Kodjo, M.K., Napo, K.: Short-term electricity generation forecasting using machine learning algorithms: a case study of the Benin electricity community (C.E.B). In: TH Wildau Engineering and Natural Sciences Proceedings, vol.1 (2021)
ArunKumar, K., Kalaga, D.V., Kumar, C.M.S., Kawaji, M., Brenza, T.M.: Comparative analysis of gated recurrent units (GRU), long short-term memory (LSTM) cells, autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average (SARIMA) for forecasting covid-19 trends. Alexandria Eng. J. 61(10), 7585–7603 (2022)
Sadeq, J.M., Qadir, B.A., Abbas, H.H.: Cars logo recognition by using of backpropagation neural networks. Measure. Sens. 26, 100702 (2023)
Li, Y.F., Cao, H.: Prediction for tourism flow based on LSTM neural network. In: 6th International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI). Procedia Computer Science, vol. 129, pp. 277–283 (2018)
Amalou, I., Mouhni, N., Abdali, A.: Multivariate time series prediction by RNN architectures for energy consumption forecasting. Energy Rep. 8, 1084–1091 (2022)
Cho, K., Merrienboer, B.V., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches (2014)
Fu, R., Zhang, Z., Li, L.: Using LSTM and GRU neural network methods for traffic flow prediction (2016)
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5), 602–610 (2005)
Doulamis, A.D., et al.: A convolutional neural network face recognition method based on BILSTM and attention mechanism. Comput. Intell. Neurosci. 2023, 2501022 (2023)
Li, Z.Y., Ge, H.X., Cheng, R.J.: Traffic flow prediction based on BILSTM model and data denoising scheme. Chin. Phys. B 31(4), 214–223 (2022)
Alakus, C., Larocque, D., Labbe, A.: Covariance regression with random forests. BMC Bioinform. 24(1), 258 (2023)
Lin, Y., Jeon, Y.: Random forests and adaptive nearest neighbors. J. Am. Stat. Assoc. 101(474), 578–590 (2006)
Moon, J., Kim, Y., Son, M., Hwang, E.: Hybrid short-term load forecasting scheme using random forest and multilayer perceptron. Energies 11(12), 3283 (2018)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system (2016)
Lei, T.M.T., Ng, S.C.W., Siu, S.W.I.: Application of ANN, XGBoost, and other ml methods to forecast air quality in Macau. Sustainability 15(6), 5341 (2023)
Amjad, M., Ahmad, I., Ahmad, M., Wróblewski, P., Kamiński, P., Amjad, U.: Prediction of pile bearing capacity using XGBoost algorithm: modeling and performance evaluation. Appl. Sci. 12(4), 2126 (2022)
Xia, B., Zhang, H., Li, Q., Li, T.: Pets: A stable and accurate pre dictor of protein-protein interacting sites based on extremely-randomized trees. IEEE Trans. NanoBioscience 14(8), 882–893 (2015)
Zhou, Q., Ning, Y., Zhou, Q., Luo, L., Lei, J.: Structural damage detection method based on random forests and data fusion. Struct. Health Monit. 12(1), 48–58 (2013)
Zhou, Q., Zhou, H., Ning, Y., Yang, F., Li, T.: Two approaches for novelty detection using random forest. Expert Syst. Appl. 42(10), 4840–4850 (2015)
Xu, Y., Zhao, X., Chen, Y.: Research on a mixed gas classification algorithm based on extreme random tree. Appl. Sci.-Basel 9(9), 1728 (2019)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 36(1), 3–42 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, H., Guo, S., Lai, S., Lu, X. (2024). Comparison of Prediction Methods on Large-Scale and Long-Term Online Live Streaming Data. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2023. Communications in Computer and Information Science, vol 2017. Springer, Singapore. https://doi.org/10.1007/978-981-97-0837-6_3
Download citation
DOI: https://doi.org/10.1007/978-981-97-0837-6_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0836-9
Online ISBN: 978-981-97-0837-6
eBook Packages: Computer ScienceComputer Science (R0)