Using gameplay videos for detecting issues in video games

Guglielmi, Emanuela; Scalabrino, Simone; Bavota, Gabriele; Oliveto, Rocco

doi:10.1007/s10664-023-10365-0

Using gameplay videos for detecting issues in video games

Published: 07 October 2023

Volume 28, article number 136, (2023)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Emanuela Guglielmi ORCID: orcid.org/0000-0002-5443-1303¹,
Simone Scalabrino¹,
Gabriele Bavota² &
…
Rocco Oliveto¹

221 Accesses
1 Citation
Explore all metrics

Abstract

Context

The game industry is increasingly growing in recent years. Every day, millions of people play video games, not only as a hobby, but also for professional competitions ( e.g., e-sports or speed-running) or for making business by entertaining others ( e.g., streamers). The latter daily produce a large amount of gameplay videos in which they also comment live what they experience. But no software and, thus, no video game is perfect: Streamers may encounter several problems (such as bugs, glitches, or performance issues) while they play. Also, it is unlikely that they explicitly report such issues to developers. The identified problems may negatively impact the user’s gaming experience and, in turn, can harm the reputation of the game and of the producer.

Objective

In this paper, we propose and empirically evaluate GELID, an approach for automatically extracting relevant information from gameplay videos by (i) identifying video segments in which streamers experienced anomalies; (ii) categorizing them based on their type ( e.g., logic or presentation); clustering them based on (iii) the context in which appear ( e.g., level or game area) and (iv) on the specific issue type ( e.g., game crashes).

Method

We manually defined a training set for step 2 of GELID (categorization) and a test set for validating in isolation the four components of GELID. In total, we manually segmented, labeled, and clustered 170 videos related to 3 video games, defining a dataset containing 604 segments.

Results

While in steps 1 (segmentation) and 4 (specific issue clustering) GELID achieves satisfactory results, it shows limitations on step 3 (game context clustering) and, above all, step 2 (categorization).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying gameplay videos that exhibit bugs in computer games

Article 25 June 2019

Understanding Players and Play Through Game Analytics

Cluster Evaluation, Description, and Interpretation for Serious Games

Data Availability

All the datasets produced and the scripts implemented to obtain the results reported in this paper (including our implementation of GELID) are available in our replication package (Guglielmi et al. 2023).

Notes

https://twitch.tv
https://youtu.be/ybvXzSLy9Ew?t=1448
https://steamcommunity.com/
https://developers.google.com/youtube/v3
http://www.cs.waikato.ac.nz/ml/weka/
33 and 48 in the training set, 31 and 4 in the test set for performance and balance, respectively.
https://youtu.be/1duizy5DSOg?t=1540
https://youtu.be/eDQIdqDC-sc?t=239
Note that the results differ from the ones reported in Table 6 because we did not run any preprocessing step here.

References

Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) Optics: Ordering points to identify the clustering structure. ACM Sigmod Rec 28(2):49–60
Article Google Scholar
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc Ser B (Methodol) 57(1):289–300
MathSciNet MATH Google Scholar
Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn 30(7):1145–1159
Article Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Article MATH Google Scholar
Chen N, Lin J, Hoi SC, Xiao X, Zhang B (2014) Ar-miner: mining informative reviews for developers from mobile app marketplace. In: Proceedings of the 36th international conference on software engineering, pp 767–778
Choudhury S, Bhowal A (2015) Comparative analysis of machine learning algorithms along with classifiers for network intrusion detection. In: 2015 International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM), IEEE, pp 89–95
Cliff N (1993) Dominance statistics: Ordinal analyses to answer ordinal questions. Psychol Bull 114(3):494
Article Google Scholar
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Measur 20(1):37–46
Article Google Scholar
Ester M, Kriegel HP, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: kdd, vol 96, pp 226–231
Flach PA (2016) Roc analysis. In: Encyclopedia of machine learning and data mining, Springer, pp 1–8
Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40
Article MathSciNet MATH Google Scholar
Gnanambal S, Thangaraj M, Meenatchi V, Gayathri V (2018) Classification algorithms with attribute selection: an evaluation study using weka. Int J Adv Netw Appl 9(6):3640–3644
Google Scholar
Guglielmi E, Scalabrino S, Bavota G, Oliveto R (2022) Towards using gameplay videos for detecting issues in video games. arXiv preprint arXiv:220404182
Guglielmi E, Scalabrino S, Bavota G, Oliveto R (2023) Replication package of "using gameplay videos for detecting issues in video games". https://figshare.com/s/3de4d6958a57073dfa1b
Guzdial M, Shah S, Riedl M (2018) Towards automated let’s play commentary. arXiv preprint arXiv:180909424
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Their Appl 13(4):18–28
Article Google Scholar
Ho TK (1995) Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, IEEE, vol 1, pp 278–282
Jones SE (2008) The meaning of video games: Gaming and textual strategies. Routledge
Karvelis P, Gavrilis D, Georgoulas G, Stylios C (2018) Topic recommendation using doc2vec. In: 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, pp 1–6
Lewis C, Whitehead J, Wardrip-Fruin N (2010) What went wrong: a taxonomy of video game bugs. In: Proceedings of the fifth international conference on the foundations of digital games, pp 108–115
Li C, Gandhi S, Harrison B (2019) End-to-end let’s play commentary generation using multi-modal video representations. In: Proceedings of the 14th International Conference on the Foundations of Digital Games, pp 1–7
Lin D, Bezemer CP, Hassan AE (2017) Studying the urgent updates of popular games on the steam platform. Empir Softw Eng 22:2095–2126
Article Google Scholar
Lin D, Bezemer CP, Hassan AE (2019) Identifying gameplay videos that exhibit bugs in computer games. Empir Softw Eng 24(6):4006–4033
Article Google Scholar
MacFarland TW, Yates JM, MacFarland TW, Yates JM (2016) Mann–whitney u test. Introduction to nonparametric statistics for the biological sciences using R pp 103–132
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics pp 50–60
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41
Article Google Scholar
Murphy-Hill E, Zimmermann T, Nagappan N (2014) Cowboys, ankle sprains, and keepers of quality: How is video game development different from software development? In: Proceedings of the 36th International Conference on Software Engineering, pp 1–11
Ozkok FO, Celik M (2017) A new approach to determine eps parameter of dbscan algorithm. Int J Intell Syst Appl Eng 5(4):247–251
Article Google Scholar
Python (2023a) Opencv. https://opencv.org, [Online]
Python (2023b) spacy. https://spacy.io/, [Online]
Python (2023c) Video-kf. https://pypi.org/project/video-kf/, [Online]
Ramchoun H, Ghanou Y, Ettaouil M, Janati Idrissi MA (2016) Multilayer perceptron: Architecture optimization and training. Int J Interact Multimed Artif Intell 4(1):26–30
Google Scholar
Rong X (2014) word2vec parameter learning explained. arXiv preprint arXiv:14112738
Santos RE, Magalhães CV, Capretz LF, Correia-Neto JS, da Silva FQ, Saher A (2018) Computer games are serious business and so is their quality: particularities of software testing in game development from the perspective of practitioners. In: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp 1–10
Scalabrino S, Bavota G, Russo B, Di Penta M, Oliveto R (2017) Listening to the crowd for the release planning of mobile apps. IEEE Trans Software Eng 45(1):68–86
Article Google Scholar
Shah S, Guzdial M, Riedl MO (2019) Automated let’s play commentary. arXiv preprint arXiv:190902195
Souček T, Lokoč J (2020) Transnet v2: An effective deep network architecture for fast shot transition detection
Steam (2023a) Conan exiles. https://store.steampowered.com/app/440900/Conan_Exiles/
Steam (2023b) Dayz. https://store.steampowered.com/app/221100/DayZ/
Steam (2023c ) New world. https://store.steampowered.com/app/1063730/New_World/
Tang S, Feng L, Kuang Z, Chen Y, Zhang W (2018) Fast video shot transition localization with deep structured models. In: Asian Conference on Computer Vision, Springer, pp 577–592
Tian Y, Lo D, Lawall J (2014) Sewordsim: Software-specific word similarity database. In: Companion Proceedings of the 36th International Conference on Software Engineering, pp 568–571
Toy EJ, Kummaragunta JV, Yoo JS (2018) Large-scale cross-country analysis of steam popularity. In: 2018 International Conference on Computational Science and Computational Intelligence (CSCI), IEEE, pp 1054–1058
Truelove A, de Almeida ES, Ahmed I (2021) We’ll fix it in post: What do bug fixes in video game update notes tell us? In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), IEEE, pp 736–747
Wan T, Jun H, Zhang H, Pan W, Hua H (2015) Kappa coefficient: a popular measure of rater agreement. Shanghai Arch Psychiatry 27(1):62
Google Scholar
Wang Z, Bovik A, Sheikh H, Simoncelli E (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612. https://doi.org/10.1109/TIP.2003.819861
Article Google Scholar
Wen Z, Tzerpos V (2004) An effectiveness measure for software clustering algorithms. In: Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004., IEEE, pp 194–203
Zhang Y, Jin R, Zhou ZH (2010) Understanding bag-of-words model: a statistical framework. Int J Mach Learn Cybern 1(1):43–52
Article Google Scholar

Download references

Author information

Authors and Affiliations

STAKE Lab, University of Molise, C.da Fonte Lappone, 86090, Pesche, IS, Italy
Emanuela Guglielmi, Simone Scalabrino & Rocco Oliveto
Università Della Svizzera Italiana, Via Giuseppe Buffi 13, 6900, Lugano, Switzerland
Gabriele Bavota

Authors

Emanuela Guglielmi
View author publications
You can also search for this author in PubMed Google Scholar
Simone Scalabrino
View author publications
You can also search for this author in PubMed Google Scholar
Gabriele Bavota
View author publications
You can also search for this author in PubMed Google Scholar
Rocco Oliveto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emanuela Guglielmi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by: Jin Guo, Raula Gaikovina Kula

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Registered Reports

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guglielmi, E., Scalabrino, S., Bavota, G. et al. Using gameplay videos for detecting issues in video games. Empir Software Eng 28, 136 (2023). https://doi.org/10.1007/s10664-023-10365-0

Download citation

Accepted: 27 June 2023
Published: 07 October 2023
DOI: https://doi.org/10.1007/s10664-023-10365-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using gameplay videos for detecting issues in video games