ABSTRACT
The input of free-form text is frequently utilized in surveys for game-related research. While this provides flexibility, it also presents the challenge of dirty data, which includes spelling errors, missing series titles, and unofficial yet popular abbreviations inputted by the user. The manual resolution of these anomalies is impractical and resource-intensive. To address this issue, a fuzzing string machine-based game mapping system was designed and evaluated using 1,096 game titles input by users. GMap-R, a real-time autocomplete game title system to aid runtime user entry, was also created and evaluated using 150 game titles provided by 30 participants, each of whom listed their five favorite games twice. With GMap-R, the correct mapping percentage increased to 98.67%. These preliminary evaluations indicate that the proposed strategy can significantly enhance the cleansing and input of game titles’ free-form text. In turn, this helps to conserve resources when obtaining unsupervised data through online studies.
- [n. d.]. The biggest video game database on RAWG - video game Discovery Service. https://rawg.io/Google Scholar
- [n. d.]. IGDB.com - Credits, Top Critics, Reviews, Videos and Screenshots — igdb.com. https://www.igdb.com/. [Accessed 07-Jul-2022].Google Scholar
- Vero Vanden Abeele, Katta Spiel, Lennart Nacke, Daniel Johnson, and Kathrin Gerling. 2020. Development and validation of the player experience inventory: A scale to measure player experiences at the level of functional and psychosocial consequences. International Journal of Human Computer Studies 135, June 2019 (2020). https://doi.org/10.1016/j.ijhcs.2019.102370Google ScholarDigital Library
- Koloud Al-Khamaiseh and Shadi ALShagarin. 2014. A survey of string matching algorithms. Int. J. Eng. Res. Appl 4, 7 (2014), 144–156.Google Scholar
- Nick Koudas Amit, Amit Marathe, and Divesh Srivastava. 2004. Flexible String Matching Against Large Databases in Practice. In In VLDB. Morgan Kaufmann, 1078–1086.Google Scholar
- David Anderson, Janet Delve, and Dan Pinchbeck. 2010. Toward A Workable Emulation-Based Preservation Strategy: Rationale and Technical Metadata. New Review of Information Networking 15, 2 (2010), 110–131. https://doi.org/10.1080/13614576.2010.530132 arXiv:https://doi.org/10.1080/13614576.2010.530132Google ScholarDigital Library
- Gerd Berget and Frode Eika Sandnes. 2016. Do autocomplete functions reduce the impact of dyslexia on information-searching behavior? The case of Google. Journal of the Association for Information Science and Technology 67, 10 (2016), 2320–2328. https://doi.org/10.1002/asi.23572 arXiv:https://asistdl.onlinelibrary.wiley.com/doi/pdf/10.1002/asi.23572Google ScholarDigital Library
- Balu Bhasuran, Gurusamy Murugesan, Sabenabanu Abdulkadhar, and Jeyakumar Natarajan. 2016. Stacked ensemble combined with fuzzy matching for biomedical named entity recognition of diseases. Journal of Biomedical Informatics 64 (2016), 1–9. https://doi.org/10.1016/j.jbi.2016.09.009Google ScholarDigital Library
- Michael H. Birnbaum. 2004. Human Research and Data Collection via the Internet. Annual Review of Psychology 55, 1 (2004), 803–832. https://doi.org/10.1146/annurev.psych.55.090902.141601 arXiv:https://doi.org/10.1146/annurev.psych.55.090902.141601PMID: 14744235.Google ScholarCross Ref
- Paul E. Black. 2021. Ratcliff/Obershelp pattern recognition. In Dictionary of Algorithms and Data Structures, Paul E. Black (Ed.). https://www.nist.gov/dads/HTML/ratcliffObershelp.htmlGoogle Scholar
- Brad Boyle, Nicole Hopkins, Zhenyuan Lu, Juan Antonio Raygoza Garay, Dmitry Mozzherin, Tony Rees, Naim Matasci, Martha L. Narro, William H. Piel, Sheldon J. Mckay, Sonya Lowry, Chris Freeland, Robert K. Peet, and Brian J. Enquist. 2013. The taxonomic name resolution service: an online tool for automated standardization of plant names. BMC Bioinformatics 14, 1 (16 Jan 2013), 16. https://doi.org/10.1186/1471-2105-14-16Google ScholarCross Ref
- Van-Kien Bui and Chaochun Wei. 2020. CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies. BMC Bioinformatics 21, 1 (Oct. 2020), 468.Google ScholarCross Ref
- Xu Chu, Ihab F. Ilyas, Sanjay Krishnan, and Jiannan Wang. 2016. Data Cleaning: Overview and Emerging Challenges. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) (SIGMOD ’16). Association for Computing Machinery, New York, NY, USA, 2201–2206. https://doi.org/10.1145/2882903.2912574Google ScholarDigital Library
- Jess Cliffe. 2000. CS V1.0 Released!https://web.archive.org/web/20001201214200http://counter-strike.net/.Google Scholar
- Adam Cohen. 2020. fuzzywuzzy: Fuzzy String Matching in Python. https://github.com/seatgeek/fuzzywuzzy.Google Scholar
- Mick P. Couper. 2005. Technology Trends in Survey Data Collection. Social Science Computer Review 23, 4 (2005), 486–501. https://doi.org/10.1177/0894439305278972 arXiv:https://doi.org/10.1177/0894439305278972Google ScholarDigital Library
- Mick P. Couper. 2011. The Future of Modes of Data Collection. Public Opinion Quarterly 75, 5 (12 2011), 889–908. https://doi.org/10.1093/poq/nfr046 arXiv:https://academic.oup.com/poq/article-pdf/75/5/889/5161125/nfr046.pdfGoogle ScholarCross Ref
- Sarah I. Endress, Elisa D. Mekler, and Klaus Opwis. 2016. "It’s Like I Would Die as Well": Gratifications of Fearful Game Experience. In Proceedings of the 2016 Annual Symposium on Computer-Human Interaction in Play Companion Extended Abstracts (Austin, Texas, USA) (CHI PLAY Companion ’16). Association for Computing Machinery, New York, NY, USA, 149–155. https://doi.org/10.1145/2968120.2987716Google ScholarDigital Library
- Andy Fitzgerald. 2020. Keyword Extraction with NLP: A Beginner’s Guide. https://www.andyfitzgeraldconsulting.com/writing/keyword-extraction-nlp/.Google Scholar
- Anna Fitzgerald. 2021. API Calls: What They Are & How to Make Them in 5 Easy Steps. https://blog.hubspot.com/website/api-calls.Google Scholar
- Aqeel Haider, Kathrin Gerling, and Vero Vanden Abeele. 2020. The Player Experience Inventory Bench: Providing Games User Researchers Actionable Insight into Player Experiences. In Extended Abstracts of the 2020 Annual Symposium on Computer-Human Interaction in Play(CHI PLAY ’20). Association for Computing Machinery, New York, NY, USA, 248–252. https://doi.org/10.1145/3383668.3419898Google ScholarDigital Library
- Carrie Heeter, Yu-Hao Lee, Ben Medler, and Brian Magerko. 2011. Beyond Player Types: Gaming Achievement Goal. In ACM SIGGRAPH 2011 Game Papers (Vancouver, British Columbia, Canada) (SIGGRAPH ’11). Association for Computing Machinery, New York, NY, USA, Article 7, 5 pages. https://doi.org/10.1145/2037692.2037701Google ScholarDigital Library
- Mauricio A. Hernández and Salvatore J. Stolfo. 1998. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem. Data Mining and Knowledge Discovery 2, 1 (01 Jan 1998), 9–37. https://doi.org/10.1023/A:1009761603038Google ScholarDigital Library
- Ilya Ilyankou. 2014. Comparison of Jaro-Winkler and Ratcliff/Obershelp Algorithms in Spell Check. IB Extended Essay Computer Science 1, 2 (2014), 3.Google Scholar
- Alpa Jain, Silviu Cucerzan, and Saliha Azzam. 2007. Acronym-Expansion Recognition and Ranking on the Web. In 2007 IEEE International Conference on Information Reuse and Integration. 209–214. https://doi.org/10.1109/IRI.2007.4296622Google ScholarCross Ref
- Matthew A. Jaro. 1989. Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida. J. Amer. Statist. Assoc. 84, 406 (1989), 414–420. https://doi.org/10.1080/01621459.1989.10478785Google ScholarCross Ref
- Emil Thorstensen Jensen, Martin Hansen, Evelyn Eika, and Frode Eika Sandnes. 2020. Country Selection on Web Forms: A Comparison of Dropdown Menus, Radio Buttons and Text Field with Autocomplete. In 2020 14th International Conference on Ubiquitous Information Management and Communication (IMCOM). 1–4. https://doi.org/10.1109/IMCOM48794.2020.9001795Google ScholarCross Ref
- Mahesh Joshi, Serguei Pakhomov, Ted Pedersen, and Christopher G. Chute. 2006. A comparative study of supervised learning as applied to acronym expansion in clinical reports. AMIA... Annual Symposium proceedings. AMIA Symposium 2006 (2006), 399–403. https://pubmed.ncbi.nlm.nih.gov/17238371 17238371[pmid].Google Scholar
- Won Kim, Byoung-Ju Choi, Eui-Kyeong Hong, Soo-Kyung Kim, and Doheon Lee. 2003. A Taxonomy of Dirty Data. Data Mining and Knowledge Discovery 7, 1 (01 Jan 2003), 81–99. https://doi.org/10.1023/A:1021564703268Google ScholarDigital Library
- Armen Kostanyan. 2017. Fuzzy string matching with finite automat. In 2017 Computer Science and Information Technologies (CSIT). IEEE, 9–11.Google Scholar
- William S. Krasker, Edwin Kuh, and Roy E. Welsch. 1983. Chapter 11 Estimation for dirty data and flawed models. Handbook of Econometrics, Vol. 1. Elsevier, 651–698. https://doi.org/10.1016/S1573-4412(83)01015-6Google ScholarCross Ref
- Alok Kumar, Maninder Singh, and Alwyn Roshan Pais. 2019. Fuzzy string matching algorithm for spam detection in twitter. In International Conference on Security & Privacy. Springer, 289–301.Google ScholarCross Ref
- Varghese P Kuruvilla. 2022. Fuzzy matching or fuzzy logic algorithms explained. https://nanonets.com/blog/fuzzy-matching-fuzzy-logic/Google Scholar
- Latin is Simple Online Dictionary. 2018. brevis/breve, brevis M. https://www.latin-is-simple.com/en/vocabulary/adjective/8839/.Google Scholar
- Jung-Chieh Lee and Liangnan Xiong. 2018. Exploring Purchase and Repurchase Behavior in Online Mobile Games: A Preliminary Study. In Proceedings of the 2018 2nd International Conference on Software and E-Business (Zhuhai, China) (ICSEB ’18). Association for Computing Machinery, New York, NY, USA, 1–5. https://doi.org/10.1145/3301761.3301762Google ScholarDigital Library
- Jin Ha Lee, Rachel Ivy Clarke, and Andrew Perti. 2015. Empirical evaluation of metadata for video games and interactive media. Journal of the Association for Information Science and Technology 66, 12 (2015), 2609–2625. https://doi.org/10.1002/asi.23357 arXiv:https://asistdl.onlinelibrary.wiley.com/doi/pdf/10.1002/asi.23357Google ScholarDigital Library
- Jin Ha Lee, Joseph T. Tennis, Rachel Ivy Clarke, and Michael Carpenter. 2013. Developing a video game metadata schema for the Seattle Interactive Media Museum. International Journal on Digital Libraries 13, 2 (01 Mar 2013), 105–117. https://doi.org/10.1007/s00799-013-0103-xGoogle ScholarDigital Library
- Mina Lee, Tatsunori B Hashimoto, and Percy Liang. 2019. Learning autocomplete systems as a communication game. arXiv preprint arXiv:1911.06964 (2019).Google Scholar
- William Lemus Leiva, Meng-Lin Li, and Chieh-Yuan Tsai. 2021. A Two-Phase Deep Learning-Based Recommender System: Enhanced by a Data Quality Inspector. Applied Sciences 11, 20 (2021). https://doi.org/10.3390/app11209667Google ScholarCross Ref
- Pascal Lessel, Maximilian Altmeyer, and Antonio Krüger. 2018. Users As Game Designers: Analyzing Gamification Concepts in a "Bottom-Up" Setting. In Proceedings of the 22nd International Academic Mindtrek Conference (Tampere, Finland) (Mindtrek ’18). Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/3275116.3275118Google ScholarDigital Library
- Vladimir I. Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics Doklady, Vol. 10. 707–710.Google Scholar
- Susan Li. 2018. Natural Language Processing for Fuzzy String Matching with Python. https://towardsdatascience.com/natural-language-processing-for-fuzzy-string-matching-with-python-6632b7824c49.Google Scholar
- Richard Marsh. 2005. Drowning in dirty data? It’s time to sink or swim: A four-stage methodology for total data quality management. Journal of Database Marketing & Customer Strategy Management 12, 2 (01 Jan 2005), 105–112. https://doi.org/10.1057/palgrave.dbm.3240247Google ScholarCross Ref
- Wes McKinney 2011. pandas: a foundational Python library for data analysis and statistics. Python for high performance and scientific computing 14, 9 (2011), 1–9.Google Scholar
- Mordor Intelligence. 2020. Gaming Market - Growth, Trends, Covid-19 Impact, and Forecasts (2022-2027). https://www.mordorintelligence.com/industry-reports/global-gaming-market. https://www.mordorintelligence.com/industry-reports/global-gaming-marketGoogle Scholar
- Natus Vincere. 2017. Money system in CS:GO explained. https://web.archive.org/web/20170102060245http://read.navi-gaming.com/en/team_news/money_system_in_csgo_explained.Google Scholar
- Gonzalo Navarro. 2001. A guided tour to approximate string matching. Comput. Surveys 33, 1 (2001), 31–88. https://doi.org/10.1145/375360.375365Google ScholarDigital Library
- Robert A. Opoku and Muhammad Naeem Khan. 2004. Customer Feedback Online: Case Studies of Swedish Manufacturing SMEs. Master’s thesis. Luleå University of Technology.Google Scholar
- Deana Pennell and Yang Liu. 2010. Normalization of text messages for text-to-speech. 2010 IEEE International Conference on Acoustics, Speech and Signal Processing (2010), 4842–4845.Google ScholarCross Ref
- PXI. 2022. PXI Bench. https://playerexperienceinventory.org/.Google Scholar
- Python Software Foundation. 2022. Difflib — Helpers for Computing Deltas. https://docs.python.org/3/library/difflib.html.Google Scholar
- Zhixin Qi, Hongzhi Wang, Jianzhong Li, and Hong Gao. 2018. Impacts of Dirty Data: and Experimental Evaluation. CoRR abs/1803.06071 (2018). arXiv:1803.06071http://arxiv.org/abs/1803.06071Google Scholar
- Erhard Rahm and Hong Hai Do. 2000. Data Cleaning: Problems and Current Approaches. IEEE Data Engineering Bulletin 23 (2000), 2000.Google Scholar
- Nikhil Raj. 2021. Pandas Functions for Data Analysis and Manipulation. https://www.analyticsvidhya.com/blog/2021/03/pandas-functions-for-data-analysis-and-manipulation/.Google Scholar
- L. Ratinov and E. Gudes. 2004. Abbreviation Expansion in Schema Matching and Web Integration. In IEEE/WIC/ACM International Conference on Web Intelligence (WI’04). 485–489. https://doi.org/10.1109/WI.2004.10083Google ScholarCross Ref
- Martin Reddy. 2011. API Design for C++. Elsevier Science.Google Scholar
- Neal W. Topp and Bob Pawloski. 2002. Online Data Collection. Journal of Science Education and Technology 11, 2 (01 Jun 2002), 173–178. https://doi.org/10.1023/A:1014669514367Google ScholarCross Ref
- Anders Tychsen, Michael Hitchens, and Thea Brolund. 2008. Motivations for Play in Computer Role-Playing Games. In Proceedings of the 2008 Conference on Future Play: Research, Play, Share (Toronto, Ontario, Canada) (Future Play ’08). Association for Computing Machinery, New York, NY, USA, 57–64. https://doi.org/10.1145/1496984.1496995Google ScholarDigital Library
- Max Van Kleek, Michael Bernstein, David R. Karger, and mc schraefel. 2007. Gui — Phooey! The Case for Text Input. In Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology (Newport, Rhode Island, USA) (UIST ’07). Association for Computing Machinery, New York, NY, USA, 193–202. https://doi.org/10.1145/1294211.1294247Google ScholarDigital Library
- Jiannan Wang, Guoliang Li, and Jianhua Feng. 2014. Extending String Similarity Join to Tolerant Fuzzy Token Matching. ACM Trans. Database Syst. 39, 1, Article 7 (jan 2014), 45 pages. https://doi.org/10.1145/2535628Google ScholarDigital Library
- David Ward, Jim Hahn, and Kirsten Feist. 2012. Autocomplete as Research Tool: A Study on Providing Search Suggestions. Information Technology and Libraries 31, 4 (Dec. 2012), 6–19. https://doi.org/10.6017/ital.v31i4.1930Google ScholarCross Ref
- Chun Wei, Alan Sprague, and Gary Warner. 2009. Clustering Malware-Generated Spam Emails with a Novel Fuzzy String Matching Algorithm. In Proceedings of the 2009 ACM Symposium on Applied Computing (Honolulu, Hawaii) (SAC ’09). Association for Computing Machinery, New York, NY, USA, 889–890. https://doi.org/10.1145/1529282.1529473Google ScholarDigital Library
- William Winkler. 1990. String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. In Proceedings of the Section on Survey Research Methods. American Statistical Association, 354–359.Google Scholar
- Peter Zackariasson and Timothy Wilson (Eds.). 2014. The Video Game Industry: Formation, Present State, and Future. Routledge, London.Google Scholar
Index Terms
- GMap: Supporting Free-Form Text Entry of Game Titles in Games User Research
Recommendations
Game Analytics for Game User Research, Part 1: A Workshop Review and Case Study
The emerging field of game user research (GUR) investigates interaction between players and games and the surrounding context of play. Game user researchers have explored methods from, for example, human-computer interaction, psychology, interaction ...
Games User Research (GUR) for Indie Studios
CHI EA '16: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing SystemsPlaytesting sessions are becoming more integrated in game development cycles. However, playtests are not always feasible or affordable for smaller independent game studios, as they require specialized equipment and expertise. Given the recent growth and ...
Comments