skip to main content
10.1145/3540250.3549174acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Hierarchical Bayesian multi-kernel learning for integrated classification and summarization of app reviews

Authors Info & Claims
Published:09 November 2022Publication History

ABSTRACT

App stores enable users to share their experiences directly with the developers in the form of app reviews. Recent studies have shown that the feedback received from users is a valuable source of information for requirements extraction, which encourages app developers to leverage the reviews for app update and maintenance purposes. Follow-up studies proposed automated techniques to help developers filter the large volume of daily and noisy reviews and/or summarize their content. However, all previous studies approached the app reviews classification and summarization as separate tasks, which complicated the process and introduced unnecessary overhead. Moreover, none of those approaches explored the potential of utilizing the hierarchical relationships that exist between the labels of app reviews for the purpose of building a more accurate model. In this work, we propose Hierarchical Multi-Kernel Relevance Vector Machines (HMK-RVM), a Bayesian multi-kernel technique that integrates app review classification and summarization using a unified model. Moreover, it can provide insights into the learned patterns and underlying data for easier model interpretation. We evaluated our proposed approach on two real-world datasets and showed that in addition to the gained insights, the model produces equal or better results than the state of the art.

References

  1. Javed A. Aslam, Katya Pelekhov, and Daniela Rus. 1998. Static and Dynamic Information Organization with Star Clusters. In CIKM. ACM, 208–217. Google ScholarGoogle Scholar
  2. Kartik Bajaj, Karthik Pattabiraman, and Ali Mesbah. 2014. Mining questions asked by web developers. In Proceedings of the 11th Working Conference on Mining Software Repositories, MSR. ACM, 112–121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Anton Barua, Stephen W. Thomas, and Ahmed E. Hassan. 2014. What are developers talking about? An analysis of topics and trends in Stack Overflow. Empirical Software Engineering, 19, 3 (2014), 619–654. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Andrew Begel and Thomas Zimmermann. 2014. Analyze this! 145 questions for data scientists in software engineering. In Proceedings of the 36th International Conference on Software Engineering, ICSE. ACM, 12–23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dimitri P Bertsekas. 1997. Nonlinear programming. Journal of the Operational Research Society, 48, 3 (1997), 334–334. Google ScholarGoogle ScholarCross RefCross Ref
  6. Christopher M Bishop and Nasser M Nasrabadi. 2006. Pattern recognition and machine learning. 4, Springer. Google ScholarGoogle Scholar
  7. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3 (2003), 993–1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. TACL, 5 (2017), 135–146. Google ScholarGoogle ScholarCross RefCross Ref
  9. Laura V. Galvis Carreño and Kristina Winbladh. 2013. Analysis of user comments: an approach for software requirements evolution. In Proceedings of the 35th International Conference on Software Engineering, ICSE. IEEE Computer Society, 582–591. Google ScholarGoogle Scholar
  10. Eya Ben Charrada. 2016. Which One to Read? Factors Influencing the Usefulness of Online Reviews for RE. In Proceedings of the 24th IEEE International Requirements Engineering Conference, RE. IEEE Computer Society, 46–52. Google ScholarGoogle ScholarCross RefCross Ref
  11. Ning Chen, Jialiu Lin, Steven C. H. Hoi, Xiaokui Xiao, and Boshen Zhang. 2014. AR-miner: mining informative reviews for developers from mobile app marketplace. In Proceedings of the 36th International Conference on Software Engineering, ICSE. ACM, 767–778. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Bin Fu, Jialiu Lin, Lei Li, Christos Faloutsos, Jason I. Hong, and Norman M. Sadeh. 2013. Why people hate your app: making sense of user feedback in a mobile app store. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD. ACM, 1276–1284. Google ScholarGoogle Scholar
  13. Cuiyun Gao, Jichuan Zeng, David Lo, Chin-Yew Lin, Michael R. Lyu, and Irwin King. 2018. INFAR: insight extraction from app reviews. In Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, USA, November 04-09, 2018. ACM, 904–907. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Giovanni Grano, Andrea Di Sorbo, Francesco Mercaldo, Corrado Aaron Visaggio, Gerardo Canfora, and Sebastiano Panichella. 2017. Android apps and user feedback: a dataset for software evolution and quality improvement. In Proceedings of the 2nd ACM SIGSOFT International Workshop on App Market Analytics, WAMA@ESEC/SIGSOFT FSE 2017, Paderborn, Germany, September 5, 2017. ACM, 8–11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Xiaodong Gu and Sunghun Kim. 2015. "What Parts of Your Apps are Loved by Users?" (T). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, ASE. IEEE Computer Society, 760–770. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Emitza Guzman and Walid Maalej. 2014. How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Reviews. In Proceedings of the IEEE 22nd International Requirements Engineering Conference, RE, Tony Gorschek and Robyn R. Lutz (Eds.). IEEE Computer Society, 153–162. Google ScholarGoogle ScholarCross RefCross Ref
  17. Elizabeth Ha and David A. Wagner. 2013. Do Android users write about electric sheep? Examining consumer reviews in Google Play. In Proceedings of the 10th IEEE Consumer Communications and Networking Conference, CCNC. IEEE, 149–157. Google ScholarGoogle Scholar
  18. Mark Harman, Yue Jia, and Yuanyuan Zhang. 2012. App store mining and analysis: MSR for app stores. In Proceedings of the 9th IEEE Working Conference of Mining Software Repositories, MSR. IEEE Computer Society, 108–111. Google ScholarGoogle ScholarCross RefCross Ref
  19. Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM International Conference on Knowledge Discovery and Data Mining, SIGKDD. ACM, 168–177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Claudia Iacob and Rachel Harrison. 2013. Retrieving and analyzing mobile apps feature requests from online reviews. In Proceedings of the 10th Working Conference on Mining Software Repositories, MSR. IEEE Computer Society, 41–44. Google ScholarGoogle ScholarCross RefCross Ref
  21. Hammad Khalid, Emad Shihab, Meiyappan Nagappan, and Ahmed E. Hassan. 2015. What Do Mobile App Users Complain About? IEEE Software, 32, 3 (2015), 70–77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Daphne Koller and Mehran Sahami. 1997. Hierarchically Classifying Documents Using Very Few Words. In Proceedings of the Fourteenth International Conference on Machine Learning (ICML. Morgan Kaufmann, 170–178. Google ScholarGoogle Scholar
  23. Johannes V. Lochter, Pedro R. Pires, Carlos Bossolani, Akebo Yamakami, and Tiago A. Almeida. 2018. Evaluating the impact of corpora used to train distributed text representation models for noisy and short texts. In Proceedings of the 2018 International Joint Conference on Neural Networks, IJCNN. IEEE, 1–8. Google ScholarGoogle Scholar
  24. Walid Maalej, Zijad Kurtanovic, Hadeer Nabil, and Christoph Stanik. 2016. On the automatic classification of app reviews. Requir. Eng., 21, 3 (2016), 311–331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Walid Maalej and Hadeer Nabil. 2015. Bug report, feature request, or simply praise? On automatically classifying app reviews. In Proceedings of the 23rd IEEE International Requirements Engineering Conference, RE. IEEE Computer Society, 116–125. Google ScholarGoogle ScholarCross RefCross Ref
  26. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Google ScholarGoogle Scholar
  27. Stuart McIlroy, Weiyi Shang, Nasir Ali, and Ahmed E. Hassan. 2017. User reviews of top mobile apps in Apple and Google app stores. Commun. ACM, 60, 11 (2017), 62–67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR (Workshop Poster). Google ScholarGoogle Scholar
  29. Azad Naik and Huzefa Rangwala. 2018. Large Scale Hierarchical Classification: State of the Art. Springer. isbn:978-3-030-01619-7 Google ScholarGoogle Scholar
  30. Kamal Nigam, Andrew McCallum, Sebastian Thrun, and Tom M. Mitchell. 2000. Text Classification from Labeled and Unlabeled Documents using EM. Machine Learning, 39, 2/3 (2000), 103–134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Dennis Pagano and Walid Maalej. 2013. User feedback in the appstore: An empirical study. In Proceedings of the 21st IEEE International Requirements Engineering Conference, RE. IEEE Computer Society, 125–134. Google ScholarGoogle ScholarCross RefCross Ref
  32. Fabio Palomba, Pasquale Salza, Adelina Ciurumelea, Sebastiano Panichella, Harald C. Gall, Filomena Ferrucci, and Andrea De Lucia. 2017. Recommending and localizing change requests for mobile apps based on user reviews. In Proceedings of the 39th International Conference on Software Engineering,ICSE. IEEE / ACM, 106–117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado Aaron Visaggio, Gerardo Canfora, and Harald C. Gall. 2015. How can i improve my app? Classifying user reviews for software maintenance and evolution. In Proceedings of the 2015 IEEE International Conference on Software Maintenance and Evolution, ICSME. IEEE Computer Society, 281–290. Google ScholarGoogle Scholar
  34. Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado Aaron Visaggio, Gerardo Canfora, and Harald C. Gall. 2016. ARdoc: app reviews development oriented classifier. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE. ACM, 1023–1027. Google ScholarGoogle Scholar
  35. Dae Hoon Park, Mengwen Liu, ChengXiang Zhai, and Haohong Wang. 2015. Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, August 9-13, 2015. ACM, 533–542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP. ACL, 1532–1543. Google ScholarGoogle Scholar
  37. Anand Rajaraman and Jeffrey David Ullman. 2011. Mining of Massive Datasets. Cambridge University Press, New York, NY, USA. isbn:1107015359, 9781107015357 Google ScholarGoogle Scholar
  38. Christoffer Rosen and Emad Shihab. 2016. What are mobile developers asking about? A large scale study using stack overflow. Empirical Software Engineering, 21, 3 (2016), 1192–1223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Dwaipayan Roy, Debasis Ganguly, Sumit Bhatia, Srikanta Bedathur, and Mandar Mitra. 2018. Using Word Embeddings for Information Retrieval: How Collection and Term Normalization Choices Affect Performance. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM. ACM, 1835–1838. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Junji Shimagaki, Corrado Aaron Visaggio, Gerardo Canfora, and Harald C. Gall. 2016. What would users change in my app? summarizing app reviews for recommending software changes. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE. ACM, 499–510. Google ScholarGoogle Scholar
  41. Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Corrado Aaron Visaggio, and Gerardo Canfora. 2017. SURF: summarizer of user reviews feedback. In Proceedings of the 39th International Conference on Software Engineering, ICSE. IEEE Computer Society, 55–58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Michael E. Tipping. 1999. The Relevance Vector Machine. In Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29 - December 4, 1999]. The MIT Press, 652–658. Google ScholarGoogle Scholar
  43. Lorenzo Villarroel, Gabriele Bavota, Barbara Russo, Rocco Oliveto, and Massimiliano Di Penta. 2016. Release planning of mobile apps based on user reviews. In Proceedings of the 38th International Conference on Software Engineering, ICSE. ACM, 14–24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Phong Minh Vu, Tam The Nguyen, Hung Viet Pham, and Tung Thanh Nguyen. 2015. Mining User Opinions in Mobile App Reviews: A Keyword-Based Approach (T). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, ASE. IEEE Computer Society, 749–759. Google ScholarGoogle Scholar
  45. Phong Minh Vu, Hung Viet Pham, Tam The Nguyen, and Tung Thanh Nguyen. 2016. Phrase-based extraction of user opinions in mobile app reviews. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE. ACM, 726–731. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Hierarchical Bayesian multi-kernel learning for integrated classification and summarization of app reviews
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
            November 2022
            1822 pages
            ISBN:9781450394130
            DOI:10.1145/3540250

            Copyright © 2022 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 9 November 2022

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate112of543submissions,21%
          • Article Metrics

            • Downloads (Last 12 months)66
            • Downloads (Last 6 weeks)5

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader