ABSTRACT
App stores enable users to share their experiences directly with the developers in the form of app reviews. Recent studies have shown that the feedback received from users is a valuable source of information for requirements extraction, which encourages app developers to leverage the reviews for app update and maintenance purposes. Follow-up studies proposed automated techniques to help developers filter the large volume of daily and noisy reviews and/or summarize their content. However, all previous studies approached the app reviews classification and summarization as separate tasks, which complicated the process and introduced unnecessary overhead. Moreover, none of those approaches explored the potential of utilizing the hierarchical relationships that exist between the labels of app reviews for the purpose of building a more accurate model. In this work, we propose Hierarchical Multi-Kernel Relevance Vector Machines (HMK-RVM), a Bayesian multi-kernel technique that integrates app review classification and summarization using a unified model. Moreover, it can provide insights into the learned patterns and underlying data for easier model interpretation. We evaluated our proposed approach on two real-world datasets and showed that in addition to the gained insights, the model produces equal or better results than the state of the art.
- Javed A. Aslam, Katya Pelekhov, and Daniela Rus. 1998. Static and Dynamic Information Organization with Star Clusters. In CIKM. ACM, 208–217. Google Scholar
- Kartik Bajaj, Karthik Pattabiraman, and Ali Mesbah. 2014. Mining questions asked by web developers. In Proceedings of the 11th Working Conference on Mining Software Repositories, MSR. ACM, 112–121. Google ScholarDigital Library
- Anton Barua, Stephen W. Thomas, and Ahmed E. Hassan. 2014. What are developers talking about? An analysis of topics and trends in Stack Overflow. Empirical Software Engineering, 19, 3 (2014), 619–654. Google ScholarDigital Library
- Andrew Begel and Thomas Zimmermann. 2014. Analyze this! 145 questions for data scientists in software engineering. In Proceedings of the 36th International Conference on Software Engineering, ICSE. ACM, 12–23. Google ScholarDigital Library
- Dimitri P Bertsekas. 1997. Nonlinear programming. Journal of the Operational Research Society, 48, 3 (1997), 334–334. Google ScholarCross Ref
- Christopher M Bishop and Nasser M Nasrabadi. 2006. Pattern recognition and machine learning. 4, Springer. Google Scholar
- David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3 (2003), 993–1022. Google ScholarDigital Library
- Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. TACL, 5 (2017), 135–146. Google ScholarCross Ref
- Laura V. Galvis Carreño and Kristina Winbladh. 2013. Analysis of user comments: an approach for software requirements evolution. In Proceedings of the 35th International Conference on Software Engineering, ICSE. IEEE Computer Society, 582–591. Google Scholar
- Eya Ben Charrada. 2016. Which One to Read? Factors Influencing the Usefulness of Online Reviews for RE. In Proceedings of the 24th IEEE International Requirements Engineering Conference, RE. IEEE Computer Society, 46–52. Google ScholarCross Ref
- Ning Chen, Jialiu Lin, Steven C. H. Hoi, Xiaokui Xiao, and Boshen Zhang. 2014. AR-miner: mining informative reviews for developers from mobile app marketplace. In Proceedings of the 36th International Conference on Software Engineering, ICSE. ACM, 767–778. Google ScholarDigital Library
- Bin Fu, Jialiu Lin, Lei Li, Christos Faloutsos, Jason I. Hong, and Norman M. Sadeh. 2013. Why people hate your app: making sense of user feedback in a mobile app store. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD. ACM, 1276–1284. Google Scholar
- Cuiyun Gao, Jichuan Zeng, David Lo, Chin-Yew Lin, Michael R. Lyu, and Irwin King. 2018. INFAR: insight extraction from app reviews. In Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, USA, November 04-09, 2018. ACM, 904–907. Google ScholarDigital Library
- Giovanni Grano, Andrea Di Sorbo, Francesco Mercaldo, Corrado Aaron Visaggio, Gerardo Canfora, and Sebastiano Panichella. 2017. Android apps and user feedback: a dataset for software evolution and quality improvement. In Proceedings of the 2nd ACM SIGSOFT International Workshop on App Market Analytics, WAMA@ESEC/SIGSOFT FSE 2017, Paderborn, Germany, September 5, 2017. ACM, 8–11. Google ScholarDigital Library
- Xiaodong Gu and Sunghun Kim. 2015. "What Parts of Your Apps are Loved by Users?" (T). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, ASE. IEEE Computer Society, 760–770. Google ScholarDigital Library
- Emitza Guzman and Walid Maalej. 2014. How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Reviews. In Proceedings of the IEEE 22nd International Requirements Engineering Conference, RE, Tony Gorschek and Robyn R. Lutz (Eds.). IEEE Computer Society, 153–162. Google ScholarCross Ref
- Elizabeth Ha and David A. Wagner. 2013. Do Android users write about electric sheep? Examining consumer reviews in Google Play. In Proceedings of the 10th IEEE Consumer Communications and Networking Conference, CCNC. IEEE, 149–157. Google Scholar
- Mark Harman, Yue Jia, and Yuanyuan Zhang. 2012. App store mining and analysis: MSR for app stores. In Proceedings of the 9th IEEE Working Conference of Mining Software Repositories, MSR. IEEE Computer Society, 108–111. Google ScholarCross Ref
- Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM International Conference on Knowledge Discovery and Data Mining, SIGKDD. ACM, 168–177. Google ScholarDigital Library
- Claudia Iacob and Rachel Harrison. 2013. Retrieving and analyzing mobile apps feature requests from online reviews. In Proceedings of the 10th Working Conference on Mining Software Repositories, MSR. IEEE Computer Society, 41–44. Google ScholarCross Ref
- Hammad Khalid, Emad Shihab, Meiyappan Nagappan, and Ahmed E. Hassan. 2015. What Do Mobile App Users Complain About? IEEE Software, 32, 3 (2015), 70–77. Google ScholarDigital Library
- Daphne Koller and Mehran Sahami. 1997. Hierarchically Classifying Documents Using Very Few Words. In Proceedings of the Fourteenth International Conference on Machine Learning (ICML. Morgan Kaufmann, 170–178. Google Scholar
- Johannes V. Lochter, Pedro R. Pires, Carlos Bossolani, Akebo Yamakami, and Tiago A. Almeida. 2018. Evaluating the impact of corpora used to train distributed text representation models for noisy and short texts. In Proceedings of the 2018 International Joint Conference on Neural Networks, IJCNN. IEEE, 1–8. Google Scholar
- Walid Maalej, Zijad Kurtanovic, Hadeer Nabil, and Christoph Stanik. 2016. On the automatic classification of app reviews. Requir. Eng., 21, 3 (2016), 311–331. Google ScholarDigital Library
- Walid Maalej and Hadeer Nabil. 2015. Bug report, feature request, or simply praise? On automatically classifying app reviews. In Proceedings of the 23rd IEEE International Requirements Engineering Conference, RE. IEEE Computer Society, 116–125. Google ScholarCross Ref
- Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Google Scholar
- Stuart McIlroy, Weiyi Shang, Nasir Ali, and Ahmed E. Hassan. 2017. User reviews of top mobile apps in Apple and Google app stores. Commun. ACM, 60, 11 (2017), 62–67. Google ScholarDigital Library
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR (Workshop Poster). Google Scholar
- Azad Naik and Huzefa Rangwala. 2018. Large Scale Hierarchical Classification: State of the Art. Springer. isbn:978-3-030-01619-7 Google Scholar
- Kamal Nigam, Andrew McCallum, Sebastian Thrun, and Tom M. Mitchell. 2000. Text Classification from Labeled and Unlabeled Documents using EM. Machine Learning, 39, 2/3 (2000), 103–134. Google ScholarDigital Library
- Dennis Pagano and Walid Maalej. 2013. User feedback in the appstore: An empirical study. In Proceedings of the 21st IEEE International Requirements Engineering Conference, RE. IEEE Computer Society, 125–134. Google ScholarCross Ref
- Fabio Palomba, Pasquale Salza, Adelina Ciurumelea, Sebastiano Panichella, Harald C. Gall, Filomena Ferrucci, and Andrea De Lucia. 2017. Recommending and localizing change requests for mobile apps based on user reviews. In Proceedings of the 39th International Conference on Software Engineering,ICSE. IEEE / ACM, 106–117. Google ScholarDigital Library
- Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado Aaron Visaggio, Gerardo Canfora, and Harald C. Gall. 2015. How can i improve my app? Classifying user reviews for software maintenance and evolution. In Proceedings of the 2015 IEEE International Conference on Software Maintenance and Evolution, ICSME. IEEE Computer Society, 281–290. Google Scholar
- Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado Aaron Visaggio, Gerardo Canfora, and Harald C. Gall. 2016. ARdoc: app reviews development oriented classifier. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE. ACM, 1023–1027. Google Scholar
- Dae Hoon Park, Mengwen Liu, ChengXiang Zhai, and Haohong Wang. 2015. Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, August 9-13, 2015. ACM, 533–542. Google ScholarDigital Library
- Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP. ACL, 1532–1543. Google Scholar
- Anand Rajaraman and Jeffrey David Ullman. 2011. Mining of Massive Datasets. Cambridge University Press, New York, NY, USA. isbn:1107015359, 9781107015357 Google Scholar
- Christoffer Rosen and Emad Shihab. 2016. What are mobile developers asking about? A large scale study using stack overflow. Empirical Software Engineering, 21, 3 (2016), 1192–1223. Google ScholarDigital Library
- Dwaipayan Roy, Debasis Ganguly, Sumit Bhatia, Srikanta Bedathur, and Mandar Mitra. 2018. Using Word Embeddings for Information Retrieval: How Collection and Term Normalization Choices Affect Performance. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM. ACM, 1835–1838. Google ScholarDigital Library
- Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Junji Shimagaki, Corrado Aaron Visaggio, Gerardo Canfora, and Harald C. Gall. 2016. What would users change in my app? summarizing app reviews for recommending software changes. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE. ACM, 499–510. Google Scholar
- Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Corrado Aaron Visaggio, and Gerardo Canfora. 2017. SURF: summarizer of user reviews feedback. In Proceedings of the 39th International Conference on Software Engineering, ICSE. IEEE Computer Society, 55–58. Google ScholarDigital Library
- Michael E. Tipping. 1999. The Relevance Vector Machine. In Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29 - December 4, 1999]. The MIT Press, 652–658. Google Scholar
- Lorenzo Villarroel, Gabriele Bavota, Barbara Russo, Rocco Oliveto, and Massimiliano Di Penta. 2016. Release planning of mobile apps based on user reviews. In Proceedings of the 38th International Conference on Software Engineering, ICSE. ACM, 14–24. Google ScholarDigital Library
- Phong Minh Vu, Tam The Nguyen, Hung Viet Pham, and Tung Thanh Nguyen. 2015. Mining User Opinions in Mobile App Reviews: A Keyword-Based Approach (T). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, ASE. IEEE Computer Society, 749–759. Google Scholar
- Phong Minh Vu, Hung Viet Pham, Tam The Nguyen, and Tung Thanh Nguyen. 2016. Phrase-based extraction of user opinions in mobile app reviews. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE. ACM, 726–731. Google ScholarDigital Library
Index Terms
- Hierarchical Bayesian multi-kernel learning for integrated classification and summarization of app reviews
Recommendations
Can app changelogs improve requirements classification from app reviews?: an exploratory study
ESEM '18: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement[Background] Recent research on mining app reviews for software evolution indicated that the elicitation and analysis of user requirements can benefit from supplementing user reviews by data from other sources. However, only a few studies reported ...
Crowdsourced App Review Manipulation
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalWith the rapid adoption of smartphones worldwide and the reliance on app marketplaces to discover new apps, these marketplaces are critical for connecting users with apps. And yet, the user reviews and ratings on these marketplaces may be strategically ...
Comments