skip to main content
research-article
Public Access

Review Summary Generation in Online Systems: Frameworks for Supervised and Unsupervised Scenarios

Published:13 May 2021Publication History
Skip Abstract Section

Abstract

In online systems, including e-commerce platforms, many users resort to the reviews or comments generated by previous consumers for decision making, while their time is limited to deal with many reviews. Therefore, a review summary, which contains all important features in user-generated reviews, is expected. In this article, we study “how to generate a comprehensive review summary from a large number of user-generated reviews.” This can be implemented by text summarization, which mainly has two types of extractive and abstractive approaches. Both of these approaches can deal with both supervised and unsupervised scenarios, but the former may generate redundant and incoherent summaries, while the latter can avoid redundancy but usually can only deal with short sequences. Moreover, both approaches may neglect the sentiment information. To address the above issues, we propose comprehensive Review Summary Generation frameworks to deal with the supervised and unsupervised scenarios. We design two different preprocess models of re-ranking and selecting to identify the important sentences while keeping users’ sentiment in the original reviews. These sentences can be further used to generate review summaries with text summarization methods. Experimental results in seven real-world datasets (Idebate, Rotten Tomatoes Amazon, Yelp, and three unlabelled product review datasets in Amazon) demonstrate that our work performs well in review summary generation. Moreover, the re-ranking and selecting models show different characteristics.

References

  1. Reinald Kim Amplayo and Mirella Lapata. 2019. Informative and controllable opinion summarization. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL'19).Google ScholarGoogle Scholar
  2. Reinald Kim Amplayo and Mirella Lapata. 2020. Unsupervised opinion summarization with noising and denoising. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1934–1945.Google ScholarGoogle ScholarCross RefCross Ref
  3. Stefanos Angelidis and Mirella Lapata. 2017. Multiple instance learning networks for fine-grained sentiment analysis. Trans. Assoc. Comput. Ling. 6 (2017), 17–31.Google ScholarGoogle Scholar
  4. Stefanos Angelidis and Mirella Lapata. 2018. Summarizing opinions: Aspect extraction meets sentiment prediction and they are both weakly supervised. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics. 3675--3686.Google ScholarGoogle ScholarCross RefCross Ref
  5. Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. A simple but tough-to-beat baseline for sentence embeddings. In Proceedings of the International Conference on Learning Representations (ICLR’17).Google ScholarGoogle Scholar
  6. Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. 2010. SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’10). 83–90.Google ScholarGoogle Scholar
  7. A. Balahur and A. Montoyo. 2008. Multilingual Feature-Driven Opinion Extraction and Summarization from Customer Reviews. In Proceedings of the International Conference on Application of Natural Language to Information Systems. Springer, Berlin. 345–346 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Bollegala, T. Mu, and J. Y. Goulermas. 2016. Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Trans. Knowl. Data Eng. 28, 2 (Feb. 2016), 398–410. DOI:https://doi.org/10.1109/TKDE.2015.2475761 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Arthur Bražinskas, Mirella Lapata, and Ivan Titov. 2020. Unsupervised opinion summarization as copycat-review generation. In Proceedings of the Association for Computational Linguistics (ACL’20). 5151–5169.Google ScholarGoogle Scholar
  10. Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, and Hui Jiang. 2016. Distraction-based neural networks for document summarization. In Proceedings of the 2016 Conference on International Joint Conference on Artificial Intelligence (IJCAI'16). 2754--2760. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Eric Chu and Peter Liu. 2019. MeanSum: A neural model for unsupervised multi-document abstractive summarization. In Proceedings of the International Conference on Machine Learning. 1223–1232.Google ScholarGoogle Scholar
  12. Maximin Coavoux, Hady Elsahar, and Matthias Gallé. 2019. Unsupervised aspect-based multi-document abstractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization. Association for Computational Linguistics, 42–47.Google ScholarGoogle ScholarCross RefCross Ref
  13. Michael Denkowski and Alon Lavie. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the Workshop on Statistical Machine Translation. 376–380.Google ScholarGoogle ScholarCross RefCross Ref
  14. Giuseppe Di Fabbrizio, Amanda Stent, and Robert J. Gaizauskas. 2014. A hybrid approach to multi-document summarization of opinions in reviews. In Proceedings of the International Natural Language Generation Conference (INLG’14). 54–63.Google ScholarGoogle Scholar
  15. Xiaofei Ding, Wenjun Jiang, and Jiawei He. 2018. Generating expert’s review from the crowds’: Integrating a multi-attention mechanism with encoder-decoder framework. In Proceedings of the 15th IEEE International Conference on Ubiquitous Intelligence and Computing (IEEE UIC’18). 954–961.Google ScholarGoogle ScholarCross RefCross Ref
  16. Yunqi Dong and Wenjun Jiang. 2019. Brand purchase prediction based on time-evolving user behaviors in e-commerce. Concurr. Comput.: Pract. Exp. 31, 1 (2019), e4882.Google ScholarGoogle ScholarCross RefCross Ref
  17. Hady Elsahar, Maximin Coavoux, Matthias Gallé, and Jos Rozen. 2020. Self-supervised and controlled multi-document opinion summarization. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL'20). Google ScholarGoogle Scholar
  18. Erkan, Radev, and R. Dragomir. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. J. Qiqihar Jr. Teach. Coll. 22 (2004), 457--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Carlos Flick. 2004. ROUGE: A package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out. 10.Google ScholarGoogle Scholar
  20. Kavita Ganesan, Cheng Xiang Zhai, and Jiawei Han. 2010. Opinosis: A graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the International Conference on Computational Linguistics (COLING’10). 340–348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Daniel Gillick, Benoit Favre, and Dilek Hakkani-Tür. 2008. The ICSI summarization system at TAC 2008. In Proceedings of the Text Analysis Conference (TAC’08).Google ScholarGoogle Scholar
  22. Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OK Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1631--1640.Google ScholarGoogle ScholarCross RefCross Ref
  23. Emitza Guzman and Walid Maalej. 2014. How do users like this feature? A fine grained sentiment analysis of app reviews. In Proceedings of the 2014 IEEE 22nd International Requirements Engineering Conference (RE’14). IEEE. 153–162.Google ScholarGoogle ScholarCross RefCross Ref
  24. Ruidan He, Wee Sun Lee, Hwee Tou Ng, and Daniel Dahlmeier. 2017. An unsupervised neural attention model for aspect extraction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 388–397.Google ScholarGoogle ScholarCross RefCross Ref
  25. Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 168–177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Chunli Huang, Wenjun Jiang, Jie Wu, and Guojun Wang. October, 2020. Personalized review recommendation based on users’ aspect sentiment. ACM Trans. Internet Technol. 20, 4 (Oct. 2020), 1533–5399. DOI:https://doi.org/10.1145/3414841 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Wenjun Jiang, Guojun Wang, Md Zakirul Alam Bhuiyan, and Jie Wu. 2016. Understanding graph-based trust evaluation in online social networks: Methodologies and challenges. ACM Comput. Surv. 49, 1 (2016), 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Dimitrios Kotzias, Misha Denil, Nando De Freitas, and Padhraic Smyth. 2015. From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 597–606. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In Proceedings of the International Conference on Machine Learning. 957–966. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Theodoros Lappas, Mark Crovella, and Evimaria Terzi. 2012. Selecting a characteristic set of reviews. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 832–840. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Theodoros Lappas and Dimitrios Gunopulos. 2010. Efficient confident search in large review corpora. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases. 195–210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jiwei Li, Minh-Thang Luong, and Dan Jurafsky. 2015. A hierarchical neural autoencoder for paragraphs and documents. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 1106--1115.Google ScholarGoogle ScholarCross RefCross Ref
  33. Xueqi Li, Wenjun Jiang, Weiguang Chen, Jie Wu, and Guojun Wang. 2019. HAES: A new hybrid approach for movie recommendation with elastic serendipity. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM, 1503–1512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Hui Lin and Jeff Bilmes. 2010. Multi-document summarization via budgeted maximization of submodular functions. In Proceedings of the Human Language Technologies: The 2010 Conference of the North American Chapter of the Association for Computational Linguistics. 912–920. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Peng Liu, Yue Ding, and Tingting Fu. 2019. Optimal throwboxes assignment for big data multicast in vdtns. Wireless Netw. (March 2019), 1--11. DOI:10.1007/s11276-019-01974-zGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  36. Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the ACM International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, 43–52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Prem Melville, Wojciech Gryc, and Richard D. Lawrence. 2009. Sentiment analysis of blogs by combining lexical knowledge with text classification. In Proceedings of the Conference on Knowledge Discovery and Data Mining (SIGKDD’09). ACM, 1275–1284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’04). 404–411.Google ScholarGoogle Scholar
  39. George A. Miller. 1995. WordNet: A lexical database for English. Commun. ACM 38, 11 (1995), 39–41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Volodymyr Mnih, Nicolas Heess, Alex Graves, and Koray Kavukcuoglu. 2014. Recurrent models of visual attention. In Proceedings of the 27th International Conference on Neural Information Processing Systems. 2204--2212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. George L. Nemhauser, Laurence A Wolsey, and Marshall L Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Math. Program. 14, 1 (1978), 265–294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Thanh-Son Nguyen, Hady W. Lauw, and Panayiotis Tsaparas. 2013. Using micro-reviews to select an efficient set of reviews. In Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management. ACM, 1067–1076. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. L. Page. 1998. The PageRank citation ranking: Bringing order to the web. Stanford Dig. Libr. Work. Pap. 9, 1 (1998), 1–14.Google ScholarGoogle Scholar
  44. Nikolaos Pappas and Andrei Popescu-Belis. 2017. Explicit document modeling through weighted multiple-instance learning. J. Artif. Intell. Res. 58 (2017), 591–626. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Liu Peng, Wang Chaoyu, Hu Jia, Fu Tingting, Cheng Nan, Zhang Ning, and Shen Xuemin. 2020. Joint route selection and charging discharging scheduling of EVs in V2G energy network. IEEE Trans. Vehic. Technol. (2020).Google ScholarGoogle Scholar
  46. Ana Maria Popescu and Orena Etzioni. 2005. Extracting product features and opinions from reviews. In Proceedings of the HLT/EMNLP on Interactive Demonstrations. 32–33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Dragomir R. Radev, Hongyan Jing, Małgorzata Styś, and Daniel Tam. 2004. Centroid-based summarization of multiple documents. Inf. Process. Manage. 40, 6 (2004), 919–938. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Gaetano Rossiello, Pierpaolo Basile, and Giovanni Semeraro. 2017. Centroid-based text summarization through compositionality of word embeddings. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres. 12–21.Google ScholarGoogle ScholarCross RefCross Ref
  49. Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 379--389.Google ScholarGoogle ScholarCross RefCross Ref
  50. K. Schouten and F. Frasincar. 2016. Survey on aspect-level sentiment analysis. IEEE Trans. Knowl. Data Eng. 28, 3 (Mar. 2016), 813–830. DOI:https://doi.org/10.1109/TKDE.2015.2485209 Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 1073--1083.Google ScholarGoogle Scholar
  52. Yoshihiko Suhara, Xiaolan Wang, Stefanos Angelidis, and Wang-Chiew Tan. 2020. OpinionDigest: A simple framework for opinion summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5789–5798.Google ScholarGoogle ScholarCross RefCross Ref
  53. Yoshihiko Suhara, Xiaolan Wang, Stefanos Angelidis, and Wang-Chiew Tan. 2020. OpinionDigest: A simple framework for opinion summarization. In Proceedings of the 2020 Conference on the Association for Computational Linguistics (ACL'20). 5789--5798.Google ScholarGoogle Scholar
  54. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems. 3104–3112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. 2017. From neural sentence summarization to headline generation: A coarse-to-fine approach. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’17). 4109–4115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Jiwei Tan, Xiaojun Wan, Jianguo Xiao, Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. 2017. From neural sentence summarization to headline generation: A coarse-to-fine approach. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 4109–4115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Panayiotis Tsaparas, Alexandros Ntoulas, and Evimaria Terzi. 2011. Selecting a comprehensive set of reviews. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 168–176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Jingjing Wang, Wenjun Jiang, Kenli Li, and Keqin Li. 2021. Reducing cumulative errors of incremental CP decomposition in dynamic online social networks. ACM Trans. Knowl. Discov. Data, Article 1 (2021), 32 pages. DOI:https://doi.org/10.1145/3441645Google ScholarGoogle Scholar
  59. Lu Wang and Wang Ling. 2016. Neural network-based abstract generation for opinions and arguments. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 47--57.Google ScholarGoogle ScholarCross RefCross Ref
  60. Lu Wang, Hema Raghavan, Vittorio Castelli, Radu Florian, and Claire Cardie. 2016. A sentence compression based framework to query-focused multi-document summarization. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria. 1384--1394.Google ScholarGoogle Scholar
  61. Peike Xia, Wenjun Jiang, Jie Wu, Surong Xiao, and Guojun Wang. 2021. Exploiting temporal dynamics in product reviews for dynamic sentiment prediction at the aspect level. ACM Trans. Knowl. Discov. Data, Article 1 (2021), 28 pages. DOI:https://doi.org/10.1145/3441451 Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Naitong Yu, Minlie Huang, Yuanyuan Shi, and Xiaoyan Zhu. 2016. Product review summarization by exploiting phrase properties. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). 1113–1124.Google ScholarGoogle Scholar
  63. Jifeng Zhang, Wenjun Jiang, Jie Wu, and Guojun Wang. 2021. Predict activity attendance in event-based social network: From the organizer’s view. ACM Trans. WEB, Article 1 (2021), 25 pages. DOI:https://doi.org/10.1145/3440134 Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. W. Zhao, Z. Guan, L. Chen, X. He, D. Cai, B. Wang, and Q. Wang. 2018. Weakly-supervised deep embedding for product review sentiment analysis. IEEE Trans. Knowl. Data Eng. 30, 1 (Jan. 2018), 185–197. DOI:https://doi.org/10.1109/TKDE.2017.2756658Google ScholarGoogle ScholarCross RefCross Ref
  65. X. Zhou, X. Wan, and J. Xiao. 2016. CMiner: Opinion extraction and summarization for chinese microblogs. IEEE Trans. Knowl. Data Eng. 28, 7 (Jul. 2016), 1650–1663. DOI:https://doi.org/10.1109/TKDE.2016.2541148Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Review Summary Generation in Online Systems: Frameworks for Supervised and Unsupervised Scenarios

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on the Web
        ACM Transactions on the Web  Volume 15, Issue 3
        August 2021
        162 pages
        ISSN:1559-1131
        EISSN:1559-114X
        DOI:10.1145/3462273
        Issue’s Table of Contents

        Copyright © 2021 Association for Computing Machinery.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 May 2021
        • Accepted: 1 January 2021
        • Revised: 1 November 2020
        • Received: 1 May 2019
        Published in tweb Volume 15, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format