Abstract
In online systems, including e-commerce platforms, many users resort to the reviews or comments generated by previous consumers for decision making, while their time is limited to deal with many reviews. Therefore, a review summary, which contains all important features in user-generated reviews, is expected. In this article, we study “how to generate a comprehensive review summary from a large number of user-generated reviews.” This can be implemented by text summarization, which mainly has two types of extractive and abstractive approaches. Both of these approaches can deal with both supervised and unsupervised scenarios, but the former may generate redundant and incoherent summaries, while the latter can avoid redundancy but usually can only deal with short sequences. Moreover, both approaches may neglect the sentiment information. To address the above issues, we propose comprehensive Review Summary Generation frameworks to deal with the supervised and unsupervised scenarios. We design two different preprocess models of re-ranking and selecting to identify the important sentences while keeping users’ sentiment in the original reviews. These sentences can be further used to generate review summaries with text summarization methods. Experimental results in seven real-world datasets (Idebate, Rotten Tomatoes Amazon, Yelp, and three unlabelled product review datasets in Amazon) demonstrate that our work performs well in review summary generation. Moreover, the re-ranking and selecting models show different characteristics.
- Reinald Kim Amplayo and Mirella Lapata. 2019. Informative and controllable opinion summarization. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL'19).Google Scholar
- Reinald Kim Amplayo and Mirella Lapata. 2020. Unsupervised opinion summarization with noising and denoising. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1934–1945.Google ScholarCross Ref
- Stefanos Angelidis and Mirella Lapata. 2017. Multiple instance learning networks for fine-grained sentiment analysis. Trans. Assoc. Comput. Ling. 6 (2017), 17–31.Google Scholar
- Stefanos Angelidis and Mirella Lapata. 2018. Summarizing opinions: Aspect extraction meets sentiment prediction and they are both weakly supervised. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics. 3675--3686.Google ScholarCross Ref
- Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. A simple but tough-to-beat baseline for sentence embeddings. In Proceedings of the International Conference on Learning Representations (ICLR’17).Google Scholar
- Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. 2010. SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’10). 83–90.Google Scholar
- A. Balahur and A. Montoyo. 2008. Multilingual Feature-Driven Opinion Extraction and Summarization from Customer Reviews. In Proceedings of the International Conference on Application of Natural Language to Information Systems. Springer, Berlin. 345–346 pages. Google ScholarDigital Library
- D. Bollegala, T. Mu, and J. Y. Goulermas. 2016. Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Trans. Knowl. Data Eng. 28, 2 (Feb. 2016), 398–410. DOI:https://doi.org/10.1109/TKDE.2015.2475761 Google ScholarDigital Library
- Arthur Bražinskas, Mirella Lapata, and Ivan Titov. 2020. Unsupervised opinion summarization as copycat-review generation. In Proceedings of the Association for Computational Linguistics (ACL’20). 5151–5169.Google Scholar
- Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, and Hui Jiang. 2016. Distraction-based neural networks for document summarization. In Proceedings of the 2016 Conference on International Joint Conference on Artificial Intelligence (IJCAI'16). 2754--2760. Google ScholarDigital Library
- Eric Chu and Peter Liu. 2019. MeanSum: A neural model for unsupervised multi-document abstractive summarization. In Proceedings of the International Conference on Machine Learning. 1223–1232.Google Scholar
- Maximin Coavoux, Hady Elsahar, and Matthias Gallé. 2019. Unsupervised aspect-based multi-document abstractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization. Association for Computational Linguistics, 42–47.Google ScholarCross Ref
- Michael Denkowski and Alon Lavie. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the Workshop on Statistical Machine Translation. 376–380.Google ScholarCross Ref
- Giuseppe Di Fabbrizio, Amanda Stent, and Robert J. Gaizauskas. 2014. A hybrid approach to multi-document summarization of opinions in reviews. In Proceedings of the International Natural Language Generation Conference (INLG’14). 54–63.Google Scholar
- Xiaofei Ding, Wenjun Jiang, and Jiawei He. 2018. Generating expert’s review from the crowds’: Integrating a multi-attention mechanism with encoder-decoder framework. In Proceedings of the 15th IEEE International Conference on Ubiquitous Intelligence and Computing (IEEE UIC’18). 954–961.Google ScholarCross Ref
- Yunqi Dong and Wenjun Jiang. 2019. Brand purchase prediction based on time-evolving user behaviors in e-commerce. Concurr. Comput.: Pract. Exp. 31, 1 (2019), e4882.Google ScholarCross Ref
- Hady Elsahar, Maximin Coavoux, Matthias Gallé, and Jos Rozen. 2020. Self-supervised and controlled multi-document opinion summarization. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL'20). Google Scholar
- Erkan, Radev, and R. Dragomir. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. J. Qiqihar Jr. Teach. Coll. 22 (2004), 457--479. Google ScholarDigital Library
- Carlos Flick. 2004. ROUGE: A package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out. 10.Google Scholar
- Kavita Ganesan, Cheng Xiang Zhai, and Jiawei Han. 2010. Opinosis: A graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the International Conference on Computational Linguistics (COLING’10). 340–348. Google ScholarDigital Library
- Daniel Gillick, Benoit Favre, and Dilek Hakkani-Tür. 2008. The ICSI summarization system at TAC 2008. In Proceedings of the Text Analysis Conference (TAC’08).Google Scholar
- Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OK Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1631--1640.Google ScholarCross Ref
- Emitza Guzman and Walid Maalej. 2014. How do users like this feature? A fine grained sentiment analysis of app reviews. In Proceedings of the 2014 IEEE 22nd International Requirements Engineering Conference (RE’14). IEEE. 153–162.Google ScholarCross Ref
- Ruidan He, Wee Sun Lee, Hwee Tou Ng, and Daniel Dahlmeier. 2017. An unsupervised neural attention model for aspect extraction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 388–397.Google ScholarCross Ref
- Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 168–177. Google ScholarDigital Library
- Chunli Huang, Wenjun Jiang, Jie Wu, and Guojun Wang. October, 2020. Personalized review recommendation based on users’ aspect sentiment. ACM Trans. Internet Technol. 20, 4 (Oct. 2020), 1533–5399. DOI:https://doi.org/10.1145/3414841 Google ScholarDigital Library
- Wenjun Jiang, Guojun Wang, Md Zakirul Alam Bhuiyan, and Jie Wu. 2016. Understanding graph-based trust evaluation in online social networks: Methodologies and challenges. ACM Comput. Surv. 49, 1 (2016), 10. Google ScholarDigital Library
- Dimitrios Kotzias, Misha Denil, Nando De Freitas, and Padhraic Smyth. 2015. From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 597–606. Google ScholarDigital Library
- Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In Proceedings of the International Conference on Machine Learning. 957–966. Google ScholarDigital Library
- Theodoros Lappas, Mark Crovella, and Evimaria Terzi. 2012. Selecting a characteristic set of reviews. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 832–840. Google ScholarDigital Library
- Theodoros Lappas and Dimitrios Gunopulos. 2010. Efficient confident search in large review corpora. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases. 195–210. Google ScholarDigital Library
- Jiwei Li, Minh-Thang Luong, and Dan Jurafsky. 2015. A hierarchical neural autoencoder for paragraphs and documents. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 1106--1115.Google ScholarCross Ref
- Xueqi Li, Wenjun Jiang, Weiguang Chen, Jie Wu, and Guojun Wang. 2019. HAES: A new hybrid approach for movie recommendation with elastic serendipity. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM, 1503–1512. Google ScholarDigital Library
- Hui Lin and Jeff Bilmes. 2010. Multi-document summarization via budgeted maximization of submodular functions. In Proceedings of the Human Language Technologies: The 2010 Conference of the North American Chapter of the Association for Computational Linguistics. 912–920. Google ScholarDigital Library
- Peng Liu, Yue Ding, and Tingting Fu. 2019. Optimal throwboxes assignment for big data multicast in vdtns. Wireless Netw. (March 2019), 1--11. DOI:10.1007/s11276-019-01974-zGoogle ScholarDigital Library
- Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the ACM International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, 43–52. Google ScholarDigital Library
- Prem Melville, Wojciech Gryc, and Richard D. Lawrence. 2009. Sentiment analysis of blogs by combining lexical knowledge with text classification. In Proceedings of the Conference on Knowledge Discovery and Data Mining (SIGKDD’09). ACM, 1275–1284. Google ScholarDigital Library
- Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’04). 404–411.Google Scholar
- George A. Miller. 1995. WordNet: A lexical database for English. Commun. ACM 38, 11 (1995), 39–41. Google ScholarDigital Library
- Volodymyr Mnih, Nicolas Heess, Alex Graves, and Koray Kavukcuoglu. 2014. Recurrent models of visual attention. In Proceedings of the 27th International Conference on Neural Information Processing Systems. 2204--2212. Google ScholarDigital Library
- George L. Nemhauser, Laurence A Wolsey, and Marshall L Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Math. Program. 14, 1 (1978), 265–294. Google ScholarDigital Library
- Thanh-Son Nguyen, Hady W. Lauw, and Panayiotis Tsaparas. 2013. Using micro-reviews to select an efficient set of reviews. In Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management. ACM, 1067–1076. Google ScholarDigital Library
- L. Page. 1998. The PageRank citation ranking: Bringing order to the web. Stanford Dig. Libr. Work. Pap. 9, 1 (1998), 1–14.Google Scholar
- Nikolaos Pappas and Andrei Popescu-Belis. 2017. Explicit document modeling through weighted multiple-instance learning. J. Artif. Intell. Res. 58 (2017), 591–626. Google ScholarDigital Library
- Liu Peng, Wang Chaoyu, Hu Jia, Fu Tingting, Cheng Nan, Zhang Ning, and Shen Xuemin. 2020. Joint route selection and charging discharging scheduling of EVs in V2G energy network. IEEE Trans. Vehic. Technol. (2020).Google Scholar
- Ana Maria Popescu and Orena Etzioni. 2005. Extracting product features and opinions from reviews. In Proceedings of the HLT/EMNLP on Interactive Demonstrations. 32–33. Google ScholarDigital Library
- Dragomir R. Radev, Hongyan Jing, Małgorzata Styś, and Daniel Tam. 2004. Centroid-based summarization of multiple documents. Inf. Process. Manage. 40, 6 (2004), 919–938. Google ScholarDigital Library
- Gaetano Rossiello, Pierpaolo Basile, and Giovanni Semeraro. 2017. Centroid-based text summarization through compositionality of word embeddings. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres. 12–21.Google ScholarCross Ref
- Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 379--389.Google ScholarCross Ref
- K. Schouten and F. Frasincar. 2016. Survey on aspect-level sentiment analysis. IEEE Trans. Knowl. Data Eng. 28, 3 (Mar. 2016), 813–830. DOI:https://doi.org/10.1109/TKDE.2015.2485209 Google ScholarDigital Library
- Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 1073--1083.Google Scholar
- Yoshihiko Suhara, Xiaolan Wang, Stefanos Angelidis, and Wang-Chiew Tan. 2020. OpinionDigest: A simple framework for opinion summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5789–5798.Google ScholarCross Ref
- Yoshihiko Suhara, Xiaolan Wang, Stefanos Angelidis, and Wang-Chiew Tan. 2020. OpinionDigest: A simple framework for opinion summarization. In Proceedings of the 2020 Conference on the Association for Computational Linguistics (ACL'20). 5789--5798.Google Scholar
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems. 3104–3112. Google ScholarDigital Library
- Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. 2017. From neural sentence summarization to headline generation: A coarse-to-fine approach. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’17). 4109–4115. Google ScholarDigital Library
- Jiwei Tan, Xiaojun Wan, Jianguo Xiao, Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. 2017. From neural sentence summarization to headline generation: A coarse-to-fine approach. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 4109–4115. Google ScholarDigital Library
- Panayiotis Tsaparas, Alexandros Ntoulas, and Evimaria Terzi. 2011. Selecting a comprehensive set of reviews. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 168–176. Google ScholarDigital Library
- Jingjing Wang, Wenjun Jiang, Kenli Li, and Keqin Li. 2021. Reducing cumulative errors of incremental CP decomposition in dynamic online social networks. ACM Trans. Knowl. Discov. Data, Article 1 (2021), 32 pages. DOI:https://doi.org/10.1145/3441645Google Scholar
- Lu Wang and Wang Ling. 2016. Neural network-based abstract generation for opinions and arguments. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 47--57.Google ScholarCross Ref
- Lu Wang, Hema Raghavan, Vittorio Castelli, Radu Florian, and Claire Cardie. 2016. A sentence compression based framework to query-focused multi-document summarization. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria. 1384--1394.Google Scholar
- Peike Xia, Wenjun Jiang, Jie Wu, Surong Xiao, and Guojun Wang. 2021. Exploiting temporal dynamics in product reviews for dynamic sentiment prediction at the aspect level. ACM Trans. Knowl. Discov. Data, Article 1 (2021), 28 pages. DOI:https://doi.org/10.1145/3441451 Google ScholarDigital Library
- Naitong Yu, Minlie Huang, Yuanyuan Shi, and Xiaoyan Zhu. 2016. Product review summarization by exploiting phrase properties. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). 1113–1124.Google Scholar
- Jifeng Zhang, Wenjun Jiang, Jie Wu, and Guojun Wang. 2021. Predict activity attendance in event-based social network: From the organizer’s view. ACM Trans. WEB, Article 1 (2021), 25 pages. DOI:https://doi.org/10.1145/3440134 Google ScholarDigital Library
- W. Zhao, Z. Guan, L. Chen, X. He, D. Cai, B. Wang, and Q. Wang. 2018. Weakly-supervised deep embedding for product review sentiment analysis. IEEE Trans. Knowl. Data Eng. 30, 1 (Jan. 2018), 185–197. DOI:https://doi.org/10.1109/TKDE.2017.2756658Google ScholarCross Ref
- X. Zhou, X. Wan, and J. Xiao. 2016. CMiner: Opinion extraction and summarization for chinese microblogs. IEEE Trans. Knowl. Data Eng. 28, 7 (Jul. 2016), 1650–1663. DOI:https://doi.org/10.1109/TKDE.2016.2541148Google ScholarCross Ref
Index Terms
- Review Summary Generation in Online Systems: Frameworks for Supervised and Unsupervised Scenarios
Recommendations
Sentiment diversification for short review summarization
WI '17: Proceedings of the International Conference on Web IntelligenceWith the abundance of reviews published on the Web about a given product, consumers are looking for ways to view major opinions that can be presented in a quick and succinct way. Reviews contain many different opinions, making the ability to show a ...
Sentence Retrieval with Sentiment-specific Topical Anchoring for Review Summarization
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementWe propose Topic Anchoring-based Review Summarization (TARS), a two-step extractive summarization method, which creates review summaries from the sentences that represent the most important aspects of a review. In the first step, the proposed method ...
Summarizing user-generated reviews in digital libraries: a visual clustering approach
JCDL '09: Proceedings of the 9th ACM/IEEE-CS joint conference on Digital librariesIn this paper, we describe a visual clustering approach to summarizing user-generated reviews of digital library items and services. The approach consists of the steps of sentence extraction, aspect identification, opinion classification, and review ...
Comments