GL-GCN: Global and Local Dependency Guided Graph Convolutional Networks for aspect-based sentiment classification

https://doi.org/10.1016/j.eswa.2021.115712Get rights and content

Highlights

  • Exploit the syntactic dependency structure to mine sentence local structure information.

  • Construct a word-document graph to explore global word dependency information.

  • Propose a novel architecture to encode both global and local structure signals.

  • Conduct extensive experiments to verify the effectiveness of the proposed approach.

Abstract

Aspect-based sentiment classification, which aims at identifying the sentiment polarity of a sentence towards the specified aspect, has become a crucial task for sentiment analysis. Existing methods have proposed effective models and achieved satisfactory results, but they mainly focus on exploiting local structure information of a given sentence, such as locality, sequentiality or syntactical dependency constraints within the sentence. Recently, some research works, which utilizes global dependency information, has attracted increasing interest and significantly boosts the performance of text classification. In this paper, we simultaneously introduce both global structure information and local structure information into the task of aspect-based sentiment classification, and propose a novel aspect-based sentiment classification approach, i.e., Global and Local Dependency Guided Graph Convolutional Networks (GL-GCN). In particular, we exploit the syntactic dependency structure as well as sentence sequential information (e.g., the output of BiLSTM) to mine the local structure information of a sentence. On the other hand, we construct a word-document graph using the entire corpus to reveal the global dependency information between words. In addition, an attention mechanism is leveraged to effectively fuse both global and local dependency structure signals. Extensive experiments are conducted on five benchmark datasets in terms of both Accuracy and F1-Score, and the results illustrate that our proposed framework outperforms state-of-the-art methods for aspect-based sentiment classification. The model is implemented using PyTorch and is trained on GPU GeForce GTX 2080 Ti.

Introduction

With the rapid growth of online review sites such as Amazon, Yelp and IMDB, aspect-based sentiment classification has become a crucial topic in recent years. The main challenge of aspect-based sentiment classification is to identify the corresponding sentiment polarity (e.g., positive, neutral, or negative) towards a specified target when multiple targets are available in a sentence. Fig. 1 shows an example of aspect-based sentiment classification with multiple sentiment polarities, where the sentiment polarity of food is positive, while for the atmosphere it is negative.

In recent years, a number of efforts have been made towards effectively modeling semantic relatedness between context words and the aspects within a sentence. Liu and Zhang (2017) and Wang, Huang, Zhao, and Zhu (2016) propose to utilize attention mechanisms Bahdanau, Cho, and Bengio (2015) together with Recurrent Neural Networks (RNN) (Bengio et al., 2003, Hochreiter and Schmidhuber, 1997) for aspect-based sentiment classification. It assigns a positive weight for each context word, which reflects the importance of the word for determining the sentiment polarity of the specified target.

Fan et al. (2018) find that the sentiment of an aspect is usually determined by key phrases rather than individual words. Based on this observation, Li, Bing, Lam, and Shi (2018) and Xue and Li (2018) propose to employ Convolutional Neural Networks (CNNs) (Lecun, Bottou, Bengio, & Haffner, 1998) to capture multi-word phrases via the convolution operations over word sequences. As CNN and RNN prioritize locality and sequentiality (Battaglia, Hamrick, Bapst, et al., 2018), these models can effectively capture semantic and syntactic information in local consecutive word sequences. However, they lack a mechanism to account for long-range word dependencies, and may result in identifying irrelevant clues for determining aspect sentiment. It is worth noting that the long-range word dependencies mentioned here is a comparative statement. Although CNN can model word dependencies within a sentence, they only capture local phrase-level dependencies detected by filters (e.g., a sliding-window). In recent years, dependency tree has received considerable attention since it can capture dependency relationship between two distant words (e.g., there is a syntactic relationship between them). Many research works propose to leverage the dependency tree to address the issue. For example, Zhang, Li, and Song (2019) propose to exploit syntactical dependency structures within a sentence. They build a Graph Convolutional Network (GCN) over the dependency tree of a sentence and exploit syntactical information to bridge the long-range word dependency.

Whereas the aforementioned works are promising and achieve satisfactory results, their limitation is that they mainly rely on exploiting local structure dependency information, such as locality, sequentiality or syntactical dependency constraints within a sentence, while global structure dependency information is largely ignored. To be specific, existing methods lack an explicit modeling of the global dependency signal, which is latent in the entire corpus to reveal the global relationships between words.

Recent research (Peng et al., 2018, Yao et al., 2019) has shown that exploiting global structure dependency information can often significantly improve the performance of text classification. Peng et al. (2018) convert a document into a word co-occurrence graph, and then leverage graph convolution operations to convolve the graph. Yao et al. (2019) propose to build a text graph for an entire corpus, where nodes are words and documents. The edge between two word nodes relies on word co-occurrence, and the edge between a word node and a document node is using TFIDF. GCN is then used to capture high order neighborhood information.

In this paper, we propose Global and Local Dependency Guided Graph Convolutional Networks (GL-GCN), for aspect-based sentiment classification. In particular, we leverage two kinds of GCNs to learn different dependency structure information: (1) One GCN is leveraged to capture global dependency structure information via exploring the entire corpus; (2) Another GCN is utilized to model local dependency structure information given in each sentence. The framework of the proposed model is shown in Fig. 2.

We conducted extensive experiments on five datasets, i.e., TWITTER, LAPTOP, REST14, REST15, and REST16. All datasets are publicly available and have been widely used in the task of aspect-based sentiment classification. Experimental results demonstrate that the proposed GL-GCN approach can effectively model both global and local dependency structure information, and consistently outperforms the state-of-the-art baseline methods with a large margin.

In summary, this work makes the following main contributions:

  • We propose a novel aspect-based sentiment classification approach exploiting both global and local dependency structure signals to better address the issue of long-range of multi-word dependency.

  • We propose a novel architecture, which consists of two kinds of GCNs, to effectively encode both global and local structure signals. Moreover, a gating mechanism is leveraged to adaptively fuse these two kinds information.

  • We conduct extensive experiments on five datasets demonstrating that GL-GCN is effective in improving the embedding quality by involving global structure information, thereby achieving the state-of-the-art performance

The rest of this paper is organized as follows. Section 2 presents the related work. Section 3 introduces the proposed approach GL-GCN to aspect-based sentiment classification. Section 4 presents extensive experiments to evaluate the effectiveness of our approach, and discusses the effectiveness of involving global dependency structure information. Section 5 concludes the paper.

Section snippets

Related work

In this section, we briefly review the related work in following two categories: Graph Convolutional Networks and Aspect-based Sentiment Classification.

Our approach

The main novelty of our proposed approach GL-GCN is to exploit both global and local dependency structure signals to better address the issue of long-range of multi-word dependency. In this section, we present the details of GL-GCN. In particular, we first formulate the problem of aspect-based sentiment classification as well as the graph convolutional networks, then present the framework of our GL-GCN model and introduce the global dependency module as well as the local dependency module. At

Experiments

In this section, we compare our method with a range of competitive baselines on five real-world datasets.

Conclusion

In this paper, we investigate the aspect-based sentiment classification problem and propose a novel model, Global and Local Dependency Guided Graph Convolutional Network (GL-GCN), to deal with it. Based on the text graph built on the entire corpus, we apply a graph convolutional network to mine word global semantic dependency relations. Further, a dependency tree built on a sentence is leveraged to extract word local syntactic dependency relations. Extensive experiments are conducted on five

CRediT authorship contribution statement

Xiaofei Zhu: Conceptualization, Methodology, Writing – original draft, Writing – review & editing, Supervision, Funding acquisition. Ling Zhu: Software, Investigation, Writing – original draft. Jiafeng Guo: Supervision, Project administration. Shangsong Liang: Writing – review & editing. Stefan Dietze: Supervision, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China [grant number 61722211]; the Federal Ministry of Education and Research, Germany [grant number 01LE1806A]; the Beijing Academy of Artificial Intelligence, China [grant number BAAI2019ZD0306]; and the Technology Innovation and Application Development of Chongqing, China [grant number cstc2020jscx-dxwtBX0014].

Prof. Dr. Xiaofei Zhu is a full professor at College of Computer Science and Engineering, Chongqing University of Technology. He received his Ph.D. degree at the Institute of Computing Technology, Chinese Academy of Science (ICT-CAS) in 2012. Then he spent four years as a Postdoctoral Research Fellow at the L3S Research Center, Leibniz University Hannover. His research interests include web search, data mining and machine learning, and he has published more than 30 papers in international

References (36)

  • Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In...
  • BattagliaP.W. et al.

    Relational inductive biases, deep learning, and graph networks

    (2018)
  • BengioY. et al.

    A neural probabilistic language model

    Journal of Machine Learning Research

    (2003)
  • Bordes, A., Usunier, N., García-Durán, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling...
  • Bruna, J., Zaremba, W., Szlam, A., & LeCun, Y. (2014). Spectral networks and locally connected networks on graphs. In...
  • Chen, P., Sun, Z., Bing, L., & Yang, W. (2017). Recurrent attention network on memory for aspect sentiment analysis. In...
  • Cui, Y., Chen, Z., Wei, S., Wang, S., Liu, T., & Hu, G. (2017). Attention-over-attention neural networks for reading...
  • Dong, L., Wei, F., Tan, C., Tang, D., Zhou, M., & Xu, K. (2014). Adaptive recursive neural network for target-dependent...
  • Fan, C., Gao, Q., Du, J., Gui, L., Xu, R., & Wong, K.-F. (2018). Convolution-based memory network for aspectbased...
  • GongP. et al.

    Neighborhood adaptive graph convolutional network for node classification

    IEEE Access

    (2019)
  • HochreiterS. et al.

    Long short-term memory

    Neural Computation

    (1997)
  • HuangB. et al.

    Aspect level sentiment classification with attention-over-attention neural networks

  • Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In Proceedings of...
  • Kiritchenko, S., Zhu, X., Cherry, C., & Mohammad, S. (2014). Nrc-canada-2014: Detecting aspects and sentiment in...
  • LecunY. et al.

    Gradient-based learning applied to document recognition

    Proceedings of the IEEE

    (1998)
  • LevieR. et al.

    CayleyNets: Graph convolutional neural networks with complex rational spectral filters

    IEEE Transactions on Signal Processing

    (2019)
  • Li, X., Bing, L., Lam, W., & Shi, B. (2018). Transformation networks for target-oriented sentiment classification. In...
  • Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015). Learning entity and relation embeddings for knowledge graph...
  • Cited by (46)

    • Generating effective label description for label-aware sentiment classification

      2023, Expert Systems with Applications
      Citation Excerpt :

      Sentiment classification (also known as opinion mining) (Cambria, 2016; Cambria, Li, Xing, Poria, & Kwok, 2020; Lin, Fu, Li, Cai, & Zhou, 2021; Zhu, Zhu, Guo, & Dietze, 2022; Zhu, Zhu, Guo, Liang, & Dietze, 2021) has emerged as a powerful strategy for understanding consumer opinion towards a product, which is of significant importance for organizations during their decision making process.

    View all citing articles on Scopus

    Prof. Dr. Xiaofei Zhu is a full professor at College of Computer Science and Engineering, Chongqing University of Technology. He received his Ph.D. degree at the Institute of Computing Technology, Chinese Academy of Science (ICT-CAS) in 2012. Then he spent four years as a Postdoctoral Research Fellow at the L3S Research Center, Leibniz University Hannover. His research interests include web search, data mining and machine learning, and he has published more than 30 papers in international conferences and journals, including the top conferences like SIGIR, WWW, CIKM, TKDE, etc. He has won the Best Paper Awards of CIKM (2011). He serves as area chair, program committees and editorial board of numerous international conferences and journals, including SIGIR, AAAI, CIKM, etc.

    Ling Zhu is a Master Candidate at College of Computer Science and Engineering, Chongqing University of Technology. She received the B.S. degree in Computer Science from the Neijiang Normal University in 2014. Her main research interest focuses on machine learning, text mining and sentiment analysis.

    Jiafeng Guo is currently a Professor in Institute of Computing Technology, Chinese Academy of Sciences. He received his Ph.D. in Computer Software and Theory from the University of Chinese Academy of Sciences, Beijing, China, in 2009. He has worked on a number of topics related to web search and data mining, including query representation and understanding, learning to rank, and text modeling. His current research focuses on representation learning and neural models for information retrieval and filtering. He has published more than 80 papers in several top conferences/journals such as SIGIR, WWW, CIKM, IJCAI, and TKDE. His work on information retrieval has received the Best Paper Award in ACM CIKM (2011), Best Student Paper Award in ACM SIGIR (2012) and Best Full Paper Runner-up Award in ACM CIKM (2017). Moreover, he has served as the PC member for the prestigious conferences including SIGIR, WWW, KDD, WSDM, and ACL, and the associate editor of TOIS.

    Shangsong Liang is an associate professor at the School of Data and Computer Science, Sun Yat-sen University. He received a Ph.D. degree from the University of Amsterdam in 2014. His research interests lie in the field of Information Retrieval, Data Mining, Artificial Intelligence and Deep Learning. He has published over 50 peer-reviewed papers, most of which are in top-tier venues such as SIGIR, KDD, WSDM, AAAI, CIKM, ECIR, IEEE TKDE, ACM TOIS and Information Processing and Management. He received various awards/honors such as the SIGIR 2017 Outstanding Reviewer Award, Outstanding Contribution for instructing Data Mining course from the International Petroleum Engineers, the Kingdom of Saudi Arabia Section.

    Prof. Dr. Stefan Dietze is a professor at the University of Dsseldor and the scientific director of Knowledge Technologies for the Social Sciences (WTS), GESIS . He received his Ph.D. from the Institute for Computer Science of the University of Potsdam (Ph.D./Dr. rer. nat. in Applied Computer Science), Germany, in 2004. His professional career led him to the Fraunhofer Institute for Software and Systems Engineering (ISST) in Berlin. Then, he spent five years as a Postdoctoral Research Fellow at the Knowledge Media Institute (KMi) of The Open University in Milton Keynes, UK. Before joining GESIS, he led a research group at the L3S research center at Leibniz University Hannover. His research interests include semantic web technologies, web intelligence, semantic search, and information retrieval. He has published numerous papers in prestigious journals and international conferences, including SIGIR, WWW, CIKM, ESWC, CHIIR and so on. He has been general/poster/track chair, program committee and editorial board member of numerous international conferences and journals.

    View full text