Skip to main content

Demystifying Graph Neural Network Explanations

  • Conference paper
  • First Online:
Book cover Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021)

Abstract

Graph neural networks (GNNs) are quickly becoming the standard approach for learning on graph structured data across several domains, but they lack transparency in their decision-making. Several perturbation-based approaches have been developed to provide insights into the decision making process of GNNs. As this is an early research area, the methods and data used to evaluate the generated explanations lack maturity. We explore these existing approaches and identify common pitfalls in three main areas: (1) synthetic data generation process, (2) evaluation metrics, and (3) the final presentation of the explanation. For this purpose, we perform an empirical study to explore these pitfalls along with their unintended consequences and propose remedies to mitigate their effects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Please refer to the Appendix A for more details on GNNs and explainer methods.

  2. 2.

    A 3-layer vanilla Graph Convolutional Network is used carry out experiments.

References

  1. Arrieta, A.B., et al.: Explainable Artificial Intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fus. 58, 82–115 (2020)

    Article  Google Scholar 

  2. Huang, Q., et al.: GraphLIME: local interpretable model explanations for graph neural networks. arXiv preprint arXiv:2001.06216 (2020)

  3. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)

    Google Scholar 

  4. Lecue, F.: On the role of knowledge graphs in explainable AI. Sema. Web 11(1), 41–51 (2020)

    Article  Google Scholar 

  5. Lucic, A., et al.: CF-GNNExplainer: counterfactual Explanations for Graph Neural Networks. arXiv preprint arXiv:2102.03322 (2021)

  6. Molnar, C., Casalicchio, G., Bischl, B.: Interpretable machine learning – a brief history, state-of-the-art and challenges. In: Koprinska, I., et al. (eds.) ECML PKDD 2020. CCIS, vol. 1323, pp. 417–431. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65965-3_28

    Chapter  Google Scholar 

  7. Robnik-Šikonja, M., Bohanec, M.: Perturbation-based explanations of prediction models. In: Zhou, J., Chen, F. (eds.) Human and Machine Learning. HIS, pp. 159–175. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-90403-0_9

    Chapter  Google Scholar 

  8. Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE 10(3), e0118432 (2015)

    Article  Google Scholar 

  9. Xu, K., et al.: How powerful are graph neural networks? In: ICLR (2018)

    Google Scholar 

  10. Ying, R., et al.: GNNExplainer: generating explanations for graph neural networks. In: Advances in Neural Information Processing Systems 32, p. 9240 (2019)

    Google Scholar 

  11. Yuan, H., et al.: XGNN: towards model-level explanations of graph neural networks. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2020)

    Google Scholar 

  12. Zhou, J., et al.: Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020)

    Article  Google Scholar 

  13. Funke, T., Khosla, M., Anand, A.: Hard masking for explaining graph neural networks (2020)

    Google Scholar 

  14. Yuan, H., et al.: Explainability in graph neural networks: a taxonomic survey. arXiv preprint arXiv:2012.15445 (2020)

  15. Luo, D., et al.: Parameterized explainer for graph neural network. In: Advances in Neural Information Processing Systems (2020)

    Google Scholar 

  16. Anonymous: Causal screening to interpret graph neural networks. Submitted to International Conference on Learning Representations (2021, under review). https://openreview.net/forum?id=nzKv5vxZfge

  17. Yuan, H., Yu, H., Wang, J., Li, K., Ji, S.: On explainability of graph neural networks via subgraph explorations. arXiv preprint arXiv:2102.05152 (2021)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Himmelhuber .

Editor information

Editors and Affiliations

A Background on GNNs and Perturbation-Based Explainer Methods

A Background on GNNs and Perturbation-Based Explainer Methods

For a GNN, the goal is to learn a function of features on a graph \(G=(V, E)\) with edges \(E\) and nodes \(V\). The input is comprised of a feature vector \(x_i\) for every node i, summarized in a feature matrix \(X \in \mathbb {R}^{n \times d_{in}}\) and a representative description of the link structure in the form of an adjacency matrix A. The output of the convolutional layer is a node-level latent representation matrix \(Z \in \mathbb {R}^{n \times d_{out}}\), where \(d_{out}\) is the number of output latent dimensions per node. Therefore, every convolutional layer can be written as a non-linear function:

$$\begin{aligned} H^{(l+1)} = f(H^{(l)}, A), \end{aligned}$$

with \(H^{(0)} = X\) and \(H^{(L)} = Z\), L being the number of stacked layers. The vanilla GNN model employed here, uses the propagation rule [3]:

$$\begin{aligned} f(H^{(l)}, A) = \sigma (\hat{D}^{-\frac{1}{2}}\hat{A}\hat{D}^{-\frac{1}{2}}H^{(l)}W^{(l)}), \end{aligned}$$

with \(\hat{A} = A + I\), I being the identity matrix. \(\hat{D}\) is the diagonal node degree matrix of \(\hat{A}\), \(W^{(l)}\) is a weight matrix for the \({l}{-}{th}\) neural network layer and \(\sigma \) is a non-linear activation function. Taking the latent node representations Z of the last layer we define the logits of node \(v_i\) for a node classification task as follows:

$$ \hat{y}_i = \text {softmax}(z_i W^{\top }_{c}), $$

where \(W_c \in \mathbb {R}^{d_{out} \times k}\) projects the node representations into the k dimensional classification space.

GNNExplainer: The GNNExplainer takes a trained GNN and its prediction(s), and it returns an explanation in the form of a small subgraph of the input graph together with a small subset of node features that are most influential for the prediction. For their selection, the mutual information between the GNN prediction and the distribution of possible subgraph structures is maximized through optimizing the conditional entropy.

CF-GNNExplainer: The CF-GNNEXPLAINER works by perturbing input data at the instance-level. The instances are nodes in the graph since it is focused on node classification. The method iteratively removes edges from the original adjacency matrix based on matrix sparsification techniques, keeping track of the perturbations that lead to a change in prediction, and returning the perturbation with the smallest change w.r.t. the number of edges, after adding different edges to the subgraph.

ZORRO: ZORRO employs discrete masks to identify important input nodes and node features through a greedy algorithm, where nodes or node features are selected step by step. The goodness of the explanation is measured by the expected deviation from the prediction of the underlying model. A subgraph of the node’s computational graph and its set of features are relevant for a classification decision if the expected classifier score remains nearly the same when randomizing the remaining features.

PGExplainer: The PGExplainer learns approximated discrete masks for edges to explain the predictions. Given an input graph, it first obtains the embeddings for each edge by concatenating node embeddings. Then the predictor uses the edge embeddings to predict the probability of each edge being selected, similarly to an importance score. The approximated discrete masks are then sampled via the reparameterization trick. Finally, the objective function maximises the mutual information between the original predictions and new predictions.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Himmelhuber, A., Joblin, M., Ringsquandl, M., Runkler, T. (2021). Demystifying Graph Neural Network Explanations. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1524. Springer, Cham. https://doi.org/10.1007/978-3-030-93736-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93736-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93735-5

  • Online ISBN: 978-3-030-93736-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics