Global explanation supervision for Graph Neural Networks

With the increasing popularity of Graph Neural Networks (GNNs) for predictive tasks on graph structured data, research on their explainability is becoming more critical and achieving significant progress. Although many methods are proposed to explain the predictions of GNNs, their focus is mainly on “how to generate explanations.” However, other important research questions like “whether the GNN explanations are inaccurate,” “what if the explanations are inaccurate,” and “how to adjust the model to generate more accurate explanations” have gained little attention. Our previous GNN Explanation Supervision (GNES) framework demonstrated effectiveness on improving the reasonability of the local explanation while still keep or even improve the backbone GNNs model performance. In many applications instead of per sample explanations, we need to find global explanations which are reasonable and faithful to the domain data. Simply learning to explain GNNs locally is not an optimal solution to a global understanding of the model. To improve the explainability power of the GNES framework, we propose the Global GNN Explanation Supervision (GGNES) technique which uses a basic trained GNN and a global extension of the loss function used in the GNES framework. This GNN creates local explanations which are fed to a Global Logic-based GNN Explainer, an existing technique that can learn the global Explanation in terms of a logic formula. These two frameworks are then trained iteratively to generate reasonable global explanations. Extensive experiments demonstrate the effectiveness of the proposed model on improving the global explanations while keeping the performance similar or even increase the model prediction power.


Introduction
As Deep Neural Networks (DNNs) are widely deployed in sensitive application areas, recent years have seen an explosion of research in understanding how DNNs work under the hood (e.g., explainable AI, or XAI; Adadi and Berrada, 2018;Arrieta et al., 2020) and more importantly, how to improve DNNs using human knowledge (Hong et al., 2020).In particular, Graph Neural Networks (GNNs) have been increasingly grabbed attention in several research fields, including computer vision (Fukui et al., 2019;Pope et al., 2019), natural language processing (Annervaz et al., 2018), medical domain (De Haan et al., 2009), and beyond.Such trend is attributed to the practical implication of graph data-many realworld data, such as social networks (Fan et al., 2019), chemical molecules (Scarselli et al., 2008), and financial data (Matsunaga et al., 2019), are represented as graphs.
However, similar to other DNNs' architectures, GNNs also offer only limited transparency, imposing significant challenges in observing when GNNs make successful/unsuccessful predictions (Hong et al., 2020;Wu et al., 2021).Some real world examples for adjusting model explanation to improve Graph Neural Networks (GNNs) can be seen in Figure 1.This issue motivates a surge of recent research on GNN explanation techniques, including gradients-based methods, where the gradients are used to indicate the importance of different input features (Baldassarre and Azizpour, 2019;Pope et al., 2019); perturbation-based methods, where an additional optimization step is typically used to find the important input that influences the model output the most with input perturbations (Ying et al., 2019;Luo et al., 2020;Schlichtkrull et al., 2020); response-based methods, where the output response signal is backpropagated as an importance score layer by layer until the input space (Baldassarre and Azizpour, 2019;Pope et al., 2019;Schnake et al., 2020); surrogate-based methods, where the explanation obtained from an interpretable surrogate model that is trained to fit the original prediction is used to explain the original model (Huang et al., 2020;Vu and Thai, 2020;Zhang et al., 2020); and global explanation methods, where graph patterns are generated to maximize the predicted probability for a certain class and use such graph patterns to explain the class (Yuan et al., 2020a).Unlike local explanation models which explain the model prediction per input sample, global explanation techniques aim at providing the general insights and high-level understanding of the predictions of a deep graph model.Specifically, they investigate what input graph patterns can lead to a certain GNN behavior or maximize a certain prediction.This is essential in many real-world critical applications and can substantially increase human trust in GNNs' prediction ability.As an example, consider classifying graph molecules as either having a mutagenic effect or not (Azzolin et al., 2022).The mutagenicity of a molecule is correlated with the presence of electron-attracting elements conjugated with nitro groups (e.g., NO 2 ).Accordingly, designing an explanation model that can provide a global understanding of the GNN classification is an urgent need.This could be achieved by designing an explainer that manages to recover all the existing well-known NO 2 motifs as an indicator of mutagenicity.Additional examples include gender (male vs. female) or age (young vs. old) classification of human subjects based on structural or functional connectivity matrices, obtained through magnetic resonance imaging of the corresponding subjects.In this case, rather than a per sample explanation, we need a per class explanation in form of high-level, generic insight on differences in the input connectivity matrices of these subjects.
Despite the recent fast progress on GNN explanation techniques, the existing research body focuses on "how to generate GNN explanations" instead of "whether the GNN explanations are inaccurate, " "what if the explanations are inaccurate, " and "how to adjust the model to generate more accurate explanations."Answering the above questions is highly beneficial to the model developers and the users of GNN explanation techniques but is also extremely difficult due to several challenges: 1) Lack of an automatic learning framework for identifying and adjusting unreasonable explanations on GNNs.Although there are plenty of existing works on GNN explanations, they are not able to ensure the correctness of explanations, not able to identify the incorrect explanations, nor able to adjust the unreasonable explanations.The technique that can enable this has not been well-explored yet and is technically challenging due to the additional involvement of another backpropagation originated from explanation error.2) Difficulty in aligning the node and edge explanations.Existing GNN explanation works usually focus on either node and edge explanation, while the interplay and consistency between the explanations of nodes and edges are extremely challenging to maintain and jointly adjusted.3) Difficulty in jointly improving model performance and explainability with limited explanation supervision.Due to the high cost for human annotation, it can be impractical to assume the full accessibility to the human explanation label during model training.Thus, designing an effective framework that can best leverage a partially labeled dataset is on-demand yet challenging.4) Lack of a learning framework that can employ global explanation of a GNN model to improve its performance and global explainability through global explanation supervision.In many applications, we have access to the ground-truth explanations annotated by domain experts that can demonstrate the behavior of the data as a whole, and hence, we are motivated to employ that as the supervision signal to improve performance and global-level explainability.Designing a learning framework that utilizes this type of information is an interesting line of research which has yet remained unexplored.
To address the above challenges, beyond merely finding a solution to produce global GNN explanations, this study focuses on a global GNN explanation supervision framework for correcting the unreasonable explanations and learning how to explain GNNs from a global aspect correctly.Although the previously proposed Graph Neural Network Explanation Supervision (GNES) framework (Gao et al., 2021) has proved effective on improving the reasonability of the model explanation per local samples, while still keep or even improve the backbone GNN model performance, it still lacks the ability to guide the global model explanation generation.In many real-world decision-critical application, the ability to explain the reason for each class prediction through a single robust overview of the model is a critical requirement.To address the inefficacy of the existing GNES model in improving the global explainability through global explanation supervision, in this study, we extend the GNES model by proposing the Global GNN Explanation Supervision (GGNES), whose effectiveness is similar to the GNES model but can improve the GNN model global explanation generation (and potentially its prediction) through guiding the global explanations generated while training the model.The major contributions of this study are summarized as follows: (1) Develop a generic framework for training GNNs while improving the reasonability and faithfulness of the global explanations generated for the model.We propose the GGNES model built upon concept-based explainability and our previously proposed GNES model.GGNES enables learning reasonable and faithful global explanation, in terms of logic formulas, while training a GNN model.These formulas are constructed from a combination of learned graphical concepts which are derived from local explanations.(2) Develop the formulation that can take the

FIGURE
Cases for adjusting model explanation to improve Graph Neural Networks (GNNs).Scene graph (left three): from the left, an input image, explanation before adjustment ( -a, inaccurate), and explanation after the adjustment ( -b, accurate).Note that the model explanation has been shifted from puppy eyes and back, rods, and an artificial tree to curtains, a clock, and a rug.Molecular formula (right three): from the left, an input formula, explanation before the adjustment ( -a, inaccurate), and explanation after the adjustment ( -b, accurate).Reactivity for this molecule is mostly a ected by benzene ring sub-components in the overall molecular structure.-b highlights the main benzene rings of the molecule more e ectively than -a.

Related work
In this section, we first introduce our previously proposed GNES framework.Then, we note that our work draws inspiration from the research fields of graph neural network explanations that provide the model generated explanations, and explanation supervision on DNNs which enables the design of pipelines for the human-in-the-loop adjustment on the DNNs based on their explanations.

. Our previously proposed GNES framework
In our previous study (Gao et al., 2021), we proposed a framework that learns how to jointly optimize both model prediction and model explanation by enforcing both whole graph regularization and weak supervision on model explanations.For the graph regularization, we proposed a unified explanation formulation for both node-level and edge-level explanations by enforcing the consistency between them.The node-and edgelevel explanation techniques we proposed are also generic and rigorously demonstrated to cover several existing major explainers as special cases.However, in some applications, the ground truth explanations demonstrate the behavior of the data as a whole instead of each individual sample.Accordingly, we need a learning framework that utilizes this type of information through global explanation supervision and hence improves both model prediction and global model explanation.

FIGURE
Proposed GNN Explanation Supervision (GNES) framework that jointly optimized the GNN models based on ( ) a prediction loss, ( ) an explanation loss on the human annotation and model explanation, and ( ) a graph regularization loss to inject high-level principles of the graph-structured explanation.Notice that we only assume limited accessibility to the human annotation for only a small set of samples ( % in our experiments).

. Graph Neural Networks explanations
Most of the existing GNN explanation methods are instancelevel methods, where the methods explain the models by identifying important input features for its prediction (Yuan et al., 2020b).The first category is gradients-based methods, where the gradients are used to indicate the importance of different input features.Existing methods are SA (Baldassarre and Azizpour, 2019), Guided BP (Baldassarre and Azizpour, 2019), CAM (Pope et al., 2019), andGradCAM (Pope et al., 2019).In Etemadyrad et al. (2022), the authors propose a novel post hoc explanation technique to find the subgraphs in input that majorly influence one or more subgraphs in the output domain by using gradient information and solving a classical community detection objective (De Domenico et al., 2015).The second category is perturbation-based methods, where an additional optimization step is typically used to find the important input that influences the model output the most with input perturbations.Existing methods are GNNExplainer (Ying et al., 2019), PGExplainer (Luo et al., 2020), and GraphMask (Schlichtkrull et al., 2020).The third category is the responsebased method, where the output response signal is backpropagated as an importance score layer by layer until the input space.Existing methods in this category include LRP (Baldassarre andAzizpour, 2019), Excitation BP (Pope et al., 2019), and GNN-LRP (Schnake et al., 2020).The last category is surrogate-based methods, where the explanation obtained from an interpretable surrogate model that is trained to fit the original prediction is used to explain the original model.The surrogate methods include GraphLime (Huang et al., 2020), RelEx (Zhang et al., 2020), and PGM-Explainer (Vu and Thai, 2020).In addition to instancelevel explanation methods, very recently, the global explanation of the GNN model has also been explored by XGNN (Yuan et al., 2020a).Please see Yuan et al. (2020b) for a survey of explainability in Graph Neural Networks.Even though there are plenty of existing explanation methods for GNNs, most of the methods above can not be applied to explanation supervision mechanism as the goal is to apply supervision on the generated explanation such that the backbone GNN model itself can be finetuned accordingly to generate better explanations as well as keep or even improve the model performance.To enable this finetuning process over the explanation, the explanation itself needs to be differentiable to the backbone GNN model's parameters.In other words, only the explanation that is directly calculated from the computational pipeline (such as gradients-based and responsebased methods) can be used to apply this additional explanation supervision to fine-tune the backbone GNN models explanation.The perturbation-based and surrogate-based methods all require additional optimization steps to obtain the explanation and thus are unable to be end-to-end trained with the explanation supervision on the backbone GNNs.

. Explanation supervision on DNNs
The potential of using explanation-methods devised for understanding which sub-parts in an instance are important for making a prediction-in improving DNNs has been studied in many domains across different applications.In fact, explanation supervision has been widely studied on image data by the computer vision community (Das et al., 2017;Linsley et al., 2018;Qiao et al., 2018;Mitsuhara et al., 2019;Zhang et al., 2019;Chen et al., 2020;Patro et al., 2020).Linsley et al. (2018) have demonstrated that the benefit of using stronger supervisory signals by teaching networks where to attend, which looks similar to the proposed approach.Moreover, Mitsuhara et al. (2019) have proposed a post-hoc finetuning strategy where an end-user is asked to manually edit the model's explanation to interactively adjust its output.Such edited explanations are then used as ground-truth explanations (from humans) to further fine-tune the model.In addition, several works in the Visual Question Answering (VQA) domain have proposed to use explanation supervision to obtain improved explanation on both the text data and the image data (Das et al., 2017;Qiao et al., 2018;Zhang et al., 2019;Patro et al., 2020).In addition to image data, the explanation supervision has also been studied on other data types, such as texts (Ross et al., 2017;Jacovi and Goldberg, 2020), attributed data (Visotsky et al., 2019), and more.Gao et al. (2024) provide a systematic survey on Explanation-Guided Learning (EGL), a line of research that focuses on leveraging additional supervision signals or prior knowledge obtained from human explanations into machine learning models' reasoning process.According to Gao et al. (2024), EGL methods provide either global (Weinberger et al., 2020;Erion et al., 2021) or local guidance (Gao et al., 2022a,b;Shi et al., 2023)) by injecting prior knowledge or adding supervision signals to improve the model's global (or local) explanation.In Erion et al. (2021), the authors introduce attribution priors to optimize for higher-level properties of explanations, such as smoothness and sparsity.Lee et al. (2022) illustrate how to upgrade a deep model to its self explainable version that can predict and explain with logic rules learned with widely-used deep learning modules.Gupta et al. (2024) introduce Concept Distillation to create richer concepts using a pre-trained teacher model.They demonstrate how conceptsensitive training can improve model interpretability, reduce biases, and induce prior knowledge.Sha et al. (2023) propose a rational extraction technique built based on an adversarial approach that calibrates the information between a guider, a typical neural model that does the prediction, and a selector-predictor model that additionally produces a rationale for the guider' prediction.Shi et al. (2023) develop the ENGAGE framework as a local guidance EGL, built upon Explanation Guidance Data Augmentation, which leverages explanation to inform graph augmentation, and uses contrastive learning for training representations to preserve the key parts in graphs while removing uninformative artifacts.
However, to our best knowledge, explanation supervision on graph-structured data with graph neural networks through learning logic-based concepts has not been explored before, and we are the first to propose a framework to handle this open research problem.

Model
In this section, we introduce our proposed Global Explanation Supervision framework for GNNs.First, we briefly summarize the explanation regularizations (i.e., explanation consistency and sparsity) proposed by Gao et al. (2021) and how these components enhance the quality of model explanations in a global level.Then, we will introduce the proposed Global node-level, in addition to the Global edge-level explanation supervision definition and formulation.
Formal definition of the problem: Let G = (X, A) denotes an attributed graph with N nodes be defined with its node attributes X ∈ R N×d in and its adjacency matrix A ∈ R N×N (weighted or binary), where d in denotes the dimension of input feature.Let y be the class label for graph G.The general goal for a GNN model is to learn the mapping function f for each graph G to its corresponding label y, Following Kipf and Welling (2016) and similar to Gao et al. (2021), we employ the basic definition of Graph Convolutional Networks (GCN; Kipf and Welling, 2016), for an attributed graph G = (X, A) with y as the class label for graph G, where a graph convolutional layer can be defined as Equation (1).
where F (l) denotes the activations at layer l, and F (0) = X; Ã = A + I N is the adjacency matrix with added self connections where I N ∈ R N×N is the identity matrix; D is the degree matrix of Ã, where ./fdata. .  .

GGNES framework
The goal here would be to design a framework that can generate global explanations which are closer to the human annotations through global explanation supervision.The prediction performance is expected to stay the same or possibly also improve.The global explanation supervision is possible via defining the learning objective of the proposed framework as a joint optimization.As shown in Equation ( 2) and following the framework in Figure (2), the objective function is a combination of model prediction loss (e.g., the cross-entropy loss), the global explanation loss (which is a function of the absolute or squared difference between class level human and model explanations), and global model explanation regularizations (graph regularizations that follow high-level graph-structured rules to the explanation).These three terms are computed per class and combined thereafter to form the global explanation supervision framework.Concretely, we employ the objective function as where M c ∈ R N×1 and E c ∈ R N×N denote the model-generated node-level and edge-level explanations of class c using a given explanation method.And M ′ c , E ′ c are the corresponding groundtruth explanations of class c, marked by the human annotators.The human annotations are provided globally for all samples and are unique per class, but equal for the samples of each class.These are used as additional guidance to make the explanation supervision possible.L Pred,c is the typical prediction loss (such as the cross-entropy loss) on the training set.The proposed explanation loss L Att,c measures the discrepancies between model and human explanations globally both on node level and edge level, as Equation ( 3) where α n and α e are the scale factors for balancing global nodelevel and global edge-level loss; the function dist(x, y) measures the mean element-wise distance between the inputs x and y, a common choose can be absolute difference or squared difference.However, in practice, in many applications, it is not feasible to obtain the human explanations for the whole dataset.As a remedy, we only apply the global explanation loss to the classes that have the ground-truth labels for the human explanations and apply the high-level graph rules to regulate the model explanation for each class even if the human annotation is unavailable (Gao et al., 2021).Specifically, we employ the global explanation consistency, in addition to the global sparsity regularization.The former can regulate the global node and edge explanation simultaneously so that the model is more likely to generate a globally consistent and smooth explanation over nodes and edges.The global sparsity regularization is designed to regulate the model to only focus on a few important nodes and edges for the explanations.Thus, we propose Equation ( 4) for global graph regularizations to obtain more reasonable model explanations: where β is the scaling factor for the global explanation consistency between node and edge explanations, γ is the scaling factor for the sparsity constraints on both node and edge explanations.These regularizations are described in more detail below: . .

Global explanation consistency regularization
The global node explanation and edge explanation are not independent, but rather highly correlated with each other.One natural assumption about the global node explanation smoothness is that the adjacent nodes should share similar importance.However, this assumption can be too strong and sometimes lead to over-smoothing of the node explanation and tend to yield indistinguishable patterns for the explanation.In addition, it ignores the connection between the node and edge explanations, which can be a crucial factor for the explanation model to generate a global consistent explanation.
Here, we propose to take one step further regarding the smoothness assumption about the explanation by considering both node and edge explanations and making them more consistent with each other.Concretely, instead of treating all pairs of adjacent nodes equally important when enforcing the smoothness constraint, we propose to weight them by the corresponding edge importance such that the explanation consistency is better enforced on those nodes and edges that are deemed important.Mathematically, the global explanation consistency can be measured by Equation ( 5) follows: given a pair of nodes i and j that is adjacent (i.e., A i,j = 1), if the edge that connects the two nodes is important (i.e., E i,j is high), then the nodes it connects also tend to be consistent.

. . Sparsity regularization
As sparsity is a common practice for the model explanation, we apply the ℓ 1 norm to regulate both the node-level and the edge-level explanations, as Equation ( 6) Overall, the benefits of applying the proposed regularization terms are 3-fold.First, the regularization terms do not rely on the specific human labels on the explanation, which can be very limited and hard to acquire in practice.Thus, they can be very crucial in the scenarios where the explanation labels are scarce.Second, since the explanation for the node and edge can be highly relevant, the proposed explanation consistency regularization can be critical for enforcing the model to generate more reasonable and consistent results that better align with the human explanation.Lastly, our overall framework is very flexible such that the regularization terms are not affected by changing the specification of the node and edge explanation formulation in Equations (7, 12), respectively, making the proposed framework easily applicable to give explanation and apply explanation supervision on any downstream applications with little to no overhead.
The regularization term in Equation ( 2) is employed to first regulate the node and edge explanation and make them consistent and smooth through considering the dependence of node and edge explanations.Additionally, to lead the model to generate more realistic explanations, the sparsity regularization is also applied which can regulate the model to only focus on a few important nodes and edges for the explanations.

. Global node explanation formulation for global explanation supervision
In many applications, the ground-truth explanations (on synthetic data) or the domain knowledge (on real-world data) provides node-level explanation of the data in a global manner rather than per sample/instance.In this case, we need to provide a single robust overview of the model predictions.Accordingly, we aim to propose a framework that can both generate the global node explanation by capturing the behavior of the GNN model as a whole (rather than providing instance-specific explanations which could be noisy or not faithful to the model predictions) and also employ it as a supervision signal to further improve the global node-level explanations generated by the model.
To this end, we employ the gradient and the response/activation information which are also the main components for local node explanation supervision as described in Gao et al. (2021).We then aggregate this information over all instances so we can produce a model-generated global explanation that remains differentiable to the backbone GNN model's parameters.This makes the global explanation supervision feasible as the model parameters can be affected and tuned during training.Mathematically, given the output y i c on class c and sample i, the global explanation for node n at layer l can be computed as follows: where n represents the gradient of the features of node n at layer l given class c and sample i, Z is the total number of samples, and F (l) n denotes the node activation at layer l.The function in Equation ( 7) can generate any simple to more complicated computations over the input gradients and the activation.Two simple examples are shown in Equations (8,9), where the gradients are employed to generate simple gradient-based local explanation for each sample, which are then aggregated using the min or max function to form the final global explanation: n,c = min( ReLU( (9) The other form of aggregation is to average over the local explanations to get the global-level explanation: More complicated technique, described as concept-based global explainer in Azzolin et al. (2022), with some variations, can be used and formulated as below: where P i is the i − th learned prototype which is initialized randomly from a uniform distribution and learned through training the GLGExplainer framework described in Azzolin et al. (2022) and m is the total number of prototypes which is a hyperparameter and tuned separately for each dataset. is also a learnable Boolean function that generates a logical combination of the learned prototypes following (Azzolin et al., 2022).In this setting, Equation ( 11) is a logic formula constructed using graphical concepts derived from local explanations.Concepts can be described as intermediate, high-level and semantically meaningful units of information commonly used by humans to The results are obtained from five individual runs for every setting.The best results for each task are highlighted with boldface font, and the second bests are underlined.
explain their decisions.More details for GLGExplainer are given in Azzolin et al. (2022).
The training process based on Equation (11) consists of three steps.First, a basic GCN is trained by optimizing only the first term in Equation (3).Second, the local explanations generated by this trained GNN are fed as inputs to the GLGExplainer which can construct the logic formula of Equation (11).Last, the original GCN is re-trained through the full loss function in Equation (3).For the third or last step, we only employ the logic formula from step 2 and discard the prototypes generated.Instead we randomly initialize the values of prototypes from a uniform distribution.Accordingly, the GCN and GLGExplainer are trained iteratively until the value of prototypes would converge.Note that all the parameters of GLGExplainer in step three are exactly equal to those in step 2, except for the prototypes that remain learnable and are updated at each iteration.
For all the functions in Equations (8-11), the results are computed and included in the Experiments section with further discussions.
. Global edge explanation formulation for global explanation supervision While several works have studied global node-level explanation topic, little to no work has explored the global edge-level explanation and its applications.However, in many scenarios, the latter can be more crucial and meaningful than the former as the domain knowledge or human annotations describe the relationship between nodes rather than the nodes in particular.
Similar to the global node explanation supervision, we need to propose a unified edge-level explanation formulation which generates explanations that are differentiable to the backbone model's parameters.Taking the gradient of each edge in the input adjacency matrix, as well as the response/activation of the pairs of nodes that are associated with that edge, and using the chain rule, we can define suitable model generated explanations for each instance.Concretely, given the output y i c on class c and sample i, the global edge explanation between node n and node m at layer l can be computed as the aggregation of all edge explanations for single instances.More precisely, this is a function of the edge gradients for all samples, in addition to node activations: where represents the gradient of the edge that connects nodes n and m at layer l given class c and sample i; F (l)  n and F (l) m denote the activation of node n and node m at layer l, respectively, and N is the total number of instances.Similar to previous formulation in Equation ( 7), can combine the local explanations of all samples, providing a global explanation for the overall behavior of the GNN.A simple example is the min or max value among all the gradient-based  13) where Z is the total number of samples, and a similar formulation can be used to find the max value of the local explanations.
Averaging over the local explanations can also be another aggregator to generate the global .explanationand can be shown by Equation ( 14) Similar to Equation ( 11), the global edge explanation can also be represented as a learnable logic combination of concepts.As long as the GNN model can generate local explanations that are a subgraph of the input data, these can be fed into the GLGExplainer in Azzolin et al. (2022) which can learn the formula, and parameters in Equation ( 11) and generate the global explanation per class.
These various functions for are investigated in detail in the Experiment section.

Experiments
We test our Global GNN Explanation Supervision framework on the datasets extracted from two publicly available sources including HCP (Human Connectome Project) and the ABIDE (Autism Brain Imaging Data Exchange) database.These datasets, in addition to the implementation details, evaluation metrics, and comparison methods are described in turn below.

. Datasets . . Magnetic resonance imaging data
The (structural, diffusion, and functional) MRI data were extracted from the Human Connectome Project website (https://db.humanconnectome.org/),specifically, the 1,200 Subjects Release, February 2017 (Van Essen et al., 2013), which provided (MRI) data from 1,200 young adult (ages 22-35) subjects.Here, two tasks are defined as binary classification of a given subject as Female vs. Male, in addition to .The age and gender labels were provided as additional meta features.For the ground-truth explanations of each class, we refer to Gong et al. (2009), which has investigated age and sex effects on the anatomical connectivity patterns of 95 normal subjects ranging in age from 19 to 85 years.Accordingly, cortical regions which show significant effect for young, old, male, or female subjects were separately identified for each group for Automated Anatomical Labeling (AAL) atlas (Tzourio-Mazoyer et al., 2002).To use these as annotations for HCP dataset, these regions were then mapped to Desikan-Killiany (DK) atlas (Desikan et al., 2006), by finding the closest node (Euclidean distance) in DK to each identified node in AAL atlas.The resulting DK nodes are provided in Supplementary material for each class under study.
The raw MRI data were then preprocessed using the HCP pipeline (WU-Minn, 2017).For the diffusion MRI, this was followed by the BEDPOSTX (Bayesian Estimation of Diffusion Parameters Obtained using Sampling Techniques, modeling crossing X fibers) algorithm in the FMRIB Software Library (Jenkinson et al., 2012, FSL), which models white matter fiber orientations and crossing fibers for probabilistic tractography.The resting state blood-oxygen-level-dependent functional MRI (r-fMRI) time series data were acquired from participants, in four runs of ∼15 min for each participant, including two runs on two different days (Day 1 and Day 2).These measurements were collected with the subject supine and still, with eyes open, to track physiological changes in the brain (i.e., changes in blood flow and oxygen levels) that occur in resting state, when an explicit task is not being performed (Biswal, 2012;Buckner et al., 2013).
Extracting SC and FC: To construct the SC matrix for each subject, we ran Probtrackx in FSL with 68 regions of interest (ROIs) obtained from the the DK atlas.For the remaining parameter setting in Probtractx, we followed the recommendations of the tutorial (in St.Louis, 2020) provided by HCP.Finally, the resulting SC matrices were normalized by dividing the respective row sum from each non-zero value.
Three steps were followed to extract the functional connectivity from the r-fMRI time series data, for each day: 1. Concatenate the time series for the two runs together; 2. For each of the 68 ROIs defined by the Desikan-Killiany atlas, average all the time series to create a single ROI time series; and 3. obtain the functional connectivities by either (a) Computing the pairwise ROI time series' Pearson correlations using FSLNets (of Heidelberg Department of Neuroradiology, 2014) with the full correlation option, thus generating Dataset 1; or following similar three steps as mentioned for Dataset 1, except that we Concatenate the time series for the two runs performed in day 2 together, thus generating Dataset 2. In this study, we followed with the experiments only using Dataset 1 due to the high amount of computation and resources required for each Dataset.

. . ABIDE dataset
We analyzed r-fMRI in the Autism Brain Imaging Data Exchange (ABIDE; Di Martino et al., 2014).It compiles a dataset of 1,112 r-fMRI participants by gathering data from 16 The results are obtained from five individual runs for every setting.The best results for each task are highlighted with boldface font, and the second bests are underlined.
international imaging sites that have aggregated and are openly sharing neuroimaging data from 539 individuals suffering from ASD and 573 typical controls (TCs).The task is to classify a subject as either belonging to ASD or the control group, based on their r-fMRI data.Since there was no prior coordination between sites, the scan and diagnostic/assessment protocols vary across sites.Accordingly, we rely on a publicly available preprocessed version of this dataset provided by the Preprocessed Connectome Project (PCP) initiative.PCP preprocessed the data using four different pipelines, all of which implemented fairly similar steps, but varied in the algorithms used for each step and the parameters.We specifically used the data processed with the Configurable Pipeline for the Analysis of Connectomes, C-PAC (Craddock et al., 2013), which provides further minimally preprocessed data through the python package, cpac.C-PAC comes pre-packaged with a default pipeline, as well as a growing library of preconfigured pipelines.These pipelines could be edited or built from scratch, using the provided pipeline builder.For our experiments, we used the default processing pipeline.For more details, please see Craddock et al. (2013) on how we extracted time series for the Harvard-Oxford atlas.We finally used the same steps as HCP dataset, to compute the functional connectivity matrices.Additionally, we used the biomarkers extracted by Kunda et al. (2020), as the ground-truth labels for explanation supervision.These include the top five most contributing FC edges for ASD and TC classification, respectively (10 overall connections), built using The results are obtained from five individual runs for every setting.The best results for each task are highlighted with boldface font, and the second bests are underlined.
the Harvard-Oxford (HO) brain atlas (Jenkinson et al., 2012) as a point of reference. .

Implementation details
Following the previous work on the explanation supervision for GNNs, we used a 3 layer GCN as our backbone GNN model.The hidden dimension size for the three graph convolutional layers is tuned separately for each dataset/task.We used 2 for Gender prediction and 3 for age prediction tasks.For the ASD classification task, we found 3 to best classify the dataset.These hidden layers are followed by a global average pooling (GAP) layer, and a softmax classifier.Models were trained for 200, 300, and 260 epochs using the ADAM optimizer (Kingma and Ba, 2014), respectively, with a learning rate of 0.001 in all three cases.For the remaining details of implementation and parameters, we followed all the settings in Gao et al. (2021), unless otherwise specified.
For the GLGExplainer, we prepared the input using the simple gradient-based local explainer in the backbone GNN.The number of prototypes was set to 4 and 2 for the HCP dataset and the ABIDE data, respectively.This explainer was trained using all the remaining settings and parameters including the optimizer, learning rate, batch size, focusing parameter, and auxiliary loss coefficients, in addition to the E-LEN, from the original proposed model (Azzolin et al., 2022).

. . Evaluation metrics
We evaluate the effectiveness of the proposed GGNES model in terms of prediction performance as well as in terms of global explainability.Specifically, for model performance assessment, we use accuracy (ACC) and Area Under the Curve (AUC) scores to measure the prediction power of the GNNs on the prediction tasks for all the datasets.In addition, we leverage the human/domainlabeled explanation on the test set to quantitatively assess the goodness of the model explanation.Specifically, for both node-level and edge-level global explanations, we treat the human explanation as the gold standard and compute the distance between human and global model explanation via Mean Square Error (MSE) and Mean Absolute Error (MAE).Additionally, we evaluate our model on: (i) FIDELITY, which represents the accuracy of the E-LEN in matching the predictions of the GNN model to explain; (ii) ACCURACY, which represents the accuracy of the formulas in matching the ground-truth labels of the graphs; (iii) CONCEPT PURITY, which is computed for every cluster independently and measures how good the embedding is at clustering the local explanations (Azzolin et al., 2022), and is computed through Equation ( 15) where C i corresponds to the cluster having p i as the learned prototype, and count_most_frequent_label(C i ) returns the number of local explanations annotated with the most present label in cluster C i .The Concept Purity results are reported by computing the mean and the standard deviation across all clusters.For a more detailed description of these metrics, see Azzolin et al. (2022). .

. Comparison methods
Since there is no existing work on global explanation supervision on GNNs, we demonstrate the effectiveness of our model by comparing the evaluation metrics in the following scenarios: • No explanation supervision technique is used.
• min, max, or Average functions are used to generate global explanation and perform explanation supervision.• Concept-based global explanation is constructed and further used for supervision.

. Experimental results
Table 1 presents the raw formulas extracted by the Entropy Layer.Those formulas can be further described in a more humanunderstandable format after finding the representative elements of each cluster as shown in Figures 3-5, which correspond to HCP structural, HCP functional, and ABIDE datasets, respectively.Each of these Figures contains a number of sub-Figures that show the learned prototypes described in Section 3.2.Specifically, for each prototype p j , the local explanation Ḡ such that Ḡ = argmax Ḡ′ ∈D d(p j , h( Ḡ′ ) is reported.Here, D is a list of local explanations obtained after the binarization step in GLGExplainer.For details on this step, in addition to the definition for distance function d(), please see Azzolin et al. (2022).The nodes in Figures 3,  4 refer to DK atlas and are labeled with numbers for better readability, while the nodes in Figure 5 correspond to HO atlas.See Supplementary material for the label names corresponding to the labels we used in Figures 3, 4. For a list of HO atlas labels used in Figure 5, see Atlas (2023).

. . Performance
Table 2 shows the model performance and model-generated explanation quality for the three described datasets.The results are obtained from 5 individual runs for every setting.The best results for each dataset are highlighted with boldface font, and the second bests are underlined.For the HCP datasets, for both Age and Gender prediction tasks, the human annotations contain only node-level explanations, but for the ABIDE dataset we have both ground-truth (domain-labeled) node-level and edge-level explanations available for all samples.In general, our proposed Global Explanation Supervision model variations outperformed the non-explanations supervision GNN model in terms of both prediction power as well as explainability on all three datasets.More specifically, the performance results for different variations suggested that global explanation supervison can have positive effects in all scenarios on both prediction power, in addition to the explanation correctness.The most complicated model (i.e., the concept-based supervision model) achieved the best performance, out-performing baseline GNN by 1-6% and 1-3% on AUC and ACC scores, respectively.In addition, in terms of explainability, there is significant improvement in both node and edge-level explanations, when comparing the backbone GNN and the concept-based supervision models.In particular, we observed between 19-57% increase in node MSE and 14-48% in node MAE, and more than 33% improvement for edge MSE and MAE explanations.
These results demonstrate the general effectiveness of the proposed framework both on largely correcting the modelgenerated global explanation, in addition to improving the model performance and prediction power.In addition, among different variations used for global explanation generation, we observe constant superiority of the more sophisticated concept-based technique compared to the others, while no clear excellence of Avg, max, or min methods when comparing one to the other was remarkable.
To further evaluate the extracted global explanation formulas presented in Table 1, we computed Fidelity, Accuracy, and Concept Purity over the test set.The results are reported in Table 3 for the three datasets.As it can be seen, on average, the clusters are quite homogeneous, which means the model has learned a good mapping from the local explanations to the concepts space.Also the concept purity is at its lowest for HCP structural dataset while has the highest value for the same set.The accuracy results demonstrate that the formula in Table 1 can correctly match the behavior of the model in most samples.Additionally, it is important to note that by looking at the fidelity results, it is clear that the explainer is generating an explanation for the ground-truth labeling of the while capturing the underlying predictive behavior of the GNN it is supposed to explain.

. . E ect of choice of local explainer and backbone GNN model
To evaluate the proposed model more comprehensively, we repeated experiments for the model performance and modelgenerated explanation quality for all datasets for two additional local explanation techniques, Guided BP and GradCAM, and one other backbone GNN model, DGCNN (Zhang et al., 2018).The results are shown in Tables 4, 5.As these results show, we continue to see superiority of our proposed Global Explanation Supervision model variations compared to non-explanation supervised scenarios.The concept-based supervision model again achieved the best performance, out-performing baseline GNN, for the two backbone GNNs and all three local explanation techniques.Additionally, we observe significant improvement in both node, and edge-level MSE and MAE, when using the concept-based supervision models.These improvements, both in explanation quality and model performance, for the concept-based technique largely exceed other simple aggregation methods (e.g., Averaging) in almost all settings as well.
. .Qualitative analysis: case studies Here, we provide some case studies of the input data and the model explanation derived from gradient based explanation technique and binarized following (Azzolin et al., 2022).We report some random examples for each dataset, with their extracted explanation in bold, as illustrated in Figure 6.

Conclusion
In this study, we address an existing challenge for explainability in GNNs, by proposing the Global GNN Explanation Supervision (GGNES) technique which uses a basic trained GNN and a global extension of the loss function used in the GNES framework.This GNN creates local explanations which are fed to a Global Logic-based GNN Explainer, an existing technique that can learn the global Explanation in terms of a logic formula.These two frameworks are then trained iteratively to generate reasonable global explanations.Extensive experiments demonstrate the effectiveness of the proposed model on improving the global explanations while keeping the performance similar or even increase the model prediction power.
model-generated global node (or edge) level explanation of a GNN, and use that as additional supervision to train the GNN model.The explanations generated by the GNN model remain differentiable to the backbone model's parameters.This makes the global explanation supervision feasible as the model parameters can be affected and tuned during training.(3) Conduct comprehensive experiments to evaluate the effectiveness of the proposed model.Extensive experiments on three real-world datasets demonstrate that the proposed model improved the backbone GNN model both in terms of prediction power and global explainability across different application domains.In addition, qualitative analyzes, including case studies, are provided to demonstrate the effectiveness of the proposed framework.

FIGURE
FIGURE Representative element of each learned concept for HCP Structural dataset.(A) P .(B) P .(C) P .(D) P .

FIGURE
FIGURE Representative element of each learned concept for HCP Functional dataset.(A) P .(B) P .(C) P .(D) P .

FIGURE
FIGURERepresentative element of each learned concept for ABIDE dataset.(A) P .(B) P .(C) P .

FIGURE
FIGURE Examples of input graphs with their explanations in bold as extracted by Gradient-based edge explanation technique, for HCP structural connectivity dataset.(A) Subject -Female-Axial view.(B) Subject -Female-Sagittal view.(C) Subject -Female-Axial view.(D) Subject -Female-Sagittal view.(E) Subject -Male-Axial view.(F) Subject -Male-Sagittal view.
TABLE Raw formulas as extracted by the Entropy Layer.Dii = j Ãij ; The trainable weight matrix for layer l is denoted as is the element-wise non-linear activation function.Additionally, a similar design as in Pope et al. (2019) is employed to this backbone GNN model in which using several layers of graph convolutional layers followed by a global average pooling (GAP) layer over the graph nodes can address any concerns when working with variable input graph size.
TABLE Performance and model-generated explanation evaluation among the proposed models and the baseline on two HCP, in addition to one ABIDE graph classification tasks.

TABLE Fidelity ,
accuracy, and concept purity computed over test sets for all datasets.
TABLE Performance and model-generated explanation evaluation for two additional local explainers with GCN as the backbone model.
TABLE Performance and model-generated explanation evaluation for all three local explainers with DGCNN as the backbone model.