Time-sequential graph adversarial learning for brain modularity community detection

: Brain community detection is an e ﬃ cient method to represent the communities of brain networks. However, time-variable functions of the brain and the intricate brain community structure impose a great challenge on it. In this paper, a time-sequential graph adversarial learning (TGAL) framework is proposed to detect brain communities and characterize the structure of communities from brain networks. In the framework, a novel time-sequential graph neural network is designed as an encoder to extract e ﬃ cient graph representations by spatio-temporal attention mechanism. Since it is di ﬃ cult to capture the community structure, the measurable modularity loss is used to optimize by maximizing the modularity of the community. In addition, the framework employs an adversarial scheme to guide the learning of representation. The e ﬀ ectiveness of our model is shown through experiments on the real-world brain network datasets, and the great performance of brain community detection demonstrates the advantage of the proposed framework.


Introduction
Neurological diseases are one of the most common diseases, causing a large number of patients and casualties worldwide every year and imposing a huge burden on the healthcare industry. Meanwhile, pharmacological magnetic resonance imaging (phMRI) is a technology that allows researchers to obtain noninvasive functional brain imaging of drug-induced variations in blood flow dynamics. As a derivative of functional magnetic resonance imaging (fMRI), phMRI will significantly promote the progression of neurological diseases in terms of pharmacokinetic and pharmacodynamic properties if the time course and neurological response of specific pharmacological stimulus can be analyzed for specific diseases and medications. As a result, phMRI processing analysis has a wide range of applications and a significant impact in the medical field. For example, the analysis of phMRI of addictive drugs can be used to study the addictive mechanisms of drugs in order to develop therapies for treating addiction [1]. And the work in this paper proposes the relevant technique for medical research in this field and engages in the processing and analysis of MRI data.
With the increasing number of studies contributing to the exploration and advancement of neuroscience, researchers are extracting and processing brain information more elaborately and profoundly [2], characterized by a large amount of complicated biological data obtained from large-scale brain neural systems. The vast amount of this enormous data is overwhelmingly in the form of network datay [3], representing the connections or interconnections of components in various large-scale neurobiological systems. In neuroscience, these data often span a variety of scales [4] (neurons, circuits, systems, and brains) or include different types of data (e.g., structural networks that express the anatomical connections of nerves, functional networks that represent the connections of distributed brain regions associated with neural activity). Brain networks [5] are composed of anatomical structures that segment different brain regions and connect them through functional networks that reveal complex patterns of neuronal communication. Such complex neural signal patterns, attributed to current advances in imaging technology and advanced medical image processing methods [6,7], can be studied with fMRI [8] and mentioned phMRI, yielding neural response activity thought to be associated with various behavioral and cognitive functions as well as brain diseases [9].
Complex networks are characterized by their modular structure [10]. This implies that the nodes of a network may be divided into modules or communities that are inwardly dense and outwardly sparse. The brain follows the organizing principle of modular organization [11], and the most widely encountered and biologically meaningful aspect of brain networks is their organization into distinct network communities or modules, meaning that tightly connected clusters within them are less connected to each other. Modular blocks have neuroscientific significance due to the fact that their boundaries distinguish functionally relevant neuron elements [12], define hubss [13] and crucial bridges that connect communities, channel and impede the flow of neural impulses and messages, and constrain the unrestricted spread of instabilities.
Due to brain networks being large and their connectivity patterns being complex, it is often impossible to identify modules simply by visually inspecting the network. Community identification methods have been used to significant effect in solving such problems [14]. Subgraphs, network modules, and communities have all been explored in depth in network topologies. They have been widely used in network neuroscience [15]. Network structure identification, also known as community detection, is the division of nodes in a network into groups, where nodes within communities are densely connected and nodes between communities are sparsely connected. Mining network structure reveals and understands complex network systems' organizational principles and operational roles. In addition, the development of new methods for mapping the structural and functional connectivity of the brain has led to the preparation of complete network maps of neuronal circuits and systems [16,17]. The structure of these brain networks can be examined and analyzed using various graph-theoretic methods. Thus, methods for finding modules or network clusters in brain networks have specialized applicability, revealing tightly coupled core building blocks or substructures, often corresponding to specific functional components [18]. With the recent intersection of neurobiological imaging and network science graph theory, new applications and analytical methods are being developed to analyze real-world biological networks.
Since the past few decades, data-driven models have received a great deal of attention in a broad range of fields, for example, it is used for image [19], disease prediction [20,21] and trajectory detection [22]. When cooperated with machine learning approaches, these models has led to a great deal of success in the area of medical image computing [23] when it comes to constructing pattern recognition frameworks. The models get the capacity to fulfill a high level of precision at a minimal computational cost. Building on its foundation, deep learning has become a method presently regarded as one of the most critical advances in machine learning overall. Deep learning is a novel technique for processing high-dimensional biological network data and learning low-dimensional graph representations of brain network structures [24][25][26]. Exemplary approaches include those based on generative adversarial techniques [27][28][29] and graph neural networks [30]. Generative adversarial networks (GANs) [31], which can bee seen as variational-inference [32] based generative model, are frequently unsupervised in training, and the freshly produced data (in theory) has the same distribution as actual data, allowing for robust, complicated data analysis. And it is important to note that the prevalent technique in medical image analysis is the use of GAN [35,36]. By pooling and convolution, the convolutional neural network (CNN) [33] method decreases the dimensionality of medical imaging data, allowing it to discover patterns in biomedical research effectively. Graph convolutional network (GCN) [34] is built to extract community characteristics as it directly analyzes network-structured data and succeeds CNN capabilities. However, existing approaches for processing dynamic network data to produce temporal graph representations for community discovery remain hard [37], particularly for small network sample datasets [38].
To resolve mentioned problems, we developed a novel time-sequential graph adversarial learning (TGAL) to accomplish the partitioning of brain regions in time-varying brain networks, detect different communities involving similar brain regions in brain networks and their evolution over time steps, and improve the robustness of the framework when dealing with small sample network data utilizing generative adversarial techniques. The main contributions of our work can be concluded as follows: 1) A time-sequential graph attention encoder is designed to learn more efficient graph representation, and more helpful graph embeddings are obtained to complete the clustering to detect more accurate dynamic communities. 2) An adversarial training scheme is proposed to assist the representation learning of brain networks. It alleviates some of the issues created by training with few data samples. 3) Due to the excellent classification performance of our frameworks with neuroscientific supports, the communities discovered in biological and medical imaging experiments may be employed as biomarkers.
Following is the outline for this paper: In Section 2, a graph learning framework with adversarial strategy is established, and its formulas and details of the composition of each module is given. Figure 1. The architecture of the proposed time-sequential graph adversarial learning (TGAL) framework for detecting brain communities. It contains three essential components: 1) a construction module for brain networks, 2) a temporal graph auto-encoder that combines a decoder and an encoder for temporal graph attention, and 3) an adversarial regularizer consisting of a discriminator and a generator.
In Section 3, a neurobiology networks dataset is constructed and described as following, a dynamic community detection experiment is designed to estimate the capability of node clustering in brain regions, Then, a visualization of the results is displayed. And a graph representation learning experiment is implemented to verify the representational learning capabilities of the framework. In Section 4, the discussion of the pros and cons of our framework and the future directions for improvement is presented. Finally, we summarize the work of this paper in Section 5.

Methods
The architecture of our TGAL is illustrated in Figure 1. Firstly, the raw brain functional image dataset is input to the module called "brain network construction" to construct network data with nodes and connections by time series and brain atlas. Secondly, on one hand, the network data is fed to an encoder, and on the other hand, the network data is given as positive samples to a discriminator in adversarial learning to generate augmented data. Then, the encoder provides the output graph embedding for not only decoding to reconstruct but calculating the loss function for community detection. The forward computation is then completed and the detected brain communities are outputted, backpropagation is finally carried out to update the model parameters.
Brain networks construction will be describe in the datasets preparation below. And the explanation of the notations is given in Table 1. In the auto-encoder, the encoder (E) adopts temporal graph attention networks to transform the time series of brain regions ( X t T t=1 ) and brain functional connections (A) into the embeddings ( Z t T t=1 ). Additionally, in the adversarial regularizer, the encoder, which is referred to as the generator (G), and the discriminator (D), engage in a min-max adversarial game in order to learn more suitable embeddings. The community assignment matrix is optimized using the measurable soft modularity loss in order to detect communities. Consequently, the encoder is trained with two objectives: a typical auto-encoder reconstruction loss and a measured modularity loss for community detection. Table 1. Explanation of notations.
attributes of brain region(node) A adjacency(connectivity) matrix of brain networks H hidden representations of region(node) attributes Z embeddings of region(node) attributes φ parameters of discriminator ψ parameters of generator E expectation i, j No.i and No. j brain region

Temporal graph autoencoder
The purpose of the temporal graph autoencoder is to encode the properties of the dynamic brain network in a low-dimensional latent space. First, we create the encoder using two distinct network blocks: the topological attention block and the temporal attention block. Each block is composed of multiple layers of the respective stack. Both utilize self-attention techniques to achieve an effective time-sequential graph representation from its neighbors and historical context data.

Topological attention layer
The initial input for this layer is a set of brain network attributes x i ∈ R d , ∀i ∈ V where d is the dimension of time series. The output is a set of brain region representations h i ∈ R f , ∀i ∈ V where f is the dimension of captured topological properties.
Similar to graph attention networks (GAT) [39],our topological attention layer is concerned with the near neighbors of the brain region i by calculating attention weight from input brain region representations: Here N i = { j ∈ V : (i, j) ∈ E} is the set of near neighbor of region i which are linked by functional connection A; W ∈ R d× f is a weight transformation matrix for each region representations; σ(·) is sigmoid activation function and is the concatenation operation. The learnt coefficients α i j , which is computed by performing softmax on each neighbors, indicates the significance of brain region i to region j. Note that topological attention layer applies on brain region representation at a single time step, and multiple topological attention layer can calculate the entire time sequence in parallel.

Temporal attention layer
Capturing the constantly changing patterns of brain networks in a global way is critical for dynamic community detection. When obtaining attributes at current time step, it is essential to consider the global temporal context. The essential concern is how to record the temporal variations in the organization of brain networks over several time steps. Temporal attention layers are intended to address this problem using the scaled dot-product attention [40]. Its queries, keys, and values are utilized to represent the properties of brain regions that provide input.
We define H s = h 1 s , h 2 s , ..., h T s , a representation sequence of a brain region s at continuous time steps as input, where T is the number of time steps. And the output of the layer is Z s = z 1 s , z 2 s , ..., z T s , a new brain network representation sequence for region s at different time steps.
Using h t s as the query, temporal attention layer evaluate its historical representations, inquiring the temporal context of the neighborhood around region s. Hence, temporal self-attention allows the discovery of relationships between time-varying representations of a brain region across several time steps. Formally, the temporal attention layer is computed as: 3) where β s ∈ R T ×T is the attention coefficient matrix computed by the query-key dot product attention operation; W q ∈ R d× f ,W k ∈ R d× f and W v ∈ R d× f are linear projections matrices which transform representations into a particular space. The two attention blocks are calculated in sequence to obtain the final temporal representation, i,e., the output embeddings Z. And It is utilized to reconstruct the brain network topology in the decoder:

5)
A is the reconstructed brain functional connection and σ(·) is still sigmoid function. The classic reconstruction loss is defined by the form of cross entropy:

Adversarial learning
In this adversarial model, the main objective is enforcing brain network embeddings Z to match the prior distribution. Other naive regularizers push the learned embeddings to conform to the Gaussian distribution rather than capture semantic diversity [41]. As a result, conventional techniques to network embedding cannot effectively profit from adversarial learning. Generative data augmentation is needed to explore the underlying features of the data to offset the negative impact of small sample data. Therefore, we derive the previous distribution of communities by counting different kinds of modules in the functional brain network that have been confirmed by neuroscience. The adversarial model serves as a discriminator by using a three-layer fully connected network to identify whether a latent code generated from the prior distribution p z or brain network data from the real-world dataset (X, A). The regularizer will eventually enhance the embedding during the minimax competition between the encoder and the discriminator in the training phase by generating augmented data X as input to the auto encoder.
The loss of the encoder(generator) L G and discriminator L D in the adversarial model, defined as follows: In this expression, z is a latent code sampled from the prior distribution p z of empirically confirmed brain communities; D φ (·) and G ψ (·) is the above-mentioned discriminator and encoder.
Formally, the objective of this adversarial learning model can be indicated as a minmax criterion:

Measurable modularity loss
Modularity maximization is a technique for community discovery that is commonly used in the detection of brain modules. A partition is regarded high quality (and so has a higher Q score [24]) conceptually if the communities it forms are more dense internally than would be predicted by chance. Thus, the partition that gets the maximum value of Q is considered to be a good estimation of the community structure of a brain network. This intuition may be expressed as follows: Here a i j indicates the number of functional connection between region i and j; c i j = k i k j 2m denotes the estimated number of connections based on a null model where k i = j A i j is a degree of the region i and 2m = i j A i j is overall amount of connections in the brain networks; δ(ω i , ω j ) = 1, if ω i = ω j , which means reigon i and reigon j are in the same community and 0 otherwise.
Inspired by [42], to develop a differentiable objective for optimizing the community assignment matrix P = so f tmax(Z) ∈ R N×C which represents a matrix of probabilities of brain region attribution to communities, the measurable modularity loss employed by our framework is defined as: where the modularity matrix B = A − dd T 2m ; C is the amount of communities and N is the number of regions in the brain networks. The regularization ensures that the model can identify communities of the predicted size.
Thus, the total loss for the encoder optimization in the train process to obtain better embeddings is sum of the above three loss terms, expressed as follows:

Experiments and results
In this part, we assess the performance of TGAL in terms of both dynamic community detection and graph representation learning. We use metrics of graph theory and machine learning to measure the experimental results over dynamic community detection, evaluating the performance of our model in graph representation learning with binary classification metrics.

Materials and datasets preparation
The neurobiological experiment dataset contains two types of data with different numbers: raw functional MRI images of 8 nicotine non-drug injected rats and 16 drug-injected rats, each with 800 time series. The injection was carried out in the middle moment of the fMRI processing on rats, and its injection time was ignored. The injected drug will stimulate the nervous system of the rat brain and cause some disturbance to the brain network.
• Brain networks construction: By preprocessing long-term functional MRI scans of experimental rats, we produced the dynamic brain network dataset necessary for the experiment. The first preprocessing was performed using the Statistical Parametric Mapping 8 (SPM8) tool in MAT-LAB. In order to account for head motion, functional signals were aligned and unwrapped, and the average motion-corrected picture was coregistered with the high-resolution anatomical T2 image. The functional data were then smoothed using an isotropic 3 mm full-width at half-maximum (FWHM) Gaussian kernel. Upon the base of the Wister rat brain atlas, 150 functional brain regions have been characterized. We used magnitude-squared coherence to evaluate the spectrum correlation between regional time series, resulting in a 150 × 150 functional connection matrix for each time step, the elements of which displayed the strength of functional connectivity between all interactions of regions. Therefore, in the time-varying dynamic brain network, the BOLD sequence signal from fMRI is used as a nodal attribute for each brain region, the adjacency matrix is given by the preprocessed functional connectivity, and the entire time period is separated equally into 6 time steps.

Implementation details
Python Torch was used as the backend for our TGAL. Two NVIDIA GeForce RTX 3080 Ti sped up the training of the networks. During training, the training epoch was set at 1000 and the learning rate was set to 0.001. Adam was used as an optimizer with a weight decay of 0.01 to reduce overfitting. The encoder was trained using 2 topological and two temporal attention layers. We perform each experiment 10 times and calculate the mean result. We set the regularization value to 0.5 and the number of communities to 10 for all datasets and methods.

Baseline
Our approach was compared against the following two kinds of baselines: • GAE [44]: is recently the most common autoencoder-based unsupervised framework for graph data, in which the encoder is composed of two-layer graph convolutional networks to leverage topological information. • ARGA [45]: is an adversarially regularized autoencoder method that employs graph autoencoder to learn the representations, regularizes the latent codes, and forces the latent codes to match a prior distribution; differing from ours, it used simple Gaussian distribution as the prior distribution.

Metrics
For graph-level metrics, we report average community conductance C and modularity Q. For ground-truth label correlation analysis, we report normalized mutual information (NMI) between the community assignments and labels and pairwise F-1 score between all node pairs and their associated community pairs.

Ablation study
As indicated in Table 2, we conducted ablation research on community detection to evaluate the effectiveness of our proposed TGAL framework in the three terms of encoder composition, adversarial learning module and loss function. And three significant outcomes were achieved: 1) In a comparison of graph-level metrics, the k-means-based technique performed admirably for community conductance, Table 2. Community detection performance on rat brain networks dataset by graph conductance C, modularity Q, NMI, and pairwise F1 measure. GCN is graph convolutional neural network and TG is the abbreviation of our proposed Temporal Graph encoder. AL indicates whether the module of Adversarial Learning is implemented. while the modularity loss-based method performed better for community modularity. This is owing to the fact that the two methods have fundamentally different optimization goals; modularity loss is motivated by the desire to maximize modularity.
2) The strategy with adversarial regularizer performs well on average; this demonstrates that adversarial learning serves as an adjunct to graph representation learning. 3) Our technique performs poorly when the proposed encoder is replaced with a two-layer graph convolutional encoder; this suggests that the proposed encoder may learn better embeddings to improve performance by virtue of a powerful attention mechanism [43]. Figure 2 depicts the results of our dynamic community experiment. It illustrates the changes in the spatial distribution of the three major brain communities observed by our method over increasing time intervals. The algorithm for community detection modularizes the clustering of nodes represented by each brain region, and the top three communities with the largest number of nodes can be identified as evolving in each time step. And each time step corresponds to a short period in the description of the dataset as mentioned above. As can be observed from the figure, there was no significant change in the distribution of brain communities at time steps 1-3, and the spatial location of the brain communities did not change much in general during this period. But there was a significant change from time step 4-6. The three main brain communities have altered spatial distribution to some extent in each of the periods after time step 3. Because the experiment was administered drug injection after time step 3 caused the rats in the original data set to change the characteristics and topology of their brain networks. Thus, the experimental results are consistent with neuroscientific facts.

Graph representation learning performance
We divided the data from drug-injected and non-drug-injected rats into two groups and verified whether the models learned efficient graph representations by competing with state-of-the-art graph representation learning models for binary classification performance. In each time step, the most dominant top three brain communities are represented by the red, yellow and blue colored spheres, and the spatial location of each color sphere represents the location of the corresponding region in the brain. The cartoon between time step 3 and time step 4 represents the drug administration.

Comparison methods
• DGI [46]: highlights the importance of cluster and representation learning in combination. We learn unsupervised graph representation with DGI and two algorithms both run SVM on the final representations as the classifer.

Metrics
The performance of binary classification is evaluated using six quantitative representation learning metrics: To summarize: 1) Accuracy (ACC), 2) Area under receiver operating characteristic curve (AUC), 3) Precision (PRE), 4) Recall (REC), 5) Receiver operating characteristic (ROC), and 6) F1score. Our suggested technique is being evaluated using leave-oneout cross-validation (LOOCV), since we only have a small quantity of data. One of the N individuals is omitted from the testing process, and the N − 1 subjects that remain are used for training purposes only. Each technique's hyperparameters are adjusted to their optimal values via the greedy search algorithm.

Classification results
As shown in Figure 3, our method achieves better results in terms of classification performance overall. In a way, it verifies that the representation obtained by our method is stronger in the unsupervised learning process. Figures 4 and 5 display the comparison of our model's second classification accuracy and precision compared to the DGI method at different time steps. Again, the overall supe-riority of our method can be seen, and it can be observed that the classification metrics at time steps 4-6 are generally better than those at time steps 1-3, which may be due to the fact that it is easier to classify when the alterations in the dynamic brain network start to appear after drug injection.

Discussion
Considering all the factors that affect the performance of our framework in the experiment, one of the essential points is the presence of noise in the data, which is a common problem in current biomedical techniques when carrying out experiments.
In the data processing part of this experiment, we thoroughly considered the effect of noisy data on the experimental results. The acquired fMRI data are screened, and the collected unqualified data will be excluded from the subsequent experiments. Since the scanned subjects have a slight head movement during the experimental data acquisition, it will cause the acquired images to be noisy. Head movement correction is used to make the processed images overlap with the target images as much as possible so that the same voxel corresponds to the same position in the brain at each moment. In this way, the interference of noise in the experiment is reduced. Several pre-processing processes, such as normalization, aim to minimize errors due to data acquisition and physiological properties. However, it is still not possible to completely exclude the effect of noise in the data.
Therefore, our deep learning framework also incorporates a mechanism to resist noise to some extent: using adversarial learning to get the generated augmented data allows the model to learn more about the distribution of the original data to extract more robust features, and to this view is weakening the negative impact of noisy samples. From the results of the above experimental section, we can observe that the strategy of generating adversarial improves the performance of our framework. However, in general, we still do not get particularly ideal detection results and classification accuracy due to the negative impact of noisy samples.
In summary, the current method still has some limitations, and the noise in many places using the current method can only reduce the impact as much as possible. But the noise problem still cannot be completely eliminated. How to better eliminate the impact of noisy data is the direction of our future research direction.

Conclusions
In this research, we develop a new framework termed time-sequential graph adversarial learning (TGAL) for incorporating community detection into an adversarial regularizer-directed graph representation learning process. Moreover, TGAL framework use a temporal graph attention encoder to incorporate topological input properties, temporal contextual representation, and latent factors. It also employs adversarial training with a neuroscientific prior to reconstructing the embedding space. In dynamic brain network datasets, our strategy surpassed several unsupervised deep embedding and community identification methods. Using the resulting graph representations for classification yielded better results than the comparison method, demonstrating that learning graph representations may result in more accurate graph embeddings in the latent space. Detailed model discussions were held to explore, among other concerns, the suggested TGAL and the superiority of the encoder and adversarial regularizer. Last but not least, our framework, which achieved improved performance, can be applied as a technology for medical-relevant research, and the detected dynamic brain communities altered by drugs can be used as an underlying biological marker to facilitate biomedical research on the pharmacodynamic properties and pharmacokinetics of neurological diseases and make efforts towards drug therapy for brain diseases.