Deep Learning Techniques for Community Detection in Social Networks

Graph embedding is an effective yet efficient way to convert graph data into a low dimensional space. In recent years, deep learning has applied on graph embedding and shown outstanding performance. Adjacency matrix is often taken as the storage data structure of graph. However, there are the problems of insufficient spatial proximity information in adjacency matrix. Therefore, this study proposes a deep community detection method which includes (1) matrix reconstruction method, (2) spatial feature extraction method and (3) community detection method. The original adjacency matrix in social network is reconstructed based on the opinion leader and nearer neighbors for obtaining spatial proximity matrix. The spatial proximity matrix can obtain subspace of the graph which can help convolution neural network easily and quickly extract the spatial localization. The spatial eigenvector of reconstructed adjacency matrix can be extracted by an auto-encoder based on convolution neural network for the improvement of modularity. In experiments, four open datasets of practical social networks were selected to evaluate the proposed method, and the experimental results show that the proposed deep community detection method obtained higher modularity than other deep learning methods. Therefore, the proposed deep community detection method can effectively detect high quality communities in social networks.


I. INTRODUCTION
Graph is an important data representation of complex networks. Effective community detection provides users a deeper understanding of networks, and it can benefit a lot of useful applications such as node classification, node recommendation, link prediction etc. However, the large scale and high-dimensionality of networks make community detection method suffer from high computation and space cost [1], [2]. Graph embedding is an effective yet efficient way to convert graph data into a low dimensional space, in which the network structural information and network properties are maximally preserved [3]- [5].
Deep learning (DL) has shown outstanding performance in a wide variety of research fields, such as social networks, The associate editor coordinating the review of this manuscript and approving it for publication was Chun-Wei Tsai . graph embedding, etc. DL based graph embedding applies DL models on network. At present, DL based graph embedding can be divided into two categories based on whether random walk is adopted to sample paths from a graph [3].
(1) DL-based graph embedding with random walk: a graph is represented as a set of random walk paths sampled from it. The deep learning methods are then applied to the sampled paths for graph embedding which preserves graph properties carried by the paths [4], [5]. (2) DL-based graph embedding without random walk: the methods applies deep models on a whole graph (or a proximity matrix of a whole graph) directly [6]- [9].
The graph is represented by an adjacency matrix. It stores the information of direct connections between nodes. The adjacency matrix does not store the proximity of indirect connections between nodes, nor can it express spatial proximity between nodes. So there is a problem of insufficient spatial proximity information in the adjacency matrix. To solve this problem and improve the accuracy of feature extraction, some auto-encoder (AE) based graph embedding studies [9]- [15] optimized the input vector by performing different preprocessing on the adjacency matrix to improve the extraction accuracy of spatial feature. In [13], the sparse adjacency matrix is preprocessed, and the obtained similarity matrix can not only reflect the similarity between connected nodes in the network topology, but also reflect the similarity between unconnected nodes. What's more, the fully-connected strategy is applied for generating the auto-encoder networks, some noises may be obtained in the fully-connected networks when the unrelated nodes of social network exist. Considering convolutional neural network (CNN) is a powerful tool to extract the spatial localization, CNN is also applied in graph embedding [16]- [23]. It is necessary to simulate the convolution operation on the graph. Therefore, the problem is how to improve the adjacency matrix so that the adjacency matrix storage spatial proximity between nodes.
Therefore, analogous to image-based convolutional networks that operate on locally connected regions of the input, the study presents a reconstruction approach of adjacency matrix to storage spatial proximity between nodes. The spatial proximity can make the nodes that are spatially close to each other stored in the matrix close to each other.
Considering AE-based graph embedding and CNN-based graph embedding has obtained robustness and effectiveness, the study incorporates AE and CNN to improve the quality of feature extraction. Therefore, this study proposes a deep community detection method which includes (1) matrix reconstruction method, (2) AE and CNN-based spatial feature extraction method and (3) community detection method (shown in Figure 1). Due to limited cyberspace information in original adjacency matrix to present the structure of network topology, the proposed matrix reconstruction method based on a novel cyberspace structure reconstruction strategy is applied to find the opinion leader in the social network and find the nearer neighbors for reconstructing adjacency matrix with the spatial proximity matrix. Furthermore, this study also proposes a spatial feature extraction method based on auto-encoder and convolutional neural network to extract the spatial features of the reconstructed adjacency matrix. Finally, the extracted spatial features are adopted into the proposed community detection method for clustering the nodes of social network into k groups.
The contributions of this study are summarized as follows. (1) A matrix reconstruction method based on a novel cyberspace structure reconstruction strategy is proposed to obtain the spatial proximity matrices of social networks, which can obtain subspace of the graph and help convolution neural network easily and quickly extract the spatial localization. (2) A spatial feature extraction method based on autoencoder and convolutional neural network is proposed to learn spatial eigenvector and effectively extract the spatial features of social networks. (3) The matrix reconstruction method and spatial feature extraction method are applied to analyze the topology of social network for the improvement of social community detection. The remainder of the paper is organized as follows. Section II discusses the literature reviews of deep neural network based network embedding methods. Section III presents the processes of the proposed deep community detection method to analyze the structure of social networks for extracting features of social networks and detecting social communities. The proposed methods will be evaluated, and practical experimental results will be analyzed and discussed in Section IV. The conclusions and future research directions are summarized in Section V. VOLUME 8, 2020

II. RELATED WORK
It can be seen from above introduction that deep neural network based network embedding methods include two categories: (1) deep learning models with random walk and (2) deep learning models without random walk [3].
Deep neural network based network embedding methods with random walk are applied to the sampled paths for graph embedding which preserves graph properties carried by the paths. The most famous deep learning model with random walk is DeepWalk [4]. The success of DeepWalk motivates many subsequent studies which apply deep learning models, Word2Vec [5] follow the idea of DeepWalk but change the settings of random walk sampling methods.
Deep neural network based network embedding without random walk applies deep models on a whole graph. Two popular deep learning models without random walk used in network embedding are AE and CNN.

A. AE-BASED NETWORK EMBEDDING METHODS
AE is an effective data dimension compression algorithm. AE based network embedding algorithms often change the settings of either input vectors [9]- [15] or loss function [9], [15]. Structural deep network embedding (SDNE) [9] constructed a semi-supervised deep model, which improved the input vectors by exploiting the first-order and secondorder proximity jointly to preserve the network structure. Also, the model added first-order proximity to the loss function as supervisory information to capture the local network structure. Deep network representations with adversarially regularized autoencoders (NetRA) [15] improved the loss function by jointly minimizing AE loss and localitypreserving loss. The model learned smoothly regularized vertex representations that well capture the network structure through jointly considering both locality-preserving and global reconstruction constraints.

B. CNN-BASED NETWORK EMBEDDING METHODS
CNN and its variants have been widely adopted in network embedding. CNN based network embedding directly use the original CNN model designed for Euclidean domains [16]- [18] and non-Euclidean domains [19]- [21]. Xu et al. re-formatted the complex network topology adjacent matrix into an image and design a CNN model to extract relevant features and classify such features [16]. PATCHY-SAN [18] selected a fixed-length node sequence from a graph and then assembles, normalized and labelled nodes' neighborhood to learn a neighborhood representation with the CNN model. Hanocka et al. [19] utilized MeshCNN for a direct analysis of 3D shapes to leverage their intrinsic geodesic connections.
CNN-based graph embedding can also be divided into spectrum-based convolution [23] and space-based convolution [24], [25] according to the different operations of simulated convolution on the graph [22]. Defferrard et al. [23] designed fast k-localized (k means the shortest distance between kernel and neighbors) convolutional filters on graphs and presents a formulation of CNNs in the context of spectral graph theory. Dai et al. [24] proposed Stochastic Steadystate Embedding (SSE). The model took the sum of neighbor nodes as a potential expression of node degree. The model updated the node's potential random representation in the asynchronous network, and recursively estimates the node's potential representation and updates the parameters of the batch sampling data. Sperli [25] introduced a CNN bassed community detection method for sparse matrices. The convolutional layer and the largest pool layer processed the nonzero values in the adjacency matrix to reduce the dimension of the data. Inspired by AE-based graph embedding and CNNbased graph embedding mentioned above, a CNN+AE based deep community detection method, which reconstructs adjacency matrix with the spatial proximity, is put forward in this paper to extract higher spatial feature with lower dimensions.

III. DEEP COMMUNITY DETECTION METHOD
This section presents the concepts of the proposed matrix reconstruction method, spatial feature extraction method and community detection method.

A. MATRIX RECONSTRUCTION METHOD
Because the original order of nodes in an adjacency matrix is nonsense, a matrix reconstruction method is proposed to adjust the order of nodes in the adjacency matrix for arranging the neighbor nodes with high correlation. Therefore, some spatial features among nodes may be formatted after reconstructing the adjacency matrix for extracting spatial features and detecting social community. The proposed matrix reconstruction method consists of three sub-methods which include (1) opinion leader election method, (2) neighbor node selection method and (3) reconstruction method described in the following subsections.

1) OPINION LEADER ELECTION METHOD
The opinion leader who is an important member in a group (i.e. an important node in an adjacency matrix) may affect the attitudes of other members. In practical social networks, several members may follow and be a friend of the opinion leader in a community. Therefore, the proposed opinion leader election method will analyze the influence of each node to find the most important node (i.e. the most influential opinion leader) in the adjacency matrix as an initial node for matrix reconstruction.
In this study, an adjacency matrix A is constructed according to the edges between each two nodes. The parameter V (i.e. {v 1 , v 2 , . . . , v n }) denotes as a node set, and the parameter E denotes as an edge set. If the edge between node v i and node v j exists, the value of a i,j is 1; otherwise, the value of a i,j is 0 (shown in Equation (1)).
a 1,1 a 1,2 · · · a 1,n a 2,1 a 2,2 · · · a 2,n . . . . . . . . . . . . a n,1 a n,2 · · · a n,n A transition probability matrix C (shown in Equation (2)) can constructed according to the adjacency matrix A. When a node connects to several nodes, the weight of each edge is lower; when a node connects to less nodes, the weight of each edge is higher. For instance, if node v i only connects to node v j , node v j is an important node for node v i , and the transition probability c i,j is 1.
In initial stage, the weight of each node is 1, and a node weight matrix S is constructed in in Equation (3). The limiting weight matrix S * is calculated based on the transition probability matrix C (shown in Equation (4)). Finally, the node i_leader with the highest weight (i.e. opinion leader) is found according to Equation (5).
where s i = 1 in initial stage. (3) The pseudo codes of the proposed opinion leader election method are presented in Algorithm 1.

2) NEIGHBOR NODE SELECTION METHOD
After finding the opinion leader (i.e. node i_leader), a neighbor node selection method is proposed to find the neighbor node j_neighbor which is highest relevant to node i_leader.
The node i_leader is presented as node v i , and an Euclidean distance r(i, j) is applied to measure the distance between node v i and node v j based on Equation (6). The node j_neighbor with the shortest distance (i.e. the nearest

Algorithm 1 Opinion Leader Election Method
Inputs: an adjacency matrix A n×n with n nodes and the number of iterations I max Outputs: the node ID of opinion leader (1) Construct the adjacency matrix A n×n .
(2) for i = 1 to n (3) for j = 1 to n (4) Determine the weight of edge from node v i to node v j based on the amount of connections of node v i . (5) end for (6) end for (7) for i = 1 to n (8) Set the weight of node v i in the initial stage as 1. (9) end for (10) while (the number of iterations is less thanI max ) (11) for i = 1 to n (12) Set the weight of node v i in the next iteration as 0. (13) for j = 1 to n (14) Calculate the weight of node v j multiplied by the weight edge from node v j to node v i as the value of temp. (15) Set the weight of node v i in the next iteration plus the value of temp. (16) end for (17) end for (18) end while (19) Return the ID of the node with a highest weight. neighbor) is found according to Equation (7).

) RECONSTRUCTION METHOD
The order of nodes in the adjacency matrix A can be determined based on the proposed opinion leader election method and neighbor node selection method. The opinion leader node can be elected as the first node, and the nearest neighbor of the opinion leader node can be elected as the second node; then the nearest neighbor of the second node can also be elected by the neighbor node selection method in accordance with Equations (6) and (7), and so on. The adjacency matrix A can be reconstructed as matrix X (shown in Equations (8)) by Algorithm 2.

B. SPATIAL FEATURE EXTRACTION METHOD
Due to the convolutional neural network as an excellent deep learning model for spatial analyses, this study proposes an auto-encoder based on convolutional neural network to automatically extract the spatial features of the reconstructed adjacency matrix for social networks. For building the structure of the proposed auto-encoder based on convolutional neural network, the number of neurons in the input layer is same with the number of neurons in the output layer. The spatial features in social networks can be extracted by the convolutional layer, so this study adopts the values of neurons after convolutional computation as spatial feature vectors.

1) THE DESCRIPTION OF A SIMPLE CASE
In this case, both of input layer and output layer have four neurons (i.e. four nodes in a social network), respectively. The size of filter in the convolutional layer (i.e. the first hidden layer) is 1 × 3, so two neurons are trained in the hidden layer. The structure of auto-encoder based on convolutional neural network is showed in Figure 2. Each parameter in the neural network is described as follows.
• The weight set of filter in the convolutional layer is {α 1 , α 2 , α 3 }. • The adjustment variables of the two neurons in the convolutional layer are b 1,1 , b 1,2 .
• The loss function analyzes the mean squared errors by Equation (15) to optimize the neural network.

Algorithm 2 Reconstruction Method
Inputs: an adjacency matrix A n×n with n nodes Outputs: the reconstructed adjacency matrix X n×n (1) Construct the adjacency matrix A n×n .
(2) Create the list L n .
(3) Create the list O n . (4) for i = 1 to n (5) Push node v i into the list L n . (6) end for (7) Set the node i_leader from the opinion leader election method as node v i . (8) while(the length of the list L n is larger than 0) (9) Pull node v i from the list L n . (10) for j = 1 to n (11) Calculate the Euclidean distance between node v i and node v j . (12) end for (13) Find and Set node j_neighbor from the neighbor node selection method as node v i . (14) Push node v i into the list O n . (15) end while (16) Construct the adjacency matrix X n×n according to the list O n . (17) Return the reconstructed adjacency matrix X n×n .
The gradient descent method is adopted for the optimization of auto-encoder based on convolutional neural network. The weights in the neural network are updated by Equations (16), (17), (18), (19), (20) and (21). When the training process is complete, the reconstructed adjacency matrix will be inputted into the trained neural network to retrieve the values of neurons (i.e. {h 1 , h 2 }) in the convolutional layer for spatial feature extraction.

2) THE DESCRIPTION OF A GENERAL CASE
In this case, the reconstructed adjacency matrix can be split n records, and the dimension of each record is expressed as 1 × n. Both of input layer and output layer have n neurons (i.e. n nodes in a social network), respectively. There are q neurons in the convolutional layer (i.e. the first hidden layer). The structure of auto-encoder based on convolutional neural network for the general case is showed in Figure 3. The mean squared errors are considered by loss function, and the gradient descent method is adopted to minimize these errors. In testing and performing stage, the spatial features can be extracted according to the values of neurons (i.e. H shown in Equation (22)) in the hidden layer.

C. COMMUNITY DETECTION METHOD
The n records from the reconstructed adjacency matrix can be clustered into k groups by the K-means algorithm for detecting communities in a social network. Each record is adopted into the trained neural network, and the spatial features of each record is extracted by the proposed spatial feature extraction method. Therefore, the dimension of each record is expressed as 1×q (e.g., the vectors of the j-th record shown in Equation (23)) after extracting spatial features and adopted into the K-means algorithm.
The proposed community detection method includes three steps as follows.
Step (1). The k records are randomly elected as k cluster centers from the n records. The vectors of the i-th cluster center are expressed as Equation (24).
Step (2). The Euclidean distance φ (i, j) is applied to measure the distance between the j-th record and the i-th cluster center based on Equation (25). The n records are distributed into the k groups according to the distance between the record and the cluster center, and the center of each cluster is recalculated based on the records in the cluster.
Step (3). If each cluster center has no changes, the clustering process is finished. Otherwise, Steps (1) and (2) are performed repeatedly.

IV. EXPERIMENTAL RESULTS AND DISCUSSIONS
In experiments, four open datasets of practical social networks were adopted to evaluate the performance of the proposed deep community detection method.

A. EXPERIMENTAL ENVIRONMENTS
This subsection presents datasets, an evaluation factor and experimental designs.

1) DATASETS
Four open datasets of practical social networks which include (1) network19, (2) karate, (3) dolphins, and (4) football were collected for the evaluation of the proposed method. Table 1 shows the description of datasets and the value of k for K-means algorithm.

2) EVALUATION FACTOR
The modularity (shown in Equation (26)) is a popular factor for evaluating the performance of community detection method.
Each parameter in Equation (26) is defined and described as follows.
• The parameter A denotes an adjacency matrix. • The parameter m denotes the number of edges. • The parameter k i denotes the degrees of the i-th node. • If the i-th node and the i-th node are clustered into the same group, the value of δ(C i , C j ) is 1.
• The range of Q (i.e. modularity) is between 1 and −1.
If the value of Q equals to 1, the performance of community detection method is higher. The NMI (shown in Equation (27)) is another popular factor for evaluating the performance of community detection method.
Each parameter in Equation (27) is defined and described as follows.
• The parameter N denotes community matrix. The rows of the matrix N correspond to the standard community results and the columns correspond to the community detection results obtained by the algorithm.
• The parameter N i· denotes Sum of values in the i-th row of community's matrix. • The parameter N ·j denotes Sum of values in the i-th column of community's matrix.
• The parameter C A denotes the result of the standard community results.
• The parameter C B denotes the result of the community results obtained by the algorithm.
• When the algorithm obtains a community division result that is more consistent with the standard division result, the value of NMI is closer to 1, otherwise the value of NMI is closer to 0.

3) EXPERIMENTAL DESIGN
For the evaluation of the proposed method, four cases were designed to compare the modularity of different method combinations as follows.

Case (1):
The auto-encoder method was adopted to extract the features of original adjacency matrix in a social network for community detection. The label of Case (1) was expressed as 'AE'.
Case (2): The auto-encoder method was adopted to extract the features of reconstructed adjacency matrix in a social network for community detection. The label of Case (2) was expressed as 'RM+AE'.

Case (3):
The auto-encoder method based on convolutional neural network was adopted to extract the features of original adjacency matrix in a social network for community detection. The label of Case (3) was expressed as 'AE+CNN'.
Case (4): The auto-encoder method based on convolutional neural network was adopted to extract the features of reconstructed adjacency matrix in a social network for community detection. The label of Case (4) was expressed as 'RM+AE+CNN'.
In experiments, four cases and four datasets were considered to evaluate the performance of the proposed deep community detection method. The structure of neural network for each dataset is showed in Table 2. Furthermore, for the evaluation of the proposed method, node2vec [3], SDNE [9] and NetRA [15] are taken as baselines for comparison.    For the analyses of the dataset 'network19', Figure 4(a) shows the original adjacency matrix, and Figure 4(b) shows the reconstructed adjacency matrix. The 7-th node in the original adjacency matrix is elected as the opinion leader. The boundaries of some communities in the original adjacency matrix are overlapped. After performing the proposed matrix reconstruction method, the nodes in the same community are closer, and the boundaries of communities are easier to be detected in the reconstructed adjacency matrix. Therefore, the reconstructed adjacency matrix is suitable for spatial feature extraction and community detection.
For the analyses of the dataset 'karate', Figure 5(a) shows the original adjacency matrix, and Figure 5(b) shows the reconstructed adjacency matrix. The 34-th node in the original adjacency matrix is elected as the opinion leader.
The boundaries of communities in the original adjacency matrix are difficult to be detected.
Although the reconstructed adjacency matrix in this case may have a little of improvement for community detection, the number of isolated points in the reconstructed adjacency matrix is less.
For the analyses of the dataset 'dolphins', Figure 6(a) shows the original adjacency matrix, and Figure 6(b) shows the reconstructed adjacency matrix. The 15-th node in the original adjacency matrix is elected as the opinion leader. The edges in the original adjacency matrix are scattered, and it is difficult to detect social communities. After performing the proposed matrix reconstruction method, the boundaries of communities are clearer for detection, and the number of isolated points is less. The cyberspace structure of communities is clearer for spatial feature extraction.
For the analyses of the dataset 'football', Figure 7(a) shows the original adjacency matrix, and Figure 7(b) shows the reconstructed adjacency matrix. The 2-nd node in the original adjacency matrix is elected as the opinion leader. The cyberspace structure of communities in the original adjacency is difficult to be detected owing to the scattered distribution. After performing the proposed matrix reconstruction method, 12 communities in the reconstructed adjacency matrix are easier to be detected, so the proposed method can support to extract the spatial features of community in this case.
In accordance with the experimental results in Figures 4, 5, 6 and 7, the nodes in the same community are closer, and the boundaries of communities are clearer. Therefore, the reconstructed adjacency matrices are suitable for spatial feature extraction and community detection.

C. EXPERIMENTAL RESULTS OF SPATIAL FEATURE EXTRACTION METHOD AND COMMUNITY DETECTION METHOD
This subsection uses the modularity to evaluate the proposed spatial feature extraction and community detection methods, and experimental results of each case in Subsection IV.A.3 are illustrated in Table 3.
For the evaluation of matrix reconstruction method, the modularity in Case (2) (i.e. RM+AE) is higher than the modularity in Case (1) (i.e. AE) for each dataset, and the modularity in Case (4) (RM+AE+CNN) is higher than the modularity in Case (3) (AE+CNN) for each dataset. Therefore, the proposed matrix reconstruction method can support to reconstruct the cyberspace structure of communities for community detection.
For the evaluation of spatial feature extraction method, the modularity in Case (3) (i.e. AE+CNN) is higher than the modularity in Case (1) (i.e. AE) for each dataset, and the modularity in Case (4) (RM+AE+CNN) is higher than the modularity in Case (2) (RM+AE) for each dataset. Therefore, the proposed spatial feature extraction method based on convolutional neural network can extract the spatial features in social networks for the improvement of community detection.

D. COMPARISON RESULTS WITH BASELINES
The comparison results are showed in Tables 4 and 5. The modularity in Case (4) (RM+AE+CNN) is higher than the modularity in node2vec and NetRA for each dataset. In network19 and dophlins, the modularity in Case (4) (RM+AE+CNN) is higher than the modularity of SDNE+Kmeans. In karate, the modularity in Case (4) (RM+AE+CNN) and the modularity of SDNE+Kmeans are equal. In football the modularity in Case (4) (RM+AE+CNN) is slightly lower than the modularity of SDNE+Kmeans in Table 4. Furthermore, The NMI in Case (4) (RM+AE+CNN) is best for three datasets in Table 5.

V. CONCLUSIONS AND FUTURE WORK
An AE and CNN-based deep community detection method for social networks is proposed in this paper. A matrix reconstruction method based on a novel cyberspace structure reconstruction strategy is proposed to acquire spatial proximity matrices and generate reconstructed adjacency matrices. The matrix adds spatial proximity to traditional adjacency matrix, obtains clear subspace characteristics, makes convolutional operations easy and quick to extract graph spatial localization. Meanwhile, the AE+CNN-based model is constructed to learn spatial eigenvector that automatically extract the spatial features of graph. The spatial eigenvector provides the basis and support for the analysis and application on the network graph. Applying the combined model of AE and CNN into the community detection, and experimental results show that the proposed deep community detection method can effectively detect high quality communities for each practical dataset, especially for the graph with obscure subspace structure. The combined model of AE+CNN based community detection method also becomes the foundation of dynamic network community detection.
Although the combined model of AE and CNN is a meaningful exploration of graph embedding model, the number of neurons in input and output layers is changeless after the deep learning model is constructed. For dynamic community detection, the relationships among nodes may be dynamically changing, so the time-series models are required to extract the spatio-temporal features for arbitrary social networks.