Community detection by laser network dynamics

We introduce methods for detecting communities in networks using the dynamics of a laser network. These methods utilize phase synchronization of the coupled laser network. In these methods, lasers within the same community share the identical phases with small phase errors by making use of in-phase and out-of-phase couplings. Numerical simulation shows that our methods can identify the community structure of the networks with artificially generated communities.


Introduction
Community structure is one of the fundamental properties of complex networks in which nodes are densely connected only within the same clusters [1]. Many types of networks, including functional brain networks to social networks, have been used to study the community structure [2][3][4]. The detection of the community structure is useful for analyzing a network and understanding the functions of a network [5,6]. Although community detection is one of the most well-known topics in network science, and many algorithms have been proposed, there is no universal definition for community detection that applies to either all types of networks or to a variety of purposes.
One of the approaches to community detection is the use of a dynamical process on a network such as synchronization of coupled oscillators [7][8][9], dynamics of a spin system [10][11][12][13][14], and diffusion processes [15][16][17]. In the case of coupled oscillators, the oscillators especially in densely connected clusters can be quickly synchronized before a fully synchronized state, which can be utilized to identify communities [7]. In another case, the Potts spin model is often used for community detection by minimizing the Hamiltonian that corresponds to the cost function of the problem [10]. A method that utilizes the Boltzmann distribution at a finite temperature to find a consensus among many partitions has also been proposed [18].
In recent years, several attempts have been made to construct various physical systems in order to solve computationally difficult problems using their own dynamical processes. There are several examples of developing a physical system to simulate the Ising spin model in order to solve combinatorial optimization problems, such as those found in superconductor-based quantum annealing machines [19], CMOS-based classical annealing machines [20], and time-division-multiplexing optical parametric oscillator networks [21][22][23]. A laser network constructed using optical fibers where laser pulses interact with each other via optical delay lines has also been proposed [24,25]. This laser network is intended for emulating XY spin models as a Boltzmann sampler with a tunable effective temperature. Since the laser network is a coupled oscillator system whose dynamics is similar to the Kuramoto model, it can also be used for an community detection for arbitrary network topology.
In this paper, we propose community detection methods using the dynamics of a laser network. In some existing methods [7][8][9], the transient dynamics of a coupled oscillator network toward synchronization or desynchronization have been used for a community detection method. Generally, these methods do not give the timing to be used for identifying communities. Our methods, instead, use a steady state for community Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. detection by introducing out-of-phase coupling that corresponds to the negative coupling constant. The out-ofphase coupling prevents the laser network from reaching a fully synchronized state and arrives at a partially synchronized state with community information as a final state. We tested our methods using numerical simulation of the dynamics of lasers on an artificial network called the Lancichinetti-Fortunato-Radicchi (LFR) benchmark graph [26].

Laser model
We consider the following complex Langevin equation, which describes the dynamics of a laser network [24,27,28]: where A j is the complex amplitude of the jth laser field, c g is the cavity photon decay rate, p is the normalized pump rate, n 0 is the saturation photon number, and jk kj x x = is the coupling constant between the jth and kth lasers. The first term of the right-hand side represents the net gain. The third term represents the Langevin noise source, where dW is a complex-valued Wiener process that possesses the properties of W W t d d d where W d j q and dW nj are independent real-valued Wiener processes. Equation (2) shows that the phase equation is equivalent to that of Kuramoto oscillators if all lasers have the same amplitudes.

Community detection
Community detection is the problem of finding densely connected groups of nodes; however, nodes between different groups are sparsely connected. To detect such communities using our laser network, we utilize the phase correlations of the pair of lasers, which is represented as where the bracket is an ensemble average over several runs: where T is the number of runs and x n ( ) is the sample of x for the nth run. We calculate correlations only for the pairs of lasers that are connected in the original graph G V E , , where V is the set of the nodes and E is the set of the edges of the graph. Then, correlations ij r are converted to binary matrix D ij { } by the threshold parameter T as We identify the communities as connected components of D ij { }. If we construct the coupling term in equation (1) directly from the adjacency matrix of G V E , ( ), and the coupling strength is sufficiently larger than the noise term, all lasers will be synchronized at the same phase. Since this situation is not suitable for our purpose, we propose two different methods for constructing a coupling matrix from a given network as follows.

Homogenous out-of-phase coupling
In the first approach, we consider homogenous out-of-phase couplings for community detection. For a giving graph G V E , ( ), we construct a laser network such that where ξ and γ are the positive constants that determine the coupling strengths and the ratio between the in-phase and out-of-phase coupling strength, respectively. As shown in figure 1, the nodes are coupled in-phase if they are connected in the G V E , ( )and out-of-phase, otherwise. Since the out-of-phase coupling separates the phases of lasers that are not connected in the G V E , ( )we expect that only densely connected groups of nodes to oscillate in phases with close phase values.
In this approach, the important parameter is γ because it tunes the global out-of-phase coupling effect, which in turn strongly depends on the edge density D m N N ), where m and N are a number of edges and a number of nodes in the G V E , ( ), respectively. Therefore, the value of γ needs to be determined, depending on the D of the G V E , ( ). To this end, we investigate the order parameter r for a random network with the same edge density as the G V E , ( )and the coupling protocol as described above. The order parameter re if of phases is defined as where r shows how much oscillators are synchronized globally, and ψ is the mean value for all phases. Figure 2(a) shows the numerical simulation result of the γ dependence of r with various values of edge density D in the Barabási-Albert (BA) model with number of nodes N=1000 [29]. The BA model is a random network model with power-law degree distribution. The order parameter r becomes 1 in the region where in-phase couplings are dominant, and r decreases as γ increases. At a fixed γ, order parameter r simply increases as edge density D increases, since the number of in-phase couplings increases. Operationally, we choose the optimum value of γ at which order parameter r becomes 0.5 in a random network with the same edge density as the G V E , ( ). Since we are interested in the parameter region where lasers can easily detect the community structure, we expect the phase correlation of lasers to become sensitive to the local change of edge density from a random connection at this critical parameter. We plotted this γ dependence on edge density D in figure 2(b). The relationship between γ and D was approximated by a linear fit using parameters a and b:  where we assumed the system size N dependence as a finite size correction. The best fit was achieved using a=1.0 and b 4.8 = -. Using this relation, we determine the value of γ depending on D of the G V E , ( ).

Modularity matrix
In the second approach, we construct a laser network with a coupling matrix on the basis of modularity [30] which is often used as a cost function for community detection [31][32][33][34]. The modularity Q with resolution parameter mod g [11] is represented as ), i s is the community to which node i belongs, and k i is the degree of node i. We use modularity matrix B ij for the coupling matrix of the laser network as B ij ij Here, the out-of-phase coupling strength is determined by the expected number of edges between the nodes in the configuration model of the G V E , ( ). The configuration model is a random connectivity model of the G V E , ( ) where edges are randomly rewired, and the degree of each node is preserved. The modularity matrix is based on the assumption that two connected nodes are likely to be in the same community if the probability that the nodes are accidentally connected by chance in the configuration model is low. The global strength of the out-of-phase coupling can be tuned by resolution parameter mod g , which determines the size of the community that is preferred by modularity Q. The issue of determining mod g remains controversial, and tuning mod g without any bias for the particular size of the community is difficult [35]. In this paper, we simply choose 1 mod g = , which is the same parameter value as that of the original definition of modularity [30].

Network model and evaluation
For our benchmark, we chose the LFR benchmark graph, which has a built-in community structure and is often used for benchmarking a community detection algorithm [26]. The LFR benchmark graph has power-law community size and degree distributions with exponents α and β, respectively. The average values of the distributions can be determined by the minimum and maximum values of degree k and community size s as k min , k max , s min , and s max , respectively. Another important parameter of the LFR benchmark graph is the mixing parameter μ, which tunes the fraction of edges of each node that is connected to the nodes in other communities. We chose the same parameter of the LFR model as [36] for the undirected graph.
We evaluated the performance of our methods on the basis of normalized mutual information (NMI) [37]; the NMI measures the similarity between two partitions of the same graph on the basis of Shannon's mutual information. NMI is often used for evaluating community detection algorithms by comparing the partition embedded in the benchmark graph with the partition obtained using algorithms [36]. However, NMI is known to be biased for a large number of divisions because of the finite size effect [38]. In particular, the partition in which the all nodes belong to the different groups represents a high NMI value compared to the case where all nodes belong to the same group, although both of those partitions are trivial cases for community detection. Relative NMI (rNMI) has been proposed to consider this finite size effect [38]. The rNMI is defined as where A is the partition embedded in the graph, B is the partition obtained by the algorithm, C is a random partition with the same group size distribution as partition B, and A C NMI , á ñ ( ) is the expected value of A C NMI , ( )over different realizations of random partition C. The ratio of relative normalized mutual information (rrNMI), which is normalized to 1 for the same partitions, is defined as [39] A B A B A A rrNMI , rNMI , rNMI , .
We use rrNMI as a measure of community detection accuracy for our methods and calculate the A C NMI , á ñ ( ) as an ensemble average of 100 random realizations of C.

Numerical simulation result
The validity of our methods was confirmed via numerical simulation. The physical parameters in the equation (1)  . When the mixing parameter is 0.5 m < , the distributions of correlations of nodes in the same community and the different community are separated from each other, and the correct communities can be found by setting a threshold value between the two distributions. In contrast, two distributions overlap when 0.5 m > , and we cannot distinguish whether the two nodes belong to the same community or not. We then plotted the rrNMI as a function of mixing parameter μ for different community sizes. S and B in figure 4 represent graphs with a small and big community size, where s 10 min = and s 50 max = for a small community size S, and s 20 min = and s 100 max = for a big community size B. We compared the rrNMI evaluation of our two methods with two well-known algorithms, the Louvain method [32] and the Infomap [40]. The Louvain method is based on a greedy maximization of the modularity and is performed by the iteration of a local maximization of the modularity and the aggregation of nodes belonging to the same community. The method is well known as the one of the fastest algorithm for community detection. The Infomap is based on the optimization of a map equation, which is the minimum description length of a random walk on a graph. Both algorithms performed well in the literature [36] in which various algorithms were compared by NMI using the LFR benchmark graph. In the study, the Infomap was the best algorithm in terms of the NMI, and the Potts spin model approach [13] performed similarly to the Infomap. Since the Louvain method is the fastest algorithm and the Infomap is the best in terms of the NMI, we chose these two algorithms to compare with our methods. The fraction of edges in the same community becomes small for large μ. As a result, rrNMI tends to decrease for all algorithms as the value of μ becomes large. Since the region where rrNMI equals to 1 means that the algorithm completely find embedded communities, the performance of the algorithm can be evaluated at the point where rrNMI begins decreasing from 1. In this sense, the performance of our two methods shows comparable performance to the previous methods. Although Infomap performs better than our method, our method is better than the Louvain method for the large-size and big community graphs.
Finally, we investigated the relaxation time of laser networks to the final state. Since only the phase information was used at the final state for detecting communities, it is crucial for the computational time of our method. We estimate the relaxation time using the link order parameter r link [41], which is described as This order parameter represents the degree of synchronization in phases that are connected in the G V E , ( ). In figure 5, we plotted the relaxation times in terms of the convergence time of r link for the LFR benchmark graph. In every case, the relaxation times increase as mixing parameter μ increases. This means that a strong community structure in the G V E , ( )makes a system converge quickly. The relaxation times in the modularity-based method are less than those in the homogenous method. A possible explanation of this tendency is as follows. In the modularity-based method, the out-of-phase coupling strength between the pairs of low-degree nodes, which are a majority on scale-free graphs, is weaker than that in the homogenous method. Therefore, the frustration effect in the modularity-based method is weaker than that in the homogenous method; this phenomenon makes the systems converge quickly.
The relaxation time shown in figure 5 is normalized by the lifetime of a photon in a cavity 1 . The photon decay rate of the cavity c g depends on the Q value of a cavity, and the estimated value of c g of the experimental setup in the literature [24] is 50 MHz. Therefore, the order of the relaxation time of the system is estimated to be 10 100 s m - . It should be noted that the relaxation times do not depend on the system size N under the same value of μ with the same degree and community size distribution, except for the graphs with weak community structure ( 0.7 m > ) in the homogenous method. The solid black circles represent the result for the graph with S and N=1000, the dashed red squares for B and N=1000, the solid green triangles for S and N=5000, the dashed rhombus blue for B and N=5000, the solid cyan pentagon for S and N=10 000, the dashed magenta hexagon for B and N=10 000, the solid yellow cross for S and N=50 000 and the dashed brown star for B and N=50 000.

Conclusion and outlook
We proposed two methods of community detection by using a laser network with an out-of-phase coupling scheme and numerically confirmed that the performance of our methods is comparable to the existing algorithms in terms of rrNMI evaluation. We also proposed a method to tune the parameter γ of the homogenous out-of-phase coupling depending on the edge density of an the original graph. In the case of modularity-based out-of-phase coupling, the relaxation time of the laser network is less than that of the homogenous out-of-phase coupling although both schemes perform similarly for the graphs with strong community structures. With regard to the perspective of computation within a real physical system, a scheme for constructing an arbitrary network topology is required to apply these methods. So far, 1D-ring network topology with all-optical connections has been implemented for a laser network [24,25]. A promising way to implement an arbitrary network topology with FPGA in an optical parametric oscillator system has recently been proposed [22,23]; this method is also applicable to a laser network. The evaluation of relaxation time suggests that the computation time for community detection using such an experimental system is estimated to be 1 10 ms -for 100 runs of an ensemble average calculation. In the Louvain method, known as the fastest algorithm, the computational time for a graph with number of nodes N=70 000 and number of edges M=351 000 is reported to be 1 s, which linearly scales up by M for sparse graphs [32]. Our methods also require post-processes for the calculation of phase correlations of node pairs, which are connected in the original graph, and the identification of connected components of a binarized correlation matrix, whose computation time also scales up by M. With regard to the time for running dynamics, however, relaxation time does not depend on system size for the LFR benchmark graphs with the same mixing parameter and could be further reduced using methods such as the physical implementation of a replica exchange Monte Carlo algorithm [42], which could also be implemented in the laser network by the control of effective temperature [25].