Asymmetric leader–laggard cluster synchronization for collective decision-making with laser network

.


Introduction
Photonic accelerators [1] have been gaining attention in recent years, and a variety of implementations and applications have now been explored [2][3][4][5][6][7][8][9].These advancements can be attributed to a growing awareness of the saturating speed of performance improvements in conventional computational systems [10], despite the soaring demands for information processing in an extensive range of applications, especially in machine learning.Reinforcement learning [11] is a subfield of machine learning that involves optimizing computer outputs or actions to maximize the reward function.Its applications are now essential to our daily lives, ranging from self-driving vehicles [12] and targeted advertising [13] to wireless networking [14], and there is now a strong demand for computational acceleration.Specifically, what we focus on here is decision-making.Environments in which agents make decisions can be uncertain or ever-changing, depending on the problem.
The multi-armed bandit (MAB) problem [15] is a fundamental problem setting of decision-making, in which a player repeatedly selects from multiple slot machines with unknown hit probabilities, aiming to maximize the total reward.An efficient strategy for the problem requires balancing two contradicting operations: exploration, in which a player tries to identify the most high-paying slot machine by taking various options, and exploitation, in which a player selects only the estimated best slot machine.The competitive multi-armed bandit (CMAB) problem [16] extends the MAB problem to a multi-player setting, in which when two or more players simultaneously select the same slot machine, the reward is equally divided among them, resulting in a loss of opportunities for benefiting from other slot machines.Therefore, in the CMAB problem, avoiding selection collisions among players is important to maximize total team rewards, besides exploration and exploitation.Recent studies have explored employing the distinct properties of photonic phenomena to address the MAB problem [17][18][19][20][21][22][23].Collective decision-making with a laser network, proposed in [24], stands out as a promising method to tackle the CMAB problem in terms of its experimental viability and scalability.It leverages the chaotic behavior of multiple optically connected lasers: the spontaneous exchange of the leader-laggard relationship and zero-lag synchronization.
A leader-laggard relationship signifies the similarities observed in the temporal waveforms of optical intensity between two mutually coupled lasers [25], where one of the lasers (referred to as the 'leader') oscillates so as to precede the other (referred to as the 'laggard') with an offset given by the coupling delay time between the two.A previous study [26] revealed that the relation switches spontaneously during low-frequency fluctuation (LFF) dynamics [27].The LFF dynamics are characterized by quasi-periodic fluctuations on a MHz time scale superimposed on chaotic oscillations on a GHz time scale in the optical intensity, observed under conditions of intense optical coupling and low pump current.In decision-making systems based on the exchange of leader-laggard roles in the LFF dynamics [21,24,28], physical semiconductor lasers have a one-to-one relationship with virtual slot machines in the MAB problem, and a strategy is adopted to select the slot machine corresponding to the leader laser at a given moment.Consequently, a player alternately selects multiple slot machines, achieving exploration in the context of the MAB problem.
In zero-lag synchronization [29][30][31], conversely, a set of lasers in a network oscillates in synchrony without any delay.This non-trivial synchronous phenomenon has been demonstrated both theoretically and experimentally.One approach to predicting the formation of zero-lag synchronization in a laser network involves calculating the power of its adjacency matrix [30].An unweighted adjacency matrix represents how information is transmitted from one node to another in the network, allowing for a qualitative assessment of whether each laser is zero-lag synchronized to the others.Exploiting this matrix-based inference, the previous work [24] introduced a four-laser network that exhibits zero-lag synchronization, forming two clusters of two lasers each, to address the CMAB problem with two players and two slot machines.The zero-lag synchronization enables each player to have information about the others' slot machine selections without directly measuring the optical outputs of the opponents' lasers at a remote location and to avoid selection collisions, thus attaining cooperative decision-making in the CMAB problem.
However, the previous study dealt only with the CMAB problem in which the number of players and the number of slot machines are the same, and each player selects slot machines in equal proportions.More general configurations should be implemented for practical applications where there can exist more options than players or where players want to take specific options more frequently than the alternatives.In addition, the previous work did not fully explore the networks, and there might be other possible networks [32] that are effective for a collective decision-making system.
In this study, first, we examine other candidates of laser networks that exhibit equivalent behaviors to the previously proposed network, and are capable of solving the competitive multi-armed bandit (CMAB) problem with a configuration of two players and two slot machines.The synchronous states in the possible networks are evaluated by performing a stability analysis.The need for this arises because the argument grounded in an unweighted adjacency matrix relied solely on conceptual observations and empirical rules without quantitative synchronization assessment.Subsequently, we demonstrate asymmetric slot machine selections by players both in numerical simulations and experiments, extending the collective decision-making system with a laser network to a broader and more practical range of problems.Our study reinforces the potential and feasibility of decision-making based on laser chaos and photonic accelerators.
2 Network configuration: requirements and stability analysis for verification

Candidate networks for decision-making
First, we reintroduce a decision-making system [24] for addressing the competitive multi-armed bandit (CMAB) problem in a 2-player, 2-slot-machine situation.Figure 1 illustrates the concept of the conflictavoiding decision-making with a laser network.Lasers 1A and 1B are allocated to Player 1, and Lasers 2A and 2B are allocated to Player 2. Lasers 1A (2A) and 1B (2B) respectively correspond to Slot A and B selected by Player 1 (2).In this laser network, there exists a leader-laggard relationship between Lasers 1A and 1B and between Lasers 2A and 2B, where the oscillation of one of the laser leads that of the other, and the leader spontaneously switches in the LFF regime [26].Meanwhile, Lasers 1A and 2B and Lasers 2A and 1B are, respectively, in zero-lag synchronization, where two lasers' oscillations synchronize without delay.Each player selects a slot machine represented by the leader laser among the lasers assigned to them.When Laser 1A is the leader, Player 1 selects Slot A. Simultaneously, when Laser 2B, which is zero-lag synchronized with Laser 1A, becomes the leader, Player 2 selects Slot B. The same holds when Laser 1B (2A) is the leader.Therefore, players autonomously avoid conflicts between them without explicitly knowing the other's slot machine selection or the behaviors of the lasers, and they achieve cooperative decision-making.We characterize the synchronization behavior of the laser network for this problem as follows.i) The four lasers are separated into two clusters of two lasers each.ii) Lasers 1A and 2B (2A and 1B) are in zero-lag Figure 1: Schematic illustration of decision-making system to solve the competitive multi-armed bandit (CMAB) problem with two players and two slot machines proposed in reference [24].synchronization.iii) Lasers 1A and 1B (2A and 2B) are not in zero-lag synchronization.Therefore, we can utilize networks that exhibit such synchronization for the collective decision-making system.This type of synchronization is called cluster synchronization in the literature [33,34], but we focus on more specific cases where the number of nodes in a cluster is the same.One necessary condition for such cluster synchronization is that the total optical injection into each laser in one cluster is uniform.
We define cluster 1 to consist of Lasers 1A and 2B and cluster 2 to consist of Lasers 2A and 1B.Here, we also need to discuss the following constraints on the laser networks.First, the coupling strength from a laser in cluster 1 to one in cluster 2 is uniform, and we denote this value as κ 1 .Similarly, we represent the coupling strength from a laser in cluster 2 to one in cluster 1 as κ 2 .Second, the coupling delay time of light τ is identical for all couplings.τ becomes a typical time scale in the dynamics of the leader-laggard relationship [26].The effective frequency of decision-making is governed by τ [21,24], and our concern is establishing a uniform strategy for all players.Third, there is no connection between lasers within the same cluster.These assumptions are reasonable to avoid unnecessary complexity in our discussion.
Taking into account the factors described so far, possible networks are a complete bipartite graph shown in Fig. 2 (a), and ones obtained by cutting some of this network's paths while maintaining the cluster synchronization.There are in total six candidate networks obtained by removing some links of the complete bipartite network, when we consider ones that coincide by color exchange, rotation, or reflection symmetry, as illustrated in Fig. 2 (b).Note that the network (IV) is identical to the one already proposed in the literature.Our model of semiconductor lasers in the networks (I)-(VII) is described by Lang-Kobayashi equations as follows [35]: In all seven networks shown in Fig. 2, identical synchronous solutions corresponding to the cluster synchronization, , which is a specific case of the cluster synchronization, can also exist in the seven networks with some additional assumptions, e.g., ω 1 = ω 2 .If all four lasers are completely synchronized, players cannot decide which slot machine to select because neither laser becomes a leader.This can be a problem for decision-making with the laser network.Therefore, the stability of the solutions for both cluster synchronization and global synchronization should be discussed.

Stability analysis of synchronous solutions
Although we have confirmed in the previous section that cluster synchronous solutions in the laser networks exist, their stability is not guaranteed.Also, there should be no global synchronization so that the decisionmaking strategy remains valid.First, we evaluate the stability of the desired cluster synchronization in the laser networks, (I)-(VII), by numerically calculating conditional Lyapunov exponents of the solutions of the cluster synchronization.For networks that exhibit stable cluster synchronization, we further compute conditional Lyapunov exponents of global synchronization to examine the instability of the undesired global synchronization.Conditional Lyapunov exponents characterize the synchronous behavior between multiple dynamical systems capable of dealing with chaotic systems [36][37][38].The exponents quantify how small displacements between trajectories along a specific direction expand or decay on average over time.A negative maximum conditional Lyapunov exponent indicates asymptotic stability of the synchronous solution, resulting in complete synchronization between the systems.On the other hand, a positive value suggests that the solution is unstable, so that identical synchronization is not observed, with different initial conditions for each system.
In this section, we assume that ω 1A = ω 2B = ω 1 and ω 2A = ω 1B = ω 2 for the existence of the cluster synchronous solutions.Therefore, the optical phase difference terms are limited to two types: θ 1 (t) ≡ (ω 2 − ω 1 )t − ω 2 τ and θ 2 (t) ≡ (ω 1 − ω 2 )t − ω 1 τ .Additionally, we adequately configure κ 1 and κ 2 depending on each network as follows: for (I) κ 1 = κ 2 = κ/2, for (II) and (III) κ 1 = κ/2, κ 2 = κ, and for (IV), (V), (VI), and (VII) κ 1 = κ 2 = κ.We focus on equal total coupling strength of the injected light into one laser for all networks, and we can consider the synchronization under dynamics equivalent among lasers.First, we discuss conditional Lyapunov exponents for the cluster synchronization.Here we introduce variables spanning synchronized manifolds, focusing on the stability of the synchronization between Lasers 1A and 2B, and that between Lasers 2A and 1B.If the cluster synchronization is asymptotically stable, for instance, E 1A and E 2B converge to E S1 , E 2A and E 1B to E S2 , and E AS1 and E AS2 to 0. On the contrary, if the synchronization is unstable, E AS1 and E AS2 exponentially expand over time.The same holds for the carrier densities.
We can obtain differential equations of the variables for synchronized manifolds and those for antisynchronized manifolds with Eq. ( 1) and (2).Regarding the equations of the variables for synchronized manifolds, we assume E AS1 (t) = E AS2 (t) = 0 and N AS1 (t) = N AS2 (t) = 0 to compute complete synchronous trajectories.Derived equations are shown below (a detailed derivation is provided in the supplementary material).
These equations are the same among the networks.As for the equations for anti-synchronized manifolds, on the other hand, we treat the variables E AS1 , N AS1 , E AS2 , and N AS2 as tiny values and linearize the equations in terms of these variables to evaluate Lyapunov exponents.The linearized equations are described as follows.
We set parameters of Lang-Kobayashi equations to typical values used in references [21,24,28], as shown in Table 1.The time step h used to calculate conditional Lyapunov exponents is set to 5 ps.We computed the time evolution of the delay-history vector, using Eq. ( 4)-( 9) and employing the fourth order Runge-Kutta method, over the calculation period of T = 1000 ns after waiting for a sufficiently long transient of 10 000 ns.The norm of the delay-history vector extended or shrank over time approximately at a pace given by |∆(t)| = e Lt |∆(0)|, where we define L as the maximal conditional Lyapunov exponent of the cluster synchronization.Technically, we normalized the delay-history vector to its initial norm every 10 steps, i.e., 50 ps, to prevent numeric overflow or underflow, and evaluated the time-average value of L. Figure 3 (a) illustrates the maximum Lyapunov exponents of the cluster synchronization in the networks, computed with varying settings of κ = 0 ns −1 to 30 ns −1 , in increments of 1 ns −1 .The exponents for the networks (I), (II), (III), (IV), and (V) are negative in the region where κ ≥ 8 ns −1 , represented by the red and blue curves, which suggests asymptotic stability of the cluster synchronization.For the networks (VI) and (VII), on the contrary, the exponents are positive with any value of κ, indicating that the cluster synchronization is unstable in the two networks.Therefore, the networks (I), (II), (III), (IV) and (V) remain candidates for the proposed cooperative decision-making system, while (VI) and (VII) are not.
Next, we discuss conditional Lyapunov exponents for the global synchronization.We introduce variables spanning synchronized manifolds, and others spanning anti-synchronized manifolds, Similarly to the case of the cluster synchronization, we obtain differential equations of the newly introduced variables using Eq. ( 1) and (2), and then derive equations for complete synchronous trajectories and linearized equations to calculate Lyapunov exponents for the networks (I), (II), (III), (IV) and (V), under the corresponding assumptions.The equations are provided in the supplementary material.Notably, the delay terms in the linearized equations depend on each network.Parameters and other computational conditions are the same as the ones used for the exponents of the cluster synchronization.The exponents for all five networks are positive with any κ, which demonstrates that the undesired global synchronization is unstable for the networks.Therefore, the networks (I), (II), (III), (IV) and (V) are suitable for the decision-making system with sufficiently strong coupling strength.
3 Asymmetric preferences in cooperative decision-making

Numerical simulation
Considering specific problem settings, the symmetric strategy of slot machine selection is not enough, and the need to introduce asymmetric preferences of players arises.For example, when two players select among three slot machines, one of which is much lower-paying than the others, players are encouraged to select only the two good ones while avoiding conflict between their choices.Another instance arises when we want one of the players to select a particular slot machine more frequently so that they get more (less) reward than the others.Therefore, the capability to manipulate slot machine selection ratios, while maintaining conflict avoidance, is essential for the CMAB problem solver implementation.Our interest lies in investigating how to control the proportions of slot machine selection through a fundamental modification of the lasers' dynamics.In the proposed method, manipulating the balance of slot machine selection ratios corresponds to changing the probabilities of individual lasers leading the others.In this context, we introduce two functions to quantify the leader-laggard relations between lasers, as previously discussed in the literature [21,24,26].One of these functions is a cross-correlation function, which is defined as follows: Ĉk,l (s) = where Īk and σ k represent the average and the standard deviation of laser intensity I k , over the period T = 10 000 ns.The value Ĉk,l evaluates a global trend of synchronization between lasers i and j.The other one is a short-term cross-correlation (STCC) function defined as follows: where Ī′ k and σ ′ k denote the average and the standard deviation of laser intensity I k , over the period τ = 5 ns.Ī′ k,τ and σ ′ k,τ have similar meanings but are calculated over an interval shifted by time τ to the left.C 1A (t) indicates the cross-correlation value at time t under the assumption that Laser 1A is a laggard and Laser 1B is a leader.Similarly, C 1B supposes that Laser 1B is a laggard and Laser 1A is a leader.If C 1A < C 1B , then Laser 1A is regarded as a leader, and Player 1 selects Slot A at that time.Conversely, if C 1A > C 1B , Laser 1B is regarded as a leader, and Player 1 selects Slot B. In this way, the values C 1A and C 1B identify the local leader-laggard relationship between Lasers 1A and 1B, while C 2A and C 2B perform the same function for Lasers 2A and 2B.With the STCC functions, we can quantify the leader probabilities for Laser 1A as L 1A = T 1A / T valid , where T 1A represents the duration during which C 1A < C 1B holds, and T valid denotes the total period for calculating STCC.L 1B = T 1B / T valid , L 2A = T 2A / T valid , and L 2B = T 2B / T valid are introduced in the same way.
The previous study revealed that the leader probabilities of lasers, as determined by STCC, can be controlled by adjusting the balance of coupling strength in situations involving two mutually coupled lasers [21], as well as configurations with three or more lasers in unidirectional ring setups [28].Building on this understanding, we anticipate that leader probabilities in the proposed laser network can also be manipulated by altering the ratios of optical injection.In this numerical simulation, we choose the network (IV) based on its ability to achieve zero-lag synchronization in the subsequent experiment.As mentioned in Section 2.1, we assume that the coupling strength from Laser 1A to Laser 2A is identical to that from Laser 1A to Laser 1B, denoted as κ 1 .Similarly, the coupling strength from Laser 2A to Laser 1A should be equal to that from Laser 2A to Laser 2B, termed as κ 2 .In the numerical simulation, κ 1 and κ 2 are varied in the following manner, with the detuning of coupling strength ∆κ ≡ κ 1 − κ 2 : With the configurations described so far, we generate the temporal waveforms of laser intensity I k (t) (k = 1A, 1B, 2A, 2B).Subsequently, we calculate the low-pass-filtered intensity (the cutoff frequency is 60 MHz), cross-correlation values between Lasers 1A and 2B and those between Lasers 2A and 1B, and short-term cross-correlation values.We systematically vary ∆κ from −20 ns −1 to 20 ns −1 , in increments of 1 ns −1 .Apart from coupling strength, the parameters applied in these simulations for the Lang-Kobayashi equations are consistent with those used in Section 2.2, as detailed in Table 1.As representative instances, the results for ∆κ = 0 ns −1 (κ 1 = κ 2 = 30 ns −1 ), ∆κ = 5 ns −1 (κ 1 = 30 ns −1 , κ 2 = 25 ns −1 ), and ∆κ = 10 ns −1 (κ 1 = 30 ns −1 , κ 2 = 20 ns −1 ) are illustrated in Fig. 4.
Figure 4 (a), (b), and (c) exhibit the temporal laser intensity waveforms for ∆κ = 0 ns −1 , 5 ns −1 , and 10 ns −1 .Figure 4 (d), (e), and (f) present the outcomes after applying a low-pass filter to (a), (b), and (c), respectively.In Fig. 4 (d), we observe sudden dropouts followed by gradual intensity recoveries, typical phenomena in LFF dynamics.Moving to Fig. 4 (e), the dropouts are less conspicuous compared to Fig. 4 (d), and their intervals become irregular.The waveform becomes even more chaotic in Fig. 4 (f).Meanwhile, zero-lag synchronization between Lasers 1A and 2B, as well as that between Laser 2A and 1B, persist, judging from the synchronous waveforms in Figs. 4 (a)-(f), and the cross-correlation functions Ĉ1A,2B and Ĉ2A,1B having a peak at 0 ns, with a value of exactly 1.0 for any ∆κ configuration.
Figure 5 (a), (b), and (c) illustrate the STCC waveforms for ∆κ = 0 ns −1 , 5 ns −1 , and 10 ns −1 , respectively.In Fig. 5 (a), frequent switching between C 1A and C 1B , and that between C 2B and C 2A are observed, with their timing being precisely synchronized.In Fig. 5 (b), the periods during which C 1B > C 1A (C 2A > C 1B ) appear to be slightly longer than those during which C 1A > C 1B (C 2B > C 2A ).In Fig. 5 (c), the spontaneous exchanges between C 1A and C 1B , and those between C 2B and C 2A are not remarkable anymore.The leader probabilities L 1A , L 1B , L 2A , and L 2B , calculated with the STCC waveforms, are shown in the lower left of Figs. 5 (a), (b), and (c).Computing the probabilities, we compare C 1A and C 1B , and C 2A and C 2B every 1 ns, and T valid is 10 000 ns for (a), (b), and (c).
We repeat the computation 40 times while changing initial states of the lasers randomly and take an average for each ∆κ value.The leader probabilities of each laser are shown in Fig. 6.With a positive value of ∆κ, L 1A is greater than L 1B , and similarly, L 2B is greater than L 2A .As ∆κ reaches 20 ns −1 , L 1A and L 2B nearly converge toward 1.Conversely, for a negative value of ∆κ, L 1B (L 2A ) is higher than L 1A (L 2B ) and as ∆κ approaches −20 ns −1 , L 1A and L 2B almost converge towards 0.
Here, we define the collision rate (CR) as the number of points at which the two players select the same slot machine simultaneously.With these configurations, CR = (T bothA + T bothB ) / T valid , where T bothA represents the periods during which C 1A < C 1B and C 2A < C 2B hold and the reverse is satisfied for the notation T bothB .In the numerical simulation, CR is 0 for any ∆κ, indicating that two players always select the slot machines separately.This is based on the fact that the cluster synchronization is stable in the range of −20 ns −1 ≤ ∆κ ≤ 20 ns −1 , i.e., κ i ≥ 8 ns −1 holds for both i = 1, 2, following from the discussion in Section 2.2.Consequently, we demonstrate asymmetric preferences as well as conflict avoidance in the 2 × 2 CMAB by changing the coupling strengths of the laser network.

Experiment
To validate our numerical simulations, we aim to observe a set of intensity waveforms that yield STCC values corresponding to different leader probabilities, including an equal situation, while achieving low conflict.Similarly to the simulations, we adopt the network (IV).The experimental setups are illustrated in Fig. 7, and the equipment is described in detail in Table 4.We use four distributed-feedback (DFB) semiconductor lasers without isolators, enabling optical injection.Two lasers, referred to as Lasers 1A and 2A, are mutually coupled through separate unidirectional optical paths established using optical circulators and isolators.Then, half of the light from Lasers 1A and 2A is individually injected into the other laser, referred to as Laser 1B and 2B, respectively.Injection current thresholds for Lasers 1A, 1B, 2A, and 2B are 11.0 mA, 11.8 mA, 11.6 mA and 11.7 mA, respectively.Injection currents of Lasers 1A, 1B, 2A, and 2B are set to 12.1 mA, 12.4 mA, 12.7 mA and 12.9 mA, respectively, corresponding to approximately 1.1 times the thresholds.The injection current distribution enables the LFF dynamics and ensures an equivalent optical power level for each laser when uncoupled.We adjust the temperature to achieve the peak wavelength of 1547.0 nm in the optical spectrum for the lasers, and set the detuning of solitary optical frequencies to 0 Hz for all lasers.
Here, we introduce κ k,l , which represents the optical amplification rate of the light from Laser k to Laser l (when κ k,l = 1 the light from Laser k is transmitted to Laser l without any amplification or attenuation, and when κ k,l = 0, no light passes through).In the numerical simulation, we have assumed that all four lasers have identical properties and that κ 1A,2A = κ 1A,1B = κ 1 and κ 2A,1A = κ 2A,2B = κ 2 should be satisfied for cluster synchronization.In contrast, in the experimental setups, each laser has distinct features, which makes it necessary to scale the coupling strength κ k,l .We insert electronic variable optical attenuators into each path and tune κ k,l by changing the applied voltage to the attenuators.We adjust the coupling strength to maximize the zero-lag synchronization precision for each target value of leader probabilities.Part of the    3, indicate that we experimentally achieve zero-lag synchronization between Lasers 1A and 2B, and between Lasers 2A and 1B, for the five setups.In our experimental configurations, perfect zero-lag synchronization characterized by Ĉ1A,2B (0) = Ĉ2A,1B (0) = 1 is not observed, due to mismatches in laser internal properties, physical discrepancies between the coupling delay time of fibers, and thermal instabilities in the environment.
Figure 10 shows STCC waveforms and the corresponding leader probabilities and collision rate (CR).Similarly to the literature [24], we extract dropout parts of STCC for the calculation of the probabilities and CR because switches of leaders between lasers are stable only during the dropouts in our experimental configurations, as shown in yellow and purple lines each representing C 2A and C 2B in Fig. 10 (f).We define that dropouts started when all the STCC values C 1A , C 1B , C 2A , and C 2B fall below 0.45 and end when one of them rises above 0.9.The area enclosed by a dashed rectangle in Fig. 10 (f 1610 ns, respectively.As described in Fig. 10, the ratio of the leader probabilities of Laser 1A (2B) to Laser 1B (2A) range from approximately [A] 30%-70% to [C] 50%-50% and [E] 70%-30%, which partly shows the experimental controllability of the leader probabilities.An attempt to achieve further asymmetric situations brought about desynchronization in our experiment, explained by the unstable cluster synchronization due to the reduced coupling strength, as discussed in Section 2.2.
Last but not least, the collision rate (CR) is approximately 0.15 and exhibits a minor dependency on the settings of coupling strength in the region.If two players independently and randomly select two slot machines evenly at 50%, CR is about 0.5.If Player 1 selects Slots A and B at a ratio of 30% : 70%, and Player 2 selects A and B at a ratio of 70% : 30%, isolated decision-making results in CR of approximately 0.42.Our result is even much lower than that value; therefore, we consider that we have experimentally demonstrated modifying slot selections of players while maintaining low conflict.

Conclusion
We examined crucial aspects of a cooperative decision-making system based on chaotic lasers in a network configuration to solve the competitive multi-armed bandit (CMAB) problem.The discussion, validated with quantitative evaluation of synchronization, revealed that, in total, five networks derived from the bipartite graph exhibit cluster synchronization indispensable for the proposed collective decision-making.Note that the stability analysis provided insights into the minimal coupling strength for cluster synchronization, which was not discussed in the argument based on an unweighted matrix [30].Our discussion also revealed some implications of essential network structures for cluster synchronization: first, a network, obtained by removing some paths of the multipartite graph, should include a loop with lasers corresponding to the number of options (= slot machines).Second, a network should still be (unilaterally) connected.Among the seven candidates illustrated in Fig. 2, the network (VI) does not meet the latter condition, and (VII) does not meet the former.Their verification will be required in scaled problem settings, e.g., the CMAB with two players and three slot machines.
Furthermore, we extended the decision-making function to include an exploitation mechanism in the CMAB problem by demonstrating, both in simulations and experiments, the controllability of the leader probabilities of lasers, which correspond to slot machine selection proportions in the suggested system.Specifically, in the experiments, we achieved five different leader ratios ranging from 30%-70% to 70%-30% while attaining a collision rate of approximately 0.15, demonstrating coordination of the decision-making compared to asynchronous and independent selection.On the other hand, the decision-making in practical problem configurations, for instance, a scenario where two players narrow down to two high-reward slot machines among three, remains to be implemented.This proof of principle for fulfilling players' asymmetric preferences, supported by the stability analysis of networks for cooperative decision-making, opens the door to the application of laser networks and other photonic-based systems for machine learning.

Figure 2 :
Figure 2: Network configurations that can be used for cooperative decision-making.(a) Complete bipartite network of four lasers.(b) Six candidate networks that meet constraints and have solutions of desired cluster synchronization.

Table 1 :Figure 3 :
Figure 3: Maximum Lyapunov exponents numerically calculated with coupling strength κ = 0 ns −1 to 30 ns −1 .(a) The exponents of the cluster synchronization.(b) The exponents of the global synchronization.

Figure 3 (
Figure3(a) illustrates the maximum Lyapunov exponents of the cluster synchronization in the networks, computed with varying settings of κ = 0 ns −1 to 30 ns −1 , in increments of 1 ns −1 .The exponents for the networks (I), (II), (III), (IV), and (V) are negative in the region where κ ≥ 8 ns −1 , represented by the red and blue curves, which suggests asymptotic stability of the cluster synchronization.For the networks (VI) and (VII), on the contrary, the exponents are positive with any value of κ, indicating that the cluster synchronization is unstable in the two networks.Therefore, the networks (I), (II), (III), (IV) and (V) remain candidates for the proposed cooperative decision-making system, while (VI) and (VII) are not.Next, we discuss conditional Lyapunov exponents for the global synchronization.We introduce variables spanning synchronized manifolds,E S ′ = (E 1A + E 1B + E 2A + E 2B )/4, N S ′ = (N 1A + N 1B + N 2A + N 2B )/4,and others spanning anti-synchronized manifolds,E AS1 ′ = (E 1A + E 1B − E 2A − E 2B )/4, N AS1 ′ = (N 1A + N 1B − N 2A − N 2B )/4, E AS2 ′ = (E 1A − E 1B + E 2A − E 2B )/4, N AS2 ′ = (N 1A − N 1B + N 2A − N 2B )/4, E AS3 ′ = (E 1A − E 1B − E 2A + E 2B )/4, N AS3 ′ = (N 1A − N 1B − N 2A + N 2B )/4.Similarly to the case of the cluster synchronization, we obtain differential equations of the newly introduced variables using Eq.(1) and (2), and then derive equations for complete synchronous trajectories and linearized equations to calculate Lyapunov exponents for the networks (I), (II), (III), (IV) and (V), under the corresponding assumptions.The equations are provided in the supplementary material.Notably, the delay terms in the linearized equations depend on each network.Parameters and other computational conditions are the same as the ones used for the exponents of the cluster synchronization.Figure 3 (b) shows the maximum Lyapunov exponents of the global synchronization in the networks

Figure 6 :
Figure 6: Leader probabilities of the four lasers changing with the difference of the coupling strength ∆κ.
Figure10shows STCC waveforms and the corresponding leader probabilities and collision rate (CR).Similarly to the literature[24], we extract dropout parts of STCC for the calculation of the probabilities and CR because switches of leaders between lasers are stable only during the dropouts in our experimental configurations, as shown in yellow and purple lines each representing C 2A and C 2B in Fig.10 (f).We define that dropouts started when all the STCC values C 1A , C 1B , C 2A , and C 2B fall below 0.45 and end when one of them rises above 0.9.The area enclosed by a dashed rectangle in Fig.10(f) gives an example of the part extracted from the STCC waveform for setup[C].We perform comparisons between C 1A and C 1B , and between C 2A and C 2B every 10 ns, andT valid is [A] 1690 ns, [B] 1380 ns, [C] 1250 ns, [D] 1240 ns, [E]1610 ns, respectively.As described in Fig.10, the ratio of the leader probabilities of Laser 1A (2B) to Laser 1B (2A) range from approximately [A] 30%-70% to [C] 50%-50% and [E] 70%-30%, which partly shows the experimental controllability of the leader probabilities.An attempt to achieve further asymmetric situations brought about desynchronization in our experiment, explained by the unstable cluster synchronization due to the reduced coupling strength, as discussed in Section 2.2.

Figure 8 :
Figure 8: Experimental results -temporal laser intensity waveform, measured with different coupling strength settings.(a) to (e) correspond to [A] to [E], respectively.

Figure 9 :
Figure 9: Experimental results -low-pass-filtered intensity, with (a) to (e) corresponding to different coupling strength settings [A] to [E], respectively.