Introduction

In recent years, the effort to understand in a comprehensive manner the physics and intricate dynamics of many-body and many-state (multimode) interacting systems has increased. This interest is driven by the realization that complex many-state configurations can lead to unparalleled opportunities ranging from biology and life sciences to chemistry, physics, and quantum information1,2,3,4,5,6. In photonics, similar prospects can also be pursued in connection with complex multimode nonlinear (MMN) photonic setups. For instance, multimode fibers are nowadays intensely investigated as a means for developing high -power optical sources and ultra-high bandwidth communication networks, an approach that was largely abandoned in the late eighties because of its inherent complexity (see ref. 7 and references therein for a concise review). While these emerging technologies could prove revolutionary, they inevitably pose a number of fundamental challenges that need to be addressed. These issues primarily stem from the extreme complexity of multimode systems that, in most cases, is further exacerbated by nonlinearities. This, in turn, leads to multi-wave mixing processes that induce nonlinear photon-photon collisions by which the many modes can exchange energy through a multitude of possible pathways, often numbering in the trillions even in the presence of one hundred modes or so7,8,9,10.

This microscopic complexity, introduced by either many-body or many-mode energy exchange processes, acts as the binding link among the photonics testbed and the other seemingly different platforms that host many-body/multimode interacting systems. As opposed to these other frameworks, complex nonlinear multimode photonic arrangements provide versatile experimental platforms; some of which are directly related to recent developments in multiplexing/demultiplexing schemes11,12,13,14,15,16,17,18,19. These allowed us to experimentally probe the whole modal occupation statistics in perfect (i.e. ordered) MMN fibers20 and demonstrate the convergence of an initial excitation towards a thermal Rayleigh-Jeans (RJ) distribution as predicted by the theory19,20,21,22,23,24,25,26. In fact, the controllable nature of photonic platforms has been successfully utilized during the last thirty years to investigate and demonstrate a plethora of phenomena, ranging from non-Hermitian systems with dynamical symmetries27,28, topological phases29,30, phase transitions31,32, linear and non-linear Anderson localization effects33,34,35,36,37,38,39 and more.

Motivated by these new experimental opportunities in photonics, we develop here a theory of thermalization of out-of- equilibrium optical states towards their thermal Rayleigh-Jeans (RJ) distribution, see Fig. 1. We show that disorder and nonlinear mode-mode interactions can lead to unconventional thermalization scenarios. For excitations with high optical temperatures T > T*, the relaxation rates Γ obey a universal one-parameter scaling theory that is independent of the microscopic details (statistical properties of the disorder) of the underlying system. The scaling parameter reflects the number of independent segments \({{{{{{{{\mathcal{N}}}}}}}}}_{{{{{{{{\rm{eff}}}}}}}}}\), each being of the size of the localization length that describes the spatial extension of the linear supermodes in the presence of disorder. As \({{{{{{{{\mathcal{N}}}}}}}}}_{{{{{{{{\rm{eff}}}}}}}}}\) increases due to increasing randomness, the relaxation rate goes to zero, hindering the thermalization process. At T < T*, we discovered a crossover from linear to quadratic temperature dependence of Γ as the system transitions to a strong disorder regime. The origin of this crossover is traced to a re-organization of the low energy modes due to the formation of Urbach tails in the density of states. We expect that the predictions presented below will be soon tested in experimental photonic platforms like nonlinear mesh photonic lattices40 and further utilized for the design of MMN fibers with potential applications to communication networks.

Fig. 1: Relaxation to an equilibrium state.
figure 1

Schematic array of coupled (a) waveguides; (b) microresonators. The initial mode distribution nα (blue) evolves toward a Rayleigh-Jeans (RJ) thermalized state (red). c The relative target-mode power \(\bar{{I}_{\alpha }}\), indicated with a black (red) line, describes an exponential relaxation of an out-of-equilibrium state towards a RJ distribution in case of random waveguide/resonator arrays with disorder strength W = 0.5 (W = 2) (see Eq. (9) in Methods). Parameters: a = 1, T = 100, χ = 0.01, N = 500.

Results

Theoretical framework

We consider MMN arrays of coupled waveguides (Fig. 1a) or coupled nonlinear resonators (Fig. 1b). Their beam dynamics is described using a time-dependent coupled mode theory41,42,43,44

$${{{{{{{\rm{i}}}}}}}}\frac{d{\psi }_{l}}{dt}=-\mathop{\sum}\limits_{j\ne l}{J}_{lj}{\psi }_{j}+{\omega }_{l}{\psi }_{l}+\chi | {\psi }_{l}{| }^{2}{\psi }_{l},\quad \quad l=1,\ldots ,N,$$
(1)

where ψl is the complex field amplitude at site l (e.g. waveguide or resonator) of the network, and \({J}_{lj}={J}_{jl}^{* }=J{\delta }_{j,l\pm 1}\) describes the evanescent coupling coefficient between sites l and j = l ± 1. The optical on-site potential (e.g. propagation constant or resonant frequency) at site l is randomly chosen from a uniform distribution of width W, i.e. \({\omega }_{l}\in [-\frac{W}{2},\frac{W}{2}]\). Finally, the last term describes nonlinear effects, e.g. associated with a Kerr-type nonlinearity with coefficient χ. In the case of coupled resonators arrays, t describes time, while in the case of single-mode multicore (or multimode) fibers or waveguides t describes the paraxial propagation distance z.

Equation (1) can be represented in the eigenmodes \(\left({f}_{\alpha }(l),\alpha =1,\ldots ,N\right)\) of the corresponding linear problem (supermode basis) as

$${{{{{{{\rm{i}}}}}}}}\frac{d{C}_{\alpha }}{dt}={\varepsilon }_{\alpha }{C}_{\alpha }+\chi \mathop{\sum}\limits_{\beta \gamma \delta }{Q}_{\alpha \beta \gamma \delta }{C}_{\beta }^{* }{C}_{\gamma }{C}_{\delta }.$$
(2)

Here, Cα(t)’s are the field expansion coefficients, i.e., ψl(t) = ∑αCα(t)fα(l) and ε1 < ε2 <  < εN are the eigenfrequencies associated with the supermodes fα, α = 1,  , N. The mixing coefficient \({Q}_{\alpha \beta \gamma \delta }={\sum }_{l}{f}_{\alpha }^{* }(l){f}_{\beta }^{* }(l){f}_{\gamma }(l){f}_{\delta }(l)\) describes the interactions associated with the nonlinear mixing between supermodes.

Equations ((1),(2)) conserve the internal energy \({{{{{{{\mathcal{H}}}}}}}}\left(\{{\psi }_{l}(t)\}\right)={{{{{{{{\mathcal{H}}}}}}}}}_{{{{{{{{\rm{NL}}}}}}}}}+{{{{{{{{\mathcal{H}}}}}}}}}_{L}=0.5\chi {\sum }_{l}| {\psi }_{l}{| }^{4}+\big[{\sum }_{l}| {\psi }_{l}{| }^{2}{\omega }_{l}-{\sum }_{lj}{J}_{lj}{\psi }_{l}^{* }{\psi }_{j}\big]\, \equiv E\). In realistic cases χ is small, and therefore \({{{{{{{\mathcal{H}}}}}}}}\left(\{{\psi }_{l}(t)\}\right)\approx {{{{{{{{\mathcal{H}}}}}}}}}_{L}={\sum }_{\alpha }{\varepsilon }_{\alpha }{I}_{\alpha }\), where \({I}_{\alpha }\equiv {\left\vert {C}_{\alpha }\right\vert }^{2}\) is the optical power of a supermode α. In addition to E, the total optical power of the beam \({{{{{{{\mathcal{N}}}}}}}}\left(\{{\psi }_{l}(t)\}\right)={\sum }_{l}| {\psi }_{l}(t){| }^{2}={\sum }_{\alpha }{I}_{\alpha }\equiv A\) is also a constant of motion. Since both A and E are extensive quantities, we also define the modal energy density h = E/N and the modal power density a = A/N.

Assuming thermal equilibrium and weak nonlinearity, the thermal state is described by the Rayleigh-Jeans (RJ) distribution23,24,25 \({n}_{\alpha }\equiv \left\langle {I}_{\alpha }(t)\right\rangle =\frac{T}{{\varepsilon }_{\alpha }-\mu }\), where 〈   〉 indicates thermal averaging and nα is the optical power of the supermode α. The optical temperature T and chemical potential μ, are determined by the initial beam’s total internal energy and optical power.

Relaxation rates

Under the assumption of weak nonlinearities, phase randomization of the field amplitudes and amplitude randomization8,9,10, the slowly varying mode intensities follow the kinetic equation (KE) (for details in the derivation of the KE, see the Supplementary Note 1, which cites refs. 8,9,45,46)

$$\frac{d{I}_{\alpha }}{dt}={\chi }^{2}\mathop{\sum }\limits_{\beta \gamma \delta }^{{\prime} }{V}_{\alpha \beta \gamma \delta }{I}_{\alpha }{I}_{\beta }{I}_{\gamma }{I}_{\delta }\left(\frac{1}{{I}_{\alpha }}+\frac{1}{{I}_{\beta }}-\frac{1}{{I}_{\gamma }}-\frac{1}{{I}_{\delta }}\right),$$
(3)

where the summation \({\sum }^{{\prime} }\) excludes the secular terms and Vαβγδ = 4πQαβγδ2δ(εα + εβ − εγ − εδ) reflects four-wave mixing processes. When the ωl’s are random, the supermodes are exponentially localized leading to a suppression of the coupling Vαβγδ between them10. Nevertheless, Eq. (3) is still valid when the localization length of the supermodes is larger than the lattice spacing46.

Next, we consider a small variation Iα → Iα + δIα of the modal power of the mode α from its RJ steady state solution, while all other modes remain at equilibrium, i.e., δIβ,γ,δ = 0. After a substitution into Eq. (3) and a subsequent approximate linearization, we arrive at a rate equation47

$$\frac{d\delta {I}_{\alpha }}{dt}\approx -\frac{\delta {I}_{\alpha }}{{\tau }_{\alpha }};\quad \frac{1}{{\tau }_{\alpha }}\equiv {\Gamma }_{\alpha }=\frac{{\chi }^{2}}{{n}_{\alpha }}\mathop{\sum }\limits_{\beta \gamma \delta }^{{\prime} }{V}_{\alpha \beta \gamma \delta }{n}_{\beta }{n}_{\gamma }{n}_{\delta },$$
(4)

which describes an exponential relaxation process of the α − th mode towards its RJ thermal state. We point out that, according to the KE predictions, there are families of initial preparations which do not relax to the initial RJ distribution48. Their identification and the more general problem of the applicability of the KE approach to describe their thermalization process generated by the CMT Eq. (1) is a subject that deserves a separate study. The current study focuses on beam preparations as described in the Methods section.

The above rate equation, Eq. (4), has been derived using the following approximations: (a) We have excluded off-diagonal contributions in the relaxation process which are associated with variations δIβ, δIγ and δIδ where β, γ, δ ≠ α. Such exclusion is equivalent to the assumption that only the α − th mode has been perturbed away from its RJ thermal state, while all other modes remain at their steady-state i.e. δIβ = δIγ = δIδ = 0. In the thermodynamic limit N → , this constraint can be further relaxed to δIβα ≠ 0 with δIαδIβα. Notice that the neglected off-diagonal terms prevent the conservation of the power and the energy (A, E), as opposed to the Eq. (1) of the main text. (b) We have also excluded diagonal terms of the KE associated with (α, β) = (γ, δ) or (α, β) = (δ, γ) which appear rarely in the summation and their contribution is negligible with respect to the remaining term appearing in Eq. (4). These approximations allowed us to come up with a relatively simple rate equation Eq. (4) which has been used for the theoretical analysis of the relaxation rates Γα. Using this equation, we have calculated Γα by computing the supermodes of the linear Hamiltonian \({{{{{{{{\mathcal{H}}}}}}}}}_{L}\) (see Methods). We have further confirmed Eq. (4) by evaluating the relaxation rates of the modal power Iα(t) towards its equilibrium value nα via a direct numerical evaluation of the beam dynamics generated by Eq. (1). The simulations utilize an ensemble of initial conditions that maintain the (T, μ) values constant (see Methods). A representative example of a relaxation dynamics is shown in Fig. 1c.

Relaxation rates for periodic MMN circuits

First, we investigate the relaxation rates \({\Gamma }_{\alpha }^{(0)}\) for W = 0 (see Supplementary Note 2, which cites refs. 46,47). In the limit N → , the sums in Eq. (4) can turn into integrals over the wavenumbers after substituting \({\varepsilon }_{\alpha }=2J\cos ({k}_{\alpha })\) where the wavevector kα [ − π, π]. From Eq. (4)

$${\Gamma }_{\alpha }^{(0)} \sim {\chi }^{2}\frac{a}{J}\left\{\begin{array}{cc}a;&T\to \infty \\ \frac{T}{J};&T\to 0,\end{array}\right.$$
(5)

for a typical mode near the center of the band46,47.

Equation (5) captures all qualitative features of our dynamical simulations from low up to moderate temperatures (light blue and brown stars in Fig. 2(a)). The transition temperature T* ~ J from a linear T-dependent \({\Gamma }_{\alpha }^{(0)}\) to a temperature-independent domain can be evaluated by extrapolating the linear relation in Eq. (5) up to moderate temperatures T and identifying the intersection point with the high-T relaxation rate in Eq. (5) (for alternative derivation see Supplementary Note 2).

Fig. 2: Temperature dependence of the relaxation rates.
figure 2

a Relaxation rates Γ, rescaled with Γ(T → ), versus T/T*. The filled symbols result from the kinetic equation (KE), Eq. (4), and the open symbols from dynamical simulations. Various symbols correspond to different disorder strengths W = 0, 0.01, 0.1, 0.25, 0.5, 0.75 (blue) and W = 1, 1.5, 2, 3 (red), for two representative modes with ε ≈ 0, J. The results of dynamical simulations for W = 0 and ε ≈ 0(J) are indicated with cyan (brown) stars. The black dashed (solid) line indicates a quadratic (linear) temperature dependence. The high-temperature regime is T-independent (dashed-dotted line). The extraction of Γ’s from the dynamical simulations have utilized the beam dynamics generated by Eq. (1) using an ensemble of initial preparations that maintain the (T, μ)-values constant (see Methods). b Relaxation rates Γ versus temperature evaluated from the KE approach (lines) and dynamical simulations (symbols) for two different disorder strengths W = 0.01 (blue line and circles) and W = 2 (red line and triangles). The black solid (dashed) line has slope 1 (2) and is drawn to guide the eye. c Transition temperature T* versus W (rescaled with the coupling constant J). We evaluate T* from dynamical simulations for two different frequencies εα = 0 (filled circles) and εα = J (open circles) and from the KE. The dashed-dotted line indicates a linear behavior of T* with W. Parameters: a = 1, χ = 0.01. In dynamical simulations: system size N = 500.

Relaxation rates for disordered networks

Next we investigate the effects of disorder W ≠ 0 on the relaxation rates Γα. We first calculate Γα versus T (for fixed a) by evaluating numerically Eq. (4). Two representative cases, corresponding to weak (W = 0.01) and moderate (W = 2) disorder strengths are shown in Fig. 2b. These results are compared with the Γα’s extracted from dynamical simulations. For high temperatures, the relaxation rates are insensitive to temperature changes, while for moderate/low temperatures, their value depends on T. Although this behavior is reminiscent of the W = 0 case, the disorder introduces some distinct features: (a) for T > T* the value of Γα depends on the disorder strength W; (b) the same applies also for the transition temperature T* (see Fig. 2c for various disorder strengths); and (c) in the low/moderate-temperature domain, we find a crossover from a linear to a quadratic temperature dependence of Γ(T) when disorder exceeds a critical value W* ≈ J.

While the linear relation with T (occurring for W < J) can be understood from the analysis of the W = 0 case, the quadratic dependence of Γα on the temperature (for WJ) is different. It turns out that, in the moderate disordered range W ≈ W*, it can be captured by the evaluation of the KE, see Eq. (4) which gives that

$$\Gamma \sim \left\{\begin{array}{cc}\Gamma (T\to \infty )&;\quad T > {T}^{* } \sim aW\\ {T}^{2}&;\quad T < {T}^{* } \sim aW\end{array}\right.$$
(6)

in agreement with the dynamical simulations.

The following qualitative explanation sheds some light on the T2 behavior. In the low-temperature regime, the power is mainly concentrated on the lower modes. In this domain, the density of states (DoS) develops exponentially decaying tails (Urbach tails) (see Supplementary Notes 5 and 6, which cites refs. 47,49), i.e., \(\rho (\varepsilon ) \sim \exp \left(-C/\sqrt{\left\vert \varepsilon -{\varepsilon }_{b}\right\vert }\right)\) where C is a dimensional constant and εb = − (W/2 + 2J) is the true band-edge49,50,51,52. The exponential decay of the DoS in the low frequency domain εbεα − 2J allows us to ignore the contribution of these εα-modes in Eq. (4) when the sums are turned into frequency integrals and consider only the bulk modes near the center of the band. The occupation number for the latter is \({n}_{\alpha }=\frac{T}{{\varepsilon }_{\alpha }-\mu }=\frac{T}{{\varepsilon }_{\alpha }-{\varepsilon }_{1}}+{{{{{{{\mathcal{O}}}}}}}}({\varepsilon }_{1}-\mu )\) where (ε1 − μ) ~ T → 0 (see Supplementary Note 5). Substitution of this expression into Eq. (4), leads to Γα ~ T2 in the low temperature and strong disorder domain.

At the high-temperature regime, the modal power is uniformly distributed among the various modes, i.e., nα ≈ − T/μ ≈ a. A substitution of the nα’s into Eq. (4) leads to a temperature-independent Γα, see Eq. (6). Of course, the asymptotic value Γ(T → ) depends on W, which affects the localization properties of the eigenmodes \({f}_{\alpha }(l) \sim 1/\sqrt{{\xi }_{\alpha }(W)}\exp \left(-| {l}_{0}-l| /{\xi }_{\alpha }(W)\right)\) and subsequently the value of the coefficients Qαβγδ. The so-called localization length ξα(W) of the α-th supermode describes its exponential decay away from a localization center l0 due to the presence of the disorder W.

Finally, the dependence of the transition temperature T* on disorder is understood using similar arguments like the ones that have been used in the case of periodic structures (see Supplementary Note 2). The main observation is that for moderate/strong disorder and low-temperatures \(| \mu | \approx | {\varepsilon }_{1}| \sim W+{{{{{{{\mathcal{O}}}}}}}}(T/A)\) which applies up to TT*. On the other hand, approaching T* from the high-temperature regime allows us to approximate nα ≈ − T*/μ = a. From this expression, we find that T* ~ − aμ, and by substituting the value of μ ~ − W, we eventually recover the expression in Eq. (6). A summary of the extracted T* for both weak (W < W* = J) and strong (W > W*) disorder is shown in Fig. 2c. In the former limit we observe that indeed T* ≈ J while in the latter T* ~ W.

One-parameter scaling of the relaxation rates for high energy states

We will postulate a one-parameter scaling theory that allows us to predict the relaxation rates Γα as a function of the parameters (A, E, χ, W, N) that control the thermalization process in 1D photonic networks. Inspired by previous studies on (linear) Anderson localization24,53,54,55,56,57, we postulate the scaling ansatz

$$p\equiv \frac{{\overline{\Gamma }}_{\alpha }}{{\Gamma }_{\alpha }^{(0)}}=f(\lambda );\quad \lambda =\sqrt{\chi /J}{\xi }_{\alpha }(W)a$$
(7)

where \(\overline{\cdots }\) denotes an averaging over disorder realizations and over states within a small frequency window and \({\Gamma }_{\alpha }^{(0)}\) is the relaxation rate of an underlying periodic system (i.e., W = 0). The specific microscopic characteristics of the disorder (statistical properties, etc.) are captured by the localization length. For box-distribution of ωn [ − W/2, W/2] (and W ≤ 3), the localization length is approximated as33\({\xi }_{\alpha }(W)\approx 24(4{J}^{2}-{\varepsilon }_{\alpha }^{2})/{W}^{2}\).

The results from our dynamical simulations (Fig. 3) for various N, W, A, E, χ-values, confirm the validity of Eq. (7). An interpolating law is

$$f(\lambda )=\exp (-C/\sqrt{\lambda })=\left\{\begin{array}{cc}0;&\lambda \to 0\\ 1;&\lambda \to \infty \end{array}\right.$$
(8)

where C ≈ 3 is a best fitting constant. In Fig. 3, we also show the results of the KE, Eq. (3), which reproduce nicely the numerical results for λ > 10−1. For smaller λ-values (e.g., strong disorder W≥3), the KE is not applicable, resulting in deviations from the exact numerics. The failure of Eq. (3) (and subsequently of Eq. (4)) in this λ-range is associated with the requirement that Vαβγδ must couple many modes. This is possible only when the localization lengths of the various modes participating in the summation are much larger than the lattice spacing, otherwise Vαβγδ is exponentially suppressed due to strong Anderson localization effects (see Supplementary Note 3).

Fig. 3: One-parameter scaling of the relaxation rates.
figure 3

The relaxation rates are extracted from dynamical simulations and normalized to their asymptotic value \({\Gamma }_{\alpha }^{(0)}=\Gamma (W\to 0)\) for two representative eigenmodes with ε ≈ 0 (filled symbols) and ε ≈ J (open symbols). The scaling parameter is \(\lambda =\sqrt{\chi /J}\xi (W)a\), see Eq. (7). The black dashed line indicates the scaling function Eq. (8). The KE, Eq. (4), (black stars) describes accurately the relaxation rates extracted from the dynamical simulations failing only for strong disorder W ≥ 3 (λ ≤ 10−1). We also test our scaling theory for multiple nonlinearity (MNL) strengths χ = 0.005, 0.01, 0.015, 0.02 for J = 0.5, 1, 2, 3 (orange stars). (Inset) The unscaled relaxation rates versus disorder W for different values of J, χ, a, and system sizes N = 64, 120, 250, 500 (various symbols). The extraction of Γ’s from the dynamical simulations have utilized the beam dynamics generated by Eq. (1) using an ensemble of initial preparations that maintain the (T, μ)-values constant (see Methods).

The two limiting cases in Eq. (8) reflect the prevailing physical mechanism that controls the relaxation process. The first limit, \(\lambda \sim \frac{\sqrt{\chi }}{{W}^{2}}\to 0\), corresponds to strong disorder (WJ), where the Anderson localization effects dominate. In this case, the effective coupling coefficient that controls the coupling between the various sites is weak, J/W 1. Thus, the appropriate basis is the Wannier basis10, and one can apply the optical thermodynamics methodology focusing on the individual resonator39. For weak nonlinearities, the RJ is the zero-th order term with corrections39 \(\sim {{{{{{{\mathcal{O}}}}}}}}\left(T\chi /({\varepsilon }_{\alpha }-\mu )\right)\). For the W values that we have used, these corrections are negligible and the thermal state is described nicely by the RJ distribution (see Supplementary Note 4).

In the extreme case of zero effective coupling, the supermodes are confined to a single site (Wannier states), and the system of Eq. (1) develops Deff = 2 × N local integrals of motion \({{{{{{{{\mathcal{H}}}}}}}}}_{n}=({\epsilon }_{n}+\frac{\chi }{2}| {\psi }_{n}{| }^{2})| {\psi }_{n}{| }^{2}={E}_{n}\) and \({{{{{{{{\mathcal{N}}}}}}}}}_{n}=| {\psi }_{n}{| }^{2}={A}_{n}\) that form a complete set, preventing the system from thermalizing, i.e., Γα = 0, α = 1,  , N as indicated by Eqs. (7),(8)). As the effective coupling increases (W decreases leading to an increasing ξ ~ 1/W2), the modes are segregated into \({N}_{{{{{{{{\rm{eff}}}}}}}}}=N/\min \{\xi ,N\}\) clusters, which are weakly coupled to one another. Thus, the system can be treated as a Neff-collection of quasi -isolated clusters with Deff = 2 × Neff emergent local (quasi-)integrals of motion corresponding to an equal number of local (quasi-)conservation laws, which slow down the relaxation process. Further decrease of W, leads to an increase of the effective coupling until eventually, at W = 0, the modes of the system fuse into one supercluster, i.e., Neff = 1 while Deff = 2. The latter limit corresponds to λ → , where the scaling function becomes unity, see Eq. (8). It is interesting to point out that Neff resembles the scaling parameter that controls the conductance in the standard theory of Anderson localization. It also appears explicitly in Eq. (7) when the power density is expressed as a = A/N. Specifically, we get that \(\lambda =\sqrt{\chi /J}A/{N}_{{{{{{{{\rm{eff}}}}}}}}}\).

The transition to a disorder-hindered thermalization regime at λ* ≈ 1 (see Fig. 3) can also be understood using an energetic argument, which provides an alternative insight for the emergence of the scaling parameter λ. First, we realize that the formation of localization boxes involve clusters of ξ interacting modes, which are apart by a frequency spacing Δξ ~ J/ξ. The nonlinear interactions between the modes is energetically costly for the system (see Supplementary Note 3), which increases its total energy by an amount Eξ ~ hξ = χa2ξ. As long as Eξ ≤ Δξ, the modes will resist the interaction pressure and the mode-segregation will persist. Conversely, the localization boxes will be destroyed and each mode will be able to interact with a sea of N other modes. The critical condition in terms of the scaling parameter is λ2 ≡ Eξξ = 1, which highlights the importance of λ and allows us to identify the transition point as λ* ~ 1.

Discussion

It is important to distinguish our study with some of the existing literature58,59,60 on thermalization processes in the presence of dynamical disorder. In these cases the randomness (either in the coupling between the modes or in the propagation constants) varies along the paraxial propagation in MMN fibers and the internal energy (Hamiltonian \({{{{{{{\mathcal{H}}}}}}}}\)) is not conserved during the beam dynamics (as opposed to our case). Such dynamical disorder has been used, for example, in the mesoscopic physics framework for modeling the effects of noise and the associated dephasing mechanism61. In our case, however, the disorder is static (i.e., remains constant along the paraxial propagation). In this respect, our framework aligns with some mean-field studies that are addressing the interplay of disorder and thermalization in bosonic gases (see, for example, the review62 and references therein). These works are mainly focusing on the consequences of nonlinear interactions in the weak localization phenomena (e.g. coherent back/ forward-scattering) occurring when a beam of well-defined direction propagates in a two-dimensional random potential with speckle statistics. They have concluded (opposite to us!) that long-time thermalization always occurs irrespective of the disorder strength, albeit the transient dynamics leading to the thermal state can be non-trivial62. The same conclusion about thermalization was recently reached for an array of coupled nonlinear oscillators with random masses63. These systems have one conserved quantity (the total energy), and therefore reach equipartition (not RJ as in our case), the strength of disorder is finite, and all modes form a connected network. In contrast, in our case, the coupling between modes can be exponentially suppressed in the strong disorder limit, where thermalization is inhibited. In the opposite limit of weak disorder, our proposed scaling law Eqs. (7),(8) agrees (as expected) with the results of ref. 63and predicts a relaxation rate Γ ~ 1/χ2.

The localization-hindered thermalization that is predicted from our study in case of strong disorder, aligns with the ones derived in the framework of quantum many-body interacting systems where the phenomenon of Many-Body Localization (MBL) in one -dimensional disordered systems has been recently established as a distinct dynamical phase of matter6,64. This phenomenon constitutes a paradigm of a broad family of systems that do not abide to ergodicity (thus resisting the conventional thermalization scenarios64,65 offering an alternative mechanism for protecting types of order that are forbidden in equilibrium) and has implications for the design of quantum computing protocols. If indeed our results constitute a classical re-incarnation of the quantum MBL then our study represents a paradigm shift to the case of classical many-mode interacting systems. In this case, the scaling theory developed here, together with the transition from linear to nonlinear temperature dependence, of the relaxation rates at low optical temperatures, will be bringing a new light to the challenging quantum counterpart problem. This is a subject of further investigation that goes beyond the scope of the current contribution.

Conclusion

We have addressed the thermalization process of photonic MMN circuits in the presence of static disorder. At low/moderate optical temperatures, the relaxation rates Γα demonstrate a crossover from a linear to a quadratic temperature dependence when the disorder increases beyond a critical value that describes the evanescent coupling between the resonators (waveguides). At high-temperatures, Γα is temperature-independent and follows a universal one-parameter scaling law, which describes the exponential slow-down of relaxation as the disorder (and optical power density) is increasing (decreasing).

These predictions can be tested experimentally using fiber optics platforms where transverse Anderson localization was observed66,67, or nonlinear mesh photonic lattices40. Our theory can guide further the understanding of the thermalization processes of many-body/multimode (bosonic) systems. It will be interesting to extend the analysis of relaxation rates beyond their mean/typical value and analyze the properties of the whole distribution.

Methods

Evaluation of the relaxation rates using Eq. (4)

We have directly evaluated Eq. (4) by a diagonalization of the Hamiltonian matrices J that describe the connectivity of the network, see Eq. (1) of the main text. In these calculations, we have used system sizes of N = 500 and power density a = 1, and we have considered various disorder strengths W. The δ function appearing in Eq. (4) is numerically implemented by considering modes α, β, γ, δ such that εα + εβ − εγ − εδw with a tolerance w. This procedure is analogous to a frequency δ function given by a box of width 2w and height 1/(2w). We have checked that a value of w = 0.01 guarantees a convergence of the calculation of the relaxation rates for W ≤ 3.

Our analysis involved the processing of more than 2000 relaxation rates that have been collected by considering a number M ≈ 100 of different disordered realizations. For a better statistical analysis, we have also considered the relaxation rates within a small frequency window [ − 0.1J + εα, 0.1J + εα]. Due to the strong fluctuations in the evaluated Γα, we have calculated the statistical mode of their logarithm and then reverted to the processed relaxation rates. Finally, we have confirmed that finite size effects in the evaluated Γα’s for N = 500 are negligible by comparing them in some selected cases with the results extracted from systems of size N = 1000.

Evaluation of relaxation rates from beam dynamics

We have simulated the beam dynamics that is generated by Eq. (1) of the main text using an 8-th order three-part split symplectic integrator algorithm which ensures the conservation of the total norm A and energy E24,47,68,69. Typically, in our simulations, we have propagated an initial excitation for times t 104 of inverse coupling constants. During the propagation, we have monitored the accuracy of our simulations by making sure that both \((E,{{{{{{{\mathcal{N}}}}}}}})\) are conserved up to \({{{{{{{\mathcal{O}}}}}}}}(1{0}^{-8})\).

We have generated an ensemble of initial preparations with supermode occupations Iβ = T/(εβ − μ) given by a RJ distribution. The ensemble was characterized by the values of the optical temperature T and chemical potential μ that were used to define the RJ distribution. Then, the modal power Iα of a targeted α-eigenmode was perturbed by an amount δIα. In all our simulations, we have omitted the analysis of the relaxation rate for supermodes that are close to the band-edge. Such supermodes have an anomalously small localization length and the applicability of the KE is questionable.

In order to maintain the ensemble (T, μ) values (up to \({{{{{{{\mathcal{O}}}}}}}}(1/N)\)), we have also perturbed by a random amount δβ ΔIα the modal power of nearby β-eigenmodes. Both the targeted eigenmode and the nearby eigemodes were chosen to be inside a small frequency window εβ − ε/J≤0.1 around a targeted frequency ε in order to guarantee a statistically similar relaxation process. For a better statistical processing, the perturbed α − mode was swept among the modes inside the small frequency window. The ensemble of total preparations was typically involving M ≈ 5000 initial conditions, which were created by randomizing the phases of the complex amplitudes Cβ, thus keeping fixed both E and \({{{{{{{\mathcal{N}}}}}}}}\). The evolution of these out-of-equilibrium states was monitored in time and the modal power Iα(t) was computed and used to extract the relaxation rates Γα towards the thermal equilibrium value nα. To this end, we have evaluated the relative target-mode power Iα(t) which takes the form

$$\overline{{I}_{\alpha }(t)}\equiv \frac{1}{M}\mathop{\sum }\limits_{i=1}^{M}{I}_{i}(t)-{n}_{\alpha }\propto \exp (-{\Gamma }_{\alpha }t).$$
(9)

In Fig. 1c, we show two characteristic examples of the modal-power dynamics \(\overline{{I}_{\alpha }(t)}\) of a mode near the band-center for two disorder photonic circuits with the same nonlinearity strength χ, modal power density a and optical temperature T = 100 and two different disordered strengths corresponding to W = 0.5 (black) and W = 2 (red). While in both cases the relaxation process is exponential, in the latter case the thermalization process is slower.