Quantum ergodicity in the SYK model

We present a replica path integral approach describing the quantum chaotic dynamics of the SYK model at large time scales. The theory leads to the identification of non-ergodic collective modes which relax and eventually give way to an ergodic long time regime (describable by random matrix theory). These modes, which play a role conceptually similar to the diffusion modes of dirty metals, carry quantum numbers which we identify as the generators of the Clifford algebra: each of the $2^N$ different products that can be formed from $N$ Majorana operators defines one effective mode. The competition between a decay rate quickly growing in the order of the product and a density of modes exponentially growing in the same parameter explains the characteristics of the system's approach to the ergodic long time regime. We probe this dynamics through various spectral correlation functions and obtain favorable agreement with existing numerical data.


Introduction
The Sachdev-Ye-Kitaev (SYK) model [1,2] has become a paradigm of hard quantum chaos in strongly interacting quantum matter. Standing in the tradition of a general class of random interaction models [3,4,5,6], it is a system of N (Majorana) fermions, χ i with i = 1, . . . , N, subject to a four-fermion interaction with Gaussian distributed random matrix elements J i jkl of zero mean and a variance given by |J i jkl | 2 = 6J 2 /N 3 . The model is known [2] to show hard many body quantum chaos at all time scales. For 'semiclassically short' times, chaos manifests itself in exponentially decaying correlations, as described by out of time correlation functions [7,8,9,10,11,12]. In the complementary regime of ultra-long times, of the order of the inverse of the many body level spacing, chaos is diagnosed via quantum level repulsion otherwise found for random matrix theory (RMT) ensembles [13]. However, a question that has not really been answered so far is at what time or energy scales the system actually becomes ergodic. Relatedly, the nature of the system's effectively irreversible dynamics prior to entering the asymptotic ergodic long time regimes remains unclear. To motivate the question on a simpler example, the dynamics of a diffusive d-dimensional metal of linear extension L is chaotic at all time scales (exceeding the elastic scattering time.) However a crossover to ergodic long time dynamics takes place only at times t > t erg = L 2 /D exceeding the classical diffusion time through the system, where D is the diffusion constant. That time scale is called the ergodic time, or, in the specific context of dirty metals, the Thouless time. At time scales shorter than t erg the dynamics of the system is governed by diffusion modes relaxing in time. Technically, these are eigenmodes of the diffusion operator, and they are labeled by a set of ('momentum') quantum numbers q = n2π/L, where n = (n 1 , . . . , n d ) is a vector of integers, and D|q| 2 defines the decay rates. The inverse of the lowest non-vanishing of these scales, D(2π/L) 2 defines the Thouless energy.
In this paper, we address analogous questions for the SYK model: what is its ergodic time, and what is the nature of the relaxation modes prevailing at shorter scale? Can these modes be classified by effective 'quantum numbers', and if so, what is the density of these modes? Finally, what are the observable consequences in spectral correlation functions? We will provide answers to these questions and test their validity by comparison to existing numerical data. Specifically, there are two numerical analyses providing test criteria for our approach: in Ref. [14] the spectral number variance, Σ 2 ( ), i.e. the statistical variation in the number of many body levels contained in an energy window of width E has been obtained for systems of fermion number up to N = 34. For energies beyond an N-dependent time scale (which was difficult to estimate quantitatively on the basis of the available data but conjectured to be an algebraic power of the band-width) deviations from the results of random matrix ensemble number variances were seen. These deviations signal the breakdown of ergodicity and their quantitative computation is one of the objectives of our analysis. In Ref. [15] the spectral form factor, K(τ), i.e. the Fourier transform of the energy dependent spectral two-point correlation function, R 2 ( ) (for the concrete definition of these functions, see the next section), was computed for systems of different size. While the long time profile showed a ramp structure characteristic for RMT ensembles, universal deviations were observed for shorter times (see also [16,17,18,19] for related studies). The quantitative analytic reproduction of the non-ergodic contributions to the number variance and the form factor, and the demonstration that they originate in the same set of relaxation modes sets a stringent test for the validity of our analysis.
In this paper, we will approach the SYK model from a perspective different from that of previous analyses. The idea is to consider its Hamiltonian as a random first quantized operator (a random matrix) acting in the 2 N/2 -dimensional Hilbert space of the system. This matrix is sparse in that it contains only algebraically many independent matrix elements, compared to a rank increasing exponentially in N. The comparatively low entropy contained in this structure is responsible for the phenomenological deviations from maximum entropy random matrix Hamiltonians defined through a full set of i.i.d. distributed matrix elements. Methodologically, the advantage gained from the first quantized perspective is that powerful field theoretical methods developed for random single particle problems become applicable to the present system. Conceptually, this approach provides insight into the question how many-body quantum chaos seeded into a large Hilbert-Fock space via the 'few' interaction matrix elements works its way through an exponentially large phase volume to eventually stabilize an ergodic phase.
We will start in the next section with a brief review of spectral correlations in random quantum single particle systems. In view of numerous analogies this will be instructive and introduce the appropriate language for our later discussion of the SYK problem. In the second part of section 2 we summarize our main results and compare to earlier numerical studies. In section 3 we introduce the field theoretical framework for the quantitative analysis and in section 4 formulate a mean field analysis. In sections 5 and 6 we discuss the types of fluctuations relevant for the description of the ergodic sector and the relaxation modes, respectively. Section 7 contains the 2 Figure 1: Left: Semiclassical representation of the spectral two-point function through Green function amplitudes. Inset top: microscopic structure of scattering vertex in coordinate (left) and momentum (right) representation. Inset center: abbreviated representation of momentum conserving two particle mode. Inset bottom: spectral correlation function as one loop diagram involving two modes. At low energies higher order loop processes gain importance. Further discussion, see text.
technically most involved part of the program, the demonstration of the absence of non-linear corrections to the mean field results. (In view of the high dimensionality of the present problem and a correspondingly large 'phase volume' of fluctuations, this step is essential. However, readers primarily interested in results may skip this part.) We conclude in section 8. Technical parts of the discussion are relegated to several appendices.

Qualitative discussion and summary of results
Consider a stochastic quantum system described by a statistical ensemble of Hamiltonians H. Its spectral fluctuations at a characteristic energy E are described by the two point function is the average level spacing at E, ρ the density of states (DoS) and . . . c is a cumulative ('connected') average over randomness. Prominent quantities derived from the two-point function include the Fourier transform, or spec- ) is the number of levels contained in a strip of width . In view of the relation ρ( ) = − 1 π Im tr(G + ( )), where G ± ( ) = ( ± − H) −1 is the resolvent, the full information on all these quantities is contained in the two-point function where the (generally weak) dependence of C on the center energy is suppressed and we noted that connected averages between Green functions of the same causality vanish, G ± G ± c = 0. 3 A semiclassical cartoon of the situation is shown in Fig. 1, where the black lines are Green function propagators, and the dots represent the common starting and end state, x ± , in tr(G ± ) = x ± G ± (x ± , x ± ). The ring shaped structure indicates that the two propagators must remain piecewise close to each other to remain statistically correlated. The top inset illustrates the situation for the case of a Gaussian potential V(y)V(y ) ∼ ξ(y − y ) with finite range correlation function ξ. The scattering processes off fluctuations V(y) are indicated by dashed lines, and a line connecting two propagator amplitudes represents the average over a product of two of these. In this case, the extent of ξ sets the tolerance for deviations between the Feynman amplitudes. The effective two-particle mode emerging in this way (inset middle) defines a quantum stochastic process which, after averaging over the randomness, is governed by an effective master equation is the probability of pair propagation between two points x and x in time t and O a local operator whose specifics depend on the context. For example, in the particular case of an extended medium with Gaussian randomness, O = D∂ 2 x would be the diffusion operator. To leading semiclassical order the correlation function then assumes the form involving the temporal Fourier transforms of the mode propagators.
The solution of the propagator master equations crucially depends on the symmetries and conservation laws of the underlying scattering processes. For example, the averaging over a single particle random potential effectively restores translational invariance meaning that the difference in momenta, q, between the participating states is conserved (inset top). The conserved momenta play a role of effective quantum numbers of the mode propagators, and at the same time are Fourier conjugate to the coordinate difference x − x in Π(x, x , t). Indeed, the diffusion operator D∂ 2 x is diagonal in a momentum representation and the frequency representation of the mode equation has the solution Π(q, ω) = −(iω − Dq 2 ) −1 . With this result, the spectral two-point function assumes the role of a sum over relaxation modes [20], For frequencies larger than the Thouless energy, ω > E C ≡ t −1 erg , this expression is dominated by modes of non-vanishing momentum, q. This is the non-ergodic regime affected by the diffusive kinematics of the modes, their dimensionality-dependent density of states, etc. For smaller frequencies, ω < E C , the sum is dominated by the momentum zero mode, q = 0 (unless, the sum over modes yields an UV divergent result, which in the diffusive context would happen for d ≥ 4. We will return to the discussion of this situation below.) The zero mode contribution to the correlation function R 2 (ω) = − 1 2 ∆ πω 2 is fully universal in that it depends only on the dimensionless ratio of energy difference and single particle level spacing. For frequencies larger than the single particle level spacing, ∆ ω E C , this expression agrees with the RMT result (we assume absence of time reversal here, such that the relevant ensemble is the Gaussian Unitary Ensemble, GUE) We finally note that the IR divergence in the zero mode contribution ∼ ω −2 at small values ω < ∆ is cut by the emergence of 'nonlinearities' in the theory. The semiclassical precursor of these processes are higher order loop diagrams, as indicated in the bottom inset, right. However, the 4 non-perturbative nature of the non-linearity (i.e. the impossibility to capture them by diagrammatic resummation) is indicated by the non-analyticity of the RMT sin-function in 1/ω, i.e. in the propagator amplitude of the semiclassical zero-mode. If one is ambitious to describe spectral statistics for all frequency values the semiclassical formulation needs to be integrated into a field theoretical framework. In this way one finds that [21] Eq. (4) generalizes to In this expression, the perturbatively singular contribution of the ergodic mode ∼ ω −2 is regularized and absorbed in the RMT-contribution R 2,RMT . We now discuss how the general concepts introduced above carry over to the case of the twobody Majorana scattering operator Eq. (1). First note that the Hamiltonian conserves fermion parity and commutes with the parity operator P ≡ N/2−1 i=1 (iχ 2i−1 χ 2i ), where we consider even values of N for definiteness. For definiteness, we will focus on systems of state number N = 2, 6, 10, 14, . . . with N mod 8 = 2 or 6, which fall into the unitary symmetry class. 1 The Hamiltonian acts within Hilbert space sectors of definite parity, and we will consider the D ≡ 2 N/2−1 dimensional Hilbert space V of even occupation number throughout. This space is generated by the action of an even number of Majorana operators on the vacuum.
In this setting, the role of the position states, |y , is taken by states |n ≡ |n 0 , . . . , n N/2 where n i = 0, 1 is the occupation number of the fermion c i ≡ 1 2 (γ 2i−1 + iγ 2i ) and |n| ≡ i n i is even. Throughout, we will label products of Majorana operators X µ ≡ χ µ 1 χ µ 2 . . . χ µ l , l even, acting in V by the container symbol µ = (µ 1 , . . . , µ l ), where µ 1 < µ 2 < · · · < µ l defines an ordered string of numbers 1 ≤ µ i ≤ N. For convenience, we add to this set the unit operator X 0 ≡ I. Note that the operators X µ commute or anti-commute amongst themselves, X µ X ν = s(µ, ν)X ν X µ , where the sign factor s(µ, ν) = ±1 will play a very important role throughout. Finally, we reserve the symbol X a , a = (a 1 , a 2 , a 3 , a 4 ) for the n ≡ N 4 operators of norm l = 4 featuring in the interaction Hamiltonian.
In this language, the propagators n + |( ± −Ĥ) −1 |n + can be considered as sums over closed loop scattering paths during which states n scatter to states n via matrix elements n |H|n = a J a n |X a |n (Fig. 2, left.) A first essential difference to the previously discussed case is that the two amplitudes correlated by a scattering sequence in Fock space can be far apart. The reason is that even for very different states |n , |m , the product of matrix elements n |J a X a |n m|J a X a |m ∼ n |X a |n m|X a |m may be non-vanishing. This is to be compared to the case of single particle scattering where the amplitudes forming the particle-hole scattering channel were close to each other on scales ∼ ξ. The 'deconfined' nature of the scattering channel can be seen as a consequence of the sparsity of the random matrix H.
We now consider the mode evolution under this type of scattering dynamics. At any instance of time, the state of the composite mode is encoded in the amplitudes Π(n, m|n 0 , m 0 , t) ≡ Π(n, m), where the initial configuration, (n 0 , m 0 ), and the time argument, t, are suppressed in the second representation for better readability. The state of the mode after a scattering event off the operators X a is given by Π(n , m ) = n,m n |X a |n m|X a |m Π(n, m). One may write this in an index-free notation as Π → Π ≡ X a ΠX a , where Π ≡ n,m Π(n, m) |n m| is considered as a matrix in V or, equivalently, as an element of the tensor product of the Hilbert space and its dual V ⊗ V * . In view of this Fock space non-locality of these objects it is all the more important to identify 'mode quantum numbers' conserved by the scattering channels.
As in the single particle problem, progress is made by representing the particle-hole dyads |n m| ∈ V ⊗ V * in a basis different from the original one, i.e. by identifying the SYK-analog of a 'momentum representation'. To this end let us consider V ⊗ V * as the D 2 dimensional space of Hilbert space matrices. Above, we have introduced a specific set of elements of this space, namely the operators X µ . The action of all possible products of Majorana operators on the vacuum generates all of V, which is to say that the X µ form a complete set and that any other matrix A ∈ V ⊗ V * can be expanded as Here, the identification of the expansion coefficients follows from tr(X µ X † ν ) = Dδ µ,ν , i.e. the fact that the trace over non-vanishing monomials of Majorana operators vanishes, and only X µ X † µ = 1 has a finite trace D. The sum extends over one half of the 2 N−1 even parity operators X µ : operators X µ and X ν ≡ X µ P are identified because P acts as the identity operator in V. For example, for N = 4, the even operators χ 1 χ 2 and χ 1 χ 2 (χ 1 χ 2 χ 3 χ 4 ) ∝ χ 3 χ 4 are identical in the even subspace. This leaves 2 N−2 = D 2 independent operators which form a basis of V ⊗ V * .
Notice the similarity of the expansion Eq. (5) expansion with a Fourier transform. Indeed, we will observe that the states µ assume a role very similar to the momentum states of the single particle theory. Specifically, the transformation to µ-states may be applied to represent the modes as Π = 1 The key observation now is that the scattered mode, Π , has the expansion coefficients, π ν = 1 This construction shows that the scattering sequence conserves the ν-state. Individual scattering events merely generate a sign factor s(a, ν). We conclude that the labels µ play a role analogous to the conserved momenta, q, of the single particle problem.
In the next section we will show that the dispersion of these modes is determined by the relation Π(ν, ω) = (iω − (|ν|)) −1 , where the (k) depends only on the state norm k ≡ |ν|. exact expression.) Here, is the mean level spacing at the band center. Since ∆D is of the order of the many body band width, generic modes are heavy and essentially non-dispersive. However, the gap of the lightest massive mode is much smaller and it sets the inverse of the time scale at which the longest lived structured modes, k = 2, have relaxed. In this regard it plays a role analogous to the Thouless energy of a disordered single particle system. However, in view of the exponentially large density of states N k of modes with gap (k), care must be exercised in transferring results from single particle spectral statistics to the present context. Specifically, we observe that, unlike with the single particle system, corrections to RMT spectral statistics are observable at frequencies much lower than γ.
To better understand these structures, we consider the effect of non-ergodic modes on various spectral correlation functions. According to the general principle discussed above, we expect the two-point correlation function to assume the form which in the present context is a sum over an exponentially large number of short lived modes. (In the main part of the paper, this expression will be derived from a replica field integral formalism.) To obtain a crude frequency-dependent criterion for the influence of the massive modes, we ask when their total contribution becomes comparable to that of the universal zero mode. This leads to the estimate where we observe that the sum over non-vanishing modes is 'UV-dominated' by the exponentially large number of generic modes with their structureless dispersion. According to this estimate, the contribution of non-universal modes masks the universal contribution for frequency values exceeding Only for frequencies ω ω erg will universal spectral statistics be observed. Note that ω erg is much smaller than the lowest relaxation gap. As an aside, we mention that the spectral two-point function must satisfy the sum rule ωR 2 (ω) = 0 (the integral over fluctuations in the density of states around its mean equals zero.) While this rule appears to be violated by the near frequencyindependent background of the generic modes, we note that Eq. (8) holds only for frequencies ω ∆D much smaller than the band width. As shown later, modifications effectively restoring the sum rule take place at larger frequencies.
In order to compare the above prediction with earlier numerical work, we consider the number variance around some value E in the bulk of the spectrum, Here, the first term is derived by integration of the RMT-contribution to the spectral two point function over energy. The second term is obtained from the massive mode contribution, noting that for the exponential majority of them the dispersion (ω compared to (k)) is negligibly weak. The result is in semi-quantitative agreement with the numerical data shown in Ref. [14]: The deviations from the RMT limit show a convex upturn as a function of energy, and they are stronger for smaller N. An eyesight inspection of the data suggests that the deviations of the N = 22 fluctuations exceed those of N = 34 by a factor of about 7. This is not far off the above result which would predict a ratio 34 4 /22 4 5.7. For a more structured comparison we now turn to the discussion of the spectral form factor. The straightforward Fourier transform of the two-point function leads to Here, K RMT (τ) = τΘ(1 − τ) + Θ(τ − 1) is the RMT form factor obtained by Fourier transformation of the non-perturbative zero mode contribution R 2,RMT . The sum represents the contribution of non-ergodic modes. It is multiplied by a factor τ safeguarding the limit K(τ → 0) → 0 required by unitarity (i.e. by the sum rule dωR 2 (ω) = 0 describing the constancy of the total number of levels.) However, we emphasize that the result (11) is based on an effective low energy theory which looses validity for energies ω ∼ D∆ of the order of the band width, corresponding to 8 dimensionless times τ τ UV ∼ D −1 . This means that the quantitative form by which K(τ) of Eq. (11) approaches zero for times shorter than τ UV must not be taken seriously.
The left panel of Fig. 4 shows the form factor for the four values N = 22, 26, 30 and 34, respectively, compared to that of an RMT ensemble. At an N-dependent time, τ erg , the form factor exhibits a pronounced minimum and for larger times approaches the RMT result. A straightforward variational computation shows that the minimum is located at τ erg ∼ ln(N)ND −1 . This is only by a factor of ∼ N ln(N) larger than short UV time cutoff, τ UV (for N = 22 and 34 the time span between the two scales is indicated by the horizontal bars in the left panel of Fig. 4). At the same time, τ erg is the time below which deviations off RMT behavior become visible in the form factor. That this time is not in straightforward inverse relation to the energy ω erg above which deviations off RMT behavior become strong in the spectral two-point function has to do with the fact that in either case the deviations are caused by a very large number of very short lived modes. Where these modes give a largely structure-less contribution to R 2 (ω), their fast relaxation in time means that they are not felt at times larger than τ erg in the form factor K(τ); RMT correlations are better visible in K(τ) than in R 2 (ω).
The profiles shown in Fig. 4 superficially resemble the 'dip-ramp-structures' discussed in Ref. [15]. That reference considered correlations in a quantum partition sum Z(z) ≡ tr(exp(−zH)) generalized to complex 'temperatures', z. Specifically, it considered the time-dependent correlation function g c (t, β) ≡ Z(β+it)Z(β−it) c / Z(β)Z(β) , where β is physical temperature. For finite β, this function, likewise termed 'form factor' in Ref. [15], is different from the scaled spectral form factor K(t) ≡ K(τ)| τ=2πt∆ . In particular, K(t) has the limiting behavior K(τ → 0) → 0, whereas g c (t) asymptotes to a finite value. For finite β, the deviations between K(t) and the thermal correlation function g ( t, β) essentially originate in their different short time asymptotics. However, for the case β = 0 the functions coincide, and a direct comparison to the numerical data shown in Fig. 12 of Ref. [15] is possible. To ease this comparison, the left panel of Fig. 4 shows the form factor K(τ)∆ as a function of dimensionless time t8/J for four system sizes, N = 22, 26, 30, 34, of unitary symmetry included in the numerical analysis. A comparison of the curves indicates that the analytical calculation reproduces the essential features of the N-dependent dip-ramp profile seen in the numerical data. There are quantitative deviations by numerical factors of O(10%) of the absolute values of minima and ramp positions. However, by and large the comparison looks favorable and we conjecture that the dip-profile is caused by the non-ergodic modes discussed above.
In the next section we will discuss how the modes generating the spectral correlations of the system emerge as effective low energy degrees of freedom of a replica field theory.

Replica field theory
The functions R 2 (ω), Σ(E) and K(τ) discussed above are all obtained from the correlation function, C(ω), Eq.(2). In this section we derive a replica generating functional, Z(h), from which C is obtained by differentiation. Compared to other approaches, the principal difference is that we view the problem from a first rather then second quantized perspective. In this way of thinking, the Hamiltonian H is considered as a large sparse random matrix acting in a Ddimensional Hilbert space and its resolvents G ± (E) = (E − H) −1 are obtained from a Gaussian integral (rather than a field integral) Here, ψ = {ψ r n } is a R · D-component vector of Grassmann variables carrying indices n in Hilbert space and r ∈ {1, . . . , R} in replica space and we use the shorthand ∂ h + ≡ ∂ h + | h + =0 . We will suppress these indices when possible, e.g.,ψ(E + h + + iδ − H)ψ = r ψ r (E + h + + iδ − H)ψ r . The above relation follows from the fact that the Gaussian integration over a Grassmann field yields the determinant of the corresponding matrix kernel, i.e.
Multiplying this with the analogous relation for G − , we obtain where ψ = {ψ r,s n } is now a 2 · R · D dimensional field comprising an index s = ± distinguishing between advanced and retarded Green functions,ẑ = + ( ω 2 + iδ)τ 3 +ĥ is a 2 × 2 matrix in advanced/retarded space comprising energy arguments and sourcesĥ ≡ diag(h + , h − ), and τ i are Pauli matrices acting in the same space.
We may now perform the Gaussian average over randomness to obtain where in the second step we have rearranged the quartic term from a scalar product in ψ to a dyadic product, and Tr ≡ tr V tr is a trace over both, Hilbert space, V, and the 2R-dimensional internal space of the ψ-state. This way of rewriting the nonlinearity is advantageous because the dyads ψ + nψ − n X a are the precursor building blocks of the two-particle modes shown in Fig. 2. Following standard procedures, we decouple the quartic term with a Hubbard-Stratonovich transformation and integrate over Grassmann fields to obtain a Tr(X a A a ) 2 +Tr ln(ẑ+ γ n a A a ) .
Here, A a = {A rr ,ss a,nn } are n = N 4 matrices of dimension 2DR containing complex commuting variables and the energy scale γ is defined as Implicit to the definition of the transformed Z(h) is a constraint on the integration variables that guarantees the existence of the integral. At this point it becomes advantageous to switch to the µ-representation of operators: we define A a = 1 D 1/2 µ a a,µ X µ where the coefficients are 2R-dimensional matrices a µ = {a rr ,ss µ } in the internal indices. Using the relations discussed in the previous section we then obtain Tr(A a X a A a X a ) = 1 D µ,ν tr(a a,µ a a,ν )tr V (X µ X a X ν X a ) = µ tr(a a,µ a a,µ )s(µ)s(a, µ).
We here defined where s(µ) = s(|µ|) is a sign factor depending only on the number of Majoranas contained in µ.
where we used that quartic products of Majoranas square to unity, X 2 a = 1. The advantage of the new representation is that the trace of the Gaussian weight has collapsed to one over the internal indices. Notice the diagonality of the weight in the Hilbert space indices, µ, which is based on a construction identical to that demonstrating the µ-conservation of the two-particle scattering vertex. Indeed, the Hubbard-Stratonovich matrices A a,nn ∼ ψ nψn X a represent bilinears of particle amplitudes decorated by scattering vertices, and a a,µ are these bilinears in the µ-representation.
We now observe that the non-linearity tr ln of the integral couples only to the homogeneous configurationĀ ≡ 1 n a A a . This suggests a shift, A a →Ā + A a , where a constraint a A a = 0 for the shifted variables is understood. The same change of variables is applied to the µ-variables, a a,µ →ā µ + a a,µ , with a a a,µ = 0. Using this representation, the functional integral becomes a tr(ā µ +a a,µ )) 2 s(µ)s(a,µ)+ 1 n a,µ s(µ)tr(λ µ a a,µ )+Tr ln(ẑ+γĀ) , where λ µ are Lagrange multiplier matrices implementing the constraint. The integrations over a a,µ are now Gaussian and can be carried out in closed form. As a result of a straightforward procedure detailed in Appendix A, we obtain the functional integral Z(h) = Da exp(−S [a]) where we omitted the overbar,ā → a,Ā → A for notational brevity and defined S (µ) ≡ 1 n a s(a, µ).
Notice that while the Gaussian weight ∼ n −1 ∼ N −4 of the matrices a a,µ was 'light', that of the modes a µ is heavier. For generic µ, |µ| = O(N), the number of operators X a commuting/anticommuting with X µ is roughly equal implying that S (µ) ∼ √ n n ∼ N −2 and the Gaussian weight scales as ∼ N 2 .

Stationary phase analysis
In this section we subject the effective action to a stationary phase analysis. The legitimacy of the procedure will be checked self-consistently at a later stage. A variation of the action Eq. (20) over a µ yields the stationary phase equation For energies, E, in the center of the band, this equation is solved by a Hilbert space homogeneous ansatz, a µ ≡ D 1/2ŷ δ µ,1 . For this configuration, S (µ) = S 1 = 1, and the equation reduces tô This is a quadratic equation and it is solved bŷ where we noted that the sign of the square root is determined by the imaginary part of Im(ẑ) = δτ 3 . Substitution of this solution leads to the mean field action If we differentiate once w.r.t. sources and set ω = 0, we obtain the mean field estimate for the average density of states, This formula states that (i) the average level spacing in the band center is given by (ii) the characteristic many body band width is given by (implying that ∆ ∼ Γ/D), and (iii) at the level of the above mean field approximation, the density of states in the bulk of the band is given by a semicircular profile. It is well known [22], that the last statement is approximate. The density of states even in the bulk of the band is better approximated by a Gaussian, and in the tails approaches a square root dependence. The above solution of the self consistent Born type equation (22) can be made more accurate by the combinatorial methods of Ref. [22]. Close to the band edges, corrections become strong and a full solution of the problem [11] leads to the many body density of states computed by different methods in Refs. [15,23]. However, in the present context we are primarily interested in the correlations of the DoS at nearby energies and the weak dependence of the average DoS on the center energy is of secondary importance. For this purpose the semicircular estimate (23) is good enough.

Fluctuations (RMT)
We now turn to the discussion of fluctuations around the mean field and their ramification in spectral statistics. To begin with, note that in the limit ω → 0 the starting functional Eq. (12) is invariant under transformations ψ → T ψ,ψ →ψT −1 homogeneous in Hilbert space, T = {T rr ,ss }. The action thus possesses a G ≡ GL(2R, 2R) replica symmetry, weakly broken by ω. The mean field solution (spontaneously) breaks this symmetry down to H ≡ GL(R, R) × GL(R, R), i.e. the transformations commuting with τ 3 . As a result of this symmetry breaking a coset space G/H of Goldstone modes appears. In the context of single particle physics, these Goldstone mode fluctuations are the degrees of freedom of the nonlinear sigma model approach to disordered systems. Their appearance is made explicit by noting that the mean field equation (22) possesses the continuous manifold of solutions The fluctuations T are soft modes of the theory and must be integrated over. Substituting the fluctuation configurations into the action, noting the invariance of the Gaussian action and expanding to first order in ω/Γ 1 we obtain where in the last step we used that 1 γŷ − iπ D∆ τ 3 + const., and the constant (vanishing in the replica limit may be ignored. The expression on the left is the action of the zero-dimensional nonlinear σ-model of disordered systems. This theory is in quantitative correspondence with RMT. Specifically, the integration over T leads to the RMT spectral correlation function and the Fourier transform of this expression yields the RMT form factor K RMT (τ) in Eq. (4). Here, the δ-function contribution describes the auto-correlations of individual levels. It contributes to sum rules based on the correlation function, R 2 , but is otherwise inessential. For the sake of completeness, we outline the derivation of this result in Appendix B.

Fluctuations (massive)
The asymptotic form of Eq. (28) equals the k = 0 contribution to Eq. (8). Higher order contributions are proportional to the propagators (iω − (k)) −1 and must therefore be due to 'massive' fluctuations. In the following we compute the quadratic action of these modes, and in this way identify the weights (k) determining their mass. To this end, we generalize the set of integration variables to a µ = T (D 1/2ŷ + y µ )T −1 , where T ∝ 1 V are the Hilbert space singlet Goldstone modes, and y µ are fluctuation matrices. For µ 0 these modes carry structure in Hilbert space (for completeness we note that the singlet mode y 0 is diagonal in advanced/retarded, because off-diagonal fluctuations are already accounted for by T .) The substitution of this ansatz into the action (20) leads to where we have used X 2 µ = s(µ) and neglected the coupling between the y µ and the Q-fluctuations (In regimes, where the cumulative contribution of the y µ is sizeable, ω ∆ fluctuations of Q are small, and Q τ 3 is a reasonable approximation). The structure of the action suggests a decomposition in contributions off-diagonal and diagonal in advanced/retarded space, respectively. The diagonal fluctuations, v µ , describe correlations between Green functions of identical causality and do not couple to spectral correlation functions. This statement extends beyond the Gaussian order considered presently (see section 7) and we will therefore neglect these fluctuations throughout.
A quick calculation, detailed in Appendix C leads to where ∆h ≡ h + − h − , and we defined the 'propagators' Π −1 µ ≡ S (µ) −1 − 1 − iω/γ. Since w 0 = 0, the sum over µ starts with configurations containing at least two Majoranas, the lowest order non-trivial even parity configuration. Now it is a good time to analyze the factors S (µ) = 1 n a s(a, µ) giving these modes their weight. Consider a configuration µ containing k = 2l operators χ i , i.e. |µ| = k. There are where the last approximation is valid for small k. For configurations with generic k = O(N/4), |S |µ| | ∼ N −2 , which follows from the fact that in this case, S (µ) sums over ∼ N 4 quasi-random sign factors. Defining the propagator assumes the form Π µ = γ( (|µ|)−iω) −1 . In view of the positivity of this expression, the integration over the off-diagonal matrices w µ can be made convergent if we define With this parameterization, the action assumes the form We may now perform the Gaussian integration over the R 2 independent complex variables parameterizing each b µ to obtain the correlation function as 14 The substitution of this expression into Eq. (12) leads to and from this result we obtain the spectral correlation function (8). Eq. (37) is the main technical result of this paper.

Fluctuations (massive, nonlinear)
Eq. (37) was obtained by quadratic expansion of the nonlinear tr ln in the action. However, in view of the exponentially large number of modes, one may wonder what happens if the expansion is pushed to higher orders, and nonlinear couplings between different modes enter the play. In this section we show that, perhaps surprisingly, the cumulative effect of these couplings is weak and the result above remains unaltered. While this is an essential step of the programs it is also technically the most involved and readers willing to trust us on this point are invited to jump to the conclusions.
We study the question whether or not nonlinear terms strongly modify the theory in the band center, = 0, and neglecting the small energy difference, ω. In this case, we may setŷ = iτ 3 , which simplifies the calculation but otherwise is inessential. The action then assumes the form, where Π µ = S (µ) −1 − 1 is the zero-frequency propagator. To further simplify our life, we assume y µ = w µ , i.e. we neglect the fluctuations diagonal in advanced/retarded space. (One can convince oneself, that the two types of fluctuations do not mix in the contributions of leading order in D to the perturbation expansion.) The action then assumes the form where we noted that only even powers of advanced/retarded off-diagonal matrices survive the trace.
We consider the exponentiated action expanded in y µ and perform the Gaussian integration over m−2 matrices b µ in terms of total order O(b m ). The result can be interpreted as a contribution to a 'Hartree-Fock' renormalized quadratic action, and the question is whether large renormalization contributions are generated in this way. If not, the stability of the quadratic theory has been shown, at least perturbatively.
The expansion of the action leads to expressions of the structure . . . tr(b µ 1b µ 2 . . . ) tr(b ν 1b ν 2 . . . ) . . . , i.e. products of traces of b-matrices. The subsequent integrals can be done by a matrix-variant of Wick's theorem. It is straightforward to verify that for fixed matrices X, Y in replica space, we have the contraction rules, tr(Xb µ Yb ν ) = Π µ s(µ)δ µν tr(X)tr(Y), tr(Xb µ )tr(Yb ν ) = Π µ s(µ)δ µν tr(XY), where the angular brackets denote the Gaussian integral over the quadratic action. One may represent these rules graphically, as in Fig. 5, where the n-corner polygons denote traces of n matrices b µ i and indices i are used as an abbreviation for µ i . For example, the first upper half of the right panel states that the contraction ofb µ 4 and b µ 5 in tr Before turning to the concrete evaluation of individual contributions to the expansion, one should estimate their relevance, i.e. the powers in D that are to be expected. To this end, consider a term of nth order in b, distributed over l traces of n j th order, n = l j=1 n j . We then have n sums over µ at the bare level. Each trace has its own µ-conservation which brings the number down to n − l. Now perform n/2 − 1 contractions over all but two b's. Each contraction removes one free 16 summation, and we are down to n 2 − l + 1 free summations. One of the sums is used for the uncontracted index entering the quadratic action, which means that we are to expect a contribution of order D n−2l (each sum over µ has D 2 terms, which is the dimension of the matrix Hilbert space V ⊗ V * ). This power is reduced by the pre-factor D −n/2 weighing n expansions of fluctuation matrices as in Eq. (38). Finally, the l traces of the starting expression contribute a factor D l . We conclude that, pending other constraints, a maximum power of D n 2 −l is to be expected. This estimate indicates that at nth order of perturbation theory, contributions of highest order come from terms where all n fluctuation matrices enter the same trace, i.e. from first order expansions of the exponentiated action in traces of maximal order. If the full power D n 2 −1 would result from the contraction of these traces, the perturbation theory would blow up. However, as we shall see, the commutation relations between X µ -operators lead to a further reduction and bring the terms down to a small contribution of O(1).
Turning to the concrete evaluation of single traces, tr(b µ 1 . . .b µ n ), consider the example of a 10th order trace graphically represented in the bottom panel of Fig. 5, where the open circles represent the two uncontracted b-matrices. A first thing to notice is that contractions with parallel contraction lines are vacuum contributions and vanish. This is seen from tr(Xb µbν Yb νbµ ) → tr(Y)tr(Xb µbµ ) → tr(X)tr(Y)tr(1) . We thus focus on the 'maximally crossed' structures shown in the right panel. (These should not be confused with the maximally crossed diagrams of weak localization theory. In the present context, each wavy line represents an 'SYK-diffuson' and not a single impurity line as in localization theory.) It is straightforward to verify that the maximally crossed contraction of a trace of order n = 2(2k + 1) generates the contribution where µ ≡ µ 1 ,...,µ 2k+1 is a sum over all index configurations. Each operator X l appears twice under the trace and we commute them through the other X-operators to annihilate them as X 2 l = s l . This leads to the appearance of a factor so that we obtain At this point, and excluding µ 1 , we are summing over D 4k terms and may expect a contribution of total order D −2k · D 4k = D n 2 −1 in accordance with the estimate above. However, this estimate turns out to be way too high as it ignores an almost complete sign cancellation due to the presence of the propagators weighing the sum.
In Appendix D we show that the summation over µ can be carried out in closed form and that it converts the sum into the sign factor The key feature of this expression is that (i) the summation over 2k indices µ did not lead to a contribution of O((D 2 ) 2k ) but only O(D 2k ). In combination with the factor D −2k up front, this makes for a D-independent scaling, as for the 'bare' term of quadratic order. However, (ii) the summation also leads to a sum over 2k quartic configurations a F(a) ≡ i, j,k,l F(χ i χ j χ k χ l ). Each sum is over n = N 4 terms, and the total sum extends over a complicated, and effectively random sign factor. This means that the typical value of the sum will be O( 4k ln(N) ). We therefore conclude that higher order terms in the perturbation expansion suffer exponential suppression and can be neglected.

Summary and discussion
In this paper we explored the approach to quantum ergodicity in the long time dynamics of the SYK model. It turned out that the irreversible relaxation to the ergodic limit is governed by a large set of collective modes, each labeled by an element of the 2 N /4 dimensional restriction of the Clifford algebra to the even parity sector. We reasoned that these modes play a role conceptually analogous to diffusion modes in a dirty single particle medium. The main differences were (a) an density of states growing exponentially in the norm of the mode index, i.e. the number of Majorana-generators contained in it, and (b) a relaxation rate quickly increasing in the same index. The competition of these tendencies led to distinct signatures in spectral correlations. Specifically, we observed that the large number of modes generates a largely structureless contribution to the spectral two-point function which masks the ergodic RMT profile already a low energies exceeding the many body level spacing only by factors polynomial (and not exponential) in the state number, N. A different perspective could be obtained from the time dependent spectral form factor which filtered out the contribution of the modes of highest longevity, and showed a strong enhancement over the RMT form factor at short times. Both the results obtained for the spectral two point function and for the form factor agree favorably with previous numerical work in a comparison that does not involve adjustable parameters. In particular, the characteristic 'dip-ramp' structure observed in the form factor resembles results previously obtained for the OTO correlation functions of the system (if only at β → 0, the only limit accessible to our analysis.) Finally, one may wonder how the collective modes discussed here compare to the conformal Goldstone mode fluctuations addressed in other works. It can be reasoned that these two types of excitations focus on different sectors of the system's phase space. For example, the effective Liouville quantum mechanics [24] describing the conformal symmetry breaking in the system ignores replica off-diagonal fluctuations (fluctuations off-diagonal in replica space and/or replica symmetry breaking do not play a role), while they are essential in the present context. As a direct consequence, the RMT 'propagator', ω −1 , governing the zero mode action in the present theory is nowhere in sight in the Liouville theory. In other words, the latter cannot describe the ergodic limit of quantum chaos. Conversely, the time-dependent conformal symmetry is not included in the present framework, which singles out two fixed frequencies from the beginning. Still it would be desirable to understand the connection between the two classes of fluctuations at a deeper level and this is a subject of further work.
where ∆z = z + − z − = ω + + (h + − h − ). Note that the convergence of the B-integral is safeguarded by the imaginary increment in ω. The Gaussian fluctuation over the R 2 matrix elements of B then leads to the result where we noted that the saddle point action S [τ 3 ] ∝ R vanishes in the replica limit. From here, the correlation function C(ω) is obtained by differentiation w.r.t. h ± . Taking the replica limit, we note that the only surviving contribution reads and from here the spectral two point function, R 2 (ω) = ∆ 2 2π 2 Re(C 0 (ω)) is obtained as R 2 (ω) = − ∆ 2 2π 2 Re 1 ω +2 . This is identical to the large energy, ω > ∆ approximation to the RMT spectral correlation function and to the k = 0 contribution to the correlation function Eq. (8).
For completeness, we quickly outline how the full non-perturbative (in the parameter (ω/∆) −1 ) RMT correlation function is obtained from the theory [25,21]. Contributions beyond C 0 emerge from non-causal solutions to the stationary phase equation [Q, τ 3 ] = 0. The equation is solved by any configurationQ = τ 3 ⊗ S , where S rr = δ rr s r an arbitrary matrix of sign factors s r = ±1. However, a straightforward computation of the corresponding fluctuation determinants shows that the majority of these configurations leads to fluctuation determinants vanishing in the replica limit. The only survivor contributions are matrices containing a sign flip in only one replica channel, sayQτ 3 ⊗ (−2P 1 + 1), where P 1 projects on the first replica. The source-free (the sources are needed in the fluctuation determinants around the saddle point to obtain contributions non-vanishing in the replica limit) action of these configurations, S [Q] = iπ 2∆ tr(Qτ 3ω ) = iπω + 2∆ tr(−2P 1 + 1) = iπω + 2∆ tr(−2P 1 + 1) = iπω + ∆ (−2 + R) R→0 −→ − i2πω + ∆ no longer vanishes in the replica limit.
A straightforward analysis of fluctuations around these points shows that the fluctuation determinant remains the same, up to a global minus sign. The two point function is thus obtained as We here discuss the derivation of Eq. (31) is derived from the precursor Eq. (29). The anticommutativity of the off-diagonal fluctuations, w µ with τ 3 implies that where we definedẑ * ≡ τ 1ẑ τ 1 = − ( ω 2 + iδ)τ 3 + diag(h − , h + ), i.e. the matrixẑ with diagonal matrix elements exchanged. Compared to the scales, , γ the arguments, ω, h ± contained inẑ are weak, and so it makes sense to expand theẑ-dependent factors to first order in these quantities. The expansion of the product containing the √ -factors yields (. . . ) (. . . ) 1 + i γ 1 − 2γ 2 −1/2 (ω + (h + − h − )), and we obtain Finally, using that close to the band center 1 − For simplicity, we assume that the propagator is dominated by the zeroth order expansion in 1 < |S (µ)| −1 , Π(µ) = (S (µ) −1 − 1) −1 S (µ). One can convince oneself that this is a conservative estimate and that higher order terms lower the result. Below we will show that if 2l out of 2k + 1 summations over µ-indices are performed, the sum assumes the form P k = D −2(k−l) For l = k, no µ-summation is left, X 2k = 1, and the remaining terms reduce to Eq. (42). The formula above is proven by induction. For l = 0, we have the starting expression (D.2). Let us then assume that the expression holds for a value 2l, and do two more µ-summations to progress to 2l + 2. We first sum over µ 2k+1−2l ≡μ. The dependence on this index sits in the highest factor S (μ)s(μ, ν 2k−2l ) contributing to X 2l and in the sign factor s(b 2k−2l+1 , ν 2k+1−2l ) = s(b 2k−2l+1 ,μν 2k−2l ) = s(b 2k−2l+1 , ν 2k−2l )s(b 2k−2l+1 ,μ). Isolating these terms and using S (μ) = to conclude that the sum over sign factors collapses one more µ-sum, i.e. the sum over µ 2k−2l .