Off-Diagonal Observable Elements from Random Matrix Theory: Distributions, Fluctuations, and Eigenstate Thermalization

We derive the Eigenstate Thermalization Hypothesis (ETH) from a random matrix Hamiltonian by extending the model introduced by J. M. Deutsch [Phys. Rev. A 43, 2046 (1991)]. We approximate the coupling between a subsystem and a many-body environment by means of a random Gaussian matrix. We show that a common assumption in the analysis of quantum chaotic systems, namely the treatment of eigenstates as independent random vectors, leads to inconsistent results. However, a consistent approach to the ETH can be developed by introducing an interaction between random wave-functions that arises as a result of the orthonormality condition. This approach leads to a consistent form for off-diagonal matrix elements of observables. From there we obtain the scaling of time-averaged fluctuations with system size for which we calculate an analytic form in terms of the Inverse Participation Ratio. The analytic results are compared to exact diagonalizations of a quantum spin chain for different physical observables in multiple parameter regimes.


Introduction
The emergence of statistical physics from unitary quantum dynamics has been debated since the early days of quantum theory [1]. It is by now widely accepted that generic non-integrable quantum systems undergo a process known as quantum thermalization, which implies that an initially out-of-equilibrium state of an isolated quantum system will approach thermal equilibrium after some typical relaxation time. The underlying mechanism behind quantum thermalization is still a subject of debate [2][3][4][5][6][7][8][9][10][11][12][13][14]. One of the most successful approaches to this long-standing problem is the eigenstate thermalization hypothesis (ETH) [15][16][17]. According to this conjecture, the many-body eigenstates of a non-integrable Hamiltonian yield the same expectation values of local observables as those calculated with a microcanonical ensemble. Below we will give a more detailed presentation of this conjecture, which can be formulated as an ansatz for the matrix elements of observables in the eigenbasis of a many-body Hamiltonian. To visualize qualitatively the physics behind the ETH, we can consider a quantum lattice system with interactions coupling different sites. If we express a many-body eigenstate in a local basis, we expect that interactions lead to a highly entangled state distributed over the lattice [18,19]. The ETH assumes that the resulting linear superposition has similar properties to a microcanonical ensemble. Note that this mechanism for thermalization is purely quantum mechanical since the existence of quantum correlations and entanglement are essential ingredients.
The validity of the ETH has been confirmed for a wide range of non-integrable systems by means of exact diagonalizations [20][21][22][23][24][25][26]. Still, there are some aspects of quantum thermalization and the ETH that are not completely clear. The conjecture can be qualitatively justified by using the theoretical framework of quantum chaos, however, it has not yet been fully derived mathematically from first principles. A possible direction to address the validity of the ETH is to try to derive it from a more basic or fundamental assumption or set of assumptions. In particular, we know that many-body eigenstates of large systems can be often described by random matrix theory (RMT). The original work by Deutsch [16] actually used a random matrix Hamiltonian as a toy model to show the emergence of quantum thermalization in isolated quantum systems. In Deutsch's approach a non-ergodic system is perturbed by a Gaussian random matrix, which results in an approximate description of many-body eigenstates by random wave-functions with uncorrelated random coefficients. This theoretical framework does not by itself prove the occurrence of thermalization in particular many-body systems, however, it proves certain aspects of the process as long as reasonable assumptions on the underlying system are fulfilled.
The quantum thermalization process has two fundamental aspects. Firstly, it involves the equivalence between time-averages of expectation values of observables and microcanonical averages. Secondly, it also involves the equilibration of an initially excited state into a thermal state, that is, we expect that time-fluctuations around thermal averages will be small. Furthermore, those fluctuations should decrease with system size such that statistical mechanics is recovered in the thermodynamic limit. Equilibration is governed by the off-diagonal matrix elements of an observable in the basis of eigenstates of the Hamiltonian. The ETH as formulated by Srednicki [15] includes a condition for off-diagonal matrix elements, which ensures equilibration. Furthermore, it has been proved that the random wave-function model can be used to qualitatively reproduce the ETH result for time-fluctuations [27]. However, random wave-function models usually work under the assumption of statistically independent random coefficients. This condition limits the validity of this approach, as we show in the next section.
A deeper understanding of time-fluctuations in the quantum thermalization process is actually crucial to describe current experiments with microscopic systems. Physical realizations of isolated quantum systems, where the emergence of statistical physics can be investigated, have been made possible only recently due to advances in quantum simulators with atomic and solid-state systems [28][29][30]. These include ultracold atoms [31][32][33], trapped ions [34,35], and superconducting qubits [36]. Identifying quantum thermalization would ideally involve a comparison between observed time-averages and microcanonical averages, however computing the latter is a challenge in complex many-body systems. An alternative path to test theoretical ideas such as the ETH is to check predictions made on the time-fluctuations of observables, such as the scaling with system size or interaction strength. For that aim, a deeper understanding of the physics and assumptions underlying the ETH would be required, to obtain quantitative predictions that can be used to identify ergodic phases in experiments.
In this work we present a derivation of the ETH in a random matrix model that yields an approximate description of a quantum non-integrable system under some reasonable assumptions. We build on the theoretical model introduced by Deutsch [16], and extend it to the calculation of off-diagonal matrix elements of observables. We show that correlations induced by orthonormality between random wave-functions must be taken into account to obtain a consistent derivation of the ETH from RMT. Our work cannot be considered as a proof of the validity of the ETH, however, it shows that the conjecture can be fully obtained from a description in terms of random wave-functions. Our theory can be used to quantify time-fluctuations after a quantum quench, and to predict the scaling of fluctuations with system size, thus yielding predictions that can be compared with experimental results and used to identify ergodic regimes in quantum many-body systems.
This article is structured as follows. In section 2 we introduce the ETH ansatz and discuss the limitations of a model of independent random wave-functions to describe the behaviour of off-diagonal matrix elements of observables. In section 3 we introduce Deutsch's random matrix model consisting of a diagonal Hamiltonian perturbed by a Gaussian random matrix. We extend the original model to account for interactions between random wave-functions arising from the orthonormality condition. In section 4 we calculate the correlation functions between random wave-functions. In section 5 we use those correlation functions to calculate the offdiagonal matrix elements of an operator, and show that they take the same form predicted by the ETH. In sections 6 and 7 we present a numerical confirmation of our analytical results. Finally, in section 8 we show that our model provides us with a good description of the time-fluctuations in a non-integrable quantum spin chain. We finish with our Conclusions in section 9, where we discuss the range of applicability of our results and their implications.

ETH and the limitation of the independent random wave-function ansatz
In this section we introduce the ETH and the random wave-function ansatz. We will show that a description of many-body wave-functions based on independent random variables does not lead to a consistent description of off-diagonal matrix elements of typical observables.
To focus our discussion, consider a system described by a non-integrable Hamiltonian, H, with eigenvectors and eigenenergies y ñ m | and E μ , respectively, such that y 0 . The equilibration of a closed quantum system into a thermal state implies that (assuming non-degenerate energy levels), Assuming that probabilities m | | a 2 take non-vanishing values close toĒ, the ETH ensures that the second term in equation (1) is equivalent to a microcanonical average.
To understand the relation between the ETH and a random wave-function ansatz, let us assume that the observable O is a local operator in a quantum lattice model defined on a subsystem S. The rest of the lattice forms a bath, B, and we write the total Hamiltonian like H=H S +H B +H SB , where H SB is the interaction term. Now we define H 0 =H S +H B , and the non-interacting energy eigenbasis, To simplify the notation in what follows we will assume that variables with indices μ, ν refer to eigenenergies or eigenstates of the interacting Hamiltonian, whereas indices α, β refer to H 0 .
The random wave-function ansatz consists of the assumption that 1 . The average á ñ  V is taken over realizations of the random wave-function (this will be more clearly defined in the next section). We assume that the function Λ(μ, α) is smooth, has a maximum when E μ =E α , and vanishes when E μ −E α ? Γ, with Γ being a typical energy width. A perturbative calculation, in which H SB was approximated by a random matrix, carried out by Deutsch [37] leads to a random wave-function model with a Lorentzian, where ω 0 is the average spacing between energy levels and we assume for now that both Γ and ω 0 are independent of α, μ. Outside a perturbative regime, however, numerical calculations on non-integrable models have shown that wave-functions have a Gaussian shape [38][39][40][41][42]. Diagonal matrix elements in the interacting basis can be approximated under the assumption of selfaveraging, . Equation (6) implies that the coupling induced by H SB leads to the smoothing of the distribution of diagonal matrix elements in the interacting basis and provides us with a justification for the ETH for diagonal elements of observables (2) within the random wave-function model [3,16], since we can make the identification, which yields a smooth function as long as the sum runs over a sufficiently large number of states. We also expect that in the thermodynamic limit the average á ñ ( ) O t does not deviate too much from its mean-value (equilibration aspect of quantum thermalization). The averaged time-fluctuations over an infinite integration time are given by [43], under the assumption of non-degenerate energy gaps. Based on quantum chaos theory, Srednicki [15] introduced the ETH ansatz for the off-diagonal matrix elements O μ ν, In this expression D(E) is the density of states (the original expression by Srednicki included an equivalent normalization by using the microcanonical entropy instead), E=(E μ +E ν )/2 and ω=E ν −E μ . R μν is a set of random variables with zero average and unit variance. f O (E, ω) is a continuous function of E and ω, which we expect to be centred around ω=0, and take negligible values if the difference between energies ω is larger than a typical energy width.
A natural question is whether a random wave-function model can be used to justify the ETH ansatz for offdiagonal matrix elements as well. As the off diagonal elements of a typical observable average to zero, it is convenient instead to analyse the squared modulus. To simplify the discussion we restrict our evaluation for now to those observables that are diagonal in the basis of H 0 , A common assumption that is made here [16,27] is to treat the coefficients c μ (α) as uncorrelated random numbers, the only surviving terms of this sum will then be å å a a m a n a = » L L mn m n a m n aa a where we have defined a microcanonical average around E μ , aa O 2 . Now, the sum of off-diagonal elements may also be obtained without use of the random wave-function ansatz as where we have only assumed a self-averaging condition. Thus, comparing equations (14) and (13) we can observe that we obtain an inconsistency. We are thus lead to conclude that there are indeed correlations between the coefficients, and that equation (

Model for generic non-integrable quantum systems
We now present the random matrix model from which we will base our analysis, consisting of a non-interacting diagonal part, and interactions modelled by a random matrix. Explicitly, the Hamiltonian in question is given by where the diagonal matrix elements, f α =α ω 0 , are energies equally spaced by ω 0 , and we choose energy units such that ω 0 =1/N, with N the total number of levels. The perturbation term is a real random Gaussian Hermitian matrix, h, which follows the probability distribution µ - In equation (20) we use the shorthand notation, c, to represent the matrix of c μ (α)s. A is a normalization constant, and we perform the integral over all independent entries of random Hamiltonian matrix elements, h α,β . Further, we have used h exp exp 2 2 2 , from which we can see that, for the random matrix selected from the Gaussian orthogonal ensemble (GOE), the width of the distribution diagonal elements is twice that of the off-diagonal elements. The first delta-function in P(c) imposes an orthonormalization constraint whereas the last delta-function restricts the values of the c μ (α) to those of eigenstates of the Hamiltonian (17); the Hermiticity of H implies that the latter need only run over μ>ν. Working with the exact probability distribution P(c) is obviously very difficult. Studies of quantum chaotic systems [44] indicate, however, that probability amplitudes behave as Gaussian distributed random variables, suggesting we may treat the c μ (α)s as belonging to a Gaussian distribution with some width depending on μ, α. However, as we saw in section II above, we must account for orthogonality in order to obtain a consistent result for off-diagonal matrix elements of observables. We thus look for an approximate probability distribution of the c μ (α)ʼs of the form In equation (21) we assume an approximation in terms of independent Gaussian variables, however, we keep the orthonormality constraint to account for correlations. To find the functions Λ(μ, α) that lead to an optimal description of the problem we have to minimize the free energy, where we have written ò c d as shorthand for an integral over all elements, ò ò The calculation of the distributions Λ(μ, α) which fulfil this condition is performed (using a differing target probability distribution p(c, Λ)) in [37]. We repeat this calculation in appendix A for clarity. We obtain , differing by a factor of 2 from [37] (this is corroborated below with a numerical calculation). Also required for the calculation of the correlation function (19) is the partition function of our approximate probability distribution, which is also obtained in appendix A (equation (A24)): In equation (24) the first product is the contribution from the free Gaussian term in p(c, Λ), whereas the second product is a result of the orthonormality condition.

Calculation of correlation functions
We can see from equation (24) that the final form of the partition function describing the full system is a product of all eigenvector interactions occurring in pairs. We are interested now in the calculation of the correlation function (19) involving a pair of random wave-functions, c μ (α) and c ν (α). For that we define the generating function We will calculate correlation functions by differentiation of G μ,ν with respect to the auxiliary fields ξ μ,α , ξ ν,α , described for all α by x x m n   , . This approach involves an implicit approximation, namely, we are assuming that correlations involving two random wave-functions can be computed by singling out the contribution of those wave-functions to the partition function and factoring out the rest. This approximation is well justified since it accounts for the effect of the orthonormality between μ and ν, which will determine the form of the correlation function.
Equation (25) may be evaluated as a N 2 -dimensional Gaussian integral after we express the delta-function in its Fourier form, We write our generating function in the form , is the generating function for the calculation of the correlation functions. Equation (27) may then be calculated exactly, as the N 2 -dimensional integral over x is now in Gaussian form, and is given by and å x n a x m a lx x = -L + L a a m a n a m a n a - We then have, which we write as x m a n a l m a n a = L + L -L L + L L m a n a m a n a m a n a Now, we can rewrite the integrand in equation (32) as Then, as for small x, in the high N limit we have  å å l m a n a l m a n a l m a n a Thus, we obtain for the generating function  å  ò x x p m a n a l m a n a x x l The generation function can be checked to yield the correct x = Taking the product over all pairs of eigenvectors μ, ν of the 2-eigenvector partition function of equation (37) we recover the interacting part of the partition function of the previous section, equation (24). We can proceed now and simplify the generating function by simplifying equation (33) in the limit Γ/ω 0 ? 1. For this, we first notice that, due to the Gaussian term in equation (36), the integration variable λ is restricted to take values such that, Since the term m a n a On the other hand, in equation (33), we find in the denominator the term λ 2 Λ(μ, α) Λ(ν, α). Since the product Λ(μ, α) Λ(ν, α) takes values of the order of w G -( ) Using this approximation and carrying out the integration over λ we arrive at the following form for the generating function, where we have ignored the non-interacting factors, which are irrelevant for the calculation of the correlation functions. Equation (40) is the basis of a self-consistent description of matrix elements in terms of random wavefunctions. We apply our result for the correlation function of interest (see equation (18)), After calculating the derivatives of our simplified generating function (40) we obtain, å å a b a b m a n b d d m a n a m b n b d d m n m a n a m a n a d d m n In the last equation, the second and third terms in the right-hand side arise solely due to the interactions between random wave-functions that are induced by the orthonormality condition.
For an observable that is diagonal in the basis of H 0 we only need to consider the values α=β and a b ¢ = ¢.
The relevant correlation function is then of the simpler form Equation (43) is one of the most important results of this work. Note that the first term in the rhs of this equation is the contribution one obtains by ignoring the interaction between random wave-functions, whereas the second term arises solely due to those interactions. It is thus necessary to understand whether the corrections induced by interactions are relevant or, on the contrary, can be neglected to leading order (as assumed in many previous works). For this we first notice that where the ratio ω 0 /Γ = 1, since it corresponds to the inverse number of states in the energy window defined by Γ. We find the following scaling O m a n a w n 0 3 We could feel tempted to simply ignore the correlation term in equation (43), since it is of higher order in the small parameter ω 0 /Γ. Neglecting the correlation term is a valid approximation in the case α=β, since we find that the leading term contribution is given by equation (45). On the contrary, for non-diagonal terms (a b ¹ ), the lowest order contribution is given by equation (46) and it is of However, there are of order Γ/ω 0 more non-diagonal than diagonal terms Whenever we use the correlation function equation (43) to calculate the expectation value of an observable, we will need to sum over indices α, β. Thus we expect that the , which is thus comparable to the contribution from the diagonal terms We conclude that both terms in the rhs of equation (43) are equally relevant.
The reasoning above also explains discrepancies that one may find when, for example, verifying the orthonormality sum rule with equation (43). Explicitly, orthnormality implies that, can be ignored, since the leading contribution to the diagonal term is Γ(μ, α), The attentive reader may find a contradiction in neglecting terms that are one order lower in ω 0 /Γ in equation (48), while keeping the second term in the rhs of equation (43). However, we recall that in the latter case, we have to sum over a large number of low-order non-diagonal corrections, and thus both equations (45) and (46) may lead to contributions of the same order when calculating matrix elements of observables.
We also stress here that whilst the derivation of the Lorentzian form of Λ(μ, α) is perturbative, and thus only accurate for small couplings, our result of equation (42) is more general and relies only on the condition that the wave-function is spread over many non-interacting states. For example, a system with a Gaussian form Λ(μ, α) could be described by the approximate distribution (21), and yet lead to the same form for the random wavefunction correlations.

Calculation of off-diagonal matrix elements
We can now use the functional form for Now, assuming self-averaging, we can replace a Then, using our expression for the correlation function, equation (42)    Again, we find that a non-negligible contribution arises from the random wave-function correlations. To further approximate this expression we define the average where m m n + ≔ ( ) 2, which one may observe is essentially a microcanonical average centred on the energy m E . A further self-averaging approximation allows this microcanonical average to be removed from the summation.
å å å m a n a m a n a Equation (53) is one of the most important results of this work. Note that the result is now free from the pathology that we found when approximating many-body wave-functions by independent random numbers in equation (11). Our final expression has a similar form, however correlations induce a second term that appears as a result of the orthonormality condition. Finally we note that the overall dependence of m n | | O , 2 on the energies E μ , E ν agrees with the ETH ansatz for off-diagonal matrix elements in equation (9).
We then take the continuum limit, substituting   Whilst the second term in equation (54) is analytically obtainable, we may observe that this term is w µ 0 2 , and thus within our approximation is correctly ignored. We then see, as the convolution of two Lorentzian functions of widths Γ 1 and Γ 2 is simply a Lorentzian of width Γ 1 +Γ 2 , that the functional form for a diagonal observable is . We see here that, to first order in ω 0 , the off diagonal elements of a generic observable that is diagonal in H 0 are described by a Lorentzian of width 2Γ. For more general observables one simply uses the known structure in the non-interacting basis, as we will see below. This result corroborates the relation between the variances of diagonal and off-diagonal elements obtained in [9], and observed numerically in [38,45], showing that they differ by a factor of two. One can see that the width of the distribution of diagonal elements is the same as that of the wave-function, Γ, from equation (6).
Returning to our original argument indicating the failure of the random wave-function ansatz, we may double check the consistency of the above RMT approach by repeating the calculation of å n mn m n ¹ | | O 2 using equation (55). This is obtained by replacing as expected. Thus the RMT approach, including correlations due to orthogonality, leads to a correct normalization of the matrix elements of observables. We note here that the result from the RMT approach tells us more about the source of this scaling than we obtained from our previous discussion. Equation (16) tells us that the sum over all off diagonal eigenstates contributes this scaling factor, but gives us no information about the contribution of any individual eigenstate. We can see from the RMT result of equation (55)

Comparison to numerical random matrix model
To check the results above we first compare them to a numerical random matrix model by diagonalizing equation (17) and calculating the off-diagonal distribution for the matrix elements of example observables. We choose our observables, O odd and O sym , to be defined such that in the non-interacting basis f ñ a {| }all offdiagonal elements are zero, and the diagonal elements are given by here, though the RMT method developed above can easily account for non-diagonal observables, as we will see below for a spin-chain system. To obtain the observable distributions we find the average distribution over many realizations of the Hamiltonian, equation (17), which is essentially the mathematical procedure to find the probability distribution in equation (20). Examples of the overlap of the RMT prediction are shown in figure 1. Here we see a very good agreement between the analytic predictions of equation (23) (figure 1(a)) and equation (55) (figures 1(b), (c)) and the exact numerical results.
As shown in figure 1, the scaling of each observable O odd and O sym are different, and we can see here that the analytic prediction of an observable dependent rescaling is true to the numerics. We note that for couplings of ⪆ g 0.2 our analytic treatment is no longer a good approximation. This corresponds to the bulk eigenstates having significant value at the edges of the spectrum, and thus our assumptions made obtaining a functional form for Λ(μ, α) (appendix A) are not good for such coupling strengths.
Before making comparison to realistic systems, we comment here on an essential ingredient to the derivation of our analytic results: the self-averaging procedure. This property of random matrices is commonly assumed [9,46], and whilst not rigorously proven, has been an invaluable tool in the descriptive power of RMT-indeed, RMT has seen much success in describing interacting spin systems [12], and atomic and nuclear physics [47,48]. Further, the analysis of random matrices based on the above assumptions already makes up much of the basis of our understanding of the ETH [9], which has seen repeated numerical verifications in non-integrable models. One can write the essential assumption as, for example such that the observable matrix elements are taken to be equal to their ensemble average. Note that O αβ are not averaged quantities, as they do not depend on the random perturbation. Our work is by no means a rigorous proof of this property, however the success of the analytic results when compared to the exact numerics may be seen as further evidence of the self-averaging property of random matrices.

Comparison to exact diagonalization of spin-chain
We now perform a comparison of the theory from random matrices outlined above to a more physical system. We choose a 1D spin chain, with a Hamiltonian of the form We reiterate here that the Lorentzian functional form for the wave-function distribution Λ(μ, α) is obtained in the perturbative regime (see appendix A), and thus we only expect good agreement with our RMT result when the interaction Hamiltonian H SB is small. However, the theory developed above for correlation functions, and thus the application to observable distributions, is more general. Previous numerical studies have shown that in the high coupling limit one observes a Gaussian wave-function distribution [38][39][40][41][42], and thus for high coupling strengths we do not expect a good overlap with the developed RMT results, however the basic phenomenology should remain unchanged. For our model, the system Hamiltonian H S is simply given by a spin in perpendicular fields, B x and B z , where N S labels the position of the system in the chain, between 1 and N.
Thus we have H 0 =H S +H B , and H I =H SB . For the analysis below we compare various limits of this system, to show where our assumptions made above do and do not hold. Each limit is non-integrable, and expected to thermalize.
We focus here on two cases: a homogeneous chain, and the case of a weakly coupled impurity. It is the latter for which we expect the RMT description to work best, as it is here that the assumption that the density of states does not change over the coupling width is valid. It is this assumption that allows us to treat the interaction Hamiltonian as a full random matrix in equation (17). Should the density of states significantly change over the relevant coupling width, then a random matrix with some bandwidth would be required.
Initially, for the impurity case, we set = = = =  calculate the off-diagonal matrix elements of system observables for varying system sizes from N=8 to N=13. We set the system position to be N S =5 throughout. To test the RMT prediction for the observable and wave-function distributions we calculate these distributions directly using exact diagonalization and perform a fit to the distribution to find the observed width Γ Fit . This is then compared to the expected width from a random matrix framework, Γ RM , which we discuss below. To perform the fit we first smooth the ED result by applying a Lorentzian mask over each point such that, for smoothed eigenstates we have . This function is related to the strength function introduced in quantum chaos theory [12]. Similarly, for an observable O We perform a three variable (central energy, peak width Γ, and peak height) fit to a Lorentzian to find the Γ Fit . The values for Γ Fit can then be compared to Γ=Γ(W 0 ) found from the interaction Hamiltonian using the method outlined below.

Computation of RMT width
For comparison of our RMT description to the ED calculation, we must be able to calculate an estimate for Γ from the random matrix perspective. This can be obtained from the Hamiltonian, as for the random matrix we have p w G = g N RM 2 0 , and g N , which may be found by the average value of the random interaction Hamiltonian. Relating this to a physical system must be done with some care, however, as the average value should not be taken over the entire Hamiltonian, but over some energy width W, as discussed below. We can write Γ RM , for a random matrix, as where D(E)=1/ω 0 is the density of states. In this form we can see more easily the relation to a real Hamiltonian. However we must treat the above expression carefully, as the association  { } † g N H H N Tr I I 2 2 must be made with proper consideration of the physical relationship between the interaction Hamiltonian and a random matrix. To reiterate, the physical grounds for using a random interaction Hamiltonian here rely on the fact that for generic non-integrable systems the interaction Hamiltonian, when expressed in the basis of eigenstates of the non-interacting Hamiltonian, resembles a banded random matrix with some width W BW . We can use a full random matrix for the low coupling limit as the density of states, which dictates the bandwidth, does not change much over the width of the coupling energy Γ. Thus, there are two caveats to be considered in implementing equation (66): H I must be expressed in the basis of H 0 , and the trace must be taken over a finite width W 0 <W BW . We thus define the trace over an energy width W, . The question is, then, which is the physically relevant value, G( ) W 0 , of the possible values of Γ(W)? We know that Γ must satisfy Γ(W 0 ) = W BW , as otherwise our assumption that the density of states does not change over the width Γ is invalid. Furthermore we must have Γ(W 0 ) = W 0 such that all states within the coupling energy Γ ≔ Γ(W 0 ) are counted. Thus we have the condition Γ(W 0 ) = W 0 = W BW . We should expect to see a plateau in the function Γ(W), giving the width over which the interaction Hamiltonian is effectively described by a random matrix. As W grows we should then expect to see Γ(W) decay for W>W BW , as the long range interaction terms vanish. It is the value of Γ(W) on the plateau that is the physically relevant point, as assuming the interaction strength is weak enough, the structure of long-range interactions should not matter.
We can see from figure 2 that this description is a good approximation for the spin chain, however the estimation of Γ from this method is a likely source of error for the system sizes available, as the plateau region is not exactly flat as one would expect from a true random matrix. For larger sizes, one expects the initial structure of the Hamiltonian to be more 'washed out' by the change to the non-interacting basis. We can also see from figure 2 that as the interaction strength J I increases the line Γ(W)=W will extend further into the plateau region, as the average value of the interaction Hamiltonian elements in this region increases. The random matrix approximation becomes invalid in the limit where the line Γ(W)=W extends past the plateau region, as it is in this case that the density of states begins to change significantly over the width Γ (hence the condition Γ(W 0 ) = W 0 ).

Impurity
We begin by analysing the simple case where = = For the case of the s ( ) x N S observable we must instead use information we have about the structure of the observable in the non-interacting basis to obtain a functional form for the observable distribution from the RMT formalism above. The useful observation here is that for s ( ) is necessary for correct normalization. We can thus see that for the s ( ) x N S observable we expect two peaks in the distribution of off-diagonal matrix elements, each of width Γ, separated by a width ( ) B 4 z S . Shown in figure 3 is a comparison between the ED numerical calculation and the RMT prediction. We compare the value for Γ Fit obtained from the fit to the smoothed distribution and the value found for Γ(W 0 ) to obtain a relative error, shown in figure 4, which we observe to decrease on average with system size for varying interaction strengths J I . Whilst the range in relative error here is high, this is largely due to the difficulty in estimating Γ for the available system sizes, and the fit to a Lorentzian distribution is very good. Furthermore, one would expect a high error for such short spin-chains, as the RMT result is valid in the thermodynamic limit, and requires the wave-function to be spread out over many states.

Homogeneous chain
The inclusion of a finite ( ) B x S adds a level of complexity to the problem, as neither relevant observable s ( ) z N S or s ( ) x N S is diagonal in the non-interacting basis. A similar approach to that shown in above for s ( ) E E E , 2 2, with the central peak of twice the height. We can see in figure 5(a) that we obtain a good agreement for the weak coupling case. As we approach the fully homogeneous case in figure 5(b), however, we observe the RMT prediction no longer holds. We can see from figure 6 that the Γ(W)=W line extends to the end of the plateau region, and thus the requirements for assuming a full random matrix perturbation are not fulfilled-the change in density of states also contributes to the distribution of the wave-functions. Thus in this limit we no longer expect the wave-function distribution to be a Lorentzian, nor do we expect the method outlined above to be a good indication of the distribution width. Furthermore we note that the for high couplings there are also added technical challenges for the systems available to our study, as the interaction Hamiltonian structure is not sufficiently randomized by the transformation to the non-interacting basis. We note that most of this structure occurs at the edges of the spectrum, and thus one can simply take the trace over the central half of the energies, as indicated in figure 6(b). This is justified for the bulk states we are analysing.

Finite size scaling of long time fluctuations
Off-diagonal elements of observables dominate the behaviour of their long-time fluctuations [24,43,49]. where the w ( ) O 0 2 term is due to the subtraction of the μ=ν part. A further parameter that is of interest [35] to the finite size scaling of closed quantum systems is the IPR, defined as 4 . This can also be obtained in a similar manner using the RMT result where the factor of 3 in the denominator comes from the ratio of the second and fourth order moments of Gaussian variables. From these we obtain We can see from figure 7 that the proportionality is indeed correct, which has previously been shown to be a consequence of the ETH in [35]. Similar results have also been previously observed in [2,4,5], which obtain bounds on the late time fluctuations in terms if the IPR. Our work implies that in those systems which can be well described by a random matrix ansatz, the IPR determines not only an upper bound, but also the scale of the time-fluctuations. Similar dependencies have also been observed numerically in [50]. One also observes in figure 7 that the numerical prefactor, expected to be 1/6 for small couplings, as D =  [38][39][40][41][42]51], observing wave-functions of non-integrable systems to be Gaussian for large coupling strengths, one may repeat a similar calculation to that leading to equation (74), however with Λ(μ, α) replaced by a Gaussian. We then obtain a prefactor of -( ) 3 2 1 . We define  r gives the value of the prefactor directly. We indeed observe a growth of this prefactor to~-( ) 3 2 1 , the value expected by applying a Gaussian distributed wave-function to the RMT approach above in both the impurity and homogeneous cases. We note that for low couplings the fact that á ñ a a r does not tend exactly to the expected value from RMT is not surprising, as this is where we are most limited by the Hilbert space sizes available to our study, and thus there is a high associated error in this limit. Similar phenomena are observed for a numerical random matrix model, where the high coupling limit is obtained by the replacement Λ(μ, α)=Λ=1/N, the scaling of fluctuations for this case is analysed in appendix B.

Discussion
We have analytically studied a random matrix Hamiltonian, equation (17), made up of a linear ensemble of states with random interactions, and expanded on previous work [16,37] to find a functional form for generic observables, as well as clarifying many of the approximations made to obtain the wave-function distribution (appendix A). The form obtained for matrix elements of observables is in agreement with the ETH. We also predict that there is a linear relation between the time-fluctuations of an observable and the IPR. This relation may be relevant to detect quantum ergodicity by measuring the time-fluctuations in an experiment, if we  understand quantum ergodicity as the participation of many Hamiltonian eigenstates in the initial state, which is implied by small IPR values. Thus, measuring an exponential decrease of the time-fluctuations with system size would yield evidence that the IPR itself is exponentially decreasing with system size, which could be used as a smoking gun of quantum ergodicity.
We have assumed that an approximate description of the quantum dynamics of a subsystem in a many-body system can be achieved by an interaction term given by a structureless random Gaussian matrix. This approximation implies that the typical energy bandwidth of the coupling term, W BW , is considered infinite compared to the coupling strength, W BW ? Γ α . Our results are thus immediately applicable to the stuation of an impurity weakly coupled to a many-body bath, since in this case Γ α depends on a different interaction strength (J I in the spin chain example above) than the energy bandwidth, W BW , and thus Γ α can be made arbitrarily small. Our numerical calculations confirm that in this weak coupling limit many-body wave-functions are well approximated by Lorentzian-shaped random wave-functions.
The weak coupling approximation may fail if, for example, we consider a subsystem in a homogeneous system where the coupling strength is not necessarily small compared to the bandwidth of the coupling term. Actually, in a homogeneous system we expect that W BW ≈Γ α since both energy scales are governed by the same interactions. For example, in the spin chain considered in the last section, both W BW and Γ α are determined by the spin-spin interactions in the bulk J B . In this case, we have observed numerically that the random wavefunctions envelope is not necessarily a Lorentzian, but rather a Gaussian function. However, a valid random wave-function relying in the approximate distribution (21) is still possible by considering that Λ(μ, α) are now normalized Gaussian functions. Most of the discussion in the subsequent sections remains intact, including expression (53) for the non-diagonal matrix elements of an observables. The only effect from the strong coupling condition is a different line-shape of envelope functions defined in equations (64), (65), and a different prefactor in the scaling of the time-fluctuations as a function of the IPR, as shown in our numerical calculations (see figures 8 and 9).  where we have expressed the independent widths of the off-diagonal and diagonal element distributions as g 1 and g 2 , respectively. This further differs from that used in [37] by appropriate symmetrization of the random interaction Hamiltonian. This may be rewritten as We note here that this leaves us with the same integral as would be obtained if we had enforced orthogonality of only two eigenvectors at a time, as in [37], up to a factor of two. Now, we observe  , that we can obtain the best possible approximation p(c, Λ) by obtaining the functional form of Λ that fulfils as well as any constraints on Λ we may require. This is the problem solved in [37], though using a different target distribution p(c, Λ). The free energy integral of equation (22) can be split into two parts, which we heuristically label the 'energy', E, and 'entropy', S, with F=E−S, we have To calculate the full average in equation (A17), we need to calculate the average of all the even powers, where the factor -- ( )!! ≔ · · ( ) n n 2 1 1 3 5 2 1 , arises after the counting of all possible combinations of pairs. In this approximation we are neglecting those terms where indices are not contracted by pairs, however, those terms have at least one less summation over one of the indices α j , and thus they are supressed by a factor

A27
Now, we can see that the first term is also constant, as the sum over α cancels that over a¢, thus this term does not contribute, and we obtain We note that in [37] the first term is here labelled entropy, and the second term is labelled a repulsion energy. The final calculation required for evaluation of the free energy is the energy part, equation (A11). As Z P does not depend on Λ(μ, α), we can ignore this part (as we require F only for it is derivative with respect to Λ(μ, α)). We can also rewrite the delta-function factor as U The key observation here is that as the only non-zero terms in equation (A31) are those with even powers of c μ (α) any terms that have correlations between the 'bias' factor a = å å from the Hamiltonian and the orthogonality factor are either excluded by the fact that m n ¹ or reduced by the need for a a a = ¢  , . Thus the dominant cause of correlations, leading to non-zero terms in the average, are from correlations within each factor, and not between. This leads to the approximation, which is equivalent to that made in the partition function evaluation above, may be ignored, as they are proportional to m a L( ) , 3 2 and Λ(μ, α) 5/2 , respectively, and are thus small. We may therefore approximate this as  Taking the continuum limit and noting that ò w m a n a