2D CFT partition functions at late times

We consider the late time behavior of the analytically continued partition function Z(β + it)Z(β − it) in holographic 2d CFTs. This is a probe of information loss in such theories and in their holographic duals. We show that each Virasoro character decays in time, and so information is not restored at the level of individual characters. We identify a universal decaying contribution at late times, and conjecture that it describes the behavior of generic chaotic 2d CFTs out to times that are exponentially large in the central charge. It was recently suggested that at sufficiently late times one expects a crossover to random matrix behavior. We estimate an upper bound on the crossover time, which suggests that the decay is followed by a parametrically long period of late time growth. Finally, we discuss gravitationally-motivated integrable theories and show how information is restored at late times by a series of characters. This hints at a possible bulk mechanism, where information is restored by an infinite sum over non-perturbative saddles.


Introduction and summary
Quantum black holes have finite entropy and a discrete spectrum of states. The details of this spectrum are inaccessible in the semi-classical approximation: the density of states one obtains from the Bekenstein-Hawking entropy is a smooth function of the energy. In this work we address the question of how the discrete spectrum arises in 2d conformal field theories and their holographic duals. Maldacena suggested that one may address this question by studying the late time behavior of correlation functions [1], which is a sharp probe of the discrete energy levels in the spectrum. For unitary systems with discrete spectra, connected thermal correlators of -1 -JHEP08(2017)075 the form O(t)O(0) (where O is a Hermitian operator) tend to decay exponentially until times of order the entropy S, and then proceed to oscillate erratically about zero with an RMS amplitude of order e −S . 1 On the other hand, correlation functions computed in a classical black hole background tend to decay exponentially forever. This decay is often referred to as 'information loss'.
Holography may be a useful setting for studying the question of how a discrete black hole spectrum arises. From the boundary field theory point view, the fact that the spectrum is discrete is trivial if we place the theory on a compact spatial manifold. Similarly, the qualitative features of the late time behavior follow easily from mild assumptions about the spectrum (such as the fact that the theory is chaotic). The challenge is then to describe this behavior in 'bulk language', using objects that are natural from a gravity point of view. In this work we focus on another quantity that is also sensitive to information loss at late times.

Spectral form factor and information loss
Consider the thermal partition function Z(β), and let us analytically continue β → β + it. The parameter t should be thought of as real time. Let E n be the discrete energy levels, each with degeneracy N n , and consider the following quantity.
g(β, t) ≡ |Z(β + it)| 2 = n,m N n N m e −β(En+Em)+it(En−Em) . (1.1) If we formally set β = 0 then (1.1) becomes a well-studied quantity in Quantum Chaos called the spectral form factor (for reviews, see [5,6]). We will use the same name to refer to g(β, t) at any β. In the context of black hole physics this quantity was first discussed in [7], and was recently studied in the context of information loss in the Sachdev-Ye-Kitaev model [8,9] in [10]. See also [4] for a related discussion. At late times the double sum in (1.1) essentially localizes onto terms with E n = E m . As we review in section 2, the time average of g(β, t) obeys the bound g(β) ≡ lim to→∞ 1 t o to 0 g(β, t)dt ≥ Z(2β) . (1. 2) The bound is saturated when the spectrum has no degeneracies. The non-zero time average reflects a weighted counting of the discrete energy levels in the spectrum. The quantity on the right-hand side is of order e S . On the other hand, suppose we have a bulk theory with a black hole background and focus on the BTZ black hole for simplicity. We approximate the exact partition function by the BTZ black hole partition function Z(β) = exp π 2 c 3β , which is the dominant contribution for temperatures above the Hawking-Page transition. We then find a spectral form factor that decays to 1 at late times. If we also include the 1-loop determinant we 1 By the notation e −S we mean that the quantity scales as e −n dof where n dof is the number of degrees of freedom. We will be interested in 2d CFTs with large central charge c, for which n dof ∼ c. We note that at very late times of order e e S we expect recurrences, which do not play a role in this work. See [2][3][4] for a discussion of recurrences in the context of information loss. find that the spectral form factor decays to zero, representing no discrete states in the corresponding spectrum. 2 We see that the spectral form factor, just like the correlation function, is sensitive to information loss. See [11,12] for related discussions.
It was suggested in [1] that one may improve the situation by adding subleading bulk saddle points such as thermal AdS 3 . It is easy to check that including the thermal AdS 3 contribution indeed raises the time average, but this contribution is not sufficient for the time average to obey the bound (1.2) at high temperature. Indeed, we will see that no finite number of subleading saddles is enough to obey the bound (1.2) at high temperature.
For 2d conformal field theories, the question of information loss in the thermal two point function was studied in [13] and for collapsing black holes in [14]. Recently, the authors of [15][16][17][18] considered the four point function of two heavy operators O H , ∆ H ∼ c, and two light operators O L on the cylinder O H |O L (φ, it)O L (0)|O H . This is a microcanonical version of the calculation described above. In the large c limit, corresponding to the classical black hole limit in the bulk, one finds that the correlation function reproduces the thermal two-point function on a line, with temperature set by the heavy operator's dimension and thus decays in time. In [17,18] it was speculated that perhaps the late time decay is avoided (and information is restored) within each Virasoro block in an OPE expansion of this four-point function. This question is difficult to answer because the relevant Virasoro blocks are not known exactly. 3 We are able to answer this question in our context, by considering instead the spectral form factor, which has a decomposition in terms of Virasoro characters, analogous to the Virasoro blocks that show up in the OPE expansion of the heavy-heavy-light-light correlator. 4 The Virasoro characters have known closed-form expressions, and each relevant Virasoro character decays at late times. We conclude that in chaotic 2d theories information is not restored kinematically in general, namely as a consequence of Virasoro symmetry, but rather dynamically, due to an interplay between infinitely many characters. (Integrable theories will be discussed separately. For such theories information loss still occurs at the level of Virasoro characters, but is explicitly restored in the characters of the extended chiral algebra.) The authors of [20,21] studied the discrete spectrum of chaotic 2d CFTs by working directly with the thermal partition function. Our conclusion agrees with their results. The authors of [20] considered a modular invariant partition function that is made up of the vacuum character plus its modular images (appropriately regulated). They found that the corresponding density of states is essentially smooth and captures almost none of the discrete states. Here we advertise that if one is interested only in whether or not the spectrum contains discrete states (rather than in the detailed properties of these states), it is enough to check whether the time-averaged spectral form factorḡ(β) vanishes, a potentially simpler computation. 2 That the BTZ partition function decays to 1 (if we do not include the 1-loop determinant) is related to the fact that the inverse Laplace transform of e 1/β is given by E −1/2 I1(2E 1/2 ) + δ(E) which includes a single discrete state.
3 See [19] for recent developments. 4 The torus partition function can be written as a correlator involving 4 heavy twist operators. The Virasoro characters are the blocks that appear in an OPE expansion of this correlator. This discussion of information loss has been phrased in terms of the boundary Virasoro characters, but also has a natural bulk interpretation. The character which dominates at high temperature corresponds to the bulk BTZ saddle. The O(1/c) correction to the character corresponds to a one-loop determinant in the bulk. Therefore, a resolution of information loss phrased in terms of Virasoro characters would probably shed light on how information is restored in the bulk.

Late times and random matrix theory
The late time behavior of the spectral form factor is only sensitive to the structure of small energy differences. We generally expect that if we probe any chaotic system at sufficiently small energy differences, then the Hamiltonian can be approximated by a random matrix chosen from a suitable Gaussian ensemble. The authors of [10] made the observation that the late time behavior of chaotic theories should therefore be described by random matrix theory (see [5,6] for a review of RMT). This was verified for the Sachdev-Ye-Kitaev model [8,9] in [22]. We thus now turn to RMT as a guide for what to expect for the late time behavior of the spectral form factor. Figure 1 shows the spectral form factor for random matrices selected from the Gaussian Unitary Ensemble (GUE). We will discuss this curve in more detail below. For now we merely point out that (i) the shape of the curve before its minimum (the dip) is dominated by the coarse-grained shape of the spectrum (in the case of Gaussian random matrices this is Wigner's semicircle law), and that (ii) after the dip time the curve starts probing the discrete energy levels. In particular, the period of linear growth is related to the spectral rigidity of random matrix energy levels (essentially the fact that energy levels repel).
In [10] it was conjectured that the existence of a dip time, followed by a period of linear growth, is a generic feature of chaotic systems, including black holes. The value of the dip time is non-universal and depends on the detailed properties of the theory, including the coarse-grained shape of the spectrum that determines the early time decay. Here we test this conjecture in the context of 2d CFTs, and find evidence that the approximate dip time is robust for chaotic CFTs dual to gravity.
In RMT (and in the SYK model) the spectral form factor is defined by averaging over an ensemble of Hamiltonians. This averaging leads to a smooth curve at late times. In trying to apply the conjecture to an ordinary quantum field theory, one has to confront the fact that there is only one Hamiltonian. As a result, the late time behavior is expected to be erratic. Figure 2 shows the spectral form factor computed from a single GUE matrix. Beyond the dip time the fluctuations become large and the features of figure 1 are barely visible (see also [23]). However, as explained in [10], one can replace ensemble averaging by time averaging over a parametrically small window (in the limit of a large Hilbert space dimension), restoring the late time features. It is therefore meaningful to discuss the random matrix theory ramp and plateau at late times even in an ordinary quantum field theory.
In this work we estimate the dip time at which a generic 2d CFT crosses over into the RMT regime. This is done by estimating the shape of the early decay of the curve using modular invariance, and assuming that at late times we have the linear growth predicted by RMT. Our estimate relies on identifying the dominant contribution from a single Virasoro character at each point in time. We estimate that the spectral form factor decays at late times in an erratic way, with an envelope that decays as We will see this implies a parametrically long period of linear growth following the dip time. We expect the same to be true of black holes in AdS 3 .

Summary of results
Here is a brief summary of the key points of this paper.
1. We consider the spectral form factor |Z(β + it)| 2 as a probe of the spectrum. At early times it diagnoses the mean density of states, while at late times its behavior serves as a useful diagnostic of the discreteness of the spectrum [10]. As in the case of two-point functions, a decay at late times indicates that we are not probing the discrete states of the spectrum, and signals information loss.
2. In 2d CFTs the partition function has an expansion in terms of the Virasoro characters. Each character decays at late times, and therefore Virasoro symmetry is not enough in general to restore information.
3. We identify a universal contribution to the early time behavior of the spectral form factor, which follows from Virasoro symmetry and modular invariance. It includes sharp peaks at times t = 2πn for integer n, where the height of the peaks decays as a power law in time. We conjecture that this is contribution dominates the early time behavior in generic 2d CFTs. 4. In chaotic theories we expect the late time behavior (t e c ) to be described by random matrix theory. In particular, we expect there to be a characteristic time scale (the 'dip time') beyond which the RMT description is valid. Based on our (uncontrolled) analysis of the early time behavior, we conjecture that the dip time scales as e c . Beyond the dip time we expect there to be a period of linear growth (with large fluctuations) that is parameterically long at large c and high temperature. 5. For certain integrable models, or BPS sub-sectors of generic models, we identify a precise infinite set of bulk saddles which restore the information naively lost in the leading thermodynamic approximation.
The rest of the paper is organized as follows. In section 2 we discuss the spectral form factor and information loss in 2d CFTs. In section 3 we review the Virasoro character expansion and the modular properties of the torus partition function, and provide simple estimates of its decay before the dip time. Then, in section 4 we give an improved estimate of the decay by identifying the dominant character at any rational time. We conclude that these contributions are not sufficient to avoid information loss. In section 4.4 we estimate the dip time, beyond which we expect the system to have an effective random matrix theory description. In section 5 we discuss integrable theories. We show that for certain integrable theories, or BPS sectors of generic theories, information is restored by identifying the dominant saddle point at each particular time using modular invariance. Appendix A gives a short review of black holes in AdS 3 .

Spectral form factor
In this section we define the spectral form factor and discuss its properties in relation to information loss. Consider a unitary quantum field theory with a holographic dual. Place the theory on a compact manifold so that it has a discrete spectrum. The spectrum consists of energy levels E n , each with degeneracy N n . The density of states is given by (2.1) The thermal partition function at inverse temperature β is We assume for simplicity that this function is finite for any β > 0 (this is always true for 2d CFTs). Let us generalize the partition function and define One can obtain this function by analytically continuing Z(β), taking β → β + it. The parameter t is conveniently thought of as real time. We then define the spectral form factor by This is an important quantity in the study of random matrix theory [5,6].
In this work we will study the late time behavior of g(β, t). In a general chaotic theory this behavior is complicated as it involves a sum over many oscillators with different frequencies (E n − E m ). Things simplify if we only consider the long-time average, where only terms with E n = E m contribute.
We see that, on average, g(β, t) approaches a non-zero value at late times. In (2.5) we implicitly assumed that there is a minimal level spacing in the spectrum. The long-time average obeys the bound The bound is saturated when the spectrum has no degeneracies. 5 In this case the late-time average of g, namely Z(2β), is exponentially smaller than the initial value Z 2 (β). Indeed, in general we have For a CFT in d spacetime dimensions the right-hand side is equal to exp − 2

Information loss
We now consider the long-time averageḡ(β) in the context of the AdS 3 /CFT 2 duality. Consider a 2d CFT on a circle of length L = 2π that has a holographic bulk dual, and assume as before that the theory has a discrete spectrum. At high tempereature the thermal state of the theory is dual to a BTZ black hole. Its partition function is given approximately by Z(β) = exp 8π 2 k/β where k = c/24 and c is the central charge of the field theory. This is an approximation to the full partition function of the quantum theory. The BTZ contribution to the spectral form factor can be computed by continuing β → β + it, and it decays at late times as In taking the late time limit we will always keep β (the real part) fixed. In this approximation we find that the time average is 1, violating the bound (2.6). This is a form of information loss. The BTZ contribution to the partition function is given by the modular image of the vacuum state. As we will see below, no finite number of additional primary operators is sufficient to avoid information loss.
Let us think clearly about what this means. Given an approximate partition function Z(β) we can compute the corresponding density of states ρ(E) by an inverse Laplace transform. For the BTZ black hole this is well approximated at high energies by the Cardy formula ρ cardy (E) = e 4π √ 2kE , which is an approximation to the density of states in the dual field theory. The important difference between this and the exact density of states (2.1) of the quantum theory is that the Cardy density is a smooth and finite function of the energy (see [11] for a related discussion in the context of large N gauge theories). Indeed, given a partition function of the form Z(β) = dEρ s (E)e −βE where ρ s is a smooth and finite function, it is easy to see that the time-averaged spectral form factor violates the bound regardless of the details of ρ s . We see that the late time behavior of the spectral form factor directly probes the discreteness of the spectrum of the theory. In particular, the time-averagedḡ(β) counts discrete states in the spectrum (weighted by a Boltzmann factor and by degeneracy). Information loss occurs when we approximate the density of states by a smooth function that does not capture the individual energy levels. This type of information loss occurs in classical black holes in arbitrary dimension. Equivalently, it occurs in the dual field theory when we use the thermodynamic approximation to the partition function.

2d CFTs
In this section we discuss in more detail the torus partition function and spectral form factor in 2d CFTs, focusing on theories with large central charge. We discuss possible corrections to the leading answer (including certain non-perturbative corrections) and show that they are not sufficient to restore information in the spectral form factor.
Consider the partition function of a 2d CFT on a torus with parameter τ = iβ 2π + µ 2π . From now on we set the chemical potential µ = 0. The partition function can be written as a sum over all states, Here q(τ ) ≡ exp(2πiτ ), N h,h is the degeneracy of the state with conformal weights (h,h), and we took the central charges to be c L = c R = c = 24k for convenience. All states have h,h ≥ 0. The full partition function can also be written as a sum over Virasoro characters, Here we use the notationf (z) = f (z). Each term captures the contribution from a Virasoro primary with dimensions (h,h) and its descendants, and we have isolated the vacuum contribution from the sum. Each character appears with degeneracy n h,h . The characters are given by where η(τ ) is the Dedekind eta function. These expressions are exact even at finite c.
We assume the theory is modular invariant, which means We can write the partition function as a sum over states after performing any SL(2, Z) transformation γ. We will refer to the γ-image of a particular character as the contribution of that character in the γ frame.
To obtain the high-temperature approximation to the partition function we can write the sum over characters in the S frame. (3.5) The first term, which is the vacuum character contribution in the S frame, is the dominant contribution at high temperatures (when β < 2π) [24]. It is given by In writing this we used the fact that η(−1/τ ) = √ −iτ · η(τ ). The leading part of (3.6) at large c comes from the vacuum state itself. It also has an O(1/c) correction coming from the sum over the vacuum's descendants.

JHEP08(2017)075
Much of this structure is echoed on the gravitational side. The asymptotic symmetry algebra of pure gravity in AdS 3 is the Virasoro algebra, with central charge c = 3 2G [25]. 6 The contribution of thermal AdS 3 to the partition function can be evaluated exactly, and is given by the vacuum character contribution In the bulk, the leading contribution comes from evaluating the action of the classical gravity solution, while the O(1/c) correction is due to a 1-loop determinant. There are no higher order corrections so this result is 1-loop exact in bulk language. The contribution of the BTZ black hole geometry is given by the vacuum character in the S frame, eq. (3.6). Here, again, the leading large c contribution comes from the classical (black hole) solution, and there is an More generally, as we review in appendix A, at fixed temperature and chemical potential there are an infinite number of classical bulk solutions that are related by SL(2, Z) transformations [26]. 7 The solution that corresponds to γ ∈ SL(2, Z) makes a contribution to the partition function equal to χ 0 (γ(τ ))χ 0 (γ(τ )). For a general theory, there will be many additional contributions to the partition function that correspond to states involving matter fields.

Analytic continuation to real time
Equation (3.5) is a useful starting point for the analytic continuation β → β + it to real time because (i) at t = 0 the vacuum character contribution provides a good approximation, and (ii) this dominant contribution has a clear bulk interpretation as the BTZ black hole. This contribution remains dominant at sufficiently early times. We now discuss the various pieces of eq. (3.5) after analytic continuation to late times. We will find that the contribution coming from each individual character decays to zero at late times, violating the bound (2.6).
We start by focusing on the contribution of the vacuum character (3.6), which is equal to the BTZ black hole contribution and is the dominant contribution at high temperature. We analytically continue β, taking Notice that after analytic continuationτ is not the complex conjugate of τ . After a time of order a few βs we find that the vacuum character contribution to the spectral form factor decays as 8 Here is the AdS length, and G is the 3d Newton constant. 7 The family of solutions is labeled by elements of Γ∞\SL(2; Z), where we quotient by τ → τ + 1 on the left. 8 The continued eta function η(τ ) oscillates in time, never giving a substantial contribution to (3.9).

JHEP08(2017)075
The leading, vacuum state contribution decays exponentially to an O(1) amplitude at times t ∼ √ k. The subleading contribution coming from the descendents is then responsible for the 1/t 6 power law decay down to zero. Curiously, including additional states (the vacuum's descendents) in the S-frame makes the violation of the bound (2.6) worse.
Next, the contribution to (3.5) coming from each non-vacuum character can be written as and deacys as at late times, regardless of the conformal dimensions. (In writing these equations we assumed for simplicity that neither h norh are equal to zero, i.e. we are excluding additional conserved currents.) We arrive at the following conclusion: including a finite number of characters in the S-frame does not bring us closer to obeying the bound (2.6).

Universal late time decay
In this section we will attempt to understand universal properties of the late time partition function in AdS 3 /CFT 2 . For gravity in weakly curved AdS 3 the partition function undergoes a phase transition between the dominant low temperature saddle, thermal AdS 3 , and the high temperature saddle, the BTZ black hole [27]. The partition function in these two regimes is given approximately by This phase structure is replicated in sufficiently sparse, large c CFTs [24]. As long as the number of states grows sub-exponentially, the partition function is dominated by the vacuum state at low temperatures, and by the vacuum state in the S frame at high temperatures. As discussed above, starting with the dominant high temperature contribution and continuing β analytically to real time does not reproduce the correct late time behavior. The spectral form factor satisfies the bound (2.6), while the thermal partition function corresponding to the BTZ black hole leads to a decaying spectral form factor (3.9). This contribution decays exponentially to an O(1) amplitude at times of order √ k. As we will show, this decay significantly underestimates the correct late time behavior of the partition function.
In section 4.1 we identify a universal contribution to the partition function which decays significantly slower than (3.9). Then, in section 4.2 we estimate corrections to the -11 -JHEP08(2017)075 universal decay using Cardy's formula, and find that they are negligible in this approximation (though with important caveats). We also show that the free compact boson exhibits the universal early time behavior identified in section 4.1. In section 4.3 we give a refined version of the universal contribution to the partition function for all times and temperatures. This, together with the late time plateau for the spectral form factor, lends evidence to a universal picture for the time dependence of the partition function that we lay out in section 4.4.

Universal contribution
The partition function (3.5) expanded in the S frame is dominated by the vacuum character at t = 0. This suggests a strategy for approximating the partition function at later times: at any given time, identify the apropriate modular transformation such that the image of the vacuum character in this frame is larger than in any other frame.
Consider the partition function at times t n ≡ 2πn, with corresponding modular parameters To study the partition function at these discrete times, it is convenient to perform a timedependent modular transformation γ n (τ ) ≡ −1/(τ + n).
This transformation removes all of the holomorphic time dependence. It maximizes the contribution from the vacuum character among all modular transformations. Explicitly, the vacuum character in the γ n frame is given by It decays at late times (large n) as Notice that the vacuum state itself decays in this frame to the exponentially large value e 4π 2 k/β , which is much larger than the asymptotic value of the vacuum state in the S frame.
The power law decay is due entirely to the O(1/c) piece of the vacuum character.  Figure 3. The spectral form factors corresponding to the BTZ black hole contribution g BTZ (β, t) (blue) and corresponding to the dominant image of the vacuum g n (β, t) (red). Here, for t = 2πn, we interpolate by taking n = integer part(t/2π). This accounts for the discontinuities in the red, dashed line. The peaks of this contribution are attained at discrete times t n (purple dots). Going to the late dominant frame does not avoid late time decay, violating the late time bound (2.6) (black, dotted). Inset: the dominant contribution at t n (purple) with a fit to a t −3 power law (black).
(4.7) Figure 3 shows g n (β, t) compared with the late time bound (2.6) and the decay from the vacuum character in the S frame. Notice that the amplitude of the power law decay in (4.7) is in fact greater than the value of the late time bound (2.6), Z(2β) ≈ exp 4π 2 k β . Next, let us consider the contribution of a non-vacuum character Z h,h (τ,τ ) = χ h (τ )χh(τ ). We assume that the state is 'light', namely that the conformal weights h,h are fixed as we take k large. We will also assume for simplicity that there are no extra currents, i.e. h,h are both strictly positive. At time t n the γ n frame again maximizes the contribution of the character among all SL(2, Z) frames. At late times this contribution to the partition function decays as The faster decay compared with the vacuum character (4.6) can be traced back to the fact that the vacuum character has an additional (1 − q) factor that decays as 1/t n . The matter character contribution to the spectral form factor behaves at late times as (4.9) -13 -

JHEP08(2017)075
In section 4.3 we will generalize these considerations and find a universal contribution to the spectral form factor for arbitrary rational times. The result will be bounded above by (4.9) if we replace t n by a rational time. We conjecture that the universal contributions from the vacuum (4.7) together with the contribution from the light states (4.9) correctly describe the spectral form factor for generic non-integrable CFTs up to the dip time. For a putative CFT that is dual to pure gravity there are no light matter fields, and we conjecture that correct description is given by (4.7). We provide an argument for this in the next subsection. As discussed in the introduction, beyond the dip time we expect another universal contribution, one due to random matrix theory, to become dominant and lead to a ramp and a plateau.
This universal contribution we have identified, (4.9), has a nice connection with classical bulk saddles. As we review in appendix A, for each n there is a black hole solution in the bulk, with the contribution Z vac (γ n (τ ), γ n (τ )) to the gravitational partition function. We can thus identify the universal decay of the spectral form factor with the contribution of these black hole solutions.

Dominance of the universal contribution
In the previous subsection we identified a universal contribution to the partition function. It is natural to ask for what class of theories (if any) this contribution correctly describes the early time behavior of the partition function. In this section we show that this is the case for at least one theory. We then argue that the universal contribution is the dominant one at early times in a large class of theories.
First, consider the theory of a free compact scalar with internal radius r. 9 The partition function (with τ = iβ 2π ) is given by (4.10) The spectral form factor has a period that is determined by the radius r. By choosing r appropriately (either very large or very small) one can have a long period, exposing the universal behavior (4.9) at early times. The result is shown in figure 4. 10 We now argue that the universal contribution identified in the previous subsection provides a good approximation to the partition function before the dip time even in generic 2d CFTs, namely before the universal contribution due to random matrix theory becomes dominant. The argument has important caveats that will be discussed below. 9 We are thankful to Alexandre Belin for useful discussions. 10 Note that, due to the small central charge, c = 1, the behavior of the initial decay is somewhat different than the universal large k behavior identified previously. As in that case, the heights of the peaks of the partition function exhibit an initial power law decay. In this case, however, there is no exponential decay. Never the less, the behavior is reproduced by the modular images of the free bosonic vacuum character. The difference stems from the fact that the free boson vacuum character looks different than the c > 1 Virasoro vacuum. Z c=1 vac (γn(τn), γn(τn)) = 1/η(γn(τn))η(γn(τn)), which has a pure power law decay at large n. t g(t) with c=1 β=0.5 r=0.005 Figure 4. The spectral form factor of a compact scalar, displaying the behavior (4.9) of the universal contribution at times t = 2πn. The subleading peaks at times t = 2πn + π can also be explained using universal properties such as modular invariance, as will be discussed in section 4.3. The dashed lines show the 1/t envelope of the leading and subleading peaks.
Focusing again on the discrete times t n = 2πn the full partition function can be written as a sum over states in the γ n frame, The factor in front on the right-hand side is equal to the vacuum state contribution in the γ n frame. This is the amplitude of the universal contribution (4.6). Our goal is to argue that the sum (4.11) is well approximated by the universal contribution, (4.6), until the dip time. We begin by explaining why the sum over the heavy states gives a subdominant contribution to the partition function, and then why the light states and descendants reproduce the amplitude and power law decay of (4.6). The correction to the leading amplitude in (4.11) is In the second line we separated the sum over all states into a sum f L over 'light' states, and a sum f H over 'heavy' states. Let us discuss these two sums separately.
Heavy states. We consider first the sum over heavy states, which we can write as β+4πin .

JHEP08(2017)075
Hereĥ ≡ h − k, and ρ(ĥ,ĥ) is the density of heavy states. This density of states can be approximated by the Cardy density ρ c [28], which is defined by the equation (4.14) The integral on the right is exactly the integral that appears on the right-hand side of (4.13) if we approximate the full density of states ρ by the Cardy density ρ c , and replace τ = 4π 2 i β andτ = − 4π 2 i β+2itn . Therefore, in the Cardy approximation we find that In the large k, high temperature limit we see that f H 1 at arbitrarily late times, and so the contribution from the heavy states cannot significantly change the amplitude in (4.11).
It is instructive to verify that this suppression of heavy states does not rely on detailed properties of the Cardy distribution. The solution to (4.14) is In the last line we expanded to leading order in largeĥ. It is easy to check that this leading piece (including theĥ −3/4 factor) also leads to a suppressed contribution from the heavy states. We now mention an important caveat regarding the argument above, which relies on the assumption that the density of states is well-approximated by the Cardy density for heavy states. One can apply the same argument to the S image of the vacuum at time t = 2πn (instead of to the γ n image), and again conclude that the corrections to the image of the vacuum state due to heavy states are negligible. But the S image of the vacuum simply decays in time and does not exhibit the peaks seen in the γ n frame, leading to a contradiction. Perhaps the simplest resolution of this problem is that there are subleading corrections to the Cardy density that reproduce the peaks when working in the S frame. We showed that the detailed properties of the Cardy density are not important for the argument to work, and so such corrections should take a special form. We hope to return to this question in future work.
Light states. The contribution from light states is more subtle. To constrain the contribution of the light states, we would like to appeal to sparsity. In other words, we would like to consider theories without too many light states. However, we always have, at the very least, Virasoro descendants of the vacuum. As the light state contribution,

JHEP08(2017)075
has no suppression, it is difficult to argue that the light states give an O(1) contribution at late times. Indeed, if this were the case, it would contradict the power law decay of our universal contribution (4.9). To address this fact, and to give teeth to the assumption of sparsity, we turn our attention to the expansion of the partition function in terms of characters rather than states.
Light and heavy characters. The universal contribution (4.9) contains an amplitude and a subleading power-law decay, which comes from summing over descendants. The descendants include heavy states which contribute to the Cardy relation (4.14). To show the dominance of the full contribution (4.9) (including the power law decay) we re-expand the partition function in characters instead of in states, in the γ n frame. We define σ n ≡ γ n (τ n ) andσ n ≡ γ n (τ n ) to reduce clutter.
Here ρ χ (ĥ,ĥ) denotes the density of characters with conformal dimensions (h,h), and we took out factors of q k as in (4.13). As before,ĥ ≡ h − k. The term χ 0 (σ n )χ 0 (σ n )) is the universal vacuum contribution (4.7). The sum on the second line is the contribution from light characters. The primaries we are describing as light here consist of any state with either h orh smaller than k. These are referred to as censored primaries in [21]. One way to justify limiting the number of such states, is that those with either h h orh h are close to conserved currents, and we expect there to be few such states in a typical chaotic CFT. More generally, we would like to consider CFTs that are dual to gravitational theories without too much matter. For us, sparseness means simply that the contribution from these light primaries is well approximated by the vacuum character, with at most an order one number of additional light primaries. 11 Finally, on the last line we have the contribution of the heavy characters, which we claim is negligible in the Cardy approximation. We can approximate the density of the heavy characters by a Cardy density ρ χ ≈ ρ χ,c , which is defined by the equation (4.20) As in the case of heavy states, the integral on the right-hand side is the same integral that appears in (4.19), and the same argument implies that this contribution will be negligible. It is worth briefly connecting this argument back to the case of the free boson and discussing what role a sparse light spectrum played in getting the early time universal 11 Note, this is more strict then what is sometimes imposed (see [24] for instance), and requires a separation of scales between the AdS length and the string scale in the bulk.  figure 4. This behavior manifests itself when the radius of the boson is taken to be very large, or very small, providing a long enough period to see the power law decay. For simplicity, let's focus on the case r 1. In this case the winding modes, m = 0 in (4.10), are parametrically heavy. The contribution of the light modes is then given by, exhibiting the same power law decay as the vacuum. Here we can view the r → 0 limit as producing a sparser light spectrum by decoupling the winding modes. If instead we take r ∼ 1, so that there is no separation between momentum and winding modes, there is no early time window for which the partition function decays. The arguments above seem to imply a decaying spectral form factor at arbitrarily late times, but we know that they must fail at some point in order for the lower bound (2.6) on the plateau height to be satisfied. In particular, the assumption that the density of characters is well approximated by the Cardy density becomes invalid at sufficiently late times. The left-hand side of (4.20) includes only the vacuum state. In the full theory the left-hand side includes other states, whose contribution becomes important at late times.
In this work we assume that at late times the only important physical effects are the universal decay before the dip time, and the random matrix theory behavior of a ramp + plateau beyond it. This is equivalent to assuming that the density of characters ρ χ is well approximated by the Cardy density until the dip time.

Rational times and hot saddles
So far we focused on the discrete times t n = 2πn. The story at generic times is slightly more elaborate. We begin by considering the times t n+1/2 = 2π(n + 1/2), n ∈ Z, and the corresponding modular parameters τ n+1/2 andτ n+1/2 . There are now two modular transformations of the vacuum that vie for dominance at high temperatures: γ n and γ 2,2n+1 , where we define γ c,d (τ ) ≡ aτ +b cτ +d (where a, b are uniquely determined from c, d). Indeed, we have our previous choice, And we have the competing modular frame,  Figure 5. Here, we show the upper half-plane tiled by fundamental domains of SL(2; Z). As we increase the temperature, which corresponds to lowering the red line, we cross more and more fundamental domains.
At late times (large n) we compare the two contributions, For sufficiently high temperature, β < π √ 3 , the second contribution is larger and gives the dominant contribution, while for π √ 3 < β < 2π the first contribution dominates. More generally, for any rational time, t n/m = 2πn m , there exists an inverse temperature, β m,n , such that for β < β m,n , the vacuum in the modular frame γ m,n gives a bigger contribution than the vacuum in any other frame.
We can understand this from the Γ ∞ \SL(2; Z) tiling of the upper half plane, see figure 5. As we increase temperature, we decrease Im(τ ), and intersect more and more fundamental domains. Each such fundamental domain corresponds to a different modular image of the vacuum dominating. At a given temperature, we can refine our identification of the universal contribution to the partition function, Z (β, t n/m ) ≡ χ 0 (γ (τ n/m ))χ 0 (γ (τ n/m )) . (4.26) Here γ is the modular transformation that maximizes the vacuum character contribution at given temperature and time. 12 At high temperatures, (4.26) gives a complicated contribution to the partition function. See figure 6 for an example. At late times, it is easy to check that taking the decaying result, e 8π 2 k/β t 3 n , for the spectral form factor, and replacing t n by an arbitrary time t n/m , leads to a result that is always greater than or equal to |Z | 2 .
12 Explicitly, given τ n/m ,τ n/m it is defined by γ * ≡ argmax γm,n |χ0(γm,n(τ n/m ))χ0(γm,n(τ n/m ))| 2 . β= π 25 Figure 6. Here, in the top line, we display the behavior of our universal contribution, g (β, t) at various temperatures. On the bottom line, for comparison, we display the spectral form factor for a sample modular invariant function, ψ 2 (τ ), defined in section 5. As we increase temperature both are controlled by more and more saddles.

Dip time estimate
In this section we derive an upper bound on the time at which the spectral form factor of a generic chaotic CFT is expected to cross over to random matrix theory behavior. We call this the dip time. The derivation assumes that the universal contribution computed in previous sections correctly describes the late time behavior of the spectral form factor up to exponentially late times, right up to the dip time t d . The universal contribution, which we shall call the slope, is bounded from above by where s = 3 for the vacuum character in the γ n frame, and s = 1 for non-vacuum characters (where both h,h are non-zero). While the result (4.28) was derived for the discrete times t n = 2πn, as we saw in section 4.3 it provides an upper bound on the universal contribution and that will suffice for the purpose of deriving a bound. 13 The decaying contribution (4.28) cannot be the full answer for a theory with a discrete spectrum at arbitrarily late times, because it violates the bound (2.6). Going to late times in the spectral form factor is equivalent to probing small energy differences in the spectrum. We expect the properties of the spectrum at small energy differences (and therefore the behavior of the spectral form factor at very late times) to be goverened by random matrix theory [10]. As described in section 1.2, random matrix theory gives another universal contribution. While this contribution is expected to have large fluctuations, on average its behavior is relatively simple. Roughly speaking, it grows linearly in time until the plateau time t p , beyond which it levels off at its asymptotic value which we shall denote g p .
In this section we estimate the dip time t d , which is the crossover time from the universal decay of (4.28) to the random matrix theory behavior. We find that the ratio 13 Notice that the universal contribution at non-integer times is exponentially smaller in k than (4.28).
Therefore, in practice we expect the random matrix theory contribution (the ramp) to 'peak through' at non-integer times even before our estimate of the dip time. We thank Steve Shenker for pointing this out.
-20 -JHEP08(2017)075 t p /t d is exponentially large in k, which implies that there is a long period during which we expect the spectral form factor to grow linearly (on average) in a generic theory.
To get the late time behavior of the ramp and the plateau, recall that the thermodynamic partition function is given by the BTZ black hole partition function, Z(β) = e 8π 2 k/β . The plateau height g p is bounded below by Z(2β) (it can be pushed higher by degeneracies, which we ignore for now).
The plateau time can be approximated by counting the available states at 2β, so it is given by 14 The ramp grows linearly in time, and should reach the plateau height at the plateau time.
The spectral form factor on the ramp is then given by The dip time t d is defined by g slope (t d ) = g ramp (t d ), and is given by For both the vacuum and matter contributions it is parametrically smaller than the plateau time: (4.33)

Fine spectral probe
As we have seen discreteness of the spectrum in the original SL(2, Z) frame is a necessary and sufficient condition for the partition function not to decay at late times. However, modular invariance means that we should be able to present the partition function as a sum over states in any SL(2; Z) frame.
In other frames, discreteness of the spectrum is not sufficient to guarantee the correct late time behavior, for instance a discrete set of states in the BTZ frame, may certainly decay. Thus, the late time behavior probes slightly different features of the spectrum when viewed in each frame. Of course, if we have a modular invariant spectrum these are all equivalent, but if one doesn't know a-priori that a given spectrum is modular invariant, 14 The factor of 2 comes from the two terms in the exponent e −β(En+Em) that appears in the sum over energy states. the late time behavior in other frames provides a detailed probe. To demonstrate this phenomenon, consider the time dependence depicted in figure 7, where we compare the exact partition function, to the behavior of an approximate partition function built out of a discrete spectrum with exponentially small modifications to the degeneracies. For long enough times, these two putative partition functions diverge despite the similarity in their spectra. In this way, the time dependence in different frames probes detailed aspects of the CFT spectrum.

Information restoration in integrable theories
So far we have discussed information loss in chaotic CFTs. In section 4 we have identified a decaying universal contribution to the spectral form factor, and commented on the expected late time behavior from random matrix theory. In integrable theories we can say significantly more about the time dependence of the spectral form factor. 15 Such theories are not chaotic and are not described by random matrix theories at small energy differences. Therefore, their spectral form factors do not exhibit a dip, ramp, and plateau at late times. Nevertheless, such theories do exhibit information loss at the level of individual Virasoro characters: each Virasoro character still decays to zero at late times. It is interesting to ask how information is restored in these simpler cases.
In this section we will answer this question for chiral CFTs. The existence of chiral CFTs with large central charge that are dual to some form of semiclassical gravity is somewhat speculative [29][30][31][32][33][34][35][36][37]. Here we will work under the assumption that such theories do exist, and that they have a sensible bulk interpretation (though the calculation itself will be done purely in field theory).
We will identify a set of modular transformations whose vacuum images are sufficient to restore information. In generic non-chiral theories, the same set of transformations is responsible for the universal late time decay discussed in section 4. In chiral theories, these transformations are enough to avoid the late time decay.

JHEP08(2017)075
While we will focus on chiral theories, we note that much of what we say here also applies to holomorphic objects in general non-chiral theories, such as the elliptic genus which counts BPS states in theories with N = (1, 1) supersymmetry.
We now turn to a brief review of the properties of chiral CFTs. In two spacetime dimensions, the vector representation of the Lorentz group is reducible into left-moving and right-moving representations. Chiral conformal field theories are theories of purely left-moving degrees of freedom in Lorentzian signature, or purely holomorphic fields in Euclidean signature. The symmetry algebra of these theories contains a single left-moving copy of the Virasoro algebra, and correspondingly a chiral CFT is labeled by a single central charge c. Operators are labeled by a single conformal dimension, h = ∆ = J, where J is the spin.
The torus partition functions of chiral CFTs can be written in a similar fashion to a generic 2d CFT.
We again will be focusing on the case of modular invariant theories, As above, we will focus on sparse theories with N h e 2πh , for which the thermal partition function undergoes a sharp phase transition in temperature.
At high temperature the BTZ contribution dominates and is given by where τ = iβ 2π as before. 16 We again analytically continue β → β + it, with the modular parameter given by (5.5) 16 We are calling this the 'BTZ partition function' because it is dual to the contribution from the BTZ configuration in chiral gravity. See appendix A for details.

JHEP08(2017)075
We consider the spectral form factor g(β, t) = |Z(β + it)| 2 . Just as in the non-chiral case, the BTZ contribution decays to zero at late times, We see that we have a phenomenon of information loss even in chiral theories.
It is now easy to see how information is restored. The partition function is manifestly 2π-periodic in time as a result of modular invariance, Z(τ ) = Z(τ + 1). At time t n = 2πn, n ∈ Z, the partition function is dominated by the modular image χ 0 (γ n (τ n )) of the BTZ contribution. This image is simply equal to γ 0 (−1/τ 0 ) due to the periodicity. As advertised, the modular transformation at time t n is the same one that gives the universal late time decay discussed in section 4.

Saddle point expansion
Our next goal is to describe, in bulk language, the mechanism by which information is restored. The modular-invariant partition function includes contributions from SL(2, Z) images of the vacuum character. They are dual to a family of black holes in the bulk. In this section we will explain that the partition function can be written as a sum over these saddle point contributions. This description of the partition function is evocative of a bulk path integral. In the next section we will discuss how information is restored in this saddle point expansion, and what this may teach us about the bulk.
As mentioned above, meromorphic modular invariant functions are entirely fixed by their poles and their constant term. For a chiral CFT, this means that the full partition function, is fixed by the light spectrum -those states with h ≤ k. Here the generating function for the light states is denoted by Z L . The way in which the spectrum of heavy states is fixed is relatively simple, and goes back to the work of Rademacher [38,39]. 17 We would like to complete Z L (τ ) into a fully modular invariant function. One way to do this is to sum over the modular group, SL(2; Z). One generator, τ → τ + 1 acts trivially on q, so we only actually need to sum over Γ ∞ \SL(2; Z).
Here, the sum runs over the elements,

JHEP08(2017)075
which can be parameterized by the pair (c, d) satisfying gcd(c, d) = 1. The sum is primed to indicate that there is a regularization needed. There is some freedom in how to regularize, but choices that preserve modular invariance can differ by at most an additive constant. 18 The sum takes on a particularly attractive meaning when thought of in the context of large k CFTs dual to large radius gravity.
It is tempting to identify this sum with the sum over bulk geometries. In this description the first and second terms correspond to the vacuum and BTZ black hole respectively, and the remaining terms correspond to the subleading geometries M c,d and their appropriate generalization for gravitational theories with matter. As we review in appendix A, this can be made precise in the context of chiral gravity.

Late time behavior in saddle point expansion
Equipped with our expression of the partition function as an infinite sum over saddles, (5.8), we can gain more insight into how the thermal partition function avoids late time decay. Initially, at high temperatures, the partition function is well approximated by the BTZ contribution.
This contribution, however, quickly begins to underestimate the partition function. Focusing on times t ≈ t n = 2πn and taking n > 0, the dominance of the BTZ saddle is eclipsed by the appropriate saddle, labeled by (c, d) = (1, n).
For each integer n the given saddle goes from subdominant to dominant and then exponentially decays again. Only by summing over this infinite class of saddles do we get a partition function that exhibits the appropriate, non-decaying behavior, see figure 8. For non-integer time, we again have the spaghetti like behavior of section (4.3). For each time t = n/m there is a phase transition such that for all β < β m,n we are dominated by the (m, n) saddle. In this way, reproducing the correct late time behavior at all temperatures depends crucially on including the appropriate set of saddles. 18 One simple way to regularize is to promote Z(τ ) from a modular invariant function to a modular form of weight w, Zw(γ(τ )) = (cτ + d) w Zw(τ ) = γ∈Γ∞\SL(2;Z) Z L (γ(τ )) (cτ +d) w . The partition function Z(τ ) is then defined by analytic continuation. g 0 (β,t) g(β,t) Figure 8. The spectral form factor g(β, t) (dashed-dotted), and the contribution of six individual saddles g n (β, t), n = 0, . . . , 5 (solid lines). Each individual saddle g n is dominant around t = t n and exponentially sub-dominant at other times.

Discretizing the spectrum
Throughout this paper we have emphasized the connection between the late time behavior of the spectral form factor and the discrete nature of the spectrum. In this section we review how the naively smooth spectral density is rendered discrete by the SL(2; Z) saddle point expansion. Including a large but finite number of saddles in the expansion yields a smooth density of states with sharp peaks around the locations of the underlying states, while including all saddles leads to a fully discrete density of states (cf. eq. (5.22)).
To be concrete, we will study weight w modular forms ψ n;w , with polar part consisting of a single pole of weight n. ψ n;w (τ ) ≡ 1 q n + O(q) . (5.13) They have the following property under modular transformation. ψ n;w (γ(τ )) = (cτ + d) w ψ n;w (τ ) . (5.14) To make contact with the previous discussion, the functions ψ n;0 can be used as a basis for constructing a partition function. Strictly speaking, the manipulations we present are only valid for w > 1, but we may think of introducing w as a regulator. 19 The final results can be analytically continued to w = 0. They match careful computations performed in the w < 1 regime with a subtraction based scheme [38,39]. For w = 0 the only holomorphic modular function is a constant, and so any scheme that preserves modular invariance is guaranteed to reproduce the same modular function, up to a constant. This constant may be important for understanding whether theories of pure 3d gravity exist [33,36], but will not effect our discussion here. 19 A special case of this is the differential regularization advocated in [20,41].

JHEP08(2017)075
Given any T invariant function, f (τ + 1) = f (τ ), we may write, F w (τ ) = γ∈Γ∞\SL(2;Z) 1 (cτ + d) w f (γ(τ )) . (5.15) To see how F w (τ ) transforms, we apply an element of SL(2; Z).  The one subtlety in the above argument is working with the cosets, Γ ∞ \SL(2; Z) rather then the full group, but as f is T invariant, and {c, d} do not change when acting with T on the left, we are free to work in the coset space.
We are interested in the special case, In terms of a real inverse temperature, we can write ψ n;w (β) = e βn + ∞ 0 d∆ ρ (n;w) (∆)e −β∆ , (5.18) and perform an inverse Laplace transform to read off the density of states. The term involving the density of states can be written explicitly as, It is useful to organize the sum over Γ ∞ \SL(2; Z) as a double sum first over Γ ∞ \SL(2; Z)/Γ ∞ , and a sum over right action by T . Then, by using the identity, The delta function in the last line is exactly the discreetness of the spectrum we were after. Notice that including a finite number of saddles, by placing a cutoff on | |, leads to a smooth density of states that becomes progressively sharper around the discrete states as we increase the cutoff. Put differently, by including an increasing number of saddles in the expansion we can witness the discreteness of the spectrum emerge out of the smooth density.

Discussion
In this paper we have examined the time dependence of the partition function in twodimensional conformal field theories. We identified a universal contribution which decays slowly in time. By apealing to the late time behavior of random matrix theory we were able to conjecture a dip time, where we expect the crossover to RMT to set in. In integrable models, in particular chiral conformal field theories, we were able to identify an infinite set of saddle point contributions to the partition function, corresponding to black holes in the bulk, which serve to restore information for all time. All of these discussions, however, leave open many avenues of future inquiry. One important question is when do the correction to the Cardy formula (4.20) describing the density of characters become important enough to affect the late time behavior. In theories with sufficiently sparse spectra, we expect such corrections to be responsible for the late time transition to random matrix theory behavior. They may also affect the universal decay worked out in section 4 before the dip time. A possible starting point for investigating these questions is to include non-vacuum states on the left-hand side of (4.20).
An important assumption we use was sparsity of the light spectrum in gravitational theories. An obvious question is how the notion of sparsity imposed here connects to other such criteria one may wish to impose for a conformal field theory dual to gravity. For instance those coming from requiring a Hawking-Page phase transition, appropriate behavior of Rényi entropies, saturation of Lyapunov bounds, or from demanding a bulk point singularity [24,[42][43][44][45][46][47][48][49][50].
We have mentioned that the discussion of information loss in integrable theories can in principle be applied to counts of BPS states in generic supersymmetric theories. It would be interesting to study this in detail. It would be especially interesting if one can leverage information about how the BPS spectrum solves its information paradox to make statements about the full supersymmetric theory.
As we investigate the analytically continued partition function at higher and higher temperature, it's time dependance becomes very featured, see figure 6. For our universal -28 -JHEP08(2017)075 contribution, as well as for chiral CFTs, there are spikes that occur at regular, rational times. An ambitious question is whether there is an experimental observable (perhaps considering a two point function rather than a partition function) which might be able to detect these rational spikes for experimentally realizable 1+1d systems.