Quantum kinetic perturbation theory for near-integrable spin chains with weak long-range interactions

For a transverse-field Ising chain with weak long-range interactions we develop a perturbative scheme, based on quantum kinetic equations, around the integrable nearest-neighbour model. We introduce, discuss, and benchmark several truncations of the time evolution equations up to eighth order in the Jordan-Wigner fermionic operators. The resulting set of differential equations can be solved for lattices with $O(10^2)$ sites and facilitates the computation of spin expectation values and correlation functions to high accuracy, at least for moderate timescales. We use this scheme to study the relaxation dynamics of the model, involving prethermalisation and thermalisation. The techniques developed here can be generalised to other spin models with weak integrability-breaking terms.


Introduction
Equilibration and thermalisation are topics that link nonequilibrium physics to equilibrium physics, and they play a fundamental role for the validity and success of thermodynamics. These topics have a long history and have been studied in a variety of settings, including classical mechanics versus quantum mechanics, closed systems versus open systems, and others. Renewed interest in equilibration and thermalisation in isolated quantum systems was to a large extend triggered by experimental progress in preparing and manipulating assemblies of cold atoms that are extremely well isolated from their surroundings; see [1,2] for reviews. Nearintegrable systems, consisting of a dominant integrable part plus a small integrability-breaking perturbation, have been studied early on in some of these experiments, including the celebrated quantum Newton's cradle by Kinoshita et al [3]. The integrability-breaking perturbation ensures thermalisation to a microcanonical equilibrium, and the relaxation dynamics towards equilibrium in a near-integrable system takes place in two stages on widely separated timescales [4][5][6][7]: a fast decay, termed prethermalisation, to a long-lasting nonequilibrium state that is characterised by a so-called generalised Gibbs ensemble (GGE) [8,9]; and a second step, in which relaxation to thermal equilibrium, as described by the ordinary Gibbs ensemble, occurs on a much longer timescale, once the integrability-breaking perturbation becomes relevant.
Accurate and reliable calculations of these phenomena are challenging, at least when going beyond the small system sizes of O (10) where exact diagonalisation (ED) is feasible. Perturbative techniques around the integrable limit suggest themselves for the problem at hand, and various types of such techniques have been employed in the context of prethermalisation, including a flow-equation methods [10], self-consistent mean-field techniques [11], self-consistent time-dependent spin-wave theory [12], and quantum kinetic theory [13][14][15][16]. The notion of quantum kinetic theory subsumes a number of approximate methods based on identifying certain classes of operators (usually those of higher degree in the normal-ordered ladder operators; see section 3 for more precise statements) as negligible, and deriving a reduced set of equations of motion for the remaining operators only [17]. In the abovementioned [13][14][15][16] quantum kinetic theories are developed for studying bosons or fermions in one spatial dimension. (in units of h -≡1). We assume periodic conditions   º + N 1 1 , so that  is translationally invariant. Additionally, the Hamiltonian is invariant under the  2 symmetry x→−x. To account for the periodic boundary conditions, we define the distance between lattice sites l and l+m as the shortest connection around the circle, . The long-range interactions in (3) decay like a power law d(m) −α with the distance, where α is some non-negative exponent. To enforce that this term contains exclusively interactions beyond nearest neighbours, the sum over m extends over - 2 only. The magnetic field strength is denoted by h, and J x and J z are pair coupling constants. The integrable part  int of the Hamiltonian is known to be exactly solvable by a Jordan-Wigner transformation, followed by a Fourier and a Bogoliubov transformation [18,19] (see appendix A.1). By means of this procedure, the integrable part (2) of the Hamiltonian can be brought into the quadratic form The notion of long-range interactions is not unanimously defined. In some communities only exponents α smaller than the spatial dimension of the system are called long-range. Our terminology includes these cases, but is less restrictive. 6 In case of a magnetic field reversal, such as the one we will use in appendix C.2, this formula should be modified by replacing ò k →−ò k . It is crucial for what follows to also express the perturbation  pert in this preferred fermionic basis in which  int is quadratic and diagonal. This guarantees that, when making approximations by neglecting high-order terms, the error will be small. A proper definition of the notion of high-order operators is given in section 2.2. The main steps of transforming  pert into the fermionic quasi-particle basis h h † , of the full Hamiltonian. Here we have employed the vector notation = ¼ ( ) k k k , , 1 4 , and å k indicates a summation over all momenta k q (q=1, K, 4) in the Brillouin zone. The coefficients ¼ A B , , I V are defined in appendix A.2.2, and they contain contributions from the coupling constants in the original Hamiltonian, as well as combinatorial contributions that arise from normal-ordering. Normal-ordering leads to significantly more complicated expressions here, but it will be crucial for identifying negligible terms in the approximation scheme of section 2.2.  0 in (6) denotes a term of degree zero in the fermionic operators, i.e. proportional to the identity. This term is irrelevant for the dynamics, but will be important for the definition of initial conditions. All terms in (6) are momentum conserving due to the translational invariance of the spin model, and of even degree in the fermionic operators because of the  2 symmetry.
The Hamiltonian (6) is of quartic degree in the fermionic operators η, h † . This is different from the conventional long-range Ising model in a transverse field [20][21][22][23] with long-range couplings between the x-components of the spin operators is used, which leads to fermionic terms of arbitrarily high degree. The somewhat less conventional Hamiltonian (1)-(3) we chose is a convenient model for studying approximation methods for the dynamics of the spin chain: all deviations from the exact dynamics are expected to be genuine effects of the approximations made in the timeevolution equations, as no approximations have to be made on the level of the Hamiltonian.

Equations of motion
The time-evolution equation of an operator  in the Heisenberg picture is given by the von Neumann equation To construct a quantum kinetic theory for the model (1)-(3), we require time-evolution equations of normal-ordered products of fermionic operators. For example, for  h h = † k k , a straightforward but tedious calculation yields , etc. The right-hand side of equation (7) generally involves time-evolved operators distinct from . Therefore, to solve the equation of motion (8), similar equations of motion have to be derived for the operators occurring on the right-hand side. In general, this will lead to a system of coupled differential equations whose number scales exponentially with the system size N. This is a problem of a complexity comparable to that of solving the von Neumann equation for the density operator in the Schrödinger picture, which is intractable already for moderate system sizes in most cases.
Our aim is to find a smaller differential system that is suitable for approximating the dynamics generated by the Hamiltonian (1), while being numerically tractable for larger system sizes. For this purpose, it will be convenient to classify operators according to their degree and their p-particle number.
the number of annihilation operators in , and by  Î c the number of creation operators. If all annihilation operators are to the right of all creation operators, the operator  is said to be normal-ordered. The degree of such a normal-ordered product is then defined as = + a c deg , and we call the integer p=max (a, c) the p-particle number of .
We define the class C p deg as the unique set of normal-ordered products of fermionic operators with a given degree and a given p-particle number, e.g.
where Br denotes the Brillouin zone, and the rightmost expression is a slightly abusive shorthand notation. We furthermore define superclasses of, respectively, fixed p-particle number and degree , whose number of elements is exponentially large in the system size N, then spans the vector space of all fermionic operators acting on Fock space. To reduce the size of the system of coupled differential equations generated by (7), we introduce a truncation  T F as the union of, in general, several classes C p deg . For example 2 corresponds to a truncation at the quadratic level, neglecting all terms of degree larger than two in the differential system. A first requirement on T to be a useful truncation is that it gives access to the observable(s) of interest. For instance, the spin component  l z , expressed in terms of the fermionic operators η k and h † k , is a linear combination of the identity and of some quadratic operators (see equation (55)). For simulating the dynamics of  l z , the truncation must therefore contain at least the terms in (11), even though that selection may not be sufficient to obtain a good approximation. Other choices of observables may require larger truncations. Additionally, for the sake of numerical efficiency, we want T to contain only the most relevant terms to describe the dynamics, at least for the time-window and observable of interest, and for the level of precision required. In that sense, kinetic theory can be seen as a perturbation theory which aims at structuring the set of all operators into a hierarchy, and then truncates that hierarchy at a chosen level.
Such truncation schemes, as is evident from the definitions of the degree and the p-particle number, are based on the concept of normal ordering, and at least some kind of ordering is required for a consistent classification of operators and the establishment of a hierarchy. Even if both, the Hamiltonian and the observable are given in normal-ordered form, the commutator in (7) will usually create non-normally ordered terms in the system of coupled differential equations, which have to be normal-ordered before a truncation can be performed. For the applications considered in the present paper, the number of coupled differential equations typically scales like N 2 or N 3 with the system size N, and is therefore very large for system sizes of tens or even hundreds of spins. Hence, normal-ordering by hand is an arduous task. To avoid this, we have developed an algorithm, which we call the linear kinetic equations (LKEs) code, that takes care of the following tasks.
(i) Symbolic calculation, for unspecified indices k 1 , k 2 , K, of the normal ordering of the commutators between all types of elements in T. Technically this is equivalent to the derivation of Wick's second theorem [24,25].
(ii) Use the results of (i) to derive, from equation (7), the differential system D for all operators  Î T.
(iii) For a given initial density operator ρ, define X 0 as the vector composed of all   r á ñ = ( ) Tr . Numerically solve the coupled linear differential equations = X DX with initial condition X(0)=X 0 .
(iv) From X(t), calculate the expectation value of the spin observable of interest, e.g.  á ñ( ) t l z .
Our LKE code is different from other kinetic equations techniques, like the one developed in [15,16] for Hubbard-type lattice models, in that it does not require the conditions of Wick's first theorem to hold. Moreover, our approach gives direct access to correlation functions.

Initial states
In principle, the LKE code described above is not restricted to specific initial states, but specific choices may simplify the problem by reducing the size of the differential system D. In particular, spatially homogeneous initial states, which are invariant under discrete lattice translations, are a convenient choice, because they simplify the fermionic representation of observables like  l z (see appendix A.3). This symmetry, as well as other ones, can be used to reduce the size of the differential system of kinetic equations, an issue that is discussed in detail in appendix B. In principle, and for convenience, one could choose a homogeneous initial state that has a simple form in the fermionic basis. More relevant for physical applications, however, are initial states that have a simple form in the spin basis, as in this case it is more likely that such a state can be prepared experimentally.
A homogeneous initial state with a particularly simple form in the spin basis is a fully z-polarised state Since the time evolution is calculated in the η-basis, the initial state needs to be transformed into that basis as well. Fortunately, fully polarised spin states have a convenient expression in the fermionic language. For instance, one can show that the fully down z-polarised spin state is transformed into the Bogoliubov basis according to where ñ |0 denotes the Bogoliubov vacuum and 2 . u k and v k , defined in appendix A.1, are the coefficients of the Bogoliubov transformation that diagonalises the integrable part of the Hamiltonian, and W n is a normalisation constant defined by We then define down truncated polarised states as 1 . For small sizes and/or large magnetic field amplitudes | | h , any of these states is a good approximation of the 'proper' polarised state (12). However, for large systems or small magnetic fields, the LKE code is expected to perform well only for initial states (15) with small n, an effect that will become clearer in the context of the p-particle structure introduced in section 3.1 and further discussed in appendix D.1.
The symmetry properties of the truncated polarised states y ñ | n can be used to further reduce the size of the differential system of kinetic equations. Firstly, these states belong to the even sector of the Fock space, and the time evolution under the Hamiltonian (6) preserves this evenness. Secondly, fermions created by the operator  n in (13) always come in pairs with opposite momenta k i and −k i , and one can show that a differential system restricted to products of operators that take into account this pair structure is sufficient to describe not only the initial state, but also the time evolution of a truncated polarised state. Similar to the classes of operators defined in equations (9) and (10), we denote byC p deg the set of products of fermionic operators with a certain degree and p-particle number, with the additional constraints of satisfying momentum conservation, belonging to the even sector of the Fock space, and taking into account the pair structure of  n . A detailed account of these symmetries and a definition of the symmetry-reduced classesC p deg is given in appendix B. Lastly, it is worth noting that the truncated polarised states (15) do not satisfy the conditions of Wick's first theorem. This is an interesting observation because of the fact that, different from other quantum kinetic equations that can be found in the literature, our LKE code does not rely on the validity of Wick's theorem. For instance, for the state y ñ | 1 one can show that

Truncations and hierarchies
Our aim is to establish a hierarchy between operators according to their relevance for the time evolution, and then truncate that hierarchy at a certain level in order to reduce the size of the differential system of kinetic equations and render it numerically more manageable. For instance, in a weakly-interacting classical kinetic theory, one would first select the ballistic terms, for which the particles are noninteracting. If higher accuracy is needed one would include two-particle scattering terms, and so on. In this section we adapt this intuitive classical picture to the fermionic Hamiltonian (6) and comment on the role of initial conditions for selecting a suitable truncation scheme. In section 3.3 we assess the quality of the approximations by benchmarking the results from different truncation schemes against exact results. There is no rigorous theory that demands that hierarchies and truncation schemes be based on normal ordering, but on the more intuitive level one can reason as follows. Consider two fermionic states The same is not true without normal-ordering, i.e. for a nonnormal-ordered product  of fermionic operators, the matrix element  x x á ¢ñ | | can be nonzero regardless of the degree of . This implies that, when disregarding non-normal-ordered operators of, say, deg=4, one is neglecting information not only about three and more fermions, but also about single fermions and pairs of fermions. This would contradict the intuitive, classical idea of a truncation scheme that we invoked at the beginning of this section. For a more detailed discussion of the reasoning behind normal ordering, see chapter 4.1 of [25].

Definitions of truncations
Truncations based on the degree of operators. Based on the discussion in the preceding paragraph, it is natural to base a hierarchy of fermionic operators on their degree. Correspondingly, a truncation scheme is defined such that it contains only normal-ordered products of fermionic operators up to a certain degree (and which additionally meet the symmetry requirements discussed in appendix B). The cardinality ofT deg , and therefore the number of variables in = X DX , scales like N deg/2 with the number of sites N. For instance, at the quartic level, we have where the symbol; is used to easily distinguish between the classes. Truncations based on the p-particle number. Modifying the idea leading to the hierarchy (18), one can order the operators according to the integer

 h h h h hh h h h h h h h h h h hh h hhh hhhh
which is precisely the p-particle number introduced in section 2.2. The corresponding truncation is defined as which, for example, yields The two truncationsT p andT deg are equivalent for Hamiltonians which obey fermion number conservation (assuming that deg=2p), but they differ in cases where, like in our fermionic Hamiltonian (6), terms like ηη or h h † † create or destroy pairs of fermions. The cardinality ofT p scales like N p with the system size N.

 h h h h hh h h hh h h h h h hhh h h h hhh h h h h hhhh h h h h hh h h hhhh h h h h hhhh
Truncation based on degree and p-particle number. We can combine the ordering principles of the previous paragraphs in different ways. We introduce the truncation adding to the terms in -T deg 2 only those of a specific degree and p-particle number. For example, for deg=6 and p=3 we have

 h h h h hh h h h h h h h h h h hh h hhh hhhh h h h hhh
The number of normal-ordered products inT 6 3 scales like N 3 with the number N of spins.

Domain of validity of the truncations
The truncations introduced above are expected to yield good approximations of the dynamics for sufficiently short times. Longer times can be reached by tuning  and/or ρ in a way such that expectation values of higherdegree fermionic operators are small. The main parameter for tuning the spin Hamiltonian (1)-(3) is the longrange variable α. The larger α, the closer the fermionic version (6) of the Hamiltonian is to the noninteracting integrable case. The smaller the interactions are, the longer it takes to build up correlations between fermions, and hence higher-degree fermionic operators remain close to their initial values for a longer time.
Another requirement for a truncation to yield a good approximation is that the initial state ρ is uncorrelated or at most weakly correlated in the fermionic basis. Moreover, when using a truncation based on the p-particle hierarchy of section 3.1, the approximation works particularly well for initial states with a small fermion density. By means of the particle-hole transformation of appendix C.2 the validity can be extended to initial states having either a small fermion density or a small hole density is the (Bogoliubov) fermion density operator. Note that, as observed in appendix D.1, for sufficiently small magnetic field amplitudes the fully-polarised state  ñ  | is expected to violate condition (25). However, for a given system size N, one can choose n sufficiently small such that the truncated polarised state y ñ | n falls into the range of validity of the truncation. Similarly, the validity of the condition (25) can be enforced by increasing, at fixed n, the system size N.

Benchmarking
In this section we assess the performance of the LKE code when using the truncations introduced in section 3.1. We compare to exact diagonalization (ED) results [26] for system sizes up to N=20. As a measure for the accuracy, we use an indicator proportional to the time-integrated Euclidean distance between the LKE expectation value and the ED expectation value Based on the results for various truncation schemes shown in figure 1, we make the following observations: for sufficiently short times, the accuracies of the truncations follow the hierarchy where the symbol  a b means that the truncation a is less accurate than b, and  _ means that, based on the data shown in figure 1, we conjecture that the related truncations are equivalent in the large α limit. We note that the intuitive idea of a hierarchy based on the p-particle number and the degree is confirmed 7 , but that 'shortcuts' Figure 1. Comparison of the performance of the LKE code for several truncation schemes, based on the accuracy quantifier (27) as a function of time t, for α=3 (top) and α=5 (bottom), and for system sizes N=10 (left) and N=20 (right). Some of the truncation schemes show very similar accuracies, which indicates that irrelevant classes of fermionic operators are contained in some of them. All data are for fully-polarised initial states  ñ  | (or y ñ | ⌊ ⌋ N 2 in the fermionic language), and for parameter values J x =J z =h=−1 in the Hamiltonian. 7 A more detailed benchmarking, which we do not show here, reveals that 4on the one hand, and -T T deg 2 deg for   2 deg 6on the other hand, providing evidence of both, a degree hierarchy and a p-particle hierarchy. However, after the quartic level, such schemes are coarse, and it is the purpose of figure 1 to propose intermediate levels of approximation. seem to exist, i.e. lower-order truncations that achieve more or less the same level of accuracy. For instance,T 4 andT 6 3 give results of essentially the same accuracy, although the first truncation contains a significantly smaller number of operators, and is therefore numerically favourable. We found these observations to hold for all system sizes  N 30 for which we had ED results available for comparison, and we do not see any reason why the observed patterns should not remain valid for larger systems with otherwise similar parameter values.
As a rule of thumb, on a regular desktop computer we can deal with system sizes ∼10 3 when using a truncation scheme for which the corresponding differential system in the LKE code scales linearly with N; system sizes of order ∼10 2 when the scaling is quadratic in N; and sizes of order ∼40 in the case of cubic scaling. Quadratic truncations, while scaling linearly with N, cannot capture effects beyond integrability, and hence are not suitable for our purposes. In the following we use the compromiseT 4 , which scales quadratically in N 8 , as our default truncation for the applications discussed in section 4.
It is not easy to come up with a clear statement about the times and parameter values for which our quantum kinetic theory is valid. We can, however, get at least a rough idea by plotting the accuracy quantifier (27) for various system sizes. The left plot in figure 2, which is for the initial state y ñ | 1 , indicates that the accuracy  Dá ñl z T 4 remains excellent up to a time t≈N, which for the parameters used equals the traversal time t  | | N J x trav discussed in more detail in section 4.1. Moreover, the accuracy clearly improves with increasing system size N, which is a very reassuring feature when moving beyond the small N available for benchmarking and on to larger = ( ) N O 10 2 in the applications of section 4. We believe that this increasing accuracy is caused, at least to some extend, by the choice of the initial state y ñ | 1 , whose fermionic density is bounded from above by a decreasing function of N, see appendix D.1. The right plot of figure 2 shows the same kind of size-scaling analysis for the initial state  ñ  | , which supports similar conclusions as for y ñ | 1 , even though the scaling with system size is less pronounced.
Because of the perturbative character of our quantum kinetic theory, the truncated time-evolution is expected to become less accurate for smaller values of α. This expectation is in agreement with the data shown in figure 3, which further confirms the improved accuracy with increasing N over a range of α-values. Overall the plot illustrates three limits in which the LKE code is applicable: sufficiently short times t, sufficiently large α, and/or sufficiently large system sizes N. Furthermore, choosing an initial state with a low fermionic density is another (and particularly powerful) way to improve the performance of the code, as illustrated in figure 2 and detailed in appendix D.1.
All benchmarking has been done for single-site observables  l z . We expect correlation operators, like   k z l z , to be more strongly affected by errors, but we have not verified this numerically.

Prethermalisation and thermalisation in the long-range Ising chain
A nonintegrable isolated quantum system of large but finite size is expected to thermalise in a probabilistic sense, meaning that, at sufficiently late times, the expectation value  á ñ( ) t of a physically reasonable observable  is very close to its thermal equilibrium expectation value  á ñ th for most t [27,28]. Fluctuations around  (27) for the initial states y ñ | 1 (left) and  ñ  | (right). Parameter values are α=4 and h=−0.51, and we use the truncationT 4 . We conjecture that, at least for y ñ | 1 , our quantum kinetic theory gives highaccuracy results at least until the traversal time t  | | N J x trav , and that the accuracy improves further with increasing the system size N. 8 Another promising quartic choice that scales quadratically with N is È È˜˜⪯T C C T equilibrium are present, but their size is suppressed for large system sizes N; and while large deviations from equilibrium may occur, they are extremely rare.
In the transverse-field Ising chain with long-range interactions (1)-(3), the integrability of  int is broken by the presence of  pert which, for the large α-values we are considering, is a weak perturbation. The relaxation to equilibrium of weakly nonintegrable systems, consisting of an integrable part plus a small nonintegrable perturbation, has been studied extensively in the literature (see [5] and references therein). For such systems, an out-of-equilibrium initial state typically approaches equilibrium in two stages [7]: on a rather short timescale, a long-lasting prethermalised nonequilibrium state is reached. This state is described by a so-called GGE [8,9], which, in addition to conservation of energy, takes into account also all the other conserved local charges of the integrable part of the Hamiltonian. Proper thermal equilibrium, as described by the ordinary Gibbs ensemble, is expected to be approached only much later, once the integrability-breaking perturbation becomes relevant.
We expect similar behaviour in the transverse-field Ising chain with long-range interactions, but with the difference that  pert in (3) contains integrable as well as nonintegrable contributions, as is evident from the presence of quadratic as well as quartic terms in the fermionic Hamiltonian (6). Both types of contributions are of small magnitude, controlled by a combination of the parameters J z and α. Notwithstanding the similar magnitudes of the integrable and nonintegrable contributions in  pert , the two terms will have different effects on the equilibration of the system. The integrable portion of  pert will contribute a small shift to the GGE that is reached in the initial relaxation step due to  int . The nonintegrable portion of  pert is generically expected to effect proper thermalisation to a Gibbs state on a timescale proportional to the squared inverse of the magnitude of the nonintegrable term [5].
In the following we make use of the LKE code with a suitable truncation scheme in order to probe the relaxation dynamics of the transverse-field Ising chain with long-range interactions. As our local observable of interest we choose  l z , the z-component of the spin at site l. This observable has the advantage of being of a simple form not only in the spin framework, but also in the fermionic language, where it is a quadratic operator (55). as a threshold value, below which we declare our numerical results as sufficiently accurate. To illustrate the choice of this threshold value we plot the time evolution of (c) the expectation value of  á ñ l z and (b) its accuracy quantifier for α=2.5 and N=20. In all three plots, the dotted line, respectively the orange dot, correspond to the same parameter values and accuracy thresholds.

Quadratic fermionic Hamiltonian
To distinguish effects of nonintegrability from those of the integrable model, we define the Hamiltonian consisting of only the quadratic terms in the fermionic Hamiltonian (6).  2 differs from  int for finite α, owing to the fact that  pert contains not only quartic, but also quadratic contributions. For any quadratic Hamiltonian, the truncation schemeT 2 yields exact results 9 , and the use of such a low-order truncation scheme will allow us to deal with fairly large system sizes of  2 in the following. Using that scheme in the LKE code, we show in figure 4 exact results for the time evolution of  á ñ l z generated by the quadratic Hamiltonian  2 for various values of the long-range parameter α. The most striking feature in this plot are the drastic changes, occurring periodically in the time evolution with a period of approximately 1200. These features have been termed traversals in [29], and they can be understood as a finite-size effect: the dynamics is controlled by pairs of quasiparticles travelling across the chain in opposite directions, and for  int the maximum velocity of quasiparticle propagation is known [30]. Because of the periodic boundary conditions, the effect of returning quasiparticles that have travelled the full length of the circle will be felt after a time t  | | N J x trav and multiples thereof. For a < ¥ this timescale changes only slightly. As figure 4 illustrates, the traversals spoil the relaxation behaviour that is visible up to t;1200. In this way, finite system sizes limit the timescales that can be assessed. For the quest of observing equilibration, which occurs on the slowest relevant timescale of a system, this poses a challenge.
From now on we will focus exclusively on times t up to τ trav . In that time window we observe in figure 4 a rapid rise of  á ñ l z from approximately −0.335 (see (101) and the discussion in appendix D.2.2) to an α-dependent value around −0.3. We estimate the corresponding timescale to be which yields a value τ ; 1 for the parameters of figure 4, in agreement with the results shown in the plot. After that fast initial rise, a prethermalisation plateau is reached. We expect, but have not explicitly confirmed, that the attained long-time values agree with the GGE equilibrium values of  2 for the initial state used.

Beyond integrability
In this section we go beyond integrability by considering the full nonquadratic fermionic Hamiltonian (6), which is equivalent to the long-range spin model (1)- (3). In this case the quadratic truncation schemeT 2 is not sufficient anymore, and we opt instead for usingT 4 as a compromise between accuracy and numerical efficiency (see section 3.3).T 4 scales quadratically with the system size, which restricts the system sizes we can deal with on a regular desktop computer to N=120. With moderate effort and using less than 30 Gb of RAM on a cluster we obtained data for N=180, but the system sizes can certainly be pushed well beyond 200 with more effort.
Because of the traversals discussed in section 4.1, the system sizes of N=180 will limit the timescales that can faithfully be observed to τ trav ;180. l z under the dynamics generated by  2 for system size N=1200 with parameter values J x =J z =−1 and h=−0.51. The colours represent different values of the long-range parameter α, as indicated in the legend. Time evolution starts from a truncated polarised initial state y ñ | 1 and is calculated with the LKE code using the truncationT 2 , which yields exact results for the quadratic Hamiltonian considered. The main features that can be observed are a rapid initial relaxation on a timescale of the order 1, followed by a prethermalisation plateau, which can be observed until finite-size traversals obfuscate the relaxation at around t ; 1200; see main text for details. 9 This is a consequence of the fact that  Îd i S p a n T The thermal equilibrium values shown in figure 5 are calculated according to Tr e is the partition function. Since we are considering an isolated system where energy is conserved, the inverse temperature β is fixed through the initial state y ñ , an exact expression is known, see (96). ED results for ν th for several small system sizes are then extrapolated to the system sizes of interest, as illustrated in figure 6. For a magnetic field h=−1, we find an inverse temperature β;−2.8, more or less independent of the value of α. For h=−0.51, β ranges from ;−4.7 to ;−5.9 for α between 4 and 8. Smaller α are not considered, as the validity of the approximations in our quantum kinetic theory become questionable in that case.
Negative inverse temperatures β are known to occur in equilibrium systems with (upper and lower) bounded energy spectra if the entropy decreases as a function of energy in the high-energy region [31]. According to (65), in our model the transition from positive to negative β takes place at the energy ν th =0. From equation (65) we furthermore find n , and it is reasonable to assume that b n b  ( ) th is monotonous 11 , which is consistent with the numerical results of figure 5. According to (98) the initial states we use correspond to positive energy densities, which, by virtue of the above monotonicity argument, imply negative temperatures. In appendix C.1 we propose a method that allows us to tune the energy density of the this question has been addressed in [22]. Unlike in that case, our perturbation (3) couples spin components in the magnetic field direction, and for that reason we expect that the location of the critical point remains largely unaffected. For the parameter values we use, we hence expect that h=−0.51 is in the ferromagnetic phase for all α, which seems to be confirmed by numerical results. 11 For instance, the condition is sufficient to obtain d β ν th (β)<0. Moreover, from (45) we know that the eigenvalues of  int are all strictly negative as long as h<0, which is always the case in this section. Perturbation theory therefore proves the strict positivity of the spectrum of the full Hamiltonian (6) for a sufficiently large (but finite) value of α. initial state, and hence the effective temperature, while still using truncated polarised states and staying in the regime where our quantum kinetic theory remains valid.

Spreading of correlations
The linear quantum kinetic theory we developed in section 2.2 does not make use of a Wick factorisation of correlations. This not only allows us to deal with correlated initial states, as discussed at the end of section 2.3, but also provides access to a subset of fermionic correlation functions. At the level of theT 4 -truncation (19) that we use, we obtain all the fermionic correlations functions necessary for calculating the spin-spin correlations   á ñ for h=−1 and h=−0.51, starting from the initial state y ñ | 1 . This state has nonvanishing correlations in the spin basis, which, as is visible from the short-time behaviour in figure 7, are smaller for h=−1, and larger for , obtained by ED, is plotted for parameter values α=4, h=−1, and different system sizes N. The dotted line represents the energy density ν 1 of the initial state y ñ | 1 , calculated according to equation (96). For each value of N we read off the β-value that corresponds to n 1 (black crosses). In the right plot, these finite-size values of β are plotted as a function of system-size N (crosses connected by a line). Using a power-law fit, we extrapolate to larger sizes (crosses for 1/N<1/24, not connected by a line), obtaining for example β(N=180);−2.8183. Since we have β(N=24)=−2.8120 for the largest system size for which we had ED data, we estimate the precision of the extrapolation to be of order 10  in the space-time plane (m, t) for magnetic fields h=−1 (left) and h=−0.51 (right). The initial state is y ñ | 1 , and the time evolution is under the nonintegrable Hamiltonian (6) with parameters N=120, α=4 and coupling constants J x =J z =−1. Correlations smaller than 10 −5 are irrelevant for what we want to illustrate in this plot, and we therefore rescaled the colour bar to this threshold, i.e. the plots actually show . h=−0.51. This h-dependence is a consequence of the fact that tuning h not only modifies the Hamiltonian (6), but also changes the initial state (13), which affects the magnitude of the correlations (59). As discussed in appendix D.1, y ñ | 1 becomes a good approximation of  ñ  | in the limit of large negative magnetic fields, which is consistent with our observation of weaker initial correlations for a magnetic field of larger magnitude.
In figure 7 we show  á ñ m z as a function of time t and distance m between lattice sites. We observe a rapid decay, on a timescale of the order one, of the nonlocal (i.e. m-independent) initial correlations, followed by a 'lightcone'-like spreading of correlations in space and time [16,[32][33][34][35]. In the presence of long-range interactions, a variety of analytical, numerical, as well as experimental results indicate that, at least for sufficiently small values of α, the linear shape of the cone gets replaced by a curved shape [36][37][38][39][40][41]. For α=4 as used in figure 7 a curved shape is not visible, and it has in fact been conjectured that correlations spread strictly linearly for α larger than some critical value [42].
The spatial decay of the correlations, i.e. the m-dependence of  á ñ m z at a given time t, appears significantly sharper in the left plot (h=−1) of figure 7 compared to the right plot (h=−0.5), a feature that is particularly striking at small | | m . This observation is consistent with the expectation that the correlation length diverges in the vicinity of the quantum critical point, which for α=4 is expected to be close to the quantum critical point h c =−1/2 of  int . Strictly speaking this argument is valid only for the groundstate and in equilibrium, but it is reasonable to expect signatures of quantum critically to persist at small Bogoliubov particle densities  á ñ. The initial state y ñ | 1 has indeed a small  á ñ (90) and, since the integrability breaking is weak, this remains true for fairly long times t>0. Furthermore, while global equilibrium has not yet been reached, regions that are some distance away from the edges of the lightcone seem to have equilibrated at least locally, with a correlation length that is presumably similar to that of the global equilibrium. Assuming all this heuristic reasoning to be valid, we interpret the qualitative differences between the two plots in figure 7 as consequences of the distance of the magnetic field values h=−1, respectively h=−0.51, from the quantum phase transition.

Conclusions
We have constructed quantum kinetic equations for describing the nonequilibrium dynamics of a transversefield Ising chain with a weak integrability-breaking perturbation. The computational method we developed makes use of the Jordan-Wigner fermionic representation of the transverse-field Ising model and takes into account the integrability-breaking perturbation up to a certain degree in the time-evolution equations of operators. Which operators to include and which operators to neglect in the time-evolution equations is a crucial issue and strongly affects the accuracy of the approximation. In section 3 we have introduced, discussed, and benchmarked several truncation schemes, all of which are based on the normal-ordering of products of fermionic operators. Based on the numerical benchmarking, we found the quartic truncation schemeT 4 to be numerically efficient and at the same time adequate for studying effects beyond integrability. Truncation schemes involving sixth order terms can reduce errors in time-evolved expectation values by almost an order of magnitude, but become very costly in computation time. Using the truncation schemeT 4 , which scales quadratically in the system size N, we reached sizes of up to N=120 on a desktop computer and up to N=180 on a cluster, but with more effort and/or high-performance computing facilities this value can certainly be pushed quite a bit further. The model we have studied is the integrable transverse-field Ising chain with nearest-neighbour interactions (2), with an added integrability-breaking long-range perturbation (3). The perturbation can be made small by choosing either the coupling coefficient J z in (3) to be small, or the long-range exponent α to be large. The latter case, which we focus on in this paper, is, to the best of our knowledge, the first perturbative technique that uses 1/α as a small parameter. The perturbation (3) we chose for this study is an unconventional one: unlike in the conventional transverse-field Ising chain with long-range interactions where both, nearest-neighbour and longrange couplings are orthogonal to the field direction, we chose a model where the long-range coupling  pert is aligned with the magnetic field (in the z-direction in our choice of reference frame). This choice avoids so-called Jordan-Wigner strings in the fermionic representation of operators, which simplifies calculations. Perturbations that can be written entirely in terms of  l z operators can be treated with the methods introduced in this paper, including also three-spin interactions like    l m x translated into the fermionic picture will consist of fermionic operators not only on sites l and l+m, but also on all sites in between. For short interaction ranges m, like in next-nearest neighbour interactions, these terms can certainly be dealt with, but for larger m it is not clear to us whether calculating a normal-ordered fermionic Hamiltonian in this case is analytically (or computationally) feasible.
Research on systems with long-range interactions usually focusses on one of the following two cases: (i) systems where the long-range exponent α is smaller than the spatial dimension of the system. In this case a number of unconventional thermodynamic and dynamic properties are known to occur, including thermal phase transitions in one-dimensional models [43], nonequivalence of statistical ensembles [44][45][46], and others. Our quantum kinetic theory applies to this regime when J z in (3) is sufficiently small, but we have not studied this case in detail in the present work. (ii) Values of the long-range exponent α that are relevant for recent experiments with ultracold atoms. For spin-1/2 models, these are in particular a = 3 (magnetic atoms, polar molecules, Rydberg atoms) and α=6 (Rydberg atoms); see [47] for an overview. The latter case should certainly fall into the range of validity of our quantum kinetic perturbation theory when using 1/α as a small parameter.
An important feature of our theory is that we do not assume the conditions of Wick's first theorem to hold, i.e. unlike in some related work [15], we do not reduce expectation values of quartic fermionic terms into products of expectation values of quadratic terms This comes with some advantages and some disadvantages. A disadvantage is that we need to solve a substantially larger set of coupled differential equations, which leads to restrictions on the accessible system sizes. This is attenuated to some extend by the fact that we deal with ordinary linear differential equations, whereas application of Wick's theorem leads to nonlinearities. Moreover, since quartic terms are not broken up into quadratic ones, we have access to the corresponding quantum correlation functions. In the language of spin models, this gives us access not only to spin expectation values, but also to spin-spin correlation functions, as shown in figure 7.
As an application of our perturbative scheme we studied the influence of the small parameter 1/α on prethermalisation and thermalisation in the weakly long-range transverse-field Ising chain. Finite-size effects restrict the accessible timescales to t;N, which in turn implies a limitation on the relaxation phenomena one can observe. Relaxation due to the integrable part  2 occurs on a timescale of O(1) and is easily observed. Thermalisation to a Gibbs state, induced by the nonintegrable part of  pert , takes place on a slower timescale. While we were not able to observe the full approach to thermal equilibrium in time, we do see that the presence of nonintegrable terms pushes the spin expectation values closer to their thermal values. This effect is more pronounced closer to the quantum critical point of the model, but we do not have a satisfactory explanation for this observation.

Appendix A. Transformation into the diagonal basis of  int
As discussed in section 2.1, we want to express the Hamiltonian as well as observables of interest as normalordered products of the fermionic operators that diagonalise the integrable part of the Hamiltonian. The reasoning behind this strategy is that high-order terms in those normal-ordered operator products are expected to be less relevant for the dynamics, and the kinetic equations derived in this paper are obtained by neglecting certain classes of normal-ordered fermionic operators. In appendix A.1 we briefly recapitulate the standard result of diagonalising the transverse-field Ising chain with nearest-neighbour interactions by means of a Jordan-Wigner transformation, followed by a Fourier and a Bogoliubov transformation. In appendix A.2 we express the long-range contribution  pert in terms of the Bogoliubov fermions of appendix A.1, and in the appendices A.3 and A.4 we do the same for spin components  l z and spin-spin correlations, which will be our observables of interest.

A.1. Integrable part
The integrable part  int of our Hamiltonian (1)-(3) describes a one-dimensional Ising chain in a transverse magnetic field with nearest-neighbour interactions. In this section we review the standard procedure of mapping this part of the Hamiltonian to noninteracting fermions by means of Jordan-Wigner, Fourier, and Bogoliubov transformations; see [19,30,48] for more detailed accounts. . We will restrict our attention to initial states from the even parity sector, and parity conservation will preserve that restriction for all later times. For convenience, within that sector we shift the Brillouin zone to be as symmetric as possible around zero by choosing the integers Î -- 2 1 in the above definition of the momenta. To prepare for the Bogoliubov transformation to follow, we express  int as a sum of matrix products (39) can be diagonalized by means of a unitary transformation.
A.1.3. Bogoliubov transformation. We introduce the change of basis , there exists a real number x k such that = = ( ) ( ) u x v x cos , sin This dispersion relation differs from the one in (5), and also from what is given in most papers and textbooks, by the factor of ( ) a sgn k [49]. This variant turns out to be useful in appendix C.2, where we define a particle-hole mapping to reach the high-temperature regime, a transformation that, at least in the ferromagnetic phase, is equivalent (in the active viewpoint of symmetries) to a reversal of the magnetic field h.

A.2. Perturbation
In this section we express the long-range perturbation ( where  where we have defined When normal-ordering the terms in the second sum of (49), further quadratic terms will emerge, which can be merged with the quadratic terms in  int . The full Hamiltonian can finally be written as is not a necessary condition to fulfil = A A I III . In addition, (consequence of the statistics), and   , are not momentum conserving. For (discrete) translationally invariant initial states, however, we show in appendix B that such nonmomentum-conserving terms have zero expectation values at all times. Therefore, the expectation value of (54) at time t simplifies to This is an important simplification for the LKE code, as it allows us to restrict the set of operators considered in the code to momentum-conserving products of Bogoliubov fermions.

A.4. Transformation of correlation functions
In the spin picture we define the connected xx-correlation function as 5 6 x l x l x l Because of the  2 symmetry of Hamiltonian and initial state we are using, we have  á ñ = 0 l x at all times and hence    á ñ = á ñ + x l x l x 1 1 . By performing Jordan-Wigner, Fourier, and Bogoliubov transformations and assuming a translationally invariant state, this correlation function turns out to be quadratic when expressed in terms of the Bogoliubov fermions lie entirely in the even parity sector of the Fock space, all odd-degree normal-ordered products of fermionic operators strictly do not contribute and can be excluded from the differential system of kinetic equations 13 .
(iii) Fermions created by the operator  n in (13) always come in pairs with opposite momenta k i and −k i . Through the definition of the truncated polarised states y ñ | n in equation (15) To exploit in the kinetic equations the symmetries (i)-(iii), we define, based on the classes C p deg of normalordered products of fermionic operators defined in section 2.2, the symmetry-restricted classes . Hence, a differential system that contains only products of operators from the setF is sufficient to describe not only the initial state, but also the time evolution of a truncated polarised state. The symmetry restrictions reduce the number of operators to be considered in the LKE code significantly, for example from N 4 to N 2 when going from C p 4 toC Appendix C. Dynamics and thermodynamics in the high-temperature limit In section 4.2 we employed ED of small, finite systems for calculating thermal expectation values, and extrapolated these values to large system sizes N. In this section we discuss how, without resorting to ED nor to conventional diagrammatic perturbation theory at equilibrium [16], thermal expectation values in the hightemperature limit can be obtained for large system sizes. We calculate, up to second order in the inverse temperature β, analytical expressions for the thermal values of both the energy density  á ñ N and the spin observable á ñ S l z . The truncated down-polarised states y ñ | n that were introduced in (15), and for which our LKE code was tailored, do not usually fall into the regime where b  1, and the same holds true for their up-polarised counterparts, which we denote by c ñ | n . However, by considering a suitable superposition of y ñ | n and c ñ | n the energy density can be tuned into the high-temperature regime. Under suitable conditions one can then argue that the dynamics of y ñ | n and c ñ | n decouples. Making use of a particle-hole transformation, the LKE code can be used for the calculation of the decoupled time evolution of c ñ | n , and the outcome can be compared to the thermal values obtained from a high-temperature expansion.
C.1. Thermal equilibrium results in the high-temperature limit For sufficiently high temperatures, the Boltzmann factor can be expanded up to second order in β, Based on this expansion, and making use of  = ( ) Tr 0, one can derive, up to the same order in β, the thermal expectation value of the energy density [50] 13 Many odd-degree normal-ordered products are also nonmomentum-conserving, and have already been eliminated in (i), e.g. all h k for which p Î { } k 0, . More generally, the set S of operators of degree  + j N 2 1 2 that can be written as normal-ordered products of a momentum-conserving product of degree 2j times η π has also been eliminated before, and S scales like 2j−1. 14 This follows from a similar property of the  ( ) S p a n T 2 for all   Î, with the partial zeta-function ζ N as defined in (48), which, for α>1/2, converges in the large-N limit. At next order, after a long calculation we find å å z a = --  (67) can be evaluated without too much effort for very large system sizes. For the purpose of benchmarking the LKE code, we are interested in α substantially larger than 1/2. For such values we observe that K 1 and K 2 converge quickly with increasing N, being essentially indistinguishable from their infinite-system limit already for system sizes ∼10 2 . In that same regime of α-values we also find that  K K where the notation å ( ) k q with = ¼ + ( ) ( ) k q k k , , q 1 2 1 denotes a summation over all < < +  k k q that has been used in the literature [30], as (86) naturally lends itself to the definition of truncated polarised states, as discussed in the next paragraph.
Truncated polarised states: properties and limitations. The nth truncated polarised state y ñ | n , as defined in equation (15), is obtained by truncating the first sum in (86) at the index s=n. Such a truncation preserves the reflection symmetry in momentum space, and therefore allows us to make use of the symmetry restrictions discussed in appendix B. Also, the numerical computation of the vector y ñ | n scales like N n , whereas that of the fully polarised state scales like⌊ ⌋! N 2 . Finally, truncated polarised states come with the additional advantage of allowing for a systematic tuning of the particle density such that, by making n sufficiently small, the validity of the LKE code can be ensured. This can be useful for small magnetic fields h, where h h áå ñ  † N 2 k k k in general does not hold for a (nontruncated) polarised state, as illustrated in figure D1.
A straightforward calculation shows that the expectation value of the fermionic particle density (26) with respect to any truncated polarised state is given by  (25) for the validity of the p-particle truncation for a truncated polarised initial state y ñ | n immediately follows for  p n. Moreover, since y ñ =  ñ  | | n for = ⌊ ⌋ n N 2 , there must exist a smallest-possible ν for which, for a given system size N,   á ñ á ñ y  n  to a desired level of accuracy. It seems reasonable to assume that y ñ n | is then a good approximation of the corresponding fully-polarised vector.
We chose to present in section 4 LKE results only for y ñ | 1 , where (25) holds by construction, independently of the choice of h/J x . y ñ | 1 is not necessarily a good approximation of the fully polarised state, as for instance in figure 5 for parameter values N=180 and h=−0.51. While also in this case y ñ | 1 is a perfectly legitimate choice as an initial state in the LKE code, it does not have a simple (approximate) representation in the spin picture, and the physical relevance of such an initial state is unclear.
In the remainder of this section we complete the above reasoning by providing a justification of the asymptotic property (89). We start by showing that L 1 grows linearly with N asymptotically in the large-N limit. From the definition (87) of L s , together with the expressions of the Bogoliubov coefficients u k and v k derived in appendix A.1.3, it follows that and k = h J 2 x is the order parameter. For κ>1, each term in the sum of (91) is positive and smaller than one, which implies  ( ) L N N 1 . To also prove a lower bound on L 1 , we define, for a fixed  p <  0 4 , the interval is approximated by a truncated polarised state y ñ | n with n large enough such that, for a given h,   á ñ á ñ y y - n n 1 to a precision of 10 −3 . For magnetic fields h close to −1/2 + , a y ñ | n with a fairly large n-value is required to reach the desired level of accuracy, which puts an h-dependent limit on the system sizes for which we were able to calculate  á ñ. From the plot we observe that (i) the criterion (25) for the validity of the approximations made in the LKE code does not hold for h close to −1/2; and (ii) at fixed magnetic field,  á ñ tends to a nonzero value when N goes to the large sizes limit (N∼10 2 ). Note that, since the value of n is not kept constant along each of the lines in the plot, such a nonzero limit is not in contradiction to (90). From this limiting behaviour in combination with equation (90) one can infer that, in order to approximate  á ñ to a certain precision, n has to increase linearly with N, which becomes computationally impractical for larger system sizes. Note that this analysis only concerns initial states; since  does not commute with , its expectation value changes with time, which may (and it practice does) lead to a violation of the criterion (25) at later times.