Many body physics and the capacity of quantum channels with memory

In most studies of the capacity of quantum channels, it is assumed that the errors in each use of the channel are independent. However, recent work has begun to investigate the effects of memory or correlations in the error, and has led to suggestions that there can be interesting non-analytic behaviour in the capacity of such channels. In a previous paper we pursued this issue by connecting the study of channel capacities under correlated error to the study of critical behaviour in many-body physics. This connection enables the use of techniques from many-body physics to either completely solve or understand qualitatively a number of interesting models of correlated error with analogous behaviour to associated many-body systems. However, in order for this approach to work rigorously, there are a number of technical properties that need to be established for the lattice systems being considered. In this article we discuss these properties in detail, and establish them for some classes of many-body system.


Introduction
One of the most important problems of quantum information theory is to try to determine the channel capacity of noisy quantum channels. In a typical scenario, Alice would like to send Bob information over many uses of a noisy quantum communication link. As the channel is noisy, this cannot usually be done perfectly, and so they must use some form of block encoding to combat errors. The channel capacity is defined as the optimal rate at which information may be transferred with vanishing error in the limit of a large number of channel uses. There are a variety of different capacities, depending upon whether Alice and Bob are interested in transmitting classical or quantum information, and whether they have extra resources such as prior entanglement. In this paper, we will be concerned mostly with the capacity for sending quantum information, and so whenever we write the term 'channel capacity', we will implicitly be referring to the quantum channel capacity.
In most work on these problems, it has usually been assumed that the noisy channel acts independently and identically for each channel use. In this situation, the transformation E n corresponding to n-uses of the channel may be written as an n-fold tensor product of the single-use channel E 1 : 3 However, in real physical situations there may be correlations in the noise that acts between successive uses, an interesting example being the decoherence of photons in optical fibres under the action of varying birefringence, which can be correlated due to mechanical motion or slow temperature fluctuations [1]. In such situations, one cannot describe the action of the channel in a simple tensor product form: In this setting, one must really describe the action of the channel by a family of quantum operations corresponding to each number of uses of the channel n = 1, 2, . . . , ∞: We will call any such family of operations a memory channel or a correlated channel 4 . Defining the notion of channel capacity for such a correlated channel is not always straightforward. In principle, a family of channels, such as equation (3) may not have any sensible limiting behaviour as n → ∞ 5 . However, in this paper, we will not need to discuss this issue in detail, as we will only consider fairly regular channels that have a (unique) well-defined notion of channel capacity.
In the case of uncorrelated errors, it has recently been shown [4] that the quantum channel capacity of an uncorrelated quantum channel is given by: where I (ξ ) is the so-called coherent information of the quantum channel ξ : I (ξ ) := sup ρ S(ξ(ρ)) − S(I ⊗ ξ(|ψ ψ|)), where S denotes the von Neumann entropy, ρ is a state, and |ψ ψ| is a purification of ρ.
Given that equation (4) is the quantum channel capacity for memoryless channels, it is natural to hope that the corresponding expression: will represent the quantum channel capacity in the case of correlated errors. However, this will notalwaysbethecase,notleastbecausethislimitdoesnotalwaysexist( [2]andseefootnote5). However, in this paper, we will not only assume that this limit exists, we will also initially work under the assumption that it represents the true quantum channel capacity. We will later discuss this assumption in some detail. A similar situation occurs for the classical capacity of correlated quantum channels, where formulae (4) and (6) can be replaced with similar expressions involving the Holevo quantity instead of the coherent information. Most prior work on calculating the capacities of correlated quantum channels has focused on the capacity for classical information. Numerical 4 and mathematical experiments involving a small number of channel uses suggest that in a variety of interesting cases the classical capacity of correlated channels can display interesting nonanalytic behaviour. For instance, the sequence of papers [5]- [7] investigates a certain family of correlated channels parameterized by a memory factor µ ∈ [0, 1] which measures the degree of correlations. The results of [5]- [7] demonstrate that when the correlated channel is refreshed after every two uses (i.e. consider E 2 ⊗ E 2 ⊗ E 2 ⊗ · · · , rather than the full correlated channel {E n }), then there is a certain transition value µ = µ 0 at which the channel capacity displays a definite kink, and above this threshold the optimal encoding states suddenly change from product to highly entangled. Similar phenomena have subsequently been observed in a variety of other cases [8,9].
Despite these interesting observations, it is still an open question whether the sharp kinks in the capacity of these models still persist if the full correlated channel {E n } is considered as n → ∞, or whether this behaviour is just an artefact of the truncation of the channel at low n. The main difficulty in deciding such questions is that even under the assumption that equations such as (6) (or its analogue for classical information-the regularized Holevo bound) represent the true quantum capacity of a given correlated channel {E n }, in most cases such variational expressions are extremely difficult to compute. It is, however, interesting to note that the non-analytic behaviour observed in the channel capacity of correlated channels is somewhat reminiscent of the non-analyticity of physical observables that define a (quantum) phase transition in strongly interacting (quantum) many-body systems, where in contrast true phase transitions usually only occur in the n → ∞ limit.
Motivated by this heuristic similarity, in a previous paper [10], we connected the study of channels with memory to the study of many-body physics. One advantage of this approach is that allows the construction of a variety of interesting examples of channels for which equation (6) can either be understood qualitatively or even calculated exactly using the techniques of many-body physics. One would otherwise usually expect regularized equations such as (6) to either be quite trivial or completely intractable. This is perhaps the most important consequence of this line of attack-by relating correlated channels directly to manybody physics, we obtain a good method for displaying models of channels with memory that tread the interesting line between 'solvability' and 'non-solvability', in analogy with the many such statistical physics models that have been proposed over the years. It is quite possible that the insights of universality, scaling and renormalization that have been so successful in many-body theory may provide valuable intuition for the study of channels with correlated error.
Another advantage of this approach is its connection to physically realistic models of correlated error. One can imagine that in many real forms of quantum memory, such as optical lattices, any correlated errors might originate from interaction with a correlated environment and thus be strongly related to models of statistical physics. This provides further physical motivation to examine the properties of correlated channels with a many-body flavour.
The connection to many-body physics also naturally leads one to consider channels with structure in two or more spatial dimensions. In such situations, it is no-longer appropriate to think of correlations as 'memory', as the correlations arise not through a single time dimension, but perhaps through spatial proximity in more than one dimension. In order to define a capacity in such multidimensional situations, one would have to decide how to quantify the size of the channel. Natural options could include the total number of particles in the system, or perhaps the size of one linear dimension. Although we will not explicitly discuss multi-dimensional Figure 1. Each particle that Alice sends to Bob interacts with a separate environmental particle from a many-body system. examples in this work, such situations might have interesting connections to the study of error tolerance in computational devices.
This paper is structured as follows. In order to make the paper self-contained, in the sections preceding section 7 we present, including all missing detail, the results of [10]. In section 6, we discuss in detail some sufficient conditions that many-body systems must satisfy in order to lead to capacity results according to the approach that we adopt-the arguments that lead to the development of these conditions were sketched in [10], however, here, we provide the full argument. In sections 8 and 9, we prove that these conditions hold for finitely correlated states and formulate a Fannes-type inequality to show the same result for harmonic chains. In the remaining sections, we discuss generalizations of our approach and present conclusions.

Many-body correlated channels
In this section, we recap the approach taken in [10] to construct correlated error models with links to many-body physics. The starting point is to suppose, as usual, that Alice transmits a sequence of particles to Bob (the 'system' particles), and that each particle interacts via a unitary U with its own environmental particle. So far, this is exactly the same setting as uncorrelated noise. However, although each system particle has its own separate environment, one can introduce memory effects by asserting that the environment particles are in the thermal/ground state of a many-body Hamiltonian, such that the interaction terms lead to correlations in the environmental state (see figure 1). Unlike the uncorrelated case, this means that there will be correlations in the noise on different system particles. At this point, it is important to discuss some of the subtleties involved in the way that the 'many-body' system was defined in [10]. In basic approaches to many-body physics, it is usual to consider a system with a finite number of particles, obtain thermal states and ground states, and then take a limit as the number of particles is taken to infinity. In more mathematical statistical physics literature [11], however, it is usual to consider genuinely infinite systems from the start. This involves a number of technical implications, including a very different approach to the concept of a state, which can no longer be expressed in terms of basic density matrices. The two approaches are not necessarily equivalent and may lead to different results. To avoid such technicalities in this 6 work, we will follow the former approach, and for each number of uses of the channel n, we will consider a many-body system of size n. As a family of channels for each n this is a mathematically well-defined object, and it is a reasonable question to ask what the resulting channel capacity is. In later sections of the paper, we will also assume periodic boundary conditions to enable us to analyse whether equation (16) is a valid quantum capacity or not. Again, although this seems like an unnatural assertion, it is mathematically well-defined, and in many systems the boundary conditions are believed to make a vanishingly small difference which disappears in the large n limit.
Of course even with these simplifications not all many-body systems can be solved exactly, or even understood qualitatively. Moreover, even if the many-body system can be well understood, the computation of the limit (6) may still be difficult, and may depend strongly upon the choice of the unitary U describing the interaction of each system particle with its associated environmental particle. In order to provide concrete examples, one must hence make a judicious choice of U in order to make analytical progress. As in [10], we choose U to be of the form of a controlled-unitary interaction, where the environmental particles act as controls. In fact, for ease of explanation we will also initially restrict the system and environment particles to be two-level spins, and the interaction U to be a controlled-phase ('CPHASE') gate, which in the computational basis for 2-qubits is defined as, Later, we will discuss how higher-level analogues of the CPHASE enable similar connections to many-body theories with constituent particles with a higher number of levels. The reason we make these choices for the controlled unitary interactions is that explicit formulae may be derived for the capacity in terms of relatively simple entropic expressions which are especially amenable to analysis.
The restriction to controlled-unitary interactions also enables us to consider environment particles that are classical. For instance, in the case of classical environment two-level spins, the 'CPHASE' interaction will be taken to mean that the system qubit undergoes a Pauli-Z rotation when the environment spin is up, otherwise it is left alone. It turns out that by considering classical environments it is possible to make more direct connections between the channel capacity of our models and concepts from statistical physics.
So let us proceed in trying to understand the capacity in cases in which the system particles are all two-level systems, with a CPHASE interaction. It is helpful to write the resulting channels in a more explicit form. Let us consider a quantum environment first. Let |0 denote spin-down, and |1 denote spin-up. Let us suppose that the environment consists of N spins (eventually we will be interested in the limit N → ∞) initially in a state: where the sum is taken over all N -bit strings x, y, and x j /y j denote the jth bit of strings x/y, respectively. We can also describe a classical environment in the same way, simply by restricting the input environment state ρ to be diagonal in the computational basis-the CPHASE interaction will in this case leave the environment unchanged, and will affect the system qubits as if the controls are entirely classical.

7
If the environment is in the state (8), and the system qubits are initially in the state σ , then the channel acting upon the system qubits is given by: where Z i denotes the Pauli-Z operator acting upon qubit i. Hence, regardless of whether the environment is considered quantum or classical, the channel that we have described is a probabilistic application of Z -rotations on various qubits. Although we will consider qubit i to be transmitted earlier in time than any other qubit j with i < j, there is no need for us to actually impose such a time ordering-because all the CPHASE interactions commute with each other, such time ordering is irrelevant 6 . We will be interested in computing equation (6) for such many-body correlated channels. In the next section, we will show that the channel capacity of this channel is given by a simple function of the entropy of the diagonal elements in the spin-up/down basis of the environmental state, i.e.
In the case of a classical environment, this is just the actual entropy of the spin-chain. This observation is very useful, as it allows us to apply all the formalism of many-body physics to the problem, also enabling us to use that intuition to observe a number of interesting effects. In the quantum case this function does not correspond to a conventional thermodynamic property. However, we will discuss examples where it is still amenable to a great deal of analysis using many-body methods.

A formula for the coherent information of our models
In order to calculate the regularized coherent information (6) for our many-body correlated channels, we will utilize the close relationship between the quantum channel capacity and the entanglement measure known as the distillable entanglement [12]. This connection utilizes a well-known mapping between quantum operations and quantum states. Given any quantum operation E acting upon a d-level quantum system, one may form the quantum state: where .. d |ii is the canonical maximally entangled state of two d-level systems. The state J (E) is sometimes referred to as the Choi-Jamiolkowski state (CJ) of the operation E [13]. It can be shown that the mapping from E to J (E) is invertible, and hence the state J (E) gives a one-to-one representation of a quantum operation. We will show that for the kinds of correlated error channel that we have described above in equation (9), the quantum channel capacity Q(E) of the channel equals D(J (E)), the distillable entanglement of the state J (E).
To make the presentation more transparent, we will make the argument for the CJ state of a particular single qubit channel, as it is straightforward to generalize the argument to the entire 8 family of memory channels described above. Hence, let us consider the following single qubit 'dephasing' channel: where p is a probability, and Z is the Pauli-Z operator. The CJ representation of this channel is: where |+ is chosen as in equation (11). The argument relies upon the fact that the channel (12) possesses some useful symmetry. This symmetry leads to the property that having one use of the channel is both mathematically and physically equivalent to having one copy of J (E). Suppose that you have one use of E, you can easily create J (E). However, it turns out that with one copy J (E) you can also implement one use of E. Hence, both the operation and the CJ state are physically equivalent resources. The argument works as follows. Suppose that you have J (E) and you want to implement one action of E upon an input state ρ. This can be achieved by teleporting ρ through your copy of J (E). This will leave you with the state E(σ i ρσ † i ), with the Pauli operator σ i depending upon the outcome of the Bell measurement that does the teleportation. However, the channel (12) commutes with all Pauli rotations. So we can 'undo' the effect of the Pauli by applying the inverse of σ i , which for Paulis is just σ i itself. Hence, we have: Hence, by teleporting into J (E) and undoing the Pauli at the end we can implement one use of the operation.
This observation allows us to relate the channel capacity of the channel to the distillable entanglement of the CJ state. The proof proceeds in two steps, and follows well-known ideas taken from [12]. The aim is to show that the one-way distillable entanglement of J (E) is equivalent to Q(E), so that previous results on D(J (E)) may be applied.
1. Proof that Q(E) one-way distillation: (i) Alice prepares many perfect EPR pairs and encodes one-half according to the code that achieves the quantum capacity Q(E). (ii) She teleports the encoded qubits through the copies of J (E), telling Bob the outcome so that he can undo the effect of the Paulis. (iii) This effectively transports all encoded qubits to Bob, at the same time acting on them with E. (iv) Bob does the decoding of the optimal code, thereby sharing perfect EPR pairs with Alice, at the rate determined by Q(E). As this is a specific one-way distillation protocol, this means that Q D. 2. Proof that Q(E) one-way distillation: (i) Alice prepares many perfect EPR pairs and sends one-half of each pair through many uses of the channel E. (ii) She and Bob do oneway distillation of the resulting pairs (this involves only forward classical communication from Alice to Bob). (iii) Thereby they share the perfect EPR pairs, at the rate determined by D(J (E)), the one-way distillable entanglement. (iv) They can use these EPR pairs to teleport qubits from Alice to Bob. As this is a specific quantum communication protocol, this means that Q D.
These arguments can easily be extended to apply to any channel that is a mixture of Pauli rotations on many qubits, hence including the memory channel models that we have described above. Fortunately, the CJ state of our channel is a so-called maximally correlated state, for which the distillable entanglement is known to be equivalent to the Hashing bound: 9 where S is the von Neumann entropy. Note that for such channels E this expression is equivalent to the single copy coherent information, which is hence additive for product channels E ⊗n . In our case we are interested in the regularized value of this quantity for correlated channels, i.e.: which can be computed quite easily as: where Diag(ρ env ) the state obtained by eliminating all off-diagonal elements of the state of the environment (in the computational basis). Hence, the computation of the quantum channel capacity of our channel {E n } reduces to the computation of the regularized diagonal entropy in the limit of an infinite spin-chain. Although in most cases this quantity is unlikely to be computable analytically, it is amenable to a great deal of analysis using the techniques of many-body theory. It is also interesting to note the intuitive connection between expression (16) and work on environment assisted capacities-in the case of random unitary channels, where the unitaries are mutually orthogonal, the diagonal entropy in expression (16) has a natural interpretation as the amount of classical information that needs to be recovered from the environment in order to correct the errors [14,15]. Although the above analysis has been conducted for two-level particles, it can be extended to situations involving d-level systems. In the d-level case, one can replace CPHASE with a controlled shift operation of the form: where the Z (k) = j exp(i2πk j/d)| j j| are the versions of the qubit phase gate generalized to d-level systems, and the first part of the tensor product acts on the environment. With this interaction all the previous analysis goes through, and the d-level version of equation (16): gives the regularized coherent information, where Diag(ρ) refers to the diagonal elements in the d-level computational basis. It is important to consider the generalization to d-level systems because the thermodynamic properties of many-body systems do not always extend straightforwardly to systems with a higher number of levels. For instance, one possible generalization of the Ising model to d-level systems is the Potts model, which leads to some very interesting and non-trivial mathematical structure [16], and in the quantum Heisenberg model the presence of a ground state gap depends on where the spins in the chain are integral or half-integral [17]. The simplicity of equation (16) enables one to immediately write down many noise models for which the regularized coherent information can both be calculated, and also represents the quantum channel capacity of the correlated channel. In particular, let us suppose that the environment consists of classical systems described by a classical Markov Chain (those readers not familiar with the Markov chain terminology required here are directed to chapter 5 of [18] for a very readable introduction). If the state at each 'site' s in the environment represents the instantaneous state of a Markov chain at time s, then the regularized entropy in equation (18) is given by the entropy rate of the Markov chain [18], provided that the Markov process is both irreducible 7 and possesses a unique stationary (equilibrium) state. Let the transition matrix of M of the Markov chain be defined such that p i (s + 1) = j M i j p(s) j , let v i be the ith element of the stationary probability distribution, and let H i be the entropy of column i in the Markov chain transition matrix. With these conventions the entropy rate is given by: In these cases the correlated channels fit quite neatly into the class of models proposed in [2,19], and moreover these channels will be forgetful [2]. As proven in [2], for forgetful channels the regularized coherent information is equal to the quantum capacity (see [20] for an independent coding argument which also works for Markov chain channels implementing generalized Pauli rotations). Hence, for these models equation (18) represents the true quantum channel capacity, and so we may write explicitly: When unique, the stationary distribution of a Markov chain is given by the unique maximal right eigenvector (of eigenvalue 1) of the transition matrix. Related results have been obtained independently in [20,21].

Environment that is a classical system
In the case of a classical environment, the second term of equation (16) is precisely the entropy of the environment, and so it can easily be computed in terms of the partition function. The partition function of the classical system is defined as: where the E i are the energies of the various possible configurations, and β = 1/(k B T ), with T the temperature and k B Boltzmann's constant. The entropy (in nats) of the system is given by the following expression: This means that in the case of a classical environment our channel capacity becomes where the log 2 (e) converts us back from nats to bits. This expression means that we can use all the machinery from classical statistical mechanics to compute the channel capacity.

11
In particular, any spin-chain models from classical physics that can be solved exactly will lead to channels with memory that can be 'solved exactly' (provided that one can show that the regularized coherent information is indeed the capacity, a problem that we shall discuss in later sections). The most famous example of an 'exactly solvable' classical spin-chain model is the Ising model. We will discuss the classical Ising model in detail in the next section, as it will also be relevant to a certain class of quantum spin-chains.
However, there are also many classical spin-chain models that cannot always be solved exactly, but which can be connected to a wide variety of physically relevant models with interesting behaviour. As just one example, consider modifying the Ising spin-chain model to allow exponentially decaying interactions between non-adjacent spins. The resulting model can be related to a quantum double-well system, and is also known to exhibit a phase transition 8 . This means that the corresponding correlated channels will also exhibit similar behaviour, provided of course that the limit equation (6) truly represents the quantum channel capacity for the models.
In this paper, we will not give detailed discussion of any further models involving a classical environment (other than the classical one-dimensional (1D) Ising chain, which we will discuss in the next section). As our expression (16) is simply the entropy of the classical environment, the interested reader may simply refer to the many interesting classical models (both solvable and almost solvable) that are well documented in the literature. Of course, to make the analysis rigorous one would need to show that expression (6) is the formula for the quantum capacity in these cases. However, we conjecture that for most sensible models this should be true. In the final section of the paper, we will present an analysis that demonstrates this for a family of 1D models.

Quantum environments
Unfortunately expression (16) does not correspond to a standard thermodynamic function of the environment state when the environment is modelled as a quantum system. It represents the entropy of the state that results when the environment is decohered by a dephasing operation on every qubit. Although this quantity is not typically considered by condensed matter physicists, there is some hope that it will be amenable to analysis using the techniques of the many-body theory.
In this paper, we will make a small step towards justifying this hope by analytically considering a class of quantum environments inspired by recent work on so-called finitely correlated or matrix product states [22].
We will leave attempts to analytically study more complicated models to another occasion, although in figure 2, we present some numerical evidence that the quantum 1D Ising model displays a sharp change in capacity at the transition point.

Quantum capacity for finitely correlated environments described by rank-1 matrices
Finitely correlated or matrix product states are a special class of efficiently describable quantum states that have provided many useful insights into the nature of complex quantum systems [22]. In a recent paper [23] it has been demonstrated that a variety of interesting Hamiltonians can be constructed with exact matrix product ground states, such that the Hamiltonians in question undergo non-standard forms of quantum 'phase transition'.
As matrix product states are relatively simple to describe, one might hope that for such ground states the computation of equation (16) may be particularly tractable. In this section, we will see that for matrix product states involving rank-1 matrices the analysis is particularly simple, and may be reduced to the solution of a classical 1D Ising model.
Let us consider a 1D matrix product state, where each particle is a two-level quantum system, |0 , |1 . Let us assume that the matrices associated to each level are independent of the site label, and are given by Q 0 for level |0 and Q 1 for level |1 . Hence, the total unnormalized state can be written as: From the form of expression (16) we see that we are only interested in the weights of the diagonal elements in the computational basis, or equivalently the state that results from dephasing each qubit. It is easy to see that this unnormalized state will be given by: In this expression if we relabel the matrices A = Q 0 ⊗ Q * 0 and B = Q 1 ⊗ Q * 1 then the probability of getting various outcomes when measuring the environment in the computational 13 basis will be given by traces of all possible products of the As and Bs. For instance, the probability of getting 01100. . . , when measuring the environment in the computational basis will be given by: where N is the number of qubits in the environment, and C(N ) is a normalisation factor given by: C(N ) can be computed by diagonalization. In the rest of this section, we will be interested in cases where A and B are both square rank-1 matrices. Some of the example Hamiltonians discussed in [23] have ground states with this property, and in fact, some special cases of the noise models presented in [5,6,8,9] can also be expressed in the form of matrix product environments with rank-1 matrices (although in general those models require more than two matrices as they require environmental spins with more than two levels). We will show that in such situations the diagonal entropy in the computational basis is equivalent to the entropy of a related classical Ising chain.
The first thing to note is that rank-1 matrices are almost idempotent. In fact, if A and B are both rank-1 matrices, then we have that: where a is the only nonzero eigenvalue of A, and b is the only nonzero eigenvalue of B. Note that because of the form of A and B as the tensor product of a matrix and its complex conjugate, these eigenvalues a, b must be non-negative. We can define the normalized matrices: These normalized matrices are idempotent. To see how this can help, consider a particular string, say, if we substituteÃ andB into this expression, and use the idempotency, then the strings of consecutive As and Bs will collapse to just oneÃ orB, with total factors of a 4 and b 3 inserted outside the trace: It is easy to see that this form is quite general-the probability of getting a particular string will collapse to a simple expression. If there are l occurrences of A and n − l occurrences of B in the string, and K counts the number of boundaries between blocks of As and blocks of Bs, then the probability of the string becomes: Noting thatÃB will also be a rank-1 matrix, let us use the letter c to refer to its only nonzero eigenvalue. Hence, the probability becomes: This expression tells us quite a lot-firstly for any given channel described by rank-1 MPS states, the only parameters that matter are a, b and c. So we need not work with the actual matrices defining our state, we only need to work with matrices of our choice that have the same parameters a, b and c. In the following, we will assert that c is non-negative-this is guaranteed because of the following argument: it holds that c = tr{ÃB}, becauseÃB is rank-1, but becauseÃB = Q 0 Q 1 ⊗ Q * 0 Q * 1 /(ab), where a and b are non-negative, this means that c must be non-negative.
So let us just go ahead and pick the following matrices: These matrices clearly have nonzero eigenvalues a and b, respectively. So what about the eigenvalue ofÃB ? For the above choice of matrices we find that: Hence, we find that the matrices that we have chosen have the correct values of a, b and c, as required. Now we notice that the matrices that we have chosen in equation (32)  It turns out that the parameters J and D will represent coupling constants and M will represent a magnetic field. To see this, let us insert the new parameters into the choice of A and B in equation (32). Then we get that the matrices (32) can be written as The matrices in such a rank-1 MPS are essentially the top row and bottom row of a transfer matrix. Comparing these matrices to the classical Ising transfer matrix, we see that the following Hamiltonian (where for convenience we now follow the usual physics convention that s i ∈ {−1, +1}): The D is just a constant shift in spectrum, so we can simply consider the Ising chain with Hamiltonian:  (41). The symmetry in this plot is to be expected as the channel is invariant under the replacement g → −g. However, near the 'phase transition' point g = 0, the gradient diverges.
The partition function for such a chain of N particles depends upon the transfer matrix for this (rescaled!) Hamiltonian: Now from the partition function, we can calculate the entropy, and hence the capacity of our channel. The formula turns out to be: where λ 1 is the maximal eigenvalue of the transfer matrix (39). Using these equations and equation (35), one can perform the (tedious) manipulation required to derive a formula for the regularized coherent information in terms of the coefficients a, b and c. Although we do not present the formula that is obtained, figure 3 shows the result for the model Hamiltonian presented in [23]: for which the ground state is known to be a matrix product state of the form: This model system has a non-standard 'phase transition' at g = 0, at which some correlation functions are continuous but non-differentiable, while the ground state energy is actually analytic [23]. As discussed in the caption of figure 3, this behaviour is mirrored in the channel capacity.

Conditions under which the regularized coherent information represents the true capacity
In this section, we will explore under what conditions our assumption that the regularized coherent information of equation (6): correctly represents the true quantum capacity of our correlated channels, assuming of course that this limit exists. In the course of the discussion, we will also need to consider under what conditions the regularized Holevo bound: represents the capacity of the channel for classical information. The Holevo bound χ(E) for a quantum channel E is defined as [24]: where the supremum is taken over all probabilistic ensembles of states { p i , ρ i }, and S as usual represents the von Neumann entropy. As pointed out in [2,25], showing that equations (42) and (43) are upper bounds to the quantum/classical capacity of a correlated channel is straightforward-one can use exactly the same arguments used in the memoryless case [4], [26]- [28]. Showing that equations (42) and (43) also give lower bounds to the relevant capacities is not as simple, and may not be true for some many-body environments. However, it turns out that if the correlations in the many-body system fall off sufficiently strongly, then the channel will be reasonably well behaved and equation (42) is the true capacity. In this section, we will make this statement quantitative. We will closely follow the approach taken in [2] in the analysis of so-called forgetful channels. Some of the subtleties involved in the analysis are explained in more detail in section 6 of that paper. The conditions that we obtain are independent of the unitary which governs the interaction between each system particle and its corresponding environment, and so are applicable more widely than the dephasing interaction considered here.

A qualitative description of the argument
In this subsection, we present an intuitive sketch of the argument that we will follow. Imagine that the correlated channel is partitioned into large blocks that we shall call live qubits, separated by small blocks that we shall call spacer qubits. The idea is to throw away the spacer qubits, inserting into them only some standard state, and to only use the live qubits to encode information (see figure 4). If we are to follow this procedure, then we will not be interested in the full channel, but only in its effect upon the live qubits. Let us use the phrase live channel to describe the resulting channel, i.e. the reduced channel that acts on the live qubits only. If the correlations in the many-body system decay sufficiently strongly, then by throwing away just a few spacer qubits we will find that the live channel closely approximates (in a sense to be discussed later) a memoryless channel. Let us call this memoryless channel the product channel. One can imagine trying to use the codes that achieve the capacity of the product channel, without any further modifications, as codes for the live channel. It turns out that under the 'right conditions' these codes are not only good codes for the live channel, but their achievable rates approach equation (42). The goal of the next subsection will be to explore exactly what these 'right conditions' are.
The quantitative arguments follow the method used in [2], where three steps are required to show that equation (42) is an achievable rate: [A] First, we must show that product codes for the transmission of classical information are good codes for the live channel.
[B] Then we must show that these good codes allow the regularized Holevo quantity to be an achievable rate. This is done by showing that the product channel Holevo quantity (which can be achieved by product codes) essentially converges to the regularized Holevo quantity for the whole channel.
[C] Then we must argue that these arguments for the transmission of classical information can be 'coherentified' (in the manner of [4]) to a good quantum code attaining equation (42).
In the next subsection, we go through this process in detail to derive sufficient conditions to demonstrate the validity of equation (42) for our many-body channels.

Derivation of the conditions
In this subsection, we will go through steps [A], [B] and [C] in turn.

Step [A].
We will assume that the many-body systems in question satisfy periodic boundary conditions and are translationally invariant (this means that the corresponding correlated channel {E n } does not quite fit into the definition of causality proposed by [2], however, it allows us to avoid the technicalities required to analyse a truly, genuinely, infinite many-body system). Let us consider a specific length of chain N , split into v = N /(l + s) sections, each consisting of one live block of length l and one spacer block of length s := δl l. In the following the sizes N , l will generally be taken to be large enough that the statements we use hold. The live channel will be defined by: where A represents the state that Alice inputs to the live channel, U represents the interaction between the environment and A, the labels L 1 , L 2 , . . . , L v represent the live blocks from sections 1, . . . , v, and the trace is taken over the environment. Due to translational invariance the reduced state of the environment corresponding to each given live block will be same, and so let us denote this state by ρ l N . With this notation, the product channel will be defined by: Note that both the live and product channels have a dependence upon both the live block length l and the total number of spins N . Let us first consider using the product and live channels to send classical information. By definition, if a given rate R is achievable for the product channel, then for every error tolerance > 0 there is an integer N such that for n > N channel uses there exist a set of ν = 2 nl R codeword nl-qubit states {ρ 1 , . . . , ρ ν } and a corresponding decoding measurement {M 1 , . . . , M ν } such that: If the same codebook and decoding measurements are used without alteration for the live channel, then the error would be: As the addition of Alice's state A, the unitary interaction U , and the POVM element M i can all be viewed as one new POVM element acting only on the environment, the left term in this formula can be bounded by [24] |tr{ Hence, the error (48) in using the product code for the live channel can be bounded by:

Assume that the rightmost term in this equation is bounded by
for positive constants C, E and F. This assertion will be demonstrated for some special cases in section 8. Then this would mean that the error becomes bounded as The part of this error depends upon the number of blocks v. One potential problem that we immediately face is that to decrease we need to increase v, however, increasing v inevitably increases the last error term in the equation. It is hence not a priori clear that both error terms can be made to decrease simultaneously. However, it can be shown [2,29] that if we pick v = l 5 , s = δl and δ > 0 then both error components can be made to vanish as l increases, while still operating at the achievable rates of the product channels (in fact, the number of sections v could be given any polynomial or subexponential dependence on l provided that asymptotically v(l) > l 5 ). So we see that provided condition (49) can be demonstrated for the many-body systems that we consider, then the product channel works well for the live channel, as long as a large enough live block size is used (however small the fraction of spacer qubits δ). Hence, equation (49) is the first of our sufficient conditions. In section 8, we demonstrate that condition (49) (which is identical to equation (63) later in the paper) holds for some interesting classes of many-body system, including matrix product states. To show that the product channel (which is just a product of the reduced channel on a single live block) Holevo capacity is essentially the regularized capacity, we need to show that the reduced channel on a single live block is essentially independent of the total length of the chain. Hence, we need to show that the reduced state of l contiguous environment spins is approximately the same regardless of whether the chain is (a) much longer than l, or (b) slightly longer than l.

7.2.2.
Step [B]. Now that we know that the product code is also suitable for the live channel, it is necessary to check that the regularized Holevo bound (i.e. the regularized Holevo bound for the full channel without throwing spins away) is actually an achievable rate for the live/spacer blocking code that has been used. In order to make this analysis it will be convenient to define a little more notation. For a total chain of length n as before let E n denote the noisy channel. For a contiguous subset of j n of the spins that Alice sends, let E j n denote the effect of the channel only upon those spins. Due to translational invariance the location of the spins is irrelevant, as long as they form a contiguous block.
A given product channel with live block length l and a total number of spins N = v(l + s) = l 6 (1 + δ) has a Holevo quantity given by: where the • merely acts as a place holder for the inputs to the channel. Our goal is to show that for large enough l this expression is close to the regularized Holevo bound equation (43) (see figure 5). It is not too difficult to derive conditions under which this will be the case. Suppose that we have a spin-chain of total length l + (l), where (l) l. In fact we will only be considering functions (l) > 0 such that lim l→∞ /(l) = 0. The subadditivity and the Araki-Lieb inequalities for the entropy ( [24], section 11.3.4), i.e.

S(A) + S(B) S(AB) |S(A) − S(B)|
can be inserted straightforwardly into the Holevo bound to show that: where d is the dimension of each communication spin (see also [2]). This equation follows from the fact that the Holevo bound is the difference of two entropic terms, each of which can change by at most log(d) under the tracing out of d-level particles. Dividing through by l now gives: This equation tells us that the Holevo quantity for a subset of l spins is very close to the Holevo quantity for a full chain of l + spins, as long as is small. Our goal now is to show that if the subset of l spins is drawn from a much longer chain of length N = l 6 (1 + δ), then the subset still has essentially the same value for the Holevo quantity, and so the regularized Holevo quantity represents the capacity of the product channel. Intuition suggests that if the correlations decay fast enough, then it should be the case that for N = l 6 (1 + δ), we should have approximately E l l+ ∼ E l N , as a given region should not 'feel' how long the chain is. Now suppose that we define Then for a given input state ω on the live block in question the output states will differ by at most: Hence, Fannes inequality [30] (of which a version suitable for our purposes is |S(X ) − S(Y )| X − Y 1 log(d) + log(e)/e) can be used to bound the difference in the two Holevo functions χ(E l l+ ), χ(E l N ) as follows: Putting this equation together with equation (54) gives: and taking the limit of large l gives: So, as long as we can pick a function (l) such that lim l→∞ (l)/l = 0, and such that the norm distance P(l, (l)) vanishes with increasing l then we know that the regularized Holevo quantity is the correct capacity.

7.2.3.
Step [C]. Now that we have understood the conditions under which the regularized Holevo bound represents the capacity for the transmission of classical information, we need to try to undertake the same analysis for quantum information. As was also exploited in [2], the way that Devetak's work [4] proves that the regularized coherent information equals the quantum channel capacity of memoryless channels is to first prove a capacity formula for the transmission of private (secret) classical information, and then to make the private coding scheme coherent. This 'coherentification' procedure applies directly to correlated channels, and so to argue that the regularized coherent information (42) is also achievable for channels with correlated noise, it is sufficient to show that the private information codes that work for the product channel are also suitable for the live channel. So now suppose that a malicious eavesdropper is in charge of the environment of our correlated channel. We need to prove that the information that she can access is still limited when product private codes are used for the live channel. We can see that the output that Eve obtains is given by: where the tildes mean that environment state ρ must be extended to give a closed system (i.e.ρ is a pure state), the entire environment of which is assumed to be totally under Eve's control. In the case of the product channel, the privacy condition means that for all > 0 there is a v 0 such that for all v > v 0 there exists some standard state θ such that: for all inputs A from the privacy code (readers familiar with [2,4] will note that in those works an extra randomization index was included as a label in the code states-however, in our context this is unimportant and so we omit it for ease of notation). Applying the same code to the live channel gives the estimates: The last term in this equation represents the norm difference between the purifications of two different possible environmental states. We are free to pick the purifications that give the greatest overlap between the two environment states. Although this may seem like a contradictory step, as we should allow Eve to have control over the environment, it is in fact valid because the product code is by assertion private for all possible extensions of the product channel. The coherentification procedure leads to the distribution of maximally entangled states which are automatically uncorrelated from the environment, whatever purification Eve decided to use. The last line from the previous equation hence becomes (using the fact that for two pure states the overlap and the trace distance are related by (|φ φ| − |ψ ψ|) 1 = 2 1 − | ψ|φ | 2 , see Nielsen and Chuang [24, p 415, equation (9.99)], noting that the factor of 2 comes in from a different convention for the trace norm): where F is the Uhlmann fidelity [24]. Hence, using the well-known relationship between the Uhlmann fidelity and the trace norm of two states  F(x, y)) 2 √ x − y 1 ), we find that: Putting the norm bound (49) (which we have not yet justified) into this equation gives: which is small enough for the assignment v = l 5 , s = δl, as long as l is large enough.

Summary of sufficient conditions.
All of this analysis means that in order to argue that the regularized coherent information and the regularized Holevo bound are the true quantum or classical capacities, the following two conditions taken together are sufficient: 1. To show that the product codes are also good for the partitioned memory channel, for some positive constants C, E and F, where N = l 6 (1 + δ), s = δl. 2. To show that the regularized coherent information is the appropriate rate the these codes we need to show that lim l→∞ ρ l l+ (l) − ρ l l 6 (1+δ) 1 = lim l→∞ P(l, (l)) = 0 (64) for some function (l) such that lim l→∞ (l)/l = 0. In fact, if equation (63) holds, in this condition we could replace ρ l l 6 (1+δ) with ρ l vl(1+δ) , where the number of sections v is any function of l with a sub-exponential dependence (e.g. a polynomial) that is asymptotically larger than l 5 .
To demonstrate that these conditions hold for the most general types of many-body system is a non-trivial task. However, in a number of interesting cases it is possible to prove that these conditions hold. In the remaining sections, we demonstrate that these conditions hold for finitely correlated/matrix product states, as well as for a class of 1D bosonic system whose ground states may be determined exactly.

Proof of property equation (63) for various states
In this section, we provide proofs for the validity of equation (63) for a variety of quantum states. These include matrix-product states for which we have discussed explicit memory channels in this paper. In fact, the proofs that we present for matrix product states are essentially contained in previous works such as [22]. We also demonstrate analogous results for the ground state of quasi-free bosonic systems as such systems may provide interesting examples for future work. In addition to the results we present here and in the next section, M Hastings has demonstrated that conditions (63) and (64) hold for certain interesting classes of fermionic system [31].

Matrix product or finitely correlated states
The proof that we present here is essentially one part of the proof of proposition 3.1 in [22]. Our presentation of the argument benefits from the arguments presented in appendix A of [32] and the review article [33].
An important tool in the argument is the use of the Jordan canonical form [34]. As some readers may be unfamiliar with this technique, we briefly review it here. If a square matrix M has complex eigenvalues {λ α }, then it can be shown that a basis may be found in which the operator can be expressed as the following direct sum: where each I α is an Identity sub-block with an appropriate dimension, and each N α is a nilpotent matrix, meaning that for each N α there is some positive integer k such that N k α = 0. Moreover, each nilpotent matrix N α itself may be written as a block-diagonal matrix, where each subblock is either a zero matrix, or is all zero except possibly for 1s that may be positioned on the super-diagonal. In other words, each sub-block of a given N α is either zero or is of the form:  The decomposition (65) is the Jordan canonical form of M. In our case the matrix M will be constructed from a completely positive map that can be associated to the matrix product states that we consider. One consequence of this, for reasons that we discuss later, is that we will ultimately only be interested in operators M whose eigenvalues satisfy 1 = λ 1 = |λ 1 | > |λ 2 | |λ 3 | . . . . For a related reason, we will also only be interested matrices M for which there is a unique eigenvector corresponding to λ 1 , and also for which the sequence of integer powers M r , r = 1, . . . , ∞ is bounded. For matrices obeying these extra conditions, we may exploit the Jordan normal form in the following way. Pick the smallest integer k such that N k+1 α = 0 for all N k α . Then M r can be written as follows: If r is large, then all blocks corresponding to α = 1 will become small because of the λ r −m α term, and so the only sizeable contribution to M r will come from the block corresponding to α = 1, i.e. the sub-block: Now we have asserted that the sequence of operators M r is bounded. However, it is not too difficult to show that for r = 1, . . . , ∞ the sequence of operators (68) becomes unbounded if N 1 is nonzero. This means that if the sequence of operators M r is bounded, we are forced to conclude that N 1 = 0, and hence as M has a unique maximal eigenvector, this means that I 1 is an identity matrix of dimension 1 × 1, i.e. I 1 = 1.
Putting all this together means that a square matrix M with a unique maximal eigenvalue 1, such that the sequence M r is bounded, may be decomposed as: This means that M r can be written in the form: For our purposes it will be convenient to pull out a factor r k from the term in square brackets: This has the advantage of making the operator in square brackets bounded even as r → ∞. This form for M r will be extremely useful to us. We will apply it to a completely positive map that can be associated to any matrix product state. Using this, we will show the decay of correlations required.
The relationship between matrix product states and CP maps is described in detail in [22,33]. Any matrix product state can be generated by repeatedly acting on a fictitious ancilla particle using an appropriately constructed CP map. Suppose that we have a matrix product state of N particles j ∈ {1, . . . , N }, each associated with a Hilbert space H j . Consider also a fictitious 'generator' ancilla system on a finite dimensional space H gen . It can be shown that the state of the N particles in the matrix product state can be defined as the state that results from an appropriate CP map T : B(H gen ) → B(H gen ) ⊗ B(H j ) which generates each particle j ∈ {1, . . . , N } in sequence. The generating ancilla is then traced out to give the matrix product state of the N particles. Related to the map T is the completely positive map Q, which is the restriction of the map T to the generator ancilla as both input and output. The map Q essentially represents the transfer matrix of the MPS-for a review of how to construct T for matrix product states, see [33].
The starting state of the fictitious generator ancilla is usually taken as a fixed point of Q, in the order that the MPS be translationally invariant. Away from a phase transition point, the CP map Q has a unique fixed point of eigenvalue 1, with all other eigenvalues of absolute value strictly less than 1. Let this fixed point of Q be the state σ . Furthermore, as Q is a CP map, it is clear that the sequence of maps Q r is bounded. Hence, as Q acts as a finite dimensional linear operator taking the ancilla space to itself, we can also think of it as a square matrix and apply equation (71) to represent powers Q r of the map. Let us use this form to compute the action of Q r on an input density matrix ω of the fictitious ancilla. As any density matrix is taken to a density matrix by a CP map, we may apply (71) to give that the output of Q r must have the following form: where in the second term r is a sequence of operators whose norm can be bounded, and the r k λ r 2 term (which governs the size of the deviation from the final fixed point σ ) arises as a consequence of equation (71). This equation essentially states that the deviation of Q r (ω) from σ falls off as fast as r k λ r 2 . Although the explicit form of r depends upon the input state, a bound on the norm of r can easily be constructed that is independent of ω. This means that lim r →∞ Q r = , where we define as the (idempotent) channel that discards the input ancilla state and creates a copy of σ in its place. For finite r we may write: where now represents operations of bounded norm acting on states of the ancilla (we have dropped the potential r -dependence of to keep notation uncluttered, as it is unimportant). Our goal in the remainder of this subsection will be to apply this deviation estimate to show that equation (63) holds for matrix product systems. This can be done in two steps. In the first step, we show that for two large blocks of length L separated by a distance d (eventually L will become the length of the live blocks l, and d will become the spacer distance δl), the reduced state can be approximated by a product. The second step will use the triangle inequality to go from this result to the full condition (63).
The first step proceeds as follows. For convenience we will consider a chain of total chain of length 2n + 2L + d, for which the state of the whole chain can be written: tr anc T n+L+d+L+n (σ ) . (74) If we take the limit as n → ∞, the reduced state of the two large blocks A and B each of length L can be written and the individual reduced states of each block A and B can be written: and Now from equation (73), we know that up to a correction d k λ d 2 , the channel Q d becomes equivalent to . Hence, we find that ρ AB and ρ A ⊗ ρ B deviate as follows: where the constant is independent of L. Now for our situation L is simply the size of each block l, and the spacing between the blocks is s = δl. Hence, for two live blocks separated by one spacer block this bound becomes: To go from this result for two live blocks to equation (63) one simply notes that the above argumentation can also be applied to blocks of unequal size, and then the triangle inequality applied to sequences sums of a similar structure to (63) with only a polynomial overhead in l.

Bosonic systems
Here, we consider chains of harmonic oscillators whose Hamiltonian can be written in the form whereh = 1 and we arrange the canonical conjugate position and momentum operators in vector formx = (x 1 , . . . ,x n ) andp = (p 1 , . . . ,p n ) and introduced the so-called potential matrix V [35]. The potential matrix encodes the interaction pattern of the harmonic oscillators in the chain. From now on we assume that V is a k-banded matrix, i.e. V i, j = 0 for |i − j| k/2. Physically this implies that interaction strength vanish strictly beyond the (k/2)th neighbour. An important quantity in this context is the symplectic matrix σ which is defined by σ jk = [R j ,R k ] , where we denoteR = (x 1 , . . . ,x n ,p 1 , . . . ,p n ).
The ground state of the Hamiltonian equation (79) is then a Gaussian state [36,37] in the sense that its characteristic function χ ρ (z) = tr[ρŴ z ], whereŴ z = e iz T σR is the Weyl operator, is Gaussian, i.e.
where γ j,k = 2Re[R jRkρ ] and D = σ tr [Rρ]. The density operator may then be recovered viâ For the ground state the first moments vanish due to the reflection symmetry of the Hamiltonian. Therefore, the ground state is fully characterized by the covariance matrix γ , which is defined as γ j,k = 2Re[R jRkρ ], where we have explicitly used the fact that the first moments vanish. An explicit computation reveals that the covariance matrix of the ground state of Hamiltonian equation (79) is given by γ = V −1/2 ⊕ V 1/2 [35]. For the following proof of equation (63), we will bound the trace norm by the quantum relative entropy using [38].
The entropy of a Gaussian state is determined by the symplectic eigenvalues {µ j } of γ that are simply the standard eigenvalues of the iγ σ . We then find [37] where In the following proof, we will need to compute reduced density matrices. On the level of covariance matrices this is particularly easy as the covariance matrix of a sub-system A is obtained simply by removing all entries referring to operators in the complement of A.
Before we proceed to the proof of property equation (63) we first derive a useful lemma that extends Fannes inequality to Gaussian states. Fannes showed [30] that for d-dimensional systems and = tr|ρ −σ | 1/e, we find |S(ρ) − S(σ )| log d − log . Obviously, in this form the theorem cannot be extended to infinite dimensional continuous variable systems as this would imply d → ∞ which renders the upper bound trivial. Considering Gaussian states however it is possible to derive a more useful Fannes-type inequality.

Lemma 1 (Bosons). Given two N -mode Gaussian statesρ i characterized by covariance matrices γ i with symplectic eigenvalues {µ
176 230 08 is the non-zero solution of (k + 2) log 2 (k + 2) + k log 2 k = 2, we find and that ∀x > 1 and k > 0 we find d dx [ f (x + k) − f (x) + klog 2 k] 0. Thus, we have 0 f (x + k) − f (x) −k log 2 k for all x 1 and k B. Inserting this into the entropy formula equation (82), we then find the first inequality in lemma 1. The second inequality is obtained from the fact that the entropy of any probability distribution with N nonzero probabilities is bounded by log 2 N . This completes the proof.
It is worth noting that an analogous theorem may also be proven for the fermionic case 9 .

Theorem 1.
In an infinite chain of harmonic oscillators in its ground state we pick two blocks, each consisting of L contiguous harmonic oscillators. The two blocks are separated from each other by d harmonic oscillators. Then we find that for some polynomial C(L) and constant α independent of d.
Proof. We will proceed using lemma 1 to bound the entropy difference S(ρ AB ρ A ⊗ρ B ) = S(0ρ A ⊗ρ B ) − S(ρ AB ). To this end we need to bound the difference in symplectic eigenvalues of the covariance matrices corresponding to ρ A ⊗ ρ B and ρ AB . Property 1 then yields the desired result.
We denote with γ ground the ground state of the complete system and write the covariance matrix of the two blocks of harmonic oscillators (both of length L) in the (x 1 , p 1 , x 2 , p 2 , . . . ) ordering as Given that the potential matrix V is banded we know from [39]- [41] that the entries of γ ground decrease exponentially in the distance d from the main diagonal. Therefore, the entries of AB are exponentially decreasing with distance from the lower left corner whose entry is of the order C 1 e −αd . We employ theorem 8.3.9 of [42] which states that 9 Indeed we find Proof. Remember that the fermionic symplectic eigenvalues |µ for every unitarily invariant norm and where S (T ) diagonalize A(B) and cond(S) = S · S −1 is the condition number. Given that the matrix i σ can be diagonalized by a matrix of the form U −1/2 we find By the pinching inequality for Hermitean matrices [42] C( A) ≺ A we find cond( A ⊕ B ) and cond( ) cond(γ ground ). For the trace norm we then find Then AB 1 2L AB 2 and σ 1 = 4L yield for constants α and C 2 independent of L. Inserting this into lemma 1 finishes the proof.
As with matrix product states, application of the triangular inequality then yields equation (63). covariance matrices γ 1 and γ 2 converge to each other in the limit l → ∞. In the following we will choose, for our convenience, L sufficiently large to ensure that l + (l) l 6 (1 + δ).
Given a k-banded potential matrix V let us choose a number r = (l)/k. Then V r is (l)-banded. Denote with F the composition of first applying an analytic matrix function to a covariance matrix and subsequently picking the sub-block describing the reduced state of a contiguous block of L harmonic oscillators. Analogously, denote with p r the composition of first applying the r th matrix power followed by picking a sub-block as before.
Then we conclude p r ( 1 ) = p r ( 2 ) due to the k-bandedness of V . Furthermore, by Bernsteins theorem (see footnote 10 for a short introduction) we then find Becauseχ>1(seefootnote10)thistendstozerowith (L)→∞.ChoosingF(A)=A 1/2 and F(A) = A −1/2 allows us then to conclude that the difference of the covariance matrices γ 1 and γ 2 is bounded by an exponentially decreasing function in (L).
To continue, we proceed in two steps. First, we show that the above property implies the weak convergence of the two reduced density matrices. Then, we use this to show that this is already enough to imply the trace norm convergence.

Lemma 3. Given two Gaussian states ρ (1)
L and ρ (2) L above with vanishing displacement and covariance matrices γ (1) L and γ (2) L such that lim L→∞ γ (1) Proof. Given that the Hamiltonian of the harmonic chain is gapped we find that γ (i) L c for some constant c < 1 independent of L. Then choose γ (1) L − γ (2) L < c 1, |1 − e −x | 2|x| 10 Bernstein's theorem concerns the approximation of functions by polynomials [43]. Given the set P r of polynomials of degree r or less with real coefficients. For a continuous function F on the interval [ − 1, 1] the best approximation error is defined as where Now assume that F is analytic in an ellipse E χ with foci −1 and 1 and with half axes α > 1 and β > 0. Then χ α + β. Then we have Theorem (Bernstein). Let the function F be analytic in the interior of E χ with χ > 1 and continuous on E χ . In addition suppose that F(z) is real for real z. Then It is straightforward to adapt the theorem to other intervals and we will thus apply this theorem for all intervals.
for x 1, |1 − e −x | e |x| for all x and X L as above. We find where the last line follows from upper bounds on the error function. Note that the first term on the right-hand side is proportional to trρ 2 L which is bounded by a constant independent of L because −log 2 trρ 2 L S(ρ L ) and the harmonic chain Hamiltonian obeys an entropy-area law [35]. Thus for sufficiently small the right-hand side becomes arbitrarily small. This concludes the proof of lemma 3. Now we need to prove that weak convergence implies trace-norm convergence for harmonic chains. The following proof will use in an essential way the fact that the ground state of bosonic Hamiltonians that are quadratic in the canonical operators obey an area law [35,39].
for some P that is yet to be determined. We now would like to establish the existence of a spectral projection P of finite rank such that ρ (i) L − Pρ (i) L P 1 < . In other words, we aim to project onto the subspace made up of the eigenvectors corresponding to the k m largest eigenvalues of ρ (i) L . We argue that such a projection P (i) L exists for each ρ (i) L . Then one may project onto the subspace spanned by the subspaces determined by P (1) L and P (2) L which defines P L . What we need is that k m is bounded independent of L. To see this, it is important to note that the ground state of H satisfies an area law, i.e. in the 1D setting there is a constant C such 31 that S(ρ (i) L ) C for all L. Let us denote by {λ ↓ k } k=0,...,∞ the decreasingly ordered eigenvalues of ρ (1) L . Note that for all k we have λ ↓ k 1 k by trρ (i) L = 1. Thus we find Therefore, we find for the choice k m e C/ that ∞ k=k m λ ↓ k for any choice of L. Thus P L can be chosen to be a rank k m projector. Thus P L is bounded in the trace norm but the subspace onto which it projects will generally depend on L. Note further that with the above P L the weak convergence lim L→∞ tr[(ρ (1) L − ρ (2) L )X L ] = 0 implies that for sufficiently large L we have that P L ρ (1) L P L − P L ρ (2) L P L 1 < . Thus, we find that for any > 0 and sufficiently large L we have ρ (1) L − ρ (2) L 1 3 thus establishing the required trace norm convergence.

Generalizations to other interactions
It is natural to ask whether the approach that we have adopted can enable progress to be made for unitary interactions other than controlled-phase gates (or their higher dimensional analogues). Some generalizations are immediate. For instance, given any channels that are probabilistic applications of unitaries, where the unitaries are controlled on different classical or quantum basis states of the environment, expression (16) can easily be shown to be an explicit lower bound to the regularized coherent information. Hence, if the environment state has sufficiently decaying correlations, expression (16) will also be a lower bound to channel capacity. In a similar manner it is likely that any channel whose capacity can be bounded by such simple entropic expressions will benefit from similar insights.

Discussion and conclusions
We have considered models of correlated error inspired by many-body physics, with the aim of demonstrating behaviour in the capacity that parallels similar behaviour in the associated many-body systems. In this context, a number of interesting questions which require further investigation. The first of these questions regards our initial motivation-to find models of correlated error that display interesting non-analytic behaviour. However, non-analytic behaviour in manybody systems arises only in the thermodynamic limit, and so our results unfortunately do not really explain why the non-analyticities that have been observed in papers such as [5]- [7] occur for finite truncations of the channel. Furthermore, the quest for 'genuine' non-analyticity is actually open to some debate-by redefining the parameters defining the channel, it is always possible to remove any non-analytic behaviour. However, we hope that our work may help to shed light on non-analytic behaviour for physically relevant parameter choices such as magnetic fields and inter-particle couplings 11 . In realistic models of correlated error it is such forms of parametrization that will probably be most important.
It will also be interesting to see how far the approach adopted here can be extended to other possible system environment interactions. The channels that we have investigated above are all of a very specific kind-as random unitary channels, they do not permit quantum information to be transmitted from one system particle to another via the environment. More general channels with memory will have this property, and so it will be interesting to understand what effects this qualitative difference can make.
Another open question is whether the conditions (63) and (64) can be established for wider families of many-body system. In addition to the systems for which we have demonstrated these conditions, recent work by Hastings [31] demonstrates that they hold for the ground states of many fermionic systems too. His approach raises interesting questions concerning topological invariants which may have further significance for the problems considered in this paper.
Finally, it is important to note that the connections made in [10] and this work are actually quite natural-entropies and correlations have a significant role in statistical physics, and so quantum channel capacities with correlated error should have some connection to many-body physics. However, it would be nice to know if there is a deeper link, perhaps through a more direct connection between coding theory and the physics of physical systems such as spinchains.