Quantum reading capacity

The readout of a classical memory can be modelled as a problem of quantum channel discrimination, where a decoder retrieves information by distinguishing the different quantum channels encoded in each cell of the memory (Pirandola 2011 Phys. Rev. Lett. 106 090504). In the case of optical memories, such as CDs and DVDs, this discrimination involves lossy bosonic channels and can be remarkably boosted by the use of nonclassical light (quantum reading). Here we generalize these concepts by extending the model of memory from single-cell to multi-cell encoding. In general, information is stored in a block of cells by using a channel-codeword, i.e. a sequence of channels chosen according to a classical code. Correspondingly, the readout of data is realized by a process of ‘parallel’ channel discrimination, where the entire block of cells is probed simultaneously and decoded via an optimal collective measurement. In the limit of a large block we define the quantum reading capacity of the memory, quantifying the maximum number of readable bits per cell. This notion of capacity is nontrivial when we suitably constrain the physical resources of the decoder. For optical memories (encoding bosonic channels), such a constraint is energetic and corresponds to fixing the mean total number of photons per cell. In this case, we are able to prove a separation between the quantum reading capacity and the maximum information rate achievable by classical transmitters, i.e. arbitrary classical mixtures of coherent states. In fact, we can easily construct nonclassical transmitters that are able to outperform any classical transmitter, thus showing that the advantages of quantum reading persist in the optimal multi-cell scenario.


Introduction
One of the central problems in the field of quantum information is the statistical discrimination of quantum states [1][2][3]. This is a fundamental issue in many protocols, including those of quantum communication [4] and quantum cryptography [5,6]. A similar problem is the statistical discrimination of quantum channels, also called 'quantum channel discrimination' (QCD) [7][8][9][10][11]. In its basic formulation, QCD involves a discrete ensemble of quantum channels which are associated with some a priori probabilities. A channel is randomly extracted from the ensemble and given to a party who tries to identify it by using input states and output measurements. The optimal performance is quantified by a minimum error probability, which is generally nonzero in the presence of constraints (e.g. for a fixed number of queries or restricted space of the input states). In general, this is a double-optimization problem whose optimal choices are unknown, a feature that makes its exploration nontrivial. Moreover, QCD may also involve continuous ensembles. A special case is 'quantum channel estimation' where the ensemble is indexed by a continuous parameter with flat distribution and the goal is to estimate this parameter with minimal uncertainty (see e.g. [12] and references therein).
Besides its difficult theoretical resolution, QCD is also interesting for its potential practical implementations. For instance, it is at the basis of the decoding procedure of two-way quantum 4 capacity of the classical memory, which corresponds to the maximum readable information per cell. If we do not impose constraints, this capacity exactly equals the amount of information stored in each cell of the memory. However, this is no longer the case when we introduce physical constraints on the resources accessible to the reading device, as typically happens in realistic implementations.
In the case of optical memories, which involve the discrimination of bosonic channels, the energy constraint is the most fundamental [4]. Thus, the quantum reading capacity is properly formulated for fixed input energy. This means that we fix the mean total number of photons irradiated over each cell of the memory. The computation of this capacity would be very important in the low-energy regime, which is the most interesting for its potential implications. Despite its calculation being extremely difficult, we are able to provide lower bounds for the most basic optical memories, i.e. the ones based on the binary encoding of lossy channels. For these memories we are able to derive a simple lower bound which quantifies the maximum information readable by classical transmitters. We call this bound the 'classical reading capacity' of the memory and it represents an extension to the multi-cell scenario of the 'classical discrimination bound' introduced in [20]. Remarkably, the optimal classical transmitter which irradiates n mean photons per cell can be realized by using a single coherent state with the same mean number of photons. Thanks to this result, we can easily investigate whether a particular nonclassical transmitter is able to outperform any classical transmitter. This is indeed what we find in the regime of a few photons. Thus, in the low-energy regime, we can prove a separation between the quantum reading capacity and the classical reading capacity, which is equivalent to the statement that the advantages of quantum reading persist into the optimal multi-cell scenario.
This paper is organized as follows. In sections 2 and 3, we review some of the key points of [20] and its supplementary materials, which are preliminary for the new results of sections 4-7. In particular, in section 2, we review the basic notions regarding the memory model with single-cell encoding. Then, in section 3, we discuss the simplest example of optical memory and its quantum reading. Once we have reviewed these notions, we introduce the model with multi-cell encoding in section 4. In section 5, we take the limit of large block size and define the quantum reading capacity of the memory, both unconstrained and constrained. In particular, we specialize the constrained capacity to the case of optical memories (bosonic channels). In section 6, we compute the lower bound relative to classical transmitters, i.e. the classical reading capacity. In section 7, we prove that this bound is separated, by showing simple examples of nonclassical transmitters which outperform classical ones in the regime of a few photons. Finally, section 8 presents our conclusions.

Basic model of memory: single-cell encoding
In an abstract sense, a classical digital memory can be modelled as a one-dimensional (1D) array of cells (the generalization to two or more dimensions is straightforward). The writing of information by some device or encoder, which we just call 'Alice' for simplicity, can be modelled as a process of channel encoding [20]. This means that Alice has a classical random variable X = {x, p x } with k values x = 0, . . . , k − 1 distributed according to a probability distribution p x . Each value x is then associated with a quantum channel φ x via the one-to-one correspondence Alice (encoder) Figure 1. Basic process of storage and readout. A memory cell can be characterized by an ensemble of quantum channels = {φ x , p x }. Alice picks a quantum channel φ x (with probability p x ) and stores it in a target cell. In order to read the information, Bob exploits a transmitter and a receiver. In the simplest scenario, this corresponds to inputting a suitable quantum state ρ and measuring the output ρ x = φ x (ρ) by a suitable detector. The detector gives the correct answer x up to some error probability P err . Multi-copy probing. Since the cell encodes the quantum channel in a stable way, we can probe the cell many times. More generally, this means that Bob can input a multipartite state ρ(s) ∈ D(H ⊗s ) which describes s quantum systems. As a consequence, the output will be ρ x (s) = φ ⊗s x [ρ(s)], whose global detection gives x up to some error probability. Optical memory. The encoded channel φ x is a bosonic channel (in particular, a single mode). In this case, Bob uses an input state ρ(s, n) describing s bosonic modes that irradiate n mean photons over the cell.
thus defining an ensemble of quantum channels Mathematically speaking, each channel of the ensemble is a completely positive tracepreserving (CPT) map acting on the state space D(H) of some chosen quantum system (Hilbert space H). Furthermore, the various channels are assumed to be different from each other. This means that, for any pair φ x and φ x with x = x , there is at least one state ρ ∈ D(H) such that is the quantum fidelity. Thus, in order to write information, Alice randomly picks a quantum channel φ x from the ensemble and stores it in a target cell. This operation is repeated independently and identically for all the cells of the memory, so that we can characterize both the cell and the memory by specifying (see figure 1). The readout of information corresponds to the inverse process, which is channel decoding or discrimination. The written memory is passed to a decoder, which we call 'Bob', who queries the cells of the memory one by one. To retrieve information from a target cell, Bob exploits a transmitter and a receiver. In the simplest case, this means that Bob inputs a suitable quantum state ρ and measures the corresponding output state ρ x = φ x (ρ) modified by the specific quantum channel stored in that cell (see figure 1). Note that, given some input state ρ, the ensemble of possible output states {φ x (ρ), p x } is generally made by non-orthogonal states, which, therefore, cannot be perfectly distinguished by a quantum measurement. In other words, the discrimination cannot be perfect and the quantum detection will output the correct value 6 x up to some error probability P err . It is clear that the main goal for Bob is to optimize the input state and output measurement in order to retrieve the maximal information from the cell.

Multi-copy probing and optical memories
In a classical digital memory information is stored quasi-permanently. This means that the association between a single cell and the channel encoding must be stable. As a result, Bob can probe the cell many times by using an input state living in a larger state space. Given some quantum channel Bob can input a multipartite state ρ(s) ∈ D(H ⊗s ) with integer s 1, i.e. describing s quantum systems. As a result, the output state will be This state is detected by a quantum measurement applied to the whole set of s quantum systems (see figure 1). Physically, if we consider the process in the time domain, ρ(s) describes the global state of s systems which are sequentially transmitted through the cell. In other words, the number s can also be regarded as a dimensionless readout time [20]. Intuitively, it is expected that the optimal P err is a decaying function of s, so that it is always possible to retrieve all the information in the limit for s → ∞. This suggests that the readout problem is nontrivial only if we impose constraints on the physical resources that are used to probe the memory. In the case of discrete variables (finite-dimensional Hilbert space) the constraint can be stated in terms of a fixed or maximum readout time s. More fundamental constraints come into play when we consider an optical memory, which can be defined as a classical memory encoding an ensemble of bosonic channels. In particular, these channels can be assumed to be single-mode. Since the underlying Hilbert space is infinite in the bosonic setting, one has unbounded operators such as the energy. Clearly, if we allow the energy to go to infinity, the perfect discrimination of (different) bosonic channels is always possible. As a result, the readout of optical memories has to be modelled as a channel discrimination problem where we fix the input energy. The simplest nontrivial energy constraint corresponds to fixing the mean total number of photons n irradiated over each memory cell [20]. Thus, for fixed n, Bob's aim is to optimize the input (i.e. the number of bosonic systems s and their state ρ) and the output measurement. In the following we explicitly formalize this constrained problem.
Let us consider an optical memory with the cell = {φ x , p x } where each element x is a single-mode bosonic channel. Then, we denote by ρ(s, n) a multimode bosonic state ρ ∈ D(H ⊗s ) with mean total energy Tr (ρn) = n, wheren is the total number operator over H ⊗s . In other words, this state describes s bosonic systems which irradiate a total of n mean photons over the target cell (also see figure 1). We refer to the pair (s, n) as the signal profile. In the bosonic setting the parameter s can be interpreted not only as the number of temporal modes (therefore readout time) but equivalently as the number of frequency modes, thus quantifying the 'bandwidth' of the signal [20]. Now, for a given input ρ = ρ(s, n) to the cell , we have the output state This output is subject to a quantum measurement over the s modes, which is generally described by a positive operator-valued measure (POVM) M = { x } having k detection operators x 0 which sums up to the identity x x = I . This measurement gives the correct answer x up to an error probability Here we denote by P[ |ρ(s, n), M] the error probability in the readout of the cell given an input state ρ(s, n) and an output measurement M. We are now interested in minimizing this quantity over both input and output. As a first step we fix the signal profile (s, n) and consider the minimization over input states and output measurements. This leads to the quantity which is the minimum error probability achievable for a fixed signal profile (s, n). Note that there are some cases where the optimal output POVM is known. For instance, if the output states ρ x (s, n) are pure and form a geometrically uniform set [26,27], then the optimal detection is the square-root measurement [1]. As a final step, we keep the energy n fixed and we minimize over s, thus defining the minimum error probability at a fixed energy per cell, i.e.
Thus, given a memory with cell , the determination of P( |n) provides the 'optimal' readout of the cell at a fixed energy n. It is worth stressing that the minimization over the number of signals s is not trivial due to the constraint that we impose on the mean total energy (if instead of such a restriction one imposes a bound on the mean energy per signal, then the infimum is always achieved in the asymptotic limit as s → ∞). Also note that we have enclosed the word 'optimal' in quotes, since the optimality of equation (8) is still partial, i.e. not including all possible readout strategies. In fact, as we discuss in the following subsection, Bob can also consider the help of ancillary systems while keeping fixed the mean total number of photons n irradiated over the single cell.

Assisted readout of optical memories
The optimality of equation (8) is true only in the 'unassisted case' where all the input modes are sent through the target cell. More generally, Bob can exploit an interferometric-like setup by introducing an ancillary 'reference' system which bypasses the cell and assists in the output measurement as depicted in figure 2. In the 'assisted case' we consider an input state ρ ∈ D(H ⊗s S ⊗ H ⊗r R ) which describes s signal modes (Hilbert space H ⊗s S ) plus a reference bosonic system with r modes (Hilbert space H ⊗r R ). As before, the minimal energy constraint corresponds to fixing the mean total number of photons irradiated over the target cell, i.e. n =Tr (ρn S ) wherê n S is the total number operator acting over H ⊗s S . 7 We denote by ρ = ρ(s, r, n) such a state, where we make explicit the number of signal modes s, the number of reference modes r and the mean total number of photons n irradiated over the cell. Following the language of [20], we also refer to ρ(s, r, n) as a transmitter with s signals, r references and signalling n photons. Now, given a transmitter ρ(s, r, n) at the input of a target cell = {φ x , p x }, we have the output state where the channel φ x acts on each signal mode, while the identity I acts on each reference mode. This state is then measured by a POVM M = { x } where x acts on the whole state space D(H ⊗s S ⊗ H ⊗r R ). The error probability P[ |ρ(s, r, n), M] has the form of equation (6) where now both state and measurement are dilated by the reference system. Thus, given a memory with cell , the minimum error probability at a fixed signal energy n is given by where the minimization includes the reference system too. In general, we always consider the assisted scheme and the corresponding error probability of equation (10). This clearly represents a superior strategy by allowing for the possibility of using entanglement between signal and reference systems. Clearly, the unassisted strategy is recovered by setting r = 0 and ρ(s, 0, n) = ρ(s, n).

The simplest case: optical memory with binary cells
In general, the solution of equation (10) is extremely difficult. In order to investigate the problem, the simplest possible scenario corresponds to an optical memory whose cell encodes two bosonic channels (binary cell) [20]. The situation is particularly advantageous when the channels are pure-loss channels and they are chosen with the same probability. This yields the binary channel ensemblē Optical memory with the binary cell¯ = {κ 0 , κ 1 }, which is read in reflection. A bit of information is stored in the reflectivity κ u of the cell medium (u = 0, 1). The encoded bit is read by using a transmitter (with s signals and r references), which irradiates n mean photons over the cell. The output is detected by a dichotomic measurement which provides the value of the bit up to some error probability.
where p 0 = p 1 = 1/2, and φ u represents a pure-loss channel with transmissivity 0 κ u 1. In the Heisenberg picture, the action of φ u on each signal mode is given by the map whereâ S is the annihilation operator of the signal mode andâ E is that of an environmental vacuum mode. For simplicity we can also denote this ensemble bȳ When the optical memory is read in reflection (which is usually the case), the two parameters κ 0 and κ 1 represent the two possible reflectivities of the cell (so that unit reflectivity corresponds to perfect 'transmission' of the signal from the transmitter to the receiver). See figure 3 for a schematic representation. Given a transmitter ρ(s, r, n) at the input of the binary cell¯ , we have two equiprobable outputs, ρ 0 (s, r, n) and ρ 1 (s, r, n). In this case the optimal measurement is dichotomic M = { 0 , I − 0 }, where 0 is the projection onto the positive part of the Helstrom matrix ρ 0 (s, r, n) − ρ 1 (s, r, n) [1]. As a result, the error probability for reading the binary cell¯ using the transmitter ρ(s, r, n) is given by where D is the trace distance [1]. This expression has to be optimized on the input only, so that we may write which is the minimum error probability at fixed signal energy. This quantity allows one to compute the maximum information per cell at fixed signal energy, which is given by where is the binary formula for the Shannon entropy. For a binary cell we clearly have 0 I (¯ |n) 1. Even in this simple binary case, the solution of equation (15) is very difficult. However, we can provide remarkable lower bounds if we restrict the minimization to some suitable class of transmitters. An important class is the one of classical transmitters, since they encompass all the optical resources used for the readout of optical memories in today's storage technology. Furthermore, this class can be easily characterized. Given a transmitter ρ(s, r, n), we can write its Glauber-Sudarshan representation [28,29] where α = (α 1 , . . . , α s ) T and β = (β 1 , . . . , β r ) T are vectors of complex amplitudes, are multi-mode coherent states and the P-function P(α, β) is a quasi-distribution, i.e. normalized to one but generally nonpositive [28,29]. In terms of the P-function, the signal energy constraint reads Now we say that ρ(s, r, n) is classical (nonclassical) if the P-function is positive P 0 (nonpositive P 0). Thus if the transmitter is classical, denoted by ρ c (s, r, n), then it can be represented as a probabilistic mixture of coherent states. The simplest examples of classical transmitters are the coherent state transmitters, which we denote by ρ coh (s, r, n). These are defined by singular P-functions so that they have the simple form Examples of nonclassical transmitters are constructed using squeezed states, entangled states and number states [25]. As shown in [20], by restricting the optimization to classical transmitters, we can compute the bound which is given by This is the optimal performance of readout by means of classical transmitters. It is important to note that this bound can be reached by a coherent-state transmitter ρ coh (1, 0, n) = | √ n S √ n|, i.e. a single-mode coherent state with mean number of photons equal to n. This achievability is very easy to show. In fact, given the input state | √ n S , we have two possible coherent states | √ κ 0 n S and | √ κ 1 n S at the output of the cell. Since these are pure states, it is known that they can be discriminated up to an error probability [1] where is their overlap. In the case at hand, we have which proves that equation (23) can be reached by a single-mode coherent state. The error probability P c (¯ |n) of equation (23) or, equivalently, the mutual information is called the 'classical discrimination bound' 8 . Alternative (and better) bounds can be derived by resorting to nonclassical transmitters. As a prototype of a nonclassical transmitter we consider an Einstein-Podolsky-Rosen (EPR) transmitter [14], which is composed of s pairs of signals and references, entangled via twomode squeezing. This transmitter has the form ρ epr (s, s, n) = |ξ ξ | ⊗s (27) where |ξ ξ | is a two-mode squeezed vacuum (TMSV) state, entangling one signal mode S with one reference mode R. In the number-ket representation, we have [25] where the squeezing parameter ξ quantifies the signal-reference entanglement and gives the energy of a single-mode signal by sinh 2 ξ . Since this transmitter involves s copies of this state, we have to impose the constraint in order to have an average of n total photons irradiated over the cell. Given an EPR transmitter ρ epr (s, s, n) at the input of the binary cell¯ , we have an error probability P[¯ |ρ epr (s, s, n)]. By optimizing over the number of copies s, we define the upper bound This bound represents the maximum information which can be read from the binary cell¯ by using an EPR transmitter which irradiates n mean photons over the cell. This quantity can be estimated using the quantum Battacharyya bound and its Gaussian formula [30]. After some algebra, we obtain [20] where

Quantum versus classical reading
Because of the potential implications in information technology, it is important to compare the performances of classical and nonclassical transmitters. The basic question to ask is the following [20]: for fixed signal energy n irradiated over a binary cell¯ , can we find some EPR transmitter able to outperform any classical transmitter? In other words, this is equivalent to showing that P epr (¯ |n) < P c (¯ |n) and a sufficient condition corresponds to proving that B < P c (¯ |n). Thus, by using equations (23) and (31), we find that for signal energies it is always possible to beat classical transmitters by using an EPR transmitter [20]. For high reflectivity κ 1 1 and κ 0 < κ 1 the threshold energy n th can be very low. In the case of 'ideal memories', defined by κ 0 < κ 1 = 1, the bound of equation (31) can be improved. In fact, we can write and the threshold energy becomes n th = 1/2 [20]. Thus, for optical memories with high reflectivities and signal energies n > 1/2, there always exists a nonclassical transmitter able to beat any classical transmitter. In the few-photon regime, roughly given by 1/2 < n < 10 2 , the advantages of quantum reading can be numerically remarkable, up to one bit per cell. The implications have been thoroughly discussed in [20] and its supplementary materials. It is important to say that these advantages are also preserved if thermal noise is added to the basic model. This noise can describe the effect of stray photons hitting the memory from the background and other decoherence processes occurring in the reading device. Formally, this corresponds to extending the problem from the discrimination of pure-loss channels to the discrimination of more general Gaussian channels. A supplementary analysis of quantum reading has also shown that its advantages persist if we consider more advanced designs of memories where information is written on and read from a block of cells (multi-cell/block encoding). Block encoding allows Alice to introduce error correcting codes which make Bob's readout flawless up to some metadata overhead. By resorting to the Hamming bound and the Gilbert-Varshamov bound, one can show that EPR transmitters enable the low-energy flawless readout of classical memories up to a negligible error correction overhead, contrary to what happens by employing classical transmitters (see the supplementary materials of [20]). In the following section, we extend the block encoding to the most general scenario, i.e. for arbitrary classical memories. Then, by increasing the size of the block (section 5), we will introduce the notion of quantum reading capacity of a classical memory.

General model of memory: multi-cell encoding
The writing of a memory is based on channel encoding, which generally may involve a block of m cells. A first simple kind of block encoding is just based on independent and identical extractions. As usual, Alice encodes a k-ary variable X = {x, p x } into an ensemble of quantum channels = {φ x , p x }. Then, she performs m independent extractions from X , generating an m-letter sequence with probability p x = p x 1 , . . . , p x m . This classical sequence identifies a corresponding 'channelsequence' which is stored in the block of m cells.
In a more general approach, Alice adopts a classical code. This means that Alice disposes a set of m-letter codewords {x 0 , . . . , is chosen with some probability p x i and identifies a corresponding 'channel codeword' Thus, in general, Alice encodes information in a block of m cells by storing a channel codeword, which is randomly chosen from the ensemble m = {φ x i , p x i }. The most general strategy of readout can be described as a problem of 'parallel discrimination of quantum channels', where Bob probes the entire block in a parallel fashion and detects the output via a collective quantum measurement. In order to query the block, Bob uses s signal systems per cell besides r other supplemental reference systems for the benefit of the output measurement. The whole set of ms + r systems is described by an arbitrary multipartite state ρ (see figure 4). At the output of the block, Bob has where the identity acts on the reference systems, while acts on the signal systems. This state is detected by a collective quantum measurement, i.e. a general POVM with l detection operators with outcome i corresponding to codeword x i . The correct result is provided up to an error probability P err . Clearly, the main goal for Bob is to optimize both input state and output measurements in order to minimize P err , thus retrieving the maximal information from the block. One can easily check that, without constraints, Bob is always able to retrieve all the information H max . An important advantage of the block encoding model is that the readout of data can be made flawless even if we consider constraints for the decoder, e.g. we fix the properties of the transmitter. This is possible by increasing the error correction overhead in the memory. In fact, Alice can always use a suitably large block (with a suitable number m of cells) and an optimal code (encoding H max bits) such that the error probability P err becomes negligible, i.e. reasonably close to zero. In this case, Bob is able to retrieve all the information from the block, corresponding to an average of R = m −1 H max bits per cell. In the following section, we study the asymptotic value of the rate R in the limit of large block size, corresponding to m → ∞. Clearly, the readout process is too difficult to be treated if we consider arbitrary multipartite transmitters, i.e. generally entangled among different cells. Thus, in order to tackle the problem, we restrict the readout to transmitters which are in a tensor-product form. In this case, we can  To read the data, Bob uses a suitable transmitter and receiver solving a problem of parallel channel discrimination. The transmitter is an arbitrary multipartite state ρ which probes the entire block by inputting s systems per cell plus sending an additional r systems directly to the receiver. The output state ρ x i is detected by an optimal collective measurement which provides the correct answer x i up to some error probability P err . In the case of unconstrained readout, P err goes to zero and Bob retrieves all the information H max from the block. If the readout is constrained, then P err is generally nonzero. However, Bob can still retrieve all the information if Alice increases the size of the block while keeping H max constant. adopt the Holevo bound to quantify the readable information, i.e. the asymptotic rate R. The maximization of this rate over the transmitters enables us to define the quantum reading capacity of the memory.

Limit of a large block size
Digital memories typically store a great number of data. This means that an average memory can be made by many encoding blocks or, alternatively, by a small number of them but large in size. The most general scenario corresponds to describing the memory as a single large block of cells where Alice stores data by encoding a very long channel codeword φ x i chosen with some probability p x i . Considering the whole memory as a large encoding block allows us to re-introduce the single-cell description. In fact, in the limit of m → ∞, each cell can be described (on average) by a marginal ensemble of quantum channels = {φ x , p x } encoding a corresponding marginal variable X = {x, p x }. Thus, independently of the actual classical code used to store information, the description of a large classical memory can always be reduced to its 'marginal cell', corresponding to a marginal ensemble of channels .
Clearly, this is far from a characterization of the memory since the mapping 'memory → marginal cell' is highly non-injective. For instance, any marginal ensemble = {φ x , p x } can be realized by two completely different channel codewords: one is the sequence φ x 1 ⊗ φ x 2 ⊗ · · · where each φ x j is independently and identically extracted from ; the other is the repetition codeword φ x ⊗ φ x ⊗ · · · which is given by infinite repetitions of the same channel φ x extracted Figure 5. Limit of large block size. A memory can be described as a large (approximately infinite) encoding block, where each cell encodes a marginal ensemble = {φ x , p x }. In order to read the memory, Bob uses an ∞-copy transmitter ρ(s, r ) ⊗∞ = ρ(s, r ) ⊗ ρ(s, r ) ⊗ · · ·, where each copy ρ(s, r ) probes a different cell using s signals generally coupled with r other references. All the output systems from the block are collectively detected by an optimal quantum measurement which reconstructs the asymptotic channel codeword. once from . In general, for a fixed marginal cell , there are infinitely many block encodings giving that marginal. However, no matter what encoding is used, it is the marginal cell which determines the maximum amount of information that is readable from the memory. As we discuss below, this maximal amount is achievable by using an optimal block encoding whose codewords are based on the typical sequences.
Let us restrict the readout of the memory to transmitters which are in tensor product form with respect to different cells. This means that Bob inputs an ∞-copy state where the single-copy ρ(s, r ) ∈ D(H ⊗s S ⊗ H ⊗r R ) describes s signal systems sent through a target cell plus an additional r reference systems 9 . Given the ∞-copy transmitter ρ(s, r ) ⊗∞ at the input of a memory with marginal cell = {φ x , p x }, the output is still in a tensor product form (see figure 5 for a schematic). The average output of each cell is described by a marginal ensemble of states where We can also use the notation E = |ρ(s, r ) indicating that the output marginal ensemble is generated by applying the marginal cell to the input transmitter ρ(s, r ). In other words, the output E is equivalent to conditioned to the input ρ(s, r ). By carrying out an optimal collective measurement on all the output systems, the maximum information per cell that can be retrieved is given by the Holevo bound where S(ρ) := −Tr(ρ log 2 ρ) is the von Neumann entropy. This quantity can also be written as conditional Holevo information underlining the fact that it depends on the marginal cell up to fixing the input transmitter.
It is important to note that the achievability of χ is assured by the Holevo-Schumacher-Westmoreland (HSW) theorem [32,33]. According to the HSW theorem, it is possible to construct an asymptotic collective measurement which perfectly discriminates among different sequences of output states ρ x 1 (s, r ) ⊗ ρ x 2 (s, r ) ⊗ · · · as long as these sequences form a code of 2 mχ typical codewords ρ x i (with m very large). This optimal code has marginal E. Because of equation (43) we can always identify a corresponding set of channel codewords φ x i generating the typical codewords ρ x i and having marginal . As a result, for a given marginal, there always exists an optimal block encoding {φ x i , p x i } which enables the receiver to give the correct answer x i with asymptotically zero error while decoding an average of χ bits per cell.
Thus, given a memory with marginal cell which is read by a transmitter ρ(s, r ), the conditional quantity χ[ |ρ(s, r )] provides the maximum information per cell, which can be reliably retrieved from the memory. This rate can always be achieved by using an optimal block code (with marginal ) which is based on the typical sequences. In the following sections, when we define the various reading capacities for a memory with marginal cell , it is understood that these capacities are achievable by a suitable choice of the block encoding.

Quantum reading capacity
As we have discussed above, the quantity χ[ |ρ(s, r )] represents the maximum information per cell which can be read from a memory with marginal cell if we use the transmitter ρ(s, r ). This readout is asymptotic (i.e. involves a large block) and reliable (i.e. without errors). Now the crucial task is the optimization of χ[ |ρ(s, r )] over the transmitters. As a first step, we can consider the readout capacity for a fixed number of input systems s and r , i.e.
By optimizing equation (46) over the number of input systems, we can define the unconstrained quantum reading capacity of the memory This is the maximum information per cell which can be read from a memory with marginal cell . 10 Since it is unconstrained, this capacity can be greatly simplified and easily computed. First of all, the maximization can be reduced to pure transmitters ψ(s, r ) as a simple consequence of the convexity of the Holevo information [31] (see appendix A for more details). Further, the use of reference systems can be avoided. In other words, it is sufficient to consider the unassisted capacity where we maximize over ψ(s, 0) = ψ(s). Finally, the supremum is achieved in the limit for s → +∞, i.e. we can write This quantity is the maximal possible since it equals the maximum amount of information which can be read from the marginal cell of the memory. This is given by the Shannon entropy of the marginal variable X = {x, p x }. In other words, we have The proof is very easy (see appendix B for details). The notion of quantum reading capacity is nontrivial only in the presence of physical constraints. This happens in the bosonic setting, where optical memories are read by fixing the input signal energy. Thus, consider an optical memory with marginal cell = {φ x , p x }, where φ x represents a single-mode bosonic channel. As the transmitter, consider the ∞-copy state where ρ(s, r, n) ∈ D(H ⊗s S ⊗ H ⊗r R ) describes s signal modes, irradiating n mean photons on a target cell, plus r additional reference modes bypassing the cell. At the output we have an infinite tensor product of states of the form which are detected by an optimal collective measurement. In this way, Bob is able to retrieve an average of χ[ |ρ(s, r, n)] bits per cell. Now, we must optimize this quantity over the input transmitters while keeping the signal energy n fixed. This constrained optimization leads to the definition of the quantum reading capacity of the optical memory This capacity represents the maximum information per cell which can be read from an optical memory with marginal cell by irradiating n mean photons per cell. The computation of equation (52) is not easy. In fact we are only able to provide lower bounds by restricting the class of transmitters involved in the maximization. We do not even know if the optimal transmitters are pure or mixed.
Let us consider a set (or 'class') P of pure transmitters ψ(s, r, n) that are characterized by some general property which does not depend on s, r and n (for instance, they could be constructed using states of a particular kind, such as coherent states). Then we can always construct a mixed-state transmitter ρ(s, r, n) = dy p y ψ y (s, r, n), where p y 0, dy p y = 1 and ψ y (s, r, n) ∈ P. Clearly, the set of mixed-state transmitters identifies a larger class A which includes P. Now, we can define a lower bound to C( |n) by optimizing over the class A, i.e.
Similarly, we can consider the further lower bound Here we first ask: is there some class P that allows one to saturate equation (55), i.e. C P ( |n) = C A ( |n)? Then, is it possible to extend this class to all the pure transmitters, so that C P ( |n) = C( |n)? Unfortunately, we are not able to answer the second question, so that the issue of the purity of the optimal transmitters remains unsolved. However, we are able to find classes for which C P ( |n) = C A ( |n). For this task, a sufficient criterion is the concavity of C P ( |n).

Lemma 1. If C P ( |n) is concave in n, then we have C P ( |n) = C A ( |n).
Proof. Let us consider the transmitter of equation (53), whose signal energy (the mean number of photons) can be written as n = dy p y n y , n y = ψ y |n|ψ y .
Given this transmitter at the input of a marginal cell , we can bound the conditional Holevo information dy p y C P ( |n y ) (58) C P | dy p y n y (59) where we have used the convexity of χ in the first inequality (57), the definition of C P ( |n) in the second inequality (58) and its concavity in the last inequality (59). It is clear that equations (57)-(60) hold for every ρ(s, r, n) ∈ A and every s and r . As a result, we may write sup s,r max ρ(s,r,n)∈A which, combined with equation (55), proves the lemma.
In the following section, we show that an important class P for which C P ( |n) is concave is that of coherent-state transmitters. This means that C P ( |n) = C A ( |n), where A is the class of classical transmitters (constructed by convex combination via the P-function). Thanks to this result we can compute an analytical bound for the readout performance of all the classical Here the parameter u represents the unknown reflectivity of the cell medium (u = 0, 1). The multi-cell readout is realized by using an ∞-copy transmitter ρ(s, r, n) ⊗∞ which irradiates n mean photons per cell (each cell is probed by s signals coupled with r references). The total output is detected by an optimal collective measurement.
transmitters, which we call the 'classical reading capacity'. This capacity represents the multicell generalization of the classical discrimination bound of section 3 and provides a simple lower bound to the quantum reading capacity. In section 6, we compute its analytical form for the most basic optical memories. Then, as we will show in section 7, this classical bound can be easily outperformed by nonclassical transmitters, thus proving its separation from the quantum reading capacity.

Classical reading capacity
Let us consider a multi-cell generalization of the binary model described in section 3. In the single-cell model of section 3, information was written in each cell in an independent fashion, by encoding one of two possible pure-loss channels, φ 0 and φ 1 (binary cell). Here we consider the multi-cell version, where Alice stores a channel codeword in the whole optical memory regarded as a large block. In particular, the block encoding is such that the marginal cell is described by a binary ensemblẽ where 0 p 1 and φ u is a pure-loss channel with transmissivity κ u (u = 0, 1). Alternatively, we can use the notatioñ Given this kind of memory, information is read by using an ∞-copy transmitter ρ(s, r, n) ⊗∞ (irradiating n mean photons per cell) and an optimal collective measurement. In particular, when the memory is read in reflection (as is typical of optical discs), the two parameters κ 0 and κ 1 represent two reflectivities and the scenario is the one depicted in figure 6. Now let us restrict the readout of this memory to classical transmitters. This means to consider the ∞-copy input where the single copy ρ c (s, r, n) is an arbitrary classical state with s signal modes, r reference modes and n mean photons. The average information which can be read from each cell is provided by the Holevo quantity χ[˜ |ρ c (s, r, n)]. By optimizing over classical transmitters we get the lower bound which defines the classical reading capacity of the optical memory˜ . This capacity represents the multi-cell version of the classical discrimination bound of section 3. As before, we can provide a simple analytical result.

Theorem 1.
Let us consider an optical memory with binary marginal cell˜ = {κ 0 , p, κ 1 , 1 − p} which is read by an arbitrary classical transmitter ρ c (s, r, n) signalling n mean photons. Then, the maximum information per cell which can be retrieved is asymptotically equal to where H is the binary Shannon entropy and In particular, the bound C c (˜ |n) can be reached by using a coherent-state transmitter ρ coh (1, 0, n) = | √ n S √ n|, i.e. a single-mode coherent state with n mean photons.
Proof. Consider the class P = coh of coherent-state transmitters ρ coh (s, r, n). By convex combination we may construct the class A = c of the classical transmitters ρ c (s, r, n). The first step of the proof is the computation of C coh (˜ |n), i.e. the readout capacity restricted to coherent state-transmitters. We first prove that C coh (˜ |n) = χ[˜ |ρ coh (1, 0, n)], i.e. the optimal coherentstate transmitter is the single-mode coherent state | √ n S √ n|. Then, we analytically compute χ[˜ |ρ coh (1, 0, n)]. Since this quantity turns out to be concave in n, we can use lemma 1 to demonstrate that C coh (˜ |n) = C c (˜ |n), thus achieving the result of the theorem.
Given a coherent-state transmitter as the input to the cell˜ , the output is which is still a multimode coherent state. This is a simple consequence of the fact that φ 0 and φ 1 are pure-loss channels. Since we are computing the Holevo information of the output ensemble, we have the freedom to apply a unitary transformation over ρ u . By using a suitable sequence of beam splitters and phase-shifters we can always transform ρ u into the state Then, since the Holevo information does not change under the addition of systems, we can trace out the r + s − 1 vacua and just consider the single-mode output state This is equivalent to considering the single-mode coherent state transmitter at the input of the pure-loss channel φ u . For fixed marginal cell˜ and fixed input energy n, the reduction from the multimode input of equation (68) to the single-mode output of equation (72) is always possible, independently of the actual number of systems, s and r , and the specific form of the coherent-state transmitter ρ coh (s, r, n) (e.g. how the states are ordered and the energy distributed). Thus, we can write In other words, the optimal coherent-state transmitter is the single-mode coherent state | √ n S √ n|. The next step is the analytical computation of χ[˜ |ρ coh (1, 0, n)]. The latter quantity corresponds to computing the Holevo information of the output ensemble {| √ κ 0 n S , p, | √ κ 1 n S , 1 − p}, which is just the von Neumann entropy of the average output stateρ The computation of this output entropy is straightforward (see appendix C for details). After simple algebra, we obtain where H is the binary formula of the Shannon entropy and ξ = ξ(κ 0 , κ 1 , p, n) is given in equation (67). One can easily check that H (ξ ) is a concave function of n, for any κ 0 , κ 1 and p.
Since C coh (˜ |n) = H (ξ ) is concave in the energy n, we can apply lemma 1 by setting P = coh and A = c. Thus we obtain C c (˜ |n) = C coh (˜ |n) = H (ξ ), which is the result of equation (66). It is clear that the optimal classical transmitter coincides with the optimal coherent-state transmitter, which is given by ρ coh (1, 0, n).
It is interesting to compare the single-cell and multi-cell classical discrimination bounds, in order to estimate the gain which is provided by the parallel readout of cells. For this sake let us consider an optical memory whose binary marginal cell˜ stores the maximal data, i.e. one bit of information. This corresponds to the limit situatioñ This case is the direct generalization of the single-cell model analyzed in section 3. Then, we compare the maximum information achievable when using classical transmitters in the multicell model, i.e. the classical reading capacity C c (¯ |n), with the maximum information that is The maximum number of readable bits per cell is given by the conditional Holevo information χ[¯ |ρ nc (s, r, n)]. Now we ask: is this quantity larger than the classical reading capacity C c (¯ |n)?
A design of the nonclassical transmitter is the EPR transmitter ρ epr (s, s, n) = |ξ ξ | ⊗s which has been first discussed in section 3. In order to beat classical transmitters, it is sufficient to consider ρ epr (1, 1, n) = |ξ ξ |, i.e. a single TMSV state per cell. This means that we have one signal mode S, irradiating n mean photons over a target cell, which is entangled with one reference mode R. To quantify the advantage we consider the information gain and check its positivity. If G > 0, then the EPR transmitter ρ epr (1, 1, n) beats all the classical transmitters, retrieving G bits per cell more than any classical strategy. As shown in figure 8, we have G > 0 in the regime of low photons and high reflectivities (i.e. κ 0 or κ 1 close to 1). This is the typical regime where the quantum reading of optical memories is advantageous, as also investigated in the single-cell scenario [20]. As is evident from figure 8 the best situation corresponds to having one of the two reflectivities equal to 1, i.e. for an 'ideal memory'¯ = {κ 0 < κ 1 , κ 1 = 1}. Given such a memory, we explicitly compare the information read by an EPR transmitter χ epr = χ[¯ |ρ epr (1, 1, n)] with the classical reading capacity C c (¯ |n). The comparison is made at low signal energy n. As shown in figure 9 for the numerical value n = 1, the EPR transmitter is always able to beat the classical bound.
It is important to note that we can construct other simple examples of nonclassical transmitters that can outperform the classical reading capacity. An alternative example of the nonclassical transmitter can be taken again of the form ρ nc (1, 1, n) and corresponds to the NOON state [34,35] where the signal and reference are again entangled. A further example of the nonclassical transmitter is of the form ρ nc (1, 0, n), i.e. not involving the reference mode. This is the Fock state As shown in figure 9, these transmitters can beat not only the classical reading capacity but also the EPR transmitter |ξ ξ | for low values of κ 0 . Recently, these kinds of transmitters were also studied in [21] within the context of quantum reading with single-cell readout.
It is interesting to compare the performances of all these transmitters in the low-energy readout of optical memories with very close reflectivities. This is shown in figure 10 for κ 1 − κ 0 = 0.01 and n = 1. The EPR transmitter |ξ ξ | is optimal almost everywhere, while the classical bound beats the other nonclassical transmitters for low values of κ 1 (and κ 0 ). This is also compatible with the result of optimality of the TMSV state for the problem of estimating the unknown loss parameter of a bosonic channel [36]. As is evident from figure 10 a larger separation from the classical bound occurs for high reflectivities, i.e. κ 1 close to 1.
It is also interesting to see what happens in the regime of low reflectivity by considering a binary marginal cell¯ = {κ 0 , κ 1 } with 0 = 0. For this comparison we introduce another nonclassical transmitter of the form ρ nc (1, 0, n). This is the squeezed coherent state |α, ξ = D(α)S(ξ )|0 , where D(α) is the displacement operator and S(ξ ) the squeezing operator [25]. The squeezed coherent state is chosen with the squeezing along the same direction of the displacement. Without loss of generality we consider this direction to be the axis of the position quadrature. This means to choose two real parameters, α and ξ , which are optimized under the photon number constraint α 2 + sinh 2 ξ = n. As shown in figure 11, the presence of squeezing is sufficient to outperform the classical reading capacity in the regime of low reflectivity. However, better performances can be achieved by the Fock state for high values of κ 1 .
From the previous analysis, it is evident that, in the regime of low photon number (down to one photon per cell), we can easily find nonclassical transmitters able to beat any classical transmitters, i.e. the classical reading capacity. This is particularly evident for high reflectivities (κ 1 or κ 0 close to 1). Thus, for the most basic optical memories, classical and quantum reading capacities are separated at low energies. In other words, the advantages of quantum reading are fully extended from the single-to the optimal multi-cell scenario. At this point a series of important considerations are in order. First of all, note that we have only considered nonclassical transmitters irradiating one signal per mode (entangled or not with a single reference mode), i.e. transmitters of the kind ρ nc (1, 0, n) or ρ nc (1, 1, n). The reason is because these transmitters are sufficient to beat the classical bound. However, better performances could be reached by optimizing over the number of signals and references. In the case of EPR transmitters, we expect that ρ epr (2, 2, n), which is composed of two TMSV states signalling n/2 mean photons each, would be able to outperform ρ epr (1, 1, n), i.e. a single TMSV state signalling n mean photons. This is shown in figure 12 for the case of an ideal memory and n = 1 mean photons. This advantage could further improve for EPR transmitters ρ epr (s, s, n) with higher values of s. For this reason, in order to reach the quantum reading capacity, it is necessary to optimize over an arbitrary number of signal and reference modes, as foreseen by the general definition of equation (52). Another important consideration is related to the practical realization of quantum reading. In order to be experimentally feasible, the detection scheme should be as simple as possible. For this reason, it is interesting to compare the classical reading capacity (which refers to the general multi-cell readout) with the performances of EPR transmitters in the single-cell scenario, where each cell is encoded and detected independently of all others. Thus, we consider an ideal memory¯ = {κ 0 < κ 1 , κ 1 = 1} which is irradiated by a few mean photons per cell (in particular, we consider the numerical value n = 5). Given this memory, we compare the optimal performance C c (¯ |n) of classical transmitters assuming the multi-cell readout (optimal block encoding and collective measurement) with the performance of EPR transmitters ρ epr (s, s, n) assuming the single-cell readout (single-cell encoding and individual measurements). The latter quantity is given by the mutual information where H is the binary formula for the Shannon entropy and P[¯ |ρ epr (s, s, n)] is the error probability of the single-cell readout. One can compute the upper bound P[¯ |ρ epr (s, s, n)] which provides a lower bound for the mutual information Thus, Q(¯ |s, n) provides the minimum number of bits per cell that are read by an EPR transmitter ρ epr (s, s, n). For fixed signal energy n, it is easy to check that this quantity and Q(¯ |∞, n/2). Despite assuming a stronger energy constraint involving the mean total number of photons in both signal and reference modes, the singlecell quantum reading is still able to outperform asymptotic multi-cell classical reading.
is increasing in s. This means that for any integer s we have where Q(¯ |1, n) corresponds to a single energetic TMSV state ρ epr (1, 1, n) and Q(¯ |∞, n) corresponds to ρ epr (∞, ∞, n), i.e. infinite copies of TMSV states with vanishing energy. The quantity Q(¯ |∞, n) is computed by taking the limit s → ∞ in equation (83). In this limit, we have → θ , with θ given in equation (34). In the left panel of figure 13 we explicitly compare the two extremal values Q(¯ |1, n) and Q(¯ |∞, n) with the classical reading capacity C c (¯ |n). As we can see, the single-cell quantum reading is able to beat asymptotic multi-cell classical reading. Finally, it is interesting to check whether single-cell quantum reading represents a superior readout strategy even if we consider a stronger energy constraint; for instance, if we fix the mean total number of photons in both the signal and reference modes for each copy of the transmitter. Note that this approach has also been considered in [37] for the analysis of loss detection in bosonic channels. While this stronger energy constraint does not make any difference to the classical reading capacity (since the optimal classical transmitter involves signal modes only) it clearly affects the EPR transmitters where the mean total energy of the TMSV states is split exactly in two between signal and reference modes. Imposing this stronger energy constraint corresponds to comparing C c (¯ |n) with Q(¯ |1, n/2) and Q(¯ |∞, n/2). As shown by the right panel of figure 13, we see that the single-cell quantum reading is still able to beat asymptotic multi-cell classical reading.

Conclusion and discussion
In this paper we have extended the model of quantum reading to the optimal and asymptotic multi-cell scenario. Here the classical memory is modelled as a large block of cells where information is stored by encoding a suitable channel codeword (channel encoding). This information is then retrieved by probing the whole memory in a parallel fashion and detecting the output via an optimal collective measurement (channel discrimination). In this general scenario, we define the quantum reading capacity of the memory.
It is important to note that this notion of capacity also applies to more practical scenarios where the memory is composed of a number >1 of large encoding blocks which are independently read by the decoder. In fact, as long as each block is read in parallel by a tensor product transmitter and a collective measurement, and the marginal cells of different blocks are identical, our derivation can be repeated as before. In the case where the (large) blocks have different marginal cells 1 , 2 , . . ., the quantum reading capacity of the memory is given by a weighted sum of C( 1 ), C( 2 ), . . .. In other words, if the ith block occurs with probability p i , then the memory can be described by the set { p 1 , 1 , p 2 , 2 , . . .} and its quantum reading capacity is given by p 1 C( 1 ) + p 2 C( 2 ) + · · ·.
In our paper, we have then discussed how the quantum reading capacity is a nontrivial quantity to compute under the assumption of physical constraints for the decoder. In the case of optical memories, where data encoding is realized by bosonic channels, the main constraint of physical interest is energetic. This leads to defining the quantum reading capacity of an optical memory as the maximum number of bits per cell which can be read by irradiating n mean photons per cell.
Despite the difficulty of a general calculation of this capacity, we are able to provide nontrivial lower bounds in the case of optical memories with binary cells. The first lower bound, which we call the classical reading capacity, represents the maximum number of bits per cell which can be read by classical transmitters. This bound has a simple analytical formula and can be achieved using a single-mode coherent state transmitter. Besides this result, we have also computed other bounds by considering particular kinds of nonclassical transmitters, including the ones constructed with TMSV states (EPR transmitters), NOON states and Fock states. We have shown that, in the regime of a few photons and high reflectivities, these nonclassical transmitters are able to outperform any classical transmitter, thus showing a separation between the classical and the quantum reading capacities. It is remarkable that using a single-mode or two-mode transmitter per cell is already sufficient to beat any classical strategy. Furthermore, we have shown that the classical reading capacity can be outperformed even if we restrict the EPR transmitters to single-cell readout and adopt a stronger energy constraint where the energy of the reference modes is also taken into account.
In conclusion, our study considers the optimal multi-cell encoding for classical memories where we fully extend the advantages of quantum reading, i.e. the readout by nonclassical transmitters. These advantages are particularly evident in the regime of a few photons with nontrivial consequences for the technology of data storage.
For the sake of simplicity, let us assume that this state ψ is the same for all the channels, i.e. equation (B.1) holds for any x = x . Then, by exploiting the multiplicativity of the fidelity under tensor product states, we get for any x = x . Now, since this quantity goes to zero for s → +∞ we have that the multi-copy output states ρ x (s) become asymptotically orthogonal. This implies that χ[ |ψ(s)] = χ ({ρ x (s), p x }) → H (X ). where s 0 + · · · + s k−1 = s. It is easy to check that the output states become asymptotically orthogonal.
In the specific case where the two coherent states have amplitudes α 0 = √ κ 0 n, α 1 = √ κ 1 n, (C. 8) their overlap is given by By replacing this quantity in equation (C.7), we obtain the expression of equation (67). Thus we have computed the output entropy of a single-mode coherent-state transmitter | √ n √ n| at the input of a binary cell with transmissivities κ 0 and κ 1 . Since the von Neumann entropy is invariant under unitaries, this is also the output entropy of an arbitrary coherent-state transmitter which irradiates n mean photons over the binary cell.