Semidefinite programming hierarchies for constrained bilinear optimization

Berta, Mario; Borderi, Francesco; Fawzi, Omar; Scholz, Volkher B.

doi:10.1007/s10107-021-01650-1

Semidefinite programming hierarchies for constrained bilinear optimization

Full Length Paper
Series A
Open access
Published: 15 April 2021

Volume 194, pages 781–829, (2022)
Cite this article

Download PDF

You have full access to this open access article

Mathematical Programming Submit manuscript

Semidefinite programming hierarchies for constrained bilinear optimization

Download PDF

3054 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

We give asymptotically converging semidefinite programming hierarchies of outer bounds on bilinear programs of the form ${\mathrm {Tr}}\big [H(D\otimes E)\big ]$, maximized with respect to semidefinite constraints on D and E. Applied to the problem of approximate error correction in quantum information theory, this gives hierarchies of efficiently computable outer bounds on the success probability of approximate quantum error correction codes in any dimension. The first level of our hierarchies corresponds to a previously studied relaxation (Leung and Matthews in IEEE Trans Inf Theory 61(8):4486, 2015) and positive partial transpose constraints can be added to give a sufficient criterion for the exact convergence at a given level of the hierarchy. To quantify the worst case convergence speed of our sum-of-squares hierarchies, we derive novel quantum de Finetti theorems that allow imposing linear constraints on the approximating state. In particular, we give finite de Finetti theorems for quantum channels, quantifying closeness to the convex hull of product channels as well as closeness to local operations and classical forward communication assisted channels. As a special case this constitutes a finite version of Fuchs-Schack-Scudo’s asymptotic de Finetti theorem for quantum channels. Finally, our proof methods answer a question of Brandão and Harrow (Proceedings of the forty-fourth annual ACM symposium on theory of computing, STOC’12, p 307, 2012) by improving the approximation factor of de Finetti theorems with no symmetry from $O(d^{k/2})$ to ${\mathrm {poly}}(d,k)$, where d denotes local dimension and k the number of copies.

Dual Lower Bounds for Approximate Degree and Markov-Bernstein Inequalities

An optimal quantum error-correcting procedure using quantifier elimination

Article 06 May 2021

Ying-Ji Sun, Ming Xu & Yuxin Deng

Bounds on entanglement dimensions and quantum graph parameters via noncommutative polynomial optimization

Article Open access 21 May 2018

Sander Gribling, David de Laat & Monique Laurent

1 Introduction

In this paper, we study constrained bilinear optimization problems of the form

$$\begin{aligned} Q=\max&\quad {\mathrm {Tr}}\big [H (D\otimes E)\big ]\nonumber \\ s.t.&\quad D \in {\mathcal {P}}_D=\Pi _{A \rightarrow D}({\mathcal {S}}_A^+ \cap {\mathcal {A}}_{A})\nonumber \\&\quad E \in {\mathcal {P}}_E=\Pi _{B \rightarrow E}({\mathcal {S}}_B^+ \cap {\mathcal {A}}_{B}), \end{aligned}$$

(1)

where H denotes a matrix in ${\mathcal {S}}_D\otimes {\mathcal {S}}_E$ for ${\mathcal {S}}_D={\mathbb {C}}^{d_D\times d_D}$ and $\otimes $ the Kronecker tensor product, and ${\mathcal {P}}_{D}$ and ${\mathcal {P}}_{E}$ are positive semidefinite representable sets such that:

$\Pi _{A \rightarrow D}:{\mathcal {S}}_A\rightarrow {\mathcal {S}}_D$ and $\Pi _{B \rightarrow E}:{\mathcal {S}}_B\rightarrow {\mathcal {S}}_E$ are linear maps
${\mathcal {S}}_A^+$ and ${\mathcal {S}}_B^+$ are the sets of positive semidefinite unit trace matrices in ${\mathcal {S}}_A$ and ${\mathcal {S}}_B$, respectively
${\mathcal {A}}_{A}$ and ${\mathcal {A}}_B$ are affine subspaces of ${\mathcal {S}}_A$ and ${\mathcal {S}}_B$, respectively.

Our main motivation to study problems of the form (1) comes from quantum information theory—or more specifically the problem of approximate quantum error correction. We present this application and its motivation in detail in Sect. 4, but continue here with the general mathematical setting.

To discuss our approach, we first rewrite (1) by defining $G_{AB}= (\Pi _{A \rightarrow D}^{\dagger } \otimes \Pi _{B \rightarrow E}^{\dagger })(H)$, where $\Pi ^{\dagger }$ denotes the adjoint map of $\Pi $ in the Hilbert-Schmidt inner product. This leads to the form

$$\begin{aligned} Q=\max&\quad {\mathrm {Tr}}\big [G_{AB} (W_{A} \otimes W_{B})\big ]\nonumber \\ s.t.&\quad W_{A} \succeq 0,\quad W_{B} \succeq 0,\quad {\mathrm {Tr}}[W_A] = {\mathrm {Tr}}[W_B] = 1\nonumber \\&\quad \Lambda _{A\rightarrow C_A}\left( W_{A}\right) =X_{C_A},\quad \Gamma _{B\rightarrow C_B}\left( W_{B}\right) =Y_{C_B}, \end{aligned}$$

(2)

where $G_{AB}\in {\mathcal {S}}_A\otimes {\mathcal {S}}_B$, $\Lambda _{A\rightarrow C_A}:{\mathcal {S}}_A \rightarrow {\mathcal {S}}_{C_A}$ and $\Gamma _{B\rightarrow C_B}:{\mathcal {S}}_B \rightarrow {\mathcal {S}}_{C_B}$ denote linear maps, and $X_{C_A}\in {\mathcal {S}}_{C_A}$ and $Y_{C_B}\in {\mathcal {S}}_{C_B}$ are the matrices defining ${\mathcal {A}}_{A}$ and ${\mathcal {A}}_B$ as the affine subspaces associated with the kernels of the linear maps $\Lambda _{A\rightarrow C_A}$ and $\Gamma _{B\rightarrow C_B}$, respectively. Now, by the linearity of the objective function we can equivalently optimise over the convex hull of feasible points

$$\begin{aligned} Q=\max&\quad {\mathrm {Tr}}\left[ G_{AB}\left( \sum _{i\in I}p_iW_A^i \otimes W_B^i\right) \right] \nonumber \\ s.t.&\quad p_i\ge 0,\quad W_A^i \succeq 0,\quad W_B^i \succeq 0,\quad {\mathrm {Tr}}\left[ W_A^i\right] = {\mathrm {Tr}}\left[ W_B^i\right] = 1\quad \forall i\in I,\nonumber \\&\quad \sum _{i\in I}p_i=1,\,\, \Lambda _{A\rightarrow C_A}\left( W_A^i\right) =X_{C_A},\quad \Gamma _{B\rightarrow C_B}\left( W_B^i\right) =Y_{C_B}\quad \forall i\in I. \end{aligned}$$

(3)

That is, in the language of quantum information theory we are maximizing over a subset of the so-called separable quantum states—where the latter is defined on $A\otimes B$ as

$$\begin{aligned} \text {Sep}(A:B)&=\Bigg \{\sum _{i\in I}p_iW_A^i \otimes W_B^i: W_A^i \succeq 0, W_B^i \succeq 0, {\mathrm {Tr}}\left[ W_A^i\right] \\&={\mathrm {Tr}}\left[ W_B^i\right] = 1,p_i\ge 0,\sum _{i\in I}p_i=1\Bigg \}. \end{aligned}$$

Recall that matrices $W_A\in {\mathcal {S}}^+_A$ are called quantum states on system A—and similarly for bipartite states on $A\otimes B$.

Now, to approximate the set of separable states within the set of bipartite states is a ubiquitous but hard problem in quantum information theory (see, e.g., [4]). Nevertheless, as realized in [25] the set of separable states can be approximated by the sum-of-squares hierarchies of Lasserre [52] and Parrilo [60]. This lead to the semidefinite programming hierarchy of Doherty-Parrilo-Spedalieri (DPS), which is extensively employed to characterize quantum correlations in quantum information theory [24]. The underlying idea of the DPS hierarchy is that separable states $W_{AB}=\sum _i p_i W_A^i \otimes W_B^i$ on $A\otimes B$, where $\{p_i\}_{i\in I}$ is a probability distribution, are n-extendible to $W_{AB_1^n}=\sum _i p_iW_A^i \otimes (W_B^i)^{\otimes n}$ on $A\otimes B^{\otimes n}$ for any n, such that we have for any permutation $\pi $ that^{Footnote 1}

$$\begin{aligned} W_{AB_1^n}=\left( {\mathcal {I}}_A\otimes {\mathcal {U}}_{B_1^n}^\pi \right) (W_{AB_1^n}) \end{aligned}$$

with ${\mathcal {I}}_A$ the identity map on ${\mathcal {S}}_A$ and ${\mathcal {U}}_{B_1^n}^\pi $ the unitary map that permutes the systems $B_1^n$ according to $\pi \in {\mathfrak {S}}_n$—the symmetric group of n elements. The state $W_{AB_1^n}=\sum _i p_iW_A^i \otimes (W_B^i)^{\otimes n}$ is an extension of $W_{AB}$, meaning that we again get back the original state $W_{AB}$ when throwing away all additional systems $B_2^n$: ${\mathrm {Tr}}_{B_2^n}(W_{AB_1^n})=W_{AB}$.^{Footnote 2} Due to the monogamy of quantum correlations, however, general states do not have this property [20, 40]. In fact, finite quantum de Finetti theorems quantify, with upper bounds, the distance of n-extendible states to separable states [17], with convergence in the limit $n\rightarrow \infty $ [69]. More precisely, [17, Theorem II.7] gives that for states $W_{AB}$ n-extendible to $W_{AB_1^n}$, there exists a probability distribution $\{p_i\}_{i\in I}$ and states $W^i_A,W_B^i$ on A and B, respectively, such that

$$\begin{aligned} \left\| W_{AB}-\sum _{i\in I}p_iW_A^i\otimes W_B^i\right\| _1\le \frac{2d_B^2}{n}, \end{aligned}$$

(4)

where $\Vert X\Vert _1={\mathrm {Tr}}\left[ |X|\right] $ denotes the Schatten one-norm and $d_B$ the dimension of B. Crucially, n-extendability has a semidefinite representation and this then immediately gives efficient semidefinite approximations of the set $\text {Sep}(A:B)$ for any fixed n.

For our setting, however, we are interested more generally in characterizing bipartite states that are separable, but subject to linear constraints on the $W_A^i,W_B^i$ as well. As such, the approach we use to generate convergent semidefinite programming hierarchies for the constrained bilinear optimizations (3) is based on deriving finite de Finetti representation theorems with additional linear constraints. This leads to our main finding, the semidefinite programs

$$\begin{aligned} {\mathrm {SDP}}_n=\max&\quad {\mathrm {Tr}}\big [G_{AB} W_{AB}\big ]\nonumber \\ s.t.&\quad W_{AB_1^n}\succeq 0, {\mathrm {Tr}}(W_{AB_1^n}) = 1, \;W_{AB_1^n}=\left( {\mathcal {I}}_A\otimes {\mathcal {U}}_{B_1^n}^\pi \right) \left( W_{AB_1^n}\right) \;\forall \pi \in {\mathfrak {S}}_n\nonumber \\&\quad \left( \Lambda _{A\rightarrow C_A}\otimes {\mathcal {I}}_{B_1^n}\right) \left( W_{AB_1^n}\right) =X_{C_A}\otimes W_{B_1^n},\nonumber \\&\quad \left( {\mathcal {I}}_{B_1^{n-1}}\otimes \Gamma _{B_n\rightarrow C_B}\right) \left( W_{B_1^n}\right) =W_{B_1^{n-1}}\otimes Y_{C_B} \end{aligned}$$

(5)

form a sequence of upper bounds on Q with the property

$$\begin{aligned} 0 \le {\mathrm {SDP}}_n - Q \le \frac{{\mathrm {poly}}(d)}{\sqrt{n}}\quad \text {implying}\quad Q=\lim _{n\rightarrow \infty }{\mathrm {SDP}}_n, \end{aligned}$$

where $d=\max \{d_A,d_B\}$ and ${\mathrm {poly}}(d)$ denotes a term at most polynomial in d. Notice that the state $W_{AB}$ appearing in the objective function of (5), is the reduced state of $W_{AB_1^n}$ on $A\otimes B_1$, i.e., ${\mathrm {Tr}}_{B_2^n}(W_{AB_1^n})=W_{AB}$.

The remainder of our manuscript is structured as follows. In Sect. 2 we give quantum de Finetti theorems with linear constraints and in Sect. 3 we present how these lead to an outer hierarchy of converging SDP relaxations for constrained bilinear optimization of the form (1). In Sect. 4, we then discuss as a special case de Finetti theorems for quantum channels (Sect. 4.3), which we utilise for our main application about approximate quantum error correction (Sect. 4.4). We end with some conclusions in Sect. 5. Some arguments and extended material are deferred to appendices, which includes some basic numerical studies in Appendix B.

We should mention that in recent work, optimization problems similar to (1) and termed jointly constrained semidefinite bilinear programming were studied in [41], where it was pointed out that they appear in various forms throughout quantum information theory. We notice that the approach in [41] is based on non-commutative extensions of the classical branch-and-bound algorithm from [1] and is complementary to ours. Another remark is that we should distinguish the setting (2) studied here from our previous work on quantum bilinear optimization [9], where we were interested in bilinear optimizations of the form

$$\begin{aligned} \max&\quad \sum _{\alpha ,\beta }G_{\alpha ,\beta }\langle \psi |E_\alpha D_\beta |\psi \rangle \nonumber \\ s.t.&\quad |\psi \rangle \in {\mathcal {H}}:\,\text {Hilbert space}\nonumber \\&\quad E_\alpha ,D_\beta \, \text {Hermitian with}\, [E_\alpha ,D_\beta ]=0, \end{aligned}$$

(6)

where $E_{\alpha }$ and $D_\beta $ are operators acting on ${\mathcal {H}}$ subject to polynomial constraints given by the set of conditions $\{[E_{\alpha }, D_{\beta }]=0\}_{\alpha ,\beta }$ expressed by commutators, i.e., $[E_{\alpha }, D_{\beta }]=E_{\alpha }D_{\beta }-D_{\beta }E_{\alpha }$. Note that in this latter setting (6) the dimension of the underlying Hilbert space is unbounded and optimized over as well [62]. In contrast, for our optimisation (2) the dimension of the variables is fixed in advance. As such, the scope of applications of our current work is different.

2 De Finetti theorems with linear constraints

2.1 Notation

In the following, we introduce some notation that is standard in quantum information theory. A $d_A$-dimensional quantum system (or in short system) is given by an inner product space ${\mathbb {C}}^{d_A}$ and denoted by A. Quantum states (or in short states) on A are matrices^{Footnote 3}

$$\begin{aligned} W_A\in {\mathcal {S}}_A:={\mathbb {C}}^{d_A\times d_A}\,\text { with } \,{\mathrm {Tr}}[W_A]=1\,\text { and}\, W_A\succeq 0, \end{aligned}$$

where $\succeq $ denotes the operator (Loewner) order. Quantum states of rank one are called pure and can be written as $W_A=\vert \psi \rangle \langle \psi \vert _A$, where $\vert \psi \rangle _A\in {\mathbb {C}}^{d_A}$ and $\vert \psi \rangle \langle \psi \vert _A\in {\mathcal {S}}_A$ denotes the rank-one projector on the vector $\vert \psi \rangle _A$.

A bipartite system $AB:=A\otimes B$ is given by an inner product space ${\mathbb {C}}^{d_A}\otimes {\mathbb {C}}^{d_B}$, where $\otimes $ denotes the Kronecker tensor product. Correspondingly, states on AB are matrices $W_{AB}\in {\mathcal {S}}_A\otimes {\mathcal {S}}_B$ with ${\mathrm {Tr}}[W_{AB}]=1$ and $W_{AB}\succeq 0$. Separable states are states on AB that are in the convex hull of product states $W_A\otimes W_B$, with $W_A$ and $W_B$ states on A and B, respectively. The maximally entangled state $\Phi _{AB}:=\vert \Phi \rangle \langle \Phi \vert _{AB}$ on AB for $d:=d_A=d_B$ is not separable and defined via

$$\begin{aligned} \vert \Phi \rangle _{AB}:=\frac{1}{\sqrt{d}}\sum _{x=1}^d\vert x\rangle _A\otimes \vert x\rangle _B \,\text {for some orthonormal basis}\, \{\vert x\rangle \}_{x=1}^d\, \text {of}\, {\mathbb {C}}^d. \end{aligned}$$

The swap operator $F_{AB}$ on $A\otimes B$ exchanges the two quantum systems, i.e., $F_{AB}(W_A\otimes W_B)=W_B\otimes W_A$ for every state $W_A$ and $W_B$ on A and B, respectively. Classical-quantum states are bipartite states that can be written in the form

$$\begin{aligned} W_{XB}=\sum _{x=1}^{d_X}p_x\vert x\rangle \langle x\vert _X\otimes W_B^x \end{aligned}$$

for a probability distribution $\{p_x\}_{x=1}^{d_A}$, an orthonormal basis $\{\vert x\rangle _X\}_{x=1}^{d_X}$ of ${\mathbb {C}}^{d_X}$, and states $W_B^x$ on B for $x=1,\dots ,d_X$. We refer to X as the classical part of the bipartite classical-quantum system XB.

Quantum channels (or in short channels) are linear maps ${\mathcal {N}}_{A\rightarrow B}:{\mathcal {S}}_A\rightarrow {\mathcal {S}}_B$ that are trace preserving and completely positive^{Footnote 4} (cp). In particular, they map states from the input system A to states on the output system B. We often abbreviate bipartite channels ${\mathcal {I}}_A\otimes {\mathcal {N}}_B$ that act trivially on the A-system as ${\mathcal {N}}_B$, where ${\mathcal {I}}_A$ denotes the identity channel on ${\mathcal {S}}_A$. The partial trace ${\mathrm {Tr}}_B[\cdot ]$ is a channel from AB to A defined via

$$\begin{aligned} {\mathrm {Tr}}_B[\cdot ]:=\sum _{x=1}^{d_B}\big (1_A\otimes \langle x\vert _B\big )(\cdot )\big (1_A\otimes \vert x\rangle _B\big ), \end{aligned}$$

where $1_A$ denotes the identity matrix on A, and $\{\vert x\rangle \}_{x=1}^{d_B}$ an orthonormal basis of ${\mathbb {C}}^{d_B}$. For bipartite states $W_{AB}$, we write $W_A={\mathrm {Tr}}_B[W_{AB}]$ for the reduced state on A. Quantum measurements (or in short measurements) are a special case of channels that can be written in the form

$$\begin{aligned} {\mathcal {M}}_{A\rightarrow I}(\cdot ):=\sum _{i\in I}{\mathrm {Tr}}\left[ M_A^i(\cdot )\right] \vert i\rangle \langle i\vert _I \end{aligned}$$

with an orthonormal basis $\{\vert i\rangle _I\}_{i\in I}$ and $M_A^i\succeq 0$ $\forall i\in I$ with $\sum _{i\in I}M_A^i=1_A$.

The Choi-Jamiołkowski isomorphism relates channels with states. For a channel ${\mathcal {N}}_{A\rightarrow B}$, its Choi state is given by

$$\begin{aligned} J^{{\mathcal {N}}}_{BA'}:=({\mathcal {N}}_{A\rightarrow B}\otimes {\mathcal {I}}_{A'})(\Phi _{AA'}), \end{aligned}$$

(7)

where $d_{A'}:=d_A$. Note that $J^{{\mathcal {N}}}_{A'}=\frac{1_{A'}}{d_{A'}}$. Vice versa, for a bipartite state $W_{A'B}$ with $W_{A'}=\frac{1_{A'}}{d_{A'}}$, its Choi channel is given as

$$\begin{aligned} {\mathcal {N}}^W_{A\rightarrow B}:Z_A\mapsto d_A\cdot {\mathrm {Tr}}_{A'}\left[ W_{A'B}(Z_{A'}^T\otimes 1_B)\right] , \end{aligned}$$

(8)

where the transpose T is taken with respect to the orthonormal basis of the maximally entangled state in (7).

The distance between states is quantified by the metric induced from the Schatten one-norm $\Vert X\Vert _1:={\mathrm {Tr}}\left[ |X|\right] $. The distance between channels is quantified by the metric induced from the Diamond norm

$$\begin{aligned} \Vert {\mathcal {N}}_{A\rightarrow B}\Vert _{\Diamond }:=\sup _{\Vert X\Vert _1\le 1}\Vert ({\mathcal {N}}_{A\rightarrow B}\otimes {\mathcal {I}}_A)(X_{AA})\Vert _1. \end{aligned}$$

A multipartite state $W_{AB_1^n}$ on $AB_1^n\equiv AB_1\cdots B_n$ is called symmetric with respect to A if

$$\begin{aligned} ({\mathcal {I}}_A\otimes {\mathcal {U}}_{B_1^n}^\pi )(W_{AB_1^n})=W_{AB_1^n}\quad \forall \pi \in {\mathfrak {S}}_n, \end{aligned}$$

where ${\mathfrak {S}}_n$ denotes the symmetric group of n elements and

$$\begin{aligned} {\mathcal {U}}_{B_1^n}^\pi (W_{B_1}\otimes \cdots \otimes W_{B_n}):=W_{B_{\pi ^{-1}(1)}}\otimes \cdots \otimes W_{B_{\pi ^{-1}(n)}}. \end{aligned}$$

A bipartite state $W_{AB}$ is called n-extendable if there exists a multipartite extension $W_{AB_1^n}$, i.e., ${\mathrm {Tr}}_{B_2^n}(W_{AB_1^n})=W_{AB}$, that is symmetric with respect to A.

2.2 Previous work

General de Finetti representation theorems state that if a multipartite state on $AB_1^n$ is symmetric with respect to A, then the reduced state on the first k systems $AB_1^k$ is close to a separable mixture of independent and identical states for k sufficiently smaller than n. De Finetti [21] first proved for the classical case with A trivial, i.e. $A={\mathbb {C}}$, that if $n = \infty $ and k is finite, then the statement holds exactly. Quantitative finite versions for the classical case were later proven and the state-of-the-art bounds can be found in [22]. In the quantum setting, early works considered the $n=\infty $ setting including [28, 42, 61, 64, 69] in the mathematical physics community and [15] in the quantum information theory community. The first finite quantum de Finetti representation theorem was proved in [47]. The state-of-the-art bounds from [17, 46] show that for multipartite states $W_{AB_1^n}$ symmetric with respect to A, we have that

$$\begin{aligned} \left\| W_{AB_1^k}-\sum _{i\in I}p_iW_A^i\otimes \left( W_B^i\right) ^{\otimes k}\right\| _1\le \frac{2kd_B^2}{n}, \end{aligned}$$

for a probability distribution $\{p_i\}_{i\in I}$ and states $W^i_A,W_B^i$ on A and B, respectively. Note that the special case $k=1$ exactly recovers (4).

2.3 Proof methods

In the following, we provide a brief sketch of our proof ideas. For simplicity we restrict to $k=1$, which is the relevant case for (3). Namely, we start with a multipartite state $W_{AB_1^n}$ symmetric with respect to A and the goal is to identify constraints such that $W_{AB_1}$ is approximated by a mixture of states of the form

$$\begin{aligned} W^i_A \otimes W^i_B\,\text { with }\,\Lambda _{A\rightarrow C_A}\left( W_A^i\right) =X_{C_A}\,\text {and}\,\Gamma _{B\rightarrow C_B}\left( W_B^i\right) =Y_{C_B}. \end{aligned}$$

The standard approach for proving de Finetti theorems [17] proceeds by measuring the systems $B_1^n$ with the uniform measurement on the symmetric subspace given by $\left\{ \vert \psi \rangle \langle \psi \vert _B^{\otimes n} \right\} _{\psi }$. In this case, the candidate mixture of product states is given by

$$\begin{aligned} \int p\left( \psi \right) d \vert \psi \rangle W_{A | \psi } \otimes \vert \psi \rangle \langle \psi \vert _B\,, \end{aligned}$$

where the integral is computed with respect to the Haar measure, $p(\psi ) d\vert \psi \rangle $ denotes the probability of outcome $\psi $, and $W_{A|\psi }$ the state on A conditioned on obtaining outcome $\psi $ in the measurement. The problem with this candidate is that, in this mixture, there will in general be many terms where

$$\begin{aligned} \vert \psi \rangle \langle \psi \vert _B\text { is such that }\Gamma _{B\rightarrow C_B}\left( \vert \psi \rangle \langle \psi \vert _B\right) \ne Y_{C_B}. \end{aligned}$$

One could try to modify the measurement so that we only get $\vert \psi \rangle \langle \psi \vert _B$ that satisfy the desired constraint, but this seems difficult. Instead, we use an alternative approach, where the candidate mixture of product states is chosen differently [12, 47]. Namely, starting from $W_{AB_1^n}$ a well-chosen measurement on the systems $B_2^n$ with measurement outcomes $z_2^n$ leads to the candidate mixture of product states

$$\begin{aligned} \underset{z_2^n}{{\mathbf {E}}}\left\{ W_{A|z_2^n} \otimes W_{B|z_2^n} \right\} \,. \end{aligned}$$

The advantage of this candidate state is that by enforcing the right constraints on the global state $W_{AB_1^n}$, namely the ones in (5), we can ensure that $\Lambda _{A\rightarrow C_A}(W_{A|z_2^n})=X_{C_A}$ and $\Gamma _{B\rightarrow C_B}(W_{B|z_2^n})=Y_{C_B}$. Note that in order for this strategy to work properly, we need the chosen measurement to be informationally complete—that is, allowing to estimate the expectation value of arbitrary states—-and have a small distortion in the sense that the loss in distinguishibility resulting from applying the measurement is small.

2.4 Information-theoretic tools

The starting point for our proof technique is the use of the chain rule of the conditional mutual information, first used in this context in [11] and further exploited in [12]. More precisely, we will use the quantum relative entropy defined as

$$\begin{aligned} D(\rho \Vert \sigma ):= {\left\{ \begin{array}{ll} \mathrm{Tr}(\rho \log \rho ) - \mathrm{Tr}(\rho \log \sigma ) &{} \text{ if } support(\rho )\subseteq support(\sigma ) \\ \infty &{} \text{ otherwise } \end{array}\right. }, \end{aligned}$$

where $\rho $ and $\sigma $ are quantum states and the logarithm is taken with respect to the basis two. Recall that the support of an operator is defined as the orthogonal complement of its kernel. The following lemma, which can be found in [12], says that if some classical systems $Z_1^n$ are symmetric with respect to A, then conditioning on $Z_1^m$ for some value of m breaks the correlations between A and $Z_{m+1}$. Before stating the lemma, we introduce notation that will be used throughout the section. For a state $\rho _{A Z}$ with a classical Z-system, we write

$$\begin{aligned} \rho _{A|z}:= \frac{{\mathrm {Tr}}_{Z}\Big [\rho _{AZ} \left( 1_{A} \otimes \vert z\rangle \langle z\vert \right) \Big ]}{{\mathrm {Tr}}\Big [\rho _{AZ} \left( 1_{A} \otimes \vert z\rangle \langle z\vert \right) \Big ]}. \end{aligned}$$

We simply write $\underset{z_{1}^{m}}{{\mathbf {E}}}\left\{ \cdot \right\} $ for the expectation over the choices of $z_1^m$ and the distribution will be clear from the context.

Lemma 2.1

[12] Let $\rho _{A Z_1^n}$ be a classical-quantum state with the $Z_1^n$-systems classical and ${\mathcal {U}}^\pi _{Z_1^n}(\rho _{A Z_1^n})=\rho _{A Z_1^n}$ for all $\pi \in {\mathfrak {S}}_n$. Then, there exists $0\le m < n$ such that

$$\begin{aligned} \underset{z_{1}^{m}}{{\mathbf {E}}}\left\{ D(\rho _{AZ_{m+1}|z_1^m} \Vert \rho _{A|z_1^m} \otimes \rho _{Z_{m+1}|z_1^m}) \right\}&\le \frac{\log d_A}{n} \end{aligned}$$

as well as

$$\begin{aligned} \underset{z_{1}^{m}}{{\mathbf {E}}}\left\{ \Vert \rho _{AZ_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{Z_{m+1}|z_1^m} \Vert _1^2 \right\}&\le \frac{(2 \ln 2) \log d_A}{n}, \end{aligned}$$

where $\ln (\cdot )$ denotes the natural logarithm.

Proof

For the quantum mutual information we have $I\left( A:Z_1^n\right) _{\rho }:=D(\rho _{AZ_1^n}\Vert \rho _A\otimes \rho _{Z_1^n}) \le \log d_A$ as well as (see, e.g., [57, Chapter 11])

$$\begin{aligned} I\left( A:Z_1^n\right) _{\rho }=\sum _{m=0}^{n-1} I(A : Z_{m+1} | Z_1^m)_{\rho } \end{aligned}$$

for the quantum conditional mutual information $I(A : Z_{m+1} | Z_1^m)_\rho :=I(A:Z_1^{m+1})_\rho -I(A:Z_1^m)_\rho $. As a result, there exists an $m \in \{0, \cdots , n-1\}$ such that $I(A : Z_{m+1} | Z_1^m)_{\rho } \le \frac{\log d_A}{n}$, which implies

$$\begin{aligned} \underset{z_1^m}{{\mathbf {E}}}\left\{ I(A:Z_{m+1})_{\rho _{AZ_{m+1}|z_1^m}} \right\} \le \frac{\log d_A}{n}, \end{aligned}$$

where we used $I(A : Z_{m+1} | Z_1^m)_{\rho }=\underset{z_1^m}{{\mathbf {E}}}\left\{ I(A:Z_{m+1})_{\rho _{AZ_{m+1}|z_1^m}} \right\} $, which holds since the conditioning systems are classical. The second statement then follows directly from Pinsker’s inequality $D(\rho \Vert \sigma )\ge \frac{1}{2\ln 2}\left\| \rho -\sigma \right\| _1^2$ [78, Theorem 5.38]. $\square $

To prove the de Finetti theorem, we will crucially make use of so-called informationally complete measurements for which the loss in distinguishability, or distortion, can be bounded.

Lemma 2.2

[12, Lemma 14] There exist a product measurement ${\mathcal {M}}_A\otimes {\mathcal {M}}_B$ with finitely many outcomes such that for any Hermitian and traceless matrix $\xi _{AB}$ on $A\otimes B$, we have

$$\begin{aligned} \Vert ({\mathcal {M}}_{A} \otimes {\mathcal {M}}_{B})(\xi _{AB})\Vert _1 \ge \frac{1}{18\sqrt{d_A d_B}}\Vert \xi _{AB}\Vert _{1}. \end{aligned}$$

This [12, Lemma 14] follows from the methods in [51]. More generally, we define the minimal distortion for the bipartite system $A \otimes B$ as

$$\begin{aligned} f(A, B):=\inf _{{\mathcal {M}}_{A}, {\mathcal {M}}_{B}}\max _{\begin{array}{c} \xi _{AB}^{\dagger } = \xi _{AB}\\ \xi _A = 0, \xi _B = 0 \end{array}} \frac{\Vert \xi _{AB} \Vert _1}{\Vert ({\mathcal {M}}_{A} \otimes {\mathcal {M}}_B)(\xi _{AB})\Vert _1}, \end{aligned}$$

(9)

where the infimum is over all product measurements on AB. In this notation, Lemma 2.2 shows that

$$\begin{aligned} f(A, B) \le 18 \sqrt{d_A d_B}. \end{aligned}$$

Note that in the definition of f(A, B) we restricted the maximization to matrices satisfying $\xi _{A} = 0$ and $\xi _{B} = 0$ because this is sufficient for us.

A drawback of Lemma 2.2 is that the distortion depends on the dimension $d_A$. More generally, we define the minimal distortion with side information for a system B as

$$\begin{aligned} f(B|\cdot ):=\inf _{{\mathcal {M}}_B} \sup _{\begin{array}{c} \xi _{AB}^\dagger =\xi _{AB}\\ \xi _{A} = 0, \xi _B = 0 \end{array}} \frac{\Vert \xi _{AB} \Vert _1}{\Vert ({\mathcal {I}}_{A} \otimes {\mathcal {M}}_B)(\xi _{AB})\Vert _1}, \end{aligned}$$

(10)

where the infimum is over all measurements on B and the supremum is over all finite-dimensional systems A. In Lemma D.1 we give an elementary proof that

$$\begin{aligned} f(B|\cdot ) \le d_B^2 (d_B+1) \end{aligned}$$

using state two-designs and properties of weighted non-commutative $L_p$-spaces. In fact, after completion of our work we realised that methods from operator space theory even give the stronger bound

$$\begin{aligned} f(B|\cdot ) \le \sqrt{18d_B^3}, \end{aligned}$$

which is discussed in [13, Equation 66]. We leave it as an open question to determine the exact dimensional dependence of the minimal distortion with side information.

2.5 Main technical result

Combining the tools from the previous subsection we find the following de Finetti theorem with linear constraints.

Theorem 2.3

Let $\rho _{AB_1^n}$ be a quantum state, $\Lambda _{A\rightarrow C_A},\Gamma _{B\rightarrow C_B}$ linear maps, and $X_{C_A},Y_{C_B}$ matrices such that

$$\begin{aligned} {\mathcal {U}}_{B_1^n}^\pi (\rho _{AB_1^n})&=\rho _{AB_1^n}\;\forall \pi \in {\mathfrak {S}}_n\qquad \text {symmetric with respect to }A\\ \Lambda _{A\rightarrow C_A}(\rho _{AB_1^n})&=X_{C_A}\otimes \rho _{B_1^n}\qquad \quad \;\;\text {linear constraint on }A\\ \Gamma _{B_n\rightarrow C_B}(\rho _{B_1^n})&=\rho _{B_1^{n-1}}\otimes Y_{C_B}\qquad \quad \text {linear constraint on }B. \end{aligned}$$

Then, we have that

$$\begin{aligned} \left\| \rho _{AB}-\sum _{i\in I}p_i\sigma ^i_{A}\otimes \omega ^i_B\right\| _1\le&\;\min \big \{ f(A,B) , f(B|\cdot ) \big \}\sqrt{ \frac{(2 \ln 2) \log \left( d_A\right) }{n}} \end{aligned}$$

with $\{p_i\}_{i\in I}$ a probability distribution, $\rho _{AB}=\mathrm{Tr}_{B_2^n}\left[ \rho _{AB_1^n}\right] $, and quantum states $\sigma ^i_A,\omega _B^i$ such that for $i\in I$:

$$\begin{aligned} \Lambda _{A\rightarrow C_A}\left( \sigma ^i_A\right) =X_{C_A}\quad \text {and}\quad \Gamma _{B\rightarrow C_B}\left( \omega ^i_B\right) =Y_{C_B}. \end{aligned}$$

As stated in Sect. 2.4, we can, e.g., take $f(A,B)\le 18\sqrt{d_A d_B}$ or $f(B|\cdot )\le \sqrt{18d_B^3}$.

Proof

Let ${\mathcal {M}}_{B}$ be a measurement of the B system and call the outcome system Z. Consider the state $\rho _{AZ_1^n}$ obtained by measuring all the B systems with ${\mathcal {M}}_{B}$. This distribution is symmetric with respect to A and so we can apply Lemma 2.1. We find that there exists an $m \in \{0, \cdots , n-1\}$ such that

$$\begin{aligned} \underset{z_1^m}{{\mathbf {E}}}\left\{ \Vert \rho _{AZ_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{Z_{m+1}|z_1^m} \Vert _1^2 \right\} \le \frac{(2 \ln 2) \log d_{A}}{n}. \end{aligned}$$

Note that we have for any $z_1^m$, $\rho _{AZ_{m+1}|z_1^m} = ({\mathcal {I}}_{A} \otimes {\mathcal {M}}_{B})(\rho _{AB_{m+1}|z_1^m})$ and correspondingly $\rho _{Z_{m+1}|z_1^m} = {\mathcal {M}}_{B}(\rho _{B_{m+1}|z_1^m})$. Now, we choose the measurement ${\mathcal {M}}_{B}$ to be as in Lemma D.1 and achieving $f(B|\cdot )$ in (10), we get that $\Vert \xi _{AB} \Vert _1^2 \le f(B|\cdot )^2 \Vert ({\mathcal {I}}_{A} \otimes {\mathcal {M}}_{B})(\xi _{AB}) \Vert _1^2$, where $\xi _{AB} = \rho _{AB_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m}$. As a result, we have

$$\begin{aligned} \underset{z_1^m}{{\mathbf {E}}}\left\{ \Vert \rho _{AB_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m} \Vert _1^2 \right\} \le f(B|\cdot )^2\frac{(2 \ln 2) \log d_{A}}{n}. \end{aligned}$$

But note we can also choose measurements ${\mathcal {M}}_{A}$ and ${\mathcal {M}}_{B}$ achieving f(A, B) in (9). In this case,

$$\begin{aligned}&\Vert \rho _{AB_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m} \Vert _1^2 \\&\le f(A,B)^2\Vert ({\mathcal {M}}_A \otimes {\mathcal {M}}_{B})(\rho _{AB_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m}) \Vert _1^2 \\&\le f(A,B)^2 \Vert ({\mathcal {I}}_A \otimes {\mathcal {M}}_{B})(\rho _{AB_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m}) \Vert _1^2 \\&= f(A,B)^2 \Vert \rho _{AZ_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{Z_{m+1}|z_1^m} \Vert _1^2, \end{aligned}$$

where we used the fact that the trace norm cannot increase when applying the quantum channel ${\mathcal {M}}_{A}$ [78, Theorem 3.39]. As a result, we get

$$\begin{aligned} \underset{z_1^m}{{\mathbf {E}}}\left\{ \Vert \rho _{AB_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m} \Vert _1^2 \right\} \le f(A,B)^2 \frac{(2 \ln 2) \log d_{A}}{n}. \end{aligned}$$

Now, using the convexity of the square function, we get

$$\begin{aligned}&\underset{z_{1}^m}{{\mathbf {E}}}\left\{ \Vert \rho _{AB_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m} \Vert _1 \right\} \\&\quad \le \sqrt{\underset{z_{1}^m}{{\mathbf {E}}}\left\{ \Vert \rho _{AB_{m+1}|z_1^m} - \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m} \Vert ^2_1 \right\} } \\&\quad \le \min \big \{f(A, B), f(B|\cdot )\big \}\sqrt{\frac{(2 \ln 2) \log d_{A}}{n}}. \end{aligned}$$

Then, using the convexity of the norm and the fact that $\underset{z_1^m}{{\mathbf {E}}}\left\{ \rho _{AB_{m+1}|z_1^m} \right\} = \rho _{AB_{m+1}}$, we obtain

$$\begin{aligned} \left\| \rho _{AB_{m+1}} - \underset{z_1^m}{{\mathbf {E}}}\left\{ \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m} \right\} \right\| _1&\le \min \big \{f(A,B), f(B|\cdot )\big \}\sqrt{\frac{(2 \ln 2) \log d_{A}}{n}}. \end{aligned}$$

The state $\underset{z_1^m}{{\mathbf {E}}}\left\{ \rho _{A|z_1^m} \otimes \rho _{B_{m+1}|z_1^m} \right\} $ corresponds to our candidate mixture of product states. It now remains to show that all the states in the mixture satisfy the linear constraints. Indeed we have for any $z_1^m$, writing $M_{z}$ for matrices of the measurement ${\mathcal {M}}_{B}$,

$$\begin{aligned} \Lambda _{A \rightarrow C_{A}}(\rho _{A|z_1^m})&= \frac{\mathrm{Tr}_{B_1^m}\Big [(1_{A}\otimes M_{z_1} \otimes \cdots \otimes M_{z_m}) \Lambda _{A \rightarrow C_{A}}(\rho _{A B_1^m})\Big ]}{\mathrm{Tr}\Big [(1_{A}\otimes M_{z_1} \otimes \cdots \otimes M_{z_m})\rho _{A B_1^m}\Big ]} \\&= \frac{\mathrm{Tr}_{B_1^m}\Big [(1_{A}\otimes M_{z_1} \otimes \cdots \otimes M_{z_m}) (X_{C_A} \otimes \rho _{B_1^m}) \Big ]}{\mathrm{Tr}\Big [(1_{A}\otimes M_{z_1} \otimes \cdots \otimes M_{z_m})\rho _{A B_1^m}\Big ]} \\&= X_{C_A}, \end{aligned}$$

and similarly

$$\begin{aligned} \Gamma _{B \rightarrow C_{B}}(\rho _{B_{m+1}|z_1^m})&= \frac{\mathrm{Tr}_{B_1^m}\Big [(M_{z_1} \otimes \cdots \otimes M_{z_m} \otimes 1_{C_{B}} ) \Gamma _{B_{m+1} \rightarrow C_{B}}(\rho _{B_1^{m+1}})\Big ]}{\mathrm{Tr}\Big [(M_{z_1} \otimes \cdots \otimes M_{z_m} \otimes 1_{B_{m+1}} )\rho _{B_1^{m+1}}\Big ]} \\&= \frac{\mathrm{Tr}_{B_1^m}\Big [(M_{z_1} \otimes \cdots \otimes M_{z_m} \otimes 1_{B_{m+1}} ) (\rho _{B_1 \cdots B_{m}} \otimes Y_{C_B})\Big ]}{\mathrm{Tr}\Big [(M_{z_1} \otimes \cdots \otimes M_{z_m} \otimes 1_{B_{m+1}} )\rho _{B_1^{m+1}}\Big ]} \\&= Y_{C_B}. \end{aligned}$$

$\square $

This can then be extended to a full quantum de Finetti theorem for any reduced state $\rho _{AB_1^{k}}$ with $0<k<n$.

Theorem 2.4

For the same setting as in Theorem 2.3, we have for $0<k<n$ that

$$\begin{aligned} \left\| \rho _{AB_1^{k} }-\sum _{i\in I}p_i\sigma ^i_{A}\otimes \left( \omega ^i_B\right) ^{\otimes k}\right\| _1\le kf(B| \cdot )\sqrt{ (2 \ln 2) \frac{ \log d_{A} + (k - 1) \log d_B }{n-k+1}}. \end{aligned}$$

Proof

Note that the for the state $\rho _{AB_1^{k-1} B_{k}^n}$, the systems $B_{k}^n$ are symmetric with respect to $AB_1^{k-1}$. As such, we can apply the same argument used in the proof of Theorem 2.3, but this time starting from the decomposition $I\left( AB_1^{k-1}:Z_k^n\right) _{\rho }=\sum _{m=k-1}^{n-1} I(AB_1^{k-1} : Z_{m+1} | Z_k^m)_{\rho }$, leading to

$$\begin{aligned}&\frac{1}{n-k+1} \sum _{m=k}^n \underset{z_{k+1}^m}{{\mathbf {E}}}\left\{ \Vert \rho _{A B_1^k | z_{k+1}^m} - \rho _{A B_1^{k-1}|z_{k+1}^{m}} \otimes \rho _{B_k | z_{k+1}^m} \Vert _1 \right\} \\&\le f(B|\cdot )\sqrt{\frac{(2 \ln 2) \log (d_{A} d_B^{k-1})}{n-k+1}}. \end{aligned}$$

Similarly, for any $i \in \{1,\dots ,k\}$, we have

$$\begin{aligned}&\frac{1}{n-k+1} \sum _{m=k}^n \underset{z_{k+1}^m}{{\mathbf {E}}}\left\{ \Vert \rho _{A B_1^i | z_{k+1}^m} - \rho _{A B_1^{i-1}|z_{k+1}^{m}} \otimes \rho _{B_{i} | z_{k+1}^m} \Vert _1 \right\} \nonumber \\&\le f(B| \cdot )\sqrt{\frac{(2 \ln 2) \log (d_{A} d_B^{i-1})}{n-k+1}}. \end{aligned}$$

(11)

Now, using the triangle inequality $k-1$ times, we get for any $m \in \{k, \dots , n\}$ and any $z_{k+1}^m$ that

$$\begin{aligned}&\left\| \rho _{A B_1^k| z_{k+1}^m} - \rho _{A |z_{k+1}^{m}} \otimes \rho _{B_1|z_{k+1}^m} \otimes \cdots \otimes \rho _{B_k |z_{k+1}^{m}}\right\| _1 \\&\quad \le \sum _{i=1}^k \Big \Vert \rho _{A B_1^i| z_{k+1}^m} \otimes \rho _{B_{i+1}|z_{k+1}^{m}} \otimes \cdots \otimes \\&\qquad \quad \rho _{B_k|z_{k+1}^{m}}- \rho _{A B_1^{i-1}|z_{k+1}^{m}} \otimes \rho _{B_{i} | z_{k+1}^m} \otimes \rho _{B_{i+1}|z_{k+1}^{m}} \otimes \cdots \otimes \rho _{B_k|z_{k+1}^{m}}\Big \Vert _1 \\&\quad = \sum _{i=1}^k\left\| \rho _{A B_1^i | z_{k+1}^m} - \rho _{A B_1^{i-1}|z_{k+1}^{m}} \otimes \rho _{B_{i} | z_{k+1}^m}\right\| _1. \end{aligned}$$

Taking the average over m and $z_{k+1}^m$ and using (11), we get

$$\begin{aligned}&\frac{1}{n-k+1} \sum _{m=k}^n \underset{z_{k+1}^m}{{\mathbf {E}}}\left\{ \left\| \rho _{A B_1^k| z_{k+1}^m} - \rho _{A |z_{k+1}^{m}} \otimes \rho _{B_1|z_{k+1}^m} \otimes \cdots \otimes \rho _{B_k |z_{k+1}^{m}}\right\| _1 \right\} \\&\quad \le k f(B|\cdot )\sqrt{\frac{(2 \ln 2) \log (d_{A} d_B^{k-1})}{n-k+1}}. \end{aligned}$$

As a result, there is an m such that the previous inequality holds. Then, as before, we use the convexity of the norm to put the expectation inside, getting the existence of an m such that

$$\begin{aligned}&\left\| \rho _{A B_1^k} - \underset{z_{k+1}^m}{{\mathbf {E}}}\left\{ \rho _{A |z_{k+1}^{m}} \otimes \rho _{B_1|z_{k+1}^m} \otimes \cdots \otimes \rho _{B_k |z_{k+1}^{m}} \right\} \right\| _1\\&\quad \le kf(B|\cdot )\sqrt{ (2 \ln 2) \frac{ \log d_{A} + (k - 1) \log d_B }{n-k+1}}. \end{aligned}$$

To conclude, it suffices to observe that by symmetry $\rho _{B_i|z_{k+1}^m} = \rho _{B_{1}|z_{k+1}^m}$ for all $i \in \{1,\dots ,k\}$ and the linear constraints are satisfied by the same calculation as in the proof of Theorem 2.3. $\square $

2.6 De Finetti theorems without symmetries

These results can again be strengthened to a form studied in [12], where $\rho _{AB_1^n}$ is not assumed to be symmetric but rather the systems that are kept are chosen at random. More precisely, we improve the so-called de Finetti theorem without symmetries of [12, Section 3] by reducing the dependence from $d_B^{k/2}$ to polynomial in both $d_B$ and k, thereby solving one of the problems [12, Section 9] had left open.

Theorem 2.5

Let $\rho _{B_1^n}$ be a quantum state with the systems $B_i$ all having dimension $d_B$. Furthermore, let the entries of $\vec {i}=(i_1, \dots , i_k)$, $\vec {j}=(j_1, \dots , j_{n-k})$ be a random permutation of $\{1, \dots , n\}$, and assume we measure the systems $j_1,\dots ,j_{n-k}$ each using the measurement ${\mathcal {M}}_B$, getting the classical systems $Z_{j_1}, \dots , Z_{j_{n-k}}$. Then, there exists $m \in \{0, \dots , n - k\}$ such that

$$\begin{aligned}&\underset{\vec {i}, \vec {j}, z_{j_1}, \dots , z_{j_m} }{{\mathbf {E}}}\left\{ \left\| \rho _{B_{\vec {i}}|z_{j_1} \cdots z_{j_m}} - \rho _{B_{i_1}|z_{j_1} \cdots z_{j_m}} \otimes \cdots \otimes \rho _{B_{i_k|z_{j_1} \cdots z_{j_m}}} \right\| _1 \right\} \\&\quad \le kf(B|\cdot ) \sqrt{ (2 \ln 2) \frac{(k-1)\log d_B }{n-k+1}} \\&\quad \le \frac{3k^{3/2}d_B^3 \log d_B}{\sqrt{n-k+1}}, \end{aligned}$$

where $f(B|\cdot )$ is defined in (10).

To compare with the usual de Finetti theorems with symmetry, the expectation is taken inside the trace norm (by convexity)—which can then be understood as enforcing the permutation invariance of the state.

Proof

For fixed $\vec {i}, \vec {j}$, $m \in \{0, \dots , n-k\}$, and $z_{j_1} \cdots z_{j_m}$, we have using the triangle inequality $k-1$ times,

$$\begin{aligned}&\Vert \rho _{B_{i_1} \cdots B_{i_k} | z_{j_1} \cdots z_{j_m} } - \rho _{B_{i_1}|z_{j_1} \cdots z_{j_m} } \otimes \cdots \otimes \rho _{B_{i_k} | z_{j_1} \cdots z_{j_m} } \Vert _1 \nonumber \\&\quad \le \sum _{t=1}^k \Big \Vert \rho _{B_{i_1 \cdots i_t} | z_{j_1} \cdots z_{j_m} } \otimes \rho _{B_{i_{t+1}}| z_{j_1} \cdots z_{j_m} } \otimes \cdots \otimes \rho _{B_{i_k}| z_{j_1} \cdots z_{j_m} }\nonumber \\&\qquad \quad - \rho _{B_{i_1 \cdots i_{t-1}} | z_{j_1} \cdots z_{j_m} } \otimes \rho _{B_{i_{t}}| z_{j_1} \cdots z_{j_m}} \otimes \cdots \otimes \rho _{B_{i_k}| z_{j_1} \cdots z_{j_m}} \Big \Vert _1 \nonumber \\&\quad = \sum _{t=1}^k \left\| \rho _{B_{i_1 \cdots i_t} | z_{j_1} \cdots z_{j_m} } - \rho _{B_{i_1 \cdots i_{t-1}} | z_{j_1} \cdots z_{j_m} } \otimes \rho _{B_{i_{t}} | z_{j_1} \cdots z_{j_m} }\right\| _1. \end{aligned}$$

(12)

Now, consider a fixed t and fixed values for $i_1, \dots , i_{t-1}$, and assume we additionally measure the system $B_{i_t}$ using the measurement ${\mathcal {M}}_B$, getting the classical system $Z_{i_t}$. Then, for fixed $i_1, \dots , i_{t-1}$, the resulting distributions on $(i_t,j_1)$ and $(j_1,i_t)$ are identical, and the same holds for $(i_t,j_1,j_2)$ and $(j_2,j_1,i_t)$, and so on. Hence, we find by elementary entropic identities that

$$\begin{aligned}&\underset{i_t, \vec {j}}{{\mathbf {E}}}\left\{ I(B_{i_1} \cdots B_{i_{t-1}} : Z_{i_t} Z_{j_1} \cdots Z_{j_{n-k}})_{\rho } \right\} \\&\quad =\sum _{m=1}^{n-k} \underset{i_t, \vec {j}}{{\mathbf {E}}}\left\{ I(B_{i_1} \cdots B_{i_{t-1}} : Z_{i_t})_\rho +I(B_{i_1} \cdots B_{i_{t-1}} : Z_{j_m}| Z_{j_1} \cdots Z_{j_{m-1}}Z_{i_t})_{\rho } \right\} \\&\quad = \sum _{m=0}^{n-k} \underset{i_t, \vec {j}}{{\mathbf {E}}}\left\{ I(B_{i_1} \cdots B_{i_{t-1}} : Z_{i_t}| Z_{j_1} \cdots Z_{j_m})_{\rho } \right\} \\&\quad = \sum _{m=0}^{n-k} \underset{i_t, \vec {j}, z_{j_1}, \dots , z_{j_m} }{{\mathbf {E}}}\left\{ I(B_{i_1} \cdots B_{i_{t-1}} : Z_{i_t})_{\rho _{|z_{j_1} \cdots z_{j_m}}} \right\} . \end{aligned}$$

Note on the other hand that we have $I(B_{i_1} \cdots B_{i_{t-1}} : Z_{i_t} Z_{j_1} \cdots Z_{j_{n-k}} ) \le \log (d_{B}^{t-1})$ and thus we get, using Pinsker’s inequality,

$$\begin{aligned}&\frac{1}{n-k\!+\!1} \sum _{m=0}^{n-k} {\mathop {{\mathbf {E}}}\limits _{i_t, \vec {j}, z_{j_1}, \dots , z_{j_m}}} \{ \Vert \rho _{B_{i_1 \cdots i_{t-1}} Z_{i_t} |z_{j_1} \!\cdots \! z_{j_m}} \!-\! \rho _{B_{i_1 \cdots i_{t-1}} \!|z_{j_1}\! \cdots \! z_{j_m}} \!\otimes \! \rho _{Z_{i_{t}}|z_{j_1} \!\cdots \! z_{j_m}} \Vert _1^2\}\\&\quad \!\le \! (2 \ln 2) \frac{\log d_B^{t-1}}{n-k+1}. \end{aligned}$$

Observe that $\rho _{B_{i_1 \cdots i_{t-1}} Z_{i_t} |z_{j_1} \cdots z_{j_m}} = {\mathcal {M}}_{B_{i_t}}(\rho _{B_{i_1 \cdots i_{t-1}} B_{i_t} |z_{j_1} \cdots z_{j_m}})$ and using a measurement ${\mathcal {M}}_{B}$ achieving $f(B|\cdot )$ in (10) (or using the measurement in Lemma D.1, in which case we should replace $f(B|\cdot )$ by $d_B^2(d_B+1)$ in the following equations), we get that

$$\begin{aligned}&\frac{1}{n-k+1} \sum _{m=0}^{n-k} {\mathop {{\mathbf {E}}}\limits _{i_t, \vec {j}, z_{j_1}, \dots , z_{j_m}}} \{\Vert \rho _{B_{i_1 \cdots i_{t-1}} B_{i_t} | z_{j_1} \cdots z_{j_m} }\\&\quad - \rho _{B_{i_1 \cdots i_{t-1}} | z_{j_1} \cdots z_{j_m}} \otimes \rho _{B_{i_{t}}|z_{j_1} \cdots z_{j_m} } \Vert _1^2 \} \le (2 \ln 2) f(B|\cdot )^2 \frac{\log d_B^{t-1}}{n-k+1}. \end{aligned}$$

This implies, using the convexity of the square function, that

$$\begin{aligned}&\frac{1}{n-k+1} \sum _{m=0}^{n-k} {\mathop {{\mathbf {E}}}\limits _{i_t, \vec {j}, z_{j_1}, \dots , z_{j_m}}} \{\Vert \rho _{B_{i_1 \cdots i_{t-1}} B_{i_t} | z_{j_1} \cdots z_{j_m} }\\&\quad - \rho _{B_{i_1 \cdots i_{t-1}} | z_{j_1} \cdots z_{j_m}} \otimes \rho _{B_{i_{t}}|z_{j_1} \cdots z_{j_m} } \Vert _1\}\le f(B|\cdot )\sqrt{(2 \ln 2) \frac{\log d_B^{t-1}}{n-k+1}}, \end{aligned}$$

and we get, continuing on (12), that

$$\begin{aligned}&\frac{1}{n-k+1} \sum _{m=0}^{n-k} {\mathop {{\mathbf {E}}}\limits _{i_t, \vec {j}, z_{j_1}, \dots , z_{j_m}}} \{\Vert \rho _{B_{i_1} \cdots B_{i_k} | z_{j_1} \cdots z_{j_m} } \\&\quad - \rho _{B_{i_1}|z_{j_1} \cdots z_{j_m} } \otimes \cdots \otimes \rho _{B_{i_k} | z_{j_1} \cdots z_{j_m} } \Vert _1\}\\ {}&\quad \times \sum _{t=1}^k \frac{1}{n-k+1} \sum _{m=0}^{n-k} {\mathop {{\mathbf {E}}}\limits _{i_t, \vec {j}, z_{j_1}, \dots , z_{j_m}}} \{\Vert \rho _{B_{i_1 \cdots i_t} | z_{j_1} \cdots z_{j_m} } \\&\quad - \rho _{B_{i_1 \cdots i_{t-1}} | z_{j_1} \cdots z_{j_m} } \otimes \rho _{B_{i_{t}} | z_{j_1} \cdots z_{j_m} \Vert _1}\}\le kf(B|\cdot ) \sqrt{(2 \ln 2) \frac{\log d_B^{k-1}}{n-k+1}}. \end{aligned}$$

$\square $

3 Constrained bilinear optimization

As stated in (2), the constrained bilinear optimization problem we are interested in takes the form

$$\begin{aligned} Q:=\max&\quad {\mathrm {Tr}}\big [G_{AB} (W_{A} \otimes W_{B})\big ]\\ s.t.&\quad W_{A} \succeq 0, W_{B} \succeq 0, {\mathrm {Tr}}(W_{A}) = {\mathrm {Tr}}(W_{B}) = 1\\&\quad \Lambda _{A\rightarrow C_A}\left( W_{A}\right) =X_{C_A}, \, \Gamma _{B\rightarrow C_B}\left( W_{B}\right) =Y_{C_B}. \end{aligned}$$

Lower bounds on the optimal value can, e.g., be derived by means of seesaw methods [48] (see [79] for an example in quantum information theory). These then often converge in practice and sometimes even provably reach a local maxima. What was missing, however, is a general method to give an approximation guarantee to the global maximum.

Our de Finetti theorem with linear constraints (Theorem 2.3) gives an SDP hierarchy of outer bounds, that exactly provides such a criterion.

Theorem 3.1

For the SDPs

$$\begin{aligned} {\mathrm {SDP}}_n :=\max&\quad {\mathrm {Tr}}\big [G_{AB} W_{AB_1}\big ]\\ s.t.&\quad W_{AB_1^n}\succeq 0, {\mathrm {Tr}}(W_{AB_1^n}) = 1, \;W_{AB_1^n}={\mathcal {U}}_{B_1^n}^\pi \left( W_{AB_1^n}\right) \;\forall \pi \in {\mathfrak {S}}_n\\&\quad \Lambda _{A\rightarrow C_A}\left( W_{AB_1^n}\right) =X_{C_A}\otimes W_{B_1^n},\;\Gamma _{B_n\rightarrow C_B}\left( W_{B_1^n}\right) =W_{B_1^{n-1}}\otimes Y_{C_B} \end{aligned}$$

and Q defined as above, we have for $d:=\max \{d_A,d_B\}$ that

$$\begin{aligned} 0 \le {\mathrm {SDP}}_n - Q \le \frac{{\mathrm {poly}}(d)}{\sqrt{n}}\quad \text {implying}\quad Q=\lim _{n\rightarrow \infty }{\mathrm {SDP}}_n. \end{aligned}$$

Proof

We have by construction $0 \le {\mathrm {SDP}}_n - Q $ and the remaining inequality arises from

$$\begin{aligned} {\mathrm {Tr}}\left[ G_{AB} W_{AB_1}\right]&={\mathrm {Tr}}\left[ G_{AB} (W_{A} \otimes W_{B})\right] +{\mathrm {Tr}}\left[ G_{AB}\left( W_{AB_1} - W_{A} \otimes W_{B}\right) \right] \\&\le {\mathrm {Tr}}\left[ G_{AB} (W_{A} \otimes W_{B})\right] +\Vert G_{AB} \Vert _\infty \cdot \Vert W_{AB_1} - W_{A} \otimes W_{B} \Vert _1 \\&\le {\mathrm {Tr}}\left[ G_{AB} (W_{A} \otimes W_{B})\right] +\frac{{\mathrm {poly}}(d)}{\sqrt{n}}, \end{aligned}$$

where we used Hölder’s inequality and the de Finetti argument as in Theorem 2.3.

$\square $

The bounds from Theorem 2.3 give worst case convergence guarantees that are slow—as to ensure that the approximation error is small we need at least the level $n={\mathrm {poly}}(d)$. However, note that constrained bilinear optimization contains as a special case the best separable state problem and so we cannot expect much better bounds on the convergence speed in general. We refer to [36] and the references therein for a detailed discussion about the computational complexity of the best separable state problem.

We can add positive partial transpose (PPT) constraints^{Footnote 5}

$$\begin{aligned} W_{AB_1^n}^{T_A}\succeq 0,\;W_{AB_1^n}^{T_{B_1}}\succeq 0,\;W_{AB_1^n}^{T_{B_1^2}}\succeq 0,\dots ,\;W_{AB_1^n}^{T_{B_1^{n-1}}}\succeq 0 \end{aligned}$$

to ${\mathrm {SDP}}_n$ and we denote the resulting relaxations by ${\mathrm {SDP}}_{n,{\mathrm {PPT}}}$. It is important to point out that any separable state is also a PPT state, and hence we still have a valid relaxation to the problem (2). It is an interesting question to study if these constraints can lead to a faster convergence speed, cf. the discussions in [27, 56]. Based on the PPT constraints, we can give a sufficient condition when already

$$\begin{aligned} {\mathrm {SDP}}_{n,{\mathrm {PPT}}}=Q\text { for some finite }n. \end{aligned}$$

The condition—known as rank loop condition—is based on [56], which in turn builds on [39].

Lemma 3.2

[39, 56] Let $W_{AB_1^n}={\mathcal {U}}_{B_1^n}^\pi \left( W_{AB_1^n}\right) $ for all $\pi \in {\mathfrak {S}}_n$ and fixed $0\le k\le n$ such that $W_{AB_1^n}^{T_{B_{k+1}^n}}\succeq 0$. Then, $W_{AB_1}$ is separable if

$$\begin{aligned} {{\mathrm {rank}}}(W_{AB_1^n}) \le \max \left\{ {\mathrm {rank}}\left( W_{AB_1^k}\right) ,\,{\mathrm {rank}}\left( W_{B_{k+1}^n}\right) \right\} . \end{aligned}$$

Finally, note that instead of extending the B-systems we could equally well extend the A-systems to get another hierarchy. In the next section we directly study our main setting of interest—approximate quantum error correction—and refrain from further analysing the general case.

4 Approximate quantum error correction

4.1 Motivation

In order to introduce the problem, we describe its relevance and applications in quantum information theory. First, we introduce the theoretical setting, then we apply the results of the previous sections, thus obtaining specific convergent hierarchies. Corresponding numerical tests can be found in Appendix B.

Given a noisy classical channel $N_{X\rightarrow Y}$, a central quantity of interest in error correction is the maximum success probability p(N, M) for transmitting a uniform M-dimensional message under the noise model $N_{X\rightarrow Y}$. This is a bilinear maximization problem, which is in general NP-hard to approximate up to a sufficiently small constant factor [5]. Nevertheless, there are efficient methods for constructing feasible coding schemes approximating p(N, M) from below as well as an efficiently computable linear programming relaxation ${\mathrm {lp}}(N,M)$ (sometimes called meta converse [37, 63]) giving upper bounds on p(N, M).^{Footnote 6} In fact, it was shown in [5] that p(N, M) and ${\mathrm {lp}}(N,M)$ cannot be very far from each other

$$\begin{aligned} p(N,M)\le {\mathrm {lp}}(N,M)\le \frac{1}{1-\frac{1}{e}}\cdot p(N,M). \end{aligned}$$

Furthermore, the meta-converse has many appealing analytic properties, such as, e.g., the ability to evaluate it efficiently in the limit of many independent repetitions $N_{X\rightarrow Y}^{\times n}$, leading to very precise asymptotic bounds on the capacity of noisy classical channels [5].

The analogue quantum problem is to determine the maximum fidelity $F({\mathcal {N}},M)$, a quantity that will be formally defined later (Definition 4.1), for transmitting one part of a maximally entangled state of dimension M over a noisy quantum channel ${\mathcal {N}}_{A\rightarrow B}$. As in the classical case, this is a bilinear optimization problem, only now with matrix-valued variables. In order to approximate $F({\mathcal {N}},M)$, an efficiently computable semidefinite programming relaxation ${\mathrm {SDP}}({\mathcal {N}},M)$ was given in [53].^{Footnote 7} However, contrary to the classical case the gap between ${\mathrm {SDP}}({\mathcal {N}},M)$ and $F({\mathcal {N}},M)$ is not understood. On the other hand, the tools introduced in Sect. 2 will exactly be used to generate a converging hierarchy of efficiently computable semidefinite programming relaxations, allowing us to quantify the gap between these new relaxations and $F({\mathcal {N}},M)$.

Moreover, the relaxation ${\mathrm {SDP}}({\mathcal {N}},M)$ is lacking most of the analytic properties of its classical analogue ${\mathrm {lp}}(N,M)$. In fact, in quantum communication theory so-called non-additivity problems caused by quantum correlations make it notoriously hard to compute asymptotic limits in the first place [23]. Hence, we propose to use methods from optimization theory to directly study the maximum fidelity $F({\mathcal {N}},M)$ in order to quantify the ability of a quantum channel to transmit quantum information. The goal is then to identify a quantum version of the meta converse for approximating $F({\mathcal {N}},M)$, having similar properties as the classical meta converse ${\mathrm {lp}}(N,M)$ for approximating p(N, M). This approach can also be justified by the fact that most of the quantum devices that will be available in the near future are likely to be noisy and small in size. As such, efficient algorithms approximating $F({\mathcal {N}},M)$ for reasonable error models ${\mathcal {N}}$ and dimension M are more relevant in such settings than computing the asymptotic limit of the rate achievable for multiple copies of a given noise model.

Numerical lower bound methods for $F({\mathcal {N}},M)$ are available through iterative seesaw methods that lead to efficiently computable semidefinite programs [31, 32, 43, 49, 65, 66, 70]. These algorithms often converge in practice and sometimes even provably reach a local maximum. What was previously missing, however, is a general method to give an approximation guarantee to the global maximum. Here, the techniques as developed in Sect. 3 exactly lead to a converging hierarchy of efficiently computable semidefinite programming relaxations on the maximum fidelity $F({\mathcal {N}},M)$. As such, this can be seen as a tool for benchmarking existing quantum error correction codes and to understand in what direction to look for improved codes

We note that references [45, 72, 74, 75] gave refined relaxations on the size of a maximally entangled state that can be sent over a noisy quantum channel for fixed fidelity $1-\varepsilon $. These approaches are complementary to our work and contrary to our findings they do not lead to a converging hierarchy of efficiently computable bounds.

4.2 Setting

The mathematical setting of approximate quantum error correction we study is as follows.

Definition 4.1

Let ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$ be a quantum channel and $M \in {\mathbb {N}}$. The channel fidelity for message dimension M is defined as

$$\begin{aligned} F({\mathcal {N}},M):=\max&\quad F\Big (\Phi _{{\bar{B}}R},\big (\left( {\mathcal {D}}_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}_{A\rightarrow {\bar{A}}}\right) \otimes {\mathcal {I}}_R\big )(\Phi _{AR})\Big )\\ s.t.&\quad {\mathcal {D}}_{B\rightarrow {\bar{B}}},{\mathcal {E}}_{A\rightarrow {\bar{A}}}\;\text {quantum channels}, \end{aligned}$$

where $F(\rho ,\sigma ):=\left\| \sqrt{\rho }\sqrt{\sigma }\right\| _1^2$ denotes the fidelity, $\Phi _{ A R}$ denotes the maximally entangled state on AR, and we have $M=d_A=d_{{\bar{B}}}=d_R$.

In information-theoretic language, the channel fidelity corresponds to an average error criteria for preserving uniformly distributed information. Alternatively, we might also aim for a worst error criteria and we discuss this in Appendix C.

By the Choi-Jamiołkowski isomorphism the channel fidelity is conveniently rewritten as a bilinear optimization.

Lemma 4.2

Let ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$ be a quantum channel and $M\in {\mathbb {N}}$. Then, the channel fidelity can be written as

$$\begin{aligned} F({\mathcal {N}},M)=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}\right) \right] \\ s.t.&\quad E_{A{\bar{A}}}\succeq 0,\;D_{B{\bar{B}}}\succeq 0,\;E_A=\frac{1_A}{d_A},\;D_B=\frac{1_B}{d_B}, \end{aligned}$$

where $J^{\mathcal {N}}_{B{\bar{A}}}:=({\mathcal {N}}_{{\bar{A}} \rightarrow B}\otimes {\mathcal {I}}_{{\bar{A}}})(\Phi _{{\bar{A}}{\bar{A}}})$ denotes the Choi state of ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$.

The advantage of this notation is that all A-systems are with the sender (termed Alice) and all B-systems are with the receiver (termed Bob), which is consistent with [53].

Proof

By using the adjoint map in Hilbert-Schmidt inner product and multiple times the Choi-Jamiołkowski isomorphism as given in (7) and (8), and noting that $\Phi _{{\bar{B}}R}$ allows us to use the simplified expression for the fidelity when one of the two arguments is pure [80, Section 9.2], we can write the objective function from Definition 4.1 as

$$\begin{aligned}&F\Big (\Phi _{{\bar{B}}R},\big (\left( {\mathcal {D}}_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}_{A\rightarrow {\bar{A}}}\right) \otimes {\mathcal {I}}_R\big )(\Phi _{AR})\Big )\\&\quad =\mathrm{Tr}\left[ \Phi _{{\bar{B}}R}\big (\left( {\mathcal {D}}_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}_{A\rightarrow {\bar{A}}}\right) \otimes {\mathcal {I}}_R\big )(\Phi _{AR})\right] \\&\quad = \mathrm{Tr}\left[ J^\mathcal {D^\dagger }_{BR} \left( {\mathcal {N}}_{{\bar{A}}\rightarrow B} \otimes {\mathcal {I}}_R\right) \left( J^{\mathcal {E}}_{{\bar{A}} R}\right) \right] . \end{aligned}$$

Taking advantage of $d_A=d_{{\bar{B}}}=d_R$, we relabel the systems and we proceed as follows

$$\begin{aligned}&F\Big (\Phi _{{\bar{B}}R},\big (\left( {\mathcal {D}}_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}_{A\rightarrow {\bar{A}}}\right) \otimes {\mathcal {I}}_R\big )(\Phi _{AR})\Big )\\&\quad = \mathrm{Tr}\left[ J^\mathcal {D^\dagger }_{BR} \left( {\mathcal {N}}_{{\bar{A}}\rightarrow B} \otimes {\mathcal {I}}_R\right) \left( J^{\mathcal {E}}_{{\bar{A}} R}\right) \right] \\&\quad = \mathrm{Tr}\left[ J^\mathcal {D^\dagger }_{B{\bar{B}}} \left( {\mathcal {N}}_{{\bar{A}}\rightarrow B} \otimes {\mathcal {I}}_{A \rightarrow {\bar{B}}} \right) \left( J^{\mathcal {E}}_{{\bar{A}} A}\right) \right] \\&\quad = d_A d_{{\bar{A}}} \cdot \mathrm{Tr}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B} \otimes \Phi _{A{\bar{B}}} \right) \left( \left( J^{\mathcal {E}}_{A{\bar{A}}}\right) ^T \otimes J^\mathcal {D^\dagger }_{B{\bar{B}}} \right) \right] \\&\quad = d_{{\bar{A}}} d_B \cdot \mathrm{Tr}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B} \otimes \Phi _{A{\bar{B}}} \right) \left( \left( J^{\mathcal {E}}_{A{\bar{A}}} \right) ^T \otimes \frac{d_A}{d_B}\cdot J^\mathcal {D^\dagger }_{B{\bar{B}}} \right) \right] , \end{aligned}$$

where the transpose is taken with respect to the canonical basis, and the dimensional factors come from the connection between the Hilbert-Schmidt inner product and the maximally entangled state [81, Example 1.2]. Due to the basic proprieties of the Choi-Jamiołkowski isomorphism it is immediate to see that $(J^{\mathcal {E}}_{A{\bar{A}}})^T$ can be identified with the $E_{A{\bar{A}}}$ of Lemma 4.2. In addition, we have $\frac{d_A}{d_B}\cdot J^\mathcal {D^\dagger }_{B{\bar{B}}}\succeq 0$, and tracing out the ${\bar{B}}$ system as well as using $d_A = d_{{\bar{B}}}$ we get

$$\begin{aligned} \frac{d_A}{d_B}\cdot J^\mathcal {D^\dagger }_{B} = \frac{d_A}{d_B}\cdot {\mathcal {D}}^\dagger \left( \frac{1_{{\bar{B}}}}{d_{{\bar{B}}}}\right) = \frac{d_A}{d_B}\cdot \frac{1}{d_{{\bar{B}}}}\cdot 1_{B} = \frac{1_B}{d_B}. \end{aligned}$$

Thus, we can identify $\frac{d_A}{d_B}\cdot J^\mathcal {D^\dagger }_{B{\bar{B}}}$ with the $D_{B{\bar{B}}}$ of Lemma 4.2. $\square $

The following simple dimension bounds hold for the channel fidelity.

Lemma 4.3

Let ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$ be a quantum channel and $M\in {\mathbb {N}}$. Then, we have

$$\begin{aligned} 0\le F({\mathcal {N}},M)\le \min \left\{ 1,\left( \frac{d_{{\bar{A}}}}{M}\right) ^2,\frac{d_B}{M}\right\} . \end{aligned}$$

The proof can be found in Appendix E. By the linearity of the objective function we can furthermore rewrite the channel fidelity as

$$\begin{aligned} F({\mathcal {N}},M)=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( \sum _{i\in I}p_iE_{A{\bar{A}}}^i\otimes D_{B{\bar{B}}}^i\right) \right] \\ s.t.&\quad p_i\ge 0\;\forall i\in I,\;\sum _{i\in I}p_i=1\\&\quad E_{A{\bar{A}}}^i\succeq 0,\;D_{B{\bar{B}}}^i\succeq 0,\;E^i_A=\frac{1_A}{d_A},\;D^i_B=\frac{1_B}{d_B}\;\forall i\in I. \end{aligned}$$

4.3 De Finetti theorems for quantum channels

Recall that a quantum channel is just a completely positive, trace preserving map between two spaces of quantum states. Here, we establish a sufficient criterion under which permutation invariance of a quantum channel implies that it can be well approximated by a mixture of product quantum channels.

Theorem 4.4

Let $\rho _{A{\bar{A}}(B{\bar{B}})_1^n}$ be a quantum state with

$$\begin{aligned} \rho _{A{\bar{A}}(B{\bar{B}})_1^n}&={\mathcal {U}}_{(B{\bar{B}})_1^n}^\pi (\rho _{A{\bar{A}}(B{\bar{B}})_1^n})\;\forall \pi \in {\mathfrak {S}}_n \end{aligned}$$

(13)

$$\begin{aligned} \rho _{A(B{\bar{B}})_1^n}&=\frac{1_A}{d_A}\otimes \rho _{(B{\bar{B}})_1^n} \end{aligned}$$

(14)

$$\begin{aligned} \rho _{(B{\bar{B}})_1^{n-1}B_n}&=\rho _{(B{\bar{B}})_1^{n-1}}\otimes \frac{1_{B_n}}{d_B}. \end{aligned}$$

(15)

Then, we have for $0<k<n$ that

$$\begin{aligned}&\left\| \rho _{A{\bar{A}}(B{\bar{B}})_1^{k}}-\sum _{i\in I}p_i\sigma _{A{\bar{A}}}^i\otimes \left( \omega _{B{\bar{B}}}^i\right) ^{\otimes k}\right\| _1 \\&\quad \le kf(B \bar{B} |\cdot )\sqrt{ (2 \ln 2) \frac{ \log (d_{A} d_{\bar{A}}) + (k - 1) \log (d_B d_{\bar{B}}) }{n-k+1}} \end{aligned}$$

with $\{p_i\}_{i\in I}$ a probability distribution, and $\sigma _{A{\bar{A}}}^i,\omega _{B{\bar{B}}}^i\succeq 0$ such that $\sigma _A^i=\frac{1_A}{d_A}$ and $\omega _B^i=\frac{1_B}{d_B}$ for $i\in I$.

Proof

We simply apply Theorem 2.4 for the linear maps $\Lambda _{A\bar{A} \rightarrow A} = \mathrm{Tr}_{\bar{A}}$ and $\Gamma _{B\bar{B} \rightarrow B} = \mathrm{Tr}_{\bar{B}}$. $\square $

We emphasize that the representation we obtain in this theorem, $\rho _{A{\bar{A}}(B{\bar{B}})_1^{k}}$ is close to a mixture of products of Choi states of completely positive and trace-preserving maps. We note that applying standard de Finetti theorems for quantum states would only show that $\rho _{A{\bar{A}}(B{\bar{B}})_1^{k}}$ is close to a mixture of products states—or in other words Choi states of completely positive maps that are in general not even trace-non-increasing. This is not sufficient for our applications, and having the constraints (15) and (14) are needed in our proofs to achieve this stronger statement. We discuss this in more detail by means of the following examples.

Example 4.5

For ${\bar{A}}{\bar{B}}$ trivial and $k=1$ Theorem 4.4 says that $\rho _{AB}$ is close to the product state $\frac{1_{AB}}{d_Ad_B}$, as this is the only valid state satisfying the linear constraints. However, having only the permutation invariance condition (13) without the other two conditions in Theorem 4.4, this conclusion does not hold. In fact, choose $\rho _{AB_1^n}$ to be maximally classically correlated between all systems $A;B_1;B_2^n$

$$\begin{aligned} \rho _{AB_1^n} = \frac{1}{d} \sum _{i} \vert i\rangle \langle i\vert ^{\otimes n+1}. \end{aligned}$$

Then, the systems $B_1^n$ are symmetric with respect to A and even more, the state is supported on the symmetric subspace $(1_{A} \otimes P^{\mathrm {sym}}_{B_1^n})(\rho _{A B_1^n}) = \rho _{A B_1^n}$. However, of course $\rho _{AB_1}$ is not close to the state $\frac{1_{AB_1}}{d_Ad_B}$.

Example 4.6

This following example shows that imposing the constraint $\rho _{AB_1} = \frac{1_{AB_1}}{d_Ad_B}$ is not enough either. Let $A,{\bar{A}}, B, {\bar{B}}$ all be of dimension $d \ge 2$. Then, define for any $n \ge 1$

$$\begin{aligned} \rho _{AB_1^n \bar{A} \bar{B}_1^n} = \frac{1}{d^2} \sum _{i,j} \vert j\rangle \langle j\vert _{A} \otimes \vert i\rangle \langle i\vert _{{\bar{A}}} \otimes \vert i\rangle \langle i\vert ^{\otimes n}_{B} \otimes \vert i\rangle \langle i\vert ^{\otimes n}_{{\bar{B}}}. \end{aligned}$$

Then, the state is invariant under permutations of the $B\bar{B}$ systems and $\rho _{AB_1} = \frac{1_{AB_1}}{d^2}$. However, the reduced state $\rho _{A\bar{A} B_1 \bar{B}_1}$ is not close to states of the form

$$\begin{aligned} \sum _{\ell } p_{\ell } \sigma ^{\ell }_{A\bar{A}} \otimes \omega ^{\ell }_{B \bar{B}}\quad \text {with}\quad \sigma ^{\ell }_{A} =\frac{1_A}{d}, \omega ^{\ell }_{B} = \frac{1_B}{d}. \end{aligned}$$

To see this, consider the projector $\Pi _{\bar{A} B} = \sum _{i} \vert i\rangle \langle i\vert _{\bar{A}}\otimes \vert i\rangle \langle i\vert _{B}$. Then, we get

$$\begin{aligned} {\mathrm {Tr}}(\Pi _{\bar{A} B} \rho _{A\bar{A} B\bar{B}}) = 1\quad \text {but}\quad {\mathrm {Tr}}(\Pi _{\bar{A} B} \sigma ^{\ell }_{A\bar{A}} \otimes \omega ^{\ell }_{B\bar{B}}) = {\mathrm {Tr}}\Bigg (\Pi _{\bar{A} B} \sigma ^{\ell }_{\bar{A}} \otimes \frac{1_B}{d}\Bigg ) = \frac{1}{d}. \end{aligned}$$

By the Choi-Jamiołkowski isomorphism and relating the trace norm distance of Choi states to the diamond norm distance of the quantum channels [73, Lemma 7], we can alternatively state the bounds from Theorem 4.4 directly in terms of the quantum channels.

Corollary 4.7

Let ${\mathcal {N}}_{AB_1^{n}\rightarrow {\bar{A}}{\bar{B}}_1^{n}}$ be a quantum channel such that

$$\begin{aligned} {\mathcal {U}}_{{\bar{B}}_1^n}^\pi \left( {\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}(\cdot )\right)&={\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}\left( {\mathcal {U}}_{B_1^n}^\pi (\cdot )\right) \;\forall \pi \in {\mathfrak {S}}_n \end{aligned}$$

(16)

$$\begin{aligned} {\mathrm {Tr}}_{{\bar{B}}_n}\Big [{\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}(\cdot )\Big ]&= {\mathrm {Tr}}_{{\bar{B}}_n}\left[ {\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}\left( {\mathrm {Tr}}_{B_n}[\cdot ] \otimes \frac{1_{B_n}}{d_B} \right) \right] \end{aligned}$$

(17)

$$\begin{aligned} {\mathrm {Tr}}_{{\bar{A}}}\left[ {\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}(\cdot )\right]&={\mathrm {Tr}}_{{\bar{A}}}\left[ {\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}\left( \frac{1_A}{d_A}\otimes {\mathrm {Tr}}_A\left[ \cdot \right] \right) \right] . \end{aligned}$$

(18)

Then, we have for $0<k<n$ with

$$\begin{aligned} {\mathcal {N}}_{AB_1^{k}\rightarrow {\bar{A}}{\bar{B}}_1^{k}}(X_{A B_1^k}):= {\mathrm {Tr}}_{\bar{B}_{k+1}^n}\left[ {\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}\left( X_{A B_1^k} \otimes \frac{1_{B_{k+1}^n}}{d_B^{n-k}}\right) \right] \end{aligned}$$

(19)

that

$$\begin{aligned}&\left\| {\mathcal {N}}_{AB_1^k \rightarrow {\bar{A}}{\bar{B}}_1^k}-\sum _{i\in I}p_i{\mathcal {E}}_{A\rightarrow {\bar{A}}}^i\otimes \left( {\mathcal {D}}_{B\rightarrow {\bar{B}}}^i\right) ^{\otimes k}\right\| _\Diamond \\&\quad \le \, d_Ad_B^k\cdot kf(B \bar{B}|\cdot ) \times \sqrt{ (2 \ln 2) \frac{ \log (d_{A} d_{\bar{A}}) + (k - 1) \log (d_B d_{\bar{B}}) }{n-k+1}} \end{aligned}$$

with $\{p_i\}_{i\in I}$ a probability distribution and ${\mathcal {D}}_{B\rightarrow {\bar{B}}}^i,{\mathcal {E}}_{A\rightarrow {\bar{A}}}^i$ quantum channels for $i\in I$.

In (19) we chose a specific extension of $X_{AB_1^k}$ to define ${\mathcal {N}}_{AB_1^{k}\rightarrow {\bar{A}}{\bar{B}}_1^{k}}(X_{A B_1^k})$, namely $X_{AB_1^k} \otimes \frac{1_{B_{k+1}^n}}{d_B^{n-k}}$. This is still well-defined as the conditions (16) and (17) we require of ${\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}$ actually say that the choice of extension does not matter. That is, we have for any $X_{A B_1^n}$ that

$$\begin{aligned} {\mathrm {Tr}}_{\bar{B}_{k+1}^n} \left[ {\mathcal {N}}_{AB_1^{n}\rightarrow {\bar{A}}{\bar{B}}_1^{n}}(X_{A B_1^n}) \right]&= {\mathrm {Tr}}_{\bar{B}_{k+1}^{n-1} }\left[ {\mathrm {Tr}}_{\bar{B}_{n} }\left[ {\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}\left( X_{A B_1^{n-1}} \otimes \frac{1_{B_n}}{d_B}\right) \right] \right] \\&= {\mathrm {Tr}}_{\bar{B}_{k+1}^{n} }\left[ {\mathcal {N}}_{AB_1^n\rightarrow {\bar{A}}{\bar{B}}_1^n}\left( X_{A B_1^{k}} \otimes \frac{1_{B_{k+1}^n}}{d_B^{n-k}}\right) \right] \\&= {\mathcal {N}}_{AB_1^k\rightarrow {\bar{A}}{\bar{B}}_1^k}\left( X_{A B_1^{k}}\right) \,, \end{aligned}$$

where we used (17) for the first equality as well as (16) and (17) multiple times for the second equality.

In the following we state several comments about de Finetti theorems for quantum channels:

We emphasize that the de Finetti reductions—called post-selection technique [18]—for quantum channels proved in [14, 29] are different from what we need in our work (also see [3, 38] for classical versions). In particular, unlike de Finetti theorems, de Finetti reductions provide an operator inequality upper bound to a symmetric quantum state in the form of an integral superposition of product states.
In contrast to the bound for Choi states (Theorem 4.4), the diamond norm bound in Corollary 4.7 does not have a polynomial dependence in $d_B$ and k. We leave it as an open question to give a de Finetti theorem for quantum channels in terms of the diamond norm distance with a dimension dependence polynomial in $d_B$ and k. (For our purposes we only need the $k=1$ bound, in terms of the Choi states.)
In the case $k=1$, the conditions of the above theorem can be seen as approximations for the convex hull of product quantum channels, just as extendible states provide an approximation for the set of separable states.^{Footnote 8} We note that in SDP hierarchies for the quantum separability problem the permutation invariance can be replaced by the stronger Bose symmetric condition. That is, the state in question is supported on the symmetric subspace. The reason is that every separable state can without loss of generality be decomposed in a convex combination of pure product states. However, in our setting, we cannot assume that we have a mixture of a product of pure channels, and so we keep the more general notion of permutation invariance.
In the following, we never directly make use of Corollary 4.7 but rather state it for connecting to the previous literature. In particular, when choosing $A{\bar{A}}$ trivial as a special case we find a finite version of the asymptotic de Finetti theorem for quantum channels from [33, 34].^{Footnote 9} We emphasize that our derived conditions then become a finite version of the notion of exchangeable sequences of quantum channels of [34] defined as a sequence of channels $\{{\mathcal {N}}_{B_1^n\rightarrow {\bar{B}}_1^n}\}$ satisfying for all n that
$$\begin{aligned} {\mathcal {U}}_{{\bar{B}}_1^n}^\pi \left( {\mathcal {N}}_{B_1^n\rightarrow {\bar{B}}_1^n}(\cdot )\right)&={\mathcal {N}}_{B_1^n\rightarrow {\bar{B}}_1^n}\left( {\mathcal {U}}_{B_1^n}^\pi (\cdot )\right) \;\forall \pi \in {\mathfrak {S}}_n\\ {\mathcal {N}}_{B_1^{n-1}\rightarrow {\bar{B}}_1^{n-1}}\Big ({\mathrm {Tr}}_{B_n}\left[ \cdot \right] \Big )&={\mathrm {Tr}}_{{\bar{B}}_n}\Big [{\mathcal {N}}_{B_1^n\rightarrow {\bar{B}}_1^n}(\cdot )\Big ]. \end{aligned}$$
They show that under these conditions, for any k, the channel ${\mathcal {N}}_{B_1^k \rightarrow \bar{B}_1^k}$ is in the convex hull of tensor power channels. In Corollary 4.7, we start with a channel^{Footnote 10}${\mathcal {N}}_{B_1^n\rightarrow {\bar{B}}_1^n}$ and quantify the closeness of such ${\mathcal {N}}_{B_1^k\rightarrow {\bar{B}}_1^k}$ to convex combinations of tensor product channels $\sum _ip_i\left( {\mathcal {D}}_{B\rightarrow {\bar{B}}}^i\right) ^{\otimes k}$.

Channels that are written as mixtures of channels of the form ${\mathcal {E}}_{A \rightarrow {\bar{A}}} \otimes {\mathcal {D}}_{B \rightarrow {\bar{B}}}$ where ${\mathcal {E}}_{A \rightarrow {\bar{A}}}$ and ${\mathcal {D}}_{B \rightarrow {\bar{B}}}$ are channels can straightforwardly be implemented between two parties having access to shared randomness but no communication. There is a natural relaxation to this set of channels, often called LOCC(1) channels, corresponding to channels that can be implemented with additional classical communication from A to B. Mathematically, these are channels of the form

$$\begin{aligned} \sum _{i \in I}{\mathcal {E}}^i_{A \rightarrow {\bar{A}}} \otimes {\mathcal {D}}^i_{B \rightarrow {\bar{B}}}, \end{aligned}$$

where ${\mathcal {D}}^i_{B\rightarrow {\bar{B}}}$ are channels and ${\mathcal {E}}^i_{A\rightarrow {\bar{A}}}$ are completely positive but not necessarily trace-preserving. We discuss this variation of approximate quantum error correction in Appendix A.

4.4 Hierarchy of outer bounds

Following the de Finetti theorem for quantum channels as given in Theorem 4.4, the n-th level of the SDP hierarchy for the quantum channel fidelity becomes

$$\begin{aligned} {\mathrm {SDP}}_n({\mathcal {N}},M):=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B_1}\otimes \Phi _{A{\bar{B}}_1}\right) W_{A{\bar{A}}B_1{\bar{B}}_1}\right] \\ s.t.&\quad W_{A{\bar{A}}(B{\bar{B}})_1^n}\succeq 0,\;{\mathrm {Tr}}\left[ W_{A{\bar{A}}(B{\bar{B}})_1^n}\right] =1\\&\quad W_{A{\bar{A}}(B{\bar{B}})_1^n}={\mathcal {U}}_{(B{\bar{B}})_1^n}^\pi \left( W_{A{\bar{A}}(B{\bar{B}})_1^n}\right) \;\forall \pi \in {\mathfrak {S}}_n\\&\quad W_{A(B{\bar{B}})_1^n}=\frac{1_A}{d_A}\otimes W_{(B{\bar{B}})_1^n},\;W_{A{\bar{A}}(B{\bar{B}})_1^{n-1}B_n}\\&\quad =W_{A{\bar{A}}(B{\bar{B}})_1^{n-1}}\otimes \frac{1_{B_n}}{d_B}. \end{aligned}$$

Here, we identified $B_1\equiv B$ and hence the n-th level of the hierarchy then corresponds to taking $n-1$ extensions. Note that instead of stating the last condition for the final block $B_n$ we could have equivalently stated it for any block $B_j$ with $j=1,\ldots ,n$ (by the permutation invariance). Iteratively, the condition then also holds on all pairs of blocks of size two, and so on. Moreover, we slightly strengthened the last condition by including the A-systems compared to the minimal condition on the B-system needed for Theorem 4.4

$$\begin{aligned} W_{(B{\bar{B}})_1^{n-1}B_n}=W_{(B{\bar{B}})_1^{n-1}}\otimes \frac{1_{B_n}}{d_B}. \end{aligned}$$

We then immediately have asymptotic convergence.

Theorem 4.8

Let ${\mathcal {N}}$ be a quantum channel and $n,M\in {\mathbb {N}}$. Then, we have

$$\begin{aligned} 0 \le {\mathrm {SDP}}_n({\mathcal {N}},M) - F({\mathcal {N}},M) \le \frac{{\mathrm {poly}}(d)}{\sqrt{n}}\quad \text {implying}\quad F({\mathcal {N}},M)=\lim _{n\rightarrow \infty }{\mathrm {SDP}}_n({\mathcal {N}},M), \end{aligned}$$

where $d:=\max \{d_A,d_{{\bar{A}}},d_B,d_{{\bar{B}}}\}$.

Proof

By construction $0 \le {\mathrm {SDP}}_n({\mathcal {N}},M) - F({\mathcal {N}},M)$ and the remaining inequality arises from

$$\begin{aligned}&d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) W_{A{\bar{A}} B {\bar{B}}}\right] \\&\quad = d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}\right) \right] \\&\qquad + d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( W_{A{\bar{A}} B {\bar{B}}} - E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}\right) \right] \\&\quad \le d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}\right) \right] \\&\qquad + d_{{\bar{A}}}d_B \cdot \Vert J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}} \Vert _\infty \cdot \Vert W_{A{\bar{A}} B {\bar{B}}} - E_{A{\bar{A}}}\otimes D_{B{\bar{B}}} \Vert _1 \\&\quad \le d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}\right) \right] +\frac{{\mathrm {poly}}(d)}{\sqrt{n}}, \end{aligned}$$

where we used Hölder’s inequality and the de Finetti reduction from Theorem 2.3.

$\square $

We note that the worst case convergence guarantee is slow, as to ensure that the approximation error becomes small, we need at least the level $n={\mathrm {poly}}(d)$.

Remark 4.9

Instead of extending the B-systems we could alternatively extend the A-systems, which leads to the (non-equivalent) hierarchy

$$\begin{aligned} \overline{{\mathrm {SDP}}}_n({\mathcal {N}},M):=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B_1}\otimes \Phi _{A{\bar{B}}_1}\right) W_{A_1{\bar{A}}_1B{\bar{B}}}\right] \\ s.t.&\quad W_{(A{\bar{A}})_1^nB{\bar{B}}}\succeq 0,\;{\mathrm {Tr}}\left[ W_{(A{\bar{A}})_1^nB{\bar{B}}}\right] =1\\&\quad W_{(A{\bar{A}})_1^nB{\bar{B}}}={\mathcal {U}}_{(A{\bar{A}})_1^n}^\pi \left( W_{(A{\bar{A}})_1^nB{\bar{B}}}\right) \;\forall \pi \in {\mathfrak {S}}_n\\&\quad W_{(A{\bar{A}})_1^nB}=W_{(A{\bar{A}})_1^n}\otimes \frac{1_B}{d_B},\;W_{(A{\bar{A}})_1^{n-1}A_nB{\bar{B}}}\\&\quad =\frac{1_{A_n}}{d_A}\otimes W_{(A{\bar{A}})_1^{n-1}B{\bar{B}}}. \end{aligned}$$

For the first level we have $\overline{{\mathrm {SDP}}}_1({\mathcal {N}},M)={\mathrm {SDP}}_1({\mathcal {N}},M)$ by inspection, but for the higher levels it depends on the input-output dimensions $d_{{\bar{A}}},d_B$ which hierarchy is potentially more powerful.

The relaxations ${\mathrm {SDP}}_n({\mathcal {N}},M)$ behave naturally with respect to the first two bounds of Lemma 4.3.

Lemma 4.10

Let ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$ be a quantum channel and $n,M \ge 1$. Then, we have

$$\begin{aligned} 0\le {\mathrm {SDP}}_n({\mathcal {N}},M)\le \min \left\{ 1,\left( \frac{d_{{\bar{A}}}}{M}\right) ^2\right\} . \end{aligned}$$

The proof can be found in Appendix E. We can again add all the PPT constraints and denote the resulting relaxations by ${\mathrm {SDP}}_{n,{\mathrm {PPT}}}({\mathcal {N}},M)$. In the following we study more closely these levels ${\mathrm {SDP}}_{n,{\mathrm {PPT}}}({\mathcal {N}},M)$, which are our tightest outer bound relaxations on the channel fidelity.

4.5 Low level relaxations

We find

$$\begin{aligned} {\mathrm {SDP}}_{1,{\mathrm {PPT}}}({\mathcal {N}},M)=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) W_{A{\bar{A}}B{\bar{B}}}\right] \\ s.t.&\quad W_{A{\bar{A}}B{\bar{B}}}\succeq 0,\;W_{A{\bar{A}}B{\bar{B}}}^{T_{B{\bar{B}}}}\succeq 0,\;{\mathrm {Tr}}\left[ W_{A{\bar{A}}B{\bar{B}}}\right] =1\\&\quad W_{AB{\bar{B}}}=\frac{1_A}{d_A}\otimes W_{B{\bar{B}}},\;W_{A{\bar{A}}B}=W_{A{\bar{A}}}\otimes \frac{1_B}{d_B}, \end{aligned}$$

which is the SDP outer bound found in [53, Section IV], up to their a priori stronger condition

$$\begin{aligned} W_{AB}=\frac{1_{AB}}{d_Ad_B}\text { instead of our } {\mathrm {Tr}}\left[ W_{A{\bar{A}}B{\bar{B}}}\right] =1. \end{aligned}$$

However, as implicitly shown in [53, Theorem 3] these two conditions actually become equivalent because of the structure of the objective function. Operationally ${\mathrm {SDP}}_1({\mathcal {N}},M)$ corresponds to the non-signalling assisted channel fidelity, whereas ${\mathrm {SDP}}_{1,{\mathrm {PPT}}}({\mathcal {N}},M)$ adds the PPT-preserving constraint—as discussed in [53, Corollary 4]. Moreover, in the objective function the symmetry^{Footnote 11}

$$\begin{aligned} \int \left( {\overline{U}}_A\otimes U_{{\bar{B}}}\right) (\cdot )\left( {\overline{U}}_A\otimes U_{{\bar{B}}}\right) ^\dagger \,\mathrm {d}U \end{aligned}$$

can be used to achieve a dimension reduction of $M^2$ leading to [53, Theorem 3]

$$\begin{aligned} {\mathrm {SDP}}_{1,{\mathrm {PPT}}}({\mathcal {N}},M)=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ J^{\mathcal {N}}_{{\bar{A}}B}Y_{{\bar{A}}B}\right] \nonumber \\ s.t.&\quad \rho _{{\bar{A}}}\otimes \frac{1_B}{d_B}\succeq Y_{{\bar{A}}B}\succeq 0,\;{\mathrm {Tr}}[\rho _{{\bar{A}}}]=1\nonumber \\&\quad M^2 \cdot Y_B=\frac{1_B}{d_B},\;\rho _{{\bar{A}}}\otimes \frac{1_B}{d_B}\succeq M\cdot Y_{{\bar{A}}B}^{T_B}\nonumber \\ {}&\quad \succeq -\rho _{{\bar{A}}}\otimes \frac{1_B}{d_B}. \end{aligned}$$

(20)

The level $n=2$ reads as

$$\begin{aligned} {\mathrm {SDP}}_{2,{\mathrm {PPT}}}({\mathcal {N}},M)=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B_1}\otimes \Phi _{A{\bar{B}}_1}\right) W_{A{\bar{A}}B_1{\bar{B}}_1}\right] \nonumber \\ s.t.&\quad W_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}\succeq 0,\;{\mathrm {Tr}}\left[ W_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}\right] =1\nonumber \\&\quad {\mathcal {U}}_{B_1B_2{\bar{B}}_1{\bar{B}}_2}^\pi \left( W_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}\right) =W_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}\;\forall \pi \in \Pi _2\nonumber \\&\quad W_{AB_1B_2{\bar{B}}_1{\bar{B}}_2}=\frac{1_A}{d_A}\otimes W_{B_1B_2{\bar{B}}_1{\bar{B}}_2},\;W_{A{\bar{A}}B_1B_2{\bar{B}}_1}\nonumber \\ {}&\quad =W_{A{\bar{A}}B_1{\bar{B}}_1}\otimes \frac{1_{B_2}}{d_B}\nonumber \\&\quad W_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}^{T_{A{\bar{A}}}}\succeq 0,\;W_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}^{T_{B_2{\bar{B}}_2}}\succeq 0. \end{aligned}$$

(21)

Numerical evaluations of (20) and (21) can be found in Appendix B.

5 Conclusion

We have shown that quantum de Finetti theorems which impose linear constraints on the approximating state lead to converging SDP hierarchies for constrained bilinear optimization. As our main application, this gave efficiently computable outer bounds on the optimal quantum channel fidelity in approximate quantum error correction. In Appendix B, we provide some numerical evidence that the resulting bounds are sometimes tight for low dimensional error models, but it would be desirable to do more extensive numerical studies for practically relevant settings. For example, it would be interesting to apply the techniques from [67] to automatically detect the symmetries in the problem in order to significantly improve the performance. One could also explore other operational settings in quantum information theory that are described in terms of jointly constrained semidefinite bilinear or multilinear programs (cf. the related work [41]).

On the mathematical side, it remains unclear if the linear constraint conditions in our quantum de Finetti theorem (Theorem 2.3) are minimal or could be further simplified. Recall that, for the linear constraint on system B, we had the condition

$$\begin{aligned} \Gamma _{B_k\rightarrow C_B}(\rho _{B_1^k})&=\rho _{B_1^{k-1}}\otimes Y_{C_B}. \end{aligned}$$

As in Example 4.6, it is simple to see that only requiring $\Gamma _{B_k \rightarrow C_B}(\rho _{B_k}) = Y_{C_B}$ is not sufficient. However, the following weaker condition might be sufficient

$$\begin{aligned} \Gamma _{B\rightarrow C_B}^{\otimes k}(\rho _{B_1^k})&=Y_{C_B}^{\otimes k}. \end{aligned}$$

We leave this as an open question (also see the related works on de Finetti reductions [14, 29]). Another mathematical question is to determine the optimal dimension dependence of the minimal distortion with side information (see Lemma D.1). It would also be interesting to improve Corollary 4.7 and give de Finetti theorems for quantum channels directly in terms of the diamond norm distance with a dimension dependence polynomial in $d_B$ and k. Finally, there are variants of quantum de Finetti theorems which provably lead to (exponentially) faster convergence for certain settings of the quantum separability problem [11, 13, 17], and the consequences for approximate quantum error correction remain to be explored.

Notes

Here and henceforth we use the notation $B_i^j$ to denote the systems $B_i\otimes \cdots \otimes B_j$, which should be interpreted as empty if $i>j$.
We refer to Sect. 2.1 for the formal definition of the so-called partial trace map ${\mathrm {Tr}}_{B_2^n}[\cdot ]$.
Here and henceforth we use the symbol $:=$ as equal by definition.
A linear map ${\mathcal {N}}_{A\rightarrow B}:{\mathcal {S}}_A\rightarrow {\mathcal {S}}_B$ is said to be completely positive if ${\mathcal {N}}_{A\rightarrow B}\otimes {\mathcal {I}}_C$ is a positive map for every quantum system C, where ${\mathcal {I}}_C$ denotes the identity map on ${\mathcal {S}}_C$
The partial transpose of a matrix $W_{AB}$ is defined for a fixed product basis as $\langle ij|W_{AB}^{T_A}|kl\rangle :=\langle kj|W_{AB}|il\rangle $.
Operationally, ${\mathrm {lp}}(N,M)$ corresponds to the non-signalling assisted maximum success probability [55].
Operationally, ${\mathrm {SDP}}(N,M)$ corresponds to the positive partial transpose preserving, non-signalling assisted maximum fidelity [53].
The class of channels we consider here is more restricted than general separable channels, which usually refers to a mixture of product completely positive and not necessarily trace-preserving maps.
We also refer to [59] for previous related work and [19] for a classical version. Moreover, following [45], conditions related to our (16)–(18) give rise to extendible channels in the resource theory of unextendibility.
This is equivalent to being given a finite sequence ${\mathcal {N}}_{B_1^k\rightarrow {\bar{B}}_1^k}$ for $k \in \{1, \dots , n\}$ satisfying the exchangeability condition, as the reduced channels are then completely determined by ${\mathcal {N}}_{B_1^n\rightarrow {\bar{B}}_1^n}$
Here, $\overline{U}_A$ denotes the complex conjugate of $U_A$ with respect to some standard basis.
The term LOCC(1) stands for local operations and one-way classical communication from sender to receiver [16].
All our code is available at https://github.com/FrancescoBorderi/Quantum-SDPs.
We noticed that SDPT3 compared to MOSEK gives results having in general lower rank.
By a continuity argument $\sigma _A$ can be assumed to have full rank.
Alternatively, the upper bound of one can directly be deduced operationally from [53, Theorem 3], where ${\mathrm {SDP}}_1({\mathcal {N}},M)$ was identified as the non-signalling assisted channel fidelity.

References

Al-Khayyal, F.A., Falk, J.E.: Jointly constrained biconvex programming. Math. Oper. Res. 8(2), 273 (1983)
Article MathSciNet MATH Google Scholar
ApS, M.: The MOSEK optimization toolbox for MATLAB manual. Version 8.1 (2017)
Arnon-Friedman, R., Renner, R.: De Finetti reductions for correlations. J. Math. Phys. 56(5), 052203 (2015)
Article MathSciNet MATH Google Scholar
Barak, B., Brandao, F.G.S.L., Harrow, A.W., Kelner, J., Steurer, D., Zhou, Y.: Hypercontractivity, sum-of-squares proofs, and their applications. In: Proceedings of the Forty-fourth Annual ACM Symposium on Theory of Computing, STOC’12, p. 307 (2012)
Barman, S., Fawzi, O.: Algorithmic aspects of optimal channel coding. IEEE Trans. Inf. Theory 64(2), 1038 (2018)
Article MathSciNet MATH Google Scholar
Beigi, S.: Sandwiched Rényi divergence satisfies data processing inequality. J. Math. Phys. 54(12), 122202 (2013)
Article MathSciNet MATH Google Scholar
Bennett, C.H., DiVincenzo, D.P., Smolin, J.A., Wootters, W.K.: Mixed-state entanglement and quantum error correction. Phys. Rev. A 54(5), 3824 (1996)
Article MathSciNet MATH Google Scholar
Berta, M., Christandl, M., Renner, R.: The quantum reverse Shannon theorem based on one-shot information theory. Commun. Math. Phys. 306(3), 579 (2011)
Article MathSciNet MATH Google Scholar
Berta, M., Fawzi, O., Scholz, V.B.: Quantum bilinear optimization. SIAM J. Optim. 26(3), 1529 (2016)
Article MathSciNet MATH Google Scholar
Bhatia, R.: Matrix Analysis. Graduate Texts in Mathematics. Springer, Berlin (1997)
Google Scholar
Brandao, F.G.S.L., Christandl, M., Yard, J.: Faithful squashed entanglement. Commun. Math. Phys. 306(3), 80 (2011)
Article MathSciNet MATH Google Scholar
Brandao, F.G.S.L., Harrow, A.W.: Product-state approximations to quantum ground states. Commun. Math. Phys. 342(1), 47 (2016)
Article MATH Google Scholar
Brandao, F.G.S.L., Harrow, A.W.: Quantum de Finetti theorems under local measurements with applications. Commun. Math. Phys. 353(2), 469 (2017)
Article MathSciNet MATH Google Scholar
Brandao, F.G.S.L., Harrow, A.W., Oppenheim, J., Strelchuk, S.: Quantum conditional mutual information, reconstructed states, and state redistribution. Phys. Rev. Lett. 115(5), 050501 (2015)
Article MathSciNet Google Scholar
Caves, C.M., Fuchs, C.A., Schack, R.: Unknown quantum states: the quantum de Finetti representation. J. Math. Phys. 43(9), 4537 (2002)
Article MathSciNet MATH Google Scholar
Chitambar, E., Leung, D., Mančinska, L., Ozols, M., Winter, A.: Everything you always wanted to know about locc (but were afraid to ask). Commun. Math. Phys. 328(1), 303–326 (2014)
Article MathSciNet MATH Google Scholar
Christandl, M., König, R., Mitchison, G., Renner, R.: One-and-a-half quantum de Finetti theorems. Commun. Math. Phys. 273(2), 473 (2007)
Article MathSciNet MATH Google Scholar
Christandl, M., König, R., Renner, R.: Postselection technique for quantum channels with applications to quantum cryptography. Phys. Rev. Lett. 102(2), 020504 (2009)
Article Google Scholar
Christandl, M., Toner, B.: Finite de Finetti theorem for conditional probability distributions describing physical theories. J. Math. Phys. 50(4), 042104 (2009)
Article MathSciNet MATH Google Scholar
Coffman, V., Kundu, J., Wootters, W.K.: Distributed entanglement. Phys. Rev. A 61, 052306 (2000)
Article Google Scholar
de Finetti, B.: La prévision?: ses lois logiques, ses sources subjectives. Ann. Inst. Henri Poincaré 7(1), 1 (1937)
MathSciNet MATH Google Scholar
Diaconis, P., Freedman, D.: Finite exchangeable sequences. Ann. Probab. 8(4), 745 (1980)
Article MathSciNet MATH Google Scholar
DiVincenzo, D., Shor, P., Smolin, J.: Quantum-channel capacity of very noisy channels. Phys. Rev. A 57(2), 830 (1998)
Article Google Scholar
Doherty, A.C., Parrilo, P.A., Spedalieri, F.M.: Distinguishing separable and entangled states. Phys. Rev. Lett. 88(18), 187904 (2002)
Article Google Scholar
Doherty, A.C., Parrilo, P.A., Spedalieri, F.M.: Complete family of separability criteria. Phys. Rev. A 69(2), 022308 (2004)
Article Google Scholar
Duan, R., Winter, A.: No-signalling-assisted zero-error capacity of quantum channels and an information theoretic interpretation of the lovász number. IEEE Trans. Inf. Theory 62(2), 891 (2016)
Article MATH Google Scholar
Fang, K., Fawzi, H.: The sum-of-squares hierarchy on the sphere and applications in quantum information theory. Math. Program. 2020, 1–30 (2020)
Google Scholar
Fannes, M., Lewis, J.T., Verbeure, A.: Symmetric states of composite systems. Lett. Math. Phys. 15(3), 255 (1988)
Article MathSciNet MATH Google Scholar
Fawzi, O., Renner, R.: Quantum conditional mutual information and approximate Markov chains. Commun. Math. Phys. 340(2), 575 (2015)
Article MathSciNet MATH Google Scholar
Fazel, M., Hindi, H., Boyd, S.P.: Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices. In: Proceedings of the 2003 American Control Conference, 2003, vol. 3, pp. 2156–2162 (2003)
Fletcher, A.S.: Channel-Adapted Quantum Error Correction. PhD thesis, Massachusetts Institute of Technology (2007)
Fletcher, A.S., Shor, P.W., Win, M.Z.: Optimum quantum error recovery using semidefinite programming. Phys. Rev. A 75(1), 012338 (2007)
Article Google Scholar
Fuchs, C.A., Schack, R.: Unknown Quantum States and Operations, a Bayesian View, p. 147. Springer Berlin Heidelberg, Berlin (2004)
MATH Google Scholar
Fuchs, C.A., Schack, R., Scudo, P.F.: De Finetti representation theorem for quantum-process tomography. Phys. Rev. A 69(6), 062305 (2004)
Article Google Scholar
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming (2008)
Harrow, A.W., Natarajan, A., Wu, X.: Limitations of semidefinite programs for separable states and entangled games. Commun. Math. Phys. 366, 423–468 (2019)
Hayashi, M.: Information spectrum approach to second-order coding rate in channel coding. IEEE Trans. Inf. Theory 55(11), 4947 (2009)
Article MathSciNet MATH Google Scholar
Hayashi, M., Tomamichel, M.: Correlation detection and an operational interpretation of the Rényi mutual information. J. Math. Phys. 57(10), 102201 (2016)
Article MathSciNet MATH Google Scholar
Horodecki, P., Lewenstein, M., Vidal, G., Cirac, I.: Operational criterion and constructive checks for the separability of low-rank density matrices. Phys. Rev. A 62(3), 032310 (2000)
Article MathSciNet Google Scholar
Horodecki, R., Horodecki, P., Horodecki, M., Horodecki, K.: Quantum entanglement. Rev. Mod. Phys. 81, 865–942 (2009)
Article MathSciNet MATH Google Scholar
Huber, S., Koenig, R., Tomamichel, M.: Jointly constrained semidefinite bilinear programming with an application to Dobrushin curves. IEEE Trans. Inf. Theory Early Access (2019)
Hudson, R.L., Moody, G.R.: Locally normal symmetric states and an analogue of de Finetti’s theorem. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 33(4), 343 (1976)
Article MathSciNet MATH Google Scholar
Johnson, P.D., Romero, J., Olson, J., Cao, Y., Aspuru-Guzik, A.: QVECTOR: an algorithm for device-tailored quantum error correction. arXiv:1711.02249 (2017)
Johnston, N.: Qetlab: A matlab toolbox for quantum entanglement, version 0.9 (2016)
Kaur, E., Das, S., Wilde, M.M., Winter, A.: Extendibility limits the performance of quantum processors. Phys. Rev. Lett. 123, 070502 (2019)
Article MathSciNet Google Scholar
Koenig, R., Mitchison, G.: A most compendious and facile quantum de Finetti theorem. J. Math. Phys. 50(1), 012105 (2009)
Article MathSciNet MATH Google Scholar
Koenig, R., Renner, R.: A de Finetti representation for finite symmetric quantum states. J. Math. Phys. 46(12), 122108 (2005)
Article MathSciNet MATH Google Scholar
Konno, H.: A cutting plane algorithm for solving bilinear programs. Math. Program. 11(1), 14 (1976)
Article MathSciNet MATH Google Scholar
Kosut, R.L., Lidar, D.A.: Quantum error correction via convex optimization. Quantum Inf. Process. 8(5), 443 (2009)
Article MathSciNet MATH Google Scholar
Kretschmann, D., Werner, R.F.: Tema con variazioni: quantum channel capacity. New J. Phys. 6(1), 26 (2004)
Article Google Scholar
Lancien, C., Winter, A.: Distinguishing multi-partite states by local measurements. Commun. Math. Phys. 323(2), 555 (2013)
Article MathSciNet MATH Google Scholar
Lasserre, J.B.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11(3), 796 (2000)
Article MathSciNet MATH Google Scholar
Leung, D., Matthews, W.: On the power of PPT-preserving and non-signalling codes. IEEE Trans. Inf. Theory 61(8), 4486 (2015)
Article MathSciNet MATH Google Scholar
Leung, D.W., Nielsen, M.A., Chuang, I.L., Yamamoto, Y.: Approximate quantum error correction can lead to better codes. Phys. Rev. A 56(4), 2567 (1997)
Article Google Scholar
Matthews, W.: A linear program for the finite block length converse of Polyanskiy-Poor-Verdú via nonsignaling codes. IEEE Trans. Inf. Theory 58(12), 7036 (2012)
Article MATH Google Scholar
Navascués, M., Owari, M., Plenio, M.B.: Power of symmetric extensions for entanglement detection. Phys. Rev. A 80(5), 052306 (2009)
Article MATH Google Scholar
Nielsen, M.A., Chuang, I.: Quantum Computation and Quantum Information (2000)
Olkiewicz, R., Zegarlinski, B.: Hypercontractivity in noncommutative lp-spaces. J. Funct. Anal. 161(1), 246 (1999)
Article MathSciNet MATH Google Scholar
Pankowski, L., Brandao, F.G.S.L., Horodecki, M., Smith, G.: Entanglement distillation by extendible maps. Quantum Inf. Comput. 13(9–10), 751–770 (2013)
MathSciNet Google Scholar
Parrilo, P.A.: Semidefinite programming relaxations for semialgebraic problems. Math. Program. 96(2), 293 (2003)
Article MathSciNet MATH Google Scholar
Petz, D.: A de Finetti-type theorem withm-dependent states. Probab. Theory Relat. Fields 85(1), 65 (1990)
Article MATH Google Scholar
Pironio, S., Navascués, M., Acín, A.: Convergent relaxations of polynomial optimization problems with noncommuting variables. SIAM J. Optim. 20, 2157–2180 (2010)
Article MathSciNet MATH Google Scholar
Polyanskiy, Y., Poor, H.V., Verdú, S.: Channel coding rate in the finite blocklength regime. IEEE Trans. Inf. Theory 56(5), 2307 (2010)
Article MathSciNet MATH Google Scholar
Raggio, G.A., Werner, R.F.: Quantum statistical mechanics of general mean field systems. Helv. Phys. Acta 62, 980 (1989)
MathSciNet MATH Google Scholar
Reimpell, M.: Quantum Information and Convex Optimization. PhD thesis, TU Braunschweig (2008)
Reimpell, M., Werner, R.F.: Iterative optimization of quantum error correcting codes. Phys. Rev. Lett. 94(8), 080501 (2005)
Article Google Scholar
Rosset, D.: Symdpoly: symmetry-adapted moment relaxations for noncommutative polynomial optimization. arXiv preprint arXiv:1808.09598 (2018)
Scott, A.J.: Optimizing quantum process tomography with unitary 2-designs. J. Phys. A: Math. Theor. 41(5), 055308 (2008)
Article MathSciNet MATH Google Scholar
Størmer, E.: Symmetric states of infinite tensor products of C*-algebras. J. Funct. Anal. 3(1), 48 (1969)
Article MathSciNet MATH Google Scholar
Taghavi, S., Kosut, R.L., Lidar, D.A.: Channel-optimized quantum error correction. IEEE Trans. Inf. Theory 56(3), 1461 (2010)
Article MathSciNet MATH Google Scholar
Toh, K.-C., Todd, M.J., Tütüncü, R.H.: On the Implementation and Usage of SDPT3—A Matlab Software Package for Semidefinite-Quadratic-Linear Programming, Version 4.0, pp. 715–754. Springer US, Boston, MA (2012)
Tomamichel, M., Berta, M., Renes, J.M.: Quantum coding with finite resources. Nat. Commun. 7, 11419 (2016)
Article Google Scholar
Wallman, J.J., Flammia, S.T.: Randomized benchmarking with confidence. New J. Phys. 16(10), 103032 (2014)
Article MATH Google Scholar
Wang, X., Duan, R.: A semidefinite programming upper bound of quantum capacity. In: Proceedings IEEE ISIT 2016, p. 1690 (2016)
Wang, X., Fang, K., Duan, R.: Semidefinite programming converse bounds for quantum communication. IEEE Trans. Inf. Theory 65(4), 2581–2592 (2018)
MathSciNet Google Scholar
Wang, X., Xie, W., Duan, R.: Semidefinite programming strong converse bounds for classical capacity. IEEE Trans. Inf. Theory 64, 640–653 (2017)
Article MathSciNet MATH Google Scholar
Watrous, J.: Semidefinite programs for completely bounded norms. Theory Comput. 5(11), 217–238 (2009)
Article MathSciNet MATH Google Scholar
Watrous, J.: The Theory of Quantum Information, 1st edn. Cambridge University Press, Cambridge (2018)
Book MATH Google Scholar
Werner, R.F., Wolf, M.M.: Bell inequalities and entanglement. Quantum Inf. Comput. 1(3), 1 (2001)
MathSciNet MATH Google Scholar
Wilde, M.M.: Quantum Information Theory. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9781139525343
Wolf, M.: Quantum channels and operations: Guided tour. Lecture notes https://wwwm5.ma.tum.de/foswiki/pub/M5/Allgemeines/MichaelWolf/QChannelLecture.pdf (July 2012)

Download references

Acknowledgements

We thank Fernando Brandão, Matthias Christandl and Robert König for useful discussions, and Nengkun Yu for pointing out a mistake in a previous version of Corollary 4.7 about channel de Finetti theorems in terms of the diamond norm distance.

Author information

Authors and Affiliations

Department of Computing, Imperial College London, London, UK
Mario Berta & Francesco Borderi
Université de Lyon, ENS de Lyon, CNRS, UCBL, LIP, 69342, Lyon Cedex 07, France
Omar Fawzi
Department of Physics, Ghent University, Ghent, Belgium
Volkher B. Scholz

Authors

Mario Berta
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Borderi
View author publications
You can also search for this author in PubMed Google Scholar
Omar Fawzi
View author publications
You can also search for this author in PubMed Google Scholar
Volkher B. Scholz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Borderi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Part of this work has been presented at ISIT 2019 under the title “Quantum Coding via Semidefinite Programming”.

Appendices

Appendix A: Classically-assisted approximate quantum error correction

1.1 A.1. Setting

It is often useful to add classical forward communication assistance to the problem of quantum error correction. The corresponding assisted channel fidelity is defined as follows.

Definition A.1

Let ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$ be a quantum channel and $M\in {\mathbb {N}}$. The LOCC(1)-assisted channel fidelity for message dimension M is defined as^{Footnote 12}

$$\begin{aligned} F^{\mathrm {LOCC(1)}}({\mathcal {N}},M):=\max&\quad F\Big (\Phi _{{\bar{B}}R},\sum _{i\in I}\big (\left( {\mathcal {D}}^i_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}^i_{A\rightarrow {\bar{A}}}\right) \otimes {\mathcal {I}}_R\big )(\Phi _{AR})\Big )\\ s.t.&\quad \sum _{i\in I}{\mathcal {E}}^i_{A\rightarrow {\bar{A}}}\;\text {quantum channel with } {\mathcal {E}}^i_{A\rightarrow {\bar{A}}}\text { cp for }i\in I\\&\quad {\mathcal {D}}^i_{B\rightarrow {\bar{B}}}\;\text {quantum channel }\forall i\in I, \end{aligned}$$

where $\Phi _{AR}$ denotes the maximally entangled state on AR, cp is the abbreviation for completely positive, and we have $M=d_A=d_{{\bar{B}}}=d_R$.

By the Choi-Jamiołkowski isomorphism this can again be rewritten as a bilinear optimization.

Lemma A.2

Let ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$ be a quantum channel and $M\in {\mathbb {N}}$. Then, the LOCC(1)-assisted channel fidelity can be written as

$$\begin{aligned} F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( \sum _{i\in I}E_{A{\bar{A}}}^i\otimes D_{B{\bar{B}}}^i\right) \right] \\ s.t.&\quad E_{A{\bar{A}}}^i\succeq 0,\;D_{B{\bar{B}}}^i\succeq 0\quad \forall i\in I\\&\quad \sum _{i\in I}E_A^i=\frac{1_A}{d_A},\;D^i_B=\frac{1_B}{d_B}\quad \forall i\in I. \end{aligned}$$

The proof follows similarly as in Lemma 4.2 about plain quantum error correction, and is based on the manipulation of the objective function $F\Big (\Phi _{{\bar{B}}R},\sum _{i\in I}\big (\left( {\mathcal {D}}^i_{B\rightarrow {\bar{B}}}\right. \left. \circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}^i_{A\rightarrow {\bar{A}}}\right) \otimes {\mathcal {I}}_R\big )(\Phi _{AR})\Big )$ by using the Choi-Jamiołkowski isomorphism. We have that $F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)$ is closely connected to the channel fidelity $F({\mathcal {N}},M)$.

Lemma A.3

Let ${\mathcal {N}}$ be a quantum channel and $M\in {\mathbb {N}}$. Then, we have

$$\begin{aligned} F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)\ge F({\mathcal {N}},M)\ge \left( F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)\right) ^2. \end{aligned}$$

Asymptotically this corresponds to the well-known statement that forward classical communication assistance does not increase the capacity.

Proof

The first inequality is trivial because the addition of a forward classical communication channel cannot decrease the channel fidelity. The fact that $\left( F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)\right) ^2$ gives a lower bound on $F({\mathcal {N}},M)$ can be seen from [50, Proposition 4.5]. Consider an arbitrary coding scheme for the quantum channel ${\mathcal {N}}$ assisted with a forward classical communication channel and call ${\mathcal {F}}_{\mathrm {LOCC(1)}}$ the channel fidelity obtained using that scheme. We then want to show that it is always possible to find a coding scheme for the quantum channel ${\mathcal {N}}$ alone allowing us to achieve a channel fidelity ${\mathcal {F}} \ge {\mathcal {F}}_{\mathrm {LOCC(1)}}^2$. Say we are able to send, through the forward classical communication channel, a symbol in the set $\{1,\dots ,S\}$ with $S\in {\mathbb {N}}$. An arbitrary coding scheme for the assisted quantum channel can be modelled by a collection of instruments $\{{\mathcal {E}}^s_{A \rightarrow {\bar{A}}}\}_{s\in \{1,\dots ,S\}}$, i.e., trace-nonincreasing cp maps summing up to a channel, and channels $\{{\mathcal {D}}^s_{B \rightarrow {\bar{B}}}\}_{s\in \{1,\dots ,S\}}$. It is then easy to show that there must exist a symbol ${\tilde{s}}$ such that the fidelity of the map ${\mathcal {D}}^{{\tilde{s}}} \circ {\mathcal {N}} \circ \frac{{\mathcal {E}}^{{\tilde{s}}}}{e^{{\tilde{s}}}}$ is lower bounded by ${\mathcal {F}}_{\mathrm {LOCC(1)}}$, where the factor $e^{{\tilde{s}}}$ is chosen such that the completely positive map $\frac{{\mathcal {E}}^{{\tilde{s}}}}{e^{{\tilde{s}}}}$ becomes trace preserving with respect to the maximally mixed state $\frac{1_A}{d_A}$, as done in [50, Proposition 5.1]. Using the polar decomposition it is possible to find an isometric encoder ${\mathcal {V}}^{{\tilde{s}}} $ such that the channel fidelity ${\mathcal {F}}$ obtained using the coding scheme with encoder ${\mathcal {V}}^{{\tilde{s}}} $ and decoder ${\mathcal {D}}^{{\tilde{s}}} $ is lower bounded by the squared fidelity of the map ${\mathcal {D}}^{{\tilde{s}}} \circ {\mathcal {N}} \circ \frac{{\mathcal {E}}^{{\tilde{s}}}}{e^{{\tilde{s}}}}$. This implies ${\mathcal {F}} \ge {\mathcal {F}}_{\mathrm {LOCC(1)}} ^2$. $\square $

We have the dimension bounds for the LOCC(1)-assisted setting. Notice that the following result readily implies Lemma 4.3.

Lemma A.4

Let ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$ be a quantum channel and $M\in {\mathbb {N}}$. Then, we have

$$\begin{aligned} 0\le F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)\le \min \left\{ 1,\left( \frac{d_{{\bar{A}}}}{M}\right) ^2,\frac{d_B}{M}\right\} . \end{aligned}$$

Proof

The lower bound is trivial. For the upper bounds, as in the proof of Lemma 4.3, we mainly use that for any sub-normalized bipartite quantum state $\rho _{XY}$ we have $d_X\cdot 1_X\otimes \rho _Y\succeq \rho _{XY}$. Now, for the first upper bound note that $\frac{d_{{\bar{B}}}}{d_B}\cdot 1_{B{\bar{B}}}=d_{{\bar{B}}}\cdot 1_{{\bar{B}}}\otimes D_B^i\succeq D^i_{B{\bar{B}}}$ for all $i\in I$, and hence we get for the objective function (with $d_A=d_{{\bar{B}}}=M$)

$$\begin{aligned} F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)&\le d_{{\bar{A}}}d_{{\bar{B}}}\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) ^{1/2}\left( \sum _{i\in I}E_{A{\bar{A}}}^i\otimes 1_{B{\bar{B}}}\right) \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) ^{1/2}\right] \\&=d_{{\bar{A}}}d_{{\bar{B}}}\cdot {\mathrm {Tr}}\left[ \left( \frac{1_{{\bar{A}}}}{d_{{\bar{A}}}}\otimes \frac{1_A}{d_A}\right) \sum _{i\in I}E_{A{\bar{A}}}^i\right] ={\mathrm {Tr}}\left[ \sum _{i\in I}E_{A{\bar{A}}}^i\right] =1. \end{aligned}$$

For the second upper bound, note that from $E_{A{\bar{A}}}^i\succeq 0,\;D_{B{\bar{B}}}^i\succeq 0$ we get

$$\begin{aligned} F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)\le d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( \sum _{i\in I}E_{A{\bar{A}}}^i\otimes \sum _{j\in I}D_{B{\bar{B}}}^j\right) \right] . \end{aligned}$$

Now, we employ that $d_{{\bar{A}}}\cdot E^i_A\otimes 1_{{\bar{A}}}\succeq E^i_{A{\bar{A}}}$ giving $\frac{d_{{\bar{A}}}}{d_A}\cdot 1_{A{\bar{A}}}\succeq \sum _{i\in I}E^i_{A{\bar{A}}}$, which in turn leads to

$$\begin{aligned} F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)&\le \frac{d_{{\bar{A}}}^2d_B}{d_A}\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( 1_{A{\bar{A}}}\otimes \sum _{j\in I}D_{B{\bar{B}}}^j\right) \right] \\&=\frac{d_{{\bar{A}}}^2d_B}{d_A}\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_B\otimes \frac{1_{{\bar{B}}}}{d_{{\bar{B}}}}\right) \sum _{j\in I}D_{B{\bar{B}}}^j\right] \\&=\frac{d_{{\bar{A}}}^2d_B}{d_A^2d_{{\bar{B}}}}\cdot {\mathrm {Tr}}\left[ J^{\mathcal {N}}_B\sum _{j\in I}D_B^j\right] =\frac{d_{{\bar{A}}}^2d_B}{d_A^2d_{{\bar{B}}}}\cdot {\mathrm {Tr}}\left[ J^{\mathcal {N}}_Bd_A\frac{1_B}{d_B}\right] =\frac{d_{{\bar{A}}}^2}{d_Ad_{{\bar{B}}}}. \end{aligned}$$

For the third upper bound, note that $1_{B{\bar{B}}}\succeq D^i_{B{\bar{B}}}$ and thus

$$\begin{aligned} F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)&\le d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \left( \sum _{i\in I}E_{A{\bar{A}}}^i\otimes 1_{B{\bar{B}}}\right) \right] \\&=d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( \frac{1_{{\bar{A}}}}{d_{{\bar{A}}}}\otimes \frac{1_A}{d_A}\right) \sum _{i\in I}E_{A{\bar{A}}}^i\right] =\frac{d_B}{d_A}\cdot {\mathrm {Tr}}\left[ \sum _{i\in I}E_{A{\bar{A}}}^i\right] =\frac{d_B}{d_A}. \end{aligned}$$

$\square $

1.2 A.2. Hierarchy of outer bounds

By removing one of the two conditions in Theorem 4.4, we get the following approximation for the set of LOCC(1) channels—stated in terms of the corresponding Choi states.

Proposition A.5

Let $\rho _{A{\bar{A}}(B{\bar{B}})_1^n}$ be a quantum state with

$$\begin{aligned} \rho _{A{\bar{A}}(B{\bar{B}})_1^n}={\mathcal {U}}_{(B{\bar{B}})_1^n}^\pi (\rho _{A{\bar{A}}(B{\bar{B}})_1^n})\;\forall \pi \in {\mathfrak {S}}_n,\quad \rho _A=\frac{1_A}{d_A},\quad \rho _{(B{\bar{B}})_1^{n-1}B_n}=\rho _{(B{\bar{B}})_1^{n-1}}\otimes \frac{1_{B_n}}{d_B}. \end{aligned}$$

Then, we have for $0<k<n$ that $\left\| \rho _{A{\bar{A}}(B{\bar{B}})_1^{k}}-\sum _{i\in I}\sigma _{A{\bar{A}}}^i\otimes \left( \omega _{B{\bar{B}}}^i\right) ^{\otimes k}\right\| _1$ is upper bounded by the same term as in Theorem 4.4, where $\omega _{B{\bar{B}}}^i\succeq 0$ with $\omega _B^i=\frac{1_B}{d_B}$ and $\sigma _{A{\bar{A}}}^i\succeq 0$ with $\sum _{i\in I}\sigma _A^i=\frac{1_A}{d_A}$.

The n-th level of the SDP hierarchy then becomes

$$\begin{aligned} {\mathrm {SDP}}^{\mathrm {LOCC(1)}}_n({\mathcal {N}},M):=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B_1}\otimes \Phi _{A{\bar{B}}_1}\right) Z_{A{\bar{A}}B_1{\bar{B}}_1}\right] \\ s.t.&\quad Z_{A{\bar{A}}(B{\bar{B}})_1^n}\succeq 0,\;{\mathcal {U}}_{(B{\bar{B}})_1^n}^\pi \left( Z_{A{\bar{A}}(B{\bar{B}})_1^n}\right) =Z_{A{\bar{A}}(B{\bar{B}})_1^n}\quad \forall \pi \in {\mathfrak {S}}_n\\&\quad Z_{AB_1^n}=\frac{1_{AB_1^n}}{d_Ad_B^n},\;Z_{A{\bar{A}}\left( B{\bar{B}}\right) _1^{n-1}B_n}= Z_{A{\bar{A}}\left( B{\bar{B}}\right) _1^{n-1}}\otimes \frac{1_{B_n}}{d_B}. \end{aligned}$$

By inspection, the only difference between ${\mathrm {SDP}}_n({\mathcal {N}},M)$ and ${\mathrm {SDP}}^{\mathrm {LOCC(1)}}_n({\mathcal {N}},M)$ is the weakened second to last condition. The asymptotic convergence follows immediately from Proposition A.5.

Theorem A.6

Let ${\mathcal {N}}$ be a quantum channel and $n,M\in {\mathbb {N}}$. Then, we have

$$\begin{aligned} {\mathrm {SDP}}^{\mathrm {LOCC(1)}}_{n+1}({\mathcal {N}},M)\le {\mathrm {SDP}}^{\mathrm {LOCC(1)}}_n({\mathcal {N}},M)\quad \text {and}\quad F^{\mathrm {LOCC(1)}}({\mathcal {N}},M)=\lim _{n\rightarrow \infty }{\mathrm {SDP}}^{\mathrm {LOCC(1)}}_n({\mathcal {N}},M). \end{aligned}$$

Note that for ${\mathrm {SDP}}^{\mathrm {LOCC(1)}}_n({\mathcal {N}},M)$ we slightly strengthened the last two conditions by including some more A- and B-systems in the conditions compared to the minimal conditions

$$\begin{aligned} Z_A=\frac{1_A}{d_A}\quad \text {and}\quad Z_{\left( B{\bar{B}}\right) _1^{n-1}B_n}=\frac{1_{B_n}}{d_B}\otimes Z_{\left( B{\bar{B}}\right) _1^{n-1}} \end{aligned}$$

needed for Proposition A.5. By an iterative argument the last condition implies in particular that

$$\begin{aligned} Z_{A{\bar{A}}B_1^n{\bar{B}}_1}=\frac{1_{B_2^n}}{d_B^n}\otimes Z_{A{\bar{A}}B_1{\bar{B}}_1}, \end{aligned}$$

which together with the other three conditions in ${\mathrm {SDP}}^{\mathrm {LOCC(1)}}_n({\mathcal {N}},M)$ then corresponds to the notion of extendible channels from [45, Definition 5] (also see [26] for similar conditions). We note, however, that when relaxing the conditions to n-extendible channels our proofs for the asymptotic convergence of the resulting outer bounds do not apply.

The SDP relaxations again behave naturally in the sense that they are upper bounded by one.

Lemma A.7

Let ${\mathcal {N}}$ be a quantum channel and $n,M\in {\mathbb {N}}$. Then, we have

$$\begin{aligned} 0\le {\mathrm {SDP}}^{\mathrm {LOCC(1)}}_n({\mathcal {N}},M)\le 1. \end{aligned}$$

Proof

The lower bound is trivial. For the upper bound, by the monotonicity in n (Theorem A.6) it is enough to restrict to $n=1$. As in the proof of Lemma A.4, we make use of $\frac{d_{{\bar{B}}}}{d_B}\cdot Z_{A{\bar{A}}}\otimes 1_{B_1{\bar{B}}_1}\succeq Z_{A{\bar{A}}B_1{\bar{B}}_1}$. This again gives

$$\begin{aligned} {\mathrm {SDP}}^{\mathrm {LOCC(1)}}_1({\mathcal {N}},M)\le d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) \frac{d_{{\bar{B}}}}{d_B}\cdot Z_{A{\bar{A}}}\otimes 1_{B_1{\bar{B}}_1}\right] =1. \end{aligned}$$

$\square $

We can again add PPT constraints and we denote the resulting relaxations by ${\mathrm {SDP}}^{\mathrm {LOCC(1)}}_{n,{\mathrm {PPT}}}({\mathcal {N}},M)$. In the following we study more closely these levels ${\mathrm {SDP}}_{n,{\mathrm {PPT}}}^{\mathrm {LOCC(1)}}({\mathcal {N}},M)$, which are our tightest outer bound relaxations on the LOCC(1)-assisted channel fidelity. We find

$$\begin{aligned} {\mathrm {SDP}}^{\mathrm {LOCC(1)}}_{1,{\mathrm {PPT}}}({\mathcal {N}},M)=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B}\otimes \Phi _{A{\bar{B}}}\right) Z_{A{\bar{A}}B{\bar{B}}}\right] \\ s.t.&\quad Z_{A{\bar{A}}B{\bar{B}}}\succeq 0,\;Z_{A{\bar{A}}B{\bar{B}}}^{T_{B{\bar{B}}}}\succeq 0\\&\quad Z_{AB}=\frac{1_{AB}}{d_Ad_B},\;Z_{A{\bar{A}}B}=Z_{A{\bar{A}}}\otimes \frac{1_B}{d_B}. \end{aligned}$$

This is exactly the SDP outer bound found in [53, Section IV], which simplifies to

$$\begin{aligned} {\mathrm {SDP}}^{\mathrm {LOCC(1)}}_{1,{\mathrm {PPT}}}({\mathcal {N}},M)=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ J^{\mathcal {N}}_{{\bar{A}}B}X_{{\bar{A}}B}\right] \\ s.t.&\quad \rho _{{\bar{A}}}\otimes \frac{1_B}{d_B}\succeq X_{{\bar{A}}B}\succeq 0,\;{\mathrm {Tr}}[\rho _{{\bar{A}}}]=1\\&\quad \rho _{{\bar{A}}}\otimes \frac{1_B}{d_B}\succeq M\cdot X_{{\bar{A}}B}^{T_B}\succeq -\rho _{{\bar{A}}}\otimes \frac{1_B}{d_B}. \end{aligned}$$

By inspection, this corresponds to ${\mathrm {SDP}}_{1,{\mathrm {PPT}}}({\mathcal {N}},M)$ but with one missing constraint, namely $M^2X_B=\frac{1_B}{d_B}$. For $n=2$ we get

$$\begin{aligned} {\mathrm {SDP}}^{\mathrm {LOCC(1)}}_{2,{\mathrm {PPT}}}({\mathcal {N}},M)=\max&\quad d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B_1}\otimes \Phi _{A{\bar{B}}_1}\right) Z_{A{\bar{A}}B_1{\bar{B}}_1}\right] \\ s.t.&\quad Z_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}\succeq 0,\;Z_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}^{T_{A{\bar{A}}}}\succeq 0,\;Z_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}^{T_{B_2{\bar{B}}_2}}\succeq 0\\&\quad {\mathcal {U}}_{B_1B_2{\bar{B}}_1{\bar{B}}_2}^\pi \left( Z_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}\right) =Z_{A{\bar{A}}B_1B_2{\bar{B}}_1{\bar{B}}_2}\quad \forall \pi \in \Pi _2\\&\quad Z_{AB_1B_2}=\frac{1_{AB_1B_2}}{d_Ad_B^2},\;Z_{A{\bar{A}}B_1B_2{\bar{B}}_1}=Z_{A{\bar{A}}B_1{\bar{B}}_1}\otimes \frac{1_{B_2}}{d_B}, \end{aligned}$$

and we recover the exact same conditions as for the notion of extendible channels [45, Definition 5].

Appendix B: Numerical examples

1.1 B.1. Methods

In the following we present the proof of concept numerics we implemented to test the low levels of our hierarchy for the application of approximate quantum error correction. The experiments have been done in MATLAB using the QETLAB library [44], CVX [35], MOSEK [2], and SDPT3 [71].^{Footnote 13} As discussed in Lemma 3.2, the authors of [56] gave a rank loop condition to certify that a certain level of the hierarchy already gives the optimal value. We restate the condition here in the exact form needed for approximate quantum error correction.

Lemma B.1

Let $W_{A{\bar{A}}(B{\bar{B}})_1^n}={\mathcal {U}}_{(B{\bar{B}})_1^n}^\pi \left( W_{A{\bar{A}}(B{\bar{B}})_1^n}\right) $ for all $\pi \in {\mathfrak {S}}_n$ and fixed $0\le k\le n$ such that $W_{A{\bar{A}}(B{\bar{B}})_1^n}^{T_{(B{\bar{B}})_{k+1}^n}}\succeq 0$. If we have

$$\begin{aligned} {\mathrm {rank}}\left( W_{A{\bar{A}}(B{\bar{B}})_1^n}\right) \le \max \left\{ {\mathrm {rank}}\left( W_{A{\bar{A}} (B{\bar{B}})_1^k}\right) ,\,{\mathrm {rank}}\left( W_{ (B{\bar{B}})_{k+1}^n}\right) \right\} , \end{aligned}$$

then $W_{A{\bar{A}}B{\bar{B}}}$ is separable with respect to the partition $A {\bar{A}} | B {\bar{B}}$.

Using Lemma B.1 it is in principle possible to, e.g., certify the optimality of the first level using the second level of our hierarchy. Moreover, if the criterion is fulfilled it will also allow us to extract the actual encoder and decoder of the optimal quantum error correction code. However, in order to facilitate the search for solutions having rank loops we need to look for low rank solutions $W_{A{\bar{A}}(B{\bar{B}})_1^n}$. It is not possible to directly write a rank condition into our semidefinite programs because rank constraints are not convex. In addition, SDP solvers typically give high rank solutions since they tend to look for solutions at the interior of the convex set.^{Footnote 14} Nevertheless, a possible strategy is to find a solution $W_{A{\bar{A}}(B{\bar{B}})_1^n}$ and then employ a heuristic to minimize the rank while keeping the hierarchy constraints. The heuristic we found the most effective for our purposes was the log-det method described in [30]. The idea is to minimize the first-order Taylor series expansion of

$$\begin{aligned} \log \det \left( W_{A{\bar{A}}(B{\bar{B}})_1^n} +\delta \cdot 1\right) , \end{aligned}$$

which is used as a smooth surrogate for ${\mathrm {rank}}\big (W_{A{\bar{A}}(B{\bar{B}})_1^n}\big )$ and $\delta >0$ is a small regularization constant. The procedure is iterative, meaning that we start from $W_0 = 1_{A{\bar{A}}(B{\bar{B}})_1^n}$, then compute $W_1$ minimizing the log-det objective function, and so on. In particular, the choice $W_0 = 1_{A{\bar{A}}(B{\bar{B}})_1^n}$ connects the method to the trace heuristic, which is known to be an effective heuristic for rank reduction. We stop after a certain number l of iterations and then we find a solution $W_l$ having hopefully lower rank than the original ${\mathrm {rank}}\big (W_{A{\bar{A}}(B{\bar{B}})_1^n}\big )$.

1.2 B.2. Qubit Channels

We computed SDP relaxations in the plain coding setting for all the most common qubit channels: depolarizing, amplitude damping, bit flip, phase flip, bit-phase flip, Werner-Holevo and generalized Werner-Holevo channel. We found the upper bounds

$$\begin{aligned}&{\mathrm {SDP}}_{1,{\mathrm {PPT}}}({\mathcal {N}}_2,2) = {\mathrm {SDP}}_{2,{\mathrm {PPT}}}({\mathcal {N}}_2,2) = {\mathrm {SDP}}_{3,{\mathrm {PPT}}}({\mathcal {N}}_2,2)={\mathrm {SDP}}_1({\mathcal {N}}_2,2)\\&\quad = {\mathrm {SDP}}_2({\mathcal {N}}_2,2) = {\mathrm {SDP}}_3({\mathcal {N}}_2,2), \end{aligned}$$

where the subscript in ${\mathcal {N}}_2$ refers to the two-dimensional input and output of the channel. These identities also remain true for random qubit channels and one might then conjecture that for qubit channels indeed already ${\mathrm {SDP}}_1({\mathcal {N}}_2,2)$ captures $F({\mathcal {N}},2)$.

For the qubit depolarizing channel the trivial coding scheme is known to be optimal and we retrieve this result using the rank loop condition of the second level based on the log-det method. Similarly, for the qubit bit flip channel with parameter $p=0.1$ we find a rank-one state solution of the second level using again the log-det method, implying that the rank loop condition holds. In this case the solution is not just the state associated with the trivial coding scheme via the Choi isomorphism but the resulting encoder/decoder pair with optimal fidelity 0.9 is given by the unitary channels with Kraus matrices $U_E= - \vert 1\rangle \langle 0\vert + \vert 0\rangle \langle 1\vert $ and $U_D= \vert 0\rangle \langle 0\vert - \vert 1\rangle \langle 1\vert $, respectively. Note that the trivial coding scheme is largely suboptimal for a qubit bit flip channel with $p=0.1$, as the corresponding fidelity is 0.1.

1.3 B.3. Qutrit Channels

We computed SDP relaxations in the plain coding setting for the following qutrit channels: depolarizing, Werner-Holevo and generalized Werner-Holevo channel. We found the upper bounds ${\mathrm {SDP}}_{1,{\mathrm {PPT}}}({\mathcal {N}}_3,2) = {\mathrm {SDP}}_{2,{\mathrm {PPT}}}({\mathcal {N}}_3,2)$ and this identity also remains true for random qutrit channels. Removing the PPT conditions, however, we found qutrit channels ${\mathcal {N}}_3$ such that ${\mathrm {SDP}}_2({\mathcal {N}}_3,2) < {\mathrm {SDP}}_1({\mathcal {N}}_3,2)$.

1.4 B.4. Depolarizing channel

The depolarizing channel for $p \in [0,4/3]$ is given as

$$\begin{aligned} Dep_d:\rho _{{\bar{A}}} \mapsto p\cdot \mathrm{Tr}[\rho _{{\bar{A}}}]\frac{1_B}{d_B} + (1-p)\cdot \rho _{B}, \end{aligned}$$

where d denotes the dimension of the input and output. Notice that even though often the channel is only studied for $p \in [0,1]$ where we can interpret p as a depolarizing probability, the above expression also represents a channel for $p \in (1,4/3]$ (as, e.g., discussed in [66, Chapter 3]). We find that

$$\begin{aligned} {\mathrm {SDP}}_{1,{\mathrm {PPT}}}(Dep_2,2) = {\mathrm {SDP}}_{2,{\mathrm {PPT}}}(Dep_2,2)={\mathrm {SDP}}_{1,{\mathrm {PPT}}}(Dep_3,2) = {\mathrm {SDP}}_{2,{\mathrm {PPT}}}(Dep_3,2). \end{aligned}$$

However, in Section B.3 we found that in general removing the PPT conditions allows us to see a difference for the first two levels. This behaviour is not shown by the qutrit depolarizing channel, probably due to its highly symmetrical structure. We computed the upper bound for LOCC(1) coding (see Appendix A) and found for $p \in (0,0.8)$ (Fig. 1) that

$$\begin{aligned} {\mathrm {SDP}}^{\mathrm {LOCC(1)}}_{2,{\mathrm {PPT}}}(Dep_2,2)&={\mathrm {SDP}}^{\mathrm {LOCC(1)}}_{1,{\mathrm {PPT}}}(Dep_2,2), \;\text {while}\;{\mathrm {SDP}}^{\mathrm {LOCC(1)}}_{2,{\mathrm {PPT}}}(Dep_3,2)\\&<{\mathrm {SDP}}^{\mathrm {LOCC(1)}}_{1,{\mathrm {PPT}}}(Dep_3,2). \end{aligned}$$

We compared, for the plain coding setting, the $n=1$ level for five repetitions of the qubit depolarizing channel with the fidelity of the trivial coding scheme, as well as the 5 qubit stabilizer code from [7]. In particular, following [75] we exploited the symmetries of the qubit depolarizing channel to get the linear program

$$\begin{aligned} {\mathrm {SDP}}_{1,{\mathrm {PPT}}}\left( Dep_2^{\otimes N},2\right) =\max&\quad \sum _{i=0}^N \left( {\begin{array}{c}N\\ i\end{array}}\right) \left( 1-\frac{3p}{4}\right) ^i {\left( \frac{3p}{4}\right) }^{N-i} m_i \\ s.t.&\quad 0 \le m_i \le 1 \quad i\in \{0,\dots ,N\}\\&\quad -\frac{1}{2} \le \sum _{i = 0}^N x_{i,k} m_i \le \frac{1}{2} \quad k\in \{0,\dots ,N\}\\&\quad \sum _{i=0}^{N}\left( {\begin{array}{c}N\\ i\end{array}}\right) 3^{N-i} m_i = 2^{2N-2}. \end{aligned}$$

where $x_{i,k}=\frac{1}{d^N}\sum _{r= \max \{0,i+k-N\}}^{\min \{i,k\}} \left( {\begin{array}{c}k\\ r\end{array}}\right) \left( {\begin{array}{c}N-k\\ i-r\end{array}}\right) (-1)^{i-r} (d-1)^{k-r} (d+1)^{N-k+r-i}$ with $i,k\in \{0,\dots ,N\}$. Notice that the number of variables is an affine function of N. The results are reported in Fig. 2. Comparing these with Figure 3.7 in [66, Chapter 3], it seems that the first level of the hierarchy matches their lower bounds in the region $p \in [1,4/3]$. Notice the intersection of the five qubit code and the trivial coding scheme in the region $p\in (0.1,0.2)$ and the singular behaviour in the region $p\in (0.6,0.7)$. We have also examined five, ten, fifteen, twenty and twenty five repetitions of the qubit depolarizing channel, again using the above linear program. The results are shown in Fig. 3. Notice that the singular behaviour noted in Fig. 2 is now even more accentuated when increasing the number of repetitions.

1.5 B.5. Amplitude damping channel

The qubit amplitude damping channel with damping probability $\gamma \in [0,1]$ is given as

$$\begin{aligned}&Amp_{\gamma }:\rho _{{\bar{A}}} \rightarrow E^0_B \rho _{B}{E^0_B}^\dagger + E^1_B \rho _{B}{E^1_B}^\dagger ,\;\text {where}\;E^0_B = \vert 0\rangle \langle 0\vert + \sqrt{1-\gamma }\vert 1\rangle \langle 1\vert ,\;E^1_B\\&= \sqrt{\gamma }\vert 0\rangle \langle 1\vert . \end{aligned}$$

We compared the results given by one, two, three, and four repetitions of the channel for the level $n=1$. The bounds are shown in Fig. 4, compared with the fidelity of the trivial coding scheme, and the 4 qubit code from [54]. Notice the overlap between the first level of the hierarchy and the trivial coding scheme for the one-shot setting. Comparing these results with Figure 3.12 in [66, Chapter 3] we see that there is gap between their lower bounds (that significantly improve on the trivial coding scheme) and our upper bounds.

Appendix C: Worst case error criteria

1.1 C.1. Setting

So far we have used the channel fidelity from Definition 4.1 as the measure to study approximate quantum error correction—which corresponds to the average error case. In this appendix, we consider the diamond norm to study the worst case error and we find a program for which the hierarchy can be used to generate, in this case, lower bounds. We prove the sequence of semidefinite relaxations do in fact converge to the exact value of the original optimization program.

Definition C.1

Let ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$ be a quantum channel and $M \in {\mathbb {N}}$, with $M=d_{A}=d_{{\bar{B}}}$. The channel distance is defined as

$$\begin{aligned} \Delta ({\mathcal {N}},M):=\min&\;\frac{1}{2}\left\| {\mathcal {D}}_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}_{A\rightarrow {\bar{A}}}-{\mathcal {I}}_{A\rightarrow {\bar{B}}}\right\| _{\Diamond } \\ \mathrm {s.t.}&\;{\mathcal {D}}_{B\rightarrow {\bar{B}}},{\mathcal {E}}_{A\rightarrow {\bar{A}}}\;\text {quantum channels.} \end{aligned}$$

The following lemma writes the channel distance as given in Definition C.1 in terms of the Choi matrices of the encoder ${\mathcal {E}}_{A\rightarrow {\bar{A}}}$ and decoder ${\mathcal {D}}_{B\rightarrow {\bar{B}}}$, respectively.

Lemma C.2

Let ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$ be a quantum channel and $M\in {\mathbb {N}}$. Then, we have that

$$\begin{aligned}&\Delta ({\mathcal {N}},M)=\min \;\lambda \\ \mathrm {s.t.}&\; \;E_{A{\bar{A}}}\succeq 0,\,E_A=\frac{1_A}{d_A},\;D_{B{\bar{B}}}\succeq 0,\,D_B=\frac{1_B}{d_B}\\&\;Z_{A{\bar{B}}}\succeq 0,\;\frac{\lambda }{d_A}\cdot 1_A\succeq Z_A\\&\; Z_{A{\bar{B}}}+\Phi _{A{\bar{B}}}\succeq d_{{\bar{A}}}d_{B}\cdot {\mathrm {Tr}}_{{\bar{A}} B}\left[ \left( 1_{A} \otimes J^{{\mathcal {N}}}_{{\bar{A}} B}\otimes 1_{{\bar{B}}})(E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}\right) \right] , \end{aligned}$$

where $J^{{\mathcal {N}}}_{{\bar{A}} B}$ denotes the Choi matrix of ${\mathcal {N}}_{{\bar{A}}\rightarrow B}$.

Proof

Following [77], the channel distance $\Delta ({\mathcal {N}},M)$ can be written as

$$\begin{aligned} \Delta ({\mathcal {N}},M)=\min&\;\left\| Z_A \right\| _{\infty }\\ \mathrm {s.t.}&\;{\mathcal {D}}_{B\rightarrow {\bar{B}}},{\mathcal {E}}_{A\rightarrow {\bar{A}}}\;\text {quantum channels} \\&\;Z_{A{\bar{B}}}\succeq 0,\;Z_{A{\bar{B}}}\succeq d_A \cdot J^{{\mathcal {D}}\circ {\mathcal {N}}\circ {\mathcal {E}}-{\mathcal {I}}}_{ A {\bar{B}}}. \end{aligned}$$

We simplify

$$\begin{aligned} J^{{\mathcal {D}}\circ {\mathcal {N}}\circ {\mathcal {E}}-{\mathcal {I}}}_{ A {\bar{B}}}&= J^{{\mathcal {D}}\circ {\mathcal {N}}\circ {\mathcal {E}}}_{ A {\bar{B}}} - J^{{\mathcal {I}}}_{ A {\bar{B}}} = J^{{\mathcal {D}}\circ {\mathcal {N}}\circ {\mathcal {E}}}_{ A {\bar{B}}} - \Phi _{ A {\bar{B}}}, \end{aligned}$$

write for the infinity norm $\left\| Z_A \right\| _{\infty }=\min \left\{ \lambda \in {\mathbb {R}}: \lambda \cdot 1_A \succeq Z_A\right\} $, and relabel $\frac{Z_{A{\bar{B}}}}{d_A}$ as $Z_{A{\bar{B}}}$, leading to

$$\begin{aligned} \Delta ({\mathcal {N}},M)=\min&\;\lambda \nonumber \\ \mathrm {s.t.}&\;{\mathcal {D}}_{B\rightarrow {\bar{B}}},{\mathcal {E}}_{A\rightarrow {\bar{A}}}\;\text {quantum channels}\nonumber \\&\;Z_{A{\bar{B}}}\succeq 0, \frac{\lambda }{d_A} \cdot 1_A \succeq Z_A \nonumber \\&\; Z_{A{\bar{B}}} + \Phi _{ A {\bar{B}}} \succeq J^{{\mathcal {D}}\circ {\mathcal {N}}\circ {\mathcal {E}}}_{A{\bar{B}}}. \end{aligned}$$

(22)

Following [53] and in particular [76, Equation 7], we have the Choi state

$$\begin{aligned} J^{{\mathcal {D}}\circ {\mathcal {N}}\circ {\mathcal {E}}}_{ A {\bar{B}}}=d_{{\bar{A}}}d_{B}\cdot {\mathrm {Tr}}_{{\bar{A}} B}\left[ \left( 1_{A} \otimes J^{{\mathcal {N}}}_{{\bar{A}} B}\otimes 1_{{\bar{B}}}\right) \left( J^{{\mathcal {E}}}_{ A {\bar{A}}}\otimes J^{{\mathcal {D}}}_{ B {\bar{B}}}\right) \right] \end{aligned}$$

and writing $J^{{\mathcal {E}}}_{ A {\bar{A}}} = E_{A{\bar{A}}}$ as well as $J^{{\mathcal {D}}}_{ B {\bar{B}}} = D_{B{\bar{B}}} $ concludes the proof. $\square $

1.2 C.2. Hierarchy of lower bounds

Similarly as in Sect. 4.4, we define a hierarchy of semidefinite programs labelled by an index n. Our framework directly applies as the structure of the optimization problem derived in Lemma C.2 involves the tensor product $E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}$. The n-th level of the SDP hierarchy then generates the lower bounds ${\mathrm {SDP}}^\Delta _n({\mathcal {N}},M)$ for the distance $\Delta ({\mathcal {N}},M)$ as

$$\begin{aligned} {\mathrm {SDP}}^\Delta _n({\mathcal {N}},M):=\min&\;\lambda \\ \mathrm {s.t.}&\; W_{A{\bar{A}}(B{\bar{B}})_1^n}\succeq 0,\;{\mathrm {Tr}}\left[ W_{A{\bar{A}}(B{\bar{B}})_1^n}\right] =1\\&\;W_{A{\bar{A}}(B{\bar{B}})_1^n}={\mathcal {U}}_{(B{\bar{B}})_1^n}^\pi \left( W_{A{\bar{A}}(B{\bar{B}})_1^n}\right) \;\forall \pi \in {\mathfrak {S}}_n\\&\;W_{A(B{\bar{B}})_1^n}=\frac{1_A}{d_A}\otimes W_{(B{\bar{B}})_1^n},\;W_{A{\bar{A}}(B{\bar{B}})_1^{n-1}B_n}\\ {}&\;=W_{A{\bar{A}}(B{\bar{B}})_1^{n-1}}\otimes \frac{1_{B_n}}{d_B}\\&\; Z_{A{\bar{B}}}\succeq 0,\;\frac{\lambda }{d_A}\cdot 1_A\succeq Z_A\\&\; Z_{A{\bar{B}}}+\Phi _{A{\bar{B}}}\succeq d_{{\bar{A}}}d_{B}\cdot {\mathrm {Tr}}_{{\bar{A}} B}\left[ \left( 1_{A} \otimes J^{{\mathcal {N}}}_{{\bar{A}} B}\otimes 1_{{\bar{B}}}\right) W_{A{\bar{A}}B{\bar{B}}}\right] . \end{aligned}$$

We can also add PPT constraints and denote the resulting relaxations by ${\mathrm {SDP}}^\Delta _{n,{\mathrm {PPT}}}({\mathcal {N}},M)$. The following theorem states the convergence of the hierarchy.

Theorem C.3

Let ${\mathcal {N}}$ be a quantum channel and $n,M\in {\mathbb {N}}$. Then, we have

$$\begin{aligned} 0\le \Delta ({\mathcal {N}},M)-{\mathrm {SDP}}^\Delta _{n}({\mathcal {N}},M)\le \frac{{\mathrm {poly}}(d)}{\sqrt{n}}\quad \text {implying}\quad \Delta ({\mathcal {N}},M)=\lim _{n\rightarrow \infty }{\mathrm {SDP}}^\Delta _{n}({\mathcal {N}},M), \end{aligned}$$

where $d=\max \{d_A,d_{{\bar{A}}},d_B,d_{{\bar{B}}}\}$.

Proof

The bound $0\le \Delta ({\mathcal {N}},M)-{\mathrm {SDP}}^\Delta _{n}({\mathcal {N}},M)$ holds by construction and thus we consider the upper bound. First, note that again applying (22) we can write

$$\begin{aligned} {\mathrm {SDP}}^\Delta _n({\mathcal {N}},M)=\min&\;\frac{1}{2}\left\| {\mathcal {W}}({\mathcal {N}})_{A\rightarrow {\bar{B}}}-{\mathcal {I}}_{A{\bar{B}}}\right\| _{\Diamond } \\ \mathrm {s.t.}&\; W_{A{\bar{A}}(B{\bar{B}})_1^n}\succeq 0,\;{\mathrm {Tr}}\left[ W_{A{\bar{A}}(B{\bar{B}})_1^n}\right] =1\\&\;W_{A{\bar{A}}(B{\bar{B}})_1^n}={\mathcal {U}}_{(B{\bar{B}})_1^n}^\pi \left( W_{A{\bar{A}}(B{\bar{B}})_1^n}\right) \;\forall \pi \in {\mathfrak {S}}_n\\&\;W_{A(B{\bar{B}})_1^n}=\frac{1_A}{d_A}\otimes W_{(B{\bar{B}})_1^n},\;W_{A{\bar{A}}(B{\bar{B}})_1^{n-1}B_n}\\ {}&\;=W_{A{\bar{A}}(B{\bar{B}})_1^{n-1}}\otimes \frac{1_{B_n}}{d_B} \end{aligned}$$

with the quantum channel ${\mathcal {W}}({\mathcal {N}})_{A\rightarrow {\bar{B}}}$ defined via its Choi state

$$\begin{aligned} J^{{\mathcal {W}}({\mathcal {N}})}_{ A {\bar{B}}}:=d_{{\bar{A}}}d_{B}\cdot {\mathrm {Tr}}_{{\bar{A}} B}\left[ \left( 1_{A} \otimes J^{{\mathcal {N}}}_{{\bar{A}} B}\otimes 1_{{\bar{B}}}\right) W_{A{\bar{A}} B {\bar{B}}}\right] . \end{aligned}$$

Second, using the de Finetti Theorem 2.3 we get that for every feasible Choi state $W_{A{\bar{A}}{(B{\bar{B}})}_1^n}$ in ${\mathrm {SDP}}^\Delta _n({\mathcal {N}},M)$, there exists a feasible Choi state $E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}$ in $\Delta ({\mathcal {N}},M)$ from Lemma C.2, such that

$$\begin{aligned} \left\| E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}-W_{A{\bar{A}} B {\bar{B}}}\right\| _1\le \frac{{\mathrm {poly}}(d)}{\sqrt{n}}. \end{aligned}$$

Third, employing the triangle inequality for the diamond norm we have

$$\begin{aligned}&\left\| {\mathcal {D}}_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}_{A\rightarrow {\bar{A}}}-{\mathcal {I}}_{A\rightarrow {\bar{B}}}\right\| _{\Diamond }-\left\| {\mathcal {W}}({\mathcal {N}})_{A\rightarrow {\bar{B}}}-{\mathcal {I}}_{A\rightarrow {\bar{B}}}\right\| _{\Diamond }\nonumber \\&\quad \le \left\| {\mathcal {D}}_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}_{A\rightarrow {\bar{A}}}-{\mathcal {W}}({\mathcal {N}})_{A\rightarrow {\bar{B}}}\right\| _{\Diamond }. \end{aligned}$$

(23)

Forth, relating the trace norm distance of Choi states to the diamond norm distance of quantum channels [73, Lemma 7], we have

$$\begin{aligned} \left\| {\mathcal {D}}_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}_{A\rightarrow {\bar{A}}}-{\mathcal {W}}({\mathcal {N}})_{A\rightarrow {\bar{B}}}\right\| _{\Diamond } \le d_A\cdot \left\| J^{{\mathcal {D}}\circ {\mathcal {N}}\circ {\mathcal {E}}}_{ A {\bar{B}}}-J^{{\mathcal {W}}({\mathcal {N}})}_{ A {\bar{B}}}\right\| _{1} \end{aligned}$$

and thanks to the monotonicity under partial trace and Hölder’s inequality this bounds (23) as

$$\begin{aligned}&\left\| {\mathcal {D}}_{B\rightarrow {\bar{B}}}\circ {\mathcal {N}}_{{\bar{A}}\rightarrow B}\circ {\mathcal {E}}_{A\rightarrow {\bar{A}}}-{\mathcal {W}}({\mathcal {N}})_{A\rightarrow {\bar{B}}}\right\| _{\Diamond }\\&\quad \le d_Ad_{{\bar{A}}}d_{B}\cdot \left\| {\mathrm {Tr}}_{{\bar{A}} B}\left[ \left( 1_{A} \otimes J^{{\mathcal {N}}}_{{\bar{A}} B}\otimes 1_{{\bar{B}}}\right) \left( E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}-W_{A{\bar{A}} B {\bar{B}}}\right) \right] \right\| _{1}\\&\quad \le d_Ad_{{\bar{A}}}d_{B}\cdot \left\| \left( 1_{A} \otimes J^{{\mathcal {N}}}_{{\bar{A}} B}\otimes 1_{{\bar{B}}}\right) (E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}-W_{A{\bar{A}} B {\bar{B}}})\right\| _{1} \\&\quad \le d_Ad_{{\bar{A}}}d_{B}\cdot \left\| 1_{A} \otimes J^{{\mathcal {N}}}_{{\bar{A}} B}\otimes 1_{{\bar{B}}}\right\| _{\infty } \left\| E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}- W_{A{\bar{A}} B {\bar{B}}}\right\| _{1} \\&\quad \le \frac{{\mathrm {poly}}(d)}{\sqrt{n}}\quad \text {with }d=\max \{d_A,d_{{\bar{A}}},d_B,d_{{\bar{B}}}\}. \end{aligned}$$

Finally, optimising in (23) over all feasible Choi states $W_{A{\bar{A}}{(B{\bar{B}})}_1^n}$ and then optimising over all feasible Choi states $E_{A{\bar{A}}}\otimes D_{B{\bar{B}}}$, we get the claimed upper bound

$$\begin{aligned} \Delta ({\mathcal {N}},M)-{\mathrm {SDP}}^\Delta _{n}({\mathcal {N}},M)\le \frac{{\mathrm {poly}}(d)}{\sqrt{n}}. \end{aligned}$$

$\square $

Numerically, we have found that for the qubit depolarizing channel the first level of our hierarchy already gives the exact optimal value

$$\begin{aligned} \Delta (Dep_2,2)={\mathrm {SDP}}^\Delta _{1,{\mathrm {PPT}}}(Dep_2,2), \end{aligned}$$

which coincides with $1-F(Dep_2,2)$. That is, for the qubit depolarizing channel the average and worst case error criteria become the same.

Appendix D: Distortion with side information

The following lemma shows that if the A system is not measured, then the loss in distinguishability after applying a measurement on the B system can be bounded independently of $d_A$.

Lemma D.1

Consider a state two-design on B, i.e., a set of rank-one projectors $\{P_z\}_{z \in \{1, \dots , t\}}$ such that $\frac{1}{t} \sum _{z=1}^t P_z \otimes P_z = \frac{2 P^{\mathrm {sym}}}{d_B (d_B+1)}$, where $P^{\mathrm {sym}}$ denotes the projector onto the symmetric subspace of $B \otimes B$. Let ${\mathcal {M}}_B$ be the measurement defined as

$$\begin{aligned} {\mathcal {M}}_B(X) = \sum _{z} \frac{d_B}{t} \cdot {\mathrm {Tr}}\big [P_z X\big ] |z\rangle \langle z|, \end{aligned}$$

and $\xi _{AB}$ be a Hermitian matrix on $A\otimes B$. Then, we have that

$$\begin{aligned} \Vert ({\mathcal {I}}_{A} \otimes {\mathcal {M}}_B)(\xi _{AB}) \Vert _1\ge \frac{1}{d_B^2 (d_B+1)} \Vert \xi _{AB} \Vert _{1}. \end{aligned}$$

We note that the existence of such two-designs is known for any dimension, see e.g., [68, Corollary 5.3] for unitary two-designs and applying these unitaries to any fixed state leads to a state two-design.

Proof

For any full rank quantum state $\sigma _{A}$, we have by a Hölder type inequality for $\sigma $-weighted Schatten norms that (see, e.g., [58] or [6])

$$\begin{aligned} \Vert ({\mathcal {I}}_{A} \otimes {\mathcal {M}}_B)(\xi _{AB}) \Vert _1 \ge \frac{\left\| \sigma _{A}^{-1/4} ({\mathcal {I}}_{A} \otimes {\mathcal {M}})(\xi _{AB}) \sigma _{A}^{-1/4} \right\| ^2_{2}}{\left\| \sigma _{A}^{-1/2} ({\mathcal {I}}_{A} \otimes {\mathcal {M}}_B)(\xi _{AB}) \sigma _{A}^{-1/2} \right\| _{\infty }}. \end{aligned}$$

For example, the above inequality can be obtained using [6, Corollary 3] with the operator $\sigma _{A}^{-1/2} ({\mathcal {I}}_{A} \otimes {\mathcal {M}}_B)(\xi _{AB}) \sigma _{A}^{-1/2}$, weight $\sigma $, and $p_\theta = 2$, $\theta = 1/2$, $p_0 = 1$, $p_1 = \infty $. We note that this particular Hölder type inequality for $\sigma $-weighted norms is elementary and follows easily from the usual Hölder inequality, but one way of potentially improving the dimension dependence in Lemma D.1 might be to use another Hölder inequality, in particular the (1, 4) inequality.

Henceforth, we abbreviate $d\equiv d_B$. To further bound the numerator, letting ${\tilde{\xi }}_{AB}:=\sigma _{A}^{-1/4} \xi _{AB} \sigma _{A}^{-1/4}$ we get

$$\begin{aligned} \Vert ({\mathcal {I}}_{A} \otimes {\mathcal {M}}_B)({\tilde{\xi }}_{AB}) \Vert ^2_{2}&= \left\| \sum _{z} \frac{d}{t} |z\rangle \langle z| \otimes {\mathrm {Tr}}_{B}\left[ (1_{A} \otimes P_z) {\tilde{\xi }}_{AB}\right] \right\| _2^2 \\&= \sum _{z} \frac{d^2}{t^2} {\mathrm {Tr}}\left[ {\mathrm {Tr}}_{B}\left[ (1_{A} \otimes P_z) {\tilde{\xi }}_{AB}\right] \otimes {\mathrm {Tr}}_{\bar{B}}\left[ (1_{\bar{A}} \otimes P_z) {\tilde{\xi }}_{\bar{A}\bar{B}}\right] ^{\dagger } F_{A\bar{A}} \right] \\&= \sum _{z} \frac{d^2}{t^2} {\mathrm {Tr}}\left[ \left( \left( (1_{A} \otimes P_z) {\tilde{\xi }}_{AB}\right) \otimes \left( (1_{\bar{A}} \otimes P_z) {\tilde{\xi }}_{\bar{A}\bar{B}}\right) ^{\dagger }\right) \left( F_{A\bar{A}} \otimes 1_{B\bar{B}}\right) \right] \\&= \frac{d^2}{t^2} {\mathrm {Tr}}\left[ \left( {\tilde{\xi }}_{AB} \otimes {\tilde{\xi }}_{\bar{A}\bar{B}}^{\dagger }\right) \left( \sum _{z} (1_{A\bar{A}} \otimes P_z \otimes P_{z})\right) \left( F_{A\bar{A}} \otimes 1_{B \bar{B}}\right) \right] \\&= \frac{1}{t} \frac{d^2}{d(d+1)} {\mathrm {Tr}}\left[ \left( {\tilde{\xi }}_{AB} \otimes {\tilde{\xi }}_{\bar{A}\bar{B}}^{\dagger }\right) (1_{A\bar{A}} \otimes (1_{B\bar{B}} + F_{B\bar{B}}))\left( F_{A\bar{A}} \otimes 1_{B \bar{B}}\right) \right] \\&= \frac{1}{t} \frac{d^2}{d(d+1)} \Big (\underbrace{{\mathrm {Tr}}\left[ {\tilde{\xi }}_{A} {\tilde{\xi }}_{A}^{\dagger }\right] }_{\ge 0} + \underbrace{{\mathrm {Tr}}\left[ {\tilde{\xi }}_{AB} {\tilde{\xi }}_{AB}^{\dagger }\right] }_{=\left\| {\tilde{\xi }}_{AB}\right\| _2^2} \Big ) \\&\ge \frac{1}{t}\frac{d^2}{d^2 (d+1)} \Vert \xi _{AB} \Vert ^2_{1}, \end{aligned}$$

where F denotes the swap operator (as defined in Sect. 2.1) and in the last step used the Hölder inequality (see, e.g., [10])

$$\begin{aligned} \left\| \xi _{AB}\right\| _1&=\left\| \sigma ^{1/4}\sigma ^{-1/4}\xi _{AB}\sigma ^{-1/4}\sigma ^{1/4} \right\| _1\le \left\| \sigma ^{1/4}\otimes 1_B\right\| _4\left\| \sigma _A^{-1/4}\xi _{AB}\sigma _A^{-1/4} \right\| _2\left\| \sigma ^{1/4}\otimes 1_B\right\| _4\\&\qquad \le \sqrt{d}\left\| {\tilde{\xi }}_{AB}\right\| _2. \end{aligned}$$

For further bounding the denominator we write

$$\begin{aligned} \left\| \sigma _{A}^{-1/2} ({\mathcal {I}}_{A} \otimes {\mathcal {M}}_B)(\xi _{AB}) \sigma _{A}^{-1/2} \right\| _\infty&= \max _{z} \frac{d}{t} \left\| {\mathrm {Tr}}_{B}\left[ (1_{A} \otimes P_z) \sigma _{A}^{-1/2} \xi _{AB} \sigma _{A}^{-1/2}\right] \right\| _\infty \\&\le \frac{d}{t} \max _{\vert \phi \rangle _{A}, \vert \psi \rangle _{B}} \langle \phi \vert _{A} \otimes \langle \psi \vert _{B} \sigma _{A}^{-1/2} \xi _{AB} \sigma _{A}^{-1/2} \vert \phi \rangle _{A} \otimes \vert \psi \rangle _{B} \\&\le \frac{d}{t} \left\| \sigma _{A}^{-1/2} \xi _{AB} \sigma _{A}^{-1/2} \right\| _\infty , \end{aligned}$$

where we used the fact that $P_z$ is a rank 1 projector. Now, observe that for any $\xi _{AB}$, there exists a $\sigma _A$ of unit trace such that

$$\begin{aligned} \frac{\sqrt{\xi _{AB} \xi _{AB}^{\dagger }}}{\Vert \xi _{AB} \Vert _{1}} \preceq d\cdot \sigma _{A} \otimes 1_{B}. \end{aligned}$$

This just follows from, e.g., [8, Lemma B.6], where it is shown that we can in fact choose^{Footnote 15}

$$\begin{aligned} \sigma _{A} = \Vert \xi _{AB} \Vert _{1}^{-1}\cdot {\mathrm {Tr}}_{B}\left[ \sqrt{\xi _{AB} \xi _{AB}^{\dagger }}\right] . \end{aligned}$$

As a result, we have

$$\begin{aligned} \sqrt{\xi _{AB} \xi _{AB}^{\dagger }} \preceq d\Vert \xi _{AB} \Vert _{1}\cdot \sigma _{A} \otimes 1_{B}. \end{aligned}$$

As $\xi _{AB}$ is Hermitian, we can decompose it into the positive and negative part $\xi _{AB} = P - Q$ with P and Q positive semidefinite and $PQ = 0$, then $\sqrt{\xi _{AB} \xi _{AB}^{\dagger }} = P + Q$ and so $-\sqrt{\xi _{AB} \xi _{AB}^{\dagger }} \preceq \xi _{AB} \preceq \sqrt{\xi _{AB} \xi _{AB}^{\dagger }}$. Thus, we get

$$\begin{aligned} - d_{B} \Vert \xi _{AB} \Vert _1\cdot \sigma _{A} \otimes 1_{B} \preceq \xi _{AB} \preceq d_{B} \Vert \xi _{AB} \Vert _1 \cdot \sigma _{A} \otimes 1_{B}, \end{aligned}$$

and we find $\Vert \sigma _{A}^{-1/2} \xi _{AB} \sigma _{A}^{-1/2} \Vert _{\infty } \le d\Vert \xi _{AB} \Vert _1$. This concludes the proof. $\square $

Appendix E: Missing proofs

In the following we give the proofs omitted in the main discussion.

Proof

The lower bound is trivial and the upper bounds follow directly from the more general statements about the optimal fidelity under additional classical communication assistance as given in Lemma A.4. $\square $

Proof

The lower bound is trivial. By the monotonicity in n (Theorem 4.8), it is enough to restrict to $n=1$ for the upper bounds.^{Footnote 16} As in the proof of Lemma A.4 we mostly use that for any sub-normalized bipartite quantum state $\rho _{XY}$ we have $d_X\cdot 1_X\otimes \rho _Y\succeq \rho _{XY}$. For the first upper bound we find $\frac{d_{{\bar{B}}}}{d_B}\cdot W_{A{\bar{A}}}\otimes 1_{B_1{\bar{B}}_1}\succeq W_{A{\bar{A}}B_1{\bar{B}}_1}$, which gives for the objective function

$$\begin{aligned} {\mathrm {SDP}}_1({\mathcal {N}},M)&\le d_{{\bar{A}}}d_B\cdot {\mathrm {Tr}}\left[ \left( J^{\mathcal {N}}_{{\bar{A}}B_1}\otimes \Phi _{A{\bar{B}}_1}\right) \left( \frac{d_{{\bar{B}}}}{d_B}\cdot W_{A{\bar{A}}}\otimes 1_{B_1{\bar{B}}_1}\right) \right] \\&=d_{{\bar{A}}}d_{{\bar{B}}}\cdot {\mathrm {Tr}}\left[ \left( \frac{1_A}{d_A}\otimes \frac{1_{{\bar{A}}}}{d_{{\bar{A}}}}\right) W_{A{\bar{A}}}\right] ={\mathrm {Tr}}\left[ W_{A{\bar{A}}}\right] =1. \end{aligned}$$

For the second upper bound we find similarly as for the first upper bound $\frac{d_{{\bar{A}}}}{d_A}\cdot 1_{A{\bar{A}}}\otimes W_{B_1{\bar{B}}_1}\succeq W_{A{\bar{A}}B_1{\bar{B}}_1}$, which then leads to the claim by the same argument as for the second upper bound in Lemma 4.3. $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Berta, M., Borderi, F., Fawzi, O. et al. Semidefinite programming hierarchies for constrained bilinear optimization. Math. Program. 194, 781–829 (2022). https://doi.org/10.1007/s10107-021-01650-1

Download citation

Received: 26 March 2020
Accepted: 23 March 2021
Published: 15 April 2021
Issue Date: July 2022
DOI: https://doi.org/10.1007/s10107-021-01650-1

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Semidefinite programming hierarchies for constrained bilinear optimization

Abstract

Similar content being viewed by others

Dual Lower Bounds for Approximate Degree and Markov-Bernstein Inequalities

An optimal quantum error-correcting procedure using quantifier elimination

Bounds on entanglement dimensions and quantum graph parameters via noncommutative polynomial optimization

1 Introduction

2 De Finetti theorems with linear constraints

2.1 Notation

2.2 Previous work

2.3 Proof methods

2.4 Information-theoretic tools

Lemma 2.1

Proof

Lemma 2.2

2.5 Main technical result

Theorem 2.3

Proof

Theorem 2.4

Proof

2.6 De Finetti theorems without symmetries

Theorem 2.5

Proof

3 Constrained bilinear optimization

Theorem 3.1

Proof

Lemma 3.2

4 Approximate quantum error correction

4.1 Motivation

4.2 Setting

Definition 4.1

Lemma 4.2

Proof

Lemma 4.3

4.3 De Finetti theorems for quantum channels

Theorem 4.4

Proof

Example 4.5

Example 4.6

Corollary 4.7

4.4 Hierarchy of outer bounds

Theorem 4.8

Proof

Remark 4.9

Lemma 4.10

4.5 Low level relaxations

5 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Classically-assisted approximate quantum error correction

1.1 A.1. Setting

Definition A.1

Lemma A.2

Lemma A.3

Proof

Lemma A.4

Proof

1.2 A.2. Hierarchy of outer bounds

Proposition A.5

Theorem A.6

Lemma A.7

Proof

Appendix B: Numerical examples

1.1 B.1. Methods

Lemma B.1

1.2 B.2. Qubit Channels

1.3 B.3. Qutrit Channels

1.4 B.4. Depolarizing channel

1.5 B.5. Amplitude damping channel

Appendix C: Worst case error criteria

1.1 C.1. Setting

Definition C.1

Lemma C.2