Negative Quasi-Probability as a Resource for Quantum Computation

A central problem in quantum information is to determine the minimal physical resources that are required for quantum computational speedup and, in particular, for fault-tolerant quantum computation. We establish a remarkable connection between the potential for quantum speed-up and the onset of negative values in a distinguished quasi-probability representation, a discrete analog of the Wigner function for quantum systems of odd dimension. This connection allows us to resolve an open question on the existence of bound states for magic-state distillation: we prove that there exist mixed states outside the convex hull of stabilizer states that cannot be distilled to non-stabilizer target states using stabilizer operations. We also provide an efficient simulation protocol for Clifford circuits that extends to a large class of mixed states, including bound universal states.

While it is widely believed that quantum computers can solve certain problems with exponentially fewer resources than their classical counterparts, the scope of the physical resources of the underlying quantum systems that enable universal quantum computation is not well understood. For example, for the standard circuit model of quantum computation, Vidal has shown that high-entanglement is necessary for an exponential speed-up [43]; however, it is also known that access to high-entanglement is not sufficient [23]. Moreover, in alternative models of quantum computation such as DQC1 [31], algorithms that may be performed on highly-mixed input states appear to be more powerful than classical computation even though there appears to be a negligible amount of entanglement in the underlying quantum system [12]. This suggests that large amounts of entanglement, purity or even coherence may not be necessary resources for quantum-computational speed-up. One of the central open problems of quantum information is to understand which sets of quantum resources are jointly necessary and sufficient to enable an exponential speed-up over classical computation. Any solution to this important problem may point to more practical experimental means of achieving the benefits of quantum computation.
The question of whether a restricted subset of quantum theory is still sufficient for a given task is meaningful when there is a specific context that divides the full set of possible quantum operations into two classes: the restricted subset of operations that are accessible or easy to implement and the remainder that are not. In such a context it is then natural to consider the difficult operations as resources and ask how much, if any, of these resources are required. For example, a common paradigm in quantum communication is that of two or more spatially separated parties for which local quantum operations and classical communication define a restricted set of operations that are accesible or "free resources", whereas joint quantum operations are not free; in this context entanglement is the natural resource for quantum communication. Here we are interested not in quantum communication but in the power of quantum computation, and in particular the practically relevant case of fault-tolerant quantum computation. Transversal unitary gates, i.e., gates that do not spread errors within each code block, play a critical role in faulttolerant quantum computation. Recent theoretical work has shown that a set of quantum gates which is both universal and transversal, and hence fault-tolerant [10,14,47], does not exist. That is, any scheme for fault tolerant quantum computation divides quantum operations into two classes: those with a fault-tolerant implementation -these are the "free resources" -and the remainder -these are not free but are required to achive universality. For a fixed fault tolerant scheme the critical question is: what are necessary and sufficient physical resources to promote fault-tolerant computation to universal quantum computation?
Most of the best known fault tolerant schemes are built around the well-known stabilizer formalism [24], in which a distinguished set of preparations, measurements, and unitary transformations (the "stabilizer operations") have a fault tolerant implementation. Stabilizer operations also arise naturally in some physical systems with topological order [13,32,34]. As described above, the transversal set of stabilizer operations do not give a universal gate set and must be supplemented with some additional (non-stabilizer) resource. A celebrated scheme for overcoming this limitation is the magic state model of quantum computation devised by Bravyi and Kitaev [3] where the additional resource is a set of ancilla systems prepared in a some (generally noisy) non-stabilizer quantum state. Hence, in this important paradigm, the question of which physical resources are required for universal fault-tolerant quantum computation reduces to the following: which non-stabilizer states are necessary and sufficient to promote stabilizer computation to universal quantum computation?
In this paper we identify a non-trivial closed, convex subset of the space of quantum states which we prove is incapable of producing universal fault-tolerant quantum computation. In particular, we show that this convex subset strictly contains the convex hull of stabilizer states, and thereby prove that there exists a class of bound universal states, i.e. states that can not be prepared from convex combinations of stabilizer states and yet are not useful arXiv:1201.1256v4 [quant-ph] 20 Aug 2012 for quantum computation. Thus our proof of the existence of bound universal states resolves in the negative the open problem raised by Bravyi and Kitaev [3] of whether all non-stabilizer states promote stabilizer computation to universal quantum computation. Furthermore, we devise an efficient simulation algorithm for the subset of quantum theory that consists of operations from the stabilizer formalism acting on inputs from our non-universal region, which includes mixed states both inside and outside the convex hull of stabilizer states. This simulation scheme is an extension of the celebrated Gottesman-Knill theorem [1,23] to a broader class of input state and should be of independent interest.
Our theoretical method for proving these results is to construct a classical, local hidden variable model for the subtheory of quantum theory that consists of the stabilizer formalism and then determine the scope of additional quantum resources that are also described by this model. Indeed our local hidden variable model is a distinguished quasi-probability representation with non-negative elements. For a d dimensional quantum system there are many possible ways to represent arbitrary quantum states as quasi-probability distributions over a phase space of d 2 points and projective measurements as conditional quasi-probability distributions over the same space (see [19,20] for further details). Perhaps unsurprisingly, it has been shown that the full quantum theory can not be represented with non-negative elements in any such representation [16,19,20,40]. However, one might expect that a subtheory of quantum theory that is inadequate for quantum speed-up might be represented non-negatively, i.e. as a true probability theory, in some natural choice of quasi-probability representation. For the context described above, we seek a quasi-probability representation reflecting our natural operational restriction, in particular, we require that stabilizer states and projective measurements onto stabilizer states have non-negative representation and that unitary stabilizer operations (i.e., Clifford transformations) correspond to stochastic processes. Conveniently, for quantum systems with odd Hilbert space dimension such a representation is already known to exist: this is the discrete Wigner function picked out by Gross [25,26] from the broad class defined by Gibbons et al [22]. In such a representation it is natural to examine whether the resouces that are necessary or sufficient for quantum speed-up correspond to those that are not represented by non-negative elements of the representation.
With this insight in hand the results of this paper may now be stated more carefully: Classically efficient simulation of positive Wigner functions: The set of fault tolerant quantum logic gates in the stabilizer formalism are known as the Clifford gates. Our first contribution is an explicit simulation protocol for quantum circuits composed of Clifford gates acting on input states with positive discrete Wigner representation. We also allow arbitrary product measurements with positive discrete Wigner representation. This simulation is efficient (linear) in the number of input registers to the quantum circuit. This simulation scheme is an extension of the celebrated Gottesman-Knill theorem and should be of independent interest.
Negativity is necessary for magic state distillation: This simulation protocol implies that states outside the stabilizer formalism with positive discrete Wigner function (bound universal states) are not useful for magic state distillation. The second contribution of this paper is to give a direct proof of this fact exploiting only the observation that negative discrete Wigner representation can not be created by stabilizer operations. This proof has a more general range of applicability than the efficient simulation scheme and also makes clear the conceptual importance of negative quasi-probability as a resource for stabilizer computation.
Geometry of positive Wigner functions: The set of quantum states with positive discrete Wigner function strictly contains the set of (convex combinations of) stabilizer states. To prove this fact we determine the geometry of the region of quantum state space with positive discrete Wigner representation. Concretely, we show that the facets of the classical probability simplex defining the discrete Wigner function are also facets of the polytope with the (pure) stabilizer states as its vertices. Since there are many more facets of the stabilizer polytope than of the simplex this suffices to show the existence of non-stabilizer states with positive representation.
The paper concludes with a discussion of this work and some avenues for future exploration.

Previous Work
The Gottesman-Knill theorem provides an efficient classical simulation protocol for circuits of Clifford unitaries acting on stabilizer states. This result deals with pure qubit stabilizer state inputs and simulates the evolution of the full quantum state. The simulation scheme of the present paper deals with odd dimensional systems, makes no distinction between mixed state and pure state input, and allows the simulation of a large class of non-stabilizer states. However, our scheme constructs a classical circuit with the same outcome probabilities as the quantum circuit and does not recover the evolution of the full quantum state.
A number of papers have addressed the question of which ancilla states enable universal quantum computation for the magic state model in qubit systems [6-8, 35-38, 42]. The most directly comparable result is the demonstration by Campbell and Browne [8] that for any protocol on the input ρ ⊗n there exists a ρ outside the convex hull of stabilizer states that maps to a convex combination of stabilizers. As n grows these states are known to exist only within some arbitrarily small distance of the convex hull of stabilizer states. By contrast, the present result implies the existence of states a fixed distance from the hull which are not distillable by any protocol.
The present result is complementary to previous work connecting negativity in discrete Wigner function type representations to quantum computational speedup [11,21,42]. In particular, van Dam and Howard [41] have used techniques of this type to derive a bound on the amount of depolarizing noise a state can withstand before entering the stabilizer polytope. Their work deals only with prime dimensional systems, and in this case it turns out that the noise threshold they derive is the same as the amount of noise required for their "maximally robust" state to enter the region of positive states.

A. The Stabilizer Formalism
Known schemes for fault tolerant quantum computation allow for only a limited set of operations to be implemented directly on the encoded quantum information. For most known fault tolerance schemes this restricted set is the stabilizer operations consisting of preparation and measurement in the computational basis and a restricted set of unitary operations. This restricted set is the Clifford group, and we now review the important parts of its structure for qudit systems [27]. The primitive object about which the the stabilizer formalism is built is the Heisenberg-Weyl group, an extension of qubit Pauli group to odd dimensional systems. We will define the Heisenberg-Weyl group in terms of generalized X and Z operators. In odd prime dimension d these are given by their action on computational basis states, where ω = exp 2πi d is a dth primitive root of unity. The Heisenberg-Weyl operators are the d 2 operators generated by X and Z, which have a group structure if we include phases. A general element of the group is given as Z d is the finite field of d elements, and the field Z d × Z d will form the "phase space" underpinning the discrete Wigner function defined below.
If the dimension of H j is d j the vector (j 1 , j 2 ) is an element of Z dj × Z dj and the vector (a 1 , The Clifford operators, C d , are the set of unitary operators that map Heisenberg-Weyl operators to Heisenberg-Weyl operators under conjugation: This is the normalizer of the Heisenberg-Weyl group. This group has many interesting features. For instance, operations of the Clifford group on computational basis input states can be efficiently simulated even though they create large amounts of entanglement.
For this paper it will be necessary to understand the Clifford group in terms of its representation over finite fields. Because it enormously simplifies the presentation we will restrict ourselves to working in the case where the system is composed of n subsystems with a common Hilbert space dimension p a prime. However, many of the important results carry over for arbitrary odd dimensional systems [25]. The main result that we want is that a Clifford operation U F ,a ∈ C p n can be specified as, That is, any Clifford unitary is uniquely specified by its action on the Pauli group and this is given by induced action on the label u ∈ (Z p × Z p ) n . The matrix F is a 2n × 2n symplectic matrix with entries in Z p and a ∈ (Z p × Z p ) n . Clifford operations generally factor as where F ∈ Sp(2n, Z p ), the group of symplectic matrices of size 2n with entries in Z p . A matrix is symplectic if it preserves the symplectic product, which may be defined in a natural way on finite fields. This structure is not important for the present paper so we do not cover it in detail. What is important is that this representation of the Clifford group has a size linear in n and thus can be easily tracked by a classical computer; this is at the heart of simulation results about the Clifford group.
Finally, we define stabilizer states to be any state that can be prepared by applying a Clifford unitary to a computational basis state. These states are important because they are the only pure states that can be prepared using our restricted operation set.

B. Magic State Distillation
It is possible to implement stabilizer operations fault tolerantly, but these operations do not suffice for universal quantum computation. To promote stabilizer computation to universal quantum computation some additional resource is required. This additional resource will be subject to large amounts of noise, so the question becomes: which non-stabilizer resources can be used to promote stabilizer computation and how can this be done? The first of these questions is the subject of this paper. The second question finds a particularly elegant solution in the form of magic state distillation [2,3,9].
Magic state distillation protocols aim to consume a large number of copies of a non-stabilizer qudit input state ρ in to produce a single non-stabilizer qudit output state ρ out with higher fidelity to some non-stabilizer pure state. This output state is then consumed to implement some non-Clifford unitary gate. These protocols have the following structure: • Prepare a number of copies of the input state ρ ⊗n in . • Perform some Clifford operation on ρ ⊗n in . • Measure Heisenberg-Weyl observables on the last n − 1 registers and post select on the outcome.
When these protocols succeed the first register will be the output state ρ out . Typically these protocols work iteratively, repeatedly consuming ρ ⊗n in until n copies of ρ out have been produced and then using ρ ⊗n out as the input to the protocol to produce ρ out and so on. The protocols we deal with here encompass but do not require this iterative structure.
It is not clear if for all input states ρ in there exists a protocol to produce an output state with arbitrarily high fidelity to some non-stabilizer input state. We call states for which such a protocol exists "distillable", and one of our results is to show that not all non-stabilizer states are distillable.

C. Quasi-Probability Representations
There is a long history of studying negativity in quasi-probability representations. The most notable examples come from quantum optics where the Wigner function [30] and the Q and P functions [33] play prominent roles. However, such approaches typically suffer in significance due to the problem of non-uniqueness of the choice of representation. While a quantum state may correspond to a negative-valued quasi-probability function in one choice of quasi-probability representation, in another choice that same state can be positive, and hence a valid classical probability density. In Reference [20], two of us proved that for any choice of quasi-probability representation in which both quantum states and measurements are represented, at least some of the states and measurements must take on negative-values. However, this result still leaves open the possibility that certain subsets of quantum states and measurements may be represented positively leading to a classical probability model for the corresponding subset of quantum operations [45]. When the restricted subtheory is prescribed by an operational restriction, this question takes on a precise and relevant meaning. Indeed, such an approach has been considered already by Schack and Caves, who constructed classical probability models for few-qubit NMR experiments [39] and thereby stimulated an important discussion of what kinds of resources might be required for universal quantum computation.
Our approach is to exploit the freedom in choice of quasi-probability representations [18] in order to align the positive subtheory with the operational restriction defining the error-free resources in the magic-state model. We seek a quasi-probability representation for which stabilizer states and projective measurements onto stabilizer states have positive representation and for which stabilizer transformations correspond to stochastic processes. This first step is easy given that such a representation already exists; this distinguished representation is the discrete Wigner function picked out by Gross [25,26] from the broad class defined by Gibbons et al [22].

D. The discrete Wigner Function
Our approach is to represent Clifford operations as stochastic processes over a discrete phase space. Intuitively, if the dynamics of the quantum system admit a representation as a classical statistical process then it should not be sufficient for universal quantum computation. To that end, we look for a quasi-probability representation for quantum theory where stabilizer resources are represented positively. This is the discrete Wigner function.
The discrete Wigner representation of a state ρ ∈ L(C p n ) is a quasi-probability distribution over Z d × Z d , which can be thought of as d by d grid. This grid is the discrete analogue of the phase space of classical mechanics. The map taking quantum states to quasi-probability distributions on discrete phase space is uniquely specified by a set of phase space point operators {A u } (defined below). For each point u in the discrete phase space there is a corresponding operator A u and the value of discrete Wigner representation of ρ at this point is given as, A quantum measurement with POVM {E k } is represented by assigning conditional (quasi-)probability functions over the phase space to each measurement outcome, In the case where W E k (u) ≥ 0 ∀u, this can be interpreted classically as the probability of getting outcome k given that the system is actually at point u, W E k (u) = Pr(outcome k|location u). If both W ρ (u) and W E k (u) are positive then the law of total probability gives the probability of getting outcome k from a measurement of state ρ, In fact this prediction reproduces the Born rule even when W ρ (u) or W E k (u) take on negative values. We say a state ρ has positive representation if W ρ (u) ≥ 0 ∀u ∈ Z n d × Z n d and negative representation otherwise. We will say a measurement with POVM M = {E k } has positive representation if W E k (u) ≥ 0 ∀u ∈ Z n d × Z n d , ∀E k ∈ M and negative representation otherwise.
The phase space point operators are defined in terms of the Heisenberg-Weyl operators as, These operators are Hermitian so the discrete Wigner representation is real-valued. There are d 2 such operators for d-dimensional Hilbert space; they are informationally complete and orthogonal in the sense that Tr(A u A v ) = dδ(u, v). 2. Clifford operators have the action U F ,a A u U † F ,a = A F u+a . This means that so that Clifford transformations map to permutations of the underlying phase space and, in particular, Clifford operations preserve positive representation.
3. For ρ = u p u A u and σ = u q u A u the trace inner product is Tr(ρσ) = d u p u q u ; 4. The phase point operations in dimension d n are tensor products of n copies of the d dimension phase space point operators.

II. NEGATIVE DISCRETE WIGNER REPRESENTATION IS NECESSARY FOR COMPUTATIONAL SPEEDUP
We now establish that any quantum computation consisting of stabilizer operations acting on product input states with positive representation can not produce an exponential computational speed-up. To this end we give an explicit efficient classical simulation protocol for such circuits. Like the Gottesman-Knill protocol our scheme allows for the simulation of pure state stabilizer inputs to circuits composed of Clifford transformations and stabilizer measurements. However, our simulation scheme extends the Gottesman-Knill result in several ways. First, it applies to systems of qudits rather than qubits. Second, it applies to mixed state inputs. Thirdly, and most remarkably, it applies to some non-stabilizer resources -namely those with positive discrete Wigner representation.
Any particular run of a quantum algorithm on n registers will produce a string k of n measurement outcomes. These outcomes occur at random and we assign the random variable K quant to be the algorithm output. The algorithm can then be considered as a way of sampling outcomes according to the distribution Pr(K quant = k). To simulate a quantum algorithm it suffices to give a simulating algorithm which samples from the distribution Pr(K quant = k), which is what we do here. Notice that this form of simulation does not allow us to actually infer the distribution of outcomes, but it does suffice for many important tasks (for example, estimating the expected outcome).
The type of algorithms we treat here take the following form (see Figure 1 for an example): Algorithm Class 1 Family of Simulable Quantum Algorithms Algorithms in this class sample strings of measurement outcomes k according to the distribution Pr(K quant = k) determined by the Born rule.

Until all registers have been measured:
(a) Apply a Clifford unitary gate U F , labeled by the symplectic transformation F ∈ Sp(2n, d). Notice that there is no loss of generality in considering only symplectic Clifford transformations as the Heisenberg-Weyl component can be rolled into the measurement.
The essential idea for the simulation is to take seriously the hidden variable model the restrictions allow us. In the discrete Wigner picture the system begins at point u in the discrete phase space, which is unknown but definite and fixed. The effect of U F is to move the system from the point u to the point F u, and measurement amounts to checking some region of the phase space to see if it contains the system. Since the vector u and matrix F are size 2n with entries from Z d it is computationally efficient to classically store and update the system's location. Of course, a (positively represented) quantum state corresponds to a probability density over the space so we must treat this a little more carefully. The simulation protocol is: Algorithm Class 2 Classical Simulation Algorithm Algorithms in this class sample strings of measurement outcomes k according to the distribution Pr(K class = k) 1. Sample u ∈ F 2n d according to the distribution Wρ 1 ⊗···⊗ρn (u) = Wρ 1 (u1)Wρ 2 (u2) . . . Wρ n (un).

Repeat until all registers have been measured:
(a) If the unitary U F is applied then update u → F u.
(b) If the measurement M with corresponding POVM {E k } is made on the last register of the quantum circuit then report outcome k with probability WE k (um) where um is the ontic position of the last qudit system, defined by u = u1 ⊕ u2 · · · ⊕ um. If the quantum algorithm conditions further steps on the outcome of measurement on this register then condition further steps of the simulation on measurement outcome k.
Our claim is that the classical algorithm in Algorithm Class 2 efficiently simulates the corresponding quantum algorithm in Algorithm Class 1. More precisely, Theorem 1. An n qudit quantum algorithm belonging to Algorithm Class 1 is simulable by the corresponding 2n dit classical algorithm in Algorithm Class 2 in the sense that the distribution of outcomes k is the same for both algorithms, Pr(K class = k) = Pr(K quant = k).
Proof. The input to the classical circuit is a 2n dit string and the transformations are all matrices of size 2n with entries in Z d so the 2n dit portion of the claim is obvious.
To show that this protocol genuinely simulates the circuit it suffices to show any string of measurement outcomes k = (k 1 k 2 . . . k n ) occurs with the same probability for both the original circuit and the simulation. Lets first consider probability distribution Pr(k n ) of the outcomes of the first measurement. In the quantum circuit the preparation ρ 1 ⊗ · · · ⊗ ρ n is passed to the (possibly identity) gate U F and measurement M n with corresponding POVM {E kn } is applied to the nth register. The probability of getting outcome k n is then: Where we have recast the inner product into the discrete Wigner form for convenience of comparison. We must now establish that the classical circuit has the same distribution. Classically, if the system is initially at point v on the discrete phase space then probability of getting outcome k n from the simulation circuit is given by: Pr class (k n |v sampled initially) = Pr class (k n |F v final location) = W I⊗···⊗I⊗E kn (F v).
Which just says that the system is moved from point v to point F v and the probability of outcome k n is the probability we see the system when we look at the region of phase space measured by E kn , which is W E kn (F v) by definition. The total probability of outcome k n is then: Comparing Algorithm Class (1), the distribution of measurement outcomes on the last register for the quantum circuit, and Algorithm Class (2), the simulated distribution of measurement outcomes on the last register, we see they are the same. If the quantum algorithm is independent of the measurement outcomes then simply applying the above argument to each register would suffice to complete the proof. However, in general adaptive schemes are possible, such the algorithm illustrated in Figure 1 on page 6 where the final gate applied depends on the outcome of the measurement on the third qudit. Using the assumption that the registers are measured from last to first we can factor the distribution of outcome strings as Pr(k) = Pr(k 1 |k 2 . . . k n )Pr(k 2 |k 3 . . . k n ) . . . Pr(k n−1 |k n )Pr(k n ).
Since the simulation conditions on measurement outcome in exactly the same way as the original quantum algorithm a simple inductive argument shows that the distribution of outcomes must be the same for the quantum algorithm and its classical simulator.
Corollary 2. Quantum algorithms belonging to Algorithm Class 1 offer no super linear advantage over classical computation.
Proof. We have seen that if it is computationally efficient (linear in the number of qudits) to sample from the classical distributions corresponding to the input state and the measurements then such quantum circuits are efficiently simulable. Since we have assumed separability of the input and measurements and the discrete Wigner function factors this efficient sampling is guaranteed.
A couple of remarks are in order. We have restricted ourselves to separable inputs and measurements, but this is not strictly necessary for efficient simulation. Any positively represented preparation or measurement can be accommodated provided it is possible to classically efficiently sample from the corresponding distribution. Since it is exponentially difficult to even write down general quantum states this is rather strong restriction.
Also, notice that our simulation protocol only samples from the output distribution of a circuit whereas the Gottesman-Knill protocol gives the full quantum state output in the case where the input is a pure stabilizer state. It may appear that the present protocol is weaker in this respect. However, the discrete Wigner function of pure stabilizer states are uniformly valued lines on the phase space [25] and these are fully specified by only two points. If we are promised the input state to a Clifford circuit is a stabilizer state then we can sample two distinct points from the corresponding distribution and determine where the circuit maps them. These two output points then suffice to fix the line corresponding to the output stabilizer state.
Finally, we note that, in the context of magic state distillation for example, one may increase the size of the input register conditional on measurement outcomes. This can be accounted for in the simulation protocol above by simply increasing the size of the phase space accordingly and sampling from the new additional positive Wigner functions.

III. MAGIC STATE DISTILLATION
The main significance of the simulation result just established is that for noisy preparation and measurement it is possible to extend the efficient simulation of quantum circuits beyond the purview of the stabilizer formalism. This result is of major theoretical importance, but it also has practical significance. In particular, the simulation scheme addresses the magic state model, which supplements error free stabilizer resources with high fidelity additional gates produced through the consumption of non-stabilizer ancillas. Recall that the backbone of this process is a distillation protocol that uses stabilizer resources applied to a large number of ancilla input states to produce a few highly pure non-stabilizer states. An immediate corollary of our simulation protocol is that states with positive discrete Wigner representation are not useful for computational speedup in the magic state model. Since this includes a large class of states outside the stabilizer formalism this offers a resolution to the long standing open problem of whether all non-stabilizer states promote stabilizer computation to universal quantum computation [4].
The class of algorithms in the previous section encompass a large variety of magic state distillation protocols, but it is still conceptually unclear why states with positive discrete Wigner representation are not useful for magic state distillation. This is especially true since we typically think of the outcome of a magic state distillation routine as a quantum state, rather than a string of measurement outcomes as in the simulation protocol. If we did keep track of the full input distribution then we would be able to use it to reconstruct the quantum state output of a distillation procedure; unfortunately this is impossible to do efficiently, but a little thought shows it is not actually necessary. We do not need to know the final quantum state, we only need to know that it is not helpful for doing quantum computation. The simulation protocol makes it clear that if the quantum state that is put into the circuit has a Wigner function that is a genuine probability distribution then the quantum state that is output (which we measure) must also correspond to a genuine probability distribution. Inspired by this observation, and in view of the great importance of magic state protocols, we devote this section to a direct proof that negative discrete Wigner representation of the ancilla resource states is necessary for such states to be distillable (using stabilizer resources) to a non-stabilizer state of arbitrary purity.
The essential insight of the proof used here is that negative discrete Wigner representation is a resource that can not be created using stabilizer operations; if the input states to a distillation protocol has no negativity in its discrete Wigner representation then the output will not either. This resource character is one of the major insights of the present work. Beyond its conceptual value the alternative proof presented below also closes several loopholes and alternative models not addressed by the simulation protocol, which we discuss at the end of the section.
Conventional magic state protocols perform a Clifford unitary on the input state ρ ⊗n in ∈ L(C d ⊗ C n−1 d ) and make a computational basis measurement on the final n − 1 qudits, post selecting on the |0 outcome. This outputs the state, where the partial trace knocks out the ancilla systems and the normalization in the denominator just guarantees Tr(ρ out ) = 1. We examine significantly more general protocols: instead of requiring n copies of a qudit input state we allow an arbitrary positively represented input state, in place of a Clifford unitary we allow any completely positive map which preserves the set of positively represented states and in place of the computational basis measurement we allow any positively represented projective measurement. Since positive representation is convex this suffices to eliminate classical randomness as a potential loophole. It also precludes choices of entangled stabilizer measurements as these are still positively represented. For convenience of presentation we define, which is the maximally negative point of the quasi-probability representation of ρ over the phase space. If the input state to a distillation routine is positively represented (ie. F (ρ) ≥ 0) then its output is also positively represented: Theorem 3. Let ρ in be a density operator on a n qudit Hilbert space such that F (ρ in ) ≥ 0. Let Λ be a (completely positive) map on this space for which F (ρ) ≥ 0 =⇒ F (Λ(ρ)) ≥ 0. Let P be a positively represented projector on this space. If ρ out is produced by acting on ρ in with Λ and post selecting a measurement on the last n − 1 qudits on outcome P then F (ρ out ) ≥ 0.
Proof. Since we can use Heisenberg-Weyl operations to cycle between phase point operations, without loss of generality The denominator is always positive so By assumption F (ρ in ) is positive, and thus so is F (Λ(ρ in )). We write P = v z v A v where, since P is positively represented, z u ≥ 0 by the discrete Hudson's theorem. This gives us that, The non-negativity of F (Λ(ρ in )) implies, by definition, that Tr(A 0 ⊗ A v Λ(ρ in )) ≥ 0 so it must be the case that Tr(A 0 ⊗ P · Λ(ρ in )) ≥ 0 and this implies F (ρ) ≥ 0.
This proof has a few technical merits over the simulation result of the previous section. Namely, it does not require the input state to the distillation protocol to be separable or otherwise efficiently sampleable and the update dynamics are not limited to only Clifford unitary maps. Indeed it's not even necessary to have a description of the input state or the transformation, all that is required is the promise that Λ(ρ in ) has positive discrete Wigner representation. Since many input states and channels do not admit any efficient representation this means that there are resources that are provably useless for distillation even when an efficient simulation of the corresponding distillation process would be impossible.

IV. THE GEOMETRY OF POSITIVELY REPRESENTED STATES
The Gottesman-Knill theorem already establishes that the action of (qubit) Clifford unitary operations on stabilizer states is efficiently classically simulatable. Since the only pure states with positive discrete Wigner representation are stabilizer states it is natural to wonder if every positively represented state is a mixture of stabilizer states. As we have already alluded to, remarkably this is false. To establish this we will clarify the geometry of the region of state space which has positive representation and show that it strictly contains the set of mixtures of stabilizer states. Combined with the results of the previous sections this establishes our simulation protocol as an extension of Gottesman-Knill and proves the existence of bound states for magic state distillation, states which are not convex combinations of stabilizer states but which are nevertheless not distillable using perfect Clifford operations.
The set of convex combinations of stabilizer states is a convex polytope with the stabilizer states as vertices. Any polytope can be defined either in terms of its vertices or as a list of half space inequalities called facets. Intuitively, these correspond to the faces of 3 dimensional polyhedrons. We show that in power of prime dimension each of the d 2 phase space point operators define a facet of the stabilizer polytope. These are only a proper subset of the faces of the stabilizer polytope, implying the existence of states with positive representation which are not convex combinations of stabilizer states. See Figure 2 for a cartoon capturing the intuition for this result.
The stabilizer polytope may be thought of as a bounded convex polytope living in R d 2 −1 , the space of d dimensional mixed quantum states. A minimal half space description for a polytope in R D is a finite set of bounding equalities called facets {F i , f i } with F i ∈ R D and f i ∈ R. X ∈ R D is in the polytope if and only if X · F i ≤ f i ∀i. In the usual quantum state space the vectors X of interest are density matrices, the inner product is the trace inner product and facets may be defined as {Â i , a i } whereÂ i are Hermitian matrices and ρ ∈ polytope ⇐⇒ Tr(ρÂ i ) ≤ a i ∀i.
The objective is to show that {−A u , 0} are facets of the polytope defined by stabilizer state vertices.
It is possible to explicitly compute a facet description for a polytope given the vertex description, but the complexity of this computation scales polynomially in the number of vertices. Since the number of stabilizer states grows superexponentially with the number of qudits [25] the conversion is generally impractical. The analytic proof given here circumvents this issue. We also remark that the work of Cormick et al [11] implies that the phase space point operators considered here are facets for the case of prime dimension. Proof. To establish that a halfspace inequality for a polytope in R D is a facet there are two requirements: every vertex must satisfy the inequality and there must be a set of vertices saturating the inequality which span a space of dimension D [48].
The requirement that all vertices satisfy the half space inequality is Tr(A u S) ≥ 0 for every stabilizer state S, and this the discrete Hudson's theorem.
We consider the stabilizer polytope as an object in R d 2 −1 and look for a set of d 2 − 1 linearly independent vertices which satisfy Tr(A u S) = 0. Since we are restricting to power of prime dimension we may choose a complete set of mutually unbiased bases of d(d + 1) states from the full set of stabilizer states. Suppose more than d + 1 states V i from this set satisfy Tr(V i A u ) > 0. Then a counting arguments shows that there must be two distinct states V 0 , V 1 belonging to an orthonormal basis which satisfy this criterion. But then which contradicts the orthonormality. Thus at least d(d + 1) − (d + 1) states in the mutually unbiased bases satisfy Tr(A u V i ) = 0. These are the required a set of d 2 − 1 linearly independent vertices.
The phase space point operators considered here give only a proper subset of the defining halfspace inequalities for the stabilizer polytope. This means that there are states that may not be written as a convex combination of stabilizer states which nevertheless satisfy Tr(A u ρ) ≥ 0 for all phase space point operators. That is, there are positive states which are not in the convex hull of stabilizer states. These are bound states for magic state distillation. An explicit example of such a state for the qutrit is given in [25]. These regions can be visualized by taking two and three dimensional slices of the qutrit state space. Such slices are depicted in Figures 3 and 4.

V. DISCUSSION AND CONCLUSIONS
We have shown that for systems of odd power of prime dimension a necessary condition for computational speedup using Clifford unitaries is negativity of the discrete Wigner representation of the inputs. This result is immediately relevant in the context of magic state distillation, where it shows that a necessary condition for distillability is negative representation of the ancilla preparation. We have also shown that the phase space point operators defining the discrete Wigner function correspond to a privileged set of facets of the stabilizer polytope. Together the two results imply the existence of non-stabilizer resources which do not promote Clifford computation to universal quantum computation; and in particular this establishes the existence of bound states for magic state distillation, or bound universal states.
We motivated the development of negative discrete Wigner representation by analogy to entanglement theory, with Clifford operations playing the role of local operations and classical communication and stabilizer states playing the role of separable states. It is known that there are slightly entangled mixed states that can not be consumed by distillation routines to produce highly entangled states [28]. The non-stabilizer but positively represented quantum states are exactly analogous to these bound entangled states. Similarly, it is known that for pure states large amounts of entanglement are required for quantum computational speedup [44], but for mixed states this is still an open question. However, for negative discrete Wigner representation there is no relevant distinction between mixed states and pure states. Moreover, although it is not yet known whether every negatively represented state is distillable we conjecture this to be the case.
The discrete Wigner function considered in this work is defined by analogy with the more familiar continuous variable Wigner function. It is natural to wonder if the results of this work extend to the infinite dimensional case. In the continuous case a pure state has positive Wigner representation if and only if it is a Gaussian state [29] (like stabilizer states in the discrete case) and a unitary evolution acts as a symplectic flow on the phase space if and only if it corresponds to a quadratic Hamiltonian [46] (like Clifford gates in the discrete case). A result of Brocker and Werner [5] shows that there are mixed states that can not be represented as probabilistic combinations of Gaussian states but that nevertheless have positive Wigner function. The natural question is then: is it possible to efficiently simulate quantum systems with quadratic Hamiltonian dynamics acting on non-Gaussian mixed states if these states have positive representation? The answer to this question, yes, is obtained by giving an explicit efficient simulation protocol for such systems [17]. This establishes that two of the main results of the present paper (efficient simulation and non-stabilizer mixed states with positive discrete Wigner representation) extend to the continuous case. It is interesting to ask if our third result, that negative discrete Wigner representation is a necessary resource for distillation, also has an analogue. That is, does the continuous variable case admit something analogous to the The remaining three values are allowed to vary and carve out regions depicted in the graphs. In each case, due to symmetries, there are 9 6 = 84 such slices which are identical (up to a relabeling of the axes). Note that the slice on the right does not cut through the stabilizer polytope but does contain a region of bound states. Also note that this slice contains one of the 9 states with a maximal negativity of 1/3 while the slice on the left, and those equivalent up to permutations, are the only ones which feature the maximally mixed state (X, Y, Z) = 1/9. See also Figure 4 for 2-dimensional slice of the figure on the left. magic state model which allows noisy negatively represented states to be consumed in order to promote linear optics to full universal quantum computational power?
Of course there remains a final detail that we have not yet addressed. The discrete Wigner function underpinning our analysis is only defined for odd dimensional systems; is it possible to find a similar construction for qubits? The discrete Wigner function used here has two crucial properties: Clifford operations are stochastic transformations on the underlying phase space and this phase space is separable. For qubit systems there is no known analogue. Indeed it is easy to see that a quasi-probability representation defined by any proper subset of the facets of the qubit stabilizer polytope in a fashion analogous to what has been done here will assign positive representation to some subset of the magic states. The construction of the discrete Wigner function, and the Clifford group, relies critically on the mathematics of finite fields, and it is well known that fields of characteristic 2 behave fundamentally differently than fields of any other characteristic. This fact is reflected in the theory of error correction where somewhat different protocols are required for dealing with bits and qubits than are used for dits and qudits. In the case of error correction, although qubits require more involved mathematics, the conceptual underpinnings are the same irrespective of the underlying dimensionality of the dits. It is not unreasonable to hope that a similar result will hold for qubits and that an appropriately modified mathematical strategy will preserve the conceptual insights and related technical results obtained in the qudit case. If this turns out not to be the case then understanding exactly why the model fails for qubits will undoubtedly provide deep insights into the workings of quantum theory and quantum computation.
The most interesting outstanding question raised by this work is whether the ability to prepare any state with negative discrete Wigner representation is sufficient to promote Clifford computation to universal quantum computation. In prime dimension the discrete Wigner construction is the unique choice of quasi-probability representation covariant under the action of Clifford operations [25], where the law of total probability is required to hold. On this basis we conjecture that the condition here is sufficient. From the work of Campbell, Anwar and Browne [9] it is already known that access to any non-stabilizer pure state (or equivalently any negatively represented pure state) suffices. If this conjecture is true then this implies an equivlance of two previously unrelated concepts of non-classicality, namely, quantum computational speedup and negative quasi-probability representation.