Quantum error-correcting codes and their geometries

This is an expository article aiming to introduce the reader to the underlying mathematics and geometry of quantum error correction. Information stored on quantum particles is subject to noise and interference from the environment. Quantum error-correcting codes allow the negation of these effects in order to successfully restore the original quantum information. We briefly describe the necessary quantum mechanical background to be able to understand how quantum error-correction works. We go on to construct quantum codes: firstly qubit stabilizer codes, then qubit non-stabilizer codes, and finally codes with a higher local dimension. We will delve into the geometry of these codes. This allows one to deduce the parameters of the code efficiently, deduce the inequivalence between codes that have the same parameters, and presents a useful tool in deducing the feasibility of certain parameters. We also include sections on quantum maximum distance separable codes and the quantum MacWilliams identities.

We have used various sources in the preparation of this article, principally Gottesman [6,7], Glynn et al [5] and Ketkar et al [12]. The most original parts of these notes are Section 4 and Section 6. Section 5 is based on Ketkar et al [12] but massaged so that appears as a straightforward generalisation of the qubit case of Section 2. Although the main results of Section 3 are from Glynn et al [5], in a deviation from their approach we have chosen to prove these results without using the F 4 trick, which we do not consider until later in Section 5.5. The interested reader is referred to the books by Sakurai [16] and Nielsen & Chuang [13] for standard treatments of quantum mechanics and quantum information theory, to the book by Haroche & Raimond [9] for a thorough treatment of current experiments in quantum mechanics, and to the book by Aaronson [1] for further connections to mathematics, computer science, physics, and philosophy. For those uninitiated in quantum mechanics or quantum computing, we strongly recommend the delightful mnemotic essay on quantum computing by Matuschak and Nielsen at https://quantum.country/qcvc.
1 Quantum codes

Introduction
A qubit is a two-state or two-level quantum-mechanical system. For example, the intrinsic angular momentum (spin) of an electron is such a system. It can only take two values when measured in arbitrary spatial direction, say by measuring the electrons deflection when passing by an inhomogeneous magnetic field. The two corresponding spin-states are commonly referred to as as "spin up" and "spin down" states with respect to that direction. Another example is the polarization of light.
Here the two states can be taken to be vertically and horizontally polarized light; another choice is light that is left circularly and right circularly polarized. In general, a continuum of different photon polarizations are possible. Yet only two distinct states are observed when e.g. putting beamsplitters or polarization filters in the path of a light beam.
This raises the question: why are only ever two discrete values corresponding to two discrete states observed, if electrons and photons can take on a continuum of possible spin-directions or polarizations? The answer lies with what measurements on quantum systems reveal. It turns out that for a two-state quantum-mechanical system, any individual measurements can only ever reveal the answer to a binary question. In other words, the measurement indicates in which of two mutually exclusive states the qubit can be found after the measurement. Thus while qubits can take on a continuity of states and a continuity of measurements can be performed, only two-valued results can ever be obtained. Thus the notion of a qubit as a quantum bit. We will not dwell on the strangeness of quantum mechanics further, the interested reader is referred to discussions of the Stern-Gerlach and double-slit experiments such as found in the books by Sakurai [16] and Haroche & Raimond [9] 1 .
In mathematical terms a qubit is represented by a unit vector in C 2 . The spin up and spin down (or any other choice of a pair of physically completely distinguishable states) are represented by an orthonormal basis |0 and |1 . The notation |0 is a shorthand for the vector 1 0 and |1 stands for 0 1 . The two kets |0 and |1 are also known as the computational basis vectors.
Consider now the state While |ψ ∈ C 2 represents a physically unique state, it is, upon measurement in the spin-up -spin-down direction, found in either of these two directions with equal probability. Sometimes this situation is referred to as the system being "in two states simultaneously". A more accurate description is that the system is "in superposition of spin-up and spin-down", or in other words, the system is correctly described as a linear combination of spin-up and spin-down.
As usual, z is the complex conjugate of the complex number z. When measured, the qubit is with probability α 0 α 0 found in state |0 ("spin-up") and with probability α 1 α 1 found in state |1 ("spin down"). Since the sum of these two probabilities must be one, we have that for a qubit α 0 α 0 + α 1 α 1 = 1.
The "ket" notation |α is used for a column vector, whilst the "bra" notation α| is used for a row vector whose coordinates are the complex conjugates of the coordinates of |α . Thus, the "bra" α| is a linear form. The inner product or "bra-ket" on C 2 is defined as α|β = α 0 β 0 + α 1 β 1 .
The normalisation condition in Eq. (2) then reads as α|α = 1, and qubits are represented by complex vectors in C 2 of unit length.
The matrix is an example of a unitary transformation since Note that {|0 , |1 } is an orthonormal basis, so In matrix terms, the trace is equal to the sum of the elements on the principal diagonal.
The Pauli matrices, are unitary linear transformations of C 2 which form a basis for the space of 2 × 2 matrices. In general, any error -also those which are not unitary -affecting a single qubit can be written as a linear combination of the Pauli matrices. We sometimes denote σ 0 , σ x , σ y , σ z simply as I, X, Y, Z respectively. Note that the Pauli matrices are both unitary and Hermitian. They are also mutually orthogonal under the Hilbert-Schmidt inner product A measurement or observable is represented by a hermitian operator. For example, the spin-up -spin-down measurementσ z is represented by the Pauli matrix σ z 2 .
The outcome of an individual measurement can only take two values. These correspond to the eigenvalues of σ z which are +1 and −1. After the measurement, the state is then found in the corresponding eigenstate: in |0 if the outcome +1 was obtained, and in |1 if the outcome −1 was obtained. These occur with probabilities respectively.
The above treatment can be generalised. Denote byÂ an observable which is represented by a Hermitian matrix A. Let m i and |m i be its eigenvalues and corresponding eigenvectors. Measuring an observableÂ on a quantum state |α yields the values m i with probability p i = | α|m i | 2 . The state is found in the corresponding eigenstates afterwards.
The description of multiple quantum systems takes place in the tensor product space of the individual Hilbert spaces. Thus a system of n qubits is described in the n-fold tensor product space of the one-qubit spaces. One arrives at the 2 n -dimensional Hilbert space (C 2 ) ⊗n = C 2 ⊗ · · · ⊗ C 2 (n times).
A density matrix is used to describe a classical probability distribution (also called a statistical mixture or statistical ensemble) over quantum states. Suppose that some source emits the quantum state |φ i with probability p i . One requires that p i ≥ 0 and i p i = 1. From the discussion in the previous section, it is clear that the measurement of an observableÂ must yield an expectation value of By linearity, this can be rewritten as Indeed the operator captures all there is to know about a quantum system and ρ is known as the density matrix describing it.
For a complex matrix ρ to represent a quantum state, one requires ρ = ρ † , ψ| ρ |ψ ≥ 0 for all |ψ (positive-semidefinite) and tr(ρ) = 1. Comparing with classical probability theory, this corresponds to a real valued, non-negative, and normalized probability distribution. The density matrix formalism can indeed be seen as a generalization of classical probability theory and quantum mechanics can be taken to be the study of the cone formed by complex positive-semidefinite matrices, and transformations thereof. This is an analogy to the probability simplex encountered in classical probability theory.
Now we can state what we left out in preceding discussion about measurements: consider the case when some eigenvalues of the measurement operator A = m i |m i m i | are equal, i.e. the spectrum of A is degenerate. What is the probability for obtaining outcome i and what is the post-measurement state? Let P j be the projector onto the eigenspace with eigenvalue m j of A. Then a measurement yields outcome m j with probability p j = tr(P j ρ) and the density operator immediately after the measurement reads P j ρP j tr(P j ρ) .
The time evolution of an isolated qubit is given by a unitary operator in SU (2).
On a closed quantum system of n qubits, the time evolution is given by unitary operators on H system = (C 2 ) ⊗n . In case of a quantum system interacting with its environment such unitaries can also act on a larger system A unitary on such a larger system can on H system be represented in the (non-unique) operator-sum decomposition as Throughout 1 will denote the identity map.
More generally, this reads for a density matrix as The above map is also known as a quantum channel or completely positive map and represents the most general form of physical change a quantum state can undergo.
In the case of a classical (conventional) bit, an error is represented by the bit-flip 0 1. For qubits, we regard any non-identity unitary transformation or non-identity quantum channel as an error. We can decompose any unitary or quantum channel in terms of a matrix basis.
A good choice is the Pauli group: it is generated by all possible tensor products of the 4 Pauli matrices, together with phases ±1 or ±i. Observe that σ x , σ z and σ y anti-commute. That is, Thus, the Pauli group P n is a non-abelian group consisting of the 4 n tensor products of σ 0 , σ x , σ z and σ y , which together with the four phases is a group of size 4 n+1 .
A quantum error-correcting code is a linear subspace Q of (C 2 ) ⊗n into which a number of logical qubits can be encoded such that all errors of a certain type can be detected and/or corrected. The question we ask is thus: given a noisy channel E, does there exist a recovery channel R, such that every density matrix ρ, for which the image of ρ is contained in Q, can be recovered? In other words, for all density matrices ρ with spectral decomposition where |φ i ∈ Q, we require that

A 1-qubit error-correcting quantum code
A classical code is a subset of A n , where A is a finite set called the alphabet and n is the length of the code. The repetition code is the simplest type of code in which each element a ∈ A is encoded as (a, a, . . . , a), an n-tuple of a's.
However, we could try the following repetition-type code Above and from now on, we simplify notation |0 ⊗ |0 as |00 , etc.
Suppose now a "bit-flip" σ x happens on the second position. This gives One can correct such an error by majority decision, One needs a measurement that indicates exactly where the bit-flip has occurred. This can be done, as will be explained in Example 2.8.
However, we cannot correct a single σ z error as is also a possible state of our code.
Shor [17] was the first to introduce a quantum code which can correct any single-qubit error. He circumvented this apparent problem by introducing a majority decision on the signs to correct a σ z error.

Example 1.2 (Shor code)
The coding space for the Shor code is (C 2 ) ⊗9 and a qubit is encoded as and Hence, by linearity, Suppose that we have a σ x error (bit-flip) occuring on the 4-th bit. Then the α 0 term would change to which we would detect and correct by taking the majority decision as with the classical error-correcting code, so we decode Now suppose we have σ z error (phase error) occuring on the 7-th bit. Then the α 0 term would be which we would detect and correct by taking the majority decision on the signs.
Since σ y = iσ x σ z , we can also correct σ y errors since the two decisions we made above are independent of each other. Note that the scalar i does not play a role in the decoding.

The orthogonal projection onto a subspace
Let Q be a subspace of (C 2 ) ⊗n and let Q ⊥ be its orthogonal subspace with respect to the standard inner product defined on (C 2 ) ⊗n ∼ = C 2 n . Any vector |ψ can be written (uniquely) as the sum of a vector P |ψ ∈ Q and P ⊥ |ψ ∈ Q ⊥ . The map |ψ → P |ψ is a linear map, called the orthogonal projection onto Q.
Proof. For any j k, Furthermore, Clearly, by definition, P 2 = P . By Lemma 1.3, P is Hermitian, since it is the sum of Hermitian operators. The following lemma implies that this is enough to characterise P .

Lemma 1.4
If P is a linear Hermitian operator for which P 2 = P and whose image is Q then P is the orthogonal projection onto Q.
The eigenspace with eigenvalue 0 is im(P ) ⊥ . Thus, P is the orthogonal projection onto im(P ).

Error-detection and correction
For the reliable transmission of an (unknown) quantum system over a noisy channel, we are now faced with three major challenges.
1. Measurement disturbance. As explained in Section 1.1, measurements induce an "update" of the state that is measured. Thus, when obtaining error syndromes in order to understand what error has occurred, the underlying quantum state may be altered.
2. Continuous set of errors. The set of errors is continuous and not discrete. How can we distinguish and correct for an error set this large?
3. No-cloning. Unknown quantum states cannot be copied. Thus an approach of adding redundancy as done for a classical repetition code is bound to fail.
How can these challenges be overcome? First, the syndrome measurements are chosen such that they stabilise the set of quantum states that consist of the code. In this way, all code states remain unchanged when extracting the syndromes, while erroneous states are changed in reversible fashion. Second, the linearity of quantum mechanics implies that when some discrete set of errors can be corrected, then so can be errors which lie in their span. We shall not show a proof of this here, but one can be found in [6, Theorem 2] and [4]. Lastly, the encoded quantum information is distributed amongst many systems and thus "hidden" from any noisy channel. In this way the state does not have to be copied and no redundancy is added. This not only gives rise to the below Knill-Laflamme conditions on error correction, but also provides an information theoretic interpretation of quantum error-correction.
In quantum error-correction one is faced with the following task. Let be a quantum channel. Given the channel N , for what set of states Q does there exist a recovery channel R such that R • N (ρ) = ρ for all It turns out that the set of correctable states form subspaces. The following theorem gives a necessary and sufficient condition for a recovery channel to exist.

Error weights
We define the weight wt(M ) of an operator M in the Pauli group P n to be the number of tensor factors which are not equal to σ 0 . For example, has weight three.
In classical codes the distance between any two elements of A n is the number of coordinates in which they differ. If the minimum distance of a code C is at least 2t + 1 then C is a t-error correcting code (i.e. we can correct errors if up to t coordinates of a codeword change). In quantum codes the same holds, if a quantum code can detect all errors of weight less than 2t + 1 then it is a t-error correcting code.

Definition and examples
Most quantum codes presently known are stabilizer codes, and their usefulness lies partially in the fact that their connection with classical codes allows for them to be described in an efficient way. Here, we will mainly deal with stabilizer codes, although we will also see examples of quantum codes in Section 4 which are not stabilizer codes.
A qubit stabilizer code Q(S) is the joint eigenspace with eigenvalue 1 of the elements of an abelian subgroup S of P n not containing −1. The subgroup S is also known as the stabilizer.
We will often define S as being generated by a set of n − k commuting independent generators M 1 , . . . , M n−k of P n . By independent, we mean that M 1 , . . . , M n−k generate S, while any smaller subset does not. Thus, the set of M i 's are called generators.
It is important note that we require −1 ∈ S, since otherwise Q(S) = {0}. We also assume that there is no coordinate in which every element of S has a σ 0 in that coordinate, as we could simply delete this coordinate and this would not affect the error correcting capabilities of the code.
Note that the phase of any element in S is ±1, since if which, as mentioned above, implies that Q(S) = {0}.
Example 2.1 Suppose n = 2 and S is generated by a single Pauli operator M = σ x ⊗ σ z .
We note that the dimension of Q(S) is 2.
We often use the short-hand notation σ 0 = I, σ x = X, etc.., so in the previous example we might write M = XZ.
In the shorthand notation we would write that S is defined by Observe that M i M j = M j M i for all i and j ∈ {1, 2, 3}. For example This can be checked quickly by verifying that different Pauli matrices {σ x , σ y , σ z } coincide in the same position in M i and M j (i = j) an even number of times.
To find a basis for the stabilizer code, suppose that is in the code space, i.e. that α is in the +1-eigenspace of all M i . Since We have that |α is in the +1-eigenspaceM 1 = Im(I + M 1 ) of M 1 if and only if α j00 = α j10 and α j01 = −α j11 . Similarly, Thus, |α is in the +1-eigenspaceM 2 if and only if iα j00 = α j11 and α j01 = −iα j10 . Thus,

Finally,
is the one-dimensional subspace spanned by In fact, we seldom actually calculate a basis as for Q(S) as it is not necessary in practise. We have only calculated this previous example so one gets a feel of how laborious this is even for small parameters. From a practical point of view it is enough to know the orthogonal projection P for the subspace Q.

The dimension and minimum distance of a stabilizer code
Let S be an abelian subgroup of P n . Let Q(S) be the subspace defined as the joint eigenspace of eigenvalue 1 of the elements of S. Let P = P (S) be the orthogonal projection onto the subspace Q(S).
Since E † = E for all E ∈ P n , we have that P † = P . Moreover, By Lemma 1.4, P = P (S).
Theorem 2.4 The stabilizer code Q(S) which is the joint +1-eigenspace of an abelian subgroup S generated by n − k independent elements has dimension 2 k .
Proof. By Lemma 2.3, the orthogonal projection onto Q(S) is The image of P is its eigenspace of eigenvalue one and also Q(S).
The operator P is Hermitian and thus diagonalisable. Since P 2 = P its eigenvalues are 0 and 1. The trace of P is equal to the sum of its eigenvalues, which in the case of P is the dimension of the eigenspace of eigenvalue one. Therefore, the dimension of Q(S) is equal to the trace of P (S).
It only remains to note that tr(M ) = 0 for all M ∈ P n with the exception of M = 1, in which case tr(1) = 2 n . Thus, dim Q = 2 n /|S| = 2 k .
Having ascertained the dimension of a stabilizer code, we go on to determine its minimum distance.
Let Centraliser(S) denote the set of elements of P n that commute with all elements of S, i.e. the centraliser of S in the group P n . Proof. We proceed by contradiction.
(⇒) Suppose that E is undetectable but that E ∈ Centraliser(S) \ S.
Since any two elements of P n either commute or anti-commute, E ∈ Centraliser(S) implies there is a M ∈ S such that Take any |ψ , |φ ∈ Q(S) with ψ|φ = 0. Then which implies ψ| E |φ = 0. Hence, by Theorem 1.5, E is detectable, a contradiction.
holds for all M ∈ S, which implies that E |ψ ∈ Q.
The subgroup generated by S and λ −1 E E defines a smaller stabilizer code, so there is a |ψ ∈ Q such that λ −1 E E |ψ = |ψ , contradicting the above. Hence, E is not detectable.
In the case that k = 0, we have that Q(S) is a 1-dimensional subspace so cannot be used to store quantum information and all errors are correctable according to the definition. However, we do not rule out considering such codes since for any proper subgroup S of S, the code Q(S ) will be of interest. Since the elements of S \ S will be in Centraliser(S ) \ S , Theorem 2.6 indicates that it makes sense to define the minimum distance of Q(S) to be equal to the minimum weight of the non-identity elements of S. These codes are called self-dual, for reasons that will become clear, see Theorem 2.12.
Theorem 2.6 If k 1 then the minimum distance of the 2 k -dimensional stabilizer code Q(S) with stabilizer group S is equal to the minimum weight of the errors in Centraliser(S) \ S.
Proof. By Lemma 2.5, Q(S) can detect all errors which are not elements of Centraliser(S) \ S. In particular, it can also detect all errors of weight less than the minimum weight of an error in Centraliser(S) \ S.
If there are elements of S whose weight is less than the minimum distance of Q(S) then the code is called impure. If this is not the case then the code is called pure. We use the shorthand notation ((n, K, d)) to denote a quantum code of (C 2 ) ⊗n of dimension K and minimum distance d. The notation [[n, k, d]] denotes a quantum code of dimension 2 k . If it is a stabilizer code Q(S) then d is equal to the minimum weight of the elements in Centraliser(S) \ S.
We now rewrite the Shor code from Example 1.2 as a stabilizer code. [9,1,3]] code) Let S be the subgroup generated by the following elements of P 9 .
In shorthand notation this would be written in the following way. Suppose that E is an error of weight at most 2. We want to prove that E ∈ S or E does not commute with some M i .
We proceed with a case-by-case analysis.
If E has weight one and a single X or Y then it does not commute with one of M 1 , . . . , M 6 . If E has weight one and a single Z then it does not commute with one of M 7 , M 8 .
If E has weight two which are both X then, without loss of generality, suppose there is a X in the first system. Then E must have a X or Y in the second system so that it commutes with M 1 . But then it must also have a X or Z in the third system so that it commutes with M 2 , contradicting the fact that it has weight two.
We leave the case-by-case analysis as an exercise but conclude that the only errors of weight two which commute with all the M i are precisely those which are in S, i.e.
We will prove that the minimum distance of this code is 3 in a very simple manner once we have determined its geometry.
An important observation here is that the Shor code is impure since S contains errors of weight 2, whereas the minimum distance is 3.
We can store the same amount of information on fewer qubits with the following code. 1,3]] code) Let S be the subgroup generated by the following elements of P 5 .
This matrix makes the task of checking that M i M j = M j M i fairly quick. We will prove that the minimum distance is 3 by considering its geometry in Example 3.15.
Let us see how we can use this example to correct errors of weight one. We perform measurementsM i on E |φ . This will return a value ±1 (the eigenvalues of M i ). This gives us a "syndrome", a 4-tuple of signs for each error E. These are given in the following tables.
Since each syndrome is distinct we can use this look-up table to identify the error and correct it. An important observation here is that when we perform the measurement M i , only the sign of the state can possibly change. Since E |φ is an eigenvector of M i , so after measuring we will be in the state ±E |φ . Thus, we can measure consecutively each measurementM i , for i = 1, . . . , n − k.

Qubit stabilizer codes as binary linear codes
In this section we introduce a connection between qubit stabilizer codes and classical binary linear codes. We will go on to exploit this connection to construct qubit quantum codes and then to realise a more general connection between stabilizer codes and classical codes.
Let F q denote the finite field with q elements. Consider the map defined by the following table.
τ : We extend the map τ to P n by applying τ to an element of P n coordinatewise, where the image of the j-th position of M is the j and (j + n)-th coordinate in τ (M ). For example, We draw the line between the n and (n + 1)-st coordinate, for readability sake. We ignore the phase, so τ (λM ) = τ (M ) for all λ ∈ {±1, ±i}. Effectively, this defines the domain of the map τ as P n /{±1, ±i}.
Proof. Observe that the multiplicative structure up to a phase factor (for example we ignore the i in σ y = iσ x σ z ) is isomorphic to the additive structure of F 2 2 .
We have established a bijection between the elements of P n /{±1, ±i} and F 2n 2 . The above lemma implies that a subgroup S of P n is in bijective correspondence with a subspace of F 2n 2 . We now wish to ascertain what property this subspace has if S is a subgroup generated by commuting elements of P n .
To this end, we define an alternating form for u, w ∈ F 2n 2 , The symplectic weight of a vector v ∈ F 2n 2 is defined as

Lemma 2.11
The weight of M ∈ P n is equal to the symplectic weight of τ (M ).
Proof. We have that n − wt(M ) is equal to the number of σ 0 's in M which is equal to n minus the symplectic weight of τ (M ).
For a subspace C F 2n 2 , we define ⊥ a as Theorem 2.12 S is a subgroup of P n generated by n − k independent mutually commuting elements if and only if C = τ (S) is a (n − k)-dimensional subspace of F 2n 2 for which C C ⊥a . If k = 0 then the minimum distance of Q(S) is equal to the minimum symplectic weight of the elements of C ⊥a \ C. If k = 0 then the minimum distance of Q(S) is equal to the minimum symplectic weight of the non-zero elements of C = C ⊥a .
Proof. The fact that C = τ (S) is contained in C ⊥a follows from Lemma 2.9 and Lemma 2.10.
By Theorem 2.6, for k = 0, the minimum distance is equal to the minimum weight of the images of the elements of Centraliser(S) under τ , which are not elements of the image of S. Since C = τ (S) and C ⊥a = τ (Centraliser(S)), the theorem follows for k = 0.
For k = 0, by definition, the minimum distance is equal to the minimum weight of the images of the elements of S under τ , which are the non-zero elements of C.
We can construct a generator matrix G(S) for C = τ (S) by taking the (n − k) × 2n matrix whose i-th row is τ (M i ). The following table makes for a useful reference. P n the Pauli group, given by n-fold tensor products of Pauli matrices σ 0 , σ x , σ y , σ z with phases {±i, ±1}. M 1 , . . . , M n−k the generators, a set of independent elements of P n that generate S. S the stabilizer, an abelian subgroup of P n . Q(S) the quantum code obtained as the joint intersection of the eigenspaces of eigenvalue 1 of the operators in S. Let S be the subgroup of P 5 generated by the following pairwise commuting elements.
The matrix G(S) for this code is One can check directly that (u, v) a = 0 for any two rows u, v of G(S). Alternatively, it is enough to observe that A is symmetric and that We will prove in Example 3.15 that the minimum distance of Q(S) is 3.
Observe that any n×n The difficulty lies in choosing A so that the symplectic weight of the code generated by G (and hence d) is large.
3 The geometry of additive, linear and stabilizer codes

Additive and linear codes over a finite field
We recall that a code of length n is a subset C of A n , where A is a finite set called the alphabet. An element of C is called a codeword.
The distance between any two elements of A n is the number of coordinates in which they differ. The minimum distance of C is the minimum distance between any two codewords of C.
Suppose A is a finite abelian group with identity element 0. If u + v ∈ C for all u, v ∈ C then we say that C is additive.
The weight of an element (codeword) u of an additive code is the number of non-zero coordinates that it has.
Lemma 3.1 If C is an additive code over an alphabet which is a finite abelian group then the minimum distance d of C is equal to the minimum non-zero weight w.
Suppose that u is a codeword of minimum weight w. Then since 0 ∈ C, we have w d.
Suppose that u and v are two codewords which differ in exactly d coordinates. Then u − v is a codeword in C of weight d and so d w.
Suppose that A = F q , the finite field with q = p h elements, p prime. If C is additive then λu ∈ C for all λ ∈ F p , so C is a subspace over F p . If C has the additional property that λu ∈ C for all λ in F q then we say C is linear. A linear code of length n is a subspace of F n q . We use the notation (n, K, d) q code to denote a code over an alphabet of size q of length n, size K and minimum distance d.
The notation [n, k, d] q code denotes a k-dimensional linear code over F q of length n and minimum distance d.

The geometry of linear codes
We will begin our geometrical study of codes by considering linear codes over F q .
Let G be a k × n matrix. We recall that when a t is a row vector in F k q , the expression a t G yields a linear combination of the rows of G. Likewise, when b is a column vector in F n q , the expression Gb yields a linear combination of the columns of G. Let C be a k-dimensional linear code over F q of length n, in other words, C is a k-dimensional subspace of F n q . We describe C by a k × n matrix G whose row space is C, i.e. the rows of G are a basis for C. Thus, for each u ∈ C, there is an In other words, the generator matrix G acts as a linear encoding matrix for the message a, yielding the codeword u ready to be sent over a noisy channel.
The geometry of C is seen by considering the set of columns of the generator matrix G. Let X be the set of columns of G, so X is a (possibly multi-)set of n vectors of F k q . The codeword u = a t G has a zero in its i-th coordinate if and only if This property is unaffected if we replace z by a non-zero scalar multiple of z, so it is natural to consider X as a (possibly multi-)set of n points of PG The projective space PG(k − 1, q) is obtained from the vector space F k q by identifying the vectors which are scalar multiples of each other. In this way, the points of PG(k − 1, q) are the one-dimensional subspaces of F k q and, more generally, the (i − 1)dimensional subspaces of PG(k − 1, q) are the i-dimensional subspaces of F k q . The lines, planes and hyperplanes of PG(k −1, q) are the 1-dimensional, 2-dimensional and co-dimension 1 subspaces, respectively. Note that in PG(k − 1, q) familiar geometric properties hold. For example, two points are joined by a line; the intersection of two planes in a three-dimensional subspace is a line. If a point x is contained in a subspace π we say that x is incident with π. If two subspaces π 1 and π 2 have an empty intersection (i.e. their corresponding subspaces in F k q intersect in the zero vector), then we say that they are skew.
A set of points x 1 , . . . , x r of a projective space are independent if they span an (r − 1)-dimensional (projective) subspace. If they are not independent then they are dependent.
The number of r-tuples of linearly independent vectors of F k q is Hence, the number of r-dimensional subspaces of F k q is Thus, the number of points of PG(k − 1, q) is There is a natural duality between the points of PG(k − 1, q) and the hyperplanes of PG(k − 1, q). A point (a 1 , . . . , a k ) is mapped to the hyperplane defined as the kernel as the linear form Thus, the number of hyperplanes of PG(k − 1, q) is also which can be checked directly by calculating The number of points in PG(k − 1, 2) is 2 k − 1 and the number of lines of Thus, the lemma is holds taking into account the dimension shift when considering the projective space.
The following theorem explains what the minimum distance d of a linear code implies for the set of points X . Theorem 3.3 An [n, k, d] linear code over F q is equivalent to a (possibly multi-)set of points X in PG(k − 1, q) in which every hyperplane of PG(k − 1, q) contains at most n − d points of X and some hyperplane contains exactly n − d points of X .
Proof. Let G be a k × n matrix whose row space is a [n, k, d] linear code C. Let X be the set of columns of G viewed as points of PG(k − 1, q).
Recall that the codeword u = a t G has a zero in its i-th coordinate if and only if The kernel of the linear form a 1 X 1 + · · · + a k X k defines a hyperplane π a of PG(k − 1, q). The codeword u = a t G has weight w if and only if u has exactly n − w zero coordinates. This is the case if and only if π a is incident with n − w points of X .
By Lemma 3.1, the minimum distance of a linear code is equal to its minimum weight. Hence, the maximum number of points of X on a hyperplane of PG

The geometry of additive codes
An additive code C over F q is linear over F p , where q = p h for some prime p. Therefore, |C| = p r for some r. The following theorem is the additive version of Theorem 3.3; the set of points X is replaced by a set of subspaces.
Theorem 3.4 An (n, p r , d) additive code over F q with q = p h is equivalent to a (possibly multi-)set X of (h − 1)-dimensional subspaces in PG(r − 1, p) in which every hyperplane of PG(r − 1, p) contains at most n − d subspaces of X and some hyperplane contains exactly n − d subspaces of X .
Proof. Let G be a r × n matrix which is a basis for C over F p . As in the case of linear codes, we consider the (possibly multi-)set X of columns of G. However, we shouldn't consider the elements of X as points of PG(r − 1, q), since we obtain C from G by taking the row span over F p and not over F q . Thus, we consider the elements of X as subspaces of PG(r − 1, p). Suppose that e ∈ F q , is such that {1, e, e 2 , . . . , e h−1 } is a basis for F q over F p . Then, up to scalar factor, we can write x ∈ X as We associate x with the subspace spanned by x 0 , . . . , x h−1 in PG(r − 1, p), which we denote by x . The subspace x has dimension at most h − 1.
Suppose that x is the i-th column of G, so x ∈ X . The non-zero codeword u = a t G, where a ∈ F r p , has a zero in its i-th coordinate if and only if the hyperplane of PG(r − 1, p), which is the kernel of linear form Observe that a linear code over F q necessarily has size q k , so if we wish to obtain an additive code with the same parameters as a linear code, then r = kh for some k.

The geometry of qubit quantum codes
For the moment, we restrict to the case q = 2 and consider the geometrical consequences of Theorem 2.12, which describes the connection between stabilizer codes and binary linear codes.
A qubit stabilizer code Q(S) is equivalent to a binary linear code C = τ (S) of length 2n which is contained in its alternating dual C ⊥a . According to Theorem 2.12, the minimum distance of Q(S) is the minimum symplectic weight of C ⊥a \C.
Consider once again the Shor code from Example 1.2.
Since there are two columns which are linearly dependent, there are elements of C ⊥a of symplectic weight two; these are images under τ of Pauli operators of Centraliser(S) of weight two.
To see this, recall that the alternating form is defined as so the dependency of the first two columns implies that (0, 0, 0, 0, 0, 0, 0, 0, 0 | 1, 1, 0, 0, 0, 0, 0, 0, 0) is an element of C ⊥a . However, this element is an element of C, since it's the first row of the matrix. Recall that the minimum distance is equal to the minimum symplectic weight of C ⊥a \ C. Therefore, although C ⊥a contains elements of symplectic weight 2, the minimum symplectic weight of C ⊥a \ C is in fact 3. We will prove this in Example 3.9.
Given a subgroup S, generated by n − k commuting elements M 1 , . . . , M n−k of P n , we obtain a set X of n lines or possibly points in PG(n − k − 1, 2) in the following way. For each i ∈ {1, . . . , n}, we get a line (or a point) by considering the span of the i-th and (i + n)-th column of the generator matrix G(S). Vice versa, given a set of n lines in PG(n − k − 1, 2), we construct a (n − k) × 2n matrix, from which we obtain M 1 , . . . , M n−k by applying τ −1 to the rows of the matrix.
On first sight it may seem that there is a certain amount of freedom when we reconstruct the code from a given quantum set of lines. Each line is incident with three points and we can choose which pair of points on the line to use to construct the i-th and the (i + n)-th column of G. This choice is equivalent to invoking a permutation of {σ x , σ y , σ z } on the i-th position of each of the M 1 , . . . , M n−k . This does not affect the property that these elements pairwise commute, so we define all quantum codes that can be obtained from each other in this way to be equivalent.
For example, in Example 2.14, invoking the permutation σ which takes X → Z → Y → X on the M i in the first, second and fourth positions gives from Example 2.14, we see that the set of lines X remains unchanged.
There is also a choice between the scalar factor of M when we apply τ −1 to a row of the matrix G. We will always assume that this factor to be 1. However, changing the sign of some of the generators of a subgroup S can be useful, as we shall see in Section 4. We would like to give a geometrical interpretation of the fact that the code C = τ (S) is contained in C ⊥a .
Recall that we say two subspaces of PG(k − 1, q) are skew if they do not intersect.
Theorem 3.7 The following are equivalent.
1. There is a [[n, k, d]] stabilizer code Q(S), where S is a subgroup generated by n − k independent commuting elements of P n and whose centraliser contains no element of weight one.
2. There is a set of n lines X spanning PG(n − k − 1, 2) with the property that every co-dimension 2 subspace is skew to an even number of the number of lines of X . Proof.
Let X be the set of n lines obtained for i = 1, . . . , n as the span of the i-th and (i + n)-th column of G(S).
Let u, w ∈ C, so u = (a 1 , . . . , a n−k )G and w = (b 1 , . . . , b n−k )G for some a = (a 1 , . . . , a n−k ) ∈ F n−k One has C ⊆ C ⊥a if and only if for all u, w ∈ C.
We want to deduce the geometrical meaning of (u, w) a = 0.
Consider a single term in the sum first. Let x and y be the j-th and the (n + j)-th column of G respectively. Then The right-hand side is zero if and only if the matrix a · x a · y b · x b · yc has zero determinant, i.e. it has rank 1.
Recall that we define π a as the hyperplane which is the kernel of the linear form a · X = a 1 X 1 + · · · + a n−k X n−k .
We can thus rewrite the above conditions as the requirement that the point λx + µy is contained in both π a and π b . In other words, there is a point on the line , spanned by x and y, which is incident with the intersection of the two hyperplanes π a and π b .
Returning to the condition (u, v) a = 0, we must therefore get an even number of ones in the sum n j=1 (u j w n+j − u n+j w j ) .
All lines of X that are skew to π a ∩ π b = ker(a · X) ∩ ker(b · X) contribute; for any given a and b there there must in total be an even number of such lines.
We note that every co-dimension 2 subspace of PG(n − k − 1, 2) can be realised in this way (as the intersection of some a · X = 0 and b · X = 0). This proves the forward implication.
π a λx + µy x a line that is skew to π a ∩ π b x y π b y π a ∩ π b (co-dimension 2) Figure 1: A point λx + µy on the intersection of the hyperplanes π a and π b .
(1 ⇐ 2) Let X be a set of lines spanning PG(n − k − 1, 2) with the property that every co-dimension 2 subspace of PG(n − k − 1, 2) is skew to an even number of lines of X . Let G be the matrix whose i-th and (i + n)-th column are points which span the i-th line of X . Let C be the code generated by G. Since X spans PG(n − k − 1, 2), the code C is (n − k)-dimensional. As we proved in the forward implication, the property that every co-dimension 2 subspace is skew to an even number of lines of X implies that for any two codewords u and v of C, (u, v) a = 0 holds. By Lemma 2.10, the image under τ −1 of C is an abelian subgroup S of P n and by Lemma 2.13, it is generated by n − k pairwise commuting elements of P n .
Let X be a set of lines and let Θ(X ) be the space spanned by the lines of X .
We say that X is a quantum set of lines if it has the property that every co-dimension 2 subspace of Θ(X ) is skew to an even number of lines of X . To deduce the minimum distance of the corresponding stabilizer code, we introduce the parameter d(X ).
Recall that r points are independent if they span an (r − 1)-dimensional subspace; they are dependent otherwise.
Consider first the case in which dim Θ(X ) = |X | − 1. By Theorem 3.7, X will give a quantum [[n, k, d]] code with k = 0. We define the parameter d(X ) as the minimum number of dependent points that can be found on distinct lines of X ; not including the dependencies for which there is a hyperplane of Θ(X ) which both a) contains all the lines of X which do not contain the dependent points , b) contains all the dependent points. 3 Thus, d(X ) = r, where r is minimal such that there exists a set of dependent points {x 1 , . . . , x r }, where each x i is incident with a line i ∈ X and the lines 1 , . . . , r are distinct, but for which there is no hyperplane containing the lines X \ { 1 , . . . , r } and the points {x 1 , . . . , x r }.
In the case in which dim Θ(X ) = |X | − 1, Theorem 3.7 implies that X will give a quantum [[n, k, d]] code with k = 0. We define the parameter d(X ) as the minimum d for which there is a hyperplane of Θ(X ) containing |X | − d lines of X . Equivalently. it is the minimum number of dependent points that can be found on distinct lines of X . This definition and the equivalence will be justified in the proof of Theorem 3.8.
From now on we assume that the centraliser of the stabilizer S contains no elements of weight one. By Lemma 3.6, this assumption guarantees that there is a quantum set of lines associated with the stabilizer code. As mentioned before, this is equivalent to assuming that the minimum distance is at least 2 in the case of pure codes. Proof. We only have to prove the part about the minimum distance since Theorem 3.7 covers the rest. As in the proof of Theorem 3.7, let G = G(S) be the (n − k) × 2n generator matrix with entries from F 2 whose row space forms the code C. Define a set of lines where j is the line that corresponds to the span of the j-th and (j + n)-th column of G.
Consider the case k = 0. By Theorem 2.12, the parameter d is the minimum symplectic weight of C ⊥a \ C.
Suppose now that v ∈ C ⊥a has symplectic weight w and let W denote the set of positions that contribute to the weight, Clearly, |W | = w.
Denote by x j the j-th column of G.
Each summand corresponds to some point of j . Thus, there are w = |W | points on distinct lines { j | j ∈ W } which are dependent.
However, since the minimum distance d is the minimum symplectic weight of C ⊥a \ C, we have to disregard this dependency if v ∈ C.
A vector v is in C if and only if v = aG for some a ∈ F n−k
First, consider those positions j of v that do not contribute to its symplectic weight, that is, j / ∈ W . For each j / ∈ W , one has that v j = a · x j = 0 and v n+j = a · x n+j = 0 if and only if the line l j is contained in the hyperplane π a described by a · X = 0. So the lines of { j | j ∈ {1, . . . , n} \ W } are contained in π a .
Second, consider those positions j of v that contribute to its symplectic weight, j ∈ W . Then since v j = a · x j and v n+j = a · x n+j . Hence, the dependent points are also contained in the hyperplane a · X = 0.
This exactly coincides with our definition of d(X ). Now, consider the case k = 0. By Theorem 2.12, the parameter d is the minimum non-zero symplectic weight of C.
Let v ∈ C be of minimum non-zero symplectic weight. Since v ∈ C, v = aG for some a ∈ F n−k 2 . Thus, v j = a · x j for all j = 1, . . . , 2n.
Let W denote the set of positions that contribute to the symplectic weight of v, i.e.
Then, for j ∈ W , a · x j = a · x n+j = 0 which is equivalent to the line j ∈ X being contained in the hyperplane a · X = 0. Therefore, there is a hyperplane of Θ(X ) containing |X | − d lines of X which coincides with our definition of d(X ) in this case.
Alternatively, since C = C ⊥a , the parameter d is the minimum non-zero symplectic weight of C ⊥a . As in the case k = 0, a vector v = (v 1 , . . . , v 2n ) ∈ C ⊥a of symplectic weight d, will give a dependency of d points of X , which coincides with our alternative definition of d(X ) in this case.
Let G = G(S) be the (n − k) × 2n generator matrix for a code C, whose i-th and (i + n)-th column span the i-th line of X . Let S = τ −1 (C) and let Q(S) be the stabiliser code. By Theorem 3.7 and the fact that Θ(X ) = PG(n − k − 1, 2), Q(S) is a [[n, k, d]] stabilizer code for some d. The fact that d = d(X ) follows from the same arguments as in the forward implication, observing that if which implies v j = a · x j and v n+j = a · x n+j , assuming (a · x j , a · x n+j ) = (0, 0). This is precisely the assumption that j is not contained in the hyperplane π a . Example 3.9 (Shor code) As we saw in Example 3.5, the Shor code has the generator matrix which is drawn in Figure 2. Here, e i , e j denotes the line spanned by points e i and e j .
Note that the point e 7 is on the two lines e 1 , e 7 and e 1 + e 2 , e 7 , and thus e 7 is "dependent with itself ". So at first sight it seems that d(X ) = 2. However, the remaining seven lines span a six dimensional subspace since the two planes e 3 , e 4 , e 7 + e 8 and e 5 , e 6 , e 8 span a five dimensional subspace, while the line e 2 , e 7 extends this to a six dimensional subspace that also contains the point e 7 (i.e. contains all dependent points). Following Theorem 3.8, we do not count this dependency and conclude that d(X ) 3. The dependency of e 7 with itself implies that the Shor code is impure. The dependent points {e 1 , e 2 , e 1 + e 2 } imply that d(X ) = 3. Although the six lines not containing these points are contained in a hyperplane, there is no hyperplane containing the six lines and the dependent points, thus we do not disregard this dependency. Thus, we see that condition b) is essential in the definition of d(X ).
Let us generalize one feature of the Shor code further: a planar pencil of lines in a projective space is a set of lines which are all contained in some plane and are all the lines incident with a point in that plane. As illustrated in Figure 2, the Shor code is the union of three planar pencils. Observe that a planar pencil of lines is itself a quantum set of lines. Our aim is to show that a quantum set of lines is nothing more than the union modulo two of planar pencils of lines. We first prove a few lemmas. Proof. Let X and Y be two quantum sets of lines. Recall that Θ(X ), Θ(Y), and Θ(X ∪ Y) are the spaces spanned by X , Y, and both sets of lines respectively. A co-dimension 2 subspace π intersects Θ(X ) in either a co-dimension 2 subspace, in a hyperplane, or in Θ(X ). In the first case it is skew to an even number of the lines of X ; in the latter two cases it is skew to none (which is even).
Let X be the subset of X of lines skew to π. Likewise, let Y be the subset of Y of lines skew to π. Then π is skew to |X | + |Y| − 2|X ∩ Y| lines of the union modulo two of X and Y.
Since both |X | and |Y| are even, every co-dimension 2 subspace is skew to an even number of lines of X ∪ Y. This proves the lemma.
An r-sputnik is a set of (r + 1) concurrent lines (they are all incident with some point) in an r-dimensional subspace π with the property that any r of them span π. In Figure 3 a 3-sputnik is illustrated.
Our aim will be to prove that a quantum set of lines is the union modulo two of planar pencils of lines. Firstly we will prove that this claim is true for an r-sputnik.   Proof. Let X be an r-sputnik and take any two lines and ∈ X . The r − 1 lines of X \ { , } span a (r − 1)-dimensional subspace which intersects the plane spanned by and in a line . The line is the third line in the planar pencil of lines spanned by and . Thus, adding (modulo 2) this pencil of lines to X we get an (r − 1)-sputnik. Now continue adding planar pencils of lines in this way until we get a 2-sputnik. Since a 2-sputnik is a planar pencil of lines, it is a quantum set of lines. We can then reverse the process adding planar pencils of lines to recover the r-sputnik which, by Lemma 3.10, is also a quantum set of lines.
Lemma 3.12 Let X be a quantum set of lines. There is a set D of dependent points such that each point of D is incident with a different line of X .
Proof. Let π = Θ(X ) be the subspace spanned by the lines of X and let ∈ X . Let π = Θ(X \ { }) be the subspace spanned by the lines of X \ { }. The subspace π is either a co-dimension 2 subspace of π, a hyperplane of π, or π itself. The first case is ruled out since X is a quantum set of lines and, by definition, any co-dimension 2 subspace is skew to an even number of lines of X . Therefore, there is a point of x of incident with π . Any point of π is the sum of points incident with the lines of X \ { }. Thus, we obtain a set of dependent points each incident with a line of X . If in this set there are two points y and z incident with same line of X , then we can replace y and z by \ {y, z}. Hence, we obtain a set of dependent points each incident with a distinct line of X .

Lemma 3.13 A quantum set of three lines is a planar pencil of lines.
Proof. Suppose that the quantum set of three lines X = { 1 , 2 , 3 } span PG (4,2) or PG(5, 2) respectively. Then there is a point x ∈ 2 such that the co-dimension 2 subspace spanned by 1 and x (resp. 1 and 2 ) is skew to 3 . This contradicts the definition of a quantum set of lines.
Suppose that the quantum set of three lines X = { 1 , 2 , 3 } span PG(3, 2). If 1 and 2 intersect then the co-dimension 2 subspace 1 (and also 2 ) must also intersect 3 . Since they span PG(3, 2) the three lines must be concurrent (and not co-planar). Taking the union modulo 2 of the planar pencil of lines spanned by 2 and 3 we obtain, by Lemma 3.10, a quantum set of two lines, which does not exist. Thus we have three pairwise skew lines 1 , 2 , 3 with the property that any line incident with two of them is incident with the third. This implies there are nine lines which are all incident with exactly one point of each of 1 , 2 , 3 , see Therefore, the quantum set of three lines span a PG(2, 2). A co-dimension 2 subspace is just a point, so a quantum set of lines must be incident with every point of the plane. Hence, X is a planar pencil of lines.
The following theorem is due to Glynn, Gulliver, Maks and Gupta [5]. It is important to note that if the qubit stabilizer code has minimum distance 2 then it is possible that the quantum set of lines X contains repeated lines. This occurs, for example, in the [[5, 2, 2]] code.
Theorem 3.14 A qubit stabilizer code with minimum distance at least three is equivalent to a quantum set of lines which is generated by the union modulo two of planar pencils of lines.
Proof. Let X be a quantum set of lines. We will prove that there is an r-sputnik X such that the union modulo 2 of X , X and r − 1 planar pencils of lines is a quantum set of |X | − 1 lines. Since, by Lemma 3.11, X is the union modulo 2 of planar pencils of lines, this implies that, by iteration, we can take the union modulo 2 of X and some planar pencils of lines and obtain a quantum set of three lines, by Lemma 3.10. By Lemma 3.13, this set of three lines is a planar pencil of lines and we are done.
By Lemma 3.12, there is a set x 1 , . . . , x r+1 of minimally dependent points incident with the lines 1 , . . . , r+1 of X , respectively. Let x ∈ r+1 \ {x r+1 }. Let j be the line spanned by the points x and x j , for j = 1, . . . , r. Let X be the r-sputnik, Let L j be the planar pencil of lines spanned by j and j . In Figure 6, r = 5, the lines j are the thick lines, the j are the medium thickness lines and the thin lines are the third line in the planar pencil of lines spanned by j and j .
By Lemma 3.10, the union modulo two of (∪ r j=1 L j ) ∪ X ∪ X is a quantum set of lines and, on inspection, it is a set of |X | − 1 lines. x 1 x 2 x 3 x 4 x 5 x Figure 6: The thick lines are in X , the medium-thick lines are in X and the thin lines make up the planar pencils at each point x 1 , . . . , x r .   PG (3,2). Then, since any two of the thick lines are pairwise skew, we have that the minimum distance is 3. Research Problem 1 The parameters [ [14,3,5]] are the smallest for which it is unknown whether there exists a qubit stabilizer code or not [8]. To construct such a code one should look for a union modulo two of planar pencils of lines that give 14 lines in PG(10, 2), such that any four points on 4 of the 14 lines that also lie on a common plane, the remaining 10 lines are contained in a hyperplane which also contains those four dependent points.
Theorem 3.14 can also be used to rule out the existence of quantum codes with certain parameters sets. For example, were a [ [4,0,3]] stabilizer code to exist then X would be a set of four skew lines in PG (3,2) with the property that any line is skew to an even number of lines of X . However, the lines of X themselves are skew to the other three lines of X , which is an odd number. A more interesting exercise is to prove that a [ [7,0,4]] code does not exist. To prove this, show that there are at least five three dimensional subspaces which intersect all of the 7 lines of PG (6,2) in the quantum set of lines and prove that these pairwise intersect in a point.

Direct sum of stabilizer codes
As discussed in the previous sections, a stabilizer code is defined as the common (+1)-eigenspace of an independent set of pairwise commuting Pauli operators M 1 , . . . , M n−k ; this is the generator of the code. In other words, these codes are completely characterized by an abelian subgroup S = M 1 , . . . , M n−k ⊂ P n . The aim of this section is to construct quantum codes that are the direct sum of stabilizer codes. Technically speaking, any subspace can be regarded as a quantum code, and naturally we want to make sure to obtain a large mininum distance when taking this direct sum of subspaces. Thus, we seek for some additional structure amongst them. While each individual subspace will again be defined by a set of generators M 1 , . . . , M n−k , we will now not simply take the joint eigenspace with eigenvalue 1 as our code space.
We have already observed that to avoid constructing a trivial code, one restricts the stabilizer not to contain a non-trivial multiple of the identity, −1 ∈ S. This implies that each generator can only have an overall phase of +1 or −1 and they are of the form M j = ±σ 1 ⊗ · · · ⊗ σ n for some σ 1 , . . . , σ n ∈ P 1 . Now observe that when M 1 , . . . , M n−k commute, then so do ±M 1 , . . . , ±M n−k .
Thus for all t = (t 1 , . . . , t n−k ) ∈ {0, 1} n−k , one can define a corresponding stabilizer code Q(S t ) as the joint (+1)-eigenspace of For distinct t and t ∈ T , there is a j such that t j = t j . Without loss of generality, suppose that t j = 1. For all |v ∈ Q(S t ) and |w ∈ Q(S t ), one has v|w = v|M j w = M j v|w = − v|w = 0. Consequently, Q(S t ) and Q(S t ) are orthogonal.
For any T ⊂ {0, 1} m , we define a direct sum stabilizer code (confusingly also known as a union stabilizer code) as To be able to determine the minimum distance of this quantum code, we first determine the errors which are not detectable.
In the following lemma, the coordinates where two (n − k)-tuples t and t differ is denoted by Lemma 4.1 Suppose Q(S T ) is unable to detect an error E. Then there is a pair t, t ∈ T such that E commutes with M j for all j ∈ {1, . . . , n − k} \ supp(t + t ).
Proof. Suppose there is no such pair. Then, for all t, t ∈ T , there is a j ∈ {1, . . . , n − k} \ supp(t + t ) for which E anti-commutes with M j .
For any u ∈ Q(S t ) and u ∈ Q(S t ), we have either in the case that both u and u are eigenvectors of M j with eigenvalue 1 and in the case that both u and u are eigenvectors of M j with eigenvalue −1.
Similarly, for any u, u ∈ Q(S t ), since E anti-commutes with M j , u| E |u = 0.
Suppose that B t is an orthogonal basis for Q(S t ). Since Q(S T ) is a direct sum of orthogonal subspaces, is an orthogonal basis for Q(S T ).
Suppose that w, w ∈ Q(S T ) and that w|w = 0. Writing out w and w with respect to the basis B T we have This implies that E is detectable, a contradiction. This ends the proof.
Thus, according to Lemma 4.1, we only need concern ourselves with the errors which are in Centraliser(S t+t ) for any t, t ∈ T .
This motivates the definition where d t+t is the minimum distance of Q(S t+t ).

Theorem 4.2
The subspace Q(S T ) is an ((n, |T |2 k , d T )) quantum code.
Proof. If E is undetectable then it is an element of Centraliser(S t+t ) for some t, t ∈ T . Thus, the minimum distance of Q(S T ) will be the minimum of the minimum distances of Q(S t+t ).

The Rains, Hardin, Shor, Sloane non-additive quantum code
This code first appeared in [15], although the geometric observation given here appears to be new.  Observe that deleting any two rows of this matrix we obtain a 3 × 10 matrix whose 5 pairs of columns define a quantum set of lines in PG (2,2). This quantum set of lines defines a stabilizer code whose minimum distance is 2. Therefore, if we set then, by Theorem 4.2, Q(S T ) is a ((5, 6, 2)) quantum code.

The geometry of direct sum stabilizer codes
Suppose that we restrict our choice of elements of T to singleton subsets and the empty set, as in Example 4.3. Let X be the quantum set of lines of PG(n − k − 1, 2) associated with the [[n, k, d]] quantum stabilizer code Q(S), where S is the subgroup generated by M 1 , . . . , M n−k . Let P = {e 1 , . . . , e r } be a set of linearly independent points of PG(n−k −1, 2), chosen so that the projection from any two points e i , e j ∈ P of the lines of X is a set of lines of PG(n − k − 3, 2). If this projection is a set of lines then it is necessarily a quantum set of lines, which we denote by X ij .
If we choose a basis so that e j ∈ P is the j-th element in the basis then the projection from e i and e j gives a stabilizer code generated by The parameter d(X ij ) is then the minimum distance of the stabilizer code Q(S e i +e j ). Thus, the definition in (4) will be Hence, we have a purely geometric way to construct direct sum stabilizer codes with parameters ((n, (r + 1)2 k , d T )), for some r n − k.
Research Problem 2 Find quantum sets of lines X for which there are points with the property that the projection of the lines of X from any pair is onto a quantum set of lines X with relatively large d(X ). It should be possible to make direct sum stabilizer codes with good parameters from this geometrical construction. It would be of great interest if one could construct codes with parameters for which stabilizer codes could feasibly exist but none are known to exist.

The higher-dimensional Pauli group
When a quantum system has D levels we speak of a quDit. In this section, we will consider quantum codes over such larger subsystems. Consequently, these codes are subspaces of the Hilbert space (C D ) ⊗n .
We will consider (C q ) ⊗n , where q = p h , is the power of a prime p. The restriction to prime powers allows us to use the structure of the finite field for their construction.
In the case when D is not a prime power, one can use the ring Z/DZ, but then most of the constructions that we will consider here will not work.
We label the coordinates of C q with elements of F q , where F q denotes the finite field with q elements. In this way, a basis for the space of endomorphisms of C q can be indexed by the elements of F q × F q .
For each a ∈ F q , we define a q × q matrix X(a) to be matrix obtained from from the linear map which permutes the coordinates of C q by adding a to the index.
In other words, with basis {|x | x ∈ F q } of C, For each b ∈ F q , we define a q × q matrix Z(b) to be the diagonal matrix whose i-th diagonal entry is w tr(ib) . Here, w = e 2πi/p is a primitive p-th root of unity and tr is the trace map from F q to its prime subfield F p , As in the previous case, if we take say q = 3 then where ω is a primitive complex third root of unity. Recall, that the rows and columns of the matrix are indexed by elements of F q , so i ∈ F q . Thus, We define the Pauli group for q odd as and for q even, that is when p = 2, as The reason that we accommodate this slightly larger group for q even is due to Lemma 5.2 below. One can check that this definition coincides with our definition of the Pauli group for q = 2.
The size of P n is pq 2n for q odd and 4q 2n for q even.
The following lemma implies that non-identity elements of the Pauli group have order p, for q odd. Note that for q even this is not the case; there are elements of order four. However, we extend the Pauli group as above (defining σ y = iσ x σ z ) and in this way we introduce more elements of order two. We do this so that we have more options for M i in our set of pairwise commuting operators which will generate the abelian subgroup S. 4 Lemma 5.2 For all a, b ∈ F n q and r ∈ N, Proof. By induction on r, we have By Lemma 5.1, this is equal to As in the case of qubit codes, we will again be looking to construct stabilizer codes and for this reason it will be of interest to know when elements M, N ∈ P n commute or not. For this reason the following lemma is fundamental. Proof. X(a) and X(a ) commute, likewise Z(b) and Z(b ), so the lemma follows from Lemma 5.1.

Error detection and correction
As in the case of qubit codes it suffices to consider errors from the group P n of Pauli-errors which are unitary operators of the form Let Q be a quantum error correcting code of (C q ) ⊗n , i.e. a subspace of (C q ) ⊗n .
Then again, as in the case of qubit codes, Q detects an error E ∈ P if for all |φ , |ψ ∈ Q with φ|ψ = 0, we have that for some constant c E which depends only on E.
A quantum code Q has minimum distance d if one can detect Pauli-errors with up to d − 1 non-identity matrices and correct Pauli-errors with up to d−1 2 non-identity matrices.
We say that a quantum code of (C q ) ⊗n of dimension K and minimum distance d is a ((n, K, d)) q code. If the code has dimension K = q k then we say that the code is a [[n, K, d]] q code. Note that some authors reserve the latter notation [[n, K, d]] q for stabilizer codes only.

Stabilizer codes
A stabilizer code is the intersection of the eigenspaces with eigenvalue one of the elements of an abelian subgroup S of P n . As before, we denote the code by Q(S). We insist that λ1 ∈ S whenever λ = 1, since otherwise Q(S) is trivial.
As in the qubit case, a stabilizer code Q(S) with stabilizer S can detect all Paulierrors that are scalar multiples of elements in S or that do not commute with some element of S. We denote by Centraliser(S), the elements of P n that commute with all elements of S. A non-detectable Pauli-error must be in Centraliser(S).
Commuting elements are characterised as follows.
is zero.
As in the case for qubit codes, we introduce the map τ which maps elements of P n to F 2n q by τ (X(a)Z(b)) = (a|b).
For elements u, w ∈ F 2n q , the trace symplectic form is (u, w) a = n j=1 tr(u j w j+n − w j u j+n ).
Then with u = (a|b) and w = (a |b ), this is the trace symplectic form (5).

Stabiliser codes as additive codes over F q
Let τ be the map that maps cX(a)Z(b) to (a|b) ∈ F 2n q . The group S is mapped to an additive code C = τ (S). The symplectic weight of (a|b) ∈ F 2n q is the number of i ∈ {1, . . . , n} such that (a i , b i ) = (0, 0). Thus, an element cX(a)Z(b) of weight w is mapped to a vector of symplectic weight w.
The elements of Centraliser(S) are mapped to the dual code of C, namely Here the dual ⊥ a is taken with respect to the trace symplectic form (6).
We have the following important theorem.
Theorem 5.4 An ((n, K, d)) q stabilizer code exists if and only if there exists an additive code C F 2n q of size |C| = q n /K such that C C ⊥a . If K = 1 then d is the minimum symplectic weight of an element of C ⊥a \ C, otherwise d is the minimum symplectic weight of an element of C ⊥a = C.
Proof. Let S be an abelian subgroup of P n not containing non-trivial multiples of the identity. Let Q(S) be the corresponding ((n, K, d)) q stabilizer code and let Thus, since P is Hermitian and P 2 = P , the dimension of its image Q(S) is equal to the trace of P . Since tr(M ) = 0 for all M ∈ P n , M = 1 and tr(1) = q n , one has tr(P ) = q n /|S| and so |S| = q n /K, since dim Q(S) = K.
We note that C = τ (S) is an additive code since S is an abelian subgroup and has size |S| = q n /K. Since τ (Centraliser(S)) = C ⊥a , we have C C ⊥a .
For K = 1, the minimum symplectic weight of any element of C ⊥a \ C is d, since the minimum distance of Q(S) is the minimum weight of the Pauli operators in Centraliser(S)\S. As in the qubit case, if K = 1 then we define the minimum distance of Q(S) to be the minimum weight of the Pauli operators in Centraliser(S) = S, which is equal to the minimum symplectic weight of any element of C ⊥a = C The backwards implication is similar. Let S = τ −1 (C) and define the stabilizer code to be Q(S). Then the dimension follows as above. If K = 1 then the minimum distance of Q(S) corresponds as above to the minimum symplectic weight of an element of C ⊥a \ C, since Centraliser(S) is equal to τ −1 (C ⊥a ) up to a scalar factor. If K = 1 then the minimum distance of Q(S) corresponds to the minimum non-zero symplectic weight of the elements of C ⊥a = C.

Constructions
The following theorem is known as the Calderbank-Shor-Steane construction. The ⊥ refers to the standard inner product on F n q given by Theorem 5.5 Suppose there are linear codes C 1 and C 2 with parameters [n, k 1 , d 1 ] q and [n, k 2 , d 2 ] q , with the property that C ⊥ Then C is a linear code over F q and for all v = (v 1 |v 2 ) and w = (w 1 |w 2 ) in C, In the above the first term vanishes since v 1 ∈ C ⊥ 1 C 2 and w 2 ∈ C ⊥ 2 . Likewise, the second term vanishes since v 2 ∈ C ⊥ 2 and w 1 ∈ C ⊥ 1 C 2 .
Hence, C C ⊥a and Theorem 5.4 applies.
To determine the minimum distance first note that C ⊥a The dimension of C 2 ×C 1 is k 1 +k 2 and the dimension of C ⊥a is 2n−(n−k 1 )−(n−k 2 ) = k 1 + k 2 , so Thus, by Theorem 5.4, if k 1 + k 2 = n then the minimum distance of the stabilizer code τ −1 (C) is the minimum weight of the elements in (C 1 \ C ⊥ 2 ) ∪ (C 2 \ C ⊥ 1 ). If k 1 + k 2 = n then the minimum distance of the stabilizer code τ −1 (C) is the minimum non-zero weight of the elements in C 2 × C 1 = C ⊥ 1 × C ⊥ 2 , which is equal to the minimum non-zero weight of the elements in Example 5.6 The ternary extended Golay code C 1 is a [12, 6, 6] 3 code for which The 12 Pauli operators generating the stabilizer group S are The next construction is called the F q 2 trick (for qubit codes this is the F 4 trick). It's not really a trick at all but it is a quick and effective way to construct quantum codes. These codes are a very special type of stabilizer code in which we impose more structure on the additive code C.
For any two vectors u, v in F n q 2 , we define the Hermitian form and for a F q 2 -linear code E we define Proof. The code D ⊥ h is a [n, k, d ] q 2 code for some d .
Fix a basis {e, e q } for F q 2 over F q , where e 2q = e 2 .
Let θ be the map from F n q 2 to F 2n q defined by θ((a 1 e + b 1 e q , . . . , a n e + b n e q )) = (a 1 , . . . a n |b 1 , . . . , b n ) Let C = θ(D ⊥ h ), a 2k-dimensional linear code over F q of length 2n.
For u ∈ D ⊥ h and u ∈ D, This implies Applying the x → x q map, we get Subtracting the last two equations, Hence, (θ(u), θ(u )) a = 0, and so θ(D) C ⊥a . Since |D| = |C ⊥a | = q 2(n−k) , we have that θ(D) = C ⊥a .
Moreover, C = θ(D ⊥ h ) and D ⊥ h D, so C C ⊥a . The symplectic weight of an element of θ(u) is equal to the weight of u, so the minimum symplectic weight of The theorem follows from Theorem 5.4.
We will use the construction of Theorem 5.7 to obtain quantum MDS codes in the next section.

Research Problem 3
If k is small enough one can multiply the columns of a generator matrix for D ⊥ h with non-zero scalars to obtain an equivalent code for which D ⊥ h D holds. It would be interesting to calculate the combinatorial threshold for codes when this can always be done and then deduce properties of codes which surpass this threshold.

The geometry of quqit codes
In the case q = p h , Theorem 5.4 implies that the existence of a ((n, q n /p r , d)) q stabilizer code Q(S) is equivalent to the existence of an additive code C C ⊥a of length 2n, such that C is generated by r vectors of F 2n q that are linearly independent over F p . Thus, the code C is generated by a r × 2n matrix G(S) over F p and its columns are vectors in F r q . We have seen in Section 3.3 that when h > 1, we should consider those columns as subspaces of PG(r − 1, p) and not as points of PG(r − 1, q). Let x i be the i-th column of the matrix G(S) and let e be an element of F q with the property that {1, e, e 2 , . . . , e h−1 } is a basis for F q over F p .
Then there are vector x i,j ∈ F r p such that Let i be the subspace as a subspace of PG(r − 1, p).
The following lemma can be considered as a generalisation of Lemma 3.6 Thus, by Lemma 5.8, the geometry of the stabilizer code Q(S) for which the minimum non-zero weight of Centraliser(S) is at least two, is given by a set X of (2h − 1)dimensional subspaces of PG(r − 1, p) of size n. The following lemma allows us to deduce the minimum distance of Q(S), at least in the case that Q(S) is pure. Proof. Suppose that there is an element in Centraliser(S) of weight w. Then the image under τ of this element is a vector v ∈ C ⊥a with symplectic weight w. Let D be the support of v restricted to the first n coordinates. As before, let x i be the i-th column of the matrix G(S) and define x ij as in (8). Since The summand is a point of the subspace i and there are |D| = w such points. This proves the backwards implication.
Suppose there are w dependent points incident with distinct subspaces of X . Then there is a subset D ⊆ {1, . . . , n} of size w and λ i,j , λ i+n,j ∈ F p , such that Recall that Since i is a (2h − 1)-dimensional subspace, the points x j , x p j , . . . , x p h−1 j are h linearly independent points, which implies there are µ i,r ∈ F q such that Since x i,j ∈ F r p , we have that µ i,r = µ p r i , for some µ i . Substituting in the above gives, The property that defines X as a quantum set of lines for p = 2 does not carry over to the case p 3. This is because we can scale any column of G by an element of F q \ {0, 1} and not alter the set of lines X . This will alter the value of (u, v) a , so the geometric interpretation of C C ⊥a will not be so clean as in the qubit case.
Moreover, it is difficult to deduce the pureness of the code directly from the geometry. To see this, suppose that v ∈ C ⊥a has symplectic support D and for simplicity sake assume that q is prime. Then Now, v ∈ C if and only if there is an a ∈ F r p such that v i = a · x i . This implies that the lines not incident with the dependent points are once again contained in a hyperplane, but we cannot deduce that the points of the dependencies are contained in the hyperplane a · X = 0. Indeed, the fact that for some non-zero scalar λ i ∈ F q . Since this λ i depends on i, we cannot deduce that v i = a · x i for all i = 1, . . . , 2n.
However, this also means that when p 3 we have some flexibility in choosing a basis for i and this choice will affect whether C C ⊥a . Consider the set of n (2h − 1)dimensional subspaces of PG(4n − 1, p) associated with a pure [[n, n − 4, 3]] q stabilizer code. By Lemma 5.9, these subspaces are pairwise skew. In geometrical language this is called a partial spread. To construct such a code, according to Theorem 5.7, it suffices to construct a [n, n − 2, 3] q 2 linear code D for which D ⊥ h D. Such a code is has a generator matrix x 1 x 2 . . . x n y 1 y 2 . . . y n , For any n q 2 + 1 such a matrix can be found by scaling the first three columns so that the equation in (9) are satisfied.
Research Problem 4 The Glynn et al [5] manuscript developed the geometry of qubit stabilizer codes, introducing the concept of a quantum set of lines. This led them to prove Theorem 3.14, which gives a beautiful geometric classification of qubit stabilizer codes. Here, we have generalised the concept of quantum set of lines to non-qubit stabilizer codes. Although we have seen that the existence of non-identity non-zero scalars means we cannot hope for such a clean geometric classification, one can certainly expect some geometric classification for larger q.
6 Quantum MDS codes

Stabiliser MDS codes
Let C be a code of length n and minimum distance d over an alphabet of size q. If we consider any n − (d − 1) coordinates then any two codewords must be different on these coordinates (if not the distance between them is at most d − 1), so there are at most q n−d+1 codewords in the code. This is the Singleton bound |C| q n−d+1 .
A code which attains the Singleton bound is called a maximum distance separable code or simply an MDS code.
Recall that if C is an additive code over F q , where q = p h for some prime p, then C is linear over F p and so necessarily |C| = p r for some r, see Section 3.3. Thus, if C is also an MDS code then h divides r and |C| = q k , where k = n − d + 1.
Theorem 5.4 states that an [[n, k, d]] q stabilizer code exists if and only if there exists an additive code C F 2n q of size |C| = q n−k such that C C ⊥a and the minimum symplectic weight of an element of C ⊥a \ C is d. Considering C ⊥a as a code over the alphabet F q × F q , then C ⊥a has minimum weight d, so |C ⊥a | q 2n−2d+2 .
Since |C| = q n−k we have that |C ⊥a | = q n+k , which implies that for a [[n, k, d]] q stabilizer code to exist, we must have the condition k n − 2(d − 1).
Compare this with the Singleton bound above k n − (d − 1), for codes of size q k .
What is perhaps surprising is that this bound holds for all [[n, k, d]] q quantum codes. The quantum Singleton bound states that n k + 2(d − 1) .
Consequently, codes reaching equality are called quantum maximum distance separable codes or QMDS codes for short. We will prove this bound in Section 6.3.

Reed-Solomon codes
The classical example of an MDS code is the following linear code over F q . Denote by {a 1 , . . . , a q } the elements of F q . The Reed-Solomon code is where f k−1 denotes the coefficient of X k−1 in f (X). If k q then each polynomial f defines a different codeword, so the dimension of C is k. A non-zero codeword has weight at least n − k + 1, since a polynomial of degree at most k − 1 has at most k − 1 zeros. Lemma 3.1 then implies that the minimum distance d = n − k + 1 and so the code is MDS.
We can use Theorem 5.7 to construct quantum stabilizer codes from Reed-Solomon codes over F q 2 , but only if we can scale the coordinates of C so that C C ⊥ h . Then D = C ⊥ h is a [n, n − k, k + 1] q 2 linear MDS code with the property that D ⊥ h D. Observe that replacing the i-th coordinate f (a i ) by λ i f (a i ) does not alter the parameters of the code. Such a code is then called a generalised Reed-Solomon code. This can only be done for k q, in which case we obtain a [[q 2 + 1, q 2 + 1 − 2k, k + 1]] q stabilizer code. For case k = q, one can check that the Reed-Solomon code is contained in its Hermitian dual, so there is no need to scale in this case.

Quantum Singleton bound
To prove the quantum Singleton bound we will need some technical tools.
1. Bloch decomposition. Let {e i } be a basis for the space of complex D × D matrices such that tr e † i e j = Dδ ij . For qubits, take for example the Pauli matrices. Every one-quDit density matrix can then be expanded as where we recall that the trace of a matrix is given by the sum of its diagonal elements, tr(M ) = i m ii for any square matrix M = (m ij ).
Consider now an n-partite system in the space (C D ) ⊗n . Denote by {E α }, with a multi-index α = (α 1 , . . . , α n ), the matrix basis formed by tensor-products of the e i 's For tensor products, such as say E ⊗ F , one has tr(E ⊗ F ) = tr(E) · tr(F ). In other words, the trace of a tensor product factorizes. Consequently tr E † α E β = D n δ αβ , and the matrix basis formed by {E α } is orthogonal.
Denote by wt(E α ) the number of non-identity terms in the tensor-decomposition, and by supp(E α ) the collection of sites where the non-identity terms act on. Naturally, wt(E α ) = | supp(E α )|.
We can expand an n-partite state as As above, we from now on omit the index α for readability. This is the Bloch decomposition of ρ.
The function tr j is called the partial trace and its action can be understood as that of removing the j-th tensor component.
The partial trace does not depend on the basis. Its coordinate-free definition is the following: Let V and W be two vector spaces and denote by I W the identity matrix on W . The partial trace tr W is the unique operator, which for all M acting on V ⊗ W and N acting on V satisfies tr(M · (N ⊗ I W )) = tr(tr W (M ) · N ) .
Considering the Hilbert-Schmidt inner product M, N = tr M † N , the partial trace can be seen as the adjoint to the map V → V ⊗ I W . Note that partial traces over different subsystems commute, tr j tr i = tr i tr j and one has that tr(M 1 ⊗ M 2 ⊗ . . . ⊗ M n ) = tr(M 1 ) tr(M 2 ) · · · tr(M n ) .

Purification.
A density matrix ρ on H A can always be diagonalized as where {|λ i A } is its set of eigenvectors and {λ i } is its set of corresponding eigenvalues.
It can be checked that tr B (|φ φ|) = ρ and the state |φ is known as a purification of ρ.
4. Von Neumann entropy. Consider a classical probability distribution represented by a set of probabilities p i ≥ 0 with i p i = 1. Its Shannon entropy is We can introduce a similar quantity for quantum states. Given a density matrix ρ, its von Neumann entropy is defined as Such matrix functions of hermitian operators can be evaluated on their eigenvalues {λ i }. Then the von Neumann entropy evaluates as Let us now write S A = S(tr B [ρ AB ]) and so on. For a state ρ on H A with purification |φ ∈ H A ⊗ H B , we have that S A = S B .
The von Neumann entropy satisfies subadditivity and strong subadditivity, We are now in position to prove the Quantum Singleton bound.
Proof. The distance must be bounded by 2(d − 1) < n, as otherwise n − (d − 1) < (d − 1) and we could recover the encoded state from two disjoint subsystems, violating the no-cloning theorem.
Let Π Q = q k i=1 |v i v i | be the projector onto the code space. A purification with a reference system R leads to where |i R is any orthonormal basis for R. Let us partition the code into the three subsystems A, B, C, such that |A| = |B| = d − 1 and |C| = n − 2(d − 1). Then S R = log q k . As the code has distance d, any subsystem of size strictly smaller than d cannot reveal anything about the reference system R: indeed the condition of RA = R ⊗ A is known to be a necessary and sufficient condition for the subsystem A to be correctable [13]; this is also equivalent to S RA = S R + S A . With the subadditivity of the von Neumann entropy this leads to where we used that the entropies of complementary subsystems are equal for a pure state. The combination of the above two inequalities yields (b) For every subset S ⊂ {1, . . . , n} with |S| ≤ n+k 2 , we have that tr S c (P ) ∝ 1, where P is the orthogonal projection onto the quantum MDS code.
Let us discuss these properties: a) states that QMDS codes form families of codes where n + k is constant. Within each family, only the member with the highest distance has to be determined, as its descendants can be obtained by a partial trace: tracing out over a single particle, one has n → n − 1, k → k + 1, d → d − 1. This works because QMDS codes are pure codes, that is, all their (d − 1)-party marginals are maximally mixed. For general quantum codes, this method of making new codes from old is not necessarily possible.
Property (b) states that for all pure states |v in the code, the marginals of size less than d are maximally mixed. This implies that every vector in the code space shows maximal bipartite entanglement across any bipartition of d − 1 vs. n − d + 1 parties. Thus QMDS codes form subspaces that show high bipartite entanglent. We  This should be compared to the "trivial" upper bound for MDS codes. If there is a (n, q k , n − k + 1) q MDS code then n q + k − 1.
The MDS conjecture states that if 4 k q and there is a (n, q k , n − k + 1) q MDS code then n q + 1.
This is known to hold for linear codes if q is a prime, see [3].  Let G be a k × n generator matrix for C and let X be the set or multi-set of columns of G, viewed as points of PG(k − 1, q). In Section 3.2, we saw that a non-zero codeword u = aG corresponds to a hyperplane π a of PG(k − 1, q) and that π a = π λa for any λ ∈ F q . The number of points of X incident with the hyperplane π a is n minus the weight of the codeword u. Thus, for i = 0, there are A i /(q − 1) hyperplanes which are incident with n − i points of X .

MacWilliams identity for quantum codes
As for classical codes, weight enumerators can be defined for quantum codes, which again are useful to deduce the error-correcting properties of codes and to obtain bounds on their existence.
Let Q be a quantum code and let P be the orthogonal projection onto Q. The weights of the primary and secondary Shor-Laflamme enumerators are Hence, if E ∈ S, tr(EP ) tr E † P = 0 and if E ∈ S then tr(EP ) tr E † P = q 2n /|S| 2 .
Thus, A j is q 2n /|S| 2 times the number of elements in the stabilizer subgroup S that have weight j.
We leave the result for B j as an exercise.
The geometrical interpretation of A j for stabilizer codes is as follows. Suppose that X is a quantum set of lines in PG(n − k − 1, q). Then A j is (q − 1) times number of hyperplanes containing n − j lines of X .
Before proving the quantum MacWilliams identity, consider the following example. where e 2 = e + 1. One can prove that the minimum distance is 4 by checking that all 3 × 3 submatrices are non-singular. By verifying that the hermitian inner product (7) between any two rows is zero, one quickly concludes that D = D ⊥ h . Theorem 5.7 implies that we can construct a [[6, 0, 4]] 2 stabilizer code Q(S) from D. By writing out the entries in the matrix over F 2 and considering the F 2 span we obtain the matrix G(S) for this quantum code. Thus, the stabilizer subgroup has generators By Lemma 5.9, the quantum set of six lines X we get from the matrix G(S) has the property that any three lines of X span the whole space PG(5, 2). Therefore, any two span a three-dimensional subspace which is contained in three hyperplanes which contain no further line of X . Thus, there are 45 hyperplanes which contain exactly two lines of X . Let be a line of X . There are 15 hyperplanes containing , so counting pairs ( , π) where ∈ X and π is a hyperplane containing , we conclude that any hyperplane containing a line of X contains two lines of X .
Thus, we work out the weight distribution. For codes with k = 0 (that is, pure states), both weight distributions coincide; this can be checked from the definition. From before, we have that A j is the (q − 1) times number of hyperplanes containing n − j lines of X . Thus, we have proved that the weight distribution for the quantum hexacode is (A 0 , . . . , A 6 ) = (1, 0, 0, 0, 45, 0, 18).
Research Problem 9 For stabilizer codes, A j and B j count the number of terms in the stabilizer S and its normaliser N (S) respectively; there is no such combinatorial interpretation for general quantum codes. Although A j can interpreted as the Hilbert-Schmidt norms of the j-body correlations that appear in the code, we would like to determine what object B j is counting for non-stabilizer codes.
We return to the proof of the quantum MacWilliams identity. Proof.
[Quantum MacWilliams identity] We will only state a proof sketch; the rather tedious combinatorial details can be found in [11,14].
Let S be a collection of subsystems and denote by tr S the partial trace the systems in S. Denote by S c the complement of S in {1, . . . , n}. Consider now how the partial trace tr S followed by a "padding" with the identity acts on an operator P .
tr S (P ) ⊗ 1 S = tr S 1 q n E tr E † P E ⊗ 1 S = 1 q n−|S| supp(E)⊆S c tr E † P E . (10) It can be shown (c.f. Appendix A in Ref. [11]) that this can also be written as tr S (P ) ⊗ 1 S = U (q n ) s.t.

supp(U )⊆S
where the integration is over the unitarily invariant Haar measure of unitary matrices that act trivially on the subsystem S c . The second equality follows from the fact that any complete orthonormal matrix basis {E α } containing the identity forms a unitary 1-design 5 .
The quantum MacWilliams identity now essentially follows from equating Eqs. (10) and (11), summing over all subsystems of size |S| = m, multiplying by P , and taking the trace. This yields terms of the form tr E † P tr(EP ) and tr E † P EP , corresponding to the two types of weights A j and B j . Using generating functions, in other words the weight enumerator polynomials A(x, y) and B(x, y), and Krawtchouk polynomials, this yields the MacWilliams identity q n B(x, y) = A(x + (q 2 − 1)y, x − y). 5 t-designs replace the integration over some compact group by a finite sum. A unitary t-design is a set of unitaries Ui, i = 1, . . . , K acting on C q , such that U (D) Pt,t(U )dU = 1 K K i=1 Pt,t(Ui) holds for every homogeneous polynomial Pt,t that has degree t in the matrix elements of U and degree t in the matrix elements of U * . This ends the proof sketch.
The enumerators and their weights have a couple of interesting properties: Let K = dim(imP ).
a) The weights A j and B j are invariant under the local choice of basis and are so-called local unitary invariants (LU-invariants). That is, A j (P ) = A j (P ) and B j (P ) = B j (P ) , where P = (U 1 ⊗ . . . ⊗ U n )P (U † 1 ⊗ . . . ⊗ U † n ) and U 1 , . . . , U n are unitary q × q matrices.
b) A 0 = dim(P ) and KB j ≥ A j ≥ 0. c) A projection operator P with K = dim(im(P )) is a code of distance d, if and only if it satisfies KB j = A j for 0 ≤ j < d.
d) One can check that for codes with K = 1, the enumerator polynomial is invariant under the quantum MacWilliams transform, and one has B(x, y) = A(x, y). When such a code is of stabilizer type, it corresponds to a classical self-dual code.
Some comments are in order. The weights must be LU-invariant -the properties of the code should not depend on the way one sets up the local coordinate system for each spin particle. The last two properties are useful to obtain weights of hypothetical codes and to apply the machinery of linear programming bounds [2]. That is, one sets up a system of linear equalities and inequalities in the variables A 0 , . . . , A n making use of a), b), and the quantum MacWilliams identity. We refer to the tables by M. Grassl [8] for more existence results.