Developing the Deutsch-Hayden approach to quantum mechanics

The formalism of Deutsch and Hayden is a useful tool for describing quantum mechanics explicitly as local and unitary, and therefore quantum information theory as concerning a"flow"of information between systems. In this paper we show that these physical descriptions of flow are unique, and develop the approach further to include the measurement interaction and mixed states. We then give an analysis of entanglement swapping in this approach, showing that it does not in fact contain non-local effects or some form of superluminal signalling.


I. INTRODUCTION
The Deutsch-Hayden approach [1] was introduced in order to describe the flow of information through quantum mechanical systems. The standard formalism is not suitable in this situation as it has non-local states; that is, a single state can describe systems which are physically far apart. Any information stored in that state can then appear to 'jump' from one system to the other without an explicit physical communication happening. Deutsch and Hayden demonstrated that this does not happen in their approach, and that only local interactions change the description of a system. Furthermore, unlike in the standard formalism, there is no notion of "collapse" in the Deutsch-Hayden approach. All evolution is unitary, even under the action of measurement. This approach is therefore very useful if we wish a formalism of quantum mechanics that is explicitly local and unitary. One particular use of such a formalism is in information theory. As Deutsch and Hayden showed in their analysis of teleportation, if we have no notion of collapse then 'bits' and 'classical communication' are treated by the formalism in exactly the same way as qubits and communication via quantum channels. This is very useful as we no longer need to swap between different types of entities in the middle of a protocol.
Deutsch and Hayden demonstrated information flow in some situations, for example in teleportation. However, there were some situations that they did not address, and some issues that need to be resolved. Firstly, the demonstration of information flow depends on the exact form of the description of the systems. However, if we are to use this then we need to know how unique those forms are: if another form may equally well be used, where does that leave the analysis? Then there are issues of measurement. The measurement interaction itself is unitary, but eventually in order to extract prediction from the theory we are going to need something that will stand in for collapse -a notion of what the description of one system is relative to a description of the other. Finally, Deutsch and Hayden dealt only with pure state systems, so in order to use the formalism universally we are going to need a way of describing mixed state systems within the approach.
These, then, are the issues that this paper will address. We will first give an introduction to Deutsch and Hayden's work. We will then look at the uniqueness of the descriptions given. Then we will deal with the measurement interaction and relative states. After this, we will look in detail at how mixed state systems may be incorporated into the formalism. We will then use the developed formalism to analyse information flow in a protocol that could not be described fully previously: that of entanglement swapping.

II. THE DEUTSCH-HAYDEN APPROACH
The Deutsch-Hayden approach is based on the formalism first introduced by Gottesman [2] in the context of stabiliser theory, and is a Heisenberg-type representation of quantum mechanics. Instead of the states of systems evolving under a Hamiltonian and operators being fixed, there is a fixed universal state (by convention usually |0 ), and an operator A(t) evolves over time with equations of motion where H is the Hamiltonian. We can also write the timedependent operator as A(t) = U † A(0)U , where U is the unitary evolution operator. The Deutsch-Hayden representation in based on the Hilbert space where these time-dependent operators are vectors rather than second rank tensors -the space of Hilbert-Schmidt operators [3]. This Hilbert-Schmidt space has an inner product (Tr(A † B) where A and B are operators) and a norm ( Tr(A 2 )), and for an Ndimensional system an N 2 dimensional space is required.
As an example, let us consider a single qubit. Such a system is 2 dimensional and will therefore need a set of 4 basis operators. One such set that would be useful is that of the 2 dimensional Pauli operators, {1 1, σ}. Now, the 1 1 component of any operator can never change (it will evolve as U † 1 1U = 1 1), and in order for the operator to be normalised, the 1 1 component must always be 1/N 2 . Thus a qubit can be characterised by giving the components in the σ x , σ y , σ z directions only. This is, of course, the Bloch sphere [4, p174].
In the general case, we choose a set of basis operators written here as Γ i . Then any operator can be written in terms of the basis (here at time t = 0): where a i ∈ R is the inner product Tr(A(0)Γ i (0)) (we assume that all the Hilbert-Schmidt operators with which we work are self-adjoint; that is, they may be the operators corresponding to observables). If we now evolve A(0) to time t we have Thus we see that in order to find the time-evolved state of an operator, all we need do is find the time-evolved basis operators and then reconstruct the operator using the original coefficients. Thus a complete characterisation of the time-evolved system can be gained by following the time evolution of the basis operators.
However, this is not usually very practical: as N increases, the task of following the evolution of N 2 basis operators quickly becomes unwieldy. It was Gottesman [2] who realised that the number of operators to be tracked can be reduced further. Operators can always be written as combinations of other operators, either additively or multipicatively. It is obvious that for additive groups of operators the evolution operator preserves the group structure (ie U † (A + B)U = U † AU + U † BU ). However it is also true of the multiplicative groups, where U forms a group homomorphism between {X} and {U † XU }: We can therefore see that, rather than following the complete set of basis operators, all we need follow is a generating set of the group. The time-evolved generating set will preserve the group structure of the complete set, enabling that to be constructed out of it.The generating set that Gottesman (whom Deutsch and Hayden follow) chooses is the Pauli group. This gives us the representation of an n-qubit array, each qubit of which is defined by a set of operators that will generate all the operators pertaining to that qubit. We note that this representation does in fact have a further redundancy within itself, as the 2-dimensional Pauli operators themselves form a multiplicative group (eg σ y = iσ x σ z ).
When using this formalism, not only do we need to time-evolve the descriptors, q a , we also need to use timeevolved forms of transformation operators. These forms can be written in terms of their actions on the descriptors over a time-step. For example, the Hadamard gate is written in its general form as   and the CNOT gate (with qubit 1 as control and 2 as target) Their average values are That is, the average values of these operators are the diagonal elements of the density operator in the computational basis. The use of these operators scales very easily as we add more systems. We saw above how the q ai for each system always describe separate subspaces for the individual system. We can therefore simply combine the z ± for each system to get overall probabilities. For example, if we have two systems then p(01) = z 1+ ⊗ z 2− which gives us one of the four diagonal elements of the 2-qubit density matrix, which are Another operator which is particularly useful to us is the density operator. It is one of the most useful ways of going between the Schrödinger and Heisenberg pictures as in both cases we are dealing with the time-evolved operator, |ψ(t) ψ(t)|. Therefore, unlike ordinary operators, there is no difference in the form of the density operator between the two pictures. The density operator is a proper vector in Hilbert-Schmidt space, although its evolution is different from other operators: The general form of a density operator in terms of the descriptors is where q (n) is the n-dimensional set of descriptors for the system, and {P n } is the n-dimensional Pauli group. For a single qubit system, P i = σ i (where i runs over four indices and σ 0 = 1 1). For two qubits, P i = σ i ⊗ σ j , and the q i (t) are therefore q ij = q 1i ⊗ q 2j (again, q 0 = 1 1). That is, The two-qubit density operator can also be written

III. DIRECT CONSTRUCTION OF DEUTSCH-HAYDEN OPERATORS
Using Deutsch and Hayden's method we can construct a density matrix from a set of q ai gained by a certain preparation procedure. For example, consider a circuit of two qubits, with a Hadamard gate operating on the first and then a CNOT with qubit 1 as control and 2 as target. The qubits start at time t = 0 in the zero state The Hadamard gate takes them to and then the CNOT leaves them as The density matrix is therefore which is the density matrix of the state |00 + |11 . A reasonable question to ask at this point would be: if we do not know (or do not care about) the preparation procedure, can we then do the reverse operation, and construct a set of q ai directly from a given density operator?
Furthermore, how unique would such a construction be? In our example of |00 + |11 , we can immediately see that the set (1) is not unique as we could swap the qubits in the preparation procedure. Performing H 2 then CN OT 2→1 would give us Under what circumstances in general do different preparation procedures give rise to different sets of q ai ? Is this the only way that different sets of q ai corresponding to the same Schrödinger state can be constructed? Essentially, how much can we trust what a given analysis of information flow tells us about the dependencies in a descriptor?
We start by constructing a set of q ai from a density matrix. The density matrix is dependent on the average values of the q ai , so we must start by finding these. We can find the values of q ai (t) by finding the components of ρ in the q i (0) directions, Tr(ρq i ). Such an operation can be simplified by noting that, writing ρ = n ρ 1n ⊗ ρ 2n , we have We therefore need only to look at the components in the sum where both traces are nonzero. The nonzero traces are: Tr(|0 0|1 1) = Tr(|1 1|1 1) = 1 Tr(|0 0|σ z ) = −Tr(|1 1|σ z ) = 1 Tr(|0 1|σ x ) = Tr(|1 0|σ x ) = 1 Tr(|0 1|σ y ) = −Tr(|1 0|σ y ) = i For example, consider the state |00 + |11 . The density operator is The nonzero components from this will be 1 1 ⊗ 1 1, σ x ⊗ σ x , σ y ⊗σ y and σ z ⊗σ z , all of which are 1. The density matrix can therefore be written It is instructive now to look at the simplest possible set of q ai that is consistent with these conditions (and also that q iy = q ix q iz ): This, however, is not a well-formed set of q ai as it does not form a basis in the Hilbert-Schmidt space of the two systems. A proper basis must have the following properties (cf [5, pp3ff] We can see that the set (3) fails to meet the first of these criteria: it gives only 8 linearly independent operators, rather than 4 2 = 16. Contrast this with the q 1i q 2j operators obtained from (1), which comprise the entire set of Dirac operators, which are known to be linearly independent and space the entire space.
If we now consider other q ai which could give a wellformed basis, we can use these criteria to note also that q 1 and q 2 cannot be identical: these co-ordinatise two subspaces of the whole space, and if the subspaces are isomorphic then the product of their bases will not give a basis for the whole space. Furthermore, no two q ai can be identical, and neither can any two q 1i q 2j . In the first case, the resultant product operator would be 1 1 ⊗ 1 1, which is already given by q 10 q 20 (q 0 = 1 1 is always understood), and hence in both instances there would be fewer than 16 linearly independent operators.
Let us see now how (3) may be altered to make it well-formed. The easiest way of making the combined operators span the system is to introduce σ y instead of one σ x (preserving the average values) and σ z for a 1 1: Only two of these sets of these operators span the entire space: We now have three different sets of q ai corresponding to the same Schrödinger state: (1), (2) and (4). The interesting question now is to what extent such sets are unique -in the present case we have found three possible sets, are there any more that would represent the same state? In general, how many distinct sets would we be able to find?
In order to answer this question we will first look at how many unknowns there are in the system (that is, how many unknown quantities are needed fully to determine a set of q ai ), and then consider how many constraints there are on their values. We start by writing a general form of a q i on the ath system: We know that q 10 = q 20 = 1 1, so the general forms for q 1i ⊗ q 2j are (n, m ∈ {x, y, z}) We can see from this that if we can find the a (i) kl and b (j) mn then we will have fully described the set of operators. a kl for a given i and b mn for a given j each contain 16 unknowns (k, l, m, n ∈ {0, x, y, z}). Now we know that q ay = q ax q az , so for each system we have 2(16) = 32 unknowns. Therefore, for our system here we will need 64 numbers fully to determine a set of q ai . So how many constraints do we have for the system? First, we have the conditions that no two q ai are the same, and that they each have norm 1: where i, j ∈ {0, x, z}. Now, q a0 is fixed at 1 1 ⊗ 1 1, so in each of the above equations the case n = m = 0 tells us nothing new. Therefore we have 3(9 − 1) = 24 constraint equations here.
Next we have the constraint that each q ai , i = 0, must be traceless. To see why, consider two Hilbert-Schmidt spaces, one co-ordinatised by the Dirac basis {σ i ⊗ σ j }, and the other by {q 1i ⊗ q 2j }. There exists a linear transformation between the two, given by T . Because both spaces are of Hermitian operators, we require T to be unitary to preserve hermicity. Therefore, the elements of {q 1i ⊗ q 2j } are a unitary transformation of the elements of {σ i ⊗ σ j }. The trace operation is invariant under such transformation, so the elements q 1i ⊗ q 2j must have the same values for their trace as σ i ⊗ σ j -that is, δ i0 δ j0 . This gives us the four constraint equations a (n) Another set of constraints comes from the criterion that the elements of each q a are linearly independent. That is, that there exist some constants c n for which i c i q 1i = 0 (and for q 2j ). Substituting (5) We know that the Dirac operators are mutually linearly independent, so each (k, l) term in the above sum must equal zero independently: As k, l ∈ {0, x, y, z} this gives us 16 equations for each system. However, we already know that a It is interesting to note that up to this point, the constraints on the values of q ai have come exclusively from the structure of the Hilbert-Schmidt space -these are constraints on any physical system. We now move on to constrain our operators to a particular physical state. These constraints are the elements of the density matrix: This gives us nine equations, but only eight constraints as we know q 10 q 20 already (1). This gives us a total of 24 + 4 + 28 + 8 = 64 constraint equations for 64 unknown variables. Thus we see that, in general, the structure of the q ai is fully determined by the physical system. That is, in general a given state defined by a density matrix has a unique representation in terms of Deutsch-Hayden operators.
The caveat "in general" evidently applies as we have been dealing with a situation where the representation is not unique. Why is this the case for our above example |00 +|11 ? The reason is that we have a physical symmetry between the two systems, and this is picked up in the mathematics. All of the 'structural' constraints are, of necessity, symmetric between the systems and between basis states within each system (or else they would be making a physical statement). The physical symmetries are all contained within the structure of the density matrix. We can see from (6) that any symmetries of the density matrix will give rise to different sets of q ai corresponding to those symmetries. That is, if there is a symmetry of the density operator pertaining to the element ρ ij , then the same symmetry applied to q 1i q 2j will give us a physical state of affairs indistinguishable from the original. (6) tells us that such symmetries are the only means of generating separate sets of q ai s.
Let us look at the density matrix in terms of the {σ i ⊗ σ j }: The symmetries that will give us different sets of q ai are the symmetries of a ij : It is important to note at this point that we cannot have any symmetry that violates the constraints that we have imposed on the system above: so, for example, we cannot swap q 1x with another q ai whose product with q 1z is nonzero. Furthermore, q 1i cannot be swapped with q 1i q 2j unless j = 0: otherwise it would force q 2j = 1 1 ⊗ 1 1 which would give less than 16 independent operators. To put that all in other words, we are restricted to the physical symmetry transformations of the system, non-physical transformations violating the constraints we have already laid on the system. In terms of symmetries of the density matrix, we are therefore restricted to swapping rows and columns. The symmetries of the density matrix are as follows ({} denotes a single transformation). Firstly, swapping single rows and the same numbered columns (ie reflections in the diagonal): Then swapping two rows and two columns at a time: Any combinations of the above will also be symmetries of the matrix.
We can immediately discard symmetries of the form 1 1 ⊗ 1 1 ↔ 1 1 ⊗ σ z : these will merely change the position of 1 1 ⊗ 1 1 in a set of q ai , but we always have it as q a0 by convention. We can also discard several as q y is not independent of q x and q z -for example, swapping r 2 ↔ r 3 and c 2 ↔ c 3 and swapping r 2 ↔ r 4 and c 2 ↔ c 4 imply the same thing. We are therefore left with the following symmetries on a set of q ai : We also have the combinations of the above (neglecting duplicates, and also leaving out the third in any given set of q i as it is implied by the first two): These are all the symmetries present: if we take further combinations of (13-16) with (7-12) we get a symmetry already given. Therefore, the identity transformation q ai ↔ q ai plus the transformations (7) -(16) form the group of relevant symmetries of the density matrix as applied to the q ai operators.
We can now use this group to generate the set of sets of q ai that will correspond to the state |00 + |11 . Now, because the only way to generate a different set of q ai is by one of these symmetry transformations, each element in the set of sets of q ai will have orbit 1 when acted on by the group of symmetries (we denote this by G ρ ) -that is, any element can be reached by any other using one of the group elements. For example, (2) is generated from (1) by applying (10), and (4) by applying (8). We can therefore use (1) to generate the entire set of allowable sets of q ai : There is one final question that needs to be answered about these alternative sets of q ai : do they give the correct results when they subject to further evolution of the system? Do all these sets give the same average values after arbitrary unitary evolution?
It is fairly straightforward to prove that this is the case. Consider two set of operators sharing the same average values, q ai and q ′ ai . q ai has been constructed from a circuit and so is known to give the correct evolution, Because q ai and q ′ ai share the same average values, we can write Let U be the transformation taking place between t and t 1 : We showed above how any basis can be written as T † σ n ⊗ σ m T : In other words, for arbitrary evolution, the evolved density operator can be written in terms of the evolved q ′ operators, and We have therefore shown that in the Deutsch-Hayden picture a given Schrödinger state is represented by the G ρ -set of sets of q ai which is acted on by the group G ρ of symmetries of the density matrix ρ of the state. Each element of the G ρ -set has orbit 1, and for all practical purposes any element may be used in place of any other. We therefore see that in an analysis of information flow, the possible descriptors for the systems under consideration differ only by the physical symmetries of the systems -that is, the description of information flow and dependencies will not differ in physically significant ways between different sets of descriptors for the same situation. Any set that we chose will give us the correct description.

IV. THE MEASUREMENT INTERACTION
We now turn to measurement-type interactions in terms of the Deutsch-Hayden picture. The simplest form of a measurement interaction is full decoherence, where the system is measured by an ancillary system of the same size to which we do not have access. The simplest model of this interaction is that there is a CNOT gate with the system as the control and the ancilla as target. For example, consider a one-qubit system and a one-qubit ancilla. Without loss of generality, we can take the state of the ancilla before interaction to be |0 ⇒ q a = 1 1 ⊗ σ. The state of the system before is given by q s (t). The state of the system afterwards corresponds to That is, the only nonzero q si is q sz . Now, the diagonal elements of the density matrix are 1 1 ± σ z -so what we have is the standard action of decoherence, where the offdiagonal elements of the density matrix written in the decoherence basis go to zero and the diagonal elements are unchanged.
We can extend this to any size of system if we have an ancilla at least as large as the system being decohered. Then each qubit of the system will perform a CNOT with a given qubit of the ancilla, leaving only the q nz as nonzero. In any size system, the diagonal elements are given by the coefficients of combinations of 1 1 and σ z 'sthat is, combinations of 1 1 and q z only. Thus again we have the off-diagonal elements becoming zero and the diagonal elements unchanged.
Moving away from decoherence, let us now suppose that we have access to the ancilla. In our two-system example, the operators for this ancilla will be We can see from this that only the information contained within the q sz (t) operator is picked up by the ancilla. This is because, were we at a later time to have access only to the ancilla, the information about the first system that could be extracted is q sz (t) only. Again, this is what we want: the q sz (t) components give the probabilities of outcomes of measurement in the measurement basis (as they give the diagonal elements of the density matrix). This is all the information about a state that can be gained by measuring it.
We have seen what happens when we perform a measurement in the |0 ,|1 basis. What if we wish to measure in a different basis? In such a case, the system would still only have nonzero q z component (and the ancilla only pick up information from q z ) -but after the q ai had been rotated in the change of basis. For example, suppose we wished to measure the system in the basis |0 ± |1 . To change to this basis from |0 , |1 we perform a Hadamard transformation. The system operators are therefore (σ z , −σ y , σ x ). Performing a CNOT with the ancilla then gives us What we have in effect done here is change the meaning of σ as the basis changes, and compensate by changing the coefficients. σ becomes with respect to the new basis rather than |0 , |1 and the coefficients will necessarily change.
In modelling the measurement of the system as a CNOT gate with an ancilla as the target, we have been dealing solely with projective measurements. For completeness, we would like to look at the case of generalised measurement, where the states of the measurement basis are not necessarily mutually orthogonal.
A POVM measurement can always be modelled by adding a second system to the one being measured, performing a projective measurement on both systems and then tracing out the second system. Now, we have seen that the density operator in any basis can always be written as When the second system is traced out this becomes All the interesting information is therefore contained in a = q 1z . Take the single-qubit system we measured above. We add a second system, choosing it to be in the |0 state. Now we measure the joint system in the Bell basis. In terms of the q ai this means performing a Hadamard then a CNOT gate. At the end of this procedure the operators for the two systems are: To complete the measurement, we must add a two-qubit ancilla and perform CNOTs between the two systems and the ancilla. We saw above that this will make all the averages for the two qubits individually zero except q z . Therefore, in the generalised as in the projective measurement, what can be gained at measurement is knowledge of q sz .
In what way, then, is the generalised measurement more powerful than a projective measurement? In terms of the Deutsch-Hayden operators, the difference comes in the choice of the basis for measurement. There are a much greater range of operations that can be carried out on two systems rather than one, and so there is more opportunity to as it were 'move' different information into the q z 'slot' from where it can be picked up by the measurement.

V. RELATIVE STATES
We have seen what happens to the state of a system which is measured, and where the ancilla which performed the measurement is subsequently traced out. We now look at what happens when the ancilla is not ignored, but rather one state of it is singled out -in such a case, what is the relative state of the system that was measured, as given by the Deutsch-Hayden descriptors? The above results will, of course, be the special case where the relative state of the ancilla is completely unknownthat is, 1 1.
As is well known, if we have two systems with joint density matrix ρ 12 and a state of the second system |β β| (this can be one of the elements of a POVM), then the state of the first system is given by the partial trace ρ 1 = Tr 1 (ρ 12 |β β|) Writing the density matrix as (17) we have We want to write this in the form i a ′ i σ i , which gives us In terms of the q ai we therefore have This can be achieved if the q ′ i for the first system relative to the state |β β| of the second system are This expression is useful if we have the relative Schrödinger state. We can also frame this is terms of relative Deutsch-Hayden operators. If q ′ 2i are the operators for the second system when it is in the state |β then we know that σ n |β β| = q ′ 2n (ie the coefficient of σ n in the expansion of the density operator). The operators for system 1, q ′ 1i , relative to the operators q ′ 2i of the second system, are (unprimed operators are for the original state of the systems) We check this for the case where the second system is simply discarded, and its density matrix is therefore 1 1.
In that case we have (remembering that n runs from 1 to 3) which is as we found above.
We can now use the expression for relative states to look in further detail at the measurement interaction. Let us consider here a specific example, where the system to be measured is in the state |0 + |1 , with descriptors Now we add the ancilla and perform the CNOT operation: Now we can look at the states of the first system relative to those of the ancilla. We choose the relative states of the ancilla to be in the computational basis, and use (18) to give us the relative state of the system. The relative operators of the system are therefore where the primed are relative to |0 0| and the doubleprimed to |1 1|. We can see from these q ai that the states of the systems in both cases are eigenstates of one of the z ± operators (recall that these operators give the probabilities for 0 and 1 in the computational basis). An interesting question at this point is to ask what the corresponding operators for the ancilla are -that is, what are the complete sets of primed and double-primed operators?
To answer this question we must pay close attention to the physical situation of the measurement. We have in a rather cavalier fashion introduced the states |0 0| and |1 1| of the ancilla, which is part of the joint system. Where do these states come from? What we are always meaning in a situation such as this is that the system under consideration (here, the ancilla) has been measured in a basis of which the relevant state is a part, and then one basis state over others 'picked out'. Physically, it is picked out by being the state which is relative to some other state that we are interested in. At some point, then, in a tractable analysis we will have to simply stop and pick out the state relative to which we wish to find other states by fiat. At this point we construct the operators for this 'ultimate' state from the density operator that we wish it to have, and then find everything else relative to it. Where this operation takes place is entirely a matter of convenience. What must be remembered, though, is that at this cut-off point we then lose the record of the interactions of that system with other systems. For example, we could not in the current situation choose to construct the operators for the ancilla from the ground up -that would give comparing this with (19), we see that we have lost the information about the first system that was contained in the ancilla.
In this situation, what we need to do is push the ultimate state back one system further, and have the ancilla measured by a third system. If we start that system in the |0 and perform CNOT with the ancilla as control and the third system as target then we have The operators of the ancilla relative to the states |0 and |1 of the third system are, respectively, q + 2 and q − 2 : These then are the operators for the ancilla to be in the states |0 0| and |1 1|, relative to which the first system states are (20) and (21).
If we look at (20), (21) and (22) we notice something interesting: the sum of the relative q ai is (neglecting a factor of 2) the original q ai for the system. This is, in fact, the case for any size system where the states relative to it are a complete POVM -the sum of the q ai relative to each of the states of the POVM is the q ai for the system before measurement. The proof is straightforward. Using Physically, then, we have performed a POVM measurement on the ancilla system, and are now looking at states of the first system with respect to elements of the POVM.

VI. MIXED STATES
So far we have been considering only the representation of pure states in the Deutsch-Hayden picture. We shall see that one of the advantages of this picture is that the representation of mixed states does not differ in kind from the pure-state representations. We will also see that the Deutsch-Hayden representation of mixed states is transparent to their physical origins.
The Deutsch-Hayden operators q ai form a basis in the Hilbert-Schmidt space. This is not, however, an arbitrary basis: the elements of the individual q a combine to form the elements of the basis. We have, in fact, a product basis on the Hilbert space. We can therefore consider the space of the system as a product Hilbert space, of factor spaces both of which (or each of which for more than two systems) are co-ordinatised by q a . These factor spaces will, of course, evolve over time -the product does not remain fixed. However, at each time t the space of the system may be written as the product H 1 (t) ⊗ H 2 (t) where {q 1i (t)} forms a basis on H 1 and {q 2j (t)} forms one on H 2 .
In order to see to what these factor spaces correspond, consider the situation at time t = 0. In this case q 1 = σ ⊗ 1 1 and q 2 = 1 1 ⊗ σ, and in the Schrödinger representation we have the state |00 . The spaces coordinatised by the q a are therefore transparently the spaces corresponding to operators of the two systems. Let us look at, for example, an operator that operates only on the first system:Â ⊗ 1 1. This can be written Therefore at a given subsequent time t we havê That is, anything that can be said about the first system alone is contained within the q 1i alone, with no reference to other q ai . The space co ordinatised by q 1i (t) at any given time is therefore the space of operators on the first system alone. Thus, at any given time, the overall space of the system is a product space of the individual system spaces. Unlike in a standard Hilbert space representation, this is the case always, regardless of whether the systems are entangled or not.
In none of the foregoing have we had to say whether the q ai correspond to pure or to mixed states. It might be questioned whether mixed states can indeed be accommodated in this picture -after all, we have constructed q ai by evolving them unitarily from a fixed state, and we know that unitary evolution alone cannot produce mixed states from pure.
The key to representing mixed states is the fact that all mixed states can be considered as pure states where one or more systems have been traced out [4, pp110ff]. The Deutsch-Hayden operators for a mixed state on a system a will be the q ai which have been evolved as part of a larger, pure system. For example, we have found the q ai corresponding to |00 + |11 on two systems. If we trace out, for example, the second system then we are left with the maximally mixed state |0 0| + |1 1|. This is represented in Deutsch-Hayden terms by We can see that one of the most immediate differences between pure and mixed states is the dimension of the non-trivial part of the subspace for the system. A pure state of the first system would be represented by operators of the form σ i ⊗ 1 1 -for example, |0 + |1 is given by by contrast, the subspace of a mixed-state system is nontrivial in more dimensions, corresponding to the systems which have been traced over to get from the pure to the mixed state. Note that in the Deutsch-Hayden picture there is no such thing as tracing out a system: whether or not we are looking at the second system makes no difference to the way in which operators on the first system are constructed.
It is a necessary and sufficient condition of a state being mixed that its q ai cannot be written non-trivially on the same size subspace that a pure state of the same system would require. If this were possible, then the operators in the non-trivial part of the subspace could be written for some unitary operator U . The average values of the q could then be written Note that ρ in the final line is pure, which gives the contradiction: these are the average values of a basis set in some pure state, which cannot give the same outcomes as a mixed state. We will therefore always need a larger subspace to describe a mixed system than a pure one. As well as distinguishing pure and mixed states in this way, we can also frame the standard condition Trρ 2 < 1 in terms of the q ai (here for 2 systems): The condition for a mixed state is therefore The Schmidt decomposition condition for pure states can also be framed in terms of the q ai . The Schmidt decomposition of a pure state density matrix is Writing this in terms of σ-matrices, that is The coefficients are therefore (in terms of the q ai ) These follow the usual Schmidt rule a 2 + b 2 + c 2 + d 2 = 1, which when expanded in these terms gives (24). As well as distinguishing mixed states when we are presented with them, we can also construct the q ai for mixed states in the same way that we constructed them for pure states. We can either construct the purified state using a circuit, or we can directly construct from the density operator. Let us look at both of these in turn for the simple states |0 0| + |1 1| given by (23).
Firstly we will try the purification method. This is straightforward [4, pp110ff]: take the mixed density operator in its diagonal basis and then add another system with the same basis, to make a pure state whose Schmidt decomposition for the original system gives the basis in which the density operator is diagonal. In the case of |0 0| + |1 1| we add a second system with basis {0, 1}. Possible purifications are |00 + |11 and |01 + |10 . We now find the corresponding q ai for the whole system, and then take only the q ai for the original system. In the first case we would have (23). The second can be found by flipping the second bit in a state |00 + |11 , which has the effect on the q ai of being acted on by 1 1 ⊗ σ x . As it happens, this gives us back the same q ai , (23).
Let us now try direct construction. We have the density operator for the state, ρ = 1 0 0 1 which tells us We know that we will not be able to express the q i as combinations of σ (in this case as well this is obvious), so we will add an extra two dimensions to the subspace under consideration and try combinations of σ i ⊗ σ j . We have quite a lot more freedom in this construction than in that of the pure states, as we do not care what the operators for the ancillary system are. All we need to make sure is that the operators for the system we do care about co-ordinatise a well-formed subspace of space for the whole system. It is easiest to start off with σ in the first position, and then add σ x where needed, giving We verify that this is indeed a representation of |0 0| + |1 1| as they are the q 1 operators from the set q (6) ai in the previous section.
This representation of mixed states is a very physical one. The Deutsch-Hayden picture does not recognise any form of evolution other than unitary evolution, and this is reflected in the fact that even when a set of q ai for a mixed state are directly constructed as above, their structure shows that they are in fact part of a larger pure state. The size of the mixed-state operators can be found from the purification method used above. An extra system is needed to purify each system that is mixedso the Deutsch-Hayden operators will in general be twice as long for mixed states than for pure states of the same sized system. That is, a mixed state of N qubits will in general be part of a pure state of dimension 2.2 N , and the Deutsch-Hayden descriptors will have length 2N .

VII. REDUCED DESCRIPTORS
We have seen that an n-dimensional mixed system needs to be written on a larger space than a corresponding pure system. We will now look in detail at the ability to write a descriptor on a reduced space, in particular with reference to entanglement between systems.
Let us take a three qubit system, the overall state of which is pure, and then neglect the third qubit (that is, we are only interested in the descriptors for the first two systems). If the first two systems together are not entangled with the third then their descriptors will satisfy Writing this out in full, with U giving the evolution of all three systems gives us Now in general a, b and c are sums, As this component is never zero, t (a,b,c) 0 = 0. Therefore the trace of a, b and c will always be 1 (neglecting normalisation) -and hence Tr(a ⊗ b ⊗ 1 1) is a simple, constant numerical factor which can be neglected here, and can be incorporated into the normalisation of the descriptors in the end. We therefore have Now let us define U 12 as the elements of the matrix U that act only on H 12 . We then have What then are the expressions U † 12 σ ⊗ σU 12 exactly? They are the same as the descriptors U † σ ⊗σ ⊗1 1U where only the action on H 12 is taken into account. We can therefore see that they are the "simply reduced" q 1 q 2 ; that is, the descriptors with only their H 12 components. For example, if q 1 = σ x ⊗σ z ⊗σ y then the simply reduced form on H 12 is σ x ⊗ σ z . At this point we need to introduce some terminology. We will say that: A descriptor q is said to be represented by an expression n when it is the case that q = n We saw examples of representation previously, where many different sets of q's described the same density matrix; each set was represented by the others. However, representation is not restricted to such examples on the same Hilbert space: in general the dimensionality of n and q differ.
Looking at (25) we can see that this is a case of representation. We have shown that q 1 q 2 ∈ H 123 can be represented by U † 12 σ ⊗ σU 12 ∈ H 12 . This is therefore the key to representing pure states. When looking at a system s, only the elements of the descriptor on H s need be used. When considering only that system, the components on other spaces make no difference to the average values, and might as well be replaced by 1 1.
As a consequence, when looking at the past evolution of the system we may as well replace the action of U on H 3 by 1 1 as well.
(It becomes necessary here to introduce some notation. There is the potential for confusion between descriptors with different numbers of components. When the context does not make clear on which space they have support then we will write for example [q 1 ] 12 for the descriptor q 1 with components only on H 12 -that is, U † 12 σ ⊗ 1 1U 12 .) Another way of seeing this is by considering the following way of expressing the descriptors for the first two systems: So what happens when there is no entanglement with qubit 3 is that the expression Tr(U † 3 |0 0|U 3 ) becomes irrelevant; it is not zero when [q 1 q 2 ] 12 is not zero.
We have seen what happens when there is no entanglement between the two systems. What changes then if the first two qubits are in a mixed state -if they are entangled with the third qubit? If we restrict ourselves to simply that system, what can we say about the qubits and their descriptors? The most obvious point is that we will no longer be able to describe them on H 12 using the simply-reduced descriptors. The information contained in H 3 becomes relevant to describing the past evolution of the two-qubit system -as shown in Chapter 2, we require H 123 to write them on.
We now have a problem. As can be seen from all the foregoing work on the Deutsch-Hayden approach there is no operation in the formalism that corresponds to the action of tracing out a system, which is the usual way of getting a mixed state representation. Subsystems and their descriptors have been considered in their physical context, as part of the wider system, and their descriptors written on the space of that wider system. The question is: given that there is no physical operation corresponding to tracing out a system, what happens when we have no knowledge of anything other than the system under consideration? The solution is prompted by the foregoing paragraph: that ignoring the action of the third system means ignoring H 3 . So if we want to restrict our attention simply to the system of qubits 1 and 2 then we can only have access to H 12 on which to write our descriptors. Hence the problem.
This may appear to be labouring the point somewhat as it is obvious what the solution to this is going to be. It is, however, instructive to show exactly how that comes about in the Deutsch-Hayden approach, and why it is needed -why we cannot just deal with the descriptors as we have them on H 123 .
The solution is that we have found a set of operators on H 123 which we know can be simply reduced to H 12 .
If we can write our 'mixed' descriptors in terms of these 'pure' descriptors then we have a way of representing the descriptors associated with mixed states on H 12 . There are two ways in which we could do this. Either a combination of the descriptors associated with pure states could represent the descriptor we are concerned with, or else the pure descriptors could form a complete basis on H 123 for all possible u † σ ⊗ σ ⊗ 1 1U .
We know from standard quantum mechanics that the first is going to be the case -this is the standard way of writing mixed states as a mixture of pure ones. We can show this in the present case. Let us look again at (26). We can re-write this as That is, we can represent q 1 q 2 on H 12 , by the expression n ⊗ m. Now, n ⊗ m can be expressed in terms of a complete basis {λ i } ∈ H 12 : We know one candidate for {λ i }: the set of all possible descriptors on H 12 , namely U † σ ⊗ σU . We can therefore write where and where the q ni (given by the U i ) are those descriptors on H 123 that describe qubits 1 and 2, unentangled with qubit 3.
What about the second possibility, that any given q 1 q 2 ∈ H 123 can be written as a mixture of the pure q 1i q 2i , rather than simply being represented by them? This would require us to be able to write where again the U i give qubits 1 and 2 unentangled with qubit 3. We have written λ i here rather than w i as they do not have the same form: where U 123 gives the first two qubits in a mixed state. We can show if this is the correct form of q 1 q 2 by considering the physical predictions that it would give. We know that the correct values for the averages of these descriptors are given by (28). Writing q 1 q 2 as in (29) would give as the average values Comparing this with (27) we see that these can only be identical if λ i = w i . Suppose that this is the case. We would then have Now let us write This gives us which would require that for all i such that Tr(U † i σ ⊗ σU i U † 12 σ ⊗ σU 12 ) = 0, which is a contradiction, as the left hand side is a constant value for all i whereas the right hand side is a variable. It is therefore not the case that q 1 q 2 itself can be written as a mixture of descriptors for pure states, but it is the case that it can be represented by such. This is a very interesting situation. What this shows is that in the Deutsch-Hayden approach the term "mixture" is fundamentally incorrect. The descriptors for such systems can be represented by a mixture of descriptors for pure states, but they are not themselves identical with a mixture, even when looking at the pure descriptors spanning the original Hilbert space. All of this is very different from a standard density-matrix formalism, where the pure state density matrices co-ordinatise the space of possible density matrices and are hence 'special' in a way that the mixed density matrices are not. Indeed, the mixed density matrices are given as different types of entities from the pure state ones (a fact emphasised by the standard Schrödinger notation, which cannot deal with them).
In terms of the Deutsch-Hayden approach, mixed and pure states are both described by sets of descriptors on a Hilbert space of certain dimensions, this corresponding to the number of systems involved in the evolution of the system under consideration. The difference between pure and mixed comes when we wish to reduce the Hilbert space on which we write our descriptors -that is, when we wish to ignore certain subsystems in the analogous operation to tracing out systems. Only then does any notational difference arise, some systems needing a sum of descriptors to describe them, some able to use the reduced descriptors. However, they are still both on a par with each other: the descriptors corresponding to pure states are not 'special' in any fundamental way, and they do not co-ordinatise the space of all descriptors. Each are as fundamental as the other, without the definition of one being dependent on the other.

VIII. ENTANGLEMENT SWAPPING
We will now look at the flow information in an entanglement swapping situation. In the Schrödinger representation, the particular example that we will use is the following [6]. We have four qubits in the overall state |Φ : If we then measure qubits 2 and 3 in the Bell basis, we can see that qubits 1 and 4 become entangled, and the measurement destroys the entanglement within the pairs (1,2) and (3,4) -it has "swapped" to the pair (1,4). The locality issues of this operation are well known: the qubit pairs (1,4) and (2,3) could be sent to opposite ends of the galaxy after the original entangling operations, and yet a subsequent operation on (2,3) will entangle (1,4). Even worse, qubits 1 and 4 could be separated before the measurement on (2,3), yet after it they are in a maximally entangled state, which can for example be used to teleport between them.
We start our analysis with the four qubits in the state |Φ : We now rotate qubits 2 and 3 to the Bell basis by performing a Bell gate, giving Now we perform the measurement on the pair (2,3) in the Bell basis by introducing two further qubits and performing two CNOT operations between (3,5) and (2,6), leaving us with the complete set of descriptors If we now look at these descriptors, we can see some very interesting things. First, let us look at the density matrices for the pairs (2,3) and (1,4). For both pairs, ρ ab = 1 1 ⊗ 1 1 and ρ a,b = 1 1, from which we can see that the pairs are not entangled. Next, we note that neither pair is in a pure state, as we cannot reduce their descriptors simply to the elements on the Hilbert spaces for those systems. Furthermore, if we look at the pairs (1,2) and (3,4) which were originally entangled, we see that they too have ρ ab = 1 1 ⊗ 1 1, ρ a,b = 1 1, so are no longer entangled. The pairs that are entangled at this point at (3,5) and (2,6), which have become so through the measurement interaction.
The exact forms of these descriptors is also interesting. We know that, because of the locality of interactions in the Deutsch-Hayden representation, the only way in which a descriptor can have a non-trivial dependence on H a is by interacting either directly with system a, or with something that has interacted with it. We can therefore trace, not only dependencies on specific quantities as Deutsch and Hayden did, but also trace non-trivial elements of a descriptor on a certain Hilbert space associated with another system. If we look at the descriptors for qubits 2 and 3 we can see that they have dependencies on many spaces, and we can trace where they have come from. Qubit 2 has non-trivial elements on H 1,2,3,6 . The dependence on H 1 comes from the original Bell state that the pair (1,2) shared. The H 3 element comes from the Bell gate interaction with qubit 3, and the H 6 element from the CNOT operation with qubit 6. Similarly we can traces the dependencies of qubit 3, with are on H 2,3,4,5 . We can also see that qubits 1 and 4 remain as they were at the beginning: they have not interacted any further in the protocol, and this is reflected in the components of their descriptors. Finally we note that the Hilbert space dependencies of qubits 5 and 6 are the same as those of the qubits that they interacted with in the CNOT operations.
In order to fully 'swap' the entanglement, we are going to need to look at the various descriptors relative to the four binary numbers stored in qubits 5 and 6. By construction, the pair (2,3) will be in one of the four Bell states, but what about the pair (1,4)? First, we need the four-system version of our expression (18) for relative descriptors. This is easily shown to be Now we use the fact that where {|β β|} is the set of states of (3,4) relative to which we are finding the descriptors for (1,2). If {|β β|} = {|00 00|, |01 01|, |10 10|, |11 11|} then we have q ′ 12ij = q 12ij (1 ± (q 3z + q 4z ) + q 3z q 4z ) 00 11 q ′ 12ij = q 12ij (1 ± (q 3z − q 4z ) − q 3z q 4z ) 01 10 If we look at the descriptors for qubit 1 and 4, and the σ z -components of qubits 5 and 6, we see that the only nonzero components of the relative descriptors are q 1x q 4x q 5z , q 1y q 4y q 5z q 6z , q 1z q 4z q 6z with their appropriate signs. We can therefore represent the relative descriptors in this case by q 1x −→ (+ + −−)q 1x q 5z q 4z −→ (+ − +−)q 4z q 6z which gives us the full set of descriptors relative to the set of measurements on (5,6) {00, 01, 10, 11}: If we now look at the reduced form of these descriptors on H 1,4 only, we see that the simply reduced forms retain all the same average values. We may therefore write the descriptors as which are the descriptors corresponding to the four (pure) Bell states. The first thing to note when analysing the flow of dependencies is that finding the relative descriptors requires the components q 5z and q 6z . Therefore, the 'swapping' of entanglement can only happen when these components are transmitted to qubits 1 and 4. That is, each qubit needs two bits of classical communication sent to it before the two qubits can become entangled.
We therefore have the transmission of the dependencies of q 5z,6z to qubits 1 and 4, and it is only once this has happened that we see the entanglement being swapped. Qubits 5 and 6 have dependencies on all of qubits 1 to 4, and these are transmitted to qubits 1 and 4, possibly separately. The dependencies on (2,3) work out to be the correlation between states of (2,3) and (1,4), and those on (1,4) give rise to the entanglement between qubits 1 and 4.
Thus we see that there is nothing non-local about entanglement swapping. There is no instantaneous "transmission" of entanglement to the qubit pair (1,4) because of the measurement on the pair (2,3), and no superluminal signalling between qubit 1 and 4 to tell them that they are now entangled with each other. The entanglement dependencies (and those which give the correlation between the two pairs -a facet of entanglement swapping that is often overlooked) are transmitted to qubits 1 and 4 separately through qubits 5 and 6. As it is only their σ z components which are used, we can describe them as two bits of data being transmitted over a classical communication channel, giving the result of the measurement of the pair (2,3) in the Bell basis. This, then, is a completely local description of entanglement swapping.

IX. CONCLUSIONS
In this paper we have developed the Deutsch-Hayden formalism to cover areas that it could not originally, specifically dealing with measurement and mixed states. We have also seen that, when considering the picture of information flow given to us by the descriptors, we can be confident that this is a unique picture, not dependent on the exact form chosen for those descriptors. Finally we have considered the entanglement swapping protocol, and seen that the usual description of it as creating entanglement non-locally is incorrect, and that in fact the entanglement is transmitted entirely locally. We have therefore seen how the Deutsch-Hayden approach is transparent to the notions of locality and unitarity in quantum mechanics, in a way that the standard formalism is not.