Matchgate quantum computing and non-local process analysis

In the circuit model, quantum computers rely on the availability of a universal quantum gate set. A particularly intriguing example is a set of two-qubit only gates: matchgates, along with SWAP (the exchange of two qubits). In this paper, we show a simple decomposition of arbitrary matchgates into better known elementary gates, and implement a matchgate in a linear-optics experiment using single photons. The gate performance was fully characterized via quantum process tomography. Moreover, we represent the resulting reconstructed quantum process in a novel way, as a fidelity map in the space of all possible nonlocal two-qubit unitaries. We propose the non-local distance - which is independent of local imperfections like uncorrelated noise or uncompensated local rotations - as a new diagnostic process measure for the non-local properties of the implemented gate.

In quantum computation, an essential requirement of the circuit model is a universal gate set, which enables the approximation of any given unitary process to arbitrary precision [1]. The advantages and disadvantages of various gate sets are still being actively explored; for example, one set may be more natural than others for interpreting a certain problem, or, in a given physical architecture one set may require far less resources than another. The best known gate-set class is that of any entangling 2-qubit gate in combination with arbitrary single-qubit unitaries [2]: most famously the 2-qubit cnot gate in conjunction with the 1-qubit Hadamard, h, and phase, t, gates. Circuits constructed of solely the cnot and h gates can be simulated efficiently with a classical computer [1]. However, the addition of the t gate-itself also efficiently simulatable-enables universal quantum computing (which of course is generally believed not to be efficiently simulatable). Another important gate-set class uses 3-qubit entanglers, such as the Toffoli gate-which has only recently been demonstrated in linear optics [3] and ion traps [4]-along with h.
In this paper we demonstrate a new gate-set class based only on 2-qubit gates, specifically the matchgate [5], which can be entangling, and the swap gate, which is strictly non-entangling [6,7]. Matchgates-originally introduced in graph theory [5]-are 2-qubit unitaries, where a and b, the 1-qubit unitaries formed by a = a 11 a 12 a 21 a 22 are members of U (2) with det(a)=det(b), and act on the even and the odd 2-qubit parity subspaces respectively [24]. Matchgates include a rich number of entangling gates-including maximal entangling gates-as well as many classes of local gates. If matchgates act only between nearest-neighbour qubits, then the resulting circuit can be efficiently simulated classically [5]. In this context matchgates relate to systems of non-interacting fermions [6]. Moreover, they are connected to 1D quantum Ising models: explicit matchgate circuits for simulations of such strongly-correlated quantum systems have been constructed in [8]. However, if matchgates are also allowed to operate between next -nearest neighbours-a seemingly trivial change achieved via swap gates-then the circuit can perform universal quantum computation [7]. Clearly, the resulting universal gate set is entirely different from the two mentioned above.
Here we show how to realise arbitrary matchgates using circuit elements already demonstrated in a range of physical architectures; we go on to demonstrate and measure matchgate operation in a linear optics photonic system, quantifying their performance with a new method for experimental analysis of two-qubit gates. Fig. 1a) shows the decompostion of an arbitrary matchgate, g ab , into cnot, controlled-unitary and single-qubit gates. Recall that, depending on the parity of the input state into the matchgate, either a or b acts on the two qubits. The first step of our general decomposition is to encode the parity of the two-qubit state onto one of the qubits. The first cnot gate turns the bottom qubit into state |0 for even-parity inputs (|00 →|00 & |11 →|10 ) and into |1 for odd-parity inputs (|01 →|01 & |10 →|11 ). This qubit then acts as the control for the controlled unitary, cu [25], where u=ba −1 . If the bottom qubit is in state |0 (parity=0), a will act on the top qubit; if it is in state |1 (parity=1), cu will undo a before it performs b. The final cnot gate returns the qubits from the parity encoding to the original basis.
Note that, alternatively, any 2-qubit operation could be implemented with three cnot gates and 8 singlequbit unitaries [9], or two "B"-gates (yet to be experimentally demonstrated) and 6 single-qubit unitaries [10]. Our decomposition, which offers a starting point for fur-a) Simplified decomposition of symmetric matchgates gaa as described in the text. c) The same gate after flipping the cnots with hs and decomposing hah into x-and z-rotations. d) Further simplification shows that gaa, up to a global phase, can be implemented using a single cz θ and local unitaries v1=hxα and v2=z −θ/2 x β h where α, β, and θ are related to a via hah=xαz −θ/2 x β . For ghh, θ=π, and cz θ → cz.
ther simplifications, in contrast requires fewer gates and allows one to recast matchgate circuits (e.g. those in [8]) into circuits consisting of more familiar quantum gates -the cnot [11,12], the more general cu [3], and single qubit rotations, all of which have been individually implemented in various architectures.
We now show how to build the matchgates required for a universal gate set. The universality proof in [7], relies on showing how matchgates and swap can be applied to implement another universal set-h, t and cz. As outlined in the additional online material [26], each logical qubit has to be encoded in two (or four) physical qubits. The required matchgate set is then g xx , g tt and g hh . These gates are all symmetric i.e. a=b, and the circuit of Fig. 1a) is greatly simplified, because a −1 a=i: the controlled unitary turns into a "controlled identity". The resulting circuit diagram, shown in Fig. 1b), still requires two 2-qubit gates. It does however resemble the general construction of a cu [13] gate, i.e. it can be replaced by a single cu and 4 single-qubit unitaries in the following way: We flip the cnots upside down by adding 4 Hadamards-cnot≡(h⊗h)×cnot×(h⊗h)-and rewrite the resulting central unitary hah, up to a global phase, as Fig. 1c). The rotations x α and x β commute with their respective cnots, which allows us to express the two-qubit part of the operation as a single controlled cz θ , Fig. 1d). This simplified circuit for g aa can now be directly implemented in bosonic systems using the technique of shortcuts through higher-dimensional Hilbert spaces [3].
Turning back to the universal matchgate set, we find that g xx =x⊗x and g tt =t⊗i-both operations are local and can be done trivially in a photonic architecture with waveplates or interferometers. Similarly, swap is a straightforward procedure in optics, either in free-space or integrated circuits. The one non-trivial gate to be demonstrated is which is a nonlocal, maximally entangling gate. For this gate the decomposition yields θ=π and therefore cz θ in Fig. 1d) is the well known cz gate. Experimentally, we can therefore implement g hh using polarization-encoded single photons, partially polarizing beam splitters and coincidence detection [14,15,16]. The experimental setup is explained in Fig. 2. Prep. Tomo. Tomo.
FIG. 2: Experimental scheme. Orthogonally polarized photon pairs are created in a nonlinear ppKTP crystal which is pumped by a 410 nm laser diode, using focussing parameters from [17]. The photons are split at a polarizing beamsplitter (PBS), collected, and guided to the circuit with singlemode optical fibers. Their input polarizations are set with a PBS, one quarter-and one half-wave plate (QWP, HWP) (Prep.). They are then superposed on a partially polarizing beamsplitter (PPBS) which transmits 2/3 of horizontal, |H and perfectly reflects vertical, |V light. Loss elements (PPBS ′ ) correct the respective amplitudes and the unknown phase shift at the central PPBS is compensated by a combination of a QWP, a HWP and another QWP in one output port of the PPBS. If compensated correctly, only the input term |HH picks up a phase shift of π due to non-classical interference, which realizes a bit-flipped cz when measured in coincidence. The photons are then jointly analyzed with a QWP, a HWP and a PBS (Tomo.) before they are detected by two single-photon detectors. The rotations for the 4 singlequbit unitaries required for ghh in Fig. 1d) are incorporated into the preparation and measurement waveplate settings.
We characterized our gate using full quantum process tomography [18], preparing 16 combinations of the states {|H , |V , |D =1/ √ 2(|H +|V ), |R =1/ √ 2(|H +i|V )} at each input and projecting into an overcomplete set of 36 measurements at the outputs. The photons were then detected in coincidence. The resulting process matrix χ exp , reconstructed via maximum-likelihood estimation, is shown in Fig. 3. The process fidelity with χ ideal is 92.3 ± 0.2%, where the error was calculated assuming Poissonian count statistics. We attribute the remaining errors to non-ideal waveplates, imperfections in the spatial and temporal mode-overlap, as well as in the splitting ratios of the PPBS's.  We did not apply any numeric local rotations to optimize this result in contrast to, e.g. [3].
The question remains: how well we have actually implemented a matchgate? A single process fidelity, such as calculated above, reveals the overlap between the experimental process and the corresponding, ideal target process. However, it does not yield any information about the way in which the process is not ideal-is it just mixed due to random noise or are we in fact implementing a different unitary process than we actually thought? In particular, we are interested in the nonlocal properties of the quantum process, as they define its entangling power and errors in them cannot be corrected with local operations.
Interestingly, as shown in [19], out of the 15 real parameters which define a unitary 2-qubit operator U ∈SU (4), only three actually describe the nonlocal part of the unitary, the remaining 12 relating to local transformations: This decomposition allows a very intuitive geometrical representation of local equivalence classes. The three nonlocal parameters c 1 , c 2 and c 3 can be used to construct a 3-dimensional space of nonlocal gates. Symmetries reduce this space to a tetrahedron, shown in Fig. 4 (a), the so-called Weyl-chamber [20]. Each point in this chamber represents all locally-equivalent gates with unique nonlocal properties [27]; the cnot gate, cz, and therefore also g hh gate, are all locally equivalent and located at [π/2, 0, 0]. Using a method from [20], one can directly check whether two gates are locally equivalent, which confirms the results obtained from our matchgate decomposition in Fig. 1. In order to illustrate our experimental process in the Weyl chamber and to find the nonlocal unitary gate closest to it, we numerically translated 6201 evenly-spaced ideal 2-qubit operators defined by (4) into their process representation and then calculated their maximal nonlocal process fidelity, χ(c 1 ...c 3 , u 1 ...v 2 )). (5) to χ exp by numerical optimization over the local transformations u 1 , v 1 , and u 2 , v 2 . The result is a threedimensional process fidelity map, shown in Fig. 4b) and c) for the ideal and experimental processes respectively. After optimization over the local unitaries, we find a maximum process fidelity for our experimental gate of 94.7±0.3% (increased from 92.3%, Fig. 3) at [c 1 , c 2 , c 3 ]∼[π/2, 0, 0]. At first it may seem that the F nl (5) is not a very sharp measure but in fact the volume of nonlocal gates with ≥90% fidelity to χ ideal is just 11.6% [28]. For χ exp , this volume shrinks to 4.85%, due to decoherence.
We can now define a new process distance measurethe nonlocal distance, ∆ nl ≡ ∆c 2 1 + ∆c 2 2 + ∆c 2 2 , where ∆ nl ∈[0, π] and ∆c i is the difference of the coordinates of the target unitary gate and the coordinates for the maximum overall fidelity F nl (c 1 , c 2 , c 3 ) max obtained from the optimization, Eq. (5) [29]. According to [21], ∆ nl meets all distance measure criteria for pure processes and can be seen as a diagnostic measure. Our experimental gate is located at a distance ∆ nl ∼0 from the ideal cz. The uncertainty in this value is dominated by the numerical optimization. Upon closer inspection, this does not come at a surprise; any uncorrelated noise process such as depolarization or dephasing would show up as mixture, i.e. an uniform decrease of F nl over the whole non-local space. Imperfections in optical components such as the PPBS central to our gate similarly lead to mixing because the underlying 2-qubit operations for reflectivity values around η ideal =1/3 are non-unitary [22]. A similar argument holds for other imperfections, such as temporal or spatial mode mismatch [30]. In other words, in contrast to the cu gates demonstrated in [3], our optical setup does not have any non-local unitary degrees of freedom.
We conclude that the nonlocal properties of our gate could not be improved by any unitary corrections. We therefore attribute the remaining reduction in fidelity to mixing, which is supported by the measured, non-ideal process purity of 89.8 ± 0.4%. This method, the projection of a quantum process into the nonlocal, unitary 2-qubit operator sub-space is likely  [20], i.e. gates that can turn separable states into Bell states. The (thick blue) line [γ, 0, 0] contains all gaa gates, which are equivalent to cu [3]. ghh is located at its midpoint [π/2, 0, 0], and is therefore the only symmetric matchgate which is maximally entangling. b-c) Point-by-point locally optimized process fidelity map, Eq. 5, of all ideal nonlocal gates with b) the ideal target gate and c) the experimentally reconstructed ghh process matrix. The maximum fidelity is 94.7 ± 0.3% at ∼[π/2, 0, 0].
to have a number of interesting applications in quantum information processing, e.g. process tomography, a topic which is rapidly increasing in importance as quantum computer architectures evolve. Suggested lines for further research are, for example, how different noise processes influence nonlocal properties of a quantum gate and nonlocal process discrimination [23].
In conclusion, we have implemented a matchgate circuit which allows universal computation when combined with the simple two-qubit swap. We characterized our gate using quantum process tomography and illustrated the fidelity overlap of the experimental process with all possible nonlocal gates in the Weyl chamber.
We wish to thank B. P. Lanyon and N. K. Langford for valuable input. We acknowledge financial support from the Australian Research Council Discovery and Federation Fellow programs and an IARPA-funded U.S. Army Research Office contract. SR acknowledges financial support by the FWF project CoQuS No. W1210-N16. AMS acknowledges support by the Natural Sciences and Engineering Research Council of Canada, Quantum Works and the Canadian Institute for Advanced Research.