Architectural design for a topological cluster state quantum computer

The development of a large scale quantum computer is a highly sought after goal of fundamental research and consequently a highly non-trivial problem. Scalability in quantum information processing is not just a problem of qubit manufacturing and control but it crucially depends on the ability to adapt advanced techniques in quantum information theory, such as error correction, to the experimental restrictions of assembling qubit arrays into the millions. In this paper we introduce a feasible architectural design for large scale quantum computation in optical systems. We combine the recent developments in topological cluster state computation with the photonic module, a simple chip based device which can be used as a fundamental building block for a large scale computer. The integration of the topological cluster model with this comparatively simple operational element addresses many significant issues in scalable computing and leads to a promising modular architecture with complete integration of active error correction exhibiting high fault-tolerant thresholds.


I. INTRODUCTION
The scientific effort to construct a large scale device capable of Quantum Information Processing (QIP) has advanced significantly over the past decade [1,2,3,4,5,6,7,8,9,10]. However, the design of a viable processing architecture capable of housing and manipulating the millions of physical qubits necessary for scalable QIP is hampered by an architectural design gap that exists between advanced techniques in error correction and fault-tolerant computing and the comparatively small scale devices under consideration by experimentalists. While some progress in scalable system design has been made [11,12,13,14,15,16], the development of a truly large scale quantum architecture, able to implement programmable QIP with extensive error correction, hinges on the ability to adapt well established computational and error correction models to the experimental operating conditions and fabrication restrictions of physical systems, well beyond 1000 physical qubits.
The recent introduction by Raussendorf, Harrington and Goyal of Topological Cluster state Quantum Computation (TCQC) [17,18,19] marks an important milestone in the advance of theoretical techniques for quantum information processing (QIP). This model of QIP differs significantly from both the circuit model of computation and the traditional cluster state model [20,21] in that the entanglement resource defines a topological "backdrop" where information qubits can be defined in very non-local ways. By disconnecting the physical qubits (used to construct the lattice) from the information qubits (non-local correlations within the lattice), information can be topologically protected, which is inherently robust against local perturbations caused by errors on the physical system. This exploitation of topological protection utilizing qubits, rather than exotic anyonic systems, which are particularly difficult to experimentally create and manipulate [22], leads to a computation model that exhibits high fault-tolerant thresholds where problematic error channels such as qubit loss are naturally corrected.
In terms of architecturally implementing certain topological coding models, there are two general techniques which can be used, depending on the physical system under consideration. Surface codes [23,24,25], appropriate for matter based systems and the 3D TCQC model [17,18], which is far more useful in optics based architectures due to the high mobility and comparatively inexpensive cost of photonic qubits. This paper introduces a computational architecture for the 3D topological model, utilizing photonic qubits. By making use of the TCQC model, error correction protocols are automatically incorporated as a property of the computational scheme and photon loss (arguably the dominant error channel) becomes correctable without the incorporation of additional coding schemes or protocols [16,26,27,28].
As error correction within the TCQC model is predicated on the presence of a large cluster lattice before logical qubits are defined, standard probabilistic techniques for preparing entangled photons are generally not appropriate without employing extensive photon routing and/or storage. Therefore, a deterministic photon/photon entangling gate is desired to drastically simplify any large scale implementation of this model. Such operations could for instance include the C-Phase and parity gates [29,30,31], and may be via direct photon-photon interactions, cavity QED techniques [32,33] or an indirect bus-mediated quantum nondemolition-like interaction [31,34]. As an architectural building block, we are going to focus on the later with the recently introduced idea of the photonic module and chip [34,35]. This small scale, chip based quantum device is an illustrative example of a technology with all the essential components to act as an architectural building block for the TCQC model. This device has the flexibility to entangle an arbitrary number of photons with no dynamic change in its operation and the manner in which photons flow in and out of the unit make it an ideal component to realize a modular structure for large scale TCQC. In recent years, experimental work in the both cavity/photon coupling [38,39,40] and chip based single photon technologies [41,42,43] has advanced significantly. While the high fidelity construction of the photonic chip is still a daunting task, the continued effort in these areas allows for optimism that such a device is an experimental possibility.
The following will detail the lattice preparation network, utilizing the photonic chip illustrated in Fig. 1 as the basic building block of the computer. We detail the physical layout of the network, the optical switching sequence required and several techniques that are used to optimize the preparation of the cluster. We complete the discussion with a resource analysis, examining the number of fabricated devices required to perform a large scale quantum algorithm.
As this paper is focused on the possible construction of a large scale architecture, we utilize the photonic chip as a building block for this architectural design and we direct readers to Refs. [34,35,36,37] which examine the micro-analytics of the photonic chip in more depth. Additionally, we will also assume (for the sake of conceptual simplicity) that appropriate, high fidelity, on demand single photon sources and detectors are also feasible on the same developmental time frame as a photonic chip. However, as the photonic module is essentially a non-demolition photon detector, sources and detectors can be constructed directly with the chip [49] allowing us to relax these assumptions, if required.
FIG. 1: Schematic design for the photonic chip. The chip is a 3-in 3-out integrated device containing one photonic module [34], classical single photon routing and two optical waveplates allowing for the optional application of single photon Hadamard gates to specific photons. Not shown are Hadamard plates potentially required on the input or output port of the chip.

II. THE FLOWING COMPUTER DESIGN
The general structure of the optical TCQC is illustrated in Fig. 2. The high mobility of single photons and the nature of the TCQC model allows for an essentially "flowing" model of computation. Initially, unentangled photons enter the preparation network from the rear, flow through a static network consisting of layers of photonic chips arranged in 4 stages and exit the preparation network where they can be immediately consumed for computation. Fig 3. shows a smaller region of computer where only one (of four) stages of the preparation network is illustrated for clarity while Fig 4. illustrates a single unit cell of the cluster state which repeats and extends in all three dimensions.   The cluster is not a Body Centered Cubic (BCC) lattice. The green links, representing entanglement bonds specified by the stabilizer generators of Eq. 1, ensure that each photon is connected to four neighbors, not six. In the flowing network, photons along the z-axis (for a given (x, y) co-ordinate) are carried by common optical waveguides. As the lattice is not fully connected, there are two groups of waveguides, one set runs at a fixed repetition rate along the z-axis (the red photons shown above) and the other operates at half that rate (black photons). We denote these waveguides as full and half rate lines respectively. In the TCQC model, computation proceeds by measuring, sequentially, cross sectional layers (x−y plane) of the lattice which acts to simulate time through the computation. Qubit information is defined within the lattice via the creation of holes (known as defects). As the cluster is consumed along the z-axis, defects are deformed via measurement such that they can be braided around each other, enacting CNOT gates between qubits [18,19].
Each unit cell of the cluster does not have a Body Centered Cubic (BCC) structure, each photon is only connected to four neighbors, not six. Therefore, in a flowing network design, where each photon pulse along the z-axis is carried by common optical waveguides, there are two sets of photon pulse repetition rates. One set (waveguides carrying red photons in Fig 4) runs at a fixed repetition rate which we will denote as full rate lines. The other set (waveguides carrying black photons in Fig 4) runs at half that rate, denoted half rate lines. If the lattice is examined along the z-axis, these full and half rate lines form a checkerboard pattern.
The global cluster is uniquely defined utilizing the stabilizer formalism [44]. The lattice is a unique state defined as the simultaneous +1 eigenstate of the operators, K x,y,z = X x,y,z b∈n(x,y,z) where X and Z are the 2 × 2 Pauli operators, (x, y, z) are the co-ordinates of each of the N photons in the 3D lattice, n(x, y, z) is the set of 4 qubits linked to the node (x, y, z) and the N − 5 identity operators are implied [ Fig. 4]. For photons that do not have four nearest neighbors (i.e. at edges of the lattice), the associated stabilizer retains this form but excludes the operator(s) associated with the missing neighbor(s).
The optical network to prepare such a state requires the deterministic projection of a group of unentangled photons into the entangled state defined via the stabilizers. Fig 1 illustrates the basic structure of the photonic chip, introduced in Refs. [34,35] which we utilize to perform these deterministic projections. The photonic module, which lies at the heart of each chip, is designed to project an arbitrary N photon state into a ±1 eigenstate of the operator X ⊗N [34], where N is the number of photons sent through the module between initialization and measurement of the atomic system.
In order to perform a parity check of the operator ZZXZZ, single photon Hadamard operations are applied to every photon before and after passage through a photonic chip. For every group of five photons sent through between initialization and measurement, the central photon in the stabilizer operator will be routed through a second set of Hadamard gates before and after passing through the module. Note, as each chip is linked in series, the Hadamard rotations on each input/output are not required for all stages of the preparation network as they cancel.

Photon stream initialization
In total there are four stages required to prepare the 3D lattice. The four stages are partitioned into two groups which have identical layout and switching patterns. These two groups stabilize the lattice with respect to the stabilizer operators along the y −z planes of the cluster and the x−z planes respectively. If an arbitrary input state is utilized to prepare the cluster, two additional stages are required to stabilize the cluster with respect to each operator associated with the x − y plane of each unit cell. However we can eliminate the need to perform these parity checks by carefully choosing the initial product state that is fed into the preparation network.
Each stabilizer operator, ZZXZZ, in the x − y plane has the X operator centered on the photons in half rate lines (i.e. the black photons in Fig. 3). Each photon in these lines is prepared in the +1 eigenstate of X, |+ = (|H + |V )/ √ 2, while photons in full rate lines are initialized in the state +1 eigenstate of Z, |H , (we assume a polarization basis for computation). This initialization ensures that the stabilizer set for the photon stream (before entering the preparation network) is described by the stabilizers, where h and f are the sets containing photons in the half and full rate optical waveguides respectively. As any product of stabilizers is also a stabilizer of the system, all of the stabilizers in the x − y planes of the cluster are automatically satisfied. It can be easily checked that the only element in the group generated by Eq. 2 that commutes with the stabilizer projections associated with the y − z and x − z planes of the cluster is the ZZXZZ term associated with each stabilizer along the x − y plane. Fig. 2). We illustrate only two stages of the network since the preparation network in the x − z plane is identical to the y − z plane. Each stage requires a staggered arrangement of photonic chips and in Fig 5, after a specific stage, we have detailed the stabilizer that has been measured where the central photon corresponds to the X term in the measured operator. The temporal staggering of the photonic state, for each optical line, is also detailed where the spacing interval, T , is bounded below by the minimum interaction time required for the operation of the photonic module and bounded above by the coherence time of the atomic systems in each chip [34,35].

A. Temporal Asynchrony
Before detailing the switching sequence for these stages we have to address a slight complication that arises when creating the 3D lattice. In general, each photon is involved in five separate parity checks. By utilizing a specific FIG. 5: Chip orientation for stages one and two of the preparation network, utilized to measure the stabilizer operators along the y − z plane of the lattice. Photons flow from left to right through each chip. The temporal staggering of each photon is required such that only one photon passes through each photonic module at any given time step. The fundamental temporal staggering is the atom/photon interaction time T , however each row has to be temporally offset to ensure proper temporal ordering for future stages. Each stage measures a specific stabilizer, illustrated with a cross after the specific stage. The operator that is measured is ZZXZZ, where the X operator associated with the photon at the center of the cross (insert). Extending this network to more and more cells requires extending each column vertically. The network for stages three and four, required to stabilize the system with respect to the x − z plane, is identical where the switching sequence is offset by 2T .
input state we can remove the need to measure the stabilizers associated with the x − y planes of each unit cell. This implies that we are measuring four operators for each photon present in half rate lines (we no longer perform a parity check associated with the x − y plane stabilizers for these photons) and three operators for each photon in full rate lines (each of these photons are involved in two parity checks associated with x − y plane stabilizers). As each photon suffers a temporal delay of T by passing through a photonic module, there is a temporal discrepancy with the photons in the half rate and full rate lines which, if not compensated, leads to two photons being temporally synchronous in later parity checks. The flexibility of the photonic chips allows us to solve the delay problem in a convenient way.
If a delay were not required, a given module would have a window of 3T where it is idle between successive parity checks. Normally this window would be utilized for measurement and re-initialization of the atomic system. For a given stage of the preparation network, every fourth photon in the full rate lines is not involved in any parity check (see Fig. 5) and must therefore be delayed. Hence we partition this 3T window into three steps. Immediately after the last photon from the previous parity check exits the chip, the atomic system is measured (now utilizing a temporal window of T rather than 3T ). As the delayed photon is the next one to enter the chip, we simply do not initialise the module in the required atomic superposition state.
As detailed in Ref. [35], the atom/photon interaction required for the module requires initializing a 3-level atomic system into a superstition of its two ground states, (|1 + |2 )/ √ 2, with a resonant RF field. The presence of a photon in the cavity mode (which is detuned with the |1 ↔ |3 transition) induces a phase shift on the state |1 which oscillates the system between the states (±|1 + |2 )/ √ 2. If we instead keep the atomic system in the |1 (or |2 ) state, the presence of a photon will have no effect on the atomic system. Delaying the required photon by T therefore requires operating the module as usual, but for this first step we do not initialize the atomic system. This stage again takes time T and we refer to it as the "holding" stage. In the next temporal window, T , the atomic system is re-initialized in the appropriate superposition state and the next 5 photon parity check proceeds as normal. This delay trick will maintain the temporal staggering of the photons without the need for additional technology and ensures that every photonic chip is active for all time steps (no idle time for any module).
B. Switching sequence Illustrated in Fig. 5 and Tab. I is the network and switching sequence for the stabilizers associated with the y − z planes of the unit cell. The stabilizers associated with the x − z planes are measured using precisely the same network and switching sequence (when viewing the lattice along the y-axis). The difference in the switching sequence is a global offset of 2T to account for the delay from stages one and two. By examining Fig. 5 there is the potential for photon collisions for chip layers two and four. However, whenever two photons enter a chip simultaneously, one interacts with the module while the other is required to bypass the unit. Hence the notation used in Tab. I, {.} U,B , refers to switching the central port {.} to the module while switching the photon in the (U)pper or (B)ottom port to a bypass line.

IV. RESOURCE REQUIREMENTS.
While we have only illustrated the network for a 5 × 5 continuously generated lattice, the patterning of photonic chips and the switching sequence for each unit extends in two dimensions allowing for the continuous generation of an arbitrary large 3D lattice in a modular way [Fig. 2]. The total number of photonic chips required for the preparation of a large cross section of the lattice is easily calculated. In general, for an N × N cross section of cells, 4N 2 + 4N photonic chips are required to continually generate the lattice.
We are able to choose the optimal clock cycle of the computer, T , as the fundamental atom/photon interaction time within each photonic module. As shown in Tab. I, the switching sequence for the preparation network allows for a temporal window of T for the measurement of the atomic system, within each device. As estimated in [34,36], depending on the system used, this rate can be approximately 10ns to 1µs. If we choose T to be the optimal operating rate of the photonic module, then there is the potential that atomic measurement in the preparation network is too slow. This can be overcome by the availability of more photonic chips. If more chips are available then we construct multiple copies of the preparation network with optical switches placed between each preparation stage. While the atomic systems are being measured from one round of parity checks the next set of incoming photons are switched to a different group of chips.
For a given ratio of the atom/photon interaction time to atomic measurement time, T atom /T module ≥ 1, photons are routed to multiple copies of the preparation network. Therefore the number of additional photonic modules/chips will increase by a factor of Γ = T atom /T module to compensate for slow measurement. Resource estimates are consequently related to the total 2D cross section of the lattice and Γ. Given that the number of photonic chips required in the preparation network, for a continuously generated N × N cross section of unit cells, is 4N 2 + 4N , the number of fabricated photonic chips required when atomic measurement is slow will be approximately Γ (4N 2 + 4N ).
To put these resource costs in context we can make a quick estimate of the resources required to build a quantum computer capable of solving interesting problems. Let us choose a logical error rate per time step of the lattice of 10 −16 to be our target error rate, where a logical time step is the creation and measurement of a single layer of the cluster. This gives an approximate error rate, per logical non-Clifford R z (π/8) rotation of approximately O(10 −11 ) [Appendix B]. This error rate would be sufficiently low to enable the factoring of integers several thousand binary digits long using Shor's algorithm [45,46,47]. Let us assume that increasing the separation between defects and the circumference of defects by two cells reduces the logical error rate per time step by a factor of 100 [Appendix A]. Given that the current threshold error rate of the 3D topological cluster state scheme is 6.7 × 10 −3 [48], albeit in the absence of loss, this assumption is equivalent to assuming qubits are affected by un-correlated Z errors with a probability between 10 −4 and 10 −5 per time step. Correlated errors, which can be produced by the photonic chip when preparing the cluster, effect qubits on two separate lattice structures defined by the cluster (the primal and dual lattices) which are corrected independently and can be treated as such [53]. Complete failure of a photonic chip is heralded as the eigenvalue conditions for cells prepared by a faulty chip are not satisfied. Therefore the measurement of a small region the x − y plane of the cluster (along the z-axis) will identify errors in cluster cells at a much higher rate than the rest of the computer, indicating chip-failure.
FIG. 6: Large scale lattice structure required for long term computation via topological cluster states. Here we illustrate a basic concept of the lattice structure that is required for lengthy operations of the topological optical computer (when examining the lattice along the z-axis). Based on the estimates in the main text, each logical qubit (defined via a pair of defects in the lattice) requires a 40 × 20 array of cells. The layout for each logical qubit is shown on the left, with the larger lattice, comprising thousands (if not millions) of logical qubits shown on the right. Each logical qubit requires just over 3000 fabricated photonic chips. Assuming single qubit error rates in the 10 −4 to 10 −5 range, this lattice would be sufficient to permit approximately 10 11 logical non-Clifford Rz(π/8) operations.
Given the above assumptions, the desired logical error rate per time step could be achieved with defects measuring 4 × 4 cells in cross-section and separated by 16 cells. Fig. 6 shows a section of a semi-infinite lattice of sufficiently well error corrected logical qubits. Each logical qubit occupies a cross-sectional region of 40 × 20 cells. To prepare such a lattice (for an arbitrary number of time steps and setting Γ = 1) we require the fabrication of 3320 integrated photonic chips.
This preliminary estimate illustrates that the resource requirements of this architecture are promising. The high fidelity construction of approximately 3×10 3 photonic modules, per logical qubit, to prepare a lattice that has sufficient topological protection to perform on the order of 10 11 non-Clifford logical operations is arguably a less significant challenge than the high fidelity manufacturing of other proposed quantum computer architectures. Other proposed systems not only require comparable (if not more) physical qubits to achieve the same error protection, but depending on the system they will also require interconnected quantum bus systems for qubit transport, very non-trivial classical control structures and most likely the fabrication of the entire computer at once, with little flexibility to expand the size of the computer as more resources become available.
In Appendix B it is estimated that each R z (π/8) rotation requires approximately 250 logical qubits (required for ancilla state distillation in order to enact non-Clifford gates). It should be noted that the logical qubit requirements for non-Clifford gates are identical to all computational models employing state injection, state distillation and teleportation as a method for fault-tolerant universal computation.

V. CONCLUSIONS
We have presented a detailed architecture for topological cluster state computation in optical systems. While this architectural design is not appropriate for all systems, it does contain does contain several key elements, easing the conceptual design of a large scale computer. For example: 1. The utilization of a computational model fundamentally constructed from error correction, rather than implementing error correction codes on top of an otherwise independent computational model.

2.
The modular construction of the architecture, where architectural expansion is achieved via the addition of comparatively simple elements in a regular and known manner.
3. Utilization of a computational model exhibiting high fault-tolerant thresholds and correcting otherwise pathological error channels (such as qubit loss).
4. Utilization of a measurement based computational model. As the preparation network is designed simply to prepare the quantum resource, programming such a device is a problem of software, not hardware.
By constructing this architecture with the photonic module and photonic chip as the primary operational elements, this design addresses many of the significant hurdles limiting the practical scalability of optical quantum computation. These include inherent engineering problems associated with probabilistic photon/photon interactions and the apparent intractability of designing a large scale, programmable system which can be scaled to millions of qubits.
The experimental feasibility of constructing this type of computer is promising. High fidelity coupling of single photons to colour centers in cavities is a significant area of research in the quantum computing community and the nano engineering required to produce a high fidelity photonic module is arguably much simpler than a full scale atom/cavity based quantum computer which would require the fabrication of all data qubits and interconnected transport buses simultaneously. An additional benefit is the continuous nature of the lattice preparation. The required number of photonic modules and photonic chips only depends on the 2D cross sectional size of the lattice. This is important. Unlike other computational systems, this model does not require us to penalize individual photonic qubits within our physical resource analysis. Working under the assumption of appropriate on-demand sources, resource costs become a function of the total number of photonic chips (and single photon sources), rather that the total number of actual qubits required for a large 3D cluster. This highlights the advantages of having photons as disposable computational resources and we hope recasts the question of quantum resources into how many active quantum components are required to construct a large scale computer.

VI. ACKNOWLEDGMENTS
The authors thank C.-H. Su, Z. W. E. Evans and C. D. Hill for helpful discussions. The authors acknowledge the support of MEXT, JST, the EU project QAP, the Australian Research Council (Centre of Excellence Scheme and Fellowships DP0880466, DP0770715), and the Australian Government, the US National Security Agency, Army Research Office under Contract No. W911NF-05-1-0284.

APPENDIX A: ERROR SCALING OF THE TOPOLOGICAL LATTICE
This appendix briefly reviews how to use the three-dimensional cluster state that is prepared by the network. In particular we outline techniques for error correction and universal computation and explain how logical failure rates are suppressed and how resource requirements are calculated.
To extract the error syndrome, one measures each face qubit in every unit cell (see Fig. 4) in the X basis. In the absence of Z errors on these qubits, the parity of these six measurements will be even. A single Z error on one face qubit, or an error during measurement, will flip the parity that is recorded from even to odd. When a chain of more than one error occurs, only cells at the end points of the chain will record odd parity. Note that no information about the path of any error chains is obtained during error correction and that error chains can terminate on boundaries of the cluster.
Identifying the minimal set of Z errors that is consistent with the syndrome proceeds by representing each odd parity unit cell as a node in a completely connected, weighted graph. The weight of the edge between any two nodes is the minimum number of errors that could connect the two cells which they represent. For each node in the graph a partner node is added which represents a boundary where an error chain could have terminated. The weight of the edge connecting each node to its partner node is the minimum number of errors that could connect the cell to the boundary. Partner nodes also form a fully connected graph with the weight between any two nodes equal to zero. The entire graph is then solved for the minimum weight pairing of the nodes. Each pair in the solution represents a Z error or chain of Z errors which can be corrected for with classical post-processing.
We note that only Z error correction is required to correct Z, X, measurement, initialization errors and photon loss. The measurements required for syndrome extraction are made in the X basis and so are not affected by X errors. Errors during preparation of the cluster state are equivalent to Z errors on the prepared state. Photon loss could be treated as a random measurement result, equivalent to a measurement error or alternatively by calculating the parity of a closed surface of face qubits enclosing the lost photon or photons. This latter method is preferable given recent results establishing a high tolerance to heralded loss events [50].
A single logical qubit is associated with a pair of defects in the lattice. A defect is a connected volume of the lattice in which all qubits have been measured in the Z basis. Two types of defects can be made -primal and dual. Primal defects are regions where one or more unit cells (see Fig. 4) have been measured. Dual defects are defined identically in the dual space of the lattice, where a center of a dual unit cell corresponds to a vertex of a primal unit cell. Valid operations on a single logical qubit include joining the pair of defects via a chain of physical Z operations and encircling a single defect via a chain of physical Z operations. Technically, these logical operations have additional X operators associated with them to ensure anticommutation [19], however these do not affect X basis measurements and thus can be ignored in the current discussion. Note that logical X and Z operations can be realized by simply tracing their effect through the circuit until logical measurement and then appropriately adjusting the measurement outcome, as opposed to performing gated operations directly on the physical qubits within the three-dimensional lattice. A controlled-not interaction between a dual and primal qubit can be effected by braiding one of the dual defects around one of the primal defects. Braiding is performed by changing the shape of the region of measurements defining the location of a defect. Logical CNOT, combined with initialization, readout, state injection, and state distillation allow universal quantum computation as detailed in Ref [19].
Protection against logical errors is achieved by increasing the separation between defects and the circumference of defects. A logical error occurs when the error correction routine causes the application of a set of corrections that make or complete a chain of physical operations forming an erroneous logical operation. Just considering error chains joining defects, it is clear that the number of physical errors required to make such a chain increases linearly with the defect separation. The probability of forming such a chain therefore decreases exponentially with separation. As the distance of a code can be defined as the weight of its minimal weight logical operator, increasing the separation between defects by two cells increases the code distance by two. Increasing the circumference of defects has an analogous effect on the probability of logical errors in the conjugate basis.
The efficacy of error correction depends on the amount by which the physical error rate, p, is below some threshold error rate. As the distance of a code is increased from d to d+2, the encoded error rate is transformed from n(d)p (d+1)/2 to n(d + 2)p (d+1)/2+1 , where higher order terms are neglected and n(d) and n(d + 2) are constants related to the code and the circuits required to extract the syndrome. Increasing the distance of the code will lower the encoded error rate only if p < n(d)/n(d + 2). If p is a factor of x below this threshold (p = n(d)/xn(d + 2)), then the encoded error rate is reduced by a factor n(d)p (d+1)/2 /n(d + 2)p (d+1)/2+1 = n(d)/pn(d + 2) = x. In general, provided that the quantity n(d)/n(d + 2) is constant for all d, increasing the distance of a code by two will result in a reduction in the encoded error rate by a factor equal to the difference between p and the threshold error rate. In our analysis of topological error correction we assume that n(d)/n(d + 2) is constant for all d.
The threshold for the topological error correction code described above is 6.7 × 10 −3 , where qubit loss is neglected and where all other errors are assumed to be equally likely to occur [18]. If we assume that the physical error rate is two orders of magnitude below this threshold, equal to 6.7 × 10 −5 , then increasing the separation of defects and the circumference of defects by two cells reduces the encoded error by a factor of 100. If our target encoded error rate, per time step, is 10 −16 we require a minimum code distance of 17, which corresponds to a defect separation of 16 unit cells, or alternatively that we can correct error chains up to 8 errors long. Similarly the perimeter of the cells should also be 16, and hence we require defects that are 4 × 4 cells in cross-section. This directly leads to the resource estimates per logical qubit given in the body of the paper.

APPENDIX B: ERROR RATE ON NON-CLIFFORD LOGICAL OPERATIONS
In the main body of the text, two logical error rates are presented. The first is the effective logical error rate per single layer of the cluster along the direction of simulated time [Appendix A] and the second is the effective error rate per non-Clifford rotation R z (π/8). We present these two values separately as the optimization of the measurement sequence for applying non-Clifford gates is still incomplete. This calculation instead provides a rough order of magnitude estimate of the effective non-Clifford error rate given the topological protection afforded by the lattice specified in the body of the paper.
Within the topological model, only a subset of universal gates can be applied directly to the lattice, these include initialization and measurement in the X and Z basis, single qubit X and Z operations (although these are realized by tracing their effect through the circuit until logical measurement and then appropriately adjusting the results) and braided CNOT gates. Completing the universal set is achieved through state injection, magic state distillation and teleportation protocols [18,51].
As non-Clifford rotations require the implementation of state distillation protocols, the effective logical gate rate is then dependent on the volume of cluster used to inject and distill a sufficiently high fidelity ancilla state for use in the teleportation protocol.
In order to estimate the failure rate of a non-Clifford R z (π/8) operation, we examine the volume required to distill high fidelity ancilla states from low fidelity injected sources and to perform the required teleportation circuits to implement the R z (π/8) gate. Tab. II from Ref. [18] specifies the volume of the 3D cluster that is required to implement five specific gate operations within the TCQC model, namely the CNOT, state distillation circuits for the singular qubit states, and teleportation circuits to implement the single qubit gates, The volume estimates are given in terms of a scaled logical cell, where defects are now defined such that they have   [52] that are currently unpublished. Our calculations will assume the original volume estimates from Ref. [18], overestimating the total volume consumed for the gate Rz(π/8).
a sufficiently large circumference and separation to suppress the probability of logical failure, Fig. 7 (from Ref. [18]) illustrates. The cluster volumes quoted in Tab. II give the total number of logical cells required to implement specific gates. Fig. 7 shows defects of the same dimensions to Fig. 6, but translated along the x − y plane of the lattice. A re-scaled logical cell is now a volume of λ 3 = 20 3 with a cylindrical defect of volume d 2 λ = 16 × 20 passing through its center. To implement the non-Clifford gate, T , injected low fidelity states [Eq. B1] need to be purified using magic state distillation [51] before teleportation circuits can be used to enact the gate. Shown in Fig. 8 is the quantum circuit required to perform a single qubit T gate on an arbitrary state |ψ given an appropriate ancilla |A . The measurement result of the ancilla qubit after the CNOT determines if the gate T or T † is applied. If the gate T † is applied, then the further application of a single qubit P gate transforms the rotation from T † to P T † = T . As single qubit P gates also require distilled ancilla and teleportation protocols and the application of the teleported gate, T , occurs with a probability of 0.5, every two T gates within the quantum circuit will, on average, require the application of one P gate. Hence not only do we need to distill one |A state, but we also need to distill "half" a |Y state.
Assuming that the residual error on all injected qubits within the lattice is equal to our assumed operational error rate of the computer p = 6.7 × 10 −5 , two concatenated levels of state distillation are required. To leading order in p, the recursion relations and success probabilities (P A,Y ) for |A and |Y state distillation, in the limit of negligible topological error, are given by, for concatenation level l [53]. For p A 0 = 6.7 × 10 −5 and p Y 0 = 6.7 × 10 −5 , the residual error after two rounds of distillation is p ]. In total, 15/(1 − P A 1 ) + 1/(1 − P A 2 ) ≈ 16 distillation circuits are required for two levels of |A state distillation and 7/(1 − P Y 1 ) + 1/(1 − P Y 2 ) ≈ 8 are required for two levels of |Y state distillation. Run in parallel, this requires a total volume of 16V A +(1/2)×8V Y = 5856 logical cells, where the factor of 1/2 accounts for the probabilistic implementation of the P gate. The teleportation circuit is then applied requiring a logical volume of V 1,z + (1/2)V 1,x = 4 cells. Given that the scaling factor for the computer detailed in Fig. 6 is λ = 20, the total number of elementary cells, in the direction of simulated time, is Ω = λ(5865 + 4) ≈ 1.1 × 10 5 and the failure rate of the logical T gate is approximately, the logical gate assumes no specific optimization of non-Clifford group gates and represents a conservative estimate on the expected logical gate fidelity. In addition to the error rate of the logical gate, a single application of the T gate consumes, on average, 15 2 + 7 2 /2 ≈ 250 logical qubits in the lattice. However, the qubit resources at the logical level are equivalent for all computational models employing state distillation and teleportation of P and T gates to achieve universality.