Towards Scalable Bosonic Quantum Error Correction

We review some of the recent efforts in devising and engineering bosonic qubits for superconducting devices, with emphasis on the Gottesman-Kitaev-Preskill (GKP) qubit. We present some new results on decoding repeated GKP error correction using finitely-squeezed GKP ancilla qubits, exhibiting differences with previously studied stochastic error models. We discuss circuit-QED ways to realize CZ gates between GKP qubits and we discuss different scenario's for using GKP and regular qubits as building blocks in a scalable superconducting surface code architecture.


Introduction
There has been a recent surge in interest in bosonic error correction, both from the experimental as well as from the theoretical side. By bosonic quantum error correction we mean the representation of a qubit as a two-dimensional subspace of an oscillator, a means of performing some error correction on this qubit, as well as a suite of techniques to perform universal computation on the qubit.
We review some of these recent developments and older proposals, with an eye towards integration of the ideas into a scalable (code) architecture. To be concrete, we concentrate on superconducting devices as physical realizations, due to the excellent control and engineerability of strong non-linearities. Due to the commonality of the quantum optics language, some of our discussion applies more generally to other physical systems realizing oscillators, such as optical modes or mechanical oscillators. Our paper does not aim to be comprehensive in reviewing all possible bosonic codes, but rather seeks to identify some promising approaches and future work to be undertaken, in particular emphasizing scalable bosonic GKP error correction.
A first condition to even consider encoding a qubit into an oscillator is that a high-Q oscillator is available ‡. Examples of such high-Q oscillators are microwave cavity modes, of 3D or co-planar resonators in a frequency range f = 3 − 10 GHz where singlephoton life-times can be up to τ = 1/κ = 1 − 10 ms [1,2]. Thus, without additional couplings and drives, the native noise model of such microwave cavities is simply photon loss, governed by the cavity decay rate κ.
In order to prepare and manipulate an encoded qubit as prescribed by some code, one induces additional errors which the chosen code should, ideally, also be able to correct. It is thus important to pick a code in which computational manipulations and error corrections are relatively simple and the chosen code can also handle the errors which occur in these processes. As is well known, no finite code can correct all errors, and hence the goal of bosonic quantum error correction is simply to provide a logical qubit which can be used as a building block in a further coding scheme. The repetition or surface codes are then the simplest examples of such further encoding steps, using then multiple oscillators.
Even though we can embrace the surface code as the simplest scalable 2D coding scheme [3], variants on who is playing the role of data and ancilla qubit and what additional error correction on these qubits takes place, are important in actually getting the very demanding engineering it, to pan out. If we have learned anything over the past 20 years of Hamiltonian engineering it is that partially-coherent dynamics can be implemented in many quantum systems, while very few to none may allow for the high-precision control and scalability needed for quantum error correction.
Overall, the challenges of efficiently using a bosonic qubit encoding are in (1) keeping the harmonicity of the oscillators as high as possible while temporally coupling ‡ The Q of the oscillator captures the number of oscillations until it is fully damped and is given by Q = ω κ with ω = 2πf the angular frequency of the oscillator and κ the oscillator decay rate.
to this mode, with high on/off ratio, to create and manipulate non-classical code states, (2) finding a photon number regime in which approximations of engineered Hamiltonians are accurate while the error correcting properties and benefits of the encoding are valid. When using bosonic qubits as basic qubits in a code architecture, it may be advantageous to choose data qubits differently than ancilla qubits and we will give some examples of such choices. The simplest encoding of a single qubit into a bosonic mode can be done using Fock states: the vacuum state represents |0 and a single-photon state represents |1 . For superconducting devices one can view the difference between bosonic encoding versus the regular transmon qubit encoding [4] as an interchange between the roles played by the anharmonic and the harmonic oscillator. Using transmon qubits to store information, resonators are used as couplers and for read-out. Using bosonic qubits to store information, anharmonic oscillators are used for state preparation and couplers generating effective nonlinearities to realize gates. In this review we will refer to systems in which the lowest two energy eigenstates (in the absence of couplers) are used as regular qubits: this definition covers a Fock encoding as well as a transmon or a fluxonium qubit.
In order to avoid notation clutter, we denote logical gates on the GKP codewords without the standard overline notation in Section 2.3.2, i.e. CNOT and Z instead of CNOT and Z.

Early Birds & Cats and Their Generalizations
The first bosonic codes were formulated in [8] and designed to protect against photon loss. Of particular interest is a two-mode code with codewords with |k denoting a Fock state with k photons. If either |0 or |1 (or both) were hit by the loss of a single photon on any one of the two modes, we can readily see that the resulting states would still be orthogonal. This orthogonality is a prerequisite for being able to correct the photon loss error, but it is however not a sufficient condition. To examine the error correction capability of a (bosonic) code, one asks whether a set of dominant errors satisfies the quantum error correction (QEC) conditions [9] of the code: if this holds (approximately) then there is an (approximate) recovery operation undoing these dominant errors. For a set of errors E = {E 1 , . . . , E k } acting on the encoding of a single qubit, the quantum error conditions are as follows. ∀i, j, we require To examplify the use of these conditions, let us first look at the single-mode version of the code in Eq. (3): This code was introduced in [10] as the smallest member of a family of so-called binomial codes, hence its name kitten or 'baby-binomial' code. This code and its logical gates has been implemented using a superconducting microwave cavity mode as an oscillator in Ref. [11], but the life-time of the encoded qubit was comparable to that of a Fock state encoding. One can easily check that for the error set E = {I, √ γa}, the QEC conditions in Eqs. (4), (5) for this code are met. However, these errors are only an approximation of the real noise. A photon loss channel with photon decay rate κ lasting for time t with γ ≡ κt 1, can be modeled by a superoperator N γ with Kraus operators E 0 = I − γ 2 a † a + O(γ 2 ) ≈ e −γa † a/2 and E 1 = √ γa, or For the Kraus operators E 0 and E 1 the QEC conditions in Eqs. (4), (5) are not quite met. In particular, we have as |1 is an eigenstate of a † a, while |0 is not. This means that upon the detection of no photon loss (corresponding to E 0 ) the code states undergo an irreversible distortion.
The two-mode version of this code, Eq. (3), improves on this distortion issue as the quantum error correction conditions for the two mode code are met for the error set E = { √ γa, √ γb, exp(− γ 2 (n a +n b )) ≈ I − γ 2 (n a +n b )}. These error operators can be viewed as the three Kraus operators of a process in which there is either photon loss on mode a, photon loss on mode b, or no photon loss on either modes. For the states in Eq. (3) we have no distortion upon not detecting a photon from either modes as |0 and |1 are both eigenstates of exp(− γ 2 (n a +n b )) with eigenvalue exp(−2γ). As far as we know, this two-mode code is still awaiting experimental realization.
By allowing ourselves code states with higher average photon number, we can correct for more loss errors, as well as gain and dephasing errors. More precisely, Ref. [10] has introduced families of binomial and so-called cat codes which correct against the set of errors E = {I, a, . . . , a L , a † , . . . , a † G , a † a, . . . , (a † a) D } for arbitrary L, G and D and N = max(L, G, 2D). For example, the idea behind the binomial codes can be understood as follows. Using the Holstein-Primakoff transformation a † a → J z + J, the binomial code words |0 and |1 can be seen as spin-eigenstates of J x = ±J with 2J = N + 1. Dephasing errors (a † a) m , m = 1, . . . , D thus lead to a change in J x by at most D ≤ J , hence keeping codewords orthogonal. At the same time, protection against photon loss and gain is achieved by using a subspace of sufficiently separated Fock states stabilized by the operator Π S = e i2πa † a/(S+1) with S = L + G. For S = 1 this gives the photon parity operator Π S=1 ≡ Π photon = e iπa † a : the even-photon codewords in Eq. (6) are clearly +1 eigenstates of this photon parity operator.
On the other hand, for large enough α, bit-flips, α ↔ −α, can be expected to occur at a much lower error rate as they correspond to a large change of the state in phase space. Particularly interesting is the engineering of Hamiltonians or dissipative processes which have these code states |±α as degenerate fixed-points, so that there is a 'macrosopic' energy barrier to transition between them, leading to a bit flip rate exponentially small in |α| 2 . This design can lead to a qubit for which the noise is biased as phase-flip errors are more prominent than bit-flip errors. We will discuss this noise-biased qubit in more detail in Section 2.2.
The next-level cat encoding was introduced in [13,14], and is sometimes referred to as the 4-legged cat code since its codewords have four blobs in phase space: Using the standard identity α| β = e −(|α| 2 +|β| 2 )/2 e α * β one can verify the orthogonality of these two states. As for the kitten code, we can verify that both states are +1 eigenstates of the photon parity operator Π photon = e iπa † a using that Π photon |α = |−α .
The photon parity thus functions as a check operator, taking eigenvalue +1 on the code space and measuring it is a natural way to perform error correction. The states |0 and |1 , both having even photon parity, are however distinguished by their photon parity modulo 4, expressed as the ±1 eigenvalues of the operator Π 1/2 photon = exp(iπa † a/2). To measure the photon parity operator via an ancilla qubit, a cavity mode-qubit dispersive interaction −χZa † a/2 can be used [5]. In circuit-QED the interaction Za † a comes about naturally as the effective interaction between, say, a cavity mode and a linearly-coupled, off-resonant, transmon qubit mode [4]. The measurement of Π photon then proceeds by preparing the ancilla qubit in |+ , letting the interaction take place for time t = π/χ and subsequently measuring the qubit in the |± basis. Using a transmon qubit and cavity mode, Ref. [15] has shown that tracking photon parity by repeated measurements of Π photon makes for a logical qubit which has a longer life-time than a Fock qubit without error correction in the same cavity mode. This result has essentially been the first demonstration of quantum error correction lengthening the life-time as compared to that of native qubits (transmon and/or Fock encoding) in the hardware.
Before we discuss further generalizations, let us examine the quantum error correction conditions, Eq. (4), (5)  = 1| a † a |1 . Besides the uninteresting case of taking α very large (so that all |±α , |±iα are orthogonal), this last condition is exactly met at sweet spots given by the equation tan α 2 = − tanh α 2 . The smallest sweet-spot at |α| 2 = 2.34 lies close to the number of photonsn = 2 of the cat code used in the experiment [15].
There are several error channels which impact the performance of the cat code using repeated photon parity measurements. First of all, the code cannot fully correct against the photon loss channel as it cannot correct the distortion Kraus operator E 0 = I − γa † a/2. Secondly, two photon-loss events ∝ a 2 implement a logical bitflip |0 ↔ |1 . Thirdly, photon loss in combination with the inevitable Kerr nonlinearity ∼ (a † a) 2 on the cavity mode causes incorrectable dephasing: the Kerr interaction makes the cavity rotation speed depend on the number of photons in the cavity, but this number becomes indeterminate in the presence of photon loss. Last but not least, transmon qubit decay during the qubit controlled-a † a interaction, is a serious source of feedback error. For example, when the qubit decays half-way through the interaction, |1 → |0 , it applies only half the rotation on the cavity mode. The result is that the eigenvalues of Π 1/2 photon = Z are measured via the qubit measurement, collapsing the logical state. This last feedback error problem is an important issue for any bosonic qubit, and it has been a central theme in the theory of fault-tolerant computing in general [16]. A disadvantage of the theoretical schemes for fault-tolerant quantum error correction is that they typically require additional hardware resources, such as logical ancilla qubits or (verified) multi-qubit GHZ states. Instead, we may seek hardware-efficient mitigation of the feedback error problem. As an example, Ref. [17] has addressed the feedback error due to transmon relaxation by drive-engineering the dispersive coupling Hamiltonian to equal −χ(|2 2| + |1 1|)a † a and starting the ancilla transmon qubit in the state 1 √ 2 (|0 + |2 ). Transmon qubit decay from 2 → 1 then commutes with the transmoncavity interaction and does not cause errors on the cavity mode. The decay does, -as in the normal case-, affect the reliability of the transmon qubit measurement outcome. All-in all, this has led to an overall factor 5 in improvement of the life-time of the encoded cat qubit [17].
Another way of minimizing feedback errors on a bosonic code is to use a biasednoise ancilla qubit (Section 2.2) as an ancilla qubit. As proposed in [18], the goal is then to let the strong-noise error channel affect the ancilla qubit measurement, while the low-noise (bit-flip) channel on the ancilla feeds back low-noise to the bosonic code.
Single-mode cat codes with higher-photon numbers can be formulated and form a class of codes [10,19]. Ref. [20] studied the performance of binomial, cat and GKP codes (see Section 2.3), against photon loss, assuming optimal noisefree recovery as permitted by the quantum error correcting conditions. Ref. [21] has formulated a general framework of rotation-symmetric codes of which the binomial and cat codes are subclasses: the unifying theme is rotation symmetry of the code states in phase space captured by invariance under the operator Π S . Another interesting class of bosonic codes uses a 3-wave mixing χ (2) -interaction, Eq. (32), as the central element for defining the code and correcting photon loss [22]. Various classes of multi-mode codes against photon loss exist, see for example [23,24,20] and references therein.
A challenge in using a bosonic qubit is that some computational manipulations can be more involved than for a regular qubit. For example, on a regular qubit such as a transmon qubit, rotations by an angle θ around axes X or Y are easily accomplished by temporarily supplying microwave radiation. On a bosonic qubit, these simple singlequbit gates can be non-trivial. An advocated solution in Ref. [25] is to always use a dual-rail (dr) encoding of a bosonic qubit with |0 dr = |01 and |1 dr = |10 , where |0 and |1 are the states of (an arbitrary) bosonic qubit itself. Having mapped the Blochsphere of a qubit onto a two-mode state space, the exponential mode-SWAP operator exp(iθ SWAP a,b ) becomes a universal gate to do single-qubit and two-qubit gates, and has been realized in [26]. Here the linear SWAP a,b transformation interchanges two modes a and b, i.e. its action on quadrature operators for the modes is given by q a ↔ q b and p a ↔ p b . If we envision using a bosonic qubit as a building block qubit in a stabilizer code, it is however not necessary to perform any gate, but rather we can focus on performing CNOT or CZ, Hadamard (H) and T gates possibly using ancilla qubits, see e.g. [3].

Noise-Biased Cat Qubit
A method to set up a dissipative process which stabilizes the coherent states |±α was devised in [13]. The idea is to engineer the Lindblad equation (in a frame rotating at the mode frequency):ρ with H sq = Ea † 2 + E * a 2 where E = i|E| with |E| proportional to a pump strength. To understand the fixed points of this evolution, for which L(ρ) = 0, we can write the Lindblad equation asρ We can then use, with K = κ 2ph 2 : This immediately implies that the states |±α = ± |E|/K are fixed points of the Lindblad evolution, as M α |±α = 0, and the last term in Eq. (11) is canceled by the constant −|E| 2 K which remains from the first term. Hence, any linear combination of the states |α = ± |E|/K is a fixed point of the dynamics.
When the pump inducing the squeezing Hamiltonian H sq is off, E = 0, we can observe that the Fock states |0 = lim α→0 |C + α and |1 = lim α→0 |C − α are fixed points, distinguished by their photon parity. Thus when E is gradually increased, we can smoothly change from a Fock encoding into the cat |C ± α encoding. Photon loss at rate κ, which can be modeled by introducing an additional term κD(a)(ρ) in Eq. (10), causes phase-flip errors, i.e. flipping between the states |C ± α , but does not interfere with the stabilization itself as |±α are eigenstates of a so that D(a)(|±α ±α|) = 0. One can add a drive term H drive = (t)a † + * (t)a to the Lindblad equation and observe that the annihilation operator a will generate rotations around the Z-axis (periodically interchanging |C ± α ). At the same time, a † in principle leads to a departure from the qubit subspace spanned by |±α corresponding to leakage. However, due the ∼ |α| 2 gap of the Lindbladian, such departure from the eigenvalue-0 manifold is exponentially suppressed and the effect of the driving term can be analyzed by projecting it onto the stabilized subspace. In this subspace it then induces Rabi oscillations around an axis which is exponentially-closely aligned with the Z-axis, with Rabi frequency Ω ∝ | ||α|, experimentally demonstrated in [27] §. A measurement in the X-basis can be accomplished by measuring the photon parity through a coupled transmon qubit. § In Refs. [13], [28] and some other papers a different convention is used, namely |C + α (resp. |C − α ) is the Z eigenstate |0 (resp. |1 ), thus interchanging what is called Z and X here.
The (pumped) squeezing interaction and the required two-photon dissipative process have first been experimentally realized in [29]. This was achieved by coupling a 3D storage cavity (at frequency ω a ) via a bridging transmon to a lossy cavity (at different frequency ω b ) and applying a two-tone drive on the lossy cavity so as to set up a process to convert two storage photons to one lossy cavity photon which is subsequently lost (the κ 2ph D(a 2 ) process). The lossy cavity is driven at pump frequency ω p = 2ω a − ω b as well as close to its own frequency ω b , generating, through the transmon nonlinearity, an effective degenerate parametric oscillator with resonant terms of the form An alternative, non-dissipative, route towards a noise-biased qubit was first proposed in [30]. Instead of invoking dissipation, the idea is to engineer a Hamiltonian which has |±α as degenerate eigenstates, using a Kerr nonlinearity and squeezing. The two-photon dissipation is then considered an optional add-on which helps in mitigating leakage, i.e. a departure from the subspace spanned by |±α . The target Hamiltonian (in the rotating frame of the cavity mode) is The spectrum of H has eigenvalues running from |E| 2 K downwards as the first term in H is negative-semi-definite. Omitting the factor |E| 2 K , the highest eigenstates are the states |±α with degenerate zero eigenvalues. We can observe the similarity and difference with Eq. (12): here we consider a Hermitian matrix and the phase of the pump amplitude E is variable and determines the phase of the coherent states which are the zero energy eigenstates. Thus, by adiabatically changing the phase of E we move to different zero energy eigenstates, allowing us to transform α → −α and hence realize a X gate on |C ± α . For the stability of the encoded space it is important to understand the spectrum of H and the gap below these degenerate zero eigenstates, see the analysis in [30,18]. To understand this, assume that the phase ϕ = 0 for simplicity. We can displace the Hamiltonian by D(±α) with α = |E|/K. For large |E|/K, one can approximate D(±α)HD(±α) ≈ −4K|α| 2 a † a, a harmonic oscillator Hamiltonian. This shows that for large α, the spectrum has approximately gap 4K|α| 2 and the first excited states below |±α are roughly equal to D(±α) |1 . The so-called 'Cassinian' Hamiltonian in Eq. (13) was first studied in [31]: the surfaces of constant classical energy are described by Cassinian ovals in p and q with the focii of the ovals at q = ± |E|/K. As a quantum system the spectrum is that of an inverted double-well ('double-oscillator') with the well maxima at zero energy for the states |±α . We can consider the effect of driving and several dissipative processes for the Hamiltonian in Eq. (13). For example, when one includes photon loss κD(ρ) in the Lindblad equation and the pump amplitude is sufficiently large, i.e. 16|E| 2 > κ 2 [32,33,30], the fixed point of the Lindblad equation is the state p |α α| + (1 − p) |−α −α| with modifiedα, |α| 2 < |E| K . In this regime the system neatly represent the dissipative storage of a classical bit.
The effect of other sources of noise such as dephasing κ deph D(a † a)(ρ) (see also Eq. 28), photon gain κn therm D(a † )(ρ) due to the coupling with a finite temperature heat bath, as well coupling with baths with other spectral densities are discussed in detail in [13,30,18,34].
Ref. [28] has implemented the Kerr-cat Hamiltonian in Eq. (13) and the corresponding qubit in the resonant mode of a so-called SNAIL element (see Section 3.3), coupled to a read-out cavity mode. The fourth-order nonlinearity of the SNAIL element gives the wanted −Ka † a 2 term, while one can drive the mode at twice its frequency so as to use the third-order SNAIL term ∝ a † 3 + a 3 to turn on squeezing. The experiment generated cat states with |α| 2 ≈ 2.5 with a dephasing life-time of 3 µs, and an enhanced decay life-time of 105 µs, and a π/2 rotation around the Z-axis obtained by driving took 24 ns. The ability to convert the noise-biased qubit to a Fock encoding by turning off the squeezing drive allows to measure Pauli X via a standard dispersive measurement [28]. One can also measure a noise-biased qubit in the X-basis by dispersively coupling (−χa † aZ/2) it to an ancilla qubit to map the photon parity onto the state of the ancilla qubit which is subsequently measured. To realize a (nondestructive) Pauli Z measurement, distinguishing ±α, Ref. [28] had applied, besides the squeezing drive, a drive at the difference frequency of the SNAIL mode and the read-out cavity mode (b) to get a resonant beam-splitting interaction ∝ a † b+ab † . The upshot is that the coherent states |±α are mapped to corresponding coherent states in the cavity mode which are heterodyne-measured when leaking out of the cavity.
Given that the noise-biased cat qubit is designed to have a low bit-flip error rate, it can function as an ancilla control qubit in the error correction circuit for another code [18] inducing low feedback noise. Assume we have a code which is an eigenspace of a stabilizer S = e iA and S is to be measured using the noise-biased cat qubit to detect or correct errors. This requires an interaction of the noise-biased cat qubit and the code of the form H int ∝ (a + a † ) ⊗ A since a + a † ≈ Z on the noise-biased cat code space (besides some leakage), allowing for a qubit controlled-S operation. For example, for the cat code, S = e iπb † b = Π photon , requiring a tunable photon-pressure coupling between the two modes of the form H int ∝ (a + a † )b † b. For the GKP code, see Section 2.3, S is a displacement so that H int can be chosen to be a tunable beam-splitting interaction of the form a † b + ab † .
It has been argued that, if the noise-bias of this qubit is sufficiently strong, only a classical repetition code [35] might suffice to correct for the dominant phase-flip (Z) errors due to photon loss. Crucial in this idea is that the CNOT gate which is needed to measure the XX checks of this code preserves the noise-bias, that is, Z errors during the gate do not propagate to become X errors after the gate. For the Kerr-cat qubit a noise-bias preserving CNOT gate has been proposed in [34]. A similar idea is to use this Kerr-cat qubit as a basic qubit in a surface code architecture in which the XXXX and YYYY checks are measured [34,36]. In this modified form of surface code one gains much more information about Z errors. It has been shown that when the probability for phase-flip errors and measurement errors is a factor 100 more than that of bit-flip errors within a phenomenological error model, the threshold against Z errors can be as high as 5% [36]. It is an open question whether such high bias will be feasible in practice as experiments for doing the CZ gate and the noise-bias preserving CNOT gate on these noise-biased qubits are still to come.

The GKP Qubit
The (square) Gottesman-Kitaev-Preskill (GKP) qubit introduced in Ref. [37] is defined through two commuting displacement operators, acting as translations in phase space, i.e. S q = exp(i2 √ πq) and S p = exp(−i2 √ πp) . The ideal GKP code is the space invariant under these two phase-space translations. As a result, any wave function in q (resp. p) in this space has support on q = k √ π (resp. p = l √ π) for integers k, l ∈ Z. The logical operators of the qubit are Z = exp(i √ πq) and X = exp ). This choice makes the wave function in q of |0 a sum of delta functions at values of q which are even multiples of √ π, while |1 has uniform support on values of q which are odd multiples of √ π. The ideal code meets the quantum error correction conditions for a continuous set of 'at most half-logical' displacements E = {e iup , e ivq : |u|, |v| ≤ √ π/2}, since any products of these shifts maps a |0 onto a state orthogonal to both |1 and |0 (and vice-versa). The set of correctable displacements forms a square Wigner-Seitz or Voronoi cell (containing only one lattice point such that all points in the cell are closer to this point than to another lattice point) in the code lattice generated by the logical phase-space translations.
Naturally, an asymmetric version of the GKP code which corrects more shift errors inq than shift errors inp can also be defined. However, when there is no hardware-based noise asymmetry betweenp andq this does not seem immediately useful.
In principle, and in theory, to perform quantum error correction the eigenvalues (phases) of the unitary operators S p and S q are to be measured. Performing such measurements projects the continuum of errors onto (superpositions of) possible displacements, and we perform error correction by choosing a displacement of minimal amplitude which resets these eigenvalues to +1, corresponding to the code space. In Section 2.4 we will analyze GKP quantum error correction using encoded GKP ancilla qubits, see Fig. 7. The advantage of this form of error correction is that it does not suffer from feedback errors induced by a poor ancilla qubit (instead, it suffers feedback errors from a GKP ancilla qubit) and the information gained through measuring the GKP ancilla states is analog rather than binary. The disadvantage is that one needs to prepare GKP ancilla states themselves first.
For this latter task one can perform some form of phase estimation to measure the eigenvalues of the unitary operators S p and S q . Since the eigenvalues take continuous values, one only ever realizes an approximate estimation of these phases. Phase estimation can readily be executed by coupling the GKP mode repeatedly to a single ancilla qubit via controlled-displacement gates as was proposed and discussed in great detail in Ref. [38], focusing on a circuit-QED implementation. The idea behind this is simple. To measure the eigenvalue of a unitary operator U such as the displacements S p or S q , one can use ancilla qubits applying qubit controlled-U k gates for k = 1, 2 . . .. For example, when k = 1, the circuit on the left in Fig. 1 has outcome probabilities P(±) = 1 2 (1 ± Re(U ) , while the circuit on the right has probabilities P(±) = 1 2 (1 ∓ Im(U ) . Figure 1: Single round of phase estimation with U |ψ n = e iθn |ψ n where the probability for ancilla qubit to be measured in the state |± equals P(±) = n |α n | 2 1 2 (1 ± cos(θ n )) (left) and P(±) = n |α n | 2 1 2 (1 ∓ sin(θ n )) (right). In the applications here U is a displacement S p or S q .
In phase-estimation schemes higher powers k > 1 of U k are often used, but applying U k , a displacement of strength ∼ k, increases the number of photons in the state by ∼ k 2 and does not provide a good approximation of an approximate GKP state [38]. Instead of repeating the phase estimation to collect bits of phase and then do a final corrective displacement, it is experimentally simpler to opt for immediate feedback on the code state based on each new bit obtained in a round of phase estimation. This is the route taken in the experimental realization of the GKP code in [39], where a small conditional displacement on the GKP qubit is executed depending on the ancilla qubit measurement outcome. In fact, using such immediate feedback the state of the ancilla qubit does not even need to be measured, as the feedback can be done depending on the qubit state itself, followed by an approximate disentangling step [40] or alternatively a qubit reset step (to remove entropy build-up).
In addition, in Ref. [39] only the right circuit in Fig. 1 measuring Im(U ) is used (instead of measuring both Re(U ) and Im(U )). If the state to be measured is (approximately) symmetrically centered around the vacuum so that its wavefunction is symmetric under q → −q and p → −p, we have dp|ψ(p)| 2 Im(S p ) = 0 and dq|ψ(q)| 2 Im(S q ) = 0. This implies that P(±) = 1 2 (1 ± Im(U ) ) = 1 2 , suggesting that the measurement outcome ± can gain a maximal amount of information by weakly projecting onto sin(θ) ≷ 0, and subsequently shifting the state to the point θ = 0. These feedback shifts are realized in [39] by small displacements. Note that if the input state has eigenvalue phase θ close to 0, then Re(U ) is close to 1, implying that not much is learned by doing the measurement with outcomes P(±) = 1 2 (1 ± Re(U ) ). We remark that the length of the displacement of the logical Y is √ 2 larger than that of X and Z. This implies some asymmetry in error correction. Namely, if we correct by measuring S p and S q , shifts such as exp(iup + vq) with u 2 ≤ π/4 and v 2 ≤ π/4 can be corrected which, as displacements, are a factor √ 2 larger than correctable displacements in purep andq directions. Given a noise model which is rotationally-symmetric in phase space, this does not seem to be an optimal choice. It also implies that logical Y eigenstates which can flip due to large displacements in purep andq directions can have shorter lifetimes [39].
A 'hexagonal' GKP qubit has also been defined in [37] by choosing two phase-space lattice translations which are not orthogonal such that all three logical operators X, Y and Z have the same length as phase-space translation vectors. For this choice we take as stabilizers exp(iξ( √ 3q −p)/2) and exp(iξp) with ξ = 2 2π/ √ 3, generating a hexagonal lattice in phase space. Again the logical operators are half-stabilizers, forming the vectors generating a hexagonal lattice. The correctable displacements now form a hexagonal Wigner-Seitz cell. This cell is larger in volume than the square Wigner-Seitz cell in the square GKP lattice. If we assume that displacement errors occur according to a stochastic Gaussian model as in Eq. (24), it implies that the hexagonal code can correct a larger volume of errors.
If we were to choose stabilizers S q = exp(i √ 2πq) and S p = exp(−i √ 2πp), there would be no additional commuting displacement operators, implying that the +1 eigenspace S p and S q is one-dimensional. This eigenstate, also called the sensor state |ψ sensor in [41], is a uniform sum of delta function at q = k √ 2π with k ∈ Z (and similarly a uniform sum of delta functions at p = l √ 2π with l ∈ Z). The sensor state is interesting in allowing one to simultaneously estimate the complex and real part of the amplitude α of a displacement D(α), by performing phase-estimation for S q and S p on D(α) |ψ sensor [41].
We will uniquely focus on the square GKP code in the remainder of this review, although most points apply with small variation to the hexagonal code.

Approximate GKP States
Any physical GKP code state will occupy a finite volume in phase space and will have a finite number of photons. In principle, an infinite number of approximations to the perfect GKP code states exist, but some are more useful than other's and here we will mention four. Ref. [37] introduced a form of approximate GKP state obtained by applying a Gaussian superposition of displacements, characterized by a 'squeezing' parameter ∆ > 0 to a perfect state: For this model wavefunction it holds thatn ≈ 1 2∆ 2 − 1 2 [37,38]. One can perform the Gaussian phase-space integral in Eq. (14) and, -neglecting contributions O(∆ 4 ) 1, see e.g. [42]-, one gets a different approximation using an operator D: The envelope operator D has approximately the same effect as the 'no loss' Kraus operator of a photon loss channel N γ , Eq. (7), with γ = 2∆ 2 . Another approximation, Figure 2: Wigner function of the state F |0 at ∆ = 0.3, and the reduced probability distributions over q and p in black. Unlike the Eand D-approximation, the Fapproximation has a clear asymmetry with respect to p and q. Since the Wigner function has a grid-like periodic structure in phase space, the GKP states are also referred to as grid states.
valid for small ∆ is The state F |0 can be interpreted as the result of preparing a squeezed state 1 π 1/4 dq exp(−q 2 /2∆ 2 ) |q to which one applies a Gaussian-enveloped coherent sum over stabilizer translations, enacting ρ → k,l∈Z e −2∆ 2 π(k 2 +l 2 ) S k p ρS −l p . The result is a state which is both an approximate eigenstate of S q (and Z) due to squeezing, as well as an approximate eigenstate of the translation S p . Note that unlike E and D, approximation F has an asymmetry in p and q. The three approximations D, E, F have been discussed and shown to fit a standard form in [43]. In addition, the normalization of these approximate forms can be computed and expressed in terms of Theta functions, see e.g. Appendix A for the D-approximation.
In Eq. (B.2) we will see a fourth, von-Mises or reverse-Villain, approximation using a cosine function to represent the periodicity in the wave-function comb. This reverse-Villain approximation has been used in [44] and [45]. All these approximate states E |0 and E |1 (or D |0 and F |0 etc.) are +1 eigenstates of the photon parity operator e iπa † a as they are invariant under q → −q and p → −p, implying that they only have support on even photon number states. In Appendix A we show how to get exact Fock state amplitudes for the approximation D |0 , -which for this purpose has the simplest form-, and this turns out to involve n-the order derivatives of theta functions. We show in Appendix A that the photon number distribution of these GKP states, as well as the sensor state, is following a thermal distribution [20] (see Fig. A1 and A2), with interesting oscillations on top.
One can propose various measures of state quality or fidelity besides the characterization of the state in terms of ∆. For example, when we measureq to infer Z on a state, all outcomes in which q is closer to an even multiple of √ π are interpreted as outcome Z = 1 and vice-versa. For a state dq ψ(q) |q , the probability for this outcome is then (18) If we apply this to the form E |0 , the error probability P(Z = −1) < 2∆ π exp(−π/4∆ 2 ). Since a perfect (homodyne) measurement ofq is practically not possible, P(Z = −1) only provides a lower bound on the logical error probability of an approximate state |0 . We can also examine the expectation value for Z on the approximate form F |0 (for simplicity) which equals and similarly 1|F † ZF|1 1|F † F|1 ≈ −e −π∆ 2 /4 , showing that the expectation decays exponentially in ∆ 2 towards 0. In the approximation in Eq. (19) we have assumed that ∆ is small enough so that the peaks at different k do not overlap, giving an easy expression for the probability distribution over q of the approximate GKP state. We further discuss the logical Z or X measurement of a GKP qubit in Section 3.2.
It has become common to describe the quality of a GKP state in terms of an amount of squeezing expressed in dB. For a regular squeezed state (squeezed along q) one has variances Var(q) = ∆ 2 2 , Var(p) = 1 2∆ 2 as the vacuum (or coherent state) has Var(q) = Var(p) = 1 2 with ∆ = e −|ξ| < 1. The convention which is used in the literature for denoting the dB of squeezing of an approximate GKP state is #dB = −10 log 10 ∆ 2 , see e.g. [42].
We can view a GKP state as being 'squeezed' in both p and q and interpret this squeezing as the extent in which the state is an eigenstate of a unitary operator such as S p or S q . Since a quantum state may not fit one of the standard GKP approximations, a measure of the effective squeezing is useful in expressing the quality of the state. Since we are interested in modular values ofq andp, it is appropriate to use the Holevo phase variance (or the variance of periodic variables such as phases used in circular statistics) to express this squeezing, i.e. one can define [41,46]: Note that this measure does not express a logical error rate, e.g. the completely mixed state inside the perfect code space has ∆ p = ∆ q = 0.

Logical Gates
An appealing feature of the GKP code is that all logical Clifford transformations are Gaussian quantum operations, realizable by optical elements [37,42] which enact linear transformations on the operatorsp andq in the Heisenberg picture. Important gates such as the CNOT and S gate do however involve twomode, respectively single-mode squeezing: the experimental realization of such squeezing transformations is typical through pumped optical non-linearities. Such elements are relatively straightforward to obtain for optical fields which travel through nonlinear χ (2) or χ (3) materials, while for superconducting devices these elements are engineered through the use of Josephson junctions. In contrast, passive linear optical elements, -beam-splitters and phase-shifters in optics language-, are readily available in circuit-QED by linear capacitive or inductive circuit couplings. In Section 3 we will discuss the engineered non-linearities in superconducting hardware which can be activated by microwave drives or activated by flux-drives, while here we discuss the logical gates for the GKP code at a formal level.
As unitary displacement operators, Z and X are not self-inverse, i.e. X = X † . On a perfect, completely shift-invariant code state X acts identically to X † , but on a finitephoton number state, see e.g. the wave function in Fig. 2, it does not: a shift to the left or right moves the envelope away from the center. The Hadamard gate has Heisenberg actionp → −q andq →p so that Hadamard gate corresponds to a phase-space rotation by an angle π/2, i.e. we can choose Had ≡ exp(i π 2 a † a), and note again that Had = Had −1 . A Had gate could be done by a quarter-cycle waiting in the self-evolution of the oscillator (so comes for free).
A disadvantage of using such quarter-cycle waiting Hadamard gate in a GKP surface code architecture is discussed in Section 4. The alternative is to use singlequbit rotations around the logical X, Y or Z axes to compose a Hadamard gate.
For the GKP code these rotations around logical axes, R P (φ) ≡ exp(−iφP/2) with logical Pauli P = X, Y, Z are not natural as the logical Pauli, which is a displacement, sits in the exponent. Note also that this gate R P (φ) is only unitary when acting on a subspace for which P 2 = I. However, one can perform R P (φ), using a controlleddisplacement coupling with a regular qubit and a regular qubit rotation, as shown in Fig. 3, and realized in [47,39]. This circuit applies R P (φ) ≡ exp(−iφP/2) on the space of states for which P 2 = I but we can examine its effect more generally. Imagine applying the circuit in Fig. 3 with P = Y and φ = π/2. Upon outcome ±, the Kraus operator action on the GKP qubit equals A + = cos(φ/2)I − i sin(φ/2)P resp. A − = −i sin(φ/2)I + cos(φ/2)P . On the perfect code subspace where P 2 = I, A + acts as a unitary and equals R P (φ), while A − can be converted to R P (φ) by the additional π-rotation P . However, on a finitely-squeezed GKP state, these Kraus operators are not unitary and their action leads to the envelope of the GKP state to be no longer centered around the vacuum. However, one can apply a displacement P −1/2 [39] to approximately re-center the GKP state.
A single-qubit gate such as the T = R Z (π/4) gate can be done in this manner as well. The S gate with action S † XS = −Y and S † ZS = Z can be realized by the transformationq →q,p →p −q corresponding to S = exp(−iq 2 /2) ¶ The S gate can thus be implemented by means of pump-activated squeezing, see Section 3, or by using an ancilla qubit as in the circuit in Fig. 3. Alternative methods for performing a T gate via magic state preparation or using a cubic phase gate V γ = exp(iγq 3 ) exist [37]. For example, one can create a +1 eigenstate of the Hadamard gate Had = exp(i π 2 a † a) by starting with a vacuum state, which is already a +1 eigenstate of Had, and measuring S p and S q without photon-number changing feedback [48].
When using GKP qubits as basic qubits in a surface code, see Section 4, we note that T and S gates are not needed for error correction: their only use is to prepare magic GKP ancilla qubits to be grown into the surface code-encoded magic states using GKP CZ and CNOT gates or parity check measurements, see e.g. [49] and references therein.
The CNOT gate can be realized by the Heisenberg actionq c →q c ,p c →p c −p t , q t →q c +q t andp t →p t . This gate is also called the SUM gate in [37] and SUM(g) with g = 1 in [42]. We see that CNOT = exp(−ip tqc ) by using Eq. (2) with v =p t and u =q c . The inverse CNOT has actionq c →q c ,p c →p c +p t ,q t →q t −q c andp t →p t .
We define the action of the CZ gate as Had t CNOT Had † t where Had t is a Hadamard gate on the target mode. That is, it enacts the transformationq t →q t ,p t →p t −q c , q c →q c ,p c →p c −q t , or CZ = exp(−iq tqc ). If either oscillator is a state where q is an even multiple of √ π, then CZ acts as exp(−iπ2k) = 1. If both oscillators are in a state where q is an odd multiple of √ π, then CZ acts as exp(−iπ(2n + 1)(2k + 1)) = −1 for n, k ∈ Z. Sections 3.3 and 3.4 will discuss how the GKP CZ gate between two GKP modes can be executed using a 3-wave or 4-wave mixing element. There is however another circuit to perform a CNOT gate which uses a sequence of beam-splitters and some single-mode squeezing [38,42] which can be more useful in some circumstances, see Fig. 4. For the CNOT gate the mode transformation on control (c) and target (t) mode equals We can use that [q 2 ,p] = 2iq and higher-order commutators are zero, leading to exp(iq 2 By the Bloch-Messiah decomposition [50] the singular value decompositions are A = U D A V † and B = U D B V T with unitary matrices U and V . For the CNOT gate the singular values are degenerate: 2 ) and D B = diag( 1 2 , 1 2 ), implying that the beam-splitting transformations U and V are not unique. Ref. [50] notes that taking 50:50 beamsplitters with with θ = 1 2 sin −1 (2/ √ 5) can be chosen (while [42] makes a different choice). We see that the single-mode squeezing represented by the diagonal matrices corresponds to a squeezer Sq(ξ) with ξ = − cosh −1 ( It is clear that logical gates are not unique as physical operations as they only have to perform the right action on the code space. Ref. [42] has discussed how logical gates propagate or amplify errors on the approximate GKP code states. Keeping the (average) number of photons in an approximate GKP state low by centering the state symmetrically around the vacuum, emerges as a good overall strategy to minimize the propagation of errors and the effect of the inaccurate action of gates.

Noise on a GKP Qubit
A simple numerically convenient noise channel, playing the role of depolarizing channel for an oscillator, is the independent Gaussian displacement channel N (ρ) with standard deviation σ 0 : Here ρ is a single-mode density matrix and P σ 0 (x) the Gaussian probability density function with mean zero and variance σ 2 0 , i.e. P σ 0 (x) = (2πσ 2 0 ) −1/2 e −x 2 /2σ 2 0 . This channel does not naturally correspond to physical sources of noise, but (1) one can convert photon loss via amplification to this channel [20], (2) one can 'displacement twirl' noise so that the effective channel is that of probabilistic mixture of displacements [51]. The exact displacement twirl is not a physical operation as it requires large displacements, so this type of modeling should be considered less justified than in the qubit Pauli case when we use a depolarizing noise model through a Pauli twirling approximation.
It is thus of interest to study how realistic noise affects the approximate GKP states beyond this toy model. We will explore the question of stochastic Gaussian displacement noise versus coherent finite-squeezing error during quantum error correction in the next Section 2.4. In this section we describe the interesting effect of photon loss on a GKP qubit using Wigner function dynamics [39], and mention some literature discussing other sources of noise.
An oscillator state undergoing photon loss at rate κ can be described, in a rotating frame at its resonant frequency, using a Lindblad equationρ = κD(a)(ρ) using the density matrix ρ. Here we assume that the thermal environment which induces this photon loss is at zero temperature, hence there are no photon gain processes. Alternatively, and conveniently, one describes this dynamics through differential equations using phase-space probability distributions such as the Wigner function. The Wigner function for the photon loss dynamics can be shown to obey a two-dimensional Fokker-Planck equation, see [7,39,52] This Fokker-Planck equation describes a process of diffusion, -a spread in the variance of the variables p and q to the vacuum noise variance equal to 1/2-, and drift, i.e. the mean values of p and q flow towards 0. Instead of considering the Wigner function dynamics, we can integrate over, say, p and consider the corresponding Fokker-Planck equation for the probability distribution P (q, t) = R dp W (q, p, t), which has the solution: In Fig. 5 we plot the effect of photon loss of a normalized state F |0 with ∆ = 0.3 for κt = 0.1, 0.5 and 1. We can consider the expectation of a stabilizer or logical Z over time, i.e. we consider Tr e iαq ρ(t) = dqP (q, t)e iαq with α = √ π or 2 √ π. Using Gaussian integration and Eq. (25) this gives Even though the state has partial support on regions where q is closer to an odd multiple of √ π, Z 0 shown on the right, is nonnegative at all times due to the large wave function peak centered at 0. Figure 6: The probability distribution P 1 (q) of the state F |1 at ∆ = 0.3 undergoing photon loss. We observe that Z 1 moves from a negative initial value to a final positive value as the state moves to the vacuum state.
with α(t) = αe −κt/2 . On the right-hand-side, we see an exponential decrease as well as a direct dependence on the expection value of a displacement operator with exponentially shrinking shift on the initial state. When the initial state ρ(0) is invariant under q → −q, we can replace Tr e iα(t)q ρ(0) by Tr cos(α(t)q)ρ(0). Thus when symmetrically centering the state in phase-space the phases of the stabilizer or logical Z never become complex. In addition, when the initial state is an approximate logical |0 such as F |0 , the expectation value Z(t) ≥ 0 at all times as shown for a few points in Fig. 5 on the right. This is interesting as it shows that |0 'never looks more like a |1 than a |0 ' under photon loss. The state F |1 whose decay is plotted in Fig. 6 starts at Z(t = 0) < 0 and eventually, for large enough t, Z(t) > 0 as the final state is the vacuum centered around q = 0. This asymmetry in its effect on |0 versus |1 is reminiscent of a logical amplitude-damping channel. Now assume that the initial state is displaced away from its centered location by, say, a stabilizer shift S m p which does not affect its initial eigenvalue for Z. Using Eq. (26) we get shows that the expection value can now become complex, but is not faster decaying in its absolute value. When m is large, we see that the additional phase changes rapidly in time, so that the expectation can rapidly change from positive to negative. However, if we know m and κ and it is the only source of noise, this phase change can be treated as a systematic error. Note that if we had applied an arbitrary but known displacement e iup on the initial state, the effect would have been similar.
Going beyond photon loss, other sources of inaccuracy and error could also readily be described using dynamics of the Wigner function. A Lindblad equation dynamics of an n-mode system for which the Hamiltonian is quadratic in creation and annihilation operators (beam-splitting, squeezing etc.) or linear (driving terms ∼ a + a † enacting displacements) while the dissipator models photon loss or photon gain, can be mapped to a Fokker-Planck equation of a general solvable form with constant 2n × 2n matrices A and D. This general behavior follows from the fact that every term in a Lindblad equation which is linear in a or a † (e.g. aρ), gives rise to a first-order derivative in the differential equation for the Wigner function (plus a term which is linear inp andq) [52,7], so that terms quadratic in a and a † (e.g. a † aρ) gives second-order derivatives. The Gaussian Green's function for Eq. (2.3.3) can be readily given, basically forming a multi-dimensional analog of Eq. (25), see [7]. All these Gaussian processes keep an initially nonnegative Wigner function nonnegative and hence simulatable by stochastic means.
On the other hand, nonlinear elements such as a self-Kerr nonlinearity −Ka † 2 a 2 lead to third-order derivatives in the differential equation for the Wigner function, as well as terms in which A is not constant (corresponding to a so-called nonlinear Fokker-Plank equation): the upshot is that the Wigner function can become negative and non-classical during the dynamics and attempts at classical stochastic simulation will suffer from the sign problem. As an example, Ref. [53] discusses Wigner function dynamics for a single oscillator in the presence of a self-Kerr nonlinearity and dissipation.
Dephasing, meaning the application of a rotation e iθa † a with unknown θ is a possible error mechanism as it rotates the quadraturesp andq into each other. Dephasing can come about, for example, from an interplay of photon loss and a Kerr nonlinearity, or a fluctuating mode frequency. In a simple stochastic model the angle θ is drawn from a distribution P(θ) with mean θ = 0 and some moments θ k . For small higher-order moments θ k 1 for k > 2, we can expand This is a dephasing channel which corresponds to the dynamics of a Lindblad equatioṅ ρ = κ deph D(a † a)ρ for a short time with κ deph t = θ 2 1. The fixed point of this equation is any mixture of Fock states |n n|; when the initial state is n c n |n the channel maps it onto n |c n | 2 |n n|. In Appendix A we evaluate the photon number distribution of such fully-dephased D |0 and D |1 . We prove that the photon number distribution is asymptotically thermal, independent of the logical state. Hence complete dephasing seems to wash out much of distinction between the two logical GKP states.
Ref. [20] has discussed the detrimental effect of a Kerr nonlinearity on a variety of single-mode bosonic codes. Numerical simulations of several sources of inaccuries on GKP state preparation using an ancilla qubit were also discussed in e.g. [41,47,54,39,42] using Lindblad equation dynamics.

Repeated GKP Error Correction and Decoding: Finite Squeezing
In this Section we examine the effect of (coherent) finite-squeezing errors on repeated GKP error correction using GKP ancilla's. This is follow-up work from Ref. [45] in which a similar problem was examined using a stochastic Gaussian displacement error model, Eq. (24), applied to GKP ancilla and data qubits as a proxy for finite-squeezing errors. The goal of this Section is to understand whether there are crucial differences between finite-squeezing coherent errors and the Gaussian displacement error model and try to develop a dedicated, computationally-efficient, decoder with good performance.
The dynamics to be analyzed is the repeated execution of the quantum circuit in Fig. 7 on a single GKP input state F(∆) |ψ for m = 1, . . . , M cycles. We remark that a variant of such 'Steane error correction' exists: in [55] the authors observed that applying a beam-splitter between GKP ancilla and GKP data qubit followed by squeezing on the GKP data qubit, can also perform error correction. Ref. [56] has analyzed the repeated execution of this variant of error correction in more detail.
A clear difference between a stochastic error model and the finite-squeezing model is that in the former entropy build-up is possible, while in the latter the state conditioned on the measurement outcomes in Fig. 7 is pure at all times. One can invoke displacement twirling as a method to convert a coherent noise model in which one applies a superposition of displacements to a stochastic mixture of displacements. For example, displacement twirling a finitely-squeezed state E |ψ with some ∆ gives a perfect state |ψ subject to the Gaussian Displacement Channel with ∆ 2 = 2σ 2 0 [45]. After such stochastification of the noise on a GKP ancilla, one can then represent the feedback error (a shift in one of the quadratures) induced by the ancilla in the circuit in Fig. 7 effectively as an incoming stochastic shift error on the data qubit. The stochastic shift of the other quadrature of the ancilla then causes a measurement error of the same strength. On this basis Ref. [45] stochastically modeled finite-squeezing errors as incoming stochastic displacement errors on the GKP data qubit and measurement errors. Another difference in the models is that in the finite-squeezing error model the measurement outcomes carry non-modular information about the measured quadrature. This can be exploited to recenter the state by choosing a corrective displacement immediately after a single round of error correction, while such corrective displacements would have no effect in the stochastic model. Figure 7: A single round of fault-tolerant GKP error correction for both logical X and Z errors. Each measurement is a perfect homodyne measurement ofq orp with outcomes q and p respectively. The finitely-squeezed ancilla states are modeled as approximate GKP states, using a slightly-different small-∆ approximation than E, D, F in Eqs. (14), (15) and Eq. (17), given in Eq. (B.2) and denoted as F V . We show that the optional corrective displacement (dashed box) keeps the state at low average photon number n in Fig. 9.  Fig. 7 (without corrective displacements). We see an enhancement in ∆ q (resp. ∆ p ) during Z-error correction which lowers ∆ p (resp. X-error correction which lowers ∆ q ) due to the feedback error induced by the finitelysqueezed GKP ancilla.
We will represent the GKP wave-function in the q-basis with ψ m (q), the wavefunction after m rounds of error correction. We will sometimes omit the normalization of states when these normalizations play no role. We will analyze the time evolution without the corrective displacement in the dashed box in Fig. 9. A single round of quantum error correction shown in Fig. 7 with measurement outcomes p m , q m Here, n f eedback denotes the photon number evolution where each EC-circuit as in Fig. 7 is followed by a corrective displacement D( −q+ip √ 2 ). The data is obtained by averaging over N = 10 4 trajectories. One observes that the average photon number stays low, corresponding to a centering of the state. Note the large outcome-dependent photon number fluctuations when we do not apply the corrective displacement.
gives ψ m (q) = dq G(q ← q |q m , p m )ψ m−1 (q ) with Green's function To understand this Green's function, observe that in the limit ∆ → 0, the wavefunction ψ + (q) has uniform support on q = k √ π, with k ∈ Z, so that the outgoing wave function is supported solely on q = q m + k √ π, hence the code state sits in the perfect code space with a known shift on top. However, before this, the interaction with the imperfect |0 ancilla for the Z-error correction applies a convolution to the incoming wavefunction. If the ancilla is perfect (∆ → 0), this convolution amounts to applying superpositions of stabilizer shifts S k p with 2k √ π = q − q , each with a phase which depends on p m , on the incoming wave-function. If we assume that all wavefunctions are of the form F |0 , i.e. sums of Gaussians, the convolution leads again to a sum of Gaussians and can be exactly evaluated, that is, one has What we observe is that the convolution broadens the peaks and they acquire phases which depend on the location of the peaks and the outcome p m . The convolution step, which corrects shifts in p, thus introduces a feedback error in the form of peak broadening for the q variable.
In Fig. 8 we plot the effective squeezing parameters ∆ p , ∆ q of the state, Eq. (20), after rounds of Z and X-error correction.
Represented as a state evolution, the stabilizer measurements of S p and S q with outcomes q m , p m effectively map an incoming state |φ to ψ + (p + p m ) |φ and ψ + (q − q m ) |φ . The finite envelope of the approximate GKP states has the effect that the outgoing states are dominantly supported aroundp = −p m ,q = q m . We understand this gain of non-modular information in one quadrature as a reflection of the loss of modular information (i.e. in terms of the increase in ∆ q/p ) in its conjugate. In Fig. 9 we observe, that a displacement about α = −qm+ipm √ 2 indeed contributes to a recentering of the state.
What is noteworthy about the error correcting dynamics in Eq. (29) and Eq. (30) is that the support of the outgoing wavefunction lies within the support of the incoming wavefunction (ψ 0 (q)) plus the support of the ancilla wavefunction (here also ψ 0 (q)) since Supp(f g) ⊂ Supp(f ) + Supp(g) for a convolution of two functions f and g. The Xerror correction step multiplies the convoluted wavefunction by ψ + (q −q m ) which cannot extend its support. Within its support the outgoing wavefunction can have changed amplitudes, depending on the outcomes p m and q m . These arguments are relevant as the GKP state F |0 has support which is concentrated around even multiples of √ π overlapping by an amount exponentially suppressed in ∆ 2 with the support of F |1 . When one uses code states ψ 0 (q) and ψ 1 (q) whose supports have negligible overlap, -which is the case for F |0 and F |1 for sufficiently small ∆-, it implies that, no matter what the measurement outcomes, 0 will largely remain a 0 and 1 will largely remain a 1 in the error correction rounds.
It implies that the picture of stochastic error feedback by using a finite-squeezed |+ ancilla is inadequate as in this picture the support of the wavefunction gets shifted around by such feedback error, while here we observe that instead amplitudes get changed within the support. We see some of this behavior in the performance of the maximum likelihood decoder versus a passive decoder in Fig. 10: a passive decoder which decides that 0 stays a 0 does surprisingly well, which may be understandable if we consider that the support of the ψ 0 (q) wavefunction can never grow by QEC (of course, in principle, the wave functions have support everywhere albeit exponentiallysuppressed with 1/∆ 2 ).
We have formulated a classical 'forward' decoder, see details in Appendix B, and compared its performance with an optimal, density matrix decoding method (maximum likelihood decoding) as well as a so-called passive decoder, see the numerical results without active displacement feedback in Fig. 10. The passive decoder functions as an important sanity check: this decoder throws away all the measurement data, including the final measurement and simply always decides that the outcome is 0 when the input state to the rounds of error correction is an approximate GKP 0 state. Since we don't necessarily know the input state, we note that this decoder is of little practical value. By comparing the performance of this decoder with the MLD decoder we learn to what extent the quantum error correction circuits are preserving quantum information irrespective of the measurement outcomes and to what extent the measurement outcomes provide the proper logical information for correction. This decoder is similar to the passive decoder in the stochastic model of [45] in which none of the error information in each round is used and only the last perfect measurement determine whether the state is identified as 0 or 1. However, in [45] this passive decoder clearly performed worse for these values for ∆. We compare the decoder performance in Fig. 11 to a 'memoryless' decoder using the measurement of the current QEC round for immediate logical feedback, see Appendix  Figure 11: Logical error rate P for decoding the circuit with active feedback as in Fig. 7 We observe an overall improvement in the logical error rate as compared to the performance of the circuits without immediate feedback.  [45]. We compare the logical error rates using two different mappings from ∆ to σ 0 , neither is entirely satisfactory.
B, clearly showing worse performance with this strategy. In Fig. 11 we show the better performance of the MLD and the (feedback-adapted) forward decoder using active feedback which minimizes photon number. We compare their performance with a 'parity' decoder which similarly applies the corrective displacement, but then applies a final logical correction (or not) when the sum of all applied shifts is closer to an odd (or even) multiple of √ π. We observe that for small ∆ and number of EC rounds M ≤ 3, the passive decoder performs comparably to the MLD decoder, consistent with the intuition given earlier in this section. At larger M it performs worse at ∆ = 0.3. Figures 10 and 11 show the average logical error rates obtained from N = 5 · 10 4 samples for all decoders. We note that there are large fluctuations of O(10 −2 ) − O(10 −1 ) in the data which make it hard to draw very definitive conclusions. The stochastic simulations in Ref. [45] used double, namely N = 10 5 trajectories, but the fluctuations in the logical error rate were notably lower.
Last, to study the difference between coherent and stochastic errors, we plot P stoch MLD (σ 0 ) and P MLD (∆) using the identification ∆ 2 = 2σ 2 0 and ∆ 2 = σ 2 0 as a function of M and ∆, see Fig. 12. The simulation for the MLD decoder based on a stochastic error model is implemented following [45]. We observe, that the conversion ∆ 2 = 2σ 2 0 underestimates the logical error, while it is overestimated for ∆ 2 = σ 2 0 . Again the large fluctuations in the logical error rates make it hard to draw very strong conclusions. The simulation and data are accessible at https://github.com/JonCYeh/GKP_EC_Sim.

Circuit-QED Realizations of GKP Qubit Components
In this Section we review and discuss schemes for state preparation, logical gates and quantum error correction for GKP qubits in circuit-QED. In circuit-QED a natural candidate for a bosonic GKP encoding is a resonant mode of a 3D microwave cavity, having low loss rate. Multiple GKP qubits are then stored in multiple low-loss 3D cavities: an engineering platform for multiple coupled cavities,-multi-layer microwaveintegrated quantum circuits (MMIQC) [57]-, is under development.
The coupling between cavity modes of different cavities can be mediated by dipolar 'antenna' couplings between the electric field of the cavity mode and that of an inserted planar chip in the cavity wall hosting a coupler mode. The idea is then to activate twomode gates such as the CZ gate by applying microwave drives or flux-modulation outside of the 3D cavities to drive the coupler mode. As a simple circuit example one can take the electric circuit in Fig. 13 in which the two LC oscillators correspond to the cavity modes: two superconducting islands, each protruding into one cavity and coupling to the electric field of the resonant mode, are connected by a Josephson junction (so-called bridge configuration): such setup can generate the Φ 4 -interaction for a CZ gate as will be described in Section 3.4, while more involved circuits could be used to engineer an effective Φ 3 -interaction for the same purpose, see Section 3.3.

Coupling with Regular Qubits
To prepare a logical GKP state or to realize single-qubit gates, one can employ an interaction with a regular qubit in which the state of regular qubit controls the application of a displacement on the GKP mode as in Fig. 1. If the regular qubit is realized by an anharmonic oscillator such as a transmon qubit, this then requires the engineering of a tunable qubit controlled-displacement interaction of the form b † bq θ witĥ q θ = cos(θ)q + sin(θ)p so that b † b acts as Pauli Z in the regular qubit subspace. Figure 13: Electric circuit representing a typical capacitive coupling between two LC oscillators, modeling cavity modes, and a (possibly flux-tunable) transmon qubit.
As mentioned earlier, a common interaction between an off-resonantly coupled transmon qubit and cavity mode is the dispersive or cross-Kerr interaction of the form −χa † ab † b → −χa † aZ/2. This interaction realizes a qubit controlled-rotation which can be converted, in principle, to a qubit controlled-displacement, using additional displacements and qubit-flips as follows. Since e iθa † aZ D(α)e −iθa † aZ = D(αe iθZ ), choosing θ = π/2 gives the displacement D(α) when Z = 1 and the displacement D(−α) when Z = −1. Thus this sequence of gates does what is needed. To realize e iθa † aZ , one can simply conjugate the interaction e −iθa † aZ by π-bit-flips on the qubit, so that we can do all 3 gates in the decomposition of D(αe iθZ ).
Note that the strength of the controlled-displacement α only depends on the strength of the uncontrolled displacement (which can easily be made very strong in O(10) ns). If the entire controlled-displacement is to be done in, say 50 ns, it requires two rotations each with time t = π/χ = 20 ns, or χ 2π = 25 Mhz. This realization thus requires making χ tunable, i.e. when the transmon qubit is to be measured or prepared, it is important that the cross-Kerr interaction be 'off', as it induces rotations on the GKP grid state which dephase the state in the Fock basis. However, there is a limit to the on-off ratio of χ obtained by flux-tuning of the transmon qubit, in particular if χ is flux-tuned to be stronger, then the resonator becomes more anharmonic as well, see [58] and Eq. (36).
Instead of needing a tunable interaction, Ref. [39] realized a qubit controlleddisplacement in 1.2 µs, using a very weak dispersive coupling χ 2π = 28 kHz between transmon qubit and cavity mode. This weak coupling obviates the need for a tunable interaction, but would make for a very slow gate when using the method described in the previous paragraph. The idea in [39] is to realize a qubit controlled-displacement by temporally displacing the cavity mode to states with |β| 2 = 320 photons, so that even a small qubit-induced cavity rotation can have a large effect. The scheme is best understood by using a displacement frame, see Appendix C.1, on a cavity-driven dispersive shift Hamiltonian H = − χ 2 a † aZ + iE(t)a † − iE * (t)a. The displacement frame shift shows that the dynamics is due to an effective Hamiltoniañ where β(t) = t 0 dt E(t ). The effect of the last term in this Hamiltonian after a time T (taking β = β * for simplicity) is the qubit controlled-displacement In order to cancel the qubit controlled-rotation (first term) the qubit state is flipped midway in the interval T , requiring that the cavity displacement direction is also inverted midway, i.e. β → −β. We see that in this realization the applied displacement power and the dispersive shift χ together determine the strength of the qubit controlled-displacement.
We can ask how to improve on the execution of the qubit controlled-displacement gate and the subsequent qubit measurement, where improvement means a faster as well as more reliable execution. As for the realization in [39] one may worry that the large displacements of the state in phase-space during the execution of the gate lead to errors on the GKP state. Even though the cavity has a long lifetime (single-photon life-time T = 245 µs in [39]), the logical operator of a displaced GKP state takes on a complex oscillatory value in time due to photon loss, see Eq. (27). It is desirable to shorten the duration of the transmon qubit measurement (700 ns in [39]), but it is hard to make the dispersive read-out of the qubit via a read-out resonator very fast. For example, the measurement pulse followed by active read-out resonator depletion is O(600) ns in [59] and O(250) ns (including resonator occupancy) in [60]. Replacing the qubit measurement by feedback and disentangling [40] requiring other controlleddisplacements can only lead to a shorter overall preparation time if the duration of such controlled-displacement can be shortened from what was achieved in [39]. Note that the replacement of measurement by coherent interactions could also be done for the GKP qubit rotation in Fig. 3.
Instead of a transmon qubit as ancilla, one may consider a different qubit such as fluxonium [61], again dispersively coupled to the 3D cavity mode. Advantages of a fluxonium qubit are its long coherence and larger anharmonicity leading to lower leakage [62], equally fast-single qubit gate operations (O(10) ns) as well as potentially very fast and powerful qubit measurement, see e.g. the GrAl-based fluxonium qubit in [63,64,65]. In addition, flux-tuning fluxonium may give a strongly-tunable dispersive shift χ [61,66], without the unwanted side-effect of strengthening the cavity anharmonicity.
Another proposal is to use a noise-biased cat qubit to measure the stabilizer displacements of a GKP qubit [18] using a tunable beam-splitter interaction between the two cavity modes of the form H = g(t)ab † + g * (t)a † b, as argued in Section 2.2. To use the interaction, we thus imagine first preparing the cat qubit in |C + α (by starting in the vacuum state |0 and turning on the pump), then activate the tunable beam-splitter, and measure the noise-biased cat in the |C ± α basis or employ feedback and disentangling via qubit controlled-displacements [40].

Logical GKP Measurement
How does one determine whether a GKP state is |0 or |1 , that is, realize a logical Zmeasurement? Such logical measurement may be completely destructive, but is desired to have high-fidelity, hence be fault-tolerant in its implementation, meaning that the outcome is insensitive to imperfections in the state. Even though one can measure a logical displacement, i.e. X, Z or Y , using a single ancilla qubit as was done in Refs. [47,39], such measurement has an intrinsic probability of error on an approximate GKP state. For example, measuring Z on |0 does not give outcome 1, since the state is not a perfect eigenstate of Z but obeys Eq. (19). If we assume that this measurement circuit is otherwise perfect and is applied to F |b with b = 0, 1, the probability for the ancilla qubit to be measured as ± equals P(±|b) = 1 4) ), using Eq. (19). The upshot is that the ancilla measurement is flipped with symmetric error probability q = 1 2 (1 − e −π∆ 2 /4) ) which goes to 0 when ∆ → 0. At, say, ∆ = 0.3, this readout error probability q is about 3.4% and much larger than the probability for an incorrect Z-outcome through the ideal homodyne measurement given in Eq. (18). Some repetition of the controlled-displacement circuit with the ancilla qubit and taking a majority vote of the answers could bring down the error probability q at the price of more time and possibly additional feedback error.
A target for future work could be to achieve an improved logical GKP qubit measurement by releasing the GKP state from a superconducting cavity via a switchrelease mechanism [67] (taking O(1) µs in time in [67]) into a transmission line and then enact phase-sensitive amplification (e.g. squeezing) so as to measure one quadrature, sayq, with no further added noise. After calibration of the measurement using ∆squeezed displaced states and their targeted measurement outcomes, the measurement could proceed by determining whether the amplified signal corresponds to aq which is closer to an even ( → outcome 0) or odd multiple (→ outcome 1) of √ π. Photon loss in this process may be expected to be a dominant source of noise. To get an estimate of the error rate in the presence of photon loss, we can compute Eq. (18) for a state at ∆ = 0.3 undergoing photon loss as in Eq. 25 with κt = 0.1 (so that a coherent state loses 1 − exp(−κt) ≈ 10% of its intensity), giving P(Z = 1|F |0 , κt = 0.1) = 99.5%. For κt = 0.5 this measurement success probability is already down to 80%.

GKP CZ Gate via Three-Wave Mixing
In this Section we describe how one could realize the CZ interaction between two GKP modes via a 3-wave mixing element which is activated by applying a (strong) microwave pump tone to a coupler mode, see e.g. [68]. An example of pure 3-wave mixing used for broadband parametric amplification is the Josephson-ring modulator circuit [69,70].
An example of the use of parametrically-activated 3-wave mixing is the experiment in [71]: flux-modulation through a coupling Josephson junction (instead of microwave driving) is used to activate a a † 2 b coupling between a logical (co-planar microwave) resonator whose state is to be manipulated and an ancilla (co-planar microwave) resonator.
For simplicity, we here assume that the following non-degenerate 3-wave mixing Hamiltonian is available: Here a, b are annihilation operators for two GKP oscillators while c is the annhilation of the pump oscillator. We assume that all frequencies ω a , ω b and ω c are sufficiently detuned, so that the χ (2) interaction between the modes will approximately time-average away in the rotating wave approximation (RWA) in the absence of any active driving, see the discussion in Appendix C. Moving to the rotating frame of the GKP oscillators we havẽ Since there are two time-dependencies involved, we can make all χ (2) -interactions resonant by driving the pump mode c with a two-tone drive, namely at ω p = ω a + ω b and ω p = ω a − ω b . Both pump tones will need to be of equal amplitude to get equal contributions from beam-splitting (a † b + ab † ) as well as two-mode squeezing (ab + a † b † ). Assuming that the pump mode is a (fairly) harmonic mode which can be strongly driven, we replace the operator c by its classical time-dependent expectation value c(t) = E e −i(ωa+ω b )t + e −i(ωa−ω b )t with, say, E ∈ R. Making a rotating-waveapproximation, Appendix C, gives the generating interaction of the CZ gate between modes a and b: Changing the phase of the pump tone (E) allows one to realize CZ −1 . If we want to do a GKP CNOT gate via two-tone pump, we cannot start with the interaction in Eq. (32), but a Hadamard or single-qubit R Y (π/2) would be required to convertq →p. Instead of applying two simultaneous pump tones to get a CZ (and with extra rotations, a CNOT), we could also decompose the CNOT circuit as in Fig. 4, i.e. a sequence of beam-splitters and single-mode squeezers. When we drive the pump mode at the difference frequency of the modes, ω p = ω a − ω b in the nondegenerate 3-wave mixing Hamiltonian in Eq. (32), we realize a beam-splitter interaction as the pump photon assists in converting one mode-a photon to a mode-b photon.
Single-mode squeezing can be activated by using a degenerate version of the 3-wave mixing elementq aqbqc in Eq. (32) with a Hamiltonian proportional toq 2 aq c (or similarlŷ q 2 bq c ). By applying a pump tone at frequency ω p = 2ω a , one activates a squeezing Hamiltonian H sq on mode a: we down-convert one pump photon into two mode-a photons and vice-versa.
For superconducting devices the only native non-linear circuit element that we have at our disposal are Josephson junctions which, -in their simplest use, without externally applied fluxes-, realize a U (Φ) = −E J cos(2πΦ/Φ 0 ) potential interaction. Here the flux variable Φ can be expanded as a linear combination of theq of bosonic modes which participate in the junction + . Usually, if E J is large ( E C ), we expand this cosine potential around its potential minimum Φ = 0, obtaining only interactions which are symmetric under Φ → −Φ such as Φ 4 (while absorbing the Φ 2 -terms in the quadratic part of the Hamiltonian).
It is clear from the discussion above that it would be desirable to engineer a Φ 3interaction where 2πΦ/Φ 0 = αq a + βq b + γq c . When the three modes have sufficiently different frequencies, we can observe that all terms in this Φ 3 -interaction, except those proportional to a † a, b † b or c † c, average out in time, hence the interaction is 'off' in the absence of active driving of one of the modes, not inducing any nonlinearity on the modes in this off-state. At the same time, by choosing the pump mode drive frequencies appropriately, we can activate, with the same interaction element, either a squeezer for mode a, a squeezer for mode b, a beam-splitter between modes a and b or/and a two-mode squeezer between modes a and b.
Besides the Josephson ring modulator, another 3-wave mixing element, called a SNAIL, has been proposed in [74]: it uses a superconducting (SQUID-like) loop containing an asymmetric array of a few Josephson junctions and external flux is applied through the loop. The effective potential induced by this SNAIL is of the where Φ is the flux variable expanded around its potential minimum, determined by the external flux Φ ext .
A recent paper [75] discusses the circuit-QED engineering required to realize a universal set of gates for continuous-variable computation using GKP states. By fluxmodulating the SNAIL, one can activate some of the terms in the 3-wave mixing Hamiltonian in Eq. (32), mimicking the effect of microwave driving of the pump mode. The authors in [75] then use this activation to show for example how to realize an interactionq 3 , required to enact the cubic phase gate V γ = e iγq 3 .
Another use of a Φ 3 -coupling for GKP state preparation has been proposed in [46]. In this paper the aim is to produce a tunable opto-mechanical coupling of the form b † bq GKP between a GKP mode and (harmonic) ancilla mode (b) which is initially prepared in a coherent state. Such coupling can be used to prepare the GKP mode into a logical state starting from a vacuum state, similar as the preparation via regular qubits discussed in Section 3.1. The idea here is that the frequency of the ancilla oscillator + A side comment: in circuit-QED we cannot passively get interactions where some cosine potential depends on theqs of some subset of modes and another cosine potential depends on thep of a subset of modes, since all the variables in the circuit Lagrangian which enter the potential energy, -such as the Josephson junction cosine potential energy-, commute when promoted to quantum operators, i.e. they will never be conjugated variables. The upshot of this is that it is very hard (see an attempt at [72]) to entirely passively Hamiltonian engineer, say, the toric code checks on a collection of bosonic modes as the essential property of such stabilizer checks is that they either act asq andp on a single mode. An exception would be the simultaneous use of Josephson junctions elements (cos(2πΦ/Φ 0 )) and so-called phase-slip elements (cos(πQ/e)) for conjugate variables flux Φ and charge Q. Ref. [73] shows that one can passively engineer an effective GKP Hamiltonian using a gyrator element ΦQ.
is shifted depending on the value forq of the GKP mode, leading to aq-dependent rotation of the coherent state of the ancilla mode. When the interaction time is chosen so that all q = k √ π for k ∈ Z lead to the same rotation of the coherent state, measuring the coherent amplitude realizes an approximate modular measurement ofq, resolving the value of q mod √ π. Such modular measurement ofq is equivalent to measuring the eigenvalue phases of S q = exp(2i √ πq). A possible advantage of this method over the coupling with regular qubits is that one gets more information per ancilla mode measurement than 1 bit. In this proposal an externally-applied flux is modulated around a value for which there is an effective third-order Φ 3 -coupling between the two oscillators while the Φ 4 -coupling vanishes at this flux setting. Choosing the GKP oscillator at much lower frequency (∼ 0.5Ghz) than the ancilla oscillator (∼ 10Ghz) creates an asymmetry so that a term like b † bq GKP dominates in the Φ 3 -interaction and the term is made resonant via flux modulation.

Use of Four-Wave Mixing?
We comment on the use of a Φ 4 -interaction for realizing the GKP CZ gate. The setup we have in mind is modeled by the electric circuit in Fig. 13. Applying circuitquantization to this circuit leads to a Hamiltonian with three active modes. Due to the coupling between the LC oscillators, each described as a single mode, some hybridization will happen between the bare cavity modes and the transmon coupler mode, and so we will associate annihilation operators a,b and c with these dressed modes. Due to this hybridization the three dressed modes with annihilation operators a, b and c will partake in the Josephson junction. This means that for the flux-variable operatorΦ across the Josephson-junction branch, we can write 2πΦ/Φ 0 = αq a + βq b + γq c with dimensionless α, β, γ modeling the participation of the effective modes in the Josephson junction [58]. Expanding the cosine potential up to fourth-order, and diagonalizing the linear interactions of the Hamiltonian (quadratic in creation and annihilation operators) thus gives rise to three dressed eigenmodes at frequenciesω a ,ω b andω c , and we have the Hamiltonian: As in the discussion on 3-wave mixing we assume that all frequenciesω a ,ω b ,ω c are sufficiently different (detuned). If there is no active driving (or flux-modulation), a full RWA approximation, whose accuracy depends on the amount of detuning, will leave only energy-conserving self-Kerr and cross-Kerr terms. In other words, in the off-state, the Hamiltonian is approximately where χ ii = 2 √ χ ii χ i i [58].

24
(and similarly for ω b and ω c ) due to rewriting the excitation-conserving terms inq 4 a as a quadratic term ∝ a † a and the self-Kerr term ∝ (a † a) 2 and χ aa = E J α 4 12 . Here we clearly see the advantage of a pure three-wave mixing element over a four-wave mixing element: in the off-state the fourwave mixing element induces unwanted Kerr and cross-Kerr anharmoniticies on the GKP storage modes a and b.
In the off-state, mode c is (ideally) in its vacuum state, hence the cross-Kerr interaction with this mode does not contribute. However, if this mode were driven these corrections are relevant and they induce additional cavity rotations. Let us now indeed discuss the effect of applying a drive on mode c. For this, we expand the fourth-order term in Eq. (35) which becomes We can apply a two-tone drive on mode c at frequency ωa+ω b 2 + h.c) in Eq. (37) and going to the rotating frame of all modes a and b, we find that a term likeq aqbq 2 c leads to a time-independent resonant term proportional toq aqb . This can be seen as follows. First, note that the signal q c (t) 2 only contains frequencies ω a , ω b , ω a + ω b and ω a − ω b , all of equal strength. The frequency ω a + ω b matches twomode squeezing (ab + h.c), while ω a − ω b matches beamsplitting (ab † + h.c). Besides this, we throw out all time-dependent terms (RWA). In particular we have • Terms without anyq c are only leading to self-Kerr and cross-Kerr for modes a and b. • Terms with a singleq c orq 3 c are not frequency-matched to become time-independent (as ω a and ω b are sufficiently different).
• Termsq 4 c leads to self-Kerr for mode c. • Terms withq 2 c lead to cross-Kerr between modes c and a or c and b. Note that from a term such asq 2 aq 2 c there is a contribution proportional to E 2 a † a (and similarlyq 2 bq 2 c We could also realize the CNOT gate using the beam-splitter and squeezer sequence in Fig. 4, i.e. we chose a single-tone pump at ω p = (ω a − ω b )/2 for the beam-splitter to let two pump photons assist in converting one mode-a photon to one mode-b photon. Single-mode squeezing can be realized by using the interactionq 2 aq 2 c in Eq. (37), i.e. we should take ω p = ω a so that two pump photons are converted into two mode-a photons. However, note that this also make the unwanted interaction b † b(a † c + ac † ) (which comes from theq 2 bq aqc term) resonant, which makes this scheme unattractive. An important parameter measuring the quality of the CZ gate via this Φ 4interaction is the relative strength of the unwanted Kerr and cross-Kerr terms versus the strength of the two-tone pump-activated wanted interactionsq aqb . In part, this relies on the error contributions due to the rotating wave approximation which should be better quantified theoretically (Appendix C). Another contributing factor is the relative strength of the participation parameters α, β, γ and the pump strength, namely, without error contribution from the RWA, one has Hence, a large γ 2 /αβ is desirable but a large γ also makes mode c more anharmonic as χ cc ∝ γ 4 and this again severley restricts the pump power E. These conflicting constraints may make this scheme less suitable in practice.

Prospects for a GKP-Surface Code Architecture
In this section we would like to provide a perspective of what it would take to build a surface code architecture based on GKP qubits, point out the challenges in this approach, as well as contrast it with existing efforts to engineer a similar architecture using transmon qubits [76], see Section 4.1.
We can partially use the results in Ref. [45] as a starting point for such GKP-surface code architecture. In this code architecture, there are two layers of protection. On the one hand, each GKP qubit is either stabilized or error corrected individually, reducing a continuous set of (displacement) errors to a mostly discrete set of GKP qubit Pauli errors. On the other hand, the surface code layer is there to suppress the logical error rate of a GKP qubit to values which decrease exponentially with the side length of the surface code lattice.
In Ref. [45] GKP error correction takes place with ancilla GKP modes using the circuits in Fig. 7. Note that these circuits can also be implemented via CZ gates, but will then require Hadamard or R Y (π/2) rotations on the GKP data mode. Interspersed with this GKP error correction, parity checks of the surface code, shown in Fig. 15, are to be measured in QEC cycles. These circuits are similar as for a regular surface code, except that the underlying qubits are GKP qubits encoded in oscillators, see Fig. 14 for a Z-check. We use the fact that GKP logical operators are not selfinverse as displacements, -and as displacements they obey [X 1 X 2 , Z −1 1 Z 2 ] = 0-, to measure checks which mutually commute on the entire oscillator space * . In this Figure the measurement of the GKP ancilla is shown as the release and amplification of the cavity state, followed by a quadrature measurement, as discussed in Section 3.2. Such measurement would give useful analog information, but the usefulness of this analog information is challenged by losing photons in the step, nor has it been experimentally realized. * One expects worse error behavior when one measures non-commuting checks outside the ideal code space; for example, this also happens when regular qubits leak [77]. In addition, the maximum likelihood decoding analysis in [45] relies on this choice to map onto compact-QED model with proper lattice versions of rotations and divergences.
We call this GKP-surface code architecture All-GKP-Ancilla. This set-up would require each GKP qubit oscillator to have CZ capability with 5 other GKP oscillators, namely 4 GKP ancilla oscillators for the surface code and 1 GKP ancilla oscillator for its own error correction.
Ref. [45] used a model of Gaussian stochastic displacement noise, Eq. (24), as an effective, numerically-simulatable, error model for this architecture. The noise channel acts in the different locations: (1) on each GKP qubit prior to GKP error correction and a round of surface-code parity check measurements, (2) prior to the homodyne measurement in Fig. 7 and (3) prior to the homodyne measurement of the ancilla GKP in the surface code check. Taking the standard deviations from these Gaussian channels to be equal, a threshold standard deviation σ 0c ≈ 0.243 for the toric code was found. Note that this model includes all sources of errors, including finite squeezing and feedback errors, albeit stochastically. Using the conversion ∆ 2 = 2σ 2 0 , this gives a threshold of ∆ = 0.34 or 9.3 dB, but the data in Fig. 12 show this conversion is somewhat too optimistic: using squeezed states with ∆ gives an error rate which is somewhat worse than a stochastic model with σ 0 = ∆/ √ 2, so a worst-case threshold estimate using ∆ 2 = σ 2 0 would be 12.3 dB. Ref. [78] considered a variation on this stochastic noise model, -explictly including error feedback-, and applied this noise to a concatenation of the GKP qubit with the surface code. Both [45] and [78] used minimum weight matching decoders to find thresholds. Ref. [45] identified the defects and the distance function between them following the associated compact-QED model closely in order to approach exact minimum-weight decoding. Different from this, Ref. [78] identified the positions where the surface code check outcomes change as defects, but altered the distance function between these defects based on GKP error information.
Release and Amp.p Reprep Figure 14: A Z-check parity measurement for the surface-GKP code in Fig. 15, on oscillators labelled NE-SW (Northeast to Southwest of ancilla qubit) using CZ and CZ −1 gates as defined in Section 2.3.2. |+ is an approximate +1 eigenstate of X and the GKP stabilizers. Release and amplification (Amp) followed by measurement of the quadraturep is a way to do a logical X measurement and Reprep is a unit standing for the repreparation of the GKP ancilla state.
We add two more observations about this scheme. First, when we use stabilizer error correction, such as surface code error correction, on bosonic codes, we need to implement parity check operators which sometimes act like a logical X on a bosonic qubit, and sometimes like a logical Z. For GKP qubits this translates into the ability to perform CZ gates as well as CNOT gates. For standard (transmon) qubits, the switch between CZ and CNOT is easily achieved by applying a layer of Hadamard gates between a parity X-cycle and a parity Z-cycle. For a GKP qubit encoded into an oscillator with frequency f , such Hadamard gate seems simple: it constitutes waiting for time t = 1/(4f ). But since all data qubits have to undergo this Hadamard gate, it implies that the resonant frequencies of the data qubit oscillators (resonant modes of identical 3D cavities) should all be identical, which seems like a narrow target to aim at (although the difference between a simulation-based predicted 3D cavity frequency and the measured frequency can be less than 0.1% [1]).
As alternative to the Hadamard gate one can use R Y (π/2) gate and R Y (−π/2) gate, using a regular qubit as in Fig. 3, to toggle back and forth between Z and X error corrrection, but it costs a lot more hassle and time than doing a R Y (π/2) on a transmon qubit in O(10) ns. A second observation is that the use of parametricallydriven 3-wave or 4-wave mixing as discussed in Sections 3.3 and 3.4 could allow for the simultaneous execution of the CZ and CZ −1 gates needed to do a surface-code parity check measurement, as the activation of the CZ or (CZ −1 ) gate only requires the application of a pump tone to the coupler between each data oscillator and ancilla oscillator (4 couplers in total). The coupling strength of these CZ couplers may not be equally strong, hence the duration of these four pump drives can vary, but an advantage of only driving the coupler mode (instead of the GKP mode) is to enable the simultaneous execution of these commuting gates. Another way of looking at the simultaneously-executed parity check is to observe that a green Z-check in Fig. 15 on oscillator NE, SE, NW, SW corresponds tô An interaction Hamiltonian H G = −q AĜ applied for time t has the effect that p A →p A −Ĝt, using Eq. (2) (while for all data oscillators i = NE, SE, NW, SW participating in the check, we havep i →p i ± tq A ). Taking t = 1, we see that by measuring the ancilla quadraturep A , we measureĜ modulo even multiples of √ π (as |+ has sharp peaks at p A being even multiples of √ π). Thus, if one of the oscillators undergoes a √ π shift in q, the measurement will detect this. Besides the Steane error correction in Fig. 7, one can also imagine a more hardwareefficient form of GKP error correction via stabilization using a regular qubit as discussed in Section 3.1, regularly interspersed with parity check measurements for the surface code which, for example, do use a GKP ancilla. The advantage here is that one does not need to prepare and couple the ancillary GKP qubit as in Fig. 7 (which again requires a regular qubit). In particular, (ancilla) GKP state preparation is time-consuming (60 µs in [39]) due to requiring slow controlled-displacement gates and slow qubit measurement, and during this process photon loss is affecting the GKP qubit. At the same time, we keep the GKP ancilla for the possibly-less frequent surface code QEC cycle in order to get still analog error information. We refer to this intermediate scheme as Only-SurfaceCode-GKP-Ancilla.
Another choice is to use regular qubits to extract both GKP and surface code error information, see the circuits in Fig. 16. We refer to this scheme as All-Regular-Qubit-Ancilla. An advantage of this scheme is that no Hadamards or R Y (π/2) rotations are needed on GKP modes and tunable controlled-displacement gates are used throughout. Figure 16: A single round of error correction for an X-check (left) and a Z-check (right) on the GKP data oscillators using regular qubit ancillas. The interactions between ancilla qubit and GKP oscillators are all tunable qubit controlled-displacement interactions.
The All-Regular-Qubit-Ancilla architecture might however be less tolerant towards errors: it might be hard to get below threshold for the surface code, when all error information is obtained through qubits, giving 1 bit of information at the time. We can provide arguments for this by using a simple error model in which we assume that GKP error correction generates an effective phenomenological error model in each surface code QEC cycle and we assume that the surface code QEC cycle is otherwise perfect. We model the effect of GKP error correction as stabilizing an approximate GKP qubit of the form, say, F |b at some ∆, besides having a logical error b → ¬b on top with probability p. Effectively then, the approximate GKP code states coming into a perfect surface-code parity check circuit as in Fig. 16 will flip the regular qubit ancilla with some effective error probability q which depends on ∆. We thus map our error model onto a known (phenonomenological) surface code error model in which there is an incoming error with probability p in each QEC round and a measurement error with probability q(∆). For this model, Fig. 3 in Ref. [79] shows the numerically-found below-threshold region and for q = p the threshold is optimally 3.3% [80]. Ref. [79] does not investigate the below-threshold region and its shape for low p, but it certainly lies within the belowthreshold region for the repetition code which is conjectured to have a below-threshold region given by H 2 (p) + H 2 (q) ≤ 1 with H 2 (p) the binary entropy [81].
The frequency of doing the surface code error correction could also be adapted to the logical decay rate of the stabilized GKP state, e.g. 275 µs in [39], so that the logical qubit error rate between surface code QEC cycles is at least less than 3.3%. It is an open question how to analyze the noise threshold for the All-Regular-Qubit-Ancilla architecture for a more elaborate error model.
In this All-Regular-Qubit-Ancilla architecture the workhorse is the controlleddisplacement gate with the regular qubit and the regular qubit preparation and measurement. The desiderata for these regular qubits are clearly (1) ability to enact a fast and accurate tunable controlled-displacement with a 3D cavity mode, (2) low leakage to higher excited states, (3) long T 1 and T 2 , beyond 100 µs, and (4) fast measurement below O(100) ns, (5) fast preparation of |+ and single-qubit gates (O(10) ns). At first sight, this seems like a wishlist for any good qubit, however it is not necessary to have a high-quality two-qubit gate between these qubits, which is a nontrivial component for the surface code with transmon qubits. Furthermore, the frequency of the 3D cavity GKP modes can be taken to be far different than those of the ancillary qubits and their coupled read-out resonators, possibly leading to easier frequency control and less frequency crowding than in an architecture with only one type of device qubit such as the surface code with transmon qubits [82,76].

Comparison: Fock Qubit Surface Code and Transmon Qubit Surface Code
Given that we imagine using high-Q 3D cavities for qubit storage, we can ask how to compare a GKP encoding with a simple Fock encoding in a surface code architecture, omitting any additional error correction. CZ gates between a 3D-cavity encoded Fock qubit (mode a) and an ancilla transmon qubit can be realized by a dispersive coupling −χa † aZ/2, allowing for the execution of the Z-check measurement. Similar as for the controlled-displacement in the GKP encoding, tunability of this interaction, for example, by using an intermediate frequency-tunable resonator to vary the coupling strength, is important. This type of parity check, using 1 transmon qubit to read out 4 Fock qubits, is the reverse of using one bosonic mode to read out the parity of four coupled transmon qubits as realized in Ref. [83]. For the X-check measurement one requires a CNOT gate with transmon qubit as target, which can be realized by performing a CZ followed and preceded by Hadamard or R Y (π/2). Again, similar as in the GKP encoding, these simple single-qubit gates require the use of an ancillary qubit, but arbitrary cavity manipulations through such coupled transmon qubit have been demonstrated in [84], albeit of rather long, O(1) µs, duration, and having some, inevitable, leakage towards the state |2 or higher.
An engineering effort for making a surface code architecture using 2D transmon qubits is underway at e.g. Google, IBM, TU Delft and ETH Zürich. Besides using an optimized decoder [85], the crucial numbers which determine whether such architecture will be 'below threshold' are the quality, leakage [77], time-duration and cross-talk of the two-qubit gate and the duration (and cross-talk) of the qubit measurement versus the dephasing and relaxation time of the qubits. Flux-tunable transmon qubits have recently achieved very good numbers for their two-qubit gates: Ref. [86] reports on a 99.1% CZ fidelity of 40ns duration and low leakage 0.1%, while Google's supremacy experiments [87] have shown the performance of ISWAP-like two-qubit gates on a 54qubit Sycamore chip with an average error rate of 0.62% and duration O(10) ns. It seems hard to push the transmon performance numbers, including measurement duration beyond their current values. It is hard to push the T 2 and T 1 times of a flux-tunable transmon qubit beyond 100 µs due to flux noise, while an enhanced T 1 would also lead to an enhanced duration of leakage. Frequency crowding and limits on highly-accurate frequency targeting, in particular for non-flux tunable transmons, leading to spurious cross-talk couplings is another challenge in realizing the surface code.
We thus believe that there is plenty of room and, in fact, necessity for developing an alternative surface code architecture in which a data qubit, such as a Fock or GKP qubit, is encoded in a very harmonic mode of high-Q (3D) cavity, while transmon qubits or their next-generation versions such as fluxonium or noise-biased cat qubits, are used as ancilla qubits. If a pump-activated CZ or controlled-displacement gate has high on/off ratio, one expects that spurious couplings between the 3D cavity data modes, due to common coupling to the ancilla qubits, would be well suppressed.

Acknowledgements
We

A. Fock State Representation of GKP Grid States
In this Appendix we examine the Fock coefficients of approximate GKP states, using the D-approximation, and the sensor grid state [41]. We study the asymptotic behavior of these Fock coefficients, showing that the photon number distribution trends along a geometric or thermal distribution.
The Theta function with rational parameters (a, b) ∈ Q 2 , adopting the notations of [88,43], is given by where (z, τ ) ∈ C and Im(τ ) > 0 ensuring absolute convergence of the series. Some common shorthands are the following The multidimensional generalization with rational vectors ( a, b) ∈ (Q m ) 2 is given by where z ∈ C m is a complex vector, Ω ∈ C m×m is a complex matrix and Im (Ω) is positive definite which ensures the absolute convergence of the series. First, it is important to note that one can properly normalize the approximate GKP state |j approx , using the D-approximation defined in Eq. (15), for the two logical states j = 0, 1: and 0 and 1 are the all-zeros and all-ones vectors respectively. Note that there is a bijection between ∆ ∈ (0, ∞) and u ∈ [0, 1) so we write either N (∆, j) or N (u, j) depending on convenience.
In order to obtain this expression one can use the position representation of Fock states in terms of Hermite functions Ψ n (or Hermite polynomials H n ), and the so-called Mehler's Hermite polynomial formula: It is possible to rewrite Eq. (A.8) in the form given in Ref. [43] N (u, j) = 1 This expression can be recovered using 15) and the modular transformation We can turn to the Fock coefficients c n (j) = n|j approx = 1 As the parity of the Hermite function Ψ n is that of the parity of n, the coefficients c n (j) vanish for odd n. We can use that for q ≥ 0, see [89], This implies that we have This gives a somewhat concise expression although it is not directly useful. In order to numerically evaluate the coefficients for example, Eq. (A.17) is more convenient as the Hermite functions, Ψ n (q), have support essentially within [−2 √ n, 2 √ n] and are easily computed recursively using This is used together with Eq. A.13 to plot the coefficients in Fig. A1.  The case of the Fock representation of the so-called sensor state [41] which is an approximate eigenstate of e i √ 2πq and e i √ 2πp is very similar. For the perfect sensor state |ψ ideal the wavefunction in q is a sum over δ-functions at integer multiples of √ 2π, and similarly the wavefunction in p is a sum over δ-functions at integer multiples of √ 2π. The approximate state is |ψ approx = where u and Ω are the same as defined in Eqs. (A.9) and (A.10). This can be obtained in a similar way and we can also write N (u) instead of N (∆) when convenient. The sensor state |ψ approx is an eigenstate of exp(iπa † a/2), hence the photon number is 0 mod 4 for this state [41]. We now have c n = n| ψ approx with c 2n+1 = 0 and Ref. [89] derives an expression analogous to Eq. (A. 19) for the sum on the right-handside: 128π 4 , with Γ() the Euler gamma function, and {d(n)} ∞ n=0 is a particular integer sequence studied in [89] which is directly related to the derivatives of θ 3 (x) at We show the Fock coefficients for the sensor state in Fig. A2. Note that the sign of the integers d(n) is the same as the sign of c 4n through Eqs. A. 23   We want now to derive the asymptotic behavior of the coefficients. This can be done by expressing the normalization of the states by an equality which has to hold for all ∆ ∈ (0, ∞) or equivalently all u ∈ [0, 1). In the case of the sensor state, n∈N |c 4n | 2 = 1 implies: which is of the form: where the constant A and the sequence a 4n , both independent of u, are defined by Eq. (A.27). Similarly for the approximate GKP states we have: which is of the same form: for some other constant B(j) and sequence b 2n (j), also both independent of u, defined by Eq. (A.29). Seen as complex functions of u ∈ C, N (u, j) and N (u) are both analytic and have a convergence radius of 1. Their behavior for u → 1 can be obtained (see also [43]) using Eq. (A.13) and the fact that This gives We then apply a transfer theorem, see [90], to deduce the asymptotic behavior of the a 4n and b 2n (j) sequences. More precisely they both converge to some finite value .
(A. 34) In turn this gives the asymptotic behavior of the Fock coefficients: 35) In both cases the coefficients are asymptotically equivalent to a geometric or thermal distribution which is usually parametrized by the average photon number n as follows p thermal n = 1 n + 1 n n + 1 n . (A.36) We can therefore deduce the average photon number of the equivalent asymptotic thermal distribution to the approximate GKP and sensor states using (A.37) These thermal distributions for the different ∆s considered are also shown in Figs. A1 and A2. One can see a good agreement of the general trend although oscillations of the order of the probabilities themselves persist. Note also that for small ∆, n ∼ 1 2∆ 2 , consistent with approximate expressions derived in other literature (see main text).

B. Decoders For Repeated GKP Error Correction With Finite Squeezing
For computational, simulation efficiency (as well as our formulation of a classical decoder) we use the approximation F V for the GKP ancillas. This form of the states can be viewed as applying a reverse Villain approximation to the approximate F-states in Eq. (17). The reverse Villain approximation, which is tight for a ≥ 4π 2 , reads n∈Z e −an 2 +bn ≈ e − a 2π 2 e b 2 4a + a 2π 2 cos ( π a b) . (B.1) Using Eq. (B.1) and ∆ 2 ∆ 4 +1 ≈ ∆ 2 , we then have Using these ancillas, the Green's function for M rounds of error correction (without active feedback between rounds), with outcomes denoted as M -dimensional vectors p and q, can be written as with one-dimensional 'action' S[ q| q, p]: We can readily interpret the first line in the last equation as a kinetic energy term T , generating dynamics in the position variable, while the second line is a potential energy term −U , pinning the position variable to the measured values q m . We note that in the potential energy, the cos 2 √ π(q m − q m ) term is dominant when ∆ is small and hence we can omit the wide parabolic potential proportional to ∆ 2 . Similarly, for the kinetic energy, when ∆ is small, expanding cos √ π(q m − q m−1 ) gives heavy-mass terms with mass ∼ 1 ∆ 2 and the light-mass quadratic term ∼ ∆ 2 contributes little. We see that aside from the imaginary term on the last line, the action in this approximation is the same as in the Hamiltonian developed for stochastic noise in the reverse Villain approximation in Ref. [45] identifying σ 2 0 = ∆ 2 with σ 0 the standard deviation in the Gaussian displacement model in Eq. (24). Note however that here the dynamics occurs at the level of wave functions, whereas the description in Ref. [45] took place at the level of the probabilities (for shift errors). The interesting difference lies in the pure imaginary term making the action S complex. If the path integral is approximated by taking a single 'classical' path which minimized Re(S) then this phase factor only gives an additional phase to this path. However, the pure imaginary term contributes a phase to each path so that the total sum of paths q in → q out can be different, due ito interference, than the case in which these phases are absent. Note that these terms comes from the Z-error correction step with outcome p m which puts a feedback error on the data oscillator.
We formulate a decoder which will provide an approximate tracking of the dynamics of the wave function in q and p of the encoded state which is classical. This approximation can be viewed, to some extent, as making a classical approximation, i.e. selecting a single optimal classical path of a quantum path integral. We believe that similar ideas could be applied to simplified tracking of the Wigner function for the purpose of decoding: this may be of interest when we want to study the effects of a fuller noise model, which includes, say, photon loss on the cavity mode during ancilla GKP state preparation and photon loss on the ancilla prior to measurement.
The use of a classical approximation to this path-integral is to determine the outgoing wave function without fully calculating the entire evolution by executing the One can also define a semi-classical approximation by expanding around the classical path and performing the Gaussian integral, but we have not numerically explored these variations on decoding.
M − 1-dimensional integral. In this classical approximation we would have ψ out (q out ) ≈ dq in exp(−S[ q opt | q, p])ψ in (q in ). (B.5) We imagine doing a final measurement ofq on the outgoing wave function ψ out and will be interested in evaluating Eq. (18). For the classical approximation we have where we have explicitly indicated how the optimal path depends on the initial position q in and the final position q. We note that the action in Eq. where N is a normalization to make P class () a probability. Since this normalization does not play a role in the use of this expression in decoding, we do not need to determine it. To evaluate Eq. (B.7), one can generate a q uniformly at random in the interval I b and q in is generated according to |ψ in | 2 . Given q and q in , if we can evaluate the classical path q opt between these points and hence compute the weight e −2Re(S[ qopt(q←q in )| q, p]) corresponding to this path, then we can stochastically estimate Eq. (B.7). However, as was observed in Ref. [45] this classical-path approach is not computationally simple as the dynamics of the q-variable can be chaotic due to it taking place in a random potential induced by the measurement outcomes q, p. Hence, instead of sampling q in and the endpoint q, we will sample q in from |ψ in (q in )| 2 and then apply a forward minimization technique, similar as in Ref. [45], on the function Re(S)[.] to find an approximately optimal path q opt given q in , leading to a final value for q out . We then determine whether q out lies in I b and repeat to gather statistics. In fact, we observe numerically that this strategy shows the same performance as fixing an initial q in = argmax q |ψ in (q)| 2 to which we apply the forward minimization. The probability to land in the I b interval is denoted as P forward (b|ψ in , q, p).
To adapt this forward decoder to the corrective displacement in each round as shown in Fig. 7, the action in Eq. (B.4) is modified by substituting q m → q m + q m for each round (and the addition of a term iq m p m which is irrelevant for this decoder).
The memoryless decoder presented in Fig. 10 in the main text is implemented by decomposing each measurement outcome q m = l q m √ π + n q m 2 √ π + e q m and p m = l p m √ π + n p m 2 √ π + e p m with l q/p m ∈ Z 2 (yes/no logical shift) n q/p m ∈ Z (number of stabilizer shifts), |e q/p m | < √ π 2 (minimal shift error) and applying a corrective shift of α m = −δq,m+iδp,m √ 2 with δ q/p,m = n q/p m 2 √ π + e q/p m after the QEC round. Note that this is not the same correction as in Fig. 7. This correction is motivated by a stochastic-shift error model and ensures that we keep the photon number low by applying an appropiate number of stabilizer displacements while correcting the perceived error.
Let us now further discuss how we test and evaluate our decoders.

Decoding of Repeated GKP Error Correction
The experimentalist implementing the repeated rounds of error correction learns the value of p, q and the outcome of the final perfect homodyne measurement ofq, but in general she should make decoding decisions without knowing the input state † †. However, she can play the game in which she assumes that the input wave state is either ψ 0 in (q) (or ψ 1 in (q)) and then determine whether the dynamics, -given her measurement data q, p-, would lead to a read-out of Z = 1 or Z = −1. Naturally this may not a deterministic process, so she can calculate the probabilities P(0|ψ 0 in , q, p) (Z = 1) and P(1|ψ 0 in , q, p) (Z = −1). This defines a maximumlikelihood full-density matrix decoder. When using this decoder, the experimentalist evaluates the (normalized) Green's function in Eq. (B.3), given her measurement data. She then flips the final experimentally measured logical outcome bit (modeled as a perfect homodyne measurement ofq) whenever P(1|ψ 0 in , q, p) > P(0|ψ 0 in , q, p). As logical error rate estimate for this decoder we take P MLD = d q d p P( q, p) min P(0|ψ 0 in , q, p), P(1|ψ 0 in , q, p) .

(B.8)
This estimate assumes that the error rate is (roughly) the same when the input wavefunction is ψ 1 in (q) and that a superposition of such inputs behaves similarly, without logical interference.
For systems of many modes, this decoding method will not typically be efficient (even for a single mode it comes down to a sizable computational effort), hence the goal of decoding is to infer errors without tracking the entire wavefunction. As we argued before, it turns out to be most computationally efficient to use a forward strategy and estimate P forward (0|ψ 0 in , q, p). With this probability in hand, let f ( q, p) = argmin b (P forward (b|ψ 0 in , q, p)) be the indicator bit whether to flip.The logical error estimate that we consider is then P forward = d q d p P( q, p) δ f ( q, p),0 P(1|ψ 0 in , q, p) + δ f ( q, p),1 P(0|ψ 0 in , q, p) . (B.9) Again, in using only this estimate, it is assumed (and not a priori given) that the evolution of the input state ψ 1 in (q) or an arbitrary superposition will have the same logical error rate, and that assuming a different input state would not lead the decoder † † The reason is that this state may be part of a complicated quantum computation and we assume that we cannot simulate this quantum computation.
to different conclusions. The logical error probability for the passive decoder which does not use any syndrome measurement data, but declares that output state is the same as input, is defined as P passive = d q d p P( q, p)P(1|ψ 0 in , q, p). (B.10) Thus, similar as in Eq. (B.10), the logical error is determined by the probability to get outcome 1 in the final q measurement. We have numerically estimated the logical errors rates of these three decoders, given in Eqs. (B.8, B.9, B.10), as a function of M for ∆ = 0.3 and ∆ = 0.4, see Fig. 10 in the main text. The logical error rates for adapted versions of these decoders including corrective displacements after each round are displayed in Fig. 11.

C. Hamiltonian Engineering Via Rotating Wave Approximations
The goal of this Appendix is to discuss the underpinnings and the 'beyond' of the commonly-invoked rotating wave approximation of a Hamiltonian of the form H = H 0 + V with First, in the absence of applying time-dependent drives, the physical basis for an RWA approximation of a Φ k -term can be motivated in at least two ways. One is to move to a rotating frame in which the Hamiltonian remains time-independent, but is a form amenable to Schrieffer-Wolff degenerate perturbation theory so that terms which either (1) contain an unequal number of creation and annihilation operators, and/or (2) contain an equal number of creation and annihilation operators of modes which are sufficiently far detuned, are seen as perturbations to detuned Fock energy levels. For example, in Eq. (C.1), to argue about the perturbative effect of terms which contain an unequal number of creation and annihilation operators, we observe that they act as off-diagonal elements in the Fock basis of H 0 , changing the number of total excitations. Hence if view H 0 as block-diagonal with blocks formed by a given total number of excitations in either mode a, b or c, separated by a gap min(ω a , ω b , ω c ), then the effect of such excitation-number changing terms can by examined by Schrieffer-Wolff perturbation theory. In lowest-order perturbation theory, one projects V onto these blocks labeled by the number of excitations, so that off-diagonal terms have no effect, Kerr and cross-Kerr terms remain, as well as excitation-number preserving terms (e.g. a † b † c 2 ). Given the GHz frequencies of the ω i s versus the relative strength of Josephson-induced coupling, keeping things to this lowest-order is a good approximation and we replace V by V eff which omit these terms which do not preserve the number of excitations. This strictly speaking is the rotating-wave-approximation. As a next approximation, to handle terms which do not preserve excitation number in any of the particular modes, we go to the rotating frame at ω c for all three modes so thatH 0 = ∆ ac (a † a + 1 2 ) + ∆ bc (b † b + 1 2 ) with ∆ ic = ω i − ω c andṼ eff = V eff . We imagine thatṼ eff is expanded in terms which are products of creation and annihilation operators of the modes. The HamiltonianH 0 has energy eigenspaces |x a ⊗ |y b ⊗ |ψ c with Fock states x, y = 0, 1, . . ., each of which has the degeneracy of the oscillator (c) space, and each space is at least min(|∆ ba |, |∆ ca |, |∆ bc |) away from another space. The perturbationṼ eff has both diagonal parts with respect to these eigenspaces (e.g. Kerr and cross-Kerr) as well as off-diagonal terms which map between the spaces. In lowest-order perturbation theory, one again projectsṼ eff onto these eigenspaces, so that off-diagonal terms have no effect and Kerr and cross-Kerr terms remain. This is then the full (RWA) approximation. To go beyond this, the effect of the off-diagonal terms can be estimated in second or higher-order perturbation theory to obtain an effective Hamiltonian H eff which is diagonal in the Fock basis using Schrieffer-Wolff perturbation theory (see e.g. [91,92,93,94]). The spectrum of H eff approximates that of H when we are in the perturbative regime with min(∆ ij ) ||V eff ||/2. In second and higherorder perturbation theory, i.e. beyond RWA, U H eff U † provides an approximation for H where the unitary U is the perturbatively expanded Schrieffer-Wolff transformation which provides a correction to the Fock eigenbasis. To get an approximation to the original Hamiltonian, one would finally rotate back to the original frame.
An alternative analysis could be based on the Magnus expansion: to apply this, we move to a rotating frame for each mode at its own frequency for the Hamiltonian in Eq. (C.1) such that the previously mentioned off-diagonal terms become rapidly time-dependent: as a consequence their effect averages out over sufficiently long times. To observe the dynamics in this rotating frame, we consider the Schrödinger equation for the state |ψ(t) = U † 0 (t) |ψ(t) with U 0 = exp(−iH 0 t), while |ψ(t) obeys the Schrödinger equation with H. The state |ψ(t) will evolve according to the rotatingframe (or interaction-frame) HamiltonianH = U † 0 HU 0 + i dU † 0 dt U 0 . For example, when For time-dependent Hamiltonians, the Magnus expansion [95] or Magnus-Taylor expansion [96] then forms a convenient representation of the effective dynamics. while for k ≥ 2 one gets increasingly higher-order commutators [95]. For the Hamiltonian in Eq. (C.1), clearly terms diagonal in the Fock basis, are time-independent and will be present in H (1) (T ). As an example of a rotating term, consider a simple timedependent HamiltonianṼ (t) = A exp(i∆t) + h.c where A is some product of creation and annihilation of some modes and ∆ is a detuning. For this Hamiltonian, the strength of this lowest-order term decays inversely with ∆, i.e.
We thus see that both methods give us perturbative expansions whose validity depends on the strength of the perturbative parameter.
Let us now discuss the case when we actively drive one of the modes, say mode c. As before one can apply a RWA making dropping terms which do not preserve total excitation number to obtain an effective V eff . If we assume that the only effect of the drive term is to create a coherent state with c(t) = Ee −iωpt in oscillator c, we can remove ω c c † c from H and replace c by c(t) everywhere. Given this time-dependence it then seems more convenient to use a Magnus expansion to analyze the effect of the higher-order effects of the non-resonant terms. For this we move to a rotating frame for the oscillators a and b at their own frequency, such that depending on the choice of the drive frequency ω p some terms become time-independent. For example, a term like a † bc is time-independent for the choice ω p = ω a − ω b .
A more thorough quantitative analysis of the error induced by the RWA approximation through perturbative or Magnus expansions would be desirable, as it plays into the accuracy of the CZ gate and is influenced by the number of photons in a GKP mode (as the latter influences the strength of the perturbation).

C.1. Time-Dependent Displacement Frame
Imagine a Hamiltonian (in a rotating frame of the mode a) of the form H = H 1 (a, a † ) + iE(t)a † − iE * (t)a where E(t) is some time-dependent envelope of the drive and H 1 (a, a † ) has some functional form on a and a † . We consider the time-evolution of the vector |ψ(t) = U † 0 (0, t) |ψ(t) with U 0 (0, t) = T exp(−i t 0 dt [iE(t )a † − iE * (t )a]) which evolves with the HamiltonianH(t) = U † 0 HU 0 + i ∂U † 0 ∂t U 0 . In words, we consider the evolution in the time-dependent displacement frame given by the time-dependent drive. When after some final time T , the total frame evolution is U 0 (0, T ) = I, the timeindependent HamiltonianH(t) will describe the time-evolution of the actual Schrödinger state |ψ(t) over the entire period of time T . We can write U 0 (0, t) = D(β(t)) where β(t) = t 0 dt E(t ), so that we evaluateH(t) = H 1 (a + β, a † + β * ) + iE(t)β * − iE * (t)β and the last term can be omitted as it give rise to an irrevelant phase.