Quantum proportional-integral (PI) control

Feedback control is an essential component of many modern technologies and provides a key capability for emergent quantum technologies. We extend existing approaches of direct feedback control in which the controller applies a function directly proportional to the output signal (P feedback), to strategies in which feedback determined by an integrated output signal (I feedback), and to strategies in which feedback consists of a combination of P and I terms. The latter quantum PI feedback constitutes the analog of the widely used proportional-integral feedback of classical control. All of these strategies are experimentally feasible and require no complex state estimation. We apply the resulting formalism to two canonical quantum feedback control problems, namely, stabilization of a harmonic oscillator under thermal noise, and generation of an entangled state of two remote qubits under conditions of arbitrary measurement efficiency. These two problems allow analysis of the relative benefits of P, I, and PI feedback control. We find that P feedback shows the best performance for harmonic state stabilization when actuation of both position and momentum feedback is possible, while when only actuation of position is available, I feedback consistently shows the best performance, although feedback delay is shown to improve the performance of a P strategy here. In contrast, for the two-qubit remote entanglement generation we find that the best strategy can be a combined PI strategy, when measurement efficiency is not one.

Feedback control is particularly important for applications such as error correction, cooling, and stabilization of quantum systems. Feedback becomes most interesting when the control signals can be applied to a quantum system at timescales that are comparable to the timescale of the measurement. In this case, one must model the effects of intrinsic time evolution, measurement (including quantum backaction) and feedback control all at the same time, which results in interesting and complex dynamics. This typically leads to a description in terms of a continuous-in-time stochastic dynamical equation for the density matrix of the quantum system. The simplest type of feedback, in which the feedback operation is directly proportional to the measurement signal at the same time, leads to Markovian evolution of the system [26,27]. This proportional feedback (often termed 'direct' feedback) has been applied in theoretical analysis of many problems including state stabilization and cooling [28][29][30], quantum error correction [31][32][33], state purification [34,35] and generation of entangled states [36][37][38][39], and has also been experimentally demonstrated [40][41][42][43].
Recent work has extended quantum feedback control beyond proportional feedback to implementations based on estimation of the quantum state [44][45][46], implementations using stochastic noise sources [47], and to implementations using the most general form of feedback that does not include a time-delayed proportional term [48]. In the latter framework, referred to as Proportional and Quantum State Estimation (PaQS) feedback, the feedback operator can equivalently be expressed as a sum of independent deterministic and stochastic contributions. This approach has also been extended to multiple measurement and feedback operators [39]. In several instances, locally optimal feedback laws have been derived [6,7,[48][49][50][51][52], with global optimality being shown in a smaller number of cases [50][51][52].
As is the case for complex classical systems, the implementation of advanced, and particularly of optimal, feedback control solutions can be challenging, due to instrumentation and computation demands. Therefore, it is important to also develop heuristic control solutions in the quantum domain. In this paper we adapt one of the most widely used classical control heuristics, proportional-integral, referred to as PI feedback control [53], to the quantum domain. In the classical domain both P and PI feedback are subsets of proportional, integral, derivative (PID) control, which includes options for modulating the feedback signal with both integrals and derivatives of the measurement signal, in addition to simple multiples of this. In classical PID control, the feedback signal is proportional to the function where e(t) is an error signal that is usually derived from the measurement at time t, and α p , α i , α d are real coefficients that dictate the relative weights of the proportional, integral and derivative information, respectively, in forming the control law at any time. These weights are usually tuned empirically to achieve good control performance, since their optimal values cannot be computed a priori except for very simple systems. Intuitively, the integral portion is used to compensate for unused parts of the measurement signal at earlier times -integration can increase the signal to noise ratio, can decrease the amount of time it takes to reach the steady-state, and can decrease overshoot of the desired set point. The third component of Eq.
(1), derivative control, can increase the stability of a result by suppressing slow deviations away from the desired target -here the derivative attempts to anticipate the direction of change in the error. While PID control is not known to be optimal in any general setting, it has proven to be a very useful framework for formulating heuristic control laws in practice [53].
In this paper we address the extension of the first two components of PID control to the quantum domain, formulating a quantum PI feedback law and analyzing the relative benefits of quantum PI, I, and P feedback in two canonical problems for quantum control, namely state stabilization of the harmonic oscillator in an external environment, and generation of entanglement between remote qubits using local Hamiltonians and non-local measurements. In contrast to some earlier studies of these systems [6,37,51], our feedback implementations for these problems do not require any state estimation and only rely on simple integrals of the measured signal. We allow for a time delay in the implementation of P feedback, as originally proposed by Wiseman [27].
A time delay between obtaining the measurement signal and implementing a feedback operation reflects common experimental constraints and is often regarded as being detrimental to proportional feedback [54,55]. However we shall see that in the case of state stabilization of the harmonic oscillator, a time delay introduces additional flexibility of feedback that can be beneficial when the feedback control operations are restricted. We also examine the robustness of P feedback with respect to uncertainties in the time delay, in particular, to increases in the time delay beyond the ideal values for each protocol.
In general, our findings for these two classes of implementations show that adding an integral component to quantum feedback control can be useful in some but not all settings. This is different from the classical setting where adding an integral component to feedback control is almost universally beneficial [53]. The different behavior of quantum systems can be rationalized by recalling a key difference between quantum and classical settings, which is the unavoidable presence of stochastic measurement noise in quantum systems. In classical systems measurement noise can be minimized and even sometimes eliminated. However for quantum systems, any information gain from a measurement necessarily comes at the cost of added noise on the system. The proportional component of feedback can be very effective at minimizing the impact of this added noise. In special cases, including the harmonic oscillator state stabilization with both position and momentum controls [6] and entanglement generation for two qubits with unit efficiency measurements [51], P feedback can be used to cancel the measurement noise. However, when this is not possible or when there are additional noise sources, we find that I feedback, or a combination of I and P feedback, can be more effective than P feedback.
We note that rigorous analysis of a quantum version of full PID control within the input-output analysis of controlled quantum stochastic evolutions has been recently presented by Gough [56,57].
In the current study of practical implementations, we do not investigate the full PID control in the quantum setting because the singular nature of the quantum measurement record makes the derivative terms ill-behaved and thus not useful for practical control implementations without further modifications.
The remainder of the paper is organized as follows. Sec. II introduces notation and presents the These two signals are then additively combined and then used to condition actuation of the quantum system by an operator F .

II. FORMALISM
In this section, we will develop the formalism for a quantum system under continuous-in-time measurement (e.g., homodyne detection) and PI feedback control. Fig. 1 shows a block diagram of the feedback system that we aim to model. We define ρ to be the state of the system, H the intrinsic Hamiltonian, c the variable-strength measurement operator, and η the measurement efficiency. We will set = 1 throughout the paper.
The dynamics of the system conditioned on the measurement record, but without feedback control, is described by the following Itô stochastic master equation (SME) [58]: where dW (t) are Wiener increments (Gaussian-distributed random variables with mean zero and autocorrelation E{dW (s)dW (t)} = δ(t − s)dt). The superoperators D and H in this equation are The corresponding measurement current can be written as [58] where ξ(t) ≡ dW/dt is a white noise process. To emphasize the link between the measurement current and the conditional state evolution, the last term in Eq. (2) is sometimes written as Before adding the feedback, we first define the error signal by analogy with classical PID control, as where g(t) is the setpoint or goal. This is often the desired value of the observable c + c † (t) but could also be another target function. g(t) is assumed to be a smoothly varying or constant function. Then the PI feedback operator in the quantum setting takes the form with some Hermitian operator F . Here α p (t) and α i (t) are time-dependent proportional and integral coefficients, respectively. This differs from classical PID control where the control coefficients are time-independent. Here, we will allow for time-dependence that is deterministic and independent of the measurement current, although in the following we will drop the time index on these coefficients for conciseness unless we wish to emphasize the time-dependence. We have also included the freedom of having a time delay τ P > 0 in the proportional component. While this is often viewed as an experimental constraint on implementation of quantum feedback control protocols that is detrimental to performance [54,55], we shall see below that for the harmonic oscillator state stabilization problem this can be used constructively to improve performance (subsection III B). J (t) is the integrated error signal, where w is a smooth integration kernel that can be used to vary the contribution of the measurement current at past times, and τ I is the integration time. We shall assume the kernels are L 2 integrable and normalize them such that τ I 0 dsw(t, s) = 1. Time-homogeneous kernels just depend on the time separation, w(t, s) → w(t − s). Typically, w(t, s) decays with t − s and puts decreasing weight on measurement results from further in the past.
The action of this PI feedback only is captured by the following dynamics of the system density matrix ρ(t): We now combine Eqs. (2) and (7) to derive the SME for evolution under measurements and the PI feedback, using the general formalism developed in Ref. [27] and its extension to smoothed feedback signals in Refs. [27,59]. For convenience we define the commutator superoperator F × as The time-evolved state after an infinitesimal time dt is given by Note that this form ensures causality, since the feedback acts after the evolution due to measurement. The infinitesimal evolution equation is then obtained by expanding the exponential e Kdt in a Taylor series up to order dt. The first and second order terms in this expansion are: where to write the second line in each equality we have expanded e(t) = j(t) − g(t), used the definitions j(t) (Eq. (3)), J (t) (Eq. (6)), and the Ito rule dW (s)dW (t) = δ(t − s)dt.
Therefore, discarding all terms less than order dt, the evolution for the system conditioned on the measurement and subsequently acted upon by the PI feedback control is Multiplying this expression out and again discarding all terms smaller than O(dt), we find the following evolution for feedback with delay in the P component, τ P > 0, is given by For the zero time delay case, we go back to Eq. (11), set τ P = 0 and again multiply the expression out and discard terms smaller than O(dt) to get [26,60] Note that in general it is not possible to obtain Eq. (13) by setting τ P = 0 in Eq. (12). With zero time delay, the correlation between the feedback noise and measurement noise creates an order dt The two SMEs in Eqs. (12) and (13) represent the evolution of the quantum system conditioned on a continuous measurement record, together with PI feedback based on that record. Examining the terms proportional to α i , it is evident that the integral feedback component just adds a generator of time-dependent unitary evolution to the system dynamics. This is in contrast to proportional feedback, which in addition to adding coherent evolution terms, also adds a dissipative evolution term and for τ P = 0 also modifies the stochastic evolution term (the term proportional to dW in Eq. (11)). This reflects the difference that in proportional feedback, the delta-correlated noise is directly fed back at each time instant, whereas in integral feedback, the feedback action is conditioned on a smoothed, tempered signal and thus is able to generate a conventional (timedependent) Hamiltonian term. Note that while the latter is not necessarily smoothly varying in time, its increments are O(dt). We emphasize that these SMEs model feedback that requires no state estimation (usually a computationally expensive task), and thus are more suitable for application to experimental implementations. However, P feedback with τ P = 0 will always be an approximation since any measurement and feedback loop will have finite delay. The τ P = 0 limit is a good approximation if the delay is small compared to the intrinsic system evolution time scales.
In this work, we simulate the above stochastic differential equations (SDE) describing evolution under PI feedback with a generalized Euler-Maruyama method. In the usual Euler-Maruyama method [61], one generates a Wiener noise increment dW (t) for each time step [t, t + dt] and then updates the state according to the stochastic differential equation. In our generalized Euler-Maruyama method, for each time t we keep a record of the noise up to time τ = max(τ I , τ P ) in the past, i.e., dW (t), dW (t − dt), ... dW (t − τ )). Then dW (t − τ P ) is accessible and J (t) can be calculated at each time t. The state is then updated according to the SME Eq. (11) as usual. We normalize the density matrix at each time step to compensate for numerical round-off errors.

III. HARMONIC OSCILLATOR STATE STABILIZATION
State stabilization of a quantum harmonic oscillator is a canonical quantum feedback control problem that has been studied for several decades [6,29,49,62,63]. This problem has many practical applications, including the cooling and manipulation of trapped cold ions [64] or atoms [65], and cooling of nanoscale [66] or even macroscopic [67,68] mechanical systems. Purely proportional feedback control schemes have been developed for this problem [6,29,49,62]. In the following, we investigate whether adding integral control adds any benefit in terms of control accuracy.
The system is a quantum harmonic oscillator with mass m and angular frequency ω. We apply a continuous measurement of the oscillator position x with strength k (i.e., c = √ kx in the notation of the Sec. II) and efficiency η. The SME describing the system under measurement is [49] where H 0 = p 2 /(2m) + mω 2 x 2 /2, p is the oscillator momentum operator and a is the annihilation operator. The terms proportional to γ describe damping and excitation due to coupling to a bosonic thermal bath with mean occupation N . The associated measurement signal is We shall consider two types of feedback for this system. First, we consider linear feedback in both x and p, in which case we have two feedback operators: We will attach (time-dependent) proportional coefficients (α p1 , α p2 ) and integral coefficients (α i1 , α i2 ) to each of these feedback operators. The total feedback operator is then Applying F 1 is usually considerably easier than F 2 , since the former corresponds to applying a force on the oscillator. Therefore, we will also consider the setting where only F 1 is available, in which case we have only the coefficients α p1 , α i1 . Given the simplicity of the harmonic system, it is possible to set up analytic candidate control laws that are specified in terms of choices for the coefficients α p1 , α p2 , α i1 , α i2 , and to then assess whether they are consistent with P, I or PI feedback. We shall do this below.
For τ P > 0, the evolution of the system with PI feedback control is obtained from Eq. (12) as For τ P = 0, the evolution of the system with PI feedback control is obtained from Eq. (13) as In both cases g(t) is a goal that we define below, with e(t) the corresponding error signal. The proportional feedback component is the same as in Ref. [6], except that in this work we also consider a time delay τ P > 0 in the feedback loop.
In the simplest setting where the system starts in a Gaussian state, the state remains Gaussian when evolved according to the above measurement and feedback dynamics since all operators acting on the density matrix are linear or quadratic in x, p [6,49]. A Gaussian state is completely determined by its first moments ( x , p ) and second moments ( The evolution of the second moments under the above measurement, thermal damping, and feedback is independent of the feedback, and evolve deterministically, independent of the measurement noise, ξ(t) [6]. The equations of motion for the second moments are given in Appendix A. We will assume in the following that these equations are solved in advance and therefore that V x (t), V p (t) and C xp (t) are known functions of time. In all of the examples treated in this section, we shall take the initial state to be a coherent state with V x (0) = V p (0) = 0.5 and C xp (0) = 0.

The evolution of the first moments is given by tr[xdρ(t)] and tr[pdρ(t)]
: where τ P ≥ 0. In the limit of zero time delay, the equations of motion for the first moments are the same as above, with τ P = 0 (this reduction for the evolution of the first moments of the quadratures is a special case since as noted above, taking τ P = 0 in Eq. (12) does not yield Eq. (13).
Our overall control goal is state stabilization, where the aim is to center the state at an arbitrary stationary (time-independent) value of the two quadrature means in the rotating frame of the oscillator, notated (X g , P g ). In the laboratory frame this control goal is specified by the mean quadrature values (x g (t), p g (t)), which are related to (X g , P g ) by the transformation p g (t) = −mωX g sin(ωt) + P g cos(ωt).
We note that the oscillator cooling problem [6] can be viewed as a special case of this state stabilization with the control goal (X g = 0, P g = 0).
The evolution of the first order moments in the rotating frame is given by For later convenience we define the deviations from the target mean values in the rotating frame byX(t) = X (t) − X g andP (t) = P (t) − P g and put these deviations together in a vector We must choose an error signal, e(t), that is based on this control goal and the measurement signal that we have access to. According to the description above, there are two components to the target state in this problem, one for each quadrature of the oscillator, i.e., X g and P g . However, since our measurements are made in the laboratory frame and we measure only the x-quadrature, from now on we shall specify the goal function to be g(t) = 2x g (t), so that the error signal is then Finally, we note that in this work we shall restrict ourselves to the regime of weak measurement and damping k, γ mω 2 , where the measurement extracts some information about the system at each timestep but does not completely distort the harmonic evolution. Similarly, the system is under-damped by the thermal bath. In this limit, it is valid to still define the characteristic period of the oscillator as T = 2π/ω.
In the following subsections we consider first the case of x measurement with feedback controls in both x and p (section III A) and then the case of x measurement with feedback control only in x (section III B).

A. x and p Control
We now analyze the case of x measurement with feedback controls in both x and p.

Proportional feedback
We first consider proportional feedback only, i.e., α i1 = α i2 = 0 in Eq. (19). We shall show that the quadrature expectations of any state can be driven to the target values (X g , P g ) by setting α p1 (t) = 2kηC xp (t), α p2 (t) = −2kηV x (t) and τ P = 0. However, in order to compensate for the thermal damping, we also need to add a term γ(x g (t)p + p g (t)x) to the Hamiltonian H 0 (note that this is not a feedback term, since it is not dependent on the measurement record). With these settings, Eqs. (19) becomes We now transform Eq. (22) into the rotating frame and obtain two coupled equations of motion for the deviationsX andP : We now see that our choice of proportional feedback coefficients α p1 (t) and α p2 (t) has allowed the feedback to completely cancel all measurement noise contributions (captured by the dW terms), resulting in deterministic equations for the evolution of the mean values x (t) and p (t). The fact that such cancellation is possible was already noted in the early studies of feedback cooling of quantum oscillators [6]. In addition, as we shall prove explicitly below, these coefficients make use of the thermal and measurement induced dissipation to steer the system to the target quadrature mean values. suggests that this proportional control law yields exponential convergence to the goal quadrature values. To understand why this particular control law works and to prove the exponential nature of the convergence to the target state, we begin by noting that the coefficients in the system of differential equations in Eqs. (23) display fast oscillations through the cos(ωt) and sin(ωt) terms, while the changes in the other time-dependent terms, V x (t), V p (t) and C xp (t) are small over the timescale of these oscillations. Therefore we may approximate this evolution by another system with new coefficients defined by time-averaging the coefficients in Eqs. (23) over one oscillator period T , and treating all time-varying quantities other than cos(ωt) and sin(ωt) as constants. For We refer to this approximation as period-averaging, but note that it is equivalent to the rotating wave approximation, since it amounts to dropping fast rotating terms in the evolution operator in the rotating frame. In Appendix B we show that this is a very good approximation in the regime k, γ mω. The period-averaged dynamics for the above system, The deviation from the target mean values at time t is given by , for which the real parts are negative for all t. Hence, this is a stable system that converges exponentially towards theZ = 0 fixed point. We may view theZ(t) as a vector Lyapunov function guaranteeing the stability of the final state [69].
This shows that for this choice of proportional feedback parameters one can completely cancel the measurement noise and obtain a deterministic system that exponentially stabilizes an arbitrary initial state.
The P feedback strategy developed above requires τ P = 0, a condition that is experimentally challenging to achieve due to the finite bandwidth of any feedback control loop. Therefore, we have also tested the performance of the feedback law when τ P > 0, on order to investigate the the stabilization performance under the I feedback strategy that we discuss in the next subsection.

Integral feedback
Now we examine the dynamics obtained by setting α p1 = α p2 = 0 in Eq. (19), which corresponds to applying only integral control. The measurement current j(t) provides a noisy estimate of the oscillator position, so it is necessary to filter this in order to obtain a smoothed estimate of the error signal e(t). We use the following exponential filter with memory τ I : Our choices for the coefficients α i1 and α i2 in the presence of such an integral filter are motivated by the same factor as in the P feedback case above, namely to cancel as much of the the measurement  27)). The green and purple lines show the behavior of the ensemble average over 1000 trajectories, X a (t) = E X (t) and P a /mω = E P/mω (t). The maximum standard deviation of the trajectories X (t) and P (t) increases with τ I , saturating at 0.7610 at long times. noise as possible. While it is not possible to do this exactly with I feedback, we show below that the choice α i1 (t) = 2kηC xp (t) and α i2 (t) = −2kηV x (t) does provide exponential convergence of the quadratures to their target values on average. As in the proportional feedback case, we also add a term γ(x g (t)p + p g (t)x) to the Hamiltonian H 0 to compensate for thermal damping.
The evolution of d x and d p is Converting to the rotating frame and writing equations of motion for the deviationsX andP yields: A typical evolution (trajectory), started from the same initial state as for the P feedback above, is shown in Fig. 2(b). We now see random fluctuations in the evolution of the quadrature expectations because the measurement noise has not been exactly cancelled by the I feedback. Indeed this is now not possible, since the measurement noise term dW (t) is arbitrarily varying while the integral feedback term is not. Consequently, single trajectories will fluctuate around the target values, preventing perfect state stabilization of individual evolutions. However, the average values of the quadratures (marked by the solid lines labeled X a and P a /mω in Fig. 2(b)) do converge exponentially to the goal values. In Fig. 2(b) and in subsequent figures where we show stochastic trajectories, we will state the "maximum standard deviation" at steady state for these trajectories.
The standard deviations of X (t) and P (t) (calculated over multiple trajectories) are the same but time-dependent and oscillatory at long times. However, this standard deviation is within a narrow range and thus we quote the maximum value over a time window in the steady state region (which is defined as when E X (t) and E P (t) reach constant values).
To analyze this behavior and prove the exponential convergence of the average over trajectories, we again write Eqs. (27) in matrix form as The solution to this system can be formally written as Note that as before, the second order moments evolve slower than cos(ωt), sin(ωt). Furthermore, since J (t) is a smoothed measurement current, it also evolves slowly on the timescale of an oscillator period, T . Thus, we may neglect the second term since the integral over the rapidly oscillating sinusoidal terms will average to zero for t T . We cannot make the same argument for the third term, since dW (t) does not have finite variation over any interval. This third term is in fact what causes fluctuations of individual quadrature trajectories around their setpoint values in 2(b). However, note that since c(τ )e −γ(t−τ ) is a non-anticipating function (alternatively, an adapted process that depends only on current and prior times, and independent of the Wiener process), we may conclude that the third term vanishes when averaged over many trajectories, [70]. This leaves only the first, exponentially decaying term, for the average quadrature values and is therefore the reason for the exponential convergence of the ensemble average to the target values. This analysis also shows that the rate of convergence is slower for I feedback than for P feedback, for which there is an additional contribution of −2kηV x (t) to the convergence rate, see Eq. (24).
This first analysis of control of state stabilization for the harmonic oscillator has shown that when access to both x and p control is given, the performance of purely proportional feedback with zero time delay is not improved by adding integral feedback. Indeed, both P and I feedback strategies converge exponentially to the target state when an ensemble average over I feedback trajectories is taken. This shows that state estimation [6] is not necessary to drive a harmonic oscillator to an arbitrary quantum state in the presence of thermal noise. However, when comparing the P and I strategies, it is evident that the P feedback is advantageous for two reasons. The first is that with zero time delay there exists a proportional feedback law that can perfectly cancel the measurement noise perturbations to the system for each individual trajectory, whereas this can only be approximately canceled under an integral feedback strategy for an individual trajectory, resulting in fluctuations about the target mean quadrature values for any given trajectory. The second is a faster convergence for P feedback. Given the superior performance of P feedback over I feedback in this setting, we conclude that is not advantageous to consider a more general PI feedback protocol when P feedback with zero time delay is possible. only, we set α p2 = α i2 = 0, and therefore have a single feedback operator, F 1 = x.

Proportional control
As before, we first consider proportional control alone, i.e., α i1 is also set to zero. Our feedback operator is x, and thus the feedback applies a force. Ideally we want this force to be proportional to −( p (t) − p g (t)) in order to cancel the measurement noise. However, since we are measuring only the position, we do not have direct access to the momentum observable. This is manifest in the dynamical equations in Eq. (19) by the fact that the only deterministic term involving α p1 is the term −α p1 (2 x (t − τ P ) − g(t − τ P ))dt in the equation for d p (t). This term does not appear to be useful for controlling the oscillator momentum, because it contains information about x rather than p . Indeed, we find that the trajectories for evolution of the mean values do not show convergent behavior when implementing proportional x feedback with τ P = 0. Noting that for a harmonic oscillator the average position and momentum have a T /4 relative delay (see also [49]), in the weak measurement and damping limit (k, γ mω 2 ) we can take a delayed signal term x (t−T /4) to be a good approximation to the the scaled oscillator momentum − p (t)/(mω). This allows formulation of a good control law based on delayed proportional feedback with τ P = T /4.
One can then follow the same line of reasoning outlined above in Section III A to tune the strength and offset of the feedback coefficient in order to achieve noise cancellation. Specifically, we set α p1 = −2kηV x mω with τ P = T /4. We similarly add a term γp g (t)x to H 0 in order to compensate for thermal damping. Note that full compensation of the effects of thermal damping requires adding a term γ(x g (t)p + p g (t)x), however, consistent with the assumption in this subsection that there is no direct control over the oscillator momentum, we add only the term γp g (t)x. The resulting dynamical equations for the mean quadratures are: evolve as dX(t) ≈ 4kηV x (t) −mωX(t) sin(ωt) +P (t) cos(ωt) sin(ωt)/mωdt − γX(t)dt where in the second line of each equation we have applied the period-averaging approximation to the deterministic terms, and regrouped the stochastic terms.
The inability to actuate the oscillator momentum in this situation introduces two negative features into these equations relative to Eqs. (23), for which both x and p control are available.
The first is that we cannot perfectly cancel the measurement noise, resulting in the presence of stochastic terms in Eqs. (30)). The second is that we cannot simply compensate for the thermal damping of oscillator momentum by adding a term γx g p to H 0 . This leads to the − γ 2 X g dt and − γ 2 P g dt terms in the period-averaged equations above. The first point is not a serious hindrance to stabilization, because in the weak measurement limit the effect of the noise is small and leads primarily to fluctuations around the target values. However, the second point is more serious, since the inability to suppress thermal damping means that the system will be driven to a state that is different from the target state. In fact, as we show below, the system is driven to a steady state with ensemble average quadrature values (where the ensemble average is taken over many with The first term is exponentially decaying to zero. The second term provides a deterministic offset from zero at long times, which is exactly what leads to the α, β scaling factors in the steady state. The third and fourth terms generate fluctuations on all trajectories. However, since e a(t−τ ) c(τ ) and e a(t−τ ) d(τ ) are non-anticipating functions (they are independent of the Wiener process), both of these terms will be zero when the expectation value over different measurement realizations are taken. Therefore, we can solve for the ensemble average steady state by dropping the stochastic terms and evaluating the t → ∞ value of Eq. (31) (or equivalently dropping the stochastic terms from Eqs (30)(b) and (30)(d) and solving for the steady state). Doing this yields α = β ≈ (2kηV ss x + γ/2)/(2kηV ss x + γ), where V ss x is the steady state of this second moment. We note that this expression for α and β is approximate, because we have solved for the steady state from Eq. (31) which was derived under the period-averaged evolution and we also assumed that x (t − T /4) ≈ p (t)/mω in formulating our control law. However, both of these approximations are very well justified in the γ, κ mω limit, so that the corresponding expressions provide excellent estimates of the average steady state for the trajectory. Knowing the values of α, β(= α), we can then compensate for the thermal damping by setting X g = X true g /α and P g = P true g /α, where X true g /P true g are the true target values of the quadrature means. 1 Note that this implies a similar rescaling of the laboratory frame target values, i.e., x g = x true g /α, p g = p true g /α. With this compensation trick solving the thermal damping issue for this constrained control setting, we can obtain very good stabilization behavior of individual trajectories to the desired target values, with relatively small fluctuations about these, as shown in Fig. 3(a). This figure shows a typical trajectory under this proportional control law, incorporating the above scaling of the target X g and P g values. It is important to note that we are simulating the dynamics here without any of the approximations used in the above analysis; i.e., the rotating frame equivalent of Eqs. (29)(a) and (29)(b), using the time-delayed feedback current and without invoking the periodaveraging approximation. Fig. 3(a) shows that the time-delayed signal does indeed provide a good estimate of the oscillator momentum, evidently resulting in some but not complete suppression of measurement noise, as well as exponential convergence of the quadrature means to their goal values.
Thus despite the reduced number of control degrees of freedom, one can nevertheless still achieve exponential convergence of the quantum expectations to their target values using P feedback, with zero bias from the target values and relatively small standard deviation (see Fig. 3

(a).)
As in the case of x and p actuation, this P feedback strategy requires a precise value for the 1 We note that using such steady state compensation can be an alternative to the strategy of introducing deterministic terms in the Hamiltonian to cancel thermal damping effects, i.e., to introducing one or both of the terms γ(xg(t)p+ pg(t)x).
feedback loop time delay τ P . Here the desired value of τ P is non-zero, and is thus experimentally less demanding to realize than the ideal P feedback strategy with x and p actuation for which τ P = 0 (Sec. III A 1). However, it might still be challenging to engineer a feedback loop with a precise value of delay τ P = T /4. To assess the robustness of the strategy with respect to uncertainties in τ P , we also analyzed the stabilization performance of this P feedback strategy for larger time delays, i.e., τ P = T /4 + . Results for several values of are shown in Appendix C, where it is seen that in this case the stabilization performance degrades for all > 0. The fluctuations of individual trajectories of quadrature expectation increase with , and there is also a bias in the long-time values of these expectations; i.e., the ensemble average values E X(t) and E P (t) do not converge to the target values. This error in convergence is appreciable even for offsets as small as = 0.05T and increases with .

Integral control
We now study the case of integral feedback when only one feedback operator is available, again choosing F 1 = x. On setting α p2 = α i2 = α p1 = 0 in Eq. (19), it is apparent that the only control handle into the system now comes from the −α i1 J (t) term. As we learned above, the key to stabilizing the system with F 1 = x alone is to construct an estimator of the oscillator momentum.
For P feedback we used a time delay to achieve this. Here we will construct an estimator with the integral filter.
Following Doherty et al. [49], we first modulate the measurement signal to form estimates of the oscillator quadrature deviations in the rotating frame: Using Eqs. (20) these integrals of the measurement record can be combined to yield an estimator of the error between p (t) and p g (t): We choose α i1 (t) = 4kηV x (t) to achieve measurement noise cancellation and convergence to the target state. The resulting dynamic evolution of the quadrature means are given by Transforming into the rotating frame, the deviations in this frame evolve as dX(t) = −γX(t) − γX g cos 2 (ωt) − γP g mω sin(ωt) cos(ωt) + 4kηV x (t) mω J (t) sin(ωt) dt where in the second line of each equation we have used the period-averaging approximation and the approximations J X (t) ≈X(t) and J P (t) ≈P (t).
The resulting equations Eqs. (35) have the same form as Eqs. (30), including exactly the same deterministic terms. Therefore, as in that case, we know that the ensemble average steady state for this evolution will not be the target values (X g , P g ), but rather the state (αX g , βP g ), with α = β ≈ (2kηV ss x +γ/2)/(2kηV ss x +γ). As in that case, we can compensate for these scale factors by adjusting the target values. Once this compensation is made, the system converges exponentially towards the target values with fluctuations. This is evidenced in the simulations shown in Fig.   3(b) which show similar relatively small fluctuations as for P feedback, with standard deviation 0.1676 ± 0.002 about the target values at long times.
Both the P and I feedback trajectories shown in Fig. 3 show stochastic noise. Since the feedback in the integral strategy is conditioned on a tempered version of the noise instead of on the instantaneous noise, we can expect that this smoothing of the noise should give the integral strategy a relative advantage over the purely proportional strategy here. While the noise does appear smaller in the I trajectory (compare Fig. 3(b) with 3(a)), it is difficult ascertain the effect of this on the overall performance of the control strategy by examining single trajectories. To enable a quantitative comparison between the performance of the two control strategies in this situation, we therefore define the following average error metric that quantifies the deviation from the control goals when averaged over all measurement trajectories: We estimate this error by simulating a large ensemble of trajectories with P or I feedback control. In Fig. 4 we plot the long-time value of this average error, i.e., when it reaches a constant value, as a function of the measurement efficiency, η. This plot shows that I feedback consistently gives a smaller error and thus performs better than P feedback over essentially the full range of measurement efficiency η.
In summary, when we only have access to the F 1 = x control operator, we do not have sufficient control degrees of freedom to follow the strategy of both cancelling the noise and engineering convergence to the target values, as was possible for P feedback in Sec. III A. However, we have seen that by forming momentum estimators (via use of time delay in the P feedback case, and via integral approximations of the quadratures in the I feedback case), we can still achieve effective control, with exponential convergence as before. We find that with this approach, both P and I feedback achieve similar control accuracy, with I feedback performing slightly better on average and the difference increasing with greater measurement efficiency η. Moreover, both of these P and I feedback strategies show the same rate of convergence to the target quadrature mean values, as is evident from the fact that (within the period-averaging approximation) Eqs. (30) and (35) have the same deterministic terms. However, neither of these strategies guarantee convergence of individual trajectories. Also we note that the P feedback strategy is very sensitive to the exact value of the time delay, for which the ideal value is τ P = T /4. Deviations from this ideal value result in inadequate stabilization performance, with failure to reach the target state even on average.
Given the similar performance of P and I feedback in this scenario and the lack of robustness of P feedback to variations in the time delay, one might not expect a significant benefit to combining the two to construct a PI feedback strategy. To assess this, we writeα p1 (t) = (1 − θ)α p1 (t) and α i1 (t) = θα i1 (t) where α p1 (t) and α i1 (t) are the values determined above, and θ ∈ [0, 1] is a mixing ratio quantifying the combination of the two strategies. In Fig. 4 we plot the long time control error for θ = 0.8 (the long time control error is minimum, and almost the same, for any value of θ in the interval [0. 8,1].) and note that indeed, there is little statistically significant benefit to combining P and I feedback in this scenario.

IV. TWO-QUBIT ENTANGLEMENT GENERATION
In this section, we compare the performance of P feedback, I feedback, and PI feedback for the task of generating an entangled two-qubit state with a local Hamiltonian and non-local measurement. This non-trivial state generation task was first addressed by measurement-based control with post-selection [71][72][73], then by P feedback and discrete feedback [37,51] and most recently by PAQS control [48]. For perfect measurement efficiency η = 1, the proportional feedback strategy with time-dependent α p (t), was shown in Ref. [51] to be globally optimal amongst all protocols that have constant measurement rate. In this case, just as for the harmonic oscillator under x and p feedback (Sec. III A), the measurement noise can be exactly canceled and the evolution converges deterministically to the target state. In the following, we consider the case where the measurement efficiency is not unity and the simplified setting where the feedback coefficients α p and α i are assumed to be time-independent. In this experimentally relevant setting, P feedback is not known to be globally optimal. Furthermore, unlike the situation for harmonic oscillator stabilization, the two-qubit system under measurement and feedback is not linear and therefore is representative of a more general class of quantum systems. This non-linearity makes analytical arguments for optimal feedback laws difficult and therefore we must resort to a numerical study.
However, we ask the same question as before; namely, whether its advantageous to combine P and I feedback?
Consider two qubits subject to an intrinsic Hamiltonian H = h 1 σ z1 + h 2 σ z2 and subject to negligible decoherence. In the following we will assume h 1 = h 2 = h. We measure the halfparity of the qubits [72], which allows a non-local implementation between remote qubits [71]. The relevant measurement operator c is where k is the measurement strength and the associated measurement current is The control goal is to stabilize the system in an entangled state, when starting from a simple product state, |↑ ⊗ |↑ ≡ |↑↑ . Given the exchange symmetry of the intrinsic Hamiltonian and the measurement operator (we will be careful to also maintain this symmetry with the feedback operator below), and since the initial state is exchange symmetric, we will remain in the symmetric triplet subspace of two qubits throughout the evolution. This subspace is spanned by the states |T −1 = |↓↓ , |T 0 = 1 2 (|↓↑ + |↑↓ ), and |T 1 = |↑↑ . Our goal is to evolve to, and stabilize the system in, the entangled state |T 0 . As in Ref. [37] we use the intuition of rotating the system in the symmetric subspace and choose a local feedback operator F = L x = 1 2 (σ x1 + σ x2 ). Applying a L x rotation can bring |T ±1 closer to |T 0 .
Since the control goal in this case is to prepare the state |T 0 , and the deterministic part of the measurement under this state, T 0 | L z |T 0 is zero, we may set the goal to be g(t) = T 0 | L z |T 0 = 0 ∀t. Hence our error signal is e(t) = j(t).
The stochastic master equation that describes the evolution of the two-qubit system for τ P = 0 is and for τ P > 0 where α p and α i are the proportional and integral feedback coefficients, and as mentioned above we have set the goal g(t) = 0. As before, we employ an exponential filter for the integral feedback: To assess the relative performance of the feedback strategies, we will look at the steady state average populations of the three triplet states as well as the average concurrence measure of entanglement. Given a two-qubit density operator ρ, the populations of the triplet states are given where λ 1 , · · · , λ 4 are the (non-negative) eigenvalues, in decreasing order, of the Hermitian matrix R = √ ρρ √ ρ withρ = (σ y ⊗ σ y )ρ * (σ y ⊗ σ y ), the spin flipped state of ρ. We expect that there is little benefit in introducing a time delay in proportional feedback in this example, since there is no information in prior measurement currents that is germane to the control goal. Indeed this expectation is borne out by these figures; the performance of the time-delayed feedback is worse than without a time delay, τ P = 0. Fig. 5(c) shows the performance under I feedback. The value of the integration time τ I can be numerically optimized to yield maximum concurrence. The plot in Fig. 5(c) uses τ I = 3, which is a near-optimal value for concurrence.
Comparing Fig. 5(c) with Figs. 5(a) and 5(b), it is evident that in the case of inefficient measurements, η < 1, an I feedback strategy is able to produce a significantly higher steady state average concurrence and target T 0 population than a P feedback strategy. Finally, in Fig. 5(d) we show the average behavior for a specific combination of P and I feedback, i.e., of PI feedback, with α p = 0.03 and α i = 0.17. This combined PI feedback strategy performs slightly better than the pure I feedback strategy, thus outperforming both P and I strategies (the long time value of concurrence in Fig. 5(c) is ∼ 0.7196 ± 0.0028 and in Fig. 5(d) is ∼ 0.7289 ± 0.0028). We have plotted here the results of just one choice of α p and α i that combines P and I feedback. This particular choice was made to show that PI feedback can outperform P and I feedback based on a more general analysis of mixing the two types of feedback that we will detail below. Note that the total feedback strength has been kept constant across all the settings shown in Fig Consequently, as we analyze in detail below the ensemble average of the triplet population, E[T 0 ], will be larger for the integral control strategy than for the proportional control strategy. To understand this in more quantitative terms, we have given the evolution of these triplet populations and the associated off-diagonal elements of the density matrix in the triplet subspace under general PI feedback in Appendix D. For the case of I feedback, i.e., α p = 0, α i > 0, the evolution is: We suppress the time index of T i and T i,j here for notational conciseness. We cannot take the ensemble average (to obtain the average evolution) by simply dropping the stochastic terms in this case, because J (t) and T i and T i,j are correlated by virtue of the dependence of both on past Wiener increments. Moreover, due to their nonlinearity we cannot solve these equations directly.
However, we can use the following argument to show that Eq. (43) has a (unstable) steady state when T 0 = 1. Suppose at some time, T 0 reaches 1 and we have T 1 = T −1 = 0 (and thus L z = 0).
Then the coherences T 1,−1 , T 0,1 , T 0,−1 will be approximately zero also (since all populations other than T 0 are zero). As a result, in the above equations dT −1 = dT 0 = dT 1 = dT 1,−1 ≈ 0, and the only coherences that evolve are given by dT 0,1 = −dT 0,−1 = i α i √ 2 J (t)dt. These coherences are generated by a non-zero J (t), and then go on to generate non-zero populations in the undesired states T −1 and T 1 . This perturbation away from the desired state is weak because of two factors: (i) J (t) can be made small when T 0 = 1, since the deterministic position of j(t) is zero, and the averaging integral will dampen the fluctuations dW (t) over the period τ I , and (ii) the coherences are dampened at a rate k/2, and therefore even when coherences are generated by non-zero J (t), they can be quickly dampened by the measurement induced dephasing before they generate non-zero populations in the undesired states.
It is clear that the integration time τ I is an important parameter for the integral control strategy. Optimization of this parameter involves a tradeoff between smoothing and time delay in the feedback action as τ increases. Specifically, we can expect that a longer integration time τ will improve the concurrence, due to the reduced fluctuations, but because the signal is being averaged over a longer time window, it will take longer for deviations away from the target value to affect the averaged value, resulting in a time delay in the feedback action. To illustrate the resulting trade-off between short and long integration time choices, Fig. 7(a) plots the steady state average concurrence as a function of the filter integration time τ I for I feedback. Note that the τ I = 0 reference value refers to the proportional feedback strategy with no delay. The generic behavior shown here is found for any value of the feedback strength α i , i.e, for all α i values we see that the concurrence shows a maximum value at a non-zero optimal filter integration time. This optimal value of τ I decreases as the control parameter α i increases (not shown). We also find that the system takes increasingly longer times to reach steady state as the feedback strength α i goes to zero, or as τ I gets larger.
Finally, we explore in more detail the possibility of full PI feedback, i.e., combining proportional and integral feedback for the problem of entangled state generation with inefficient measurements in this two qubit system. In Fig. 5(d) we already showed that there was a small benefit to combining both strategies for a particular set of coefficients. To study the performance of the combined strategy more systematically, we write α p = (1 − θ)f P I and α i = θf P I where f P I is the total feedback strength and θ ∈ [0, 1] is a mixing ratio quantifying the combination of the two strategies.
In Fig. 7(b) we now plot the steady state average concurrence versus this strategy mixing ratio θ for PI feedback, while keeping the total feedback strength f P I constant. The plot shows the existence of an optimal mixing ratio θ o located between ∼ 0.7 and ∼ 0.9, i.e., the optimal strategy is to have mostly integral control with some admixture of proportional control. The precise value of this optimal mixing ratio depends on the total feedback strength f P I . However, as shown in Fig.   7(c), θ o is quite robust to variations in efficiency. Note that the maximum concurrence obtained by this PI feedback strategy for perfect efficiency, η = 1, is less than that obtained using the globally optimal P feedback strategy with time-dependent proportionality constant α p (t) [37,51].
These results show that the advantage of PI control relative to pure I or pure P control increases as the total feedback strength parameter f P I increases. This can be seen by comparing the difference in steady state average concurrence between P, I and PI with optimal θ o for f P I = 0.2 (red line) and f P I = 0.3 (blue line) in Fig. 7(b). Finally, we note that the optimal mixing ratio also depends on the system Hamiltonian, in particular, the value of h. In this case, for larger values of h, the optimal mixing parameter θ o → 1 and the optimal feedback strategy becomes just I feedback. We show the concurrence versus θ curves for h = 0.5 in Fig. 8, for comparison with that here h = 0.5, and in this case I feedback is superior to any mixture of P and I feedback.

V. DISCUSSION AND CONCLUSIONS
We have presented and implemented a formalism for modeling proportional and integral (PI) feedback control in quantum systems for which, as in the case of classical PI feedback control, we allow the feedback to be tuned from a purely proportional feedback strategy (P feedback, including the possibility of delay) to a purely integral feedback (I feedback), with a combined strategy at any point in between (PI feedback). In this approach both proportional and integral feedback components are defined in terms of the measurement outcomes only, i.e., no dependence on knowledge of the quantum state is assumed. Consequently we did not seek globally optimal protocols, rather the best performance within the options of P, I, and PI feedback, given the ability to feed back quantum operations based only on the measurement record. For a given implementation we then first compared the performance of separate P feedback and I feedback control strategies, with and without the presence of time delay in the former, and then carried out a PI feedback strategy, following an assessment of whether or not this might be beneficial.
We implemented this quantum PI feedback approach in this work for two canonical problems, namely stabilization of a harmonic oscillator to arbitrary target values of its quadrature expectations when subject to thermal noise, and entanglement generation of remote qubits by non-local measurements with local feedback operations.
In the case of the harmonic oscillator, as in previous work on cooling of a harmonic oscillator [6], we studied two settings of feedback control based on measurement of the position degree of freedom x, which is generally easier to measure than the momentum p. In the first setting, it is possible to actuate both x and p degrees of freedom of the oscillator, while in the second regime it is possible to only actuate x, i.e., to apply a force. The first setting allows formulation of a P feedback strategy that can perfectly cancel the measurement backaction noise entering the system, resulting in a deterministic evolution of the average state [6]. In the second harmonic oscillator setting, with control only over the x degree of freedom, complete cancellation of the measurement noise can no longer be made, even in a P feedback strategy.
However, by using a time delay in P feedback and integral filters in I feedback to obtain estimates of the time-dependent oscillator momentum, we found that it is nevertheless still possible to formulate good feedback control laws that achieve exponential convergence of quadrature expectation values on average, with relatively small measurement noise induced fluctuations of individual trajectories around their target values In this case, we consistently found a small advantage of I feedback over P feedback for all efficiencies η, with the former also showing smaller fluctuations around the goal.
This was seen to stem from the fact that I feedback can derive a smoother estimate of the oscillator momentum through use of a integral filter, and thus allows us to engineer a system with more controlled and smaller fluctuations around the target quadrature mean values.
Thus for the harmonic oscillator state stabilization, we find the best performance with a pure P strategy when both x and p controls are available, and the best performance with a pure I feedback strategy when only x control is available. We found little significant advantage in formulating a general mixed PI feedback strategy for the harmonic oscillator state stabilization. Although we make no claims about the optimality of any of these feedback control strategies for the harmonic oscillator, a significant feature of our analysis is the proof that all of them lead to exponential convergence of the expectation values of the oscillator quadratures to their goal values. We emphasize that this convergence analysis has been restricted to the parameter regime where a period-averaging (i.e., rotating wave) approximation is valid. It is possible that this landscape of PI feedback performance could change outside this regime, which is a potential topic for further study.
Our second case study was the generation of entanglement by measurement of collective operators of two non-interacting qubits, combined with local feedback operations, for arbitrary measurement efficiency η ≤ 1 and time-independent proportionality constant α p . Unlike the situation for η = 1 and more general time-dependent P feedback [37], our more restricted -but experimentally relevant -case is unable to completely cancel measurement noise, regardless of the value of η. Here we found that an I feedback strategy can improve on P feedback and achieve superior performance, essentially because an I strategy is able to formulate a smoothed estimate of the error signal by means of the integral filter. This situation is reminiscent of PI feedback control in classical systems [53] and this case provides strong motivation for the formulation of a general PI feedback law that combines the P and I feedback strategies. We numerically determined an optimal mixing ratio between P and I feedback for this problem of remote entanglement generation, showed that this optimal value can depend on the overall feedback strength and system Hamiltonian, and demonstrated that PI feedback can be beneficial over both the I and P feedback strategies in some cases.
We also examined the robustness of the P feedback strategies to imperfect time delay, investigating the effects of larger values of τ P than specified by the ideal control law. We found that the harmonic oscillator state stabilization example when both x and p actuation are available is the most robust to finite time delays, with the quadrature expectations at long time having zero bias from their target values (i.e., E X (t → ∞) = X g and E P (t → ∞) = P g ), but with fluctuations from the target values that increase with time delay. Meanwhile, both the harmonic oscillator with only x actuation and the two qubit remote entanglement example are very sensitive to deviations of τ P away from the ideal specified value, with performance degrading rapidly as the deviation increases. For the latter cases, the I feedback strategy will therefore be preferred when the perfect time delay condition can not be met.
These case studies reveal a key difference between the benefits of PI feedback in the quantum and classical domains. In the quantum case, there is an unavoidable correlation between the noise experienced by the system and the noise in the measurement signal (this is evident in the fact that the same stochastic increment dW is present in Eqs. (2) and (3)). This is not always the case in classical systems, where the "process noise" that the system experiences is often independent of the measurement noise. This difference means that P feedback strategies can play a unique and potentially more powerful role in the quantum domain than they typically do in the classical domain.
In particular, in some circumstances, depending on the feedback actuation degrees of freedom, a P feedback strategy can perfectly cancel the measurement noise that the system experiences, while an I feedback strategy can only approximately cancel the measurement noise. However, in cases where this perfect cancellation is not possible, whether this is due to time delay or other constraints on the feedback action, we saw that I feedback can outperform P feedback, because it provides a smoothed version of the measurement noise. This beneficial value of I feedback is similar to that seen in classical PI feedback control.
Several possibilities for extending this work are immediately evident. Firstly, formulating optimal forms of PI feedback in the quantum domain would be beneficial, even for paradigmatic systems that are analytically tractable like the harmonic oscillator example treated here. The results in the current work indicate that such optimality studies would be particularly useful for feedback control in situations with inefficient measurements (see e.g., [52]). Secondly, the development of heuristic methods for tuning the optimal proportions of P and I feedback for any system, analogous to those that exist for classical PI feedback control [53] is an interesting direction. Here, it would be of interest to determine the optimal strategies under constraints of finite measurement and feedback bandwidth, in contrast to the infinite bandwidth controls implicitly assumed in this work, but still without state estimation. Exploration of robust methods to address the implementation of differential control terms to allow implementation of quantum PID control would also be valuable. Finally, our demonstration of the beneficial effects of integral control strategies for generation of entangled states of qubits under inefficient measurements within the range of current capabilities [71], indicate good prospects for experimental demonstration of quantum PI feedback in the near future. For the harmonic oscillator state stabilization example presented in the main text, we derived effective P feedback strategies in the case of x and p actuation, and of x actuation only. In the former case, the P feedback strategy required zero time delay, τ P = 0, while in the latter, formulating a momentum estimate required a time delay of τ P = T /4.
Given that any real feedback loop will have some time delay, and that sometimes it is difficult to make this delay small compared to the natural timescales of the system being controlled, we study the impact of larger-than-desired time delays on the P feedback strategies in this Appendix, to examine their robustness with respect to variations in τ P .

x and p Control
In the case where x and p actuation is available, Fig. 2(a) of the main text shows that the ideal P feedback strategy with τ P = 0 achieves deterministic and exponential convergence of the quadrature expectations to their target values. In Fig. 11 we show the behavior of the quadrature expectations for finite delay times, τ P > 0. The trajectories are very different from the case of τ P = 0, showing increasing noise as τ P increases. This is expected, since with a finite time delay we no longer exactly cancel the measurement-induced fluctuations. While the ensemble average of the trajectories still converges to the target state (left panels of Fig. 11), albeit at a slower rate than for τ P = 0, individual trajectories fluctuate around the target quadrature values. Thus with τ P > 0 the long-time behavior of the quadrature expectations has zero bias from the target values (i.e., E X (t → ∞) = X g and E P (t → ∞) = P g ), but non-zero variance.
The zero bias property of the quadrature expectations from their target values at long times can be proved rigorously. We return to the equations of motion for the quadratures in the presence of finite time delay, Eq. (21), transform to the rotating frame and then, consistent with averaging over an ensemble of trajectories, drop the stochastic terms to obtain coupled deterministic equations for the deviationsX andP . This is all done while retaining a finite value of τ P . Following the notation of the main text, we then arrive aṫ where Z(t) = [X(t) − X g ,P (t) − P g ] T and