Stochastic thermodynamics based on incomplete information: Generalized Jarzynski equality with measurement errors with or without feedback

In the derivation of fluctuation relations, and in stochastic thermodynamics in general, it is tacitly assumed that we can measure the system perfectly, i.e., without measurement errors. We here demonstrate for a driven system immersed in a single heat bath, for which the classic Jarzynski equality $\langle e^{-\beta(W-\Delta F)}\rangle = 1$ holds, how to relax this assumption. Based on a general measurement model akin to Bayesian inference we derive a general expression for the fluctuation relation of the measured work and we study the case of an overdamped Brownian particle and of a two-level system in particular. We then generalize our results further and incorporate feedback in our description. We show and argue that, if measurement errors are fully taken into account by the agent who controls and observes the system, the standard Jarzynski-Sagawa-Ueda relation should be formulated differently. We again explicitly demonstrate this for an overdamped Brownian particle and a two-level system where the fluctuation relation of the measured work differs significantly from the efficacy parameter introduced by Sagawa and Ueda. Instead, the generalized fluctuation relation under feedback control, $\langle e^{-\beta(W-\Delta F)-I}\rangle = 1$, holds only for a superobserver having perfect access to both the system and detector degrees of freedom, independently of whether or not the detector yields a noisy measurement record and whether or not we perform feedback.


Introduction
During the last two decades we have seen an enormous progress in the understanding and description of the thermodynamic behaviour of small-scale systems, which are strongly fluctuating and arbitrary far from equilibrium. This includes, e.g., a consistent thermodynamic description at the single trajectory level and the discovery of so-called fluctuation relations which, in a certain sense, promote the status of the second law of thermodynamics from an inequaltiy to an equality. A number of excellent review articles and monographs from different perspectives can be found in Refs. [1][2][3][4][5][6][7].
A tacit assumption underlying this framework, which is usually never discussed in any detail, is that we must be able to measure the stochastic trajectory z(t) of a system perfectly, i.e., without measurement errors, in order to establish the framework of stochastic thermodynamics and to derive fluctuation relations. In practise, we know, however, that this is an experimental challenge for very small systems and, to put this thought even further, this might be the major obstacle in finding a fully satisfactory generalization of stochastic thermodynamics to quantum systems.
Extending the framework of stochastic thermodynamics to the case of incomplete or only partially available information has only recently attracted interest [8][9][10][11][12][13]. In our context, the results of García-García et al. [13], who have also derived a modified Jarzynski equality for faulty measurements, are of particular importance. Our results are indeed in agreement with their theory, though our point of view and derivation differs from them as we will discuss further in the main text below.
In addition, we also go one step beyond and include feedback based on faulty measurement results in our theory. In fact, the state of knowledge of the observer is of crucial importance in control theory and determines how "effective" the feedback control can be applied. However, if the experimentalist is forced to perform feedback based on faulty measurement results, it seems logical that she also uses the same (faulty) detector to infer other statistical properties of the system. Thus, we argue that, in order to extend stochastic thermodynamics to the case of feedback control with measurement errors, it is of crucial importance to take this measurement error consistently into account also during the time where no feedback is performed but where we still need to measure the system. This has indeed crucial consequences as we will examine below.
Outline: The article starts with a derivation of the standard Jarzynski equality (JE) based on a stochastic path integral method in order to establish the mathematical tools we will need in the following. Then, the rest of the article is divided into two main parts: Sec. 3 treats the case without feedback control and Sec. 4 the case with feedback control. In both cases we derive a general expression for the measured Jarzynski equality (MJE) of the measured work distribution for arbitrary measurement errors [Eqs. (20) and (35)]. In general, however, these might be extremely difficult to compute. Therefore, we present analytical results (underpinned by numerical simulations) for the two paradigmatic cases of an overdamped Brownian particle (OBP) in a harmonic potential and a two-level system (TLS). At all times we try to physically motivate our results and shift most lengthy computations to the appendix. Furthermore, we comment on the use of mutual information in the JE in Sec. 5. Finally, in Sec. 6 we discuss our findings and point out to possible future applications.
2. Derivation of the Jarzynski equality for a driven system in a heat bath Consider a system described by a Hamiltonian H λ(t) (z). Here, z might denote the position and momentum of a particle (i.e., z = (x, p)) or the discrete state of a system (such as spin up or down, z ∈ {↑, ↓}). The results derived below are independent of this consideration and we will use the notation of a continuous variable z most of the time. Next, suppose the system is in contact with a thermal bath at inverse temperature β and initially at t = 0 in equilibrium with it, i.e., p t=0 (z) = e −βH λ(0) (z) /Z 0 with Z 0 = dz e −βH λ(0) (z) . Then, we change the Hamiltonian from t = 0 to t = t f as described by an arbitrary but fixed protocol λ(t). Consequently, the work performed on the system, along each trajectory z(t) = z becomes a stochastic quantity whose fluctuations are bounded by the following relation, which is also known as Jarzynski's equality (JE) [14,15], e −β(W −∆F ) z = 1 .
Here, . . . z denotes an average over all possible system trajectories z and ∆F = −β −1 (ln Z f − ln Z 0 ) denotes the change in equilibrium free energy. Eq. (2) can be derived in different ways and we will use stochastic path integrals and the method of time-reversed trajectories below.
In the formalism of stochastic path integrals the average of a trajectory-dependent quantity X[z] can be expressed as [16] where D[z] denotes a measure in the space of trajectories z and P[z] the probability density (with respect to this measure) of choosing a trajectory z. We now divide the time interval [0, t f ] into N time steps of duration δt = t f /N. A particular trajectory z is then approximated by its coordinates z k ≡ z(t k ) at times Note that the limit N → ∞ by keeping t f fixed is implied. The work along the trajectory is the discretized version of Eq. (1), where λ k denotes the value of the external control parameter at time t k . Furthermore, where dz denotes an integral over a continuous variable (e.g., for an OBP) or a discrete sum (e.g., for a TLS). The probability density for a particluar path is given by Here, +p λ 0 (z 0 ) is the initial equilibrium distribution and p λ k (z k−1 → z k ) denotes the transition probability from z k−1 to z k in time δt where the driving protocol has the value λ k . This factorization implicitly assumes Markovian system dynamics.
Of particular importance now will be the notion of a time-reversed path, denoted by Here z * k indicates the time-reversal of z k , e.g., if z k = (x k , p k ) for a particle with position x k and momentum p k , then z * k = (x k , −p k ). The probability density for such a path is As usual in stochastic TD, we assume microreversiblity (or local detailed balance) [17][18][19] where δq k (z k−1 → z k ) ≡ δq k is the heat absorbed by the system during the time interval [t k−1 , t k ]. Due to normalization, we can write where we used D[z † ] = D[z] and introduced the heat δq[z] ≡ δq 1 + · · · + δq N absorbed along the full trajectory z. Since the system is initially in equilibrium (in the forward as well as in the backward process), we have and furthermore H λ * By the first law of thermodynamics the energy difference between initial and final state along the ‡ Note that in the presence of a magnetic field (or any other odd variable in the Hamiltonian) the sign of the field also changes under time-reversal. trajectory is ∆e(z 0 , z f ) = q[z] + W [z]. Then, from Eq. (10) for N → ∞ (keeping t f fixed) the original JE follows immediately: To be precise and to emphasize that the statistical average . . . z is taken over the system trajectories we explicitly use a subscript z. This will change in the following.

Measured Jarzynski equality without feedback
Suppose now we measure the system coordinate z continuously with measurement outcome y, which in general can involve measurement errors and suppose the true system dynamics are inaccessible or hidden. Then the original JE, evaluated with the accessible measurement data, is in general not equal to unity, but depends on the difference of the true and measured work distribution. More specifically, we introduce the conditional probability p m (y|z) to obtain measurement outcome y given a particular state z of the system. The probability distribution of measurement outcomes y after a measurement is then Given a particular measurement outcome y, the state of the system after the measurement is given by Bayes' rule and reads The case of a perfect measurement, as usually considered in stochastic thermodynamics, is described by p m (y|z) = δ y,z (where δ y,z denotes the Kronecker delta for a discrete state space or the Dirac distribution for a continuous system). It is then actually redundant to explicitly distinguish between the state of the system and the measurement result because p ′ m (y) = p(z = y) and p ′ (z|y) = δ y,z (the final state is pure and coincides with the measurement result).

General case
In order to incorporate the measurements on the system, we expand the phase space to the phase space of measured and true trajectories (see Fig. 1). A stochastic path in this extended space is denoted by (z, y) and the probability of choosing such a path is simply denoted by P[z, y]. The trajectory z of the system is the projection of the whole trajectory onto the z-subspace and the probability distribution of this true stochastic path is given by P  the time interval [0, t f ] again into N time steps, the probability density of a path in the space of system and measured trajectories will be factorized as Our main assumptions are that the evolution of the system is independent of the measurement process and that the outcome of a measurement y k only depends on the state of the system z k at time t k , i.e., we assume This can be seen as a Markov assumption for the measurement apparatus, i.e., the previous measurement result y k−1 does not influence the system evolution and the next measurement result. The conditional probability p m (y k |z k ) quantifies the uncertainty of the measurement (see Eqs. (13) and (14)).
The measured work W m [y] along a measurement trajectory y is defined as in Eqs. (1) and (5) by interchanging z with y and is in general different from the true work W = W [z]. Even on average it might be that W m [y] y = W [z] z . Nevertheless, we assume that the Hamiltonian of the system is known to us and unchanged by the measurement; the only mistake is in the measurement outcome y (see Ref. [13] for the case of different Hamiltonians).
From an experimental point of view it only makes sense to consider the distribution of measured work and we may write the average of the exponential of measured work and free energy difference ∆F as where in the last step we used (16). Again the assumption of microreversibility (see Eq. (9)) allows us to write Eq. (17) as Here, ∆e † (z 0 , z f ) and δq † [z † ] are the energy difference and the exchange of heat with the reservoir along the system's backward trajectory, respectively. The first law also holds for the backwards paths of the system, ∆e and assuming time-reversal symmetry of the measurement, p m (y i |z i ) = p m (y * i |z * i ), we can further simplify Eq. (18) to where we have used that the measured work is asymmetric under time reversal, , which directly follows from the corresponding property of the true work. Thus, one finally arrives at the following expression for the MJE: This expression results from a formal manipulation and is at this point, however, still explicitly dependent on the (backward) trajectories z † of the system and is therefore of limited practical use. Later on we will see how to overcome this difficulty for various examples were we use Eq. (20) as our formal starting point. Note that depending on the probability distribution p(W † , W † m ) an expansion in terms of the moments of the distribution could be also attempted.
As an important limiting case we immediately see that for a perfect measurement, p m (y k |z k ) = δ y k ,z k , the measured work coincides with the work of the system, W m [y] = W [z], and the right hand side becomes unity recovering the original JE (see Eq. (2)). Moreover, the right hand side of Eq. (20) may also be equal to one if there is a certain symmetry in the driven system, such that . Finally, let us comment on recent work by García-García et al. [13], who also derive a modified JE including measurement errors and which is equivalent to our result, Eq. (20). However, their point of view as well as the derivation differ from the present approach. García-García et al. introduce the error E[z, y] = W [z] − W m [y] of system and measured work and derive a fluctuation theorem for the joint distribution of the measured work and this error [13]: From the latter relation, one can immediately derive Eq. (20). Thus, whereas all measurement errors in Ref. [13] are incorporated at the level of the final work distribution p ′ (W m , E), we start with a particular measurement model for the state of the system expressed in terms of p m (y k |z k ). This is closer to a microscopic modeling of the situation because any measurement model for the system p m (y k |z k ) will also yield a certain work distribution p ′ (W m , E), whereas for a given work distribution p ′ (W m , E) there might be many different measurement models (and even different systems) which yield the same p ′ (W m , E). Thus, our findings show a completely different path to derive fluctuation theorems in the presence of measurement errors. Whether our approach or the one of Ref. [13] is superior might depend strongly on the specific situation and the system under study.
In the following sections we examine two paradigmatic systems for which the right hand side of Eq. (20) can be evaluated analytically, namely an overdamped Brownian particle (OBP) in a harmonic potential in Sec. 3.2 and a two-level system (TLS) in Sec. 3.3.

Overdamped Brownian motion
We consider the overdamped dynamics of a particle in a harmonic potential in one dimension such that the Hamiltonian of the system is only given by the potential energy: The stiffness f λ(t) as well as the center of the potential µ λ(t) can be altered in time by an external driving protocol λ(t). To simulate the system dynamics we use the Langevin with diffusion constant D, which is related to the friction constant γ by the Einstein relation D = (βγ) −1 , and Gaussian white noise ξ(t).
We specify our measurement model by assuming that the measured position of the particle y i is normally distributed around the real position z i with a standard deviation of σ m , such that, if σ m → 0, the conditional probability becomes a Dirac distribution and the measured coordinate coincides with the true coordinate of the particle. Such a Gaussian measurement model might be a good approximation for a noisy measurement without systematic error (i.e., we have y = z ) and simplifies a lot analytical calculations. Note that the Langevin equation (23) now merely presents a convenient numerical tool. From the point of view of the observer, it has no objective reality unless σ m = 0. The correct state of knowledge of the observer would be indeed described by a stochastic Fokker-Planck equation [20,21].
Continuous driving protocol. Evaluating the general expression, Eq. (20), for a continuous and piecewise differentiable (c.p.d.) driving protocol λ(t) yields (see Appendix A.1 for the derivation) where ∆f ≡ f λ(t f ) − f λ(0) . The right hand side of the above equation equals unity for σ m = 0 corresponding to the original JE. Similarly, if we vary the width of the potential periodically such that f λ(0) = f λ(t f ) , then the original JE is also recovered. However, this attribute is, as far as we know, specific to the model of the overdamped Brownian particle with c.p.d. driving protocol. In general the right hand side will be different from one. Interestingly, shifting the center µ λ(t) of the potential has no effect at all on the MJE. Furthermore, if we define an effective free energy, ∆F ≡ ∆F + σ 2 m ∆f , which may be interpreted as an additonal contribution due to the uncertainty of the measurements, a JE of the form e −β(Wm−∆F ) y = 1 holds.
Instantaneous change of driving protocol ("quench"). We also derive in Appendix A.2 an analytic expression for the MJE for an instantaneous change of the system Hamiltonian at a time t m (also called a 'quench'). We consider here that the position and the width of the parabola is altered at the same time and is constant before and after t m . We find is the difference of the center of the parabola before and after t m § Numerics. In order to verify our findings we performed Brownian dynamics (BD) simulations and used the weighted ensemble path sampling algorithm [22], which shifts the computational resources towards the sampling of rare trajectories, which have the largest impact on the JE. It has been shown that this method is statistically exact for a broad class of Markovian stochastic processes [23]. Please note that we set β ≡ 1 for all simulations in this paper.
As a simple example we change both parameters of the potential continuously and linearly in time. We choose f λ(t) = f λ(0) + αt and µ λ(t) = µ λ(0) + α ′ t. For this driving scheme we find very good agreement of BD simulation and the analytic expression, Eq. (25), which is presented in Fig. 2 (left). Furthermore, we compare Eq. (26) with simulation results where intially the Hamiltonian of the system is given by Fig. 2 (right) we show the results of the BD simulation (marks) as well as the analytic expression (line) for different values of σ m verifying our findings also for a quench.

Two-level system
Consider a driven system consisting of two energy levels, a ground state with energy ε λ(t) (g) and an excited state with energy ε λ(t) (e), coupled to a heat bath with inverse temperature β. The master equation (ME) describing this system is d dt Here, we denoted the energy gap of excited and ground state by ω λ(t) ≡ ε λ(t) (e) −ε λ(t) (g) and p g/e (t) denotes the probability to find the system in the ground/excited state. We measure the state of the system continuously with (1−η) being the probability of measuring the state of the system correctly and consequently η of measuring it wrongly, i.e., we set Continuous driving protocol. The MJE of the TLS, where the external control parameter λ(t) is c.p.d., can be well approximated by (see Appendix A.3 for the derivation) where p † g/e (t) denotes the probability that the system is in the ground/excited state in the backward process at time t, respectively. Furthermore, ω λ † (t) denotes the energy gap of the TLS. We remark, that for a c.p.d. protocol with nondifferentiable points at 0 < t 1 < ... < t K < t f we have to split the integral at the respective points as (28) is exact up to first order in η. For higher orders (say η k ) we have to assume that P[z i 1 , . . . , z i k ] ≈ p(z i 1 ) . . . p(z i k ) which seems to be remarkably well justified (see our numerical results below). In fact, though this result strictly holds only for slow driving, orders of η k for k ≫ 1 become negligible since η ∈ [0, 1], hence, justifying our approximation. Furthermore, it is important to note that for the evaluation of the right hand side of Eq. (28) we only need to solve for the average evolution of the system (as dictated by the master equation); it is not necessary to have access to higher order statistics.
Instantaneous change of driving protocol ("quench"). For a quench we assume that at t m with 0 < t m < t f the energy levels are shifted instantaneously and are held constant before and after. Then, the MJE is given by (see Appendix A.4 for the derivation) where ∆ω † ≡ ω λ † (t f ) − ω λ † (0) and ω λ † (t) is defined as before. Note that both relations for the TLS (Eq. (28) and (29)) give the original JE for perfect measurement (η = 0). Numerics. To test these expression, we performed Monte Carlo (MC) simulations for different values of η ∈ [0, 0.3] for two driving schemes. First, the driving scheme varies the energy levels continuously and linearly in time, i.e., ω λ(t) = ω 0 +αt. In Fig. 3 (left) we plotted the left hand side of Eq. (28) from MC simulations (marks) and the right hand side from numerical integration of the associated ME of the backward protocol (line). As one can see, the approxiamtion of the MJE, Eq. (28), is in very good agreement with the simulation results for small values of η. Note that a value of η = 0.3 corresponds to a very large error of the conditional probability p m (y k |z k ) because for a value of η = 0.5 the measurement becomes identical to infering the system state by a fair coin toss. We also test Eq. (29) where we change the driving protocol instantaneously, i.e., ω λ(t) = ω 0 + α ′ Θ(t − t m ). Here, we find perfect agreement of simulation (marks) and numerical integration (line), which is shown in Fig. 3 (right).

Measured Jarzynski equality with feedback
Feedback describes the situation in which the state of the system is measured and the evolution of the system is manipulated by applying an external control scheme depending on the measurement outcome. The change of the JE and other fluctuation theorems under feedback has recently attracted a lot of attention, in theory [24][25][26][27][28][29][30][31][32][33] as well as in experiments [34,35]. A prominent and the first example of a generalized JE incorporating feedback by performing a single measurement on a stochastic thermodynamic system at a time t m with measurement outcome y m is the relation derived by Sagawa and Ueda [24]: The so-called efficacy parameter γ, which determines "how efficiently we use the obtained information with feedback control" [24], depends on the probability p λ † (ym) (y * m ) of obtaining the time-reversed outcome y * m in the backward process: Note that in the backward process we use the time-reversed driving protocols λ † (t, y m ) according to the measurement statistics of y m obtained in the forward process. Especially, there is no feedback control in the backwards process. Now, in the derivation of Eq. (30), the particular measurement yielding outcome y m (on which the feedback control is based) is allowed to have measurement errors. However, the left hand side of Eq.

General case
Let us suppose we measure our system as we did without feedback control but at one instance in time, denoted t m with 0 < t m < t f , the protocol is changed according to the measurement outcome y m such that the protocol is fixed before t m , i.e., λ = λ(t) for t ∈ [0, t m ] and is dependent on y m after t m , i.e., λ = λ(t, y m ) for t ∈ (t m , t f ]. The work applied to the system, which now depends on y m , is given by The same equation holds also for the measured work W m [y|y m ] by interchanging z with y (keeping y m ). The probability of a path in phase space (z, y) under feedback control is denoted by P λ(ym) [z, y] and we again assume that it factorizes into the probability density of the system trajectory P λ(ym) [z], which now explicitely depends on y m , and the conditional probabilities (33) Note, that the difference in free energy does now also depend on the measurement outcome, i.e., ∆F = ∆F (y m ), because the Hamiltonian of the system at time t f depends on y m . Using again the condition of microreversiblity (see Eq. (9)) and assuming timereversal symmetry of the conditional probabilities, p m (y i |z i ) = p m (y * i |z * i ), the following equation holds: From Eq. (34) we immediately obtain the MJE in the presence of feedback control: which looks remarkably similar to Eq. (20). Here, W † m [y † |y m ] and W † [z † |y m ] are the measured and true work, respectively, in the backward process applying the timereversed protocol λ † (t, y m ) according to the measurement outcome y m in the forward process. We stress that we do not perform any feedback in the backward process equivalently to [24]. Analogously to the efficacy parameter γ (see Eqs. (30) and (31)) we call the right hand side of Eq. (35) measured efficacy parameter, because the JE is evaluated using the measured trajectories. Note the subtle distinction between Eq. (30) and (35). Eq. (30) starts with exp(−β(W − ∆F )) z which experimentally requires an error-free detector to evaluate it. We instead start with exp(−β(W − ∆F )) y which can be directly evaluated also with a faulty detector. Our final theoretical result (36) then depends on z † indeed. However, based on this definition we show below how to overcome this difficulty for various examples. Furthermore, note that a complementary analytical analysis confirming our results has been reported in Ref.
[?] for the example of the Szilard engine.
In the limiting case of perfect measurement, p m (y k |z k ) = δ y k ,z k , Eq. (35) simplifies to Due to normalization of conditional probabilites, it holds that the integrals of all y * k with k < m are equal to unity, hence, Only in this case the efficacy γ and the measured efficacy γ m are the same as it should be. However, for a measurement outcome y m including errors, γ deviates from γ m . The interpretation and physical significance of the difference between γ and γ m can be explained as follows: consider two observes Alice and Bob. Suppose that Alice measures the state of the system with a faulty detector whereas Bob measures the system with a perfect detector. Furthermore, suppose that only Alice performs the feedback control based on her measurement result at time t m . Then, if Alice evaluates the JE of the work done on the system along her measured trajectories, she will observe the result γ m . In contrast, Bob -given the correct system trajectories and knowledge about the feedback action of Alice and her faulty detector -is able to verify the standard Sagawa-Ueda relation with the efficacy parameter γ.

Overdamped Brownian motion
As an explicit example, for which we can evaluate the right hand side of Eq. (35) analytically, we look again at an OBP in a harmonic potential (see Sec. 3.2) and assume that the center of the potential is intially at µ λ(0) = 0 and the width is f λ(0) . Both parameters will be changed instantaneously at time t m if the measured position at that time is y m > 0, the position to µ λ(t f ) and the stiffness to f λ(t f ) . Otherwise, for y m < 0, the potential remains unchanged. For this specific example Eq. (35) can be evaluated explicitly and we obtain (see Appendix A.5) where κ ≡ 2βf λ(0) σ 2 m . For the special case of only altering f λ(t) and keeping the position of the parabola fixed, i.e., µ λ(t f ) = µ λ(0) , Eq. (39) reduces to On the other hand, if the stiffness is held constant, f λ(t f ) = f λ(0) = f , but the parabola is shifted, we find We have varified Eqs. (39) -(41) by perfoming BD simulations for various driving schemes (not shown here) and will discuss the paradigmatic model of an "information ratchet" [24] in the next paragraph in more detail also showing numerical results.
Information ratchet. The Brownian particle is initially in thermal equilibrium in the harmonic potential with center µ 0 . We then measure the position of the particle y m at time t m and perform the following feedback scheme: If y m ≥ µ 0 + L with L > 0 being constant, we shift the center of the potential µ t>tm = µ 0 + 2L, if y m < µ 0 + L we do nothing. We then replace µ 0 → µ 0 + 2L and start over again after some transient relaxation time. By repeatingly performing this feedback protocol, we can actually move the average position of the particle to the right, ideally without performing work. Here, ∆F = 0 holds throughout the whole process. Furthermore, one can also extract work from the system by this feedback control if the particle is transported against a potential gradient as, e.g., in the experiment [34]. For a single step of the ratchet, where we put µ 0 = 0 for simplicity, the measured efficacy with feedback control is given by The derivation follows the same steps as in Appendix A.5 but the integral of y * m is splitted at L instead of 0. Eq. (42) differs from the efficacy parameter γ of the original information ratchet [24], In Fig. 4 (left) we plot the solutions of the two equations above as function of the variance of the measurement σ m . The two equations coincide for the case of perfect  measurement. However, for finite values of σ m the efficacy γ of the feedback control (dashed line) is lower than for perfect measurement: If the measurement has an error, then the potential will be shifted even though the real position of the particle may not be greater than L. Then we may actually apply work to the system instead of extracting it and the average value of extracted work is lower for noisier measurements.
If we look at the work we measure using the same apparatus as we have used to measure y m (line), we see that with increasing measurement error σ m , the measured efficacy γ m also increases in strong contrast to γ. Since the measured work is given in terms of the measured position y m of the particle, we always apply the "correct" feedback scheme from the observer's point of view. Thus, we (the observer) always think that we extract work. This can also be seen in the distribution of measured (purple) and system (blue) work in Fig. 4 (right), where the probability of measured work is only non-zero for W m < 0. To support this claim even further, we can calculate the average measured and system work by integration of Eq. (32) over z and y m , where the integral is nonzero only if y m > L. The difference of them results in where κ = 2βf σ 2 m . Thus, on average the measured extracted work (note that in our convention work is positive if it is done on the system) from the system will be greater than the true extracted work and even increases with σ m . For a larger value of σ m the probability distribution p ′ m (y m ) (see Eq. (13)) of the measured position y m is broader (i.e., has a larger variance) than p(z), but still has the same mean value as p(z). Then, measurement outcomes with y m > L are more frequent and γ m increases.

Two-level system
Similarly to the derivation of the MJE of the TLS without feedback we find with feedback for a c.p.d. but at this point unspecified driving protocol an approximation for the modification of the original JE (see Appendix A.6 for details): Here, p z,λ † (ym) (t) is the probability for the system to be in state z (ground or excited) at time t in the backward process with the backward protocol according to the measurement outcome y m (ground or excited state) in the forward process. We again note that we do not apply feedback in the backward process and that Eq. (45) is valid under exactly the same conditions as discussed below Eq. (28). Furthermore, ω λ † (t,ym) is the energy gap as defined in Sec. 3.3 with the time-reversed protocol according to the outcome of the forward process. For a c.p.d. protocol with nondifferentiable points the integral in Eq. (45) is again split into parts at the respective points. For most driving protocols with feedback we have considered numerically (not shown here) Eq. (45) is a very good approximation.
For a driving protocol that is not continuous in time, we find a different expression. Here, we assume as in the case without feedback, that before and after t m the protocol is constant and that a quench is performed at time t m . We then find for the MJE (see also Appendix A.6) where p zm,λ † (zm) (t m ) denotes the probability of the system to be in state z m at time t m in the backward process with the backward protocol according to the measurement outcome y m =z m . Here, we introduced the complementary statez m to z m (i.e., if z m = g thenz m = e and vice versa). Furthermore, ∆ω λ † (zm) (z m ) ≡ ω λ † (t f ,zm) (z m ) − ω λ † (0,zm) (z m ) and ω λ † (t,zm) (z m ) = ε λ † (t,zm) (z m ) − ε λ † (t,zm) (z m ).
We will now discuss an example of a protocol with a quench in detail in the next paragraph.
Conditional swap. As a specific example, for which we can extract work from a single heat bath by measuring the state of the TLS at time t m , we discuss a feedback operation which we calll a conditional swap: if at time t m the measured state of the TLS y m is the excited state, we interchange the two energy levels such that we extract work of ω = ε e − ε g if the system state z m is the excited one and perform work of −ω if z m = g. If y m = g we do nothing. We compare our findings (see Eq. (46)) of this conditional swap to the corresponding expression of the efficacy parameter γ, which is given for this specific example by Note that in the model of the conditional swap p g,λ † (g) (t m ) = p e,λ † (e) (t m ) and p e,λ † (g) (t m ) = p g,λ † (e) (t m ). We show the difference of γ (dashed) and γ m (line) for different values of η in Fig. 5  (left). As one can see, for a perfect measurement they result in the same value. However if η is greater than zero, the two differ. The explanation is very similar to the one of the information ratchet discussed in Sec. 4.2: if the measurement y m involves errors, the two states are sometimes interchanged even though the system may be in the ground state resulting in work applied to the system instead of extracting work from the system. If we look at the work distribution of the system (see Fig. 5 right top), one can see that for values η > 0, the extracted work becomes less whereas the probability of applying work to the system increases with measurement error (note that in our convention work is negative if it is done by the system). Then the efficacy parameter is lower than without measurement error. On the other hand, if we look at the measured work (see Fig. 5 right bottom), which is calculated from the measured state of the system, we only measure positive work extraction from the system by perfoming the conditional swap. Furthermore, the probability of measuring the excited state of the system is always larger than the actual probability of the system to be in the excited state if p e (t m ) < 1/2 (as in our case), Therefore, the probability of extracting work from the system and therefore γ m increases with larger values of η.
Thus, by adding the stochastic mutual information I(z m , y m ) ≡ ln p(ym,zm) p(ym)p(zm) to the exponent we can make the right hand side of the "Jarzynski-Sagawa-Ueda relation" equal to unity again. This result provides us with a nice interpretation because it tells us that the amount of work we can extract from the system is bounded by I(y m , z m ) zm,ym , which can be viewed as the amount of correlations established during the measurement.
Unfortunately, in case of measurement errors, validating Eq. (49) requires to be able to observe the system perfectly during the time where it is not controlled. But this again raises the question of how this might be achieved because this means that the detector of the experimentalist is only faulty previous to the feedback step and otherwise correct. Eq. (49) could be therefore viewed as an "objective" fluctuation theorem which a second "superobserver" with perfect access to both the system and detector degrees of freedom would observe. In contrast, the MJE we have considered so far could be called a "subjective" fluctuation theorem which is based on the knowledge of the observer only.
In fact, we will now show that by taking the full stochastic mutual information between the system and detector into account, defined as yields a fluctuation theorem of the form e −β(W [z|y]−∆F (y))−I(z,y) z,y = 1 (51) which holds without and with measurement errors and without and with feedback, even if the feedback is performed continuously, i.e., every time step δt. However, the latter relation may be invalid for some error-free feedback control processes where absolute irreversiblity is inherent [?]. We remark that the validity of Eq. (51) without feedback and with measurement errors was already noted in Ref. [ Here, we used that the JE D[z]P[z]e −β(Wm[z|y]−∆F (y)) = 1 holds for every fixed measurement record y and (consequently in case of feedback) any control protocol λ(t, y). Thus, the mutual information seems to be a universal quantity in order to establish fluctuation theorems where not only the system but also the detector has to be taken into account, although it does not possess an obvious thermodynamic interpretation in case without feedback. Unfortunately, finding some (non-trivial) quantity G = G[y] such that the MJE can be corrected, i.e., such that e −β(W −∆F )−G y = 1, remains an open problem at the moment.

Conclusions and Outlook
In the present paper, we generalized the original JE expressed in terms of the "true" work done on the system to an equation for arbitrary measurement errors based on the measurement record y. The key ingredient for this was the conditional probability distribution p m (y|z), which quantifies the uncertainty of a measurement outcome y given that the system state is z and which defines an abstract measurement model. In fact, by shifting the attention from z to y we only did a first step in generalizing stochastic thermodynamics to the presence of measurement errors because much more sophisticated inference schemes could have been considered as well (we actually did not even use Eq. (14) in our derivations leaving this interesting problem to future work).
Then, using the formalism of stochastic path integrals, we derived the MJE (measured JE) without feedback (Eq. (20)) and with feedback control (Eq. (35)). These expressions were general (under the assumption of a Markovian measurement apparatus), but explicitely involve system trajectory dependent quantities. For two important paradigmatic examples we could overcome this difficulty and express the MJE in terms of fixed Hamiltonian parameters or average quantities, which can be computed based on a master equation. For an OBP trapped in a harmonic potential the expressions derived were exact, whereas for the TLS exact solutions were only found for quenches and very good approximations for continuous driving protocols. We also checked our findings with simulation results. In the limiting case of perfect measurement the general MJE equations result in the original JE without and with feedback. For the non-ideal case we hope that our theory provides a convenient way to explain the always noisy statistics in experiments, which have beautifully demonstrated the validity of the JE and other fluctuation theorems within the given statistical accuracy so far, see, e.g., Refs. [36][37][38][39][40][41][42][43][44].
Furthermore, in case of feedback control the correct handling of measurement errors is even more important because we put the obtained information back into the system to influence its future behaviour. Here, we have seen that the measured efficacy γ m may exceed the system efficacy γ and, contrary to previous intuition, increases with larger measurement errors, which we have calculated explicitely for an information ratchet of an OBP and a conditional swap of the TLS. Furthermore, we showed that the "Jarzynski-Sagawa-Ueda relation" by incorporating the full stochastic mutual information always holds for a "superobserver" who has access to the measured and system trajectories, without and with measurement errors and without and with feedback.
Finally, we would like to mention that a lot of research has already been carried out to understand the stochastic thermodynamics of coarse-grained systems, see, e.g., Refs. [45][46][47][48][49][50][51][52][53][54][55]. In there, given a set of microstates, a subset of observable states is introduced, which defines the coarse-graining and which is sometimes explicitly modeled by a detector or sensor. Based on the observability of this subset, the changed laws of (stochastic) thermodynamics are investigated. Though one can argue that both approaches pursue the same research goal, it is worthwhile to point out that our approach is in principle different. First, the coarse-graining approach still assumes that it is possible to observe the particular subsets perfectly, i.e., error-free, and second, it is also implicitly assumed that it is actually possible to find these subsets or to physically model a detector, but this might be challenging for some large detectors such as a camera. Nevertheless, the question to what extend our approach based on an abstract measurement model p m (y|z) is equivalent to an explicit detector model with underlying coarse-grained system dynamics is, in our point of view, interesting to study in the future.
We first look at the integral of Eq. (A.3): in the limit N → ∞ the time steps dt = t f /N become infinitesimal and we can write the term in the exponential approximately as where the prime (e.g, f ′ ) denotes a derivative with respect to time t. Note that the additional dt in front of the integral is correct. Furthermore, this step is only exact provided that the protocol is differentiable. However, as long as it is continuous and only nondifferentiable at a finite number of points 0 < t 1 < · · · < t K < t f this argument can be easily generalized by splitting the integral at the respective places (i.e., remain infinitesimal small at all points. Then, by the mean value theorem of integration we know that there exists a ξ ∈ [0, t f ] such that which holds for N → ∞. In the last step, we write the product as an exponential and use an approximation of the logarithm up to first order: Taking the limit N → ∞, f λ 0 = f λ(t 0 ) and f λ N = f λ(t N ) , we arrive at Eq. (25).

Appendix A.2. Derivation of MJE for instantaneous driving of OBP
Here, we derive Eq. (26), where we assume that the stiffness of the harmonic potential as well as the position are instantaneously changed at the same time t m . Since the driving protocol is constant before and after t m , it holds that δf λ † k+1 = 0 as well as δ[f µ] λ † k+1 = 0 for all k = m. In this case the right hand side of Eq. (20) reads after integration over all y k and z k with k = m: For the integral over y m to converge, it must again hold that σ 2 m < (2β |∆f |) −1 . We now use, that for the harmonic potential the probability distribution of the position of the OBP in equilibrium (initial system state) is Gaussian distributed with mean µ λ(t f ) and variance (2βf λ(t f ) ) −1/2 in the time-reversed protocol. The integration over z m then finally yields Eq. (26).
Note that, the integral over z m only converges if For large N we approximate such that we can write Eq. (A.14) simply as Writing the product explicitely yields We now make the crucial assumption that P † [z k 1 , ..., z kn ] ≈ p † (z k 1 )...p † (z kn ). Then, To ensure this equality, we introduced a "rest" term R of the form Taking the limit N → ∞, we can write where we again assumed that the protocol is differentiable (see the remark below for the case of a c.p.d. protocol). Evaluating the sums over z i k and writing the sums over where we denote the time derivative of the energy gap of the TLS byω λ † (t k ) = ε λ † (t k ) (e) −ε λ † (t k ) (g) and the probability of the system to be in the ground/exited state at time t i by p † g/e (t i ), both in the backward protocol of the driving scheme. Note that Eq. (A.22) is exact up to first order in η.
Finally, we remark that for a c.p.d. protocol with nondifferentiable points at 0 < t 1 < · · · < t K < t f the result above readily generalizes and in Eq. (A.22) we have to split the integral at the respective points as (A.23)