Stochastic entropy production for continuous measurements of an open quantum system

We investigate the total stochastic entropy production of a two-level bosonic open quantum system under protocols of time dependent coupling to a harmonic environment. These processes are intended to represent the measurement of a system observable, and consequent selection of an eigenstate, whilst the system is also subjected to thermalising environmental noise. The entropy production depends on the evolution of the system variables and their probability density function, and is expressed through system and environmental contributions. The continuous stochastic dynamics of the open system is based on the Markovian approximation to the exact, noise-averaged stochastic Liouville-von Neumann equation, unravelled through the addition of stochastic environmental disturbance mimicking a measuring device. Under the thermalising influence of time independent coupling to the environment, the mean rate of entropy production vanishes asymptotically, indicating equilibrium. In contrast, a positive mean production of entropy as the system responds to time dependent coupling characterises the irreversibility of quantum measurement, and a comparison of its production for two coupling protocols, representing connection to and disconnection from the external measuring device, satisfies a detailed fluctuation theorem.


Introduction
Most applied and theoretical quantum mechanics research is underpinned by the theory of open quantum systems. Any realistic quantum system interacts with its environment, though the interaction in many cases is weak. Open quantum systems are of interest due to the characteristic dynamical properties they display, specifically irreversible dissipative behaviour on approach to a steady state, and decoherence, both of which are not found in closed systems. These phenomena are of crucial importance in quantum technological applications such as quantum computing [1] and in theoretical developments in quantum thermodynamics [2].
In this paper, we concern ourselves with the thermodynamic behaviour of an open quantum system, more specifically with the stochastic entropy production associated with the evolution of its reduced density matrix brought about by changes in interactions with the environment. Such a framework can naturally describe the consequences arising from a time dependent Hamiltonian coupling of the system to its environment, but it can also represent the effect of quantum measurements involving parts of the environment, such as a measuring device. The field of stochastic thermodynamics began as a generalisation of the laws of thermodynamics applied to stochastic systems [3], such as the behaviour of colloidal particles or molecular systems exposed to heat baths [4]. These developments allow us to compute the entropy production associated with individual stochastic trajectories of the evolving reduced density matrix. This stochastic entropy production can satisfy a detailed fluctuation theorem describing the relationship between the effects of time-reversed versions of the coupling protocol [5].
In order to apply the tools of stochastic thermodynamics to the quantum regime, a notion of quantum trajectories must be established. Quantum trajectories have long been used in the field of quantum optics [6], and they are typically realised through the unravelling of a deterministic equation of motion for the variables describing the system. Unravelling is the elaboration of the deterministic dynamics into a stochastic equation of motion that is capable of describing randomness in system behaviour, such as fluctuations around the average evolution. The deterministic dynamics implicitly describes an average over the range of random behaviour. Randomness might be ascribed to outcomes of continuous measurements involving the environment [7], such as photon counting or homodyne/heterodyne detection, though unitary evolution of a system together with its environment where the state of the environment is not fully specified can also be represented using a framework of stochastic unravelling.
To obtain these unravellings, we start with the stochastic Liouville-von Neumann (SLN) equation for the system's reduced density matrix [8][9][10][11][12], a non-Markovian stochastic differential equation derived from the Feynman-Vernon path integral-based consideration of open quantum system behaviour [13]. It is important to note that the stochasticity in the SLN is merely a mathematical feature that requires averaging in order to produce the exact deterministic behaviour of the reduced density matrix. We then derive the Markovian limit of the noise-averaged SLN equation in order to make possible a simple unravelling of the deterministic dynamics using a Kraus operator representation of physical stochasticity brought about by the environment. The stochastic differential equation that emerges then describes physical randomness and can provide a basis for deriving the stochastic entropy production.
We employ stochastic entropy production to quantify the irreversibility of the continuous measurement of the quantum state of a two-level bosonic system. We develop a framework where coupling to an external device causes the system to select an eigenstate of the measured observable, and subsequent decoupling returns the system to the initial thermal state. We find that this requires the system to be coupled, initially with equal strength, to three harmonic baths representing elements of the environment. Each bath couples to an observable of the system represented by one of the Pauli matrices. The system and baths are illustrated in Figure 1. The resulting random exploration of the Bloch sphere, with bias brought about by the system Hamiltonian, produces behaviour consistent with a thermal Gibbs state at high temperature. This situation is then disturbed by increasing one of the environmental coupling strengths, as a representation of additional interaction associated with a measuring device, which obliges the system to move towards and dwell in the vicinity of an appropriate eigenstate according to the Born rule. In our case, we will be dealing with continuous measurements of the system energy. Returning the coupling to its initial strength reverses this dynamical behaviour, such that we can regard the whole sequence as a simple representation of the dynamics of quantum measurement. Having established the stochastic dynamics, we can then derive the stochastic entropy production of measurement using established analysis [5] and show that measurement is associated, on average, with positive entropy production. Note that incorporating stochasticity into the dynamics is crucial to computing the irreversibility of a process. It can be demonstrated that the mean evolution of the reduced density matrix towards stationarity entirely misses the quantum selection of an eigenstate of the measured observable. Quantifying the irreversibility of a stochastic trajectory, or indeed the average irreversibility over an ensemble of trajectories, requires the incorporation of physical randomness into the dynamics.
The plan for the paper is as follows: in Section 2 the essential ideas of stochastic thermodynamics are introduced, as well as the SLN equation describing a system coupled to several independent baths. In Section 3 we obtain a Markovian approximation of the noise-averaged SLN equation by taking a high temperature limit and assuming the relaxation times of the environment to be short. We then construct a stochastic unravelling of the equation of motion by choosing appropriate Kraus operators and approximations. In Section 4 we study process protocols of variation in the strength of one of the bath couplings; and then our calculations of the stochastic entropy production are presented, indicating that there is an entropic cost of measurement, and showing adherence of the entropy production to the detailed fluctuation theorem.

Theory
Continuous or weak quantum measurements are central to the work we will be discussing. They bring about a continuous change to the system being monitored, induced by the environment to which the system is coupled. Continuous quantum measurements produce evolution described by a stochastic differential equation (SDE), with an entropy production which can then be assessed by applying methods of stochastic thermodynamics. We describe methods for stochastic entropy production in Section 2.1, followed by the SLN equation in Section 2.2, which is the starting point for the dynamics we will be exploring.

Stochastic Thermodynamics
The concept of entropy arose over a century ago from a consideration of the irreversibility of macroscopic phenomena, but our understanding has evolved significantly since then. Insights into entropy production were expanded by applying thermodynamic concepts to small and individual systems in the form of fluctuation theorems. These were first introduced by Evans et al. [14][15][16][17], and then similar ideas appeared in chaos theory [18] and stochastic modelling [19,20], and were also developed by Jarzynski [21] and Crooks [22,23], amongst others. These developments introduce the idea of entropy production associated with individual stochastic trajectories.
Stochastic entropy production can be defined as the contrast in likelihoods of forward and reverse sequences of system behaviour [4]. Specifically, it takes the form of the logarithm of a ratio of the probabilities of a forward trajectory driven by a forward protocol of driving forces, and the corresponding reversed trajectory driven by a reverse protocol. These protocols are defined such that one is the time-reversed version of the other, e.g., if the system Hamiltonian varies in a specific way for the forward protocol, the exact opposite time dependence takes place in the reverse protocol. It may be shown that irreversible behaviour such as relaxation towards a stationary state is then accompanied by positive mean stochastic entropy production.
We can write the total stochastic entropy production associated with the evolution of a set of time-dependent coordinates x(t) for 0 ≤ t ≤ τ as where x † is a set of coordinates evolving to form a time-reversed trajectory during the reverse protocol. Concretely, the reverse trajectory is defined as x † (t) = x(τ − t), such that the starting and finishing points x † (0) and x † (τ ) are x(τ ) and x(0) respectively. It is a detailed reversal of the sequence of events described by the forward trajectory. For simplicity, our notation disregards an inversion in coordinates that are odd under time reversal symmetry when defining the reverse trajectory [5]. p F (x, t) is the probability density function (pdf) over the coordinates obtained from solving the appropriate Fokker-Planck equation for the forward protocol, while P F and P R are the conditional probability densities for a trajectory from x(0) to x(τ ) under the forward protocol, and for the time reversed trajectory from x † (0) to x † (τ ) under the reverse protocol, respectively. Denoting the pdfs of the total entropy production ∆s tot in forward and reverse driving protocols as P F (∆s tot ) and P R (∆s tot ), respectively, it is possible in certain circumstances to relate them as follows [24,25]: which is called the detailed fluctuation theorem. This result states that the probability of a negative stochastic entropy production for a reverse protocol is not zero, in spite of the usual demands of the second law, though it is exponentially smaller than the probability of a positive production of the same magnitude during the corresponding forward protocol.

SDEs and Fokker-Planck equation
We need to construct an appropriate evolution equation for the reduced density matrix [7]. Kraus operators define mappings of the density matrix between an initial and final state, and continuous stochastic trajectories of the density matrix can be generated using a sequence of Kraus operator maps defined for an infinitesimal time interval dt.
The general Kraus representation of the evolution of a density matrixρ in a time interval dt is given bȳ where M k are the Kraus operators, labelled by the index k. The map should be trace preserving, which requires the operator sum identity k M † k M k = I to be satisfied (I is the identity operator). The Kraus operators depend on dt, and for continuous evolution we require the Kraus operators to differ incrementally from the identity, i.e. M k ∝ I, when dt → 0.
An interpretation of Eq. (3) is that each Kraus operator can implement a stochastic action on the system by the environment, namely where ρ is a member of an ensemble of density matrices representing the uncertain current state of the system. The trace of ρ is clearly preserved. The operation takes place with a conditional probability given by p k = Tr(M k ρ(t)M † k ) such that the average over all possible transformations of ρ(t) is given by and a further averaging over the ensemble of ρ(t) yields Eq. (3), with the over-bar therefore denoting an ensemble average. For this interpretation to be physically acceptable ρ must remain positive definite under the mapping Eq.
(3). This approach can then lead to a Lindblad equation for the ensemble averaged density matrixρ [26]: with Lindblad operators L k related to the M k . Such a deterministic Lindblad equation can be unravelled into a stochastic differential equation to simulate stochastic interactions such as continuous measurements [27]. We show how this framework can be implemented for our two-level bosonic system in Section 3.
To calculate stochastic quantum entropy production, forward and reverse Kraus operators would be needed to construct forward and reverse stochastic trajectories, respectively. This was first proposed by Crooks [28] by considering forward and reverse changes in the density matrix with respect to the invariant equilibrium state of the system. This approach has been used to calculate the entropy production corresponding to quantum jump unravellings [29][30][31][32][33], although the method would not be appropriate for systems without an equilibrium state [34,35]. Other recent developments instead construct the reverse Kraus operators from the time reversal of the forward operators [34][35][36].
However, we need not seek reverse Kraus operators for situations where the trajectories are continuous and the stochastic evolution is Markovian. We need only derive a set of Itô SDEs for the forward dynamics (according to appropriate forward Kraus operators) from which the entropy production associated with the evolution of the reduced density matrix may be computed using the approach developed in [5]. We could construct SDEs for each element of the density matrix, though it is more convenient, and physically more transparent, to consider the dynamics of quantum expectation values of various physical operators, as we shall see.
Together with the set of SDEs, we require an associated Fokker-Planck equation for the pdf of the chosen system variables. Let us therefore consider an Itô SDE for a vector of dynamical variables x involving a vector of Wiener increments dW , a vector of deterministic rate functions A and a matrix of noise coefficients B: The corresponding Fokker-Planck equation that describes the evolution of the pdf p(x, t) is [37] ∂p(x, t) where D = 1 2 BB T is the diffusion matrix. We shall find that D is diagonal for our model which simplifies matters.

Stochastic entropy production
From the definition in Eq. (1), it is possible to divide the total entropy production into two contributions associated with the environment and the system, respectively. This division is somewhat arbitrary and is introduced largely for ease of interpreting the behaviour. We calculate the total entropy production using the approach of Ref. [5], first separating the deterministic part of Eq. (7) into two contributions: where A irr and A rev are the irreversible and reversible contributions to A, respectively. Specifically, if dx i (the i-th component of dx) is even with respect to time reversal, then A rev i dt is also even. Since dt is odd, A rev i has to be odd: A rev i transforms in the opposite way to x i under the time reversal. By similar reasoning A irr i transforms in the same way as x i . This separation is necessary for the calculation of the entropy production [5].
Explicit expressions for calculating the two contributions of the entropy production have been derived. For a system with a diagonal diffusion matrix D, the incremental environmental stochastic entropy production is [5]: where N is the number of dynamical variables contributing to the entropy production. Note that this expression (10) contains Wiener increments and hence the environmental entropy production ∆s env is explicitly a stochastic variable. The second contribution, the incremental system stochastic entropy production, is related to the change in the logarithm of the pdf p(x, t) of the system, obtained from the Fokker-Planck equation: where the second line is obtained using Itô's Lemma. The total stochastic entropy production is then given by the sum of the environmental and system contributions, d∆s tot = d∆s sys + d∆s env . From these individual stochastic contributions, it is possible to obtain the averaged system and environmental entropy productions by cumulative summing up of their respective incremental stochastic contributions, followed by the averaging over many realisations of the noises and hence system evolution. The averaged total entropy production also has an analytical form. For certain circumstances, namely sufficient elapsed time for the pdf to become time independent, and the maintenance of a condition of detailed balance, this average is given by the Kullback-Leibler divergence between the initial and final pdfs [5]: The double angled brackets denote averages over all noise histories and initial coordinates, over the time interval from t init to t f inal arising from all initial states in the ensemble. Likewise, the averaged system entropy production also has an analytical form, usually taken to be namely the difference in the Gibbs entropy of the final and initial probability distributions. This result again emerges only under certain conditions, and we investigate additional contributions in Section 3.3.

Stochastic Liouville-von Neumann equation
Having described some of the background concepts of stochastic entropy production, we now introduce the stochastic dynamics under consideration. As briefly discussed in the introduction, a non-Markovian SLN equation can provide an exact treatment of the dynamics of the open system reduced density matrix, implicitly after averaging over all manifestations of the noises associated with the environment (see below). The approach involves taking an average over stochastic solutions of the equation resulting in a deterministic trajectory from which physical predictions can be obtained [12,38]. However, in order to unravel the SLN equation in a straightforward way, we need to start from a Markovian limit of the system behaviour, such that the system interacts with an environment that does not possess any memory. Another perspective is to consider the environmental correlation times, which in a Markovian limit are very short when compared to the characteristic timescales for the dynamics of the open system. In this section we obtain the deterministic Markovian equation of motion for the physically meaningful, ensemble averaged reduced density matrix of our system. In Section 3 that follows, the unravelling procedure is considered.
Many non-Markovian methods for open quantum systems begin their development with the full density matrix of the environment and system and proceed to take a partial trace in order to derive an equation of motion exclusively for the reduced density matrix of the system. The Feynman-Vernon influence functional formalism is one such method, where the reduced density matrix of the open system is represented as a path integral over all environmental modes made up of an infinite number of harmonic oscillators [13]. Methods which rely on this include quasi-adiabatic path integrals [39], hierarchical equations of motion [40,41], stochastic Schrödinger equations [42] and the Stochastic Liouville-von Neumann (SLN) equation and its variations [8][9][10][11][12]. Other methods that do not rely on the Feynman-Vernon functional include projection operator methods such as the Nakajima-Zwanzig equation [43][44][45].
The SLN equation is a stochastic differential equation containing complex cross-correlated coloured noises representing the environment; to arrive at the appropriate deterministic description, an average needs to be taken over such noises. Note again that the outcome of such a process will describe the evolution of an average of an ensemble of physical density matrices, according to the framework of interpretation set out in Section 2.1.1.
The SLN equation describes an open quantum system, with coordinates q, coupled to a large number of harmonic oscillators that represent the environment. Here we shall consider the system to be coupled to three independent sets of oscillators; the reason for doing so will become apparent. The SLN formalism [9] can easily be extended to accommodate this. The Hamiltonian coupling of the system to the k th set of oscillators is taken to be linear in the oscillator coordinates ξ ik , where i labels the oscillators, but it can depend on a general coupling operator f k (q) of the system coordinates q. The full Hamiltonian can be written as follows where H env;k (ξ ik ) is the Hamiltonian for the k th set of environmental oscillators, and H sys is the Hamiltonian of the open system. To derive the SLN equation, the initial density matrix of the full system is taken to be the tensor product of the reduced density matrix of the system ρ q (t 0 ) and the environmental density matrix ρ ξ (t 0 ), i.e. ρ 0 = ρ q (t 0 ) ⊗ ρ ξ (t 0 ), where t 0 is an initial time. We shall be considering a weak coupling limit so this partitioned state is an appropriate approximation for the system we are investigating. Based on the Hamiltonian of Eq. (14), it is possible to obtain a stochastic differential equation for the (unphysical) stochastic reduced density matrix ρ s (t) of the system driven by a pair of complex coloured Gaussian noises, η k (t), and ν k (t), for each oscillator set k [8][9][10][11]. It should be stressed again that these stochastic noises are purely mathematical constructs, and do not generate individual physical trajectories [11,12]. Physical results only arise when the noise-driven trajectories are averaged over their many realisations. The SLN equation (with = 1) becomes: where the square brackets correspond to a commutator, and the curly brackets to an anticommutator. Each set of environmental oscillators is connected independently to the open system and the associated noises satisfy the following correlations: where J k (ω) is the spectral density of the k th oscillator set, β k = 1/k B T k is its inverse temperature (with k B being set to 1 hereafter), Θ(t − t ) the Heaviside function, and the angled brackets denote an average over the environmental noises; all other correlation functions are zero. There exist several ways to construct the coloured noises, with each scheme affecting the convergence of results in different ways [38].

Noise-averaged SLN equation
We need to average Eq. (15) with respect to realisations of the environmental noises (to be denoted by single angle brackets) to obtain an exact and deterministic evolution equation for the physical reduced density matrix of the open system. Taking the average of both sides of Eq. (15) and denoting the physical density matrix averaged over the noises asρ = ρ s , we write: In order to calculate the averages of the stochastic density matrix multiplied by the noises that appear in the right hand side, the Furutsu-Novikov theorem may be used [46,47]. For a set of noises ζ i with the correlation the Furutsu-Novikov theorem states that where A[ζ] is a functional of the noises and δA[ζ]/δζ j is its functional derivative with respect to ζ j . The averaged product of a noise and a noise-dependent functional may therefore be transformed into an integral. We use Eqs. (20) in (18) to obtain the required explicit expressions for the averages of the noises multiplied by ρ s : From Eq. (17) we observe the presence of the Heaviside function Θ(t − t ), which vanishes for t > t. The integrals in Eq. (21) have an upper limit t, which implies that K ην k (t − t) = 0 in the last term since t is always smaller than t due to the integration limits. Hence, the last term is zero and can be dropped; only the commutator terms remain. We then need to calculate the functional derivatives, and after some manipulation, see Appendix A.1, we obtain where U + and U − are forward and reverse propagators, respectively (see Appendix A.1). Eq. (22) is the most general form of the averaged SLN equation, as no approximations have been made so far. Note its non-Markovian nature. Also note that it describes the evolution of a reduced density matrix averaged over physical fluctuations, along the lines of Eq. (6). However, this equation still contains the stochastic density matrix ρ s and hence is not a self-contained equation for the physical ensemble-averaged density matrixρ. To be able to work with an equation depending exclusively on the latter, some simplifications need to be made; this will be done in the next Section.

Markovian limit of the SLN equation
Eq. (22) in its exact non-Markovian form is difficult to work with as it still contains the stochastic density matrix. A self-contained equation for the physical density matrixρ(t) can be obtained, however, in the Markovian limit by approximating certain properties of the environment. We begin by assuming that all the sets of oscillators coupled to the system have the same temperature T k ≡ T , and the same spectral density. This is not a necessary assumption or approximation in our analysis, but it will simplify our notations. Next we take the limit in which the temperature of the environment T is very large such that T is much greater than any characteristic energy quantum of its harmonic oscillators, βω 1, and we will also adopt the same Ohmic spectral density J(ω) = αω for all oscillator sets, where α is a proportionality constant. In this case we can use the first order expansion of coth βω 2 ≈ 2 βω , following [48], and obtain the following results for the correlation functions of Eqs. (16) and (17): and Note that the correlation functions are the same for all the oscillator sets, hence we have dropped the index k here. Eqs. (23) and (24) appear in [49], although a different route to obtain them was taken, specifically that of introducing a cutoff in the spectral density and allowing it to be much larger than the dynamical timescales of the system. Inserting Eqs. (23) and (24) into Eq. (22) leads to the Markovian limit of the averaged SLN equation that contains onlyρ(t). The equation takes a similar form to that found in other work [48]: but unlike the result published in [48], this equation contains a term involving the time dependence of the system coupling operators f k .

System setup and equations of motion
In the last section we obtained the Markovian limit of the evolution equation for the ensemble averaged reduced density matrix. Here we shall unravel this equation, namely add noise terms, such that individual trajectories correspond to different physical realisations of the quantum state diffusion of the system under the influence of an underspecified environment.
3.1 Setup of the system and the unravelling procedure

Two-level system and choice of environments
We consider a two-level bosonic system with Hamiltonian where σ z is the usual Pauli matrix and has the dimension of energy. We take the system Hamiltonian to be constant in time. We require the environmental interaction with the open system to be consistent with a thermal state subject to our earlier assumptions of weak coupling and high temperature. This means that the insertion of ρ ∝ I − βH sys should cause the right hand side of Eq. (25), with time independent f k , to vanish: The anticommutator vanishes exactly for our choice of Hamiltonian where H 2 sys ∝ I and the implication is that any choice of the operators f k , through which the system interacts with the environment, is consistent with the appropriate stationary thermal state. Since the f k are hermitian according to the structure of the Hamiltonian in Eq. (14), it is natural, in our case, to employ three coupling operators proportional to the Pauli matrices. Furthermore, it is natural for the initial strength of the coupling to be the same for each operator, in order that the system-environment interaction should be isotropic. Note that the same framework, if needed, could then be employed for system Hamiltonians proportional to σ x or σ y as well as the choice in Eq. (26).
This coupling scheme has the added advantage that it allows us to model the act of measurement of the system energy in a straightforward way, merely by elevating the strength of the coupling to the operator σ z . This represents an additional interaction between the system and an external measuring device, and in the next section we shall see that the unravelled system dynamics under such an interaction can indeed correspond to the stochastic selection of one of the energy eigenstates. Specifically, the Hamiltonian, Eq. (26), has two eigenvectors | ± with eigenvalues ± , respectively, and the environmental coupling to σ z drives the density matrix stochastically towards the form | + + | or | − − |. However, the scheme of coupling to the three Pauli matrices means that these eigenstates are not fixed points of the dynamics: the tendency is only for the system to dwell in the vicinity of one or other of the eigenstates, to an extent that depends on the strength of environmental coupling to σ z .
To summarise, we shall couple three independent sets of harmonic oscillators to our open system using operators: where for simplicity we replace labels k by the Cartesian indices x, y and z. γ 0 is the coupling coefficient between the system and the environment, and δγ(t) = γ(t) − γ 0 is the coupling coefficient between the system and the device measuring H sys . If γ(t) is increased from an initial value of γ 0 , we can model measurement of the system energy while under the influence of a thermal bath. We can calculate the stochastic entropy production related to this process. Furthermore, we can model the detachment of the measuring device by a protocol where γ(t) returns to γ 0 allowing the system to re-thermalise, and again compute the associated entropy production. Note also that, with this choice of the coupling, the last term in Eq. (25) containing the time derivative of the f k vanishes exactly; this is obvious for constant f x and f y , while for f z it follows from the fact that ∂f k /∂t is proportional to σ z .

Unravelled stochastic equation
To obtain a form of Eq. (25) that allows for a straightforward stochastic unravelling, we would need to write it in a Lindblad form with positive coefficients [7]. It would be possible to do so by extending the Hilbert space and then adding stochasticity [50,51], though that approach will not be taken here. Instead we construct Lindblad operators that match the dynamics of Eq. (25) in accordance with our high temperature approximation βH sys 1. We shall now show that the Markovian averaged (deterministic) SLN Eq. (25) is equivalent, to lowest order in β , to the Lindblad equation when an appropriate choice of the Lindblad operators is made. Indeed, consider the Lindblad equation (6) using operators where λ = 2α β . By inserting these operators into Eq. (6) we obtain We note that H sys , f 2 k vanishes when f 2 k ∝ I, which holds in our case. The final contribution on the right hand side is a factor βH sys smaller in magnitude than the penultimate term, and therefore can be neglected. With these considerations we recover the Markovian averaged SLN Eq. (25) confirming the suitability of the Lindblad operators in Eq. (29).
Starting from Eqs. (6) and (29) we can now construct an unravelling consistent with the quantum state diffusion (QSD) approach [52][53][54][55][56]. Using the set of Kraus operators [26,57,58] consistent with the development in Section 2.1.1, and substituting Eq. (31) into Eq. (4) we obtain (see Refs. [26,58]): where we use ρ to denote the physical reduced density matrix of the system that evolves stochastically, and where environmental noise is represented by a set of independent Wiener increments dW k . Notice that upon averaging over the Wiener noises this evolution equation corresponds to Eq. (6) and hence the density matrixρ corresponds to the noise averaged density matrix ρ. Eq. (32) finally provides us with the dynamics and the means by which to determine the irreversibility of system behaviour in various cases.

Equations of motion for components of the coherence vector
With the particular environmental couplings defined in Eq. (28) (k = x, y, z), and with γ 0 = 1, we obtain Lindblad operators: Instead of simulating the dynamics of elements of the density matrix directly, we will evolve the coherence vector defined by the three components r i = Tr(σ i ρ). This is consistent with a representation of the density matrix in the form ρ = 1 2 (I + r · σ) .
We obtain from Eq. (32) the following equations: dr x = −2 r y dt − 2λ 2 1 + γ 2 r x dt + λ β r z + 2 1 − r 2 x dW x − 2λr x r y dW y − 2γλr x r z dW z (35) dr y = 2 r x dt − 2λ 2 1 + γ 2 r y dt − 2λr x r y dW x + λ β r z + 2 1 − r 2 y dW y − 2γλr y r z dW z (36) Notice how important it is to consider the stochastic evolution of the coherence vector rather than the averaged behaviour. For the latter situation, described by d r x = −2 r y dt − 2λ 2 1 + γ 2 r x dt, etc, any initial state results in r x → 0, r y → 0 and r z → −β at long times and hence is not disturbed by the evolution of γ away from unity. What does change in these circumstances is the pdf of the system variables, and hence higher moments of the components of the coherence vector. The noise-averaged Lindblad equation forρ cannot capture the stochastic selection of an eigenstate under measurement.

Equations of motion in cylindrical coordinates
Next we derive the equations of motion for the convenient set of coordinates (r 2 , r z , φ), where r 2 = r 2 x + r 2 y + r 2 z and φ = arctan Using Itô's Lemma, see Appendix A.2, we can obtain: and Due to the coordinate singularities, suitable care needs to be taken with regard to initial state choices and the dynamics themselves.

Initial state simplification
We see from Eq. (39) that r = 1 is a fixed point of the dynamics. Defining a pure state by the condition Tr(ρ 2 ) = 1, it can be shown that r = 1 corresponds to a pure state by inserting ρ in its coherence vector representation 34. In order to reduce the computational difficulty of solving the equations, we set r = 1 and consider Hence, the coherence vector describing our system will always lie on the surface of the Bloch sphere, and the pdf will depend only on two variables: r z and φ. These equations form the basis of our simulations with corresponding results to be presented in Section 4.

Fokker-Planck equation
The definition of the entropy production in Eq. (1) requires an evolving pdf for the system coordinates over time, and hence we need to derive and solve the associated Fokker-Planck equation. To do so, we need expressions for the vector A and matrices B and D = 1 2 BB T , obtained by comparing their definitions in Eq. (7) with the equations of motion (42) and (43). We get and hence and D φφ = 2λ 2 β rz 2 2 + β r z + 1 There exists no φ dependence in either A or D, and the latter matrix is indeed diagonal, as anticipated. With these expressions, the Fokker-Planck equation for the pdf p(r z , φ, t) becomes: The Fokker-Planck equation can be simplified further if we assume the initial pdf to be independent of φ. As there is no explicit φ dependence in Eq. (49), if the initial pdf is φ independent, it will always remain so. This leads us to the simplification p(r z , φ, t) → p(r z , t), and a more succinct equation: The stationary solution p st (r z ) of the Fokker-Planck equation corresponding to setting ∂p(r z , t)/∂t = 0 can be obtained analytically using tools such as Mathematica. For γ = 1, we have: while for γ = 1 we have where N γ is a γ dependent normalisation factor, and the exponents are given by where we note that c = d. The stationary pdfs in Eqs. (51) and (52)   We can see that the coupling strength γ has a very significant effect on the stationary probability density of the system coordinate r z . Increasing γ concentrates the pdf closer to the boundaries and in an asymmetric manner due to the system Hamiltonian bias . By evolving γ from unity to a larger value, we can capture the effect of a measurement of the system energy, as this would cause individual trajectories to dwell in the vicinity of the eigenstates of H sys at r z = ±1, or equivalently at system energies Tr (H sys ρ) = r z of ± .

Average system entropy production and boundary contributions
Using its definition in Eq. (11), the incremental stochastic system entropy production over each time step dt can be calculated from the time-dependent pdf p(r z , t) obtained from the Fokker-Planck equation (50). We then average over many realisations of the behaviour. However, we have noticed in practice that combining the average system entropy production with the average environmental entropy production Eq. (10), obtained numerically for time independent environmental coupling, does not yield a vanishing total entropy production that we would expect for a system in a stationary state with zero probability current. We have therefore investigated this situation further. The incremental system entropy production averaged over r z and the noises for our system may be written as where d ∆s sys is the noise averaged increment in system entropy production, which is a function of r z and t. By integrating by parts twice and retaining the boundary terms, see Appendix A.3 for details, we obtain: where dS G is the increment in the Gibbs entropy of the system, shown in Eq. (13), and J z is the r z -component of the probability current defined in Appendix A.3. In Eq. (13) we noted that d ∆s sys = dS G held only under certain conditions, now shown to be where the boundary terms in Eq. (55) vanish. For example, the current J z and often the pdf and its derivatives vanish at the boundaries of the parameter space, and diffusion coefficients can also go to zero. For our system, although D zz = 0 at r z = ±1, it can be shown that the gradient ∂p st (r z , t)/∂r z is singular at the boundaries for the stationary solution of the Fokker-Planck equation, and that by implication this applies to nonstationary situations as well. There is therefore a need to correct the usual result d ∆s sys = dS G .
In contrast to this analysis, we observe a mean system entropy production consistent with zero when our system is in a stationary state, which suggests that the numerically obtained time dependent pdf is not sufficiently accurate near the boundaries, or the sampling of these regions by the generated trajectories is too limited, to produce the appropriate additional contributions. Our approach, therefore, is to estimate the boundary terms analytically for the stationary pdf and to add them to the numerical results.
Using the stationary pdfs given in Eqs. (52) and (51), together with the diffusion coefficient D zz given in Eq. (46), it may be shown that the boundary correction with J z in Eq. (55) vanishes at equilibrium, and so does dS G . This allows us to write the mean stationary system entropy production increments for general γ as which does not vanish. In fact, it is possible to obtain a simple analytical expression for γ = 1 to lowest order in β : d ∆s sys This shows that this boundary correction can be obtained exactly analytically for γ = 1, though it becomes harder for γ = 1. In the latter case we have a more complicated form of p st (r z ), and hence it is necessary to expand it analytically in terms of small β before calculating the boundary correction.

Average environmental entropy production
With the choice of the couplings and Hamiltonian, we can calculate the environmental entropy production from Eq. (11). We need to determine A irr and A rev from Eqs. (42) and (43). This is done by understanding how the components of the coherence vector and the various contributions to A transform under time reversal. For a system without spin degrees of freedom, the time reversal operator is given by Θ = K , where K is the operation of complex conjugation [59] and the time reversal operation on the Pauli matrices produces Thus r x and r z are even under time reversal, and r y is odd, and hence φ is also odd, and we can separate the coefficients of the dt terms in Eqs. (42) and (43) into their A irr and A rev components. We can write and obtain an expression for the environmental entropy production using Eq. (10), since we already have the form of D from Eq. (46).

Simulations and Results
In this section we describe two protocols designed to represent the dynamics of connecting the system to and disconnecting it from a measuring device, respectively, and we compute the associated stochastic entropy production for each.

The computational procedure
In order to demonstrate adherence to the detailed fluctuation theorem we consider two simple protocols that are a time reversal of each other: • Connecting a measuring device (protocol M): begin at t = 0 with the system thermalised for γ = 1 and the initial pdf defined as in Figure 2; for t > 0 perform simulations of Eqs. (42) and (43) using γ = 2; • Disconnecting a measuring device (protocol M): begin at t = 0 with the system thermalised for γ = 2 and the initial pdf defined as in Figure 2; then, for t > 0 perform simulations of Eqs. (42) and (43) using γ = 1.
The thermalisation of the system is defined by the pdf given by the stationary Fokker-Planck equation as plotted in Figure 2. For both protocols, the initial value of r z for a trajectory is randomly sampled from the appropriate pdf p st (r z ) corresponding to either γ = 1 or γ = 2, while the value of φ is randomly sampled from a uniform distribution within the range [0, 2π). Whenever we specify a particular value of γ, we will be referring to the value used for the dynamics (i.e., for t > 0), unless we explicitly mention the thermalisation condition (t = 0). For both protocols several parameters of the system are kept constant: the environmental inverse temperature β = 0.1, the Hamiltonian parameter = 1, the proportionality constant in the spectral density α = 0.01, and λ = √ 0.2. In order to calculate the system entropy production, we solve the Fokker-Planck equation (50) for r z and observe the behaviour of the system going from a stationary state at a particular value of γ to another stationary state defined by a different γ. The protocols will drive the system between two states obtained from the Fokker-Planck equation at different γ values. The evolution from one stationary pdf to the other occurs over a timescale of order t = 1, for both protocols.
In summary, the computational procedure is as follows: 1. Obtain the solution p(r z , t) of the Fokker-Planck equation (50) using the appropriate stationary pdf as the initial condition (at t = 0) for the trajectories; in both cases the boundary conditions at r z = ±1 are chosen corresponding to a zero probability current J z defined in Appendix A.3. Note that each pdf is obtained on a grid of r z values.
2. Start a loop over the noises (independently for each protocol): (a) generate noises dW x , dW y and dW z for each time step dt; (b) run the stochastic dynamics for r z (t) and φ(t) for both protocols using Eqs. (42) and (43) based on initial values sampled from the corresponding γ-dependent stationary pdf; (c) for each time increment dt and using obtained values of r z (t) and φ(t), calculate the incremental contributions to the environmental, d∆s env , and system, d∆s sys = −d (ln p(r z , t)), stochastic entropy productions via respectively, Eq. (10) and the first line of Eq. (11); in the latter case the increment of the logarithm of the pdf is calculated as a difference between its values at two consecutive times. The values of the pdf p(r z , t) for these times at the required value of r z are obtained by linearly interpolating each pdf between the two nearest r z values on the grid; (d) the incremental total stochastic entropy production, at each time step dt, is calculated as a sum, d∆s tot = d∆s env + d∆s sys ; accumulate these contributions over an entire trajectory; (e) go back to step 2(a) to run another trajectory; the calculation is repeated the required number of times using different manifestations of the noises. The stochastic entropy production is averaged over the trajectories. Each protocol will have its corresponding independent ensemble of entropy productions.
3. Once the averaged values of the entropy production increments are obtained from a large ensemble of trajectories, the boundary term from Eq. (55) is added to the ensemble average d ∆s tot as for the specific value of γ. This yields the final increment of the mean total entropy production d ∆s tot for the given time step dt, separately for the two protocols.

Dynamics of r z and φ
The thermal stateρ eq of the system is approximately proportional to e −βHsys for weak system-environment coupling.
In the derivation of Eq. (32) we have assumed βH sys 1, so we can writeρ eq ≈ 1 2 (I − βH sys ) and hence obtain the average of r z asr z = Tr (σ zρeq ) = −β ≈ −0.1 for our parameters. We find that our numerics supports this: by running one million realisations of the dynamics for protocol M (representing the connection of the measuring device), we observe that the averaged results of r z do indeed remain constant throughout the simulation. Furthermore, the dynamics of φ maintains a pdf that is constant in φ which matches the assumptions used to derive Eq. (50). While the averaged system behaviour remains unchanged, the stochastic trajectories differ significantly due to measurements, shown in Figure 3. Note that each individual trajectory, if run for a sufficiently long time, would dwell in the vicinity of either of the boundaries r z = ±1, jumping between the two. However, once the system reaches stationarity (equilibrium), the ensemble of trajectories reaches the stationary distribution p st (r z ) γ associated with the corresponding value of γ. One would expect this distribution to match the distribution following from solving the Fokker-Planck equation from Section 3.2. As can be seen from Figure 3 (right panel), this is indeed the case in our simulations. This demonstrates the stochastic behaviour of the system and the effect of the measurement interactions that send r z to the vicinity of the eigenstates, while maintaining the same thermal averaged state throughout the evolution. This exemplifies how crucial it is to represent the stochastic behaviour of quantum measurements, and to recognise that all physical changes in the system are stochastic in nature. We then show in Figure 4 how the system explores the surface of the Bloch sphere biased by the Hamiltonian and under the influence of the environment for a couple of individual trajectories for protocols M and M (connecting and disconnecting the measurement device). The trajectory with γ = 1 explores the Bloch sphere fairly uniformly as seen in Figure 2, but for γ = 2, the trajectories tend to dwell near the boundaries at r z = ±1.

Entropy production
The total stochastic entropy production is obtained by summing up the stochastic system entropy production in Eq. (11), the stochastic environmental entropy production in Eq. (10), and (for the ensemble average) the boundary correction term from Eq. (55). The results for protocol M with γ = 2 are shown in Figure 5. The noise and r z averaged total entropy production is shown with (solid black) and without (dotted black) the boundary correction. The red curve corresponds to a single stochastic realisation. The dashed blue line denotes the asymptotic analytical average total stochastic entropy production for the process from Eq. (12). The averages result from one million realisations for β = 0.1, = 1, t max = 2, dt = 10 −5 , and α = 0.01.
As is seen from the Figure, without the boundary correction the mean entropy production continues to increase which is inconsistent with a stationary system. The need for boundary corrections is probably a consequence of our choice of system variables: it arises from the singular pdf density at the boundaries. Figure 6: The black curve corresponds to noise-and r z -averaged total stochastic entropy production for protocol M, with γ = 1, including the boundary correction. The red curve is a single stochastic realisation, and the dashed blue curve is the asymptotic analytical averaged total stochastic entropy production from Eq. (12). The averages are obtained from one million realisations for β = 0.1, = 1, t max = 2, dt = 10 −5 , and α = 0.01. The boundary correction for γ = 1 is significantly smaller than that for γ = 2.
The entropy production for protocol M with γ = 1 is shown in Figure 6. As with protocol M, these results also lead to a ceiling in the average total stochastic entropy production once the system equilibrates. We see that the process of disconnecting the measurement device also leads to a positive mean entropy production; in faimagesct, the mean entropy production is higher than for the connection process. Note that entropy production tends to a constant asymptotically even for an individual trajectory. Both Figs. 5 and 6 display perfect alignment between the analytically averaged total entropy production for late times, and its numerical counterpart. This validates the approximations we have made, and the form we have used for the boundary correction terms.

Entropy production pdf and detailed fluctuation theorem
As a check on the accuracy of the entropy production results, we verify that the detailed fluctuation theorem Eq.
(2) is satisfied. The detailed fluctuation relation concerns processes described by forward and reverse protocols that are time reversals of one another, and where the final pdf under the forward protocol is the same as the initial pdf under the reverse protocol. However, as we deal here with distributions of the entropy production calculated over many trajectories, the strict reversal condition of the trajectories in protocols M and M is not necessary and we can compare e ∆stot directly with the ratio of entropy pdfs P M (∆s tot ) /P M (−∆s tot ) . Figure 7 displays the asymptotic distributions of total stochastic entropy production for protocols M and M. These distributions should satisfy the detailed fluctuation relation, and we indeed find a very good adherence except for deviations at the extremes of the range due to insufficient sampling.

Discussion and Conclusions
In this paper, we have considered the stochastic entropy production induced by continuous measurements of a simple open quantum system. We begin with the stochastic Liouville-von Neumann (SLN) equation for the system dynamics, in which the environments are represented by coloured noises, taking the appropriate stochastic average with respect to these noises and retaining the non-Markovian generality of the dynamics. Trajectories of the reduced density matrix generated by the SLN equation are not physical, only the averaged evolution is of physical significance. In contrast, we require a stochastic dynamical unravelling that can be interpreted as the outcome of a physical weak continuous measurement of the system in conjunction with thermalising influences of the environment. To do so, we take the Markovian limit of the behaviour of the environment in the SLN by assuming a high temperature with a specific choice of a two-level bosonic Hamiltonian H sys = σ z . These steps enable us to perform a Markovian stochastic unravelling described by an Itô process, which then allows us to derive an appropriate Fokker-Planck equation and understand how the pdf of the reduced density matrix changes with the strength of the continuous measurements while coupled to a thermalising bath. We have set up a system that interacts with three independent harmonic baths through the three Pauli matrices, allowing for a quasi-isotropic stochastic exploration of the Bloch sphere consistent with residence in a thermal state. We take the view that a system under constant interaction with an underspecified environment is best represented by an ensemble of reduced density matrices that conveys its uncertainty. The measurement of system energy is then realised by increasing γ, the strength of environmental coupling to σ z , causing the system to dwell in the vicinity of one of the two energy eigenstates. We can then obtain the stochastic entropy production associated with each stochastic trajectory of the system. Note that if the pdf of the reduced density matrix is stationary for a particular set of environmental couplings, raising γ will cause the pdf to increase in the vicinity of the eigenstates of the system Hamiltonian but this will not change the average energy of an ensemble of systems. A model of the dynamics and thermodynamics of measurement requires the system to be represented by a member of an ensemble: the average behaviour will not suffice.
To calculate the entropy production associated with measurement, we use an analysis of the Markovian system dynamics developed in [5]. This contrasts with the calculation of the stochastic entropy production for quantum systems based on forward and reverse trajectories constructed using forward and reverse Kraus operators [28,31,36]. We also compute system entropy production by considering the evolution of the pdf of the reduced density matrix of the system.
We find that the numerically calculated entropy production contains some subtle numerical artefacts. These are corrected by calculating the mean analytical system entropy production in a stationary state, allowing us to obtain boundary correction terms that eliminate the artefacts.
We find that the increase and decrease in coupling strength γ, corresponding to the operation of a measuring device, are both accompanied by a positive mean stochastic entropy production. This is the entropic cost of quantum measurement. Furthermore, we show that the stochastic entropy production associated with a quantum measurement is finite as the system achieves a new stationary state. Finally, we have shown that the processes of attachment and detachment of a measuring device generate distributions of stochastic entropy production that satisfy a detailed fluctuation relation. We aim to extend this approach next to more complicated systems, to include non-Markovian dynamics and to relax the requirement for a high temperature approximation. Acknowledgments D. Matos is supported by the EPSRC Centre for Doctoral Training in Cross-Disciplinary Approaches to Non-Equilibrium Systems (CANES, Grant No. EP/L015854/1). Calculations in this paper were performed using the King's College HPC cluster Gravity.

A.1 Averaged SLN equation
Here we shall consider the calculation of the functional derivatives of ρ s with respect to the noises needed for the derivation of Eq. (22). The formal solution of the SLN equation is given by ρ s (t) = U + (t, 0)ρ s (0)U − (0, t), where U + and U − are the appropriate forward and backward propagators defined as where T + and T − are the forward and backward time ordering operators, respectively, and are the Hamiltonian operators. Note that they are not adjoints of each other, see Ref. [11] for details. Hence the functional derivative of the density matrix can be written as δρ s (t) δζ(t ) = δU + (t, 0) δζ(t ) ρ s (0)U − (0, t) + U + (t, 0)ρ s (0) δU − (0, t) δζ(t ) .
The derivatives of the forward and backward propagators with respect to arbitrary perturbations of the noises are given by: where V ± (τ ) = − η(τ ) ± ν(τ ) 2 f , for convenience dropping the suffix k. We can now define the variation of the propagators with respect to the noises η(t) and ν(t). Starting with η(t): δU For ν(t): This then allows us to write η(t)ρ s (t) as Using this expression, we arrive at Eq. (22) given in the text.

A.2 Cylindrical coordinate equations of motion
In this section, we derive the equations of motion in cylindrical coordinates (r 2 , r z , φ) for general γ. With these coordinates, the x and y components are given by r x = r 2 − r 2 z cos φ r y = r 2 − r 2 z sin φ.
From these expressions and using Itô's Lemma, we can obtain the equations of motion for this set of coordinates: The required derivatives are: which allows us to simplify Eqs. (72) and (73): dr 2 = 2r z dr z + (dr z ) 2 + 2r y dr y + (dr y ) 2 + 2r x dr x + (dr x ) 2 (78) dφ = − r y r 2 x + r 2 y dr x + r x r y r 2 x + r 2 y 2 (dr x ) 2 + r x r 2 x + r 2 y dr y − r x r y r 2 x + r 2 y 2 (dr y ) 2 + r 2 y − r 2 x r 2 x + r 2 y 2 dr x dr y .
dr 2 = 4λ 2 r 2 − 1 γ 2 r 2 Next we integrate the second term inside the integral by parts and replace the r z derivative of the probability current with the left hand side of the Fokker-Planck equation (95): ∂ ln p(r z , t) ∂t p(r z , t)dt + ln p(r z , t) ∂J z ∂r z dt dr z − [J z ln p(r z , t)] where dS G is the increment of the Gibbs entropy, see Eq. (13).