A Framework for Sequential Measurements and General Jarzynski Equations

Heinz-Jürgen Schmidt; Jochen Gemmer

doi:10.1515/zna-2019-0272

Publicly Available Published by De Gruyter December 12, 2019

A Framework for Sequential Measurements and General Jarzynski Equations

Heinz-Jürgen Schmidt and Jochen Gemmer

From the journal Zeitschrift für Naturforschung A

https://doi.org/10.1515/zna-2019-0272

Abstract

We formulate a statistical model of two sequential measurements and prove a so-called J-equation that leads to various diversifications of the well-known Jarzynski equation including the Crooks dissipation theorem. Moreover, the J-equation entails formulations of the Second Law going back to Wolfgang Pauli. We illustrate this by an analytically solvable example of sequential discrete position–momentum measurements accompanied with the increase of Shannon entropy. The standard form of the J-equation extends the domain of applications of the standard quantum Jarzynski equation in two respects: It includes systems that are initially only in local equilibrium, and it extends this equation to the cases where the local equilibrium is described by microcanononical, canonical, or grand canonical ensembles. Moreover, the case of a periodically driven quantum system in thermal contact with a heat bath is shown to be covered by the theory presented here if the quantum system assumes a quasi-Boltzmann distribution. Finally, we shortly consider the generalised Jarzynski equation in classical statistical mechanics.

Keywords: Jarzynski Equations; Second Law; Sequential Measurements

1 Introduction

The famous Jarzynski equation represents one of the rare exact results in nonequilibrium statistical mechanics. It is a statement about the expectation value of the exponential of the work ⟨e−βw⟩ performed on a system that is initially in thermal equilibrium with inverse temperature β, but can be far from equilibrium after the work process. This equation has been first formulated for classical systems [1] and subsequently proven to hold for quantum systems [2], [3], [4]. Extensions to systems initially in local thermal equilibrium [3], microcanonical ensembles [5], and grand canonical ensembles [6], [7], [8], [9], [10], [11] have been published. The literature on the Jarzynski equation and its applications is abundant; a concise review is given in [12] with the emphasis on the connection with other fluctuation theorems. The most common approach to the quantum Jarzynski equation is in terms of sequential measurements. This approach will also be adopted in the present article. The “work” that appears in the Jarzynski equation is then understood in terms of the energy differences according to two sequential measurements and hence as a random variable. Although “work” is not an observable [13] in the sense of a self-adjoint operator giving rise to a projection-valued measure, it can be viewed as a generalised observable [14] in the sense of a positive operator–valued measure, see Roncaglia et al. [15], De Chiara et al. [16], and Section 3.1.

Interestingly one can derive from the Jarzynski equation certain inequalities that resemble the Second Law, see, e.g. Campisi and Hänggi [17]. However, a closer inspection shows that these inequalities are not exactly statements about the nondecrease of entropy. Only in the limit case where the system is approximately in thermal equilibrium also after the work process would this interpretation be valid. On the other hand there are numerous attempts to derive a Second Law in the sense of nondecreasing entropy in quantum mechanics, starting with the article of W. Pauli [18] “on the H-theorem concerning the increase of entropy in the view of the new quantum mechanics.” It is the aim of the present article to unify these two routes of research and to identify its common roots.

The structure of the article is as follows. In Section 2, we develop a general framework for sequential measurements and prove a so-called J-equation essentially based on the assumption of a (modified) doubly stochastic conditional probability matrix. The J-equation depends on an arbitrary sequence q(j) of hypothetic probabilities, but in this article we will consider only two special cases, case R (“real probabilities”) and case S (“standard probabilities”) to be defined in Section 3. In case R, the J-equation implies, via Jensen’s inequality, an increase of the (modified) Shannon entropy from the first to the second measurement. In Section 3, we specialise the general framework of Section 2 to the case of quantum theory such that the first measurement is of Lüders type satisfying two more assumptions. Then, we can reformulate the J-equation for case S where the initial density matrix is a function 𝒢 of L commuting self-adjoint operators, see Theorem 1. Special choices for the function 𝒢 and the L commuting self-adjoint operators lead to various diversifications of the Jarzynski equation: the local equilibrium given by N canonical ensembles (Subsection 3.2), the microcanonical ensemble (Subsection 3.3), and the grand canonical ensemble case (Subsection 3.4). Moreover, in Subsection 3.5, we consider recently discovered cases of a periodically driven quantum system that are in quasi-equilibrium with a heat bath possessing a quasi-temperature 1/ϑ and show how these cases can also be covered by the present theory. In all these applications, we will obtain case S variants of the Second Law–like statements following from the Jarzynski equations via Jensen’s inequality.

The aforementioned case R variant of the Second Law also holds in quantum theory. This will be discussed in some more detail in Section 4 containing further applications. It will be instructive to consider the analytically solvable example of two subsequent discrete position–momentum measurements at a free particle moving in one dimension and to confirm the mentioned increase of Shannon entropy, see Subsection 4.2. In the following Subsection 4.3, we show how to integrate the quantum version of the Crooks dissipation theorem into our approach. We briefly discuss how the results hitherto derived can be transferred to the classical realm in Section 5. We close with a summary and outlook in Section 6. In order to make the article more readable, we have shifted most of the proofs and further mathematical details to two Appendices.

2 Statistical Model of Sequential Measurements

2.1 Simple Case

We consider two sequential measurements at the same physical system at times t0<t1 with respective outcome sets ℐ and 𝒥. These sets are assumed to be finite or countably infinite. Hence, the joint outcome of the two measurements can be represented by the pair (i,j)∈ℐ×𝒥. We define

(1)𝖤≡ℐ×𝒥

as the set of “elementary events” and describe the probability of elementary events by a function

(2)P:𝖤→[0,1]

subject to the natural condition

(3)∑(i,j)∈𝖤P(i,j)=1.

As usual, one defines the first and second marginal probability functions

(4)p:ℐ→[0,1]

(5)p(i)≡∑j∈𝒥P(i,j),

and

(6)p^:𝒥→[0,1]

(7)p^(j)≡∑i∈ℐP(i,j).

For the sake of simplicity, we will assume

(8)p(i)>0 for all i∈ℐ.

This could be achieved by deleting all outcomes i∈ℐ with p(i) = 0, thereby reducing the set ℐ. Due to (8), the “conditional probability”

(9)π(j|i)≡P(i,j)p(i)

can be defined for all (i,j)∈𝖤. It satisfies

(10)∑j∈𝒥π(j|i)=1 for all i∈ℐ,

and hence can be considered as a stochastic matrix. Note further that

(11)p^(j)=∑i∈ℐπ(j|i)p(i).

If additionally π is a “doubly stochastic matrix,” i.e.

(12)∑i∈𝒥π(j|i)=1 for all j∈𝒥,

the triple (ℐ,𝒥,P) will be called a “statistical model of two sequential measurements” (SM2).

In accordance with the usual nomenclature of probability theory, functions X:𝖤→ℝ are also called “random variables.” Their expectation value is defined as

(13)⟨X⟩≡∑(i,j)∈𝖤X(i,j)P(i,j),

if the series converges. Using a sloppy notation, the expectation value will be sometimes also written as ⟨X(i,j)⟩ if no misunderstanding is likely to occur. We have the following result:

Proposition 1

If (I,J,P) is an S⁢M2 and q:J→[0,1] a sequence satisfying

(14)∑j∈𝒥q(j)=1,

then

(15)⟨q(j)p(i)⟩=1.

Conversely, if (I,J,P) satisfies the above conditions, but not necessarily (12), and (15) holds for all q:J→[0,1] satisfying (14), then π(j|i) will be doubly stochastic.

The q(j) will also be called “hypothetical probabilities” in contrast to the “real probabilities” p^(j). The choice q(j)=p^(j) will be referred to as the “case R.” The above proposition essentially says that the J-equation is equivalent to π(j|i) being doubly stochastic. The proof can be found in Appendix A.

We will call (15) and its modified form (27) the “J-equation” as we think that it contains the probabilistic core of the Jarzynski equation but should be distinguished from the latter for the sake of clarity. To illustrate this claim, we note that any sequence p:ℐ→[0,1] of probabilities satisfying (8) may be written in the form

(16)p(i)=exp(−β(Ei−F)),

and, analogously,

(17)q(j)=exp(−β(E′j−F′)),

where the β,Ei,E′,F,F′ are certain real parameters, not uniquely determined by (16) and (17). Then, (15) can be written as

(18)⟨e−βw⟩=e−βΔF,

where w:𝖤→ℝ is a random variable defined by w(i,j)≡E′−Ei and ΔF≡F′−F. Indeed, (18) has the form of the standard Jarzynski equation, but in general, the parameters occurring in (18) will not have the physical meaning of inverse temperature β, work w, and difference of free energies ΔF, as required for the Jarzynski equation. Even in the special case where the usual physical interpretation of the parameters β,Ei,E′,F,F′ holds, we have not yet proven the standard Jarzynski equation, because we still would have to confirm the conditions of Proposition 1 for this special case.

Let (ℐ,𝒥,P) be an SM2 and choose q(j)=p^(j) for all j∈𝒥, that is, replace the hypothetical probabilities by the real ones. Then, by Proposition 1,

(19)⟨p^(j)p(i)⟩=1.

As the logarithm (with arbitrary basis) is a concave function, Jensen’s inequality yields

(20)⟨logX⟩≤log⟨X⟩

for any random variable X:𝖤→ℝ+. If we define the Shannon entropy [19] as usual by

(21)S(p)≡−∑ipilogpi,

it follows immediately from (20) that S(p) does not decrease between two sequential measurements:

Proposition 2

(22)S(p^)≥S(p).

The proof can be found in Appendix A.

It is an obvious question under which circumstances the inequality in (22) will be a strict one. We will answer this question only for the case of finite ℐ=𝒥:

Proposition 3

Let I=J and |I|=n. Then, S⁢(p^)=S⁢(p) if the conditional probability is of permutational type, i.e. if π(j|i)=δj,σ⁢(i) for some permutation σ∈S⁢(n).

One may ask which assumption is responsible for the asymmetry between the two sequential measurements that appears in (22). Obviously, this is the property (12) of the conditional probability matrix being doubly stochastic that is postulated only for the first conditional probability π and not for the second one π^(i|j)≡P(i,j)p^(j). It will be instructive to consider the situation in which both matrices, π and π^, are doubly stochastic. For the sake of simplicity, we will assume that the outcome sets ℐ and 𝒥 are finite, both containing exactly n elements, and that P(i,j)>0 for all i∈ℐ and j∈𝒥. It follows that, in matrix notation, πp=p^ and π^p^=p, cf. (11); moreover, the double stochasticity may be written as π1=π^1=1, where 1 denotes the constant vector 1=(1,1,…,1). Hence, the matrix Π≡π^π has the two positive invariant distributions 1n1 and p. As P(i,j)>0, the matrix Π is irreducible, and hence, its positive invariant distribution is unique (Theorem 54 of [20]). Consequently, p(i)=p^(j)=1n for all i∈ℐ and j∈𝒥. This characterises the constant distribution with maximal Shannon entropy and hence a completely symmetric situation.

An equation similar to the J-equation (15) considered above has been proven in [21]. In our notation, it can be formulated as

(23)⟨p(i)π^(i|j)⟩=⟨p^(j)π(j|i)⟩=⟨p(i)p^(j)P(i,j)⟩=1.

However, the closer comparison of (23) and (15) shows that these equations are not equivalent, which is also clear from the fact that (23) does not presuppose additional assumptions like the double stochasticity of the conditional probability.

2.2 Modified Case

Now we will formulate a slightly more general framework for SM2 that is motivated by applications using quantum theory in Section 3 and partially follows the account of Wolfgang Pauli in [18], Ch. I §2.

When defining the modified framework for sequential measurements, we will again consider the triple (ℐ,𝒥,P) assumed for the simple case and additionally postulate two nonvanishing functions

(24)d:ℐ→ℕ and D:𝒥→ℕ.

In Section 3, the d(i) and D(j) will be interpreted as the degeneracies of certain eigenspaces of measured observables. In Appendix B, we will derive the following assumption (25) characterising the modified case by coarse graining of the outcome sets of the simple case. Here, the d(i) and D(j) play the role of cell sizes of the coarse graining.

The 5-quintuple (ℐ,𝒥,P,d,D) will be called a “modified statistical model of two sequential measurements” (mSM2), if the condition of π being doubly stochastic is replaced by the unprimed version of (B9):

(25)∑i∈ℐπ(j|i)d(i)=D(j) for all j∈𝒥.

We will generally denote a conditional probability function π:𝖤→[0,1] satisfying (25) as being of “modified doubly stochastic” type.

Consequently, we obtain the following variant of Proposition 1:

Proposition 4

If (I,J,P,d,D) is an m⁢S⁢M2 and q:J→[0,1] a sequence satisfying

(26)∑j∈𝒥q(j)=1,

then

(27)⟨d(i)D(j)q(j)p(i)⟩=1.

Conversely, if (I,J,P,d,D) satisfies the above conditions, but not necessarily (25), and (27) holds for all q:J→[0,1] satisfying (26), then π(j|i) will be of modified doubly stochastic type.

The proof is completely analogous to that of Proposition 1. Analogously, it follows that the modified Shannon entropy

(28)S′(p)≡−∑ip(i)logp(i)d(i)

does not decrease in the modified statistical model of two sequential measurements, i.e.

Proposition 5

(29)S′(p^)≥S′(p),

where

(30)S′(p^)≡−∑jp^(j)logp^(j)D(j).

This equation is analogous to the statement dSdt≥0 after (22) in [18] that has been proven by Pauli using the stronger symmetry condition (in our notation)

(31)π(j|i)D(j)=π(i|j)d(i),

see (21) in [18], justified by first-order perturbation theory (“Fermi’s Golden Rule”). We note that in general (31) need not hold, see Section 4.2 for a counter-example, but there are also positive examples beyond the Golden Rule, see Subsection 4.3.

2.3 Symmetric Formulation

In this subsection, we will give a more symmetric and slightly more formal account of the framework theory for the (modified) statistical model of sequential measurements that will be used later in Subsection 4.3.

The basic concepts will be ℐ,𝒥,Π,p,q. As in Section 2.1, ℐ and 𝒥 will be finite or countably infinite sets and

(32)𝖤≡ℐ×𝒥.

We first postulate

Axiom 1

p is a function p:I→(0,1) such that the series ∑ip⁢(i) converges and has the value ∑ip⁢(i)=1. Analogously,

q is a function q:J→(0,1) such that the series ∑jq⁢(j) converges and has the value ∑jq⁢(j)=1.

Next, Π will be a function Π:𝖤→ℝ+ called the “conditional matrix” that has no direct physical meaning. Its values will be written as Π(j|i). It is subject to the following.

Axiom 2

Π is a function Π:E→R+ such that the two following series converge for all i∈I,j∈J and have positive values:

(33)d(i)≡∑jΠ(j|i)>0,

(34)D(j)≡∑iΠ(j|i)>0.

As an immediate consequence, the “first conditional probability”

(35)π(j|i)≡Π(j|i)d(i)

satisfies

(36)∑jπ(j|i)=1 for all i∈ℐ,

and

(37)∑iπ(j|i)d(i)=D(j) for all j∈𝒥.

Hence, π is a doubly stochastic matrix in the modified sense. Define

(38)P(i,j)≡π(j|i)p(i) for all (i,j)∈𝖤,

then

(39)∑(i,j)∈𝖤P(i,j)=1,

and m=(ℐ,𝒥,P,d,D) will be an mSM2 in the sense of Subsection 2.2.

As the axioms Axiom 1 and Axiom 2 are completely symmetric with respect to the transpositions ℐ↔𝒥 and p↔q, we may analogously define a second model m~=(𝒥,ℐ,P~,D,d) by

(40)π~(i|j)≡Π(j|i)D(j)=(35)π(j|i)d(i)D(j),

and

(41)P~(j,i)≡π∼(i|j)q(j),

such that

(42)∑jπ~(i|j)D(j)=d(i) for all i∈ℐ,

and

(43)∑(j,i)∈𝖤P~(j,i)=1.

Hence, m~=(𝒥,ℐ,P~,D,d) will also be an mSM2 in the sense of Subsection 2.2 called the “reciprocal model” with respect to m and will satisfy a reciprocal J-equation of the form

(44)⟨D(j)p(i)d(i)q(j)⟩=1.

Let us reconsider the original J-equation (27) and define the corresponding random variable Y (in a less sloppy way than above) as

(45)Y(i,j)=d(i)q(j)D(j)p(i) for all i∈ℐ and j∈𝒥.

We then rewrite the expectation value of Y in the following way. For any real number y∈ℝ, we define

(46)𝖤y≡{(i,j)∈𝖤|Y(i,j)=y}.

Let 𝖸≡{y∈ℝ|𝖤y≠∅}, then

(47)⟨Y⟩=∑y∈𝖸(∑(i,j)∈𝖤yP(i,j))y.

The sum in the brackets can be interpreted as the probability that Y assumes the value y, or, in symbols:

(48)P(Y=y)=∑(i,j)∈𝖤yP(i,j).

Next, we repeat the above definitions for the reciprocal model m~=(𝒥,ℐ,P~,D,d) setting

(49)Y~(j,i)=D(j)p(i)d(i)q(j)=1Y(i,j),

(50)𝖤~z={(j,i)∈𝖤~|Y~(j,i)=z},

for z∈ℝ and

(51)P~(Y~=1/y)=∑(j,i)∈𝖤~1/yP~(j,i).

We note that

(52)(i,j)∈𝖤y⇔(j,i)∈𝖤~1/y,

and formulate the following “C-equation”:

Proposition 6

For all y∈Y, there holds

(53)P(Y=y)P~(Y~=1/y)=1y.

The proof can be found in Appendix A.

From the C-equation, we may again derive the J-equation (27) in the following way:

(54)⟨Y⟩=(47,48)∑y∈𝖸P(Y=y)y=(53)∑y∈𝖸P~(Y~=1/y)=1.

3 Applications to Quantum Theory

3.1 General Case

We will investigate how the (modified) statistical model of sequential measurements outlined in the preceding subsections can be realised within the framework of quantum theory. The identification of the respective concepts will be facilitated by denoting them with the same letters. Additionally to a number of usual assumptions, we will use Assumption 1 and Assumption 2 that are highlighted below.

We consider a quantum system with a Hilbert space ℋ and a finite number of mutually commuting self-adjoint operators E∼ 1,…,E∼ L defined on (suitable domains of) ℋ. They are assumed to have a pure point spectrum and hence a family of common eigenprojections (P∼ i)i∈ℐ such that

(55)E∼ λ=∑i∈ℐEi(λ)P∼ i,λ=1,…,L.

Here ℐ is a finite or countable infinite index set to be identified with the outcome set of the first measurement according to Section 2. The P∼ i are assumed to be of finite degeneracy,

(56)d(i)≡Tr(P∼ i)<∞, for all i∈ℐ,

and are chosen as maximal projections in the sense that i≠j implies Ei(λ)≠Ej(λ) for at least one λ=1,…,L. Note the completeness relation

(57)∑i∈ℐP∼ i=𝟙.

Physically, the E∼ 1,…,E∼ L correspond to observables that can be jointly measured. We assume a (mixed) state of the system before the time t=t0 described by a density operator ρ and perform a joint Lüders measurement, cf. [14] (10.22), of E∼ 1,…,E∼ L at the time t=t0. The probability of the outcome i∈ℐ will be

(58)p(i)=Tr(ρP∼ i),

satisfying

(59)∑i∈ℐp(i)=1.

In accordance with (8), we will make the following assumption:

Assumption 1

(60)p(i)>0foralli∈ℐ.

The validity of this assumption could be achieved by restricting the Hilbert space ℋ to the subspace spanned by the eigenspaces of those P∼ i with p(i)=Tr(ρP∼ i)>0.

After the first measurement of the E∼ 1,…,E∼ L, the system is subject to a further time evolution and a second measurement of (possibly) other observables. Thus, the primary preparation together with the first measurement may be considered as another preparation of a certain state, in general different from the initial state ρ. If a selection according to a particular outcome i∈ℐ is involved, this state will be, according to the assumption of a Lüders measurement, cf. [14] (10.22),

(61)ρi=P∼ iρP∼ iTr(ρP∼ i)=P∼ iρP∼ ip(i).

If no selection according to a particular outcome is involved, the state resulting after the first measurement will rather be the mixed state

(62)ρ1=∑i∈ℐp(i)ρi=(61)∑i∈ℐP∼ iρP∼ i.

In order to apply the results of the preceding section we will make the following crucial assumption:

Assumption 2

(63)ρi=1d(i)P∼ iforalli∈ℐ.

If P∼ i is a one-dimensional projection, i.e. if d(i)=1, the assumption (63) will be automatically satisfied. In the case of d(i)>1, this assumption means that ρ is diagonal with respect to any common eigenbasis of the E∼ 1,…,E∼ L. An important case where (63) holds is given if ρ is a function of the operators E∼ 1,…,E∼ L, say,

(64)ρ=𝒢(E∼ 1,…,E∼ L).

This has to be interpreted in the sense of functional calculus as

(65)ρ=∑i∈ℐ𝒢(Ei𝝀)P∼ i,

where Ei𝝀 will be the short-hand notation for (Ei(1),…,Ei(L)). It follows that, in accordance with (63),

(66)ρi=(61)1p(i)P∼ iρP∼ i=(6165)1p(i)𝒢(Ei𝝀)P∼ i=1d(i)P∼ i,

(67)p(i)=(58)Tr(ρP∼ i)=(65)𝒢(Ei𝝀)TrP∼ i=(56)𝒢(Ei𝝀)d(i).

In what follows, we will refer to the case (64) as the “standard case” (case S). However, we stress that this is not the most general case compatible with Assumption 2 as the counter-example of all d(i)=1 and ρ not commuting with the P∼ i shows.

Next, we consider a second set of observables described by the mutually commuting self-adjoint operators F∼ 1,…,F∼ L subject to analogous assumptions. Hence, the following holds:

(68)F∼ λ=∑j∈𝒥Fj(λ)Q∼ j,λ=1,…,L,

(69)D(j)≡Tr(Q∼ j)<∞,for allj∈𝒥,

and

(70)∑j∈𝒥Q∼ j=𝟙.

We have chosen another index set 𝒥 for the second set of observables in order to stress that no natural identification between both index sets is required in what follows. Obviously, 𝒥 has to be identified with the second outcome set introduced in Section 2. In general, the E∼ λ will not commute with the F∼ μ. We assume that a second measurement of the F∼ 1,…,F∼ L will be performed at the time t=t1>t0, not necessarily of Lüders type. Between the two measurements in the time interval (t0,t1), the evolution of the system can be quite arbitrary and will be described by a unitary evolution operator U=U(t1,t0).

Next, we will show that the suitably defined “physical conditional probability” p(j|i) is of modified doubly stochastic type, and hence, the J-equation (27) also holds in cases of physical relevance. Recall that the state of the system immediately after the first measurement at time t=t0 with outcome i∈ℐ is assumed to be of the form (63) and that the time evolution between t=t0 and t=t1 is given by the unitary evolution operator U. Hence, according to the rules of quantum theory

(71)p(j|i)=Tr(Q∼ jUP∼ id(i)U∗),for alli∈ℐ,j∈𝒥.

Moreover,

Lemma 1

The physical conditional probability (71) is of modified doubly stochastic type.

The proof of this Lemma can be found in Appendix A.

Recall that certain “hypothetical probabilities” q(j) occur in Proposition 3 of Section 2.2. For the quantum case, we will always assume that these probabilities are of the following form:

(72)q(j)=Tr(𝒢(F∼ 1,…,F∼ L)Q∼ j)=D(j)𝒢(Fj𝝀)

for all j∈𝒥, where the function 𝒢 is chosen to be the same as in (64). We understand the “standard case S” as including the condition (72).

Lemma 1 and Proposition 3 immediately entail the following theorem, referred to as claiming the general Jarzynski equation, which will be formulated only for the standard case S:

Theorem 1

Let ∼E1,…,∼EL be a family of mutually commuting self-adjoint operators with the spectral decomposition (55) satisfying (56), likewise ∼F1,…,∼FL a second family satisfying (68) and (69). Further, let ρ=G⁢(∼E1,…,∼EL) be a density operator such that

(73)p(i)=Tr(ρP∼ i)>0

holds for all i∈I. Further, let U be some unitary time evolution operator. Then, the following holds

(74)⟨𝒢(Fj𝝀)𝒢(Ei𝝀)⟩=1,

where the expectation value has been calculated by means of the physical probability function p(i,j)=p(j|i)p(i)=Tr(∼QjU∼Pid⁢(i)U∗)p(i).

We note in passing that the physical probabilities p(i, j) can be written as

(75)p(i,j)=Tr(ρF(i,j)),

where the positive operators

(76)F(i,j)≡P∼ iU∗Q∼ jUP∼ i≥0

satisfy

(77)∑(i,j)∈𝖤F(i,j)=∑i∈ℐP∼ iU∗(∑j∈𝒥Q∼ j)UP∼ i=(70)∑i∈ℐP∼ i=(57)𝟙

and hence constitute a “positive operator valued measure” (POVM) F:𝖤→ℒ(ℋ). Here, ℒ(ℋ) denotes the space of bounded, linear operators defined on ℋ. (For a more general definition, see [14]; note that we use a simplified form of POVM adapted to countably infinite outcome spaces E.) Hence, the various random variables defined on E, including “work,” can be viewed as generalised observables in the sense of [14], albeit generally not “sharp observables” (i.e. observables described by projection valued measures). This observation puts the statement of [13] “work is not an observable” into perspective, see also [15], [16], [22].

For the examples in the following subsections, it suffices to identify the families E∼ 1,…,E∼ L and F∼ 1,…,F∼ L and the function 𝒢. The conditions of Theorem 1 can be easily verified by the reader; it remains to evaluate the general Jarzynski equation (74) for the various examples. Moreover, as in all examples the convex exponential function is involved, we may invoke Jensen’s inequality and obtain special relations that may be viewed as manifestations of the Second Law for nonequilibrium case S scenarios, but have to be distinguished from case R statement of nondecreasing modified Shannon entropy in Subsection 4.1.

3.2 Systems in Local Canonical Equilibrium

We assume that the quantum system consists of N subsystems and consequently ℋ=⊗μ=1Nℋμ. For each subsystem, we assume a, possibly time-dependent, Hamiltonian Hμ(t), where the lift to the total Hilbert space by means of suitable tensor products with identity operators will be tacitly understood. Its spectral composition will be written as

(78)Hμ(t)=∑iμ∈ℐμEiμ(μ)(t)Piμ(t).

Then, we set L = N, ℐ=𝒥=ℐ1×…×ℐN, and

(79)E∼ μ=Hμ(t0),and F∼ μ=Hμ(t1),forμ=1,…N.

These Hamiltonians are not necessarily connected with the unitary time evolution operator U=U(t1,t0). Further, we choose

(80)ρ=𝒢(H1(t0),…,HN(t0))=∏μ=1N(Tr exp(−βμHμ(t0)))−1exp(−βμHμ(t0)),

with the usual interpretation of the parameters β_μ > 0 as the inverse temperatures of the subsystems. The generalised Jarzynski equation (74) then assumes the form

(81)⟨exp(−∑μ=1Nβμ(wμ−ΔFμ))⟩=1,

where

(82)wμ(i,j)≡Ejμ(μ)(t1)−Eiμ(μ)(t0),

(83)Zμ(t)≡Tr(e−βμHμ(t))≡e−βμFμ(t),

(84)ΔFμ≡Fμ(t1)−Fμ(t0),

for all μ=1,…,N. As exp is convex, Jensen’s inequality yields e⟨x⟩≤⟨ex⟩, and hence (81) implies

(85)exp⟨−∑μ=1Nβμ(wμ−ΔFμ)⟩≤1,

or, equivalently,

(86)∑μ=1Nβμ(⟨wμ⟩−ΔFμ)≥0.

Note that the left-hand side of (86) has the form of a sum of entropy changes in the quasi-static limit and hence can be viewed as a manifestation of the Second Law for the present nonequilibrium scenario. For similar results, see [3] and [23].

3.3 Systems in Microcanonical Equilibrium

We choose L = 1 and a one-parameter family of Hamiltonians H(t) with spectral decomposition

(87)H(t)=∑i∈ℐEi(t)Pi(t).

The microcanonical ensemble will not be represented by a characteristic function concentrated on a small energy interval but in the physically equivalent form

(88)ρ=𝒢(H(t0))≡1W(t0)exp(−(E−H(t0)w)2),

where

(89)W(t)≡Trexp[−(E−H(t)w)2]≡e−f(t),

and E, w > 0 are parameters. The generalised Jarzynski equation (74) then assumes the form

(90)⟨exp[−(E−Ej(t1)w)2+Δf+(E−Ei(t0)w)2]⟩=1,

where Δf≡f(t1)−f(t0). Application of Jensen’s inequality analogous to that in Section 3.2 yields

(91)⟨(E−Ej(t1)w)2−Δf−(E−Ei(t0)w)2⟩≥0.

The generalisation to systems in local microcanonical equilibrium analogous to the case treated in Section 3.2 is straightforward and need not be given here in detail.

3.4 Systems in Grand Canonical Equilibrium

The Hilbert space of the system is chosen as the bosonic or fermionic Fock space over the one-particle Hilbert space ℋ:

(92)ℱ±(ℋ)=⊕n=0∞𝒮±ℋ⊗n,

where ℋ⊗n denotes the n-fold tensor product and 𝒮_± the projector onto the totally symmetric (+) part or the totally antisymmetric (−) part of ℋ⊗n. We choose L = 2 and E∼ 1=H(t0), where H(t) is the canonical lift of a time-dependent one-particle Hamiltonian H1(t) to ℱ±(ℋ). Further, we choose E∼ 2=∼N, the particle number operator in ℱ±(ℋ). By definition, E∼ 1 and E∼ 2 commute. Let the respective spectral decompositions with a common system of eigenprojections be written as

(93)H(t)=∑i∈ℐEi(t)Pi(t),

and

(94)∼N=∑i∈ℐNiPi(t).

Moreover, we set

(95)ρ=𝒢(H(t0),N∼)≡exp[β(Ω(t0)+μN∼−H(t0))],

where

(96)exp(−βΩ(t))≡Trexp[β(μN∼−H(t))],

and β,μ,Ω have the usual physical interpretation as inverse temperature, chemical potential, and grand potential, respectively.

The generalised Jarzynski equation (74) then assumes the form

(97)⟨exp[−β((Ej(t1)−Ei(t0))−μ(Nj−Ni)−ΔΩ)]⟩=1,

where ΔΩ≡Ω(t1)−Ω(t0). Application of Jensen’s inequality analogous to that in Section 3.2 yields

(98)β⟨(Ej(t1)−Ei(t0)⟩−μ⟨Nj−Ni⟩−ΔΩ)≥0.

The generalisation to systems in local grand canonical equilibrium analogous to the case treated in Section 3.2 is straightforward and need not be given here in detail. For similar results, see also [6], [7], [8], [9], [10], [11].

3.5 Application to PeriodicThermodynamics

Analogously to Section 3.2, we consider two systems (i.e. N = 2) and assume that the first system is periodically driven with a Hamiltonian K1(t) satisfying K1(t+T)=K1(t). We have chosen the letter “K” as we will have to distinguish between the Hamiltonian and the (quasi) energy operator H1(t) and want to conform, as far as possible, with the notation introduced in the preceding sections. According to Floquet theory, the general solution of the corresponding Schrödinger equation will be of the form

(99)ψ(t)=∑i∈ℐ1aiui(t)e−𝗂εit,

with time-independent coefficients a_i. Here, the ε_i denote the quasi-energies, unique up to integer multiples of ω≡2πT, and the ui(t) are T-periodic functions of t. We assume a pure point spectrum of the quasi-energies, and accordingly, ℐ₁ will be a countably infinite or possibly finite index set.

Upon choosing a selection of quasi-energies from their equivalence classes, we may define a quasi-energy operator

(100)H1(t)=∑i∈ℐ1εi∼Pi(1)(t)≡∑i∈ℐ1εi|ui(t)⟩⟨ui(t)|.

Hence, TrP∼i(1)(t)=1 for all t and

(101)∑i∈ℐ1∼Pi(1)(t)=𝟙.

The first system is coupled to a heat bath with Hamiltonian

(102)H2=∑n∈𝒩EnP∼n(2),

where the P∼n(2) are assumed to be finite-dimensional projectors with dimension (degeneracy) d(n)=TrP∼n(2). Without loss of generality, we also assume a pure point spectrum of the heat bath corresponding to a countably infinite index set 𝒩. The outcome sets introduced in Section 2 can be chosen as ℐ=𝒥=ℐ1×𝒩. Note the completeness relation

(103)∑n∈𝒩P∼n(2)=𝟙.

The total Hamilton operator of the system plus bath will be written as

(104)K(t)=K1(t)⊗𝟙2+𝟙1⊗H2+H12,

with some self-adjoint operator H₁₂ defined on the total Hilbert space ℋ=ℋ1⊗ℋ2 describing the system–bath interaction. It is assumed to be valid for t<t0. Strictly speaking, the form of K(t) is irrelevant for the Jarzynski equation to be formulated below. Its only purpose is to motivate the following assumptions about the state of the total system at the time t=t0.

We assume that for times t<t0, the heat bath will be in a thermal equilibrium state

(105)ρ2=1Z2e−βH2,

where, as usual, β is the inverse temperature and the heat bath partition function is

(106)Z2=Tr(e−βH2)=∑n∈𝒩e−βEnd(n).

The crucial assumption of this subsection will be that also the system assumes, for times t<t0, a quasi-stationary distribution ρ1(t) of Floquet states that will be of Boltzmann type with an inverse quasi-temperature ϑ, namely

(107)ρ1(t)=1Z1e−ϑH1(t),

and the corresponding time-independent quasi-partition function reads

(108)Z1=Tr(e−ϑH1(t))=∑i∈ℐ1e−ϑεi.

With respect to the conditions of Theorem 1, we thus may write the initial state as

(109)ρ=ρ1(t0)⊗ρ2=𝒢(H1(t0),H2)=[Tr(e−ϑH1(t0))Tr(e−βH2)]−1exp(−ϑH1(t0)−βH2).

Whereas the general existence of a quasi-stationary distribution has been made plausible in the literature [24], the more restrictive assumption of a quasi-Boltzmann distribution has been demonstrated for only four kinds of systems:

For the particular case of a linearly forced harmonic oscillator, the authors of [24] have shown that the Floquet-state distribution remains a Boltzmann distribution with the temperature of the heat bath, i.e. ϑ = β, see also [25].
Similarly, the parametrically driven harmonic oscillator assumes a quasi-stationary state with a quasi-temperature that is, however, generally different from the bath temperature, see [26].
A spin s exposed to both a static magnetic field and an oscillating, circularly polarised magnetic field applied perpendicular to the static one, as in the classic Rabi set-up, and coupled to a thermal bath of harmonic oscillators has been shown to approach a quasi-Boltzmann distribution, see [27],
And finally, every quasi-stationary distribution of Floquet states of a two-level system, see [25], can be trivially viewed as a quasi-Boltzmann distribution.

As in Section 3, we will assume that at times t=t0 and t=t1 there will be performed measurements of the observables corresponding to the commuting (quasi) energy operators H1(t) and H₂. The interaction between the system and the heat bath in the time interval (t0,t1) can be quite arbitrary and will be described by a Hamiltonian H~(t). It follows that all mathematical assumptions necessary to prove the general Jarzynski equation (74) are satisfied. But note the following difference: Typically, the general Jarzynski equation holds in a situation of local thermal equilibrium at the initial time t=t0. In this section, we will rather apply Theorem 1 to a situation of a quasi-stationary distribution of Floquet states of a periodically driven system in contact with a heat bath. This situation may be far from local thermal equilibrium.

Analogously to Section 3.2, we will set β1=ϑ, the inverse quasi-temperature, whereas β2=β is the ordinary temperature of the heat bath. Consequently, we will rewrite w₁ as the “change of quasi-energy e” and w₂ as the heat q absorbed by the heat bath. Strictly speaking, this would exclude a time-dependent Hamiltonian for the heat bath as otherwise w₂ could also be composed of both heat and work. Nevertheless, we will stick to this more intuitive notation.

As noted above, both partition functions Z₁ and Z₂ are time-independent. Hence, (74) simplifies to

(110)⟨exp(−ϑe−βq)⟩=1.

The inequality derived by means of Jensen’s inequality analogous to (86) hence will read

(111)ϑ⟨e⟩+β⟨q⟩≥0,

and can again be viewed as a manifestation of the Second Law for periodic thermodynamics.

4 Further Applications to Quantum Theory

4.1 A Second Law–like Statement for the Nonstandard Case

In the preceding sections, we have formulated a number of Second Law–like statements, namely (86), (91), (98), and (111), which follow from the respective Jarzynski equations in the standard case S. However, these statements are not special cases of the “Pauli-type” inequalities (22) and (29) as these are based on the assumption q(j)=p^(j) for all j∈𝒥 (case R) and hence do not belong to the standard case that is characterised by (72). In view of the fundamental significance of the Second Law, it will be in order to add a few remarks on the realisation of (22) and (29) in quantum mechanics.

First, we will reformulate (29) in the context of quantum theory:

Theorem 2

We assume the notations and general conditions of Section 3, in particular Assumption 1 and Assumption 2. It follows that the 5-quintuple (I,J,p,d,D) will be a modified statistical model of sequential measurements (m⁢S⁢T2) where the physical probability function p:I×J→R is given by

(112)p(i,j)=Tr(Q∼ jUP∼ iρP∼ iU∗)=p(j|i)p(i),

and the second marginal probabilities are defined by

(113)p^j=∑ip(i,j).

Then, the following holds:

(114)S′(p^)=−∑jp^jlogp^jD(j)≥S′(p)=−∑ip(i)logp(i)d(i).

This statement is certainly not new but has a couple of forerunners albeit formulated in different frameworks [18], [28], [29]. We note that the modified Shannon entropy S′(p) can be identified with the von Neumann entropy −Tr(ρ1logρ1) of the mixed state ρ₁ after the first measurement according to (62). Indeed,

(115)ρ1=(62)∑ip(i)ρi=(63)∑ip(i)d(i)Pi

implies

(116)−Tr(ρ1logρ1)=−Tr(∑ip(i)d(i)logp(i)d(i) Pi)=−∑ip(i)logp(i)d(i)=S′(p).

For the modified Shannon entropy S′(p^), this identification is not possible in general. Even if we additionally assume that the second measurement will be of Lüders type, it is not clear whether an assumption analogous to (63) would hold. Below we will consider a simplified scenario where this identification is nevertheless possible.

Next, we note that the Second Law–like statement (114) holds for closed systems irrespective of their size and is in this respect more general than the usual formulations of the Second Law for large systems including small systems coupled to a heat bath. Moreover, (114) is not restricted to sequential energy measurements and e.g. would also hold for (discretised) position measurements, thereby describing the spreading of wave packets, see the example of the following Subsection 4.2. In this context, it might be instructive to discuss the well-known Umkehreinwand (reversibility paradox) of Loschmidt. There exist solutions ψ(t) of, say, the 1-particle Schrödinger equations that are time reflections of spreading wave packets and hence concentrate on smaller and smaller regions. These solutions do not lead to a violation of (114) as after the first measurement this special solution ψ(0) is transformed into a mixed state ρ₁ that again will spread with increasing time. The delicate phase relations of ψ(0) needed for the inverse spreading are destroyed by the first measurement.

Similarly, the related Wiederkehreinwand (recurrence paradox) of Poincaré and Zermelo that would be particularly serious for small systems with short recurrence times can be rebutted. It may happen that the modified Shannon entropy S′(p^) will be a periodic function s(s) of the time difference t≡t1−t0 between the two measurements, but this does not injure the validity of (114). The reason is simply that the latter inequality reads s(t)≥S′(p) and nots(ta)≥s(tb) for all ta>tb. Physically speaking, the Wiederkehreinwand does not apply as in quantum mechanics the entropy difference is not a definite quantity defined for all times t but rather should be construed as the mean value of entropy differences over many measurements of a pair of observables performed at a fixed time difference t.

As (114) is a fundamental inequality that is valid for a large class of sequential measurements, it will be interesting to investigate its possible geometrical meaning.

To this end, we generalise our considerations to a finite number of L sequential Lüders measurements but restricted to the case of a finite n-dimensional Hilbert space ℋ and nondegenerate projections P_i, Q_j. This corresponds to Subsection 2.1 dealing with the “simple case.” In particular, Assumption 2, see (63), will be satisfied for the corresponding state before each measurement. Consider first the simplest case of an n = 2-dimensional Hilbert space where all mixed states correspond to the points of a unit ball with centre C≅12𝟙. The boundary of the unit ball is usually denoted as the “Bloch sphere.” Two orthogonal projections P₁ and P₂ are represented by antipodal pairs of points of the Bloch sphere, and the first Lüders operation ρ↦ρ1 is just the projection onto the line joining P₁ and P₂. Upon this projection, the distance d of the state to the centre C decreases (or remains constant); i.e. the Lüders operation is contractive. This distance can be expressed in terms of the scalar product (A,B)↦TrA∗B for A,B∈ℒ(ℋ). The unitary time evolution between the first and the second measurements corresponds to a rotation of the Bloch sphere and can be discarded as far as only geometric relations are considered. Then, the second measurement with orthogonal projections Q₁ and Q₂ again yields a projection that maps ρ₁ onto, say, ρ₂ and further decreases the distance to the centre C. In this way, the L sequential Lüders measurements yield a sequence ρ,ρ1,ρ2,…,ρL of mixed states represented by points inside the Bloch sphere with nonincreasing distance to the centre C, see Figure 1.

$Figure 1: Geometric interpretation of the Lüders operation ρ1↦ρ2${\rho_{1}}\mapsto{\rho_{2}}$ as a projection of points inside the Bloch sphere onto the line joining two orthogonal projections Q1 and Q2.$

Figure 1:

Geometric interpretation of the Lüders operation ρ1↦ρ2 as a projection of points inside the Bloch sphere onto the line joining two orthogonal projections Q₁ and Q₂.

Figure 2:

The von Neumann entropy S versus the squared distance d² to the maximally mixed state. Left panel: In the case of Hilbert space dimension dim = 2, S will be a monotonically decreasing function (121) of d². Right panel: In the case of Hilbert space dimension dim > 2, S will no longer be a function of d². We show the case of dim = 3 and 2000 randomly chosen density matrices.

Before we connect this geometric picture to the Second Law–like statement (114), we will sketch the generalisation to finite n > 2, although the corresponding geometry cannot be visualised in a likewise simple manner. The unit ball in the case of n = 2 has to be replaced by the convex set K of mixed states such that the pure states are the extremal points of K. The centre C now corresponds to the maximally mixed state C≅1n𝟙. A family of n mutually orthogonal 1-dimensional projections P_i spans an n-simplex Σ of extremal points of K consisting of all mixed states that commute with all Pi,i=1,…,n. The centre C is always contained in the simplex Σ. The Lüders operation ρ↦ρ1 is the projection onto the affine subspace spanned by the Pi,i=1,…,n. Again, by subsequent projections, we obtain a sequence ρ,ρ1,ρ2,…,ρL of mixed states such that the distance d to the centre C is nonincreasing.

In passing, we note that the squared distance d2=∥ρ−1n𝟙∥2 is related to the (dimensionless) Tsallis entropy [30]

(117)Sq=1q−1(1−∑i=1npiq)

(118)d2=n−1n−S2.

The connection to the Second Law–like statement (114) will be first discussed for the special case of n = 2. Let p,1−p be the eigenvalues of a general statistical operator ρ. Then

(119)S(ρ)=−plog(p)−(1−p)log(1−p),

and

(120)d2(ρ)=2(p−12)2,

where d(ρ) denotes, as above, the Euclidean distance of ρ to the maximally mixed state 12𝟙. Obviously, (116) can be solved for a monotonically decreasing function S(d2), namely

(121)S(d2)=12(log(4)+(−1+2d)log(1−2d)−(1+2d)log(1+2d))

4.2 An Analytically Solvable Example

As a nontrivial example, we consider a particle in one dimension and a free time evolution between the two measurements described by the Schrödinger equation

(122)𝗂ℏ∂∂tψ(x,t)=−ℏ22m∂2∂x2ψ(x,t)

with self-explaining notation. The two measurements are unsharp position–momentum measurements. More specifically, the projections P∼ i,i∈ℐ considered in Section 3 are one-dimensional and of the form P∼ i=|i⟩⟨i| where

(123)|i⟩=|ν,n⟩≡χ[νΔ,(ν+1)Δ]1Δexp(2π𝗂nxΔ),ν,n∈ℤ.

Here, χ[νΔ,(ν+1)Δ] denotes the characteristic function of the interval [νΔ,(ν+1)Δ]. Thus, the first measurement is a discretised joint measurement of position q_ν and momentum pn=2πℏnΔ. Analogous definitions hold for the second measurement with projections Q∼ j=|j⟩⟨j| and |j⟩=|μ,m⟩ such that ℐ=𝒥=ℤ2.

We choose the physical units such that Δ=m=ℏ=1 and an initial pure state given by the Gaussian

(124)ψ(x)=1π1/4σexp(−x22σ2).

After the first (Lüders) measurement, the particle is in one of the pure states |ν,n⟩ with probability

(125)p(i)=p(ν,n)=|⟨ν,n|ψ⟩|2,

where

(126)⟨ν,n|ψ⟩=∫νν+1ψ(x)exp(−2π𝗂nx)dx=π42σe−2π2n2σ2(erf(ν+1+2π𝗂nσ22σ)−erf(ν+2π𝗂nσ22σ)).

After the time t=t1−t0, the state |ν,n⟩ evolves into |ν,n,t⟩. To calculate the latter, we have integrated the free propagator over one-unit interval and thereafter performed a Galilean boost with the result

(127)|ν,n,t⟩=𝗂2e−2π𝗂n(πnt−x){erfi(1+𝗂2t(ν+2πnt−x))−erfi(1+𝗂2t(ν+1+2πnt−x))}.

Here, erfi(z) denotes the imaginary error function erf(𝗂z)𝗂. The second (Lüders) measurement yields the result j=(μ,m) with conditional probability

(128)p(j|i)=p(μ,m|ν,n)=|⟨μ,m|ν,n,t⟩|2.

If m≠n, the corresponding amplitudes are obtained as

(129)⟨μ,m|ν,n,t⟩=e−4𝗂π2t(m2−2mn+2n2)4π(m−n){e2𝗂π2t(m−2n)2(−2R(m,μ,ν)+R(m,μ,ν+1)+R(m,μ+1,ν))−e2𝗂π2t(2m2−4mn+3n2)(−2R(n,μ,ν)+R(n,μ,ν+1)+R(n,μ+1,ν))},

where the abbreviation

(130)R(k,μ,ν)≡erfi((1+𝗂)(2kπt−μ+ν)2t)

has been used. In the case m=n, we have

(131)⟨μ,n|ν,n,t⟩=e−2𝗂π2n2t2π((1+𝗂)t(e𝗂(μ−ν−2πnt+1)22t−2e𝗂(−μ+ν+2πnt)22t+ei(−μ+ν+2πnt+1)22t)−𝗂π(−2(−μ+ν+2πnt)R(n,μ,ν)+(−μ+ν+2πnt+1)R(n,μ,ν+1)+(−μ+ν+2πnt−1)R(n,μ+1,ν))).

We have noted in Subsection 2.2 that in general the p(j|i) need not be symmetric. Indeed, for our example, we find, e.g. that for t=σ=1 we have p(1,1|0,0)=0.00483946, but p(0,0|1,1)=0.00258997.

The equations (125–131) yield the second marginal probabilities p^(j)=∑ip(j|i)p(i) in the form of a doubly infinite series of terms given by analytical expressions. Hence, the p^(j) can be numerically calculated by a suitable truncation of the infinite series. Moreover, the Shannon entropies S(p) and S(p^) can be numerically calculated, and the Second Law–like statement (114) can be tested, see Figure 3. Due to the localisation of the particle by the first measurement, an additional spreading of the momenta is generated that leads to a stronger increase of the Shannon entropy than the increase that is solely produced by the spreading of the Gaussian wave packet (124) in the course of time.

$Figure 3: Illustration of the increase of the Shannon entropies, S(p)<S(p^,t)$S(p) < S(\hat{p},t)$, in the case of two sequential Lüders measurements. t denotes the time difference between the two measurements and assumes the dimensionless values 10−4,…,10−1${10^{-4}},\ldots{,10^{-1}}$; the parameter σ in (124) is chosen as σ = 1. The entropy after the first measurement is S(p)=1.3654$S(p)=1.3654$ (dashed red line).$

Figure 3:

Illustration of the increase of the Shannon entropies, S(p)<S(p^,t), in the case of two sequential Lüders measurements. t denotes the time difference between the two measurements and assumes the dimensionless values 10−4,…,10−1; the parameter σ in (124) is chosen as σ = 1. The entropy after the first measurement is S(p)=1.3654 (dashed red line).

4.3 Crooks Fluctuation Theorems

In the literature, the Jarzynski equation is sometimes derived from so-called Crooks fluctuation theorems, see [12]. The notation refers to [31] where G. E. Crooks proved a classical work fluctuation theorem. Quantum versions of the Crooks fluctuation theorem have first been considered in [2] and [3]. One may ask whether the quantum version of a Crooks fluctuation theorem has a counterpart in the (modified) statistical model of sequential measurements and whether one would need additional assumptions to prove it.

We adopt the notation of Subsection 2.3 and again consider a modified model m=(ℐ,𝒥,P,d,D) of sequential measurements as well as a its “reciprocal model” (𝒥,ℐ,P~,D,d). It remains to show that the C-equation (53) indeed entails the Crooks fluctuation theorem in the case of quantum mechanics. To this end, we assume the case of Subsection 3.2 with N = 1 such that

(132)Y(i,j)=exp(−β(W(i,j)−ΔF)),

where we have denoted the random variable “work” by a capital letter W, and hence, writing y=exp(−β(w−ΔF)),

(133)P(Y=y)=P(W=w).

Analogously,

(134)P~(Y~=1/y)=P~(W~=−w),

and the Crooks fluctuation theorem assumes its familiar form

(135)P(W=w)P~(W~=−w)=exp(β(w−ΔF)).

Next, we will investigate the question how the reciprocal model (𝒥,ℐ,P~,D,d) can be realised in quantum theory. We consider its conditional probability

(136)π~(i|j)=(40)π(j|i)d(i)D(j)=(71)Tr(Q∼ jUP∼ id(i)U∗)d(i)D(j)=Tr(P∼ iU∗Q∼ jD(j)U),

where the last equation was obtained by cyclic permutation of the operators inside the trace. This suggests the following realisation: We prepare a state described by a statistical operator ρ~ such that

(137)q(j)=Tr(ρ~Q∼ j),

and the assumption

(138)Q∼ jρ~Q∼ j=q(j)D(j)Q∼ jfor allj∈𝒥,

analogous to (63) is satisfied. Then, at the time t=t0, we measure the set of observables described by the mutually commuting self-adjoint operators F∼ 1,…,F∼ L with common eigenprojections Q∼ j,j∈𝒥, the measurement being of Lüders type. Thereafter, a unitary time evolution U~≡U∗ takes place until t=t1 where a second measurement of E∼ 1,…,E∼ L with common eigenprojections P∼ i,i∈ℐ is performed. The corresponding conditional probabilities of the sequential measurements are then given by (136).

In an experiment, it might be difficult to realise the adjoint time evolution U~≡U∗. As a more practical alternative, we briefly recapitulate the time evolution considered in [12], Section IV A, using our own notation. Let the time evolution U(t,t1) for t∈[t0,t1] of the original model be given as the solution of the differential equation

(139)∂∂tU(t,t1)=−𝗂H(t)U(t,t1),

with “initial” value U(t1,t1)=𝟙. Then, consider the Hamiltonian

(140)Hˇ(t)≡H(t1+t0−t)

according to a “time-reversed protocol” and the corresponding evolution operator Uˇ(t,t1) satisfying

(141)∂∂tUˇ(t,t1)=−𝗂Hˇ(t)Uˇ(t,t1),

with initial value Uˇ(t1,t1)=𝟙. Upon the transformation t↦t1+t0−t, (141) assumes the form

(142)∂∂tUˇ(t1+t0−t,t0)=𝗂Hˇ(t1+t0−t)Uˇ(t1+t0−t,t0)=(140)𝗂H(t)Uˇ(t1+t0−t,t0).

Now assume “microreversibility,” i.e. the existence of an antiunitary operator Θ commuting with all H(H):

(143)Θ∗H(t)Θ=H(t)for allt∈[t0,t1],

and further

(144)Θ∗Q∼ jΘ=Q∼ jandΘ∗P∼ iΘ=P∼ i,

for all j∈𝒥 and i∈ℐ. The latter assumption already follows from (143) if the P∼ i are the eigenprojections of H(t0) and the Q∼ j the eigenprojections of H(t1). Our assumption of microreversibility is somewhat weaker than the usual formulation in so far as it only requires that there exists some basis such that H(t) is real for all t, see [32] for a similar approach.

Accordingly, (142) implies

(145)∂∂tΘ∗Uˇ(t1+t0−t,t0)Θ=−𝗂H(t)Θ∗Uˇ(t1+t0−t,t0)Θ.

Comparison with (139) together with the initial conditions yields

(146)U(t,t1)=Θ∗Uˇ(t1+t0−t,t0)Θ,

cp. (40) in [12], especially

(147)U∗=U(t1,t0)∗=U(t0,t1)=Θ∗Uˇ(t1,t0)Θ.

Inserting this result into (136) and using (144) give

(148)π~(i|j)=Tr(P∼ iUˇ(t1,t0)Q∼ jD(j)Uˇ(t1,t0)∗),

and thus show that the time-reversed protocol correctly realises the reciprocal model. But we stress that the assumptions of microreversibility are convenient but not necessary for the validity of the Crooks fluctuation theorem in contrast to the impression generated by [12].

As a special case, we mention the situation where H(t)=Hˇ(t)=H(t1+t0−t) and ∼Qi=P∼ i for all i∈ℐ=𝒥, but all preceding assumptions still hold, in particular (143) and (144). This includes the case where H is time-independent. Then, it follows that U∗=Θ∗UΘ and hence

(149)π(j|i)d(i)=Tr(P∼jUP∼ iU∗)=Tr(P∼ iΘ∗UΘP∼jΘ∗U∗Θ)=Tr(P∼ iUP∼ jU∗)=π(i|j)d(j).

This means that the symmetry condition (31) considered by W. Pauli will be exactly satisfied, not only in the Golden Rule approximation. In the counter-example to (31) in Subsection 4.2, the condition (144) is violated as the momentum p_n is inverted under time reflections.

5 Applications to ClassicalTheory

In classical statistical mechanics, all observables have definite values for each individual system. Hence, it is not necessary to adopt the scenario of sequential measurements in the context of Jarzynski equations. Nevertheless, the statistical model of sequential measurements introduced in Section 2 can be useful if suitably reinterpreted. To this end, we set ℐ=𝒥=𝒳, where 𝒳 is the 2N-dimensional phase space of the system under consideration. Summations over ℐ or 𝒥 will be replaced by integrations using the canonical volume form dx=dp1…dpNdq1…dqN on 𝒳. At the time t=t0, the state of the system will be described by a probability distribution p:𝒳→ℝ+ satisfying

(150)p(x)>0for allx∈𝒳

and

(151)∫𝒳p(x)dx=1.

The time evolution between t=t0 and t=t1 is deterministic and will be described by a volume preserving map

(152)U:𝒳→𝒳.

Let A:𝒳×𝒳→ℝ be a random variable. Its expectation value will be defined by

(153)⟨A⟩≡∫𝒳×𝒳dxdyp(x)δ(y,U(x))A(x,y)=∫𝒳dxp(x)A(x,U(x)).

In the special case of A(x,y)=q(y)p(x), where q:𝒳→ℝ+ is assumed to satisfy

(154)∫𝒳q(y)dy=1,

we conclude

(155)⟨q(y)p(x)⟩=(153)∫𝒳dxp(x)q(U(x))p(x)

(156)=∫𝒳dxq(U(x))

(157)=∫𝒳dyq(y)=(153154)1,

using that U is volume preserving in (157). Hence, (74) has a classical counterpart and the general Jarzynski equation also holds classically, in particular for the examples treated in Subsections 3.2 to 3.4.

6 Summary and Outlook

The usual formulation of the quantum Jarzynski equation applies to closed systems that are initially in thermal equilibrium, described by the canonical ensemble, and then subject to two sequential energy measurements. Between the two measurements, the system may be arbitrarily disturbed under the influence of a time-dependent Hamiltonian. The present work can be understood as a gradual generalisation of this situation. Some of these generalisations have already been considered in the literature, see [3], [5], [6], [7], [8], [9], [10], [11], but now they appear in a coherent way as results of a unified approach. First, we allow for equilibrium scenarios that are rather described by microcanonical or grand canonical ensembles. In the next step, we also consider local equilibria, i.e. N subsystems that have initially different temperatures. This case is treated in the present article for the case of local canonical ensembles, but the extension to the case of local microcanonical or grand canonical ensembles is straightforward. Moreover, it turns out that for the general quantum Jarzynski equation the restriction to energy measurements is no longer necessary. The only essential assumptions are those postulating that the first measurement is of Lüders type and, additionally, that the state resulting after this measurement is diagonal in the eigenbasis of the measured observables (Assumption 2). An example, where this more general point of view is crucial, is the sequential measurement of the “quasi-energy” in the case of periodic thermodynamics. This example is briefly touched in our article but could be expanded with respect to the results on the dissipated heat obtained in [27] and [26].

At this point, a further natural generalisation suggests itself, namely the replacement of the two sequential (projective) measurements by more general ones described by POVMs and involving more general state transformations than those of Lüders type, see [23], [28], [33] for related approaches. It turns out that the simple form of the general Jarzynski equation and of the resulting Second Law–like statements will be lost upon this generalisation. In the article at hand, we have followed a different route of generalisation by analysing the probabilistic core of the general Jarzynski equation. The result is what we have called a “statistical model of sequential measurements” that does not explicitly presuppose quantum mechanics and includes the “J-equation,” cf. Prop. 3, as a progenitor of the general quantum Jarzynski equation. Another benefit of the abstract statistical model is to make clear that the J-equation will exactly hold even if the correct quantum time evolution is replaced by an approximation, e.g. the Golden Rule approximation, as far as the modified doubly stochasticity is retained. The mathematical clue to prove the J-equation is the assumption of a modified doubly stochastic transition probability that is satisfied in quantum theory and breaks the time-reflection symmetry of the model. Consequently, a Second Law–like statement follows that is different from those mentioned above and joins the theory to previous approaches to the Second Law going back to W. Pauli and G. D. Birkhoff. We have illustrated this result by an example involving discrete position–momentum measurements and describing the spreading of an initial Gaussian wave packet. The arrow of time remains mysterious, but the two arrows arising in thermodynamics and quantum measurement theory point into the same direction.

Appendix A: Proofs of Some Propositions

Proof of Proposition Proposition 1.

The first part of the proposition follows from

(A1)⟨q(j)p(i)⟩=(13)∑i,jq(j)p(i)P(i,j)

(A2)=(9)∑jq(j)∑iπ(j|i)

(A3)=(12)∑jq(j)=(1391214)1.

For the converse statement, choose j′∈𝒥 arbitrarily and let q(j)=δj,j′. Then

(A4)1=⟨q(j)p(i)⟩=∑i,jq(j)π(j|i)=∑i,jδj,j′π(j|i)=∑iπ(j′|i),

which means that π(j|i) will be doubly stochastic. ⊡

Proof of Proposition Proposition 2.

It follows that

(A5)0=log 1=(19)log⟨p^(j)p(i)⟩

(A6)≥(20)⟨logp^(j)p(i)⟩

(A7)=⟨logp^(j)⟩−⟨logp(i)⟩

(A8)=∑j∈𝒥p^(j)logp^(j)−∑i∈ℐp(i)logp(i).

⊡

Proof of Proposition Proposition 3.

The if part follows as the sum (21) is invariant under permutations. For the only-if part, we invoke the theorem of Birkhoff-von Neumann [34] saying that any doubly stochastic matrix is the convex sum of permutational matrices. Assume that π is not of permutational type, and hence, the convex sum will be a proper one. This means that π can be written in the form

(A9)π=∑μ=1Mλμσ^μ,

such that λμ>0 for all μ=1,…,M, M > 1, ∑μλμ=1, and

(A10)σ^μ(j|i)=δj,σμ(i) forsomeσμ∈𝒮(n).

Recall that the function f(x)≡−xlogx is strictly concave for x∈[0,1]. Then, we obtain

(A11)S(p^)=∑jf(p^(j))

(A12)=(11,A9)∑jf(∑μλμ∑iσ^μ(j|i)p(i))

(A13)>∑j∑μλμf(∑iσ^μ(j|i)p(i))

(A14)=(A10)∑j∑μλμf(∑iδj,σμ(i)p(i))

(A15)=∑j∑μλμf(p(σμ−1(j)))

(A16)=(∑μλμ)(∑if(p(i)))

(A17)=S(p),

where in (A13) we have applied Jensen’s inequality using that f is strictly concave and the convex sum (A9) is a proper one. Summarising, S(p^)>S(p) if π is not of permutational type, which completes the proof of Proposition 3. ⊡

Proof of Proposition Proposition 6.

We consider

(A18)P~(Y~=1/y)=(51)∑(j,i)∈𝖤~1/yP~(j,i)

(A19)=(40,41,52)∑(i,j)∈𝖤yπ(j|i)d(i)q(j)D(j)

(A20)=(45,46)∑(i,j)∈𝖤yπ(j|i)p(i)y

(A21)=(9)(∑(i,j)∈𝖤yP(i,j))y

(A22)=(48)P(Y=y)y.

From this, the proposition follows immediately. ⊡

Proof of Lemma Lemma 1.

First, we will show that p(j|i) is a stochastic matrix. This follows from

(A23)∑j∈𝒥p(j|i)=(71)Tr((∑j∈𝒥Q∼ j)UP∼ id(i)U∗)=(70)Tr(UP∼ id(i)U∗)=Tr(P∼ id(i))=(56)1,

for all i∈ℐ.

Next, according to the definition of “modified doubly stochastic type,” we have to confirm that (25) holds

(A24)∑i∈ℐp(j|i)d(i)=(71)Tr(Q∼ jU(∑i∈ℐP∼ i)U∗)=(57)Tr(Q∼ jUU∗)=Tr(Q∼ j)=(69)D(j),

for all j∈𝒥. This completes the proof of the Lemma. ⊡

Appendix B: Derivation of the Modified Statistical Model of Sequential Measurements

We assume that the outcome sets ℐ and 𝒥 are divided into disjoint cells (“Elementarbereiche” in [18]) such that the probability P(i, j) is constant over the cells. The construction is similar to the operation of “coarse graining” in physical theories, but in those cases, the probability will typically not be constant over the cells, and the following considerations will be at most approximately valid. The mentioned cells will be written as the inverse images of suitable maps

(B1)Πℐ:ℐ→ℐ′ and Π𝒥:𝒥→𝒥′,

and are assumed to be finite. ℐ′ and 𝒥′ can hence be viewed as the respective sets of cells. We define the cell sizes

(B2)d:ℐ′→ℕ,d(i′)≡|Πℐ−1(i′)|,

(B3)D:𝒥′→ℕ,D(j′)≡|Π𝒥−1(j′)|.

As mentioned above, the probability is assumed to be constant over cells and hence gives rise to a modified probability function P′:ℐ′×𝒥′→[0,1] via

(B4)P′(i′,j′)≡d(i′)D(j′)P(i,j),if Πℐ(i)=i′ and Π𝒥(j)=j′.

We note that

(B5)1=(3)∑i∈ℐ,j∈𝒥P(i,j)=∑i′∈ℐ′,j′∈𝒥′∑i∈Πℐ−1(i′)∑j∈Π𝒥−1(j′)P(i,j)

(B6)=∑i′∈ℐ′,j′∈𝒥′d(i′)D(j′)P(i,j)=(B4)∑i′∈ℐ′,j′∈𝒥′P′(i′,j′),

as it must hold for a probability function. Here, and in what follows, the index i within a sum over i′ denotes an arbitrary element of the cell Πℐ−1(i′), analogous for j. As in Subsection 2.1, we define the modified marginal probability and obtain

(B7)p′(i′)≡∑j′P(i′,j′)=(B4)∑j′P(i,j)d(i′)D(j′)=∑jP(i,j)d(i′)=(5)p(i)d(i′).

Analogously for the modified conditional probability,

(B8)π′(j′|i′)≡P′(i′,j′)p′(i′)=(B4,B7)P(i,j)d(i′)D(j′)p(i)d(i′)=(9)π(j|i)D(j′).

The condition of π being doubly stochastic entails the following property of π′:

(B9)∑i′π′(j′|i′)d(i′)=(B4)∑i′π(j|i)D(j′)d(i′)=∑iπ(j|i)D(j′)=(18912)D(j′).

Next, we express the Shannon entropy in terms of the modified probabilities:

(B10)S(p)=(21)−∑ip(i)logp(i)=(B7)−∑ip′(i′)d(i′)logp′(i′)d(i′)=−∑i′p′(i′)logp′(i′)d(i′)≡S′(p′),

cp. (17) of [18] or the “observational entropy” according to (15) of [35].

Acknowledgement

This work was funded by the Deutsche Forschungsgemeinschaft (DFG), Funder Id: http://dx.doi.org/10.13039/501100001659, grant 397107022 (GE 1657/3-1) within the DFG Research Unit FOR 2692. The authors thank all members of this research unit, especially Andreas Engel, for stimulating and insightful discussions and hints to relevant literature.

References

[1] C. Jarzynski, Phys. Rev. Lett. 78, 2690 (1997).10.1103/PhysRevLett.78.2690Search in Google Scholar

[2] J. Kurchan, arXiv:0007360v2 [cond-mat.stat-mech].Search in Google Scholar

[3] H. Tasaki, arXiv:0000244v2 [cond-mat.stat-mech].Search in Google Scholar

[4] S. Mukamel, Phys. Rev. Lett. 90, 170604 (2003).10.1103/PhysRevLett.90.170604Search in Google Scholar PubMed

[5] P. Talkner, M. Morillo, J. Yi, and P. Hänggi, New J. Phys. 15, 095001 (2013).10.1088/1367-2630/15/9/095001Search in Google Scholar

[6] T. Schmiedl and U. Seifert, J. Chem. Phys. 126, 044101 (2007).10.1063/1.2428297Search in Google Scholar PubMed

[7] K. Saito and Y. Utsumi, Phys. Rev. B 78, 115429 (2008).10.1103/PhysRevB.78.115429Search in Google Scholar

[8] D. Andrieux, P. Gaspard, T. Monnai, and S. Tasaki, New J. Phys. 11, 043014 (2009), Erratum in: New J. Phys. 11, 109802 (2009).10.1088/1367-2630/11/4/043014Search in Google Scholar

[9] J. Yi, P. Talkner, and M. Campisi, Phys. Rev. E 84, 011138 (2011).10.1103/PhysRevE.84.011138Search in Google Scholar PubMed

[10] M. Esposito, Phys. Rev. E 85, 041125 (2012).10.1103/PhysRevE.85.041125Search in Google Scholar PubMed

[11] J. Yi, Y. W. Kim, and P. Talkner, Phys. Rev. E 85, 051107 (2012).10.1103/PhysRevE.85.051107Search in Google Scholar PubMed

[12] M. Campisi, P. Hänggi, and P. Talkner, Rev. Mod. Phys. 83, 771 (2011), Erratum in: Rev. Mod. Phys. 83, 1653 (2011).10.1103/RevModPhys.83.771Search in Google Scholar

[13] P. Talkner, E. Lutz, and P. Hänggi, Phys. Rev. E 75, 050102 (2007).10.1103/PhysRevE.75.050102Search in Google Scholar PubMed

[14] P. Busch, P. Lahti, J.-P. Pellonpä, and K. Ylinen, Quantum Measurement, Springer-Verlag, Berlin 2016.10.1007/978-3-319-43389-9Search in Google Scholar

[15] A. J. Roncaglia, F. Cerisola, and J. P. Paz, Phys. Rev. Lett. 113, 250601 (2014).10.1103/PhysRevLett.113.250601Search in Google Scholar PubMed

[16] G. De Chiara, A. J. Roncaglia, F. Cerisola, and J. P. Paz, New J. Phys. 17, 035004 (2015).10.1088/1367-2630/17/3/035004Search in Google Scholar

[17] M. Campisi and P. Hänggi, Entropy 13, 2024 (2011).10.3390/e13122024Search in Google Scholar

[18] W. Pauli, in: Probleme der Moderne Physik, Arnold Sommerfeld zum 60, Geburtstag 1928. Reprinted in Collected Scientific Papers by Wolfgang Pauli, Vol. 1 (Eds. R. Kronig and V. Weisskopf), Interscience, New York 1964, p. 549.Search in Google Scholar

[19] C. E. Shannon, Bell Syst. Tech. J. 27, 379 (1948).10.1002/j.1538-7305.1948.tb01338.xSearch in Google Scholar

[20] R. Serfozo, Basics of Applied Stochastic Processes, Springer-Verlag, Berlin 2009, Corrected 2nd printing 2012.10.1007/978-3-540-89332-5Search in Google Scholar

[21] V. Vedral, J. Phys. A 45, 272001 (2012).10.1088/1751-8113/45/27/272001Search in Google Scholar

[22] G. P. Martins, N. K. Bernandes, and M. F. Santos, Phys. Rev. A 99, 032124 (2019).10.1103/PhysRevA.99.032124Search in Google Scholar

[23] M. Campisi, J. Pekola, and R. Fazio, New J. Phys. 19, 053027 (2017).10.1088/1367-2630/aa6acbSearch in Google Scholar

[24] H.-P. Breuer, W. Huber, and F. Petruccione, Phys. Rev. E 61, 4883 (2000).10.1103/PhysRevE.61.4883Search in Google Scholar

[25] M. Langemeyer and M. Holthaus, Phys. Rev. E 89, 012101 (2014).10.1103/PhysRevE.89.012101Search in Google Scholar PubMed

[26] O. R. Diermann, H. Frerichs, and M. Holthaus, Phys. Rev. E 100, 012102 (2019).10.1103/PhysRevE.100.012102Search in Google Scholar PubMed

[27] H.-J. Schmidt, J. Schnack, and M. Holthaus, Phys. Rev. E 100, 042141 (2019).10.1103/PhysRevE.100.042141Search in Google Scholar PubMed

[28] J. Gemmer and R. Steinigeweg, Phys. Rev. E 89, 042113 (2014).10.1103/PhysRevE.89.042113Search in Google Scholar PubMed

[29] O. Penrose, Foundations of Statistical Mechanics: A Deductive Treatment, Pergamon Press, Oxford 1970.10.1016/B978-0-08-013314-0.50011-XSearch in Google Scholar

[30] C. Tsallis, J. Stat. Mech. 52, 479 (1988).10.1007/BF01016429Search in Google Scholar

[31] G. E. Crooks, Phys. Rev. E 60, 2721 (1999).10.1103/PhysRevE.60.2721Search in Google Scholar PubMed

[32] D. Schmidtke, L. Knipschild, M. Campisi, R. Steinigeweg, and J. Gemmer, Phys. Rev. E 98, 012123 (2018).10.1103/PhysRevE.98.012123Search in Google Scholar PubMed

[33] Y. Morikuni and H. Tasaki, J. Stat. Phys. 143, 1 (2011).10.1007/s10955-011-0153-7Search in Google Scholar

[34] G. Birkhoff, Tres observaciones sobre el algebra lineal, Univ. Nac. Tucumán Rev. Ser. A 5, 147 (1946).Search in Google Scholar

[35] D. Šafránek, J. M. Deutsch, and A. Aguirre, Phys. Rev. A 99, 012103 (2019).10.1103/PhysRevA.99.012103Search in Google Scholar

Received: 2019-08-21

Accepted: 2019-11-13

Published Online: 2019-12-12

Published in Print: 2020-03-26

A Framework for Sequential Measurements and General Jarzynski Equations

Abstract

1 Introduction

2 Statistical Model of Sequential Measurements

2.1 Simple Case

2.2 Modified Case

2.3 Symmetric Formulation

3 Applications to Quantum Theory

3.1 General Case

3.2 Systems in Local Canonical Equilibrium

3.3 Systems in Microcanonical Equilibrium

3.4 Systems in Grand Canonical Equilibrium

3.5 Application to PeriodicThermodynamics

4 Further Applications to Quantum Theory

4.1 A Second Law–like Statement for the Nonstandard Case

4.2 An Analytically Solvable Example

4.3 Crooks Fluctuation Theorems

5 Applications to ClassicalTheory

6 Summary and Outlook

Appendix A: Proofs of Some Propositions

Proof of Proposition Proposition 1.

Proof of Proposition Proposition 2.

Proof of Proposition Proposition 3.

Proof of Proposition Proposition 6.

Proof of Lemma Lemma 1.

Appendix B: Derivation of the Modified Statistical Model of Sequential Measurements

Acknowledgement

References

Journal and Issue

Articles in the same Issue