Universal Ideal Behavior and Macroscopic Work Relation of Linear Irreversible Stochastic Thermodynamics

We revisit the Ornstein-Uhlenbeck (OU) process as the fundamental mathematical description of linear irreversible phenomena, with fluctuations, near an equilibrium. By identifying the underlying circulating dynamics in a stationary process as the natural generalization of classical conservative mechanics, a bridge between a family of OU processes with equilibrium fluctuations and thermodynamics is established through the celebrated Helmholtz theorem. The Helmholtz theorem provides an emergent macroscopic"equation of state"of the entire system, which exhibits a universal ideal thermodynamic behavior. Fluctuating macroscopic quantities are studied from the stochastic thermodynamic point of view and a non-equilibrium work relation is obtained in the macroscopic picture, which may facilitate experimental study and application of the equalities due to Jarzynski, Crooks, and Hatano and Sasa.


Introduction
Gaussian fluctuation theory is one of the most successful branches of equilibrium statistical mechanics [1,2]. Since the work of Onsager and Machlup [3,4], the Ornstein-Uhlenbeck process (OUP) has become the stochastic, mathematical description of dynamic, linear irreversible phenomena [5]. It has been extensively discussed in the literature in the past [6,7,8,9]. Several recent papers studied particularly the OUP without detailed balance [10,11,12]. In recent years, taking stochastic process rigorously developed by Kolmogorov as the mathematical representation, stochastic thermodynamics has emerged as the finitetime thermodynamic theory of mesoscopic systems, near and far from equilibrium [13,14,15,16]. The fundamental aspects of this new development are the mathematical notion of stochastic entropy production [17,18,19], novel thermodynamic relationships collectively known as nonequilibrium work equalities, and fluctuation theorems [20,21,22,23,24], and the mathematical concept of non-equilibrium steady-state [25,26,27].
Fundamental to all these advances is the notion of time reversal. Newtonian dynamic equation, in Hamiltonian form: is a canonical example of dynamics with time-reversal symmetry [28]: Under transformation (t, x i , y i ) −→ (−t, x i , −y i ), Eq. (1) is invariant. This invariance requires that H(x i , y i ) = H(x i , −y i ): H is usually a function of y 2 i and terms like B · y, where B changes sign upon time reversal such as a magnetic field with a Lorentz force. Adopting this definition to linear stochastic processes, one has a novel definition for time reversibility that is distinctly different from that of Kolmogorov's, as we shall show below.
Consider the linear stochastic differential equation which is an OUP with parameters α and ǫ; M and Γ are two n × n constant matrices, B t is standard Brownian motion. We further assume that all the eigenvalues of M are strictly positive and Γ is non-singular. According to the concept of detailed balance, Eq. (2) can be uniquely written as [29,30,31,32] where D and Ξ(α) are positive definite matrices: D = 1 2 ΓΓ T and MΞ + ΞM T = 2D. If one identifies the two terms inside {· · · } as dissipative (transient) and conservative (perpetuate) motions, respectively, then a time reversible process should be defined as a statistical equivalence between the probability density of a finite path {X(t 0 ) = x 0 , X(t 1 ) = x 1 , · · · , X(t n ) = x n } in which t 0 < t 1 · · · < t n : f x 0 , x 1 , · · · , x n , and the probability density f X † (tn)X † (2tn−t n−1 )···X † (2tn−t 0 ) x n , x n−1 , · · · , x 0 in which the X † (t) follows the adjoint stochastic differential equation [32,33] dX with initial distribution for X † (t n ) identical to that of X(t n ).
Recognizing the underlying circulating, conservative dynamics in Eqs. (3a) and (3b) allows us to connect a Hamiltonian structure with linear stochastic processes, and consequently develop a Helmholtz theorem, which historically has served as the fundamental mathematical link between classical Newtonian mechanics and thermodynamics. For high dimensional stochastic processes, variables in the Helmholtz theorem provide the systems' underlying dynamics with a macroscopic picture. An ideal gas-like relation between a set of new, macroscopic variables emerges, confirming the simplicity of the OUP. A work-free energy equality in terms of the macroscopic thermodynamic variables, which are fluctuating with the underlying dynamics, captures the nature of the fluctuation in the underlying stochastic processes. We emphasize that even though the mathematical derivations are essentially the same, the physical meaning of the work relation is closer to the classical thermodynamics.
The paper is structured as follows. In Sec. 2, we first provide the necessary preliminaries on the OUP. Sec. 2.1 introduces the conservative dynamics as a part of the stationary behavior of the OUP. Sec. 2.2 then discusses a long neglected issue of zero energy reference. Secs. 3.1 and 3.2 introduces the stationary free energy function and the dynamic free energy functional. Sec. 3.3 studies the novel object of equation of state. It is shown that the OUP has a simply, universal ideal thermodynamic behavior. In Sec. 4, we turn to the circulating dynamics and its relation to classical mechanics as well as stochastic dynamics. Sec. 4.1 focuses on the simplicity of the circulating dynamics as being totally integrable. Sec. 4.2 contains a proof that the stationary probability density of OUP, conditioned on an invariant torus of the underlying conservative dynamics, analogous to a microcanonical ensemble, is an invariant measure of the latter. If the dynamics on an invariant torus is ergodic, then the conditional probability is the only, natural invariant measure on the torus. Work equalities and fluctuation theorems are discussed in Sec. 5. Using a macroscopic presentation of the Jarzynski equality, its relation to Helmholtz theorem is revealed in Sec. 5.3. The paper concludes with discussions in Sec. 6.

Stationary Gaussian density and underlying conservative dynamics
The OUP in Eq. (3a) satisfies the important fluctuation-dissipation relation: 2DΞ −1 = ǫ 2 ΓΓ T × covariance matrix of the stationary OUP. In fact, it has a stationary Gaussian distribution Z −1 (α)e −ϕ(x;α)/ǫ 2 in which Z(α) is a normalization factor and ϕ(x; α) = x T Ξ −1 (α)x. In addition, there is an underlying circulating dynamics where the scalar ϕ(x; α) is conserved [11]: In fact, this conservative dynamics can be expressed as [32]: where and M(α)Ξ − D is skew-symmetric. It is of paramount importance to recall that for a Markov process without detailed balance, its stationary dynamics is quantified by two mathematical objects: a stationary probability density and a stationary circulation [25,34] characterized as a divergence-free, conservative vector field. In general, the latter accounts for the complexity arising from the system's dynamics [35]: how many integrals of motion does it have; whether the conservative dynamics is ergodic on an invariant set; etc. Many of the characteristics persist in the stationary stochastic process, and can be used to classify long time, complex behaviors in high dimensional systems. On the other hand, the dissipative (transient) dynamics plus noise drive the system towards the stationary distribution while characterizing "energy" fluctuations.
For the OUP in Eq. (3a), the conservative dynamics will be shown to be totally integrable. That is, symmetries would be implied through ⌊n/2⌋ first integrals of motions, which are the natural generalizations of the time-reversal symmetries. The remaining part, dX(t) = − 1 2 D∇ x ϕ(x; α)dt + ǫΓdB t , has a stationary dynamics that is detailed balanced. It is worth noting that any ϕ(x; α) = ϕ(x; α) + C(α) is also a valid substitution for the ϕ(x; α) in Eq. (7). As far as the stochastic dynamical system Eq. (3a) is concerned, there is no unique ϕ(x; α) as a function of both dynamic variable x and parameter α.

Zero energy reference: A hidden assumption in classical physics
The central object that connects classical Newtonian mechanics with equilibrium thermodynamics is the entropy function S(E, V, N), with V and N being the volume and the number of particles of a classical mechanical system in a container, and E its total mechanical energy which is conserved according to Newton's Second Law of motion. In Hamilton's formulation Eq. (1), E is simply the initial value of the Hamiltonian function H {x i }, {y i } in which x i and y i are the position and momentum of ith particle, respectively, 1 ≤ i ≤ N.
We recognize that in the classical theory of mechanical motions, replacing H with However, an additive function C(V, N) would cause non-uniqueness in the thermodynamic forces in the relation: in which Since pressure p has a mechanical interpretation, one can, by physical principle, uniquely determine the form of p as a function of V . The situation for µ is much less clear: Since there is not an independent mechanical interpretation of the chemical potential other than the thermodynamic one given in Eq. (8), the non-uniqueness is inherent in the mathematical, as well as the physico-chemical theory. The problem has the same origin as Gibbs' paradox [37,38].
In classical chemical thermodynamics, the Hamiltonian function as a function of varying number of particles N, H(x 1 , · · · , x N , y 1 , · · · , y N ) = 1 2 N i m i y 2 i + V (x 1 , · · · , x N ), is uniquely determined via a Kirkwood charging process [36]: where lim With this convention, the Hamiltonian for a molecular system is uniquely determined in chemical thermodynamics, which yields a consistent chemical potential µ. How to generalize this chemical approach to Hamiltonian dynamics Eq. (1) with no clear separation between kinetic and potential parts, however, is unclear. The problem of uniqueness of Hamiltonian function H is intimately related to the uniqueness of ϕ(x; α) in Sec. 2.1. As we shall show in the rest of this paper, the zero energy reference has deep implications to the theory of stochastic thermodynamics. The resolution to the problem will be discussed in Sec. 4.3.

Free energy functions and functional
As the notion of entropy, the definition of free energy is widely varied in the literature. 1 The most general features of free energy, perhaps, are: it is the difference between "internal energy" and entropy; it is the entropy under a "natural invariant measure". In this section, we shall present two different types of free energies associated with the OU dynamics in Eq. (3a): (i) Thermodynamic free energy of a stationary dynamics, as a function of mean internal energy E and parameter α: A(E, α). We identify a "thermodynamic state" as a state of sustained motion, either for a deterministic conservative dynamics Eq. (4), or for a stochastic stationary process defined by Eq. (3a).

Thermodynamic free energy functions A(E, α)
With a particular given ϕ(x; α), we now introduce two different free energy functions. The first one is defined following the microcanonical ensemble approach; definition of the second one follows Gibb's canonical ensemble approach. While the second one is frequently being used in the work-free energy relation (discussed in Sec. 5), the two definitions agree perfectly in the large dimension limit.
The first thermodynamic free energy function, A 1 (E, α), associated with the conservative deterministic motion of Eq. (4) on the surface of ϕ(x; α) = E, is obtained following the microcanonical ensemble approach through Boltzmann's entropy function. Letting σ B (E, α) correspond to the entropy S and Θ −1 (E, α) correspond to ∂S ∂E in Eq. (8), we can define: 1 It has become increasingly clear that the Boltzmann's entropy for a Hamiltonian dynamics is not unique: There are different geometric characterizations of the level sets of the Hamiltonian that can be acceptable choices. Neither is Shannon's entropy in stochastic dynamics unique: other convex functions such as Tsallis' entropy can also be found in the literature.
where V n = π n/2 Γ n 2 + 1 −1 is the volume of an n-dimensional Euclidean ball with radius 1. Γ(·) is gamma function. n is the dimension of the OUP in Eq. (3a). The second one, A 2 (E, α), follows Gibbs' canonical ensemble approach via the "partition function" Z(α): in which mean internal energy The two free energy functions A 1 in Eq. (13c) and A 2 in Eq. (14b) are different only by a function of n inside the {· · · }. For large n, ln Γ( n 2 + 1) ≈ n 2 ln n 2 − n 2 . Therefore, A 1 and A 2 agree perfectly in the limit of n → ∞.

Dynamic free energy functional
The thermodynamic free energy A 2 (E, α) in Sec. 3.1 sets a universal energy reference point for the entire family of stochastic dynamics in Eq. (3a) with different α. For a given α, the time-dependent probability density function f α (x, t) follows the partial differential equation The f α (x, t) represents an instantaneous "state" of the probabilistic system, which has a free energy functional This is a dynamic generalization of the free energy functions in Sec. 3.1. It has two important properties. First, Second [32,33], in which the equality holds if and only if f α (x, t) reaches its stationary distribution Z −1 (α)e −ϕ(x;α)/ǫ 2 . The negated rate of change in the dynamic free energy functional, −dΨ/dt, is widely recognized as non-adiabatic entropy production rate. The entropy production rate also has a finite time, stochastic counterpart in terms of the logarithm of the likelihood ratio: whereX(τ ) = X(s − τ + t), and the expectation E P · · · is carried out over the diffusion process defined by Eq. (3a) and the corresponding Eq. (15):

Universal equation of state of OU process
With the introduction of the internal energy E and the parameter α, the thermodynamic relation Eq. (8) -known as the Helmholtz theorem -for the OUP model can be expressed by σ B , α and their conjugate variables. We notice that α enters Eq. (13) only through det Ξ(α). If one measures α through α = det Ξ(α), then the Helmholtz theorem writes: The two conjugate variables, Θ and F α , correspond to the macroscopic quantities in classical thermodynamics as temperature and force. 2 Following either Boltzmann's microcanonical or Gibbs' canonical approach, Sec. 3.1 revealed that E = 1 2 nΘ in which θ = 1 2n Θ could be interpreted as an "absolute temperature". Since the absolute temperature θ is a fluctuating quantity with respect to E and α, it may, in general, not bear a simple relationship with the noise strength ǫ 2 . But here in OUP, by comparing the microcanonical approach with the canonical one, we note that the mean absolute temperatureθ = ǫ 2 2n . The thermodynamic conjugate variable of α, the α-force: A mathematical relation between α, F α , and θ is called an equation of state in classical thermodynamics.
The "internal energy" E being a sole function of temperature θ, and the product of thermodynamic conjugate variables, αF α , equaling to nθ, are hallmarks of thermodynamic behavior of ideal gas and ideal solution. We thus conclude that the OUP has a universal ideal thermodynamic behavior.

Circulating conservative flow and its invariant measures
After discussing the energy function and stationary probability, we now focus on the dynamic complexity of the system and study the circulating, conservative dynamics. The universal ideal thermodynamic behavior reveals one aspect of the simplicity in OUP; another is reflected in the divergence-free motions. For the linear conservative dynamics, Eq. (4), its structure is known to be simple: the vector field is integrable.
The conservative dynamics in Eq. (4) can be proved to be purely cyclic (e.g., periodic, or quasi-periodic on an invariant torus). Because the skew-symmetric matrix (M − DΞ −1 ) has only pairs of imaginary eigenvalues {λ ℓ |1 ≤ ℓ ≤ n}. We can also find real Jordan form of (M − DΞ −1 ): QJQ −1 , where J is block diagonal, with 2 × 2 skew-symmetric blocks: being the ith block on the diagonal. Natural coordinates for the conservative flow Eq. (4) is therefore: y = Q −1 x.

The conservative flow and general time reversal symmetries
Poisson bracket {·, ·} can be defined for the linear conservative system as: Then the conservative flow expressed in terms of its Hamiltonian function ϕ(x) is:ẋ First integrals I i of the conservative flow are: Here, I (2i−1)∼(2i) denotes the diagonal matrix with 1 on (2i − 1)-th to (2i)-th diagonal entries, and zero everywhere else. The conservative flow is totally integrable, and can be written in canonical action-angle variables. Angular coordinates θ i accompanying I i can be found as: Hence, in the canonical action-angle variables, ϕ = There are n 2 first integrals, but for the given Poisson bracket, one combination of them is unique, which is the Hamiltonian ϕ that connects to the stationary distribution and generates the conservative flow.
In the action-angle variables, it is observable that the system bears the following symmetries:

Conditional probability measure as invariant measure of the conservative flow
The OUP yields an equilibrium probability density function for X: f eq X (x; α) = Z −1 (α)e −ϕ(x;α)/ǫ 2 . In this section, we calculate the conditional probability density for X restricted on an equal energy surface D ϕ=E = x|x ∈ R n , ϕ(x; α) = E and prove it to be an invariant measure of the conservative dynamics, Eq. (4), restricted on D ϕ=E . Therefore, in the absence of fluctuation and dissipation, our definition of "equilibrium free energy" (Eq. (13c)) in stochastic thermodynamics retreats to the Boltzmann's microcanonical ensemble approach in classical mechanics.
One can obtain a conditional probability density for X restricted on an equal energy surface D ϕ=E = x|x ∈ R n , ϕ(x; α) = E as: in which [41] S(E, α) = ln The conditional probability density at x ∈ D ϕ=E is: Note this conditional probability is one of the invariant measures of the conservative dynamics Eq. (4) restricted in D ϕ=E .
To prove this fact, define the dynamics of the conservative part as: S t , mapping a measurable set A → S t (A). Then measure of a set A ⊆ D ϕ=E under dµ = e −S(E,α) ∇ x ϕ(x; α) −1 dΣ n−1 is: since S −1 t (A) ⊆ D ϕ=E and S t is volume preserving. In general, if the dynamics is ergodic on the entire D ϕ=E , then its invariant measure µ is the physical measure: µ-average equals time average along a trajectory; if there are other first integrals for the conservative dynamics, then µ can be projected further to lower dimensional invariant sets.

Resolutions to the energy reference problem
Up to now, there are clearly several possibilities to uniquely determine the free additive function C(V, N) in the Hamiltonian H {x i }, {y i }; V, N discussed in Introduction.
(1) C(V, N) is chosen such that the global minimum of H = 0 for each and every V and N. This is widely used, implicitly, in application practices, as in our Eq. (7).
(2) C(V, N) is chosen according to the "equilibrium free energy": Note that this is precisely the "energy function" in Hatano and Sasa [24].
(3) Extra information concerning the fluctuations in V , such as in an isobaric ensemble, and fluctuations in N in grand ensemble, provides an empirically determined basis for the free energy scale.
In terms of the theory of probability, choice (2) uniquely determines the energy reference point according to a conditional probability, and in choice (3) it is uniquely determined according to a marginal distribution. How to normalize a probability, which has always been considered non-consequential in statistical physics, seems to be a fundamental problem in the physics of complex systems.

Work equalities and fluctuation theorems
The previous discussions suggest that while a great deal of complexity of a detailed, mesoscopic stationary dynamics is captured by the circulating conservative dynamics, OUP also has a macroscopic state of motion that is defined by the internal energy E, or equivalently the level sets of ϕ(x),D ϕ=E ⊂ R n . Thus, from the macroscopic point of view, a stochastic system could be studied through the one-dimensional (1-D) time sequence of fluctuating internal energy E, as a function of t, or the change in E due to changes in the parameter α.
The celebrated Jarzynski equality connects the mesoscopic fluctuating force with the change in free energy. We present this result through a projection from n-D phase space to 1-D function E(t) that facilitates experimental verification of the work-free energy relation. This approach reveals a close connection between the Jarzynski equality and the Helmholtz theorem. We start with stating the Jarzynski and Crooks' equalities, with the mathematical proofs collected in the Appendix A for readers' convenience. We then demonstrate the novel formulation of the Jarzynski equality in the projected space.
We have shown that A 2 (α) is uniquely determined only up to a particular ϕ(x; α). As shown below, the existence of an A 2 (α) has a paramount importance in the theories of work equalities, in which the notion of a common energy for a family of stochastic dynamical systems with different α has to be given a priori [42].

The Jarzynski equality
The macroscopic α-force in the Helmholtz theorem, as a function of E and α, is defined through Boltzmann's entropy σ B . The Jarzynski equality, on the other hand, concerns with a mesoscopic α-force, and the statistical behavior of its corresponding stochastic work The Jarzynski equality dictates that if the initial distribution of X(τ ) follows the equilibrium distribution, then [20] e − 1 where the average of a functional over the ensemble of paths is defined as: in which P[X(τ ), α(τ )]D[X(τ )], is an infinite-dimensional probability distribution for the entire paths [X(τ )].
It is clear from the proof in Appendix A that the Jarzynski equality is general for Markov processes with or without detailed balance. Several recent papers have studied extensively the latter case [43,44].

Crooks' approach
G. E. Crooks' approach, when applied to processes without detailed balance [24], considers the probability functional of a backward path P[X(t)|X(0);α(t)] over a forward one P[X(t)|X(0); α(t)], where both the initial and final distribution of X(τ ) follows the equilibrium distribution: are the heat dissipation and the house-keeping heat respectively. If a process describes a physical system in equilibrium, which is expected to be "microscopic reversible" in [22], then On the other hand, if the system is in detailed balance for each and every α, M(α) − D(α)Ξ −1 (α) = 0, then the house-keeping heat Q hk [X(τ ), α(τ )] ≡ 0. Therefore, pathensemble average of Eq. (36) gives: This is Hatano-Sasa's result [24]. For systems without detailed balance, Q hk [X(τ ), α(τ )] measures the magnitude of the divergence-free vector field, or the extent to which the system is away from detailed balance, even when stationary distribution is attained. At the same time, measures how much on average the behavior of backward paths is statistically different from forward ones.

Crooks' approach through adjoint processes
Jarzynski's approach is based on a mesoscopic α-force; while Crooks' approach concerns with the stochastic entropy production rate which reflects "heat dissipation". Therefore, for systems with detailed balance, they are essentially the same result according to the First Law of thermodynamics. For systems without detailed balance, one can again obtained a Jarzynski-like equality from the probability P of the forward path over the adjoint probability P † of the backward one, according to the notion of time reversal in Eq. (3b): Thus, whether a system is in detailed balance or not, one has the Hatano-Sasa equality [24]:

Macroscopic work equalities
We are now in the position to study the work-free energy relation from a macroscopic view. Essentially, we will consider the stochastic, fluctuating (E(t), α(t)) instead of (X(t); α(t)) directly. In doing so, we are observing the evolution in the probability distribution of E through a projection from (X; α) to (E, α). With the projection of the n-dimensional phase space to the one-dimensional time series E(t), the stationary probability density function f ss E (E, α) of E with α is also a projection of the original stationary probability density function f ss X (x; α) in Euclidean space (as discussed in Sec. 4.2): in which For the process of (E(t), α(t)), the total internal energy is no longer E itself. But rather, it would include the "entropic effect", S(E, α), caused by the curved space structure, and become A (E, α), as will be discussed in more detail in [45]: The Helmholtz theorem for the new (E(t), α(t)) process reads: Hence, the total force that does the work in this new coordinate is: where ǫ 2 ∂S(E, α)/∂ α is what chemists called an "entropic force". Now we define the work that external environment has done to the system through the controlled change of α(t) as: Then the work-free energy relation in macroscopic variables is: Therefore, the averaged minus exponential of work is equal to the minus exponential of free energy difference. Here, we notice that the free energy stays the same through the change of free variables, as a result of Eq. (27):

Discussion
In the present work, using the OUP as an example, we have illustrated a possible method of deriving emergent, macroscopic descriptions of a complex stochastic dynamics from its mesoscopic law of motion. In recent years, there is a growing awareness of the role of probabilistic reasoning as the logic of science [46,47]. In this framework, prior information, data, and probabilistic deduction are three pillars of a scientific theory. In fields with very complex dynamics, statistical inferences focus on the latter two aspects starting with data. In physical sciences that includes chemistry, and cellular biology, the prior plays a fundamental role as a feasible "mechanism" which enters a scientific model based on "established knowledge" -no biochemical phenomena should violate the physical laws of mechanics and thermodynamics. Indeed, many priors have been rigorously formalized in terms of mathematical theories. Unfortunately, most of these theories are expressed in terms of deterministic mathematics for very simple individual "particles"; obtaining a meaningful probabilistic prior for a realistic, macroscopic-level system requires a computational task that is neither feasible nor meaningful [52,53]. Nonlinear stochastic dynamical study is the mathematical deductive process that formulates probabilistic prior based on a given mechanism.
Open systems, when represented in terms of Markov processes, are ubiquitously nonsymmetric processes according to Kolmogorov's terminology. This is one of the lessons we learned from the open-chemical systems theory. The non-symmetricity can be quantified by entropy production [25]. For discrete-state Markov processes, symmetric processes are equivalent to Kolmogorov's cycle condition [54]. Interestingly, concepts such as cycle condition, detailed balance, dissipation and irreversible entropy production had all been independently discovered in chemistry: Wegscheider's relation in 1901 [55], detailed balance by G.N. Lewis in 1925 [56], Onsager's dissipation function in 1931 [57], and the formulation of entropy production in the 1940s [58,59].
A non-symmetric Markov process implies circulating dynamics in phase space. Such dynamics is not necessarily dissipative, as exemplified by harmonic oscillators in classical mechanics. One of us has recently pointed out the important distinction between overdamped thermodynamics and underdamped thermodynamics [32]. The present paper is a study of OUP in terms of the latter perspective, in which we have identified the unbalanced circulation as a conservative dynamics, a hallmark of the generalized underdamped thermodynamics [33]. In terms of this conservative dynamics, Boltzmann's entropy function naturally enters stochastic thermodynamics, and we discover a relation between the Helmholtz theorem [60] and the various work relations.
In the past, studies on stochastic thermodynamics with underdamped mechanical motions have always required an explicit identification of even and odd variables. See recent [61] and [32] and the references cited within. One of us has introduced a more general stochastic formulation of "underdamped" dynamics, with thermodynamics, in which circulating motion can be a part of a conservative motion [32] without dissipation. The present work is an in-depth study of the OUP within this new framework. It seems to us that even the term "nonequilibrium" in the literature has two rather different meanings: From a classical mechanical standpoint, any system with a stationary current is "nonequilibrium", even though it can be non-dissipative. From a statistical mechanics stand point, on the other hand, "nonequilibrium", "irreversible", and "dissipative" are almost all synonymous.

Absolute information theory and interpretive information theories
We now discuss two rather different perspectives on the nature of information theory, or theories [62,46]. First, in the framework of classical physics in terms of Newtonian mechanics, Boltz-mann's law, and Gibbs' theory of chemical potential, there is a universal First Law of Thermodynamics based on the function S(E, V, N, α) where S is the Boltzmann's entropy of a conservative dynamical system at total energy E, e.g., Hamiltonian H {x i }, {y i } = E, with V and N = (n 1 , n 2 , · · · , n m ) being the volume and numbers of particles in the chemomechanical system, and α = (α 1 , α 2 , · · · , α ν ) represents controllable parameters of the system. Then one has in which (∂S/∂E) −1 V,N,α is absolute temperature. p and µ are pressure and chemical potential, they are the corresponding thermodynamic forces for changing volume V and number of particles N, respectively. It is natural to suggest that if an agent is able to manipulate a classical system through changing α while holding S, V , and N constant, then he or she is providing to, or extracting from, the classical system non-mechanical, non-chemical work. It will be the origin of a Maxwell's demon [40].
For an isothermal system, one can introduce Helmholtz's free energy function A = E − T S, then Eq. (49) becomes And for an isothermal, isobaric information manipulation process without chemical reactions, one has Gibbs function G = E − T S + pV and dG = µdN − SdT + V dp + F α dα = F α dα. Note that while the first three terms contain "extensive" quantities N, S, and V , the last term usually does not. It is nanothermodynamic [63]. Note also that for a feedback system that controls F α , one has Θ = G − F α α and dΘ = −αdF α . Just as µ is a function of temperature T in general, so is F α : It has an entropic part [64,65]. This is where the "information" in Maxwell's demon enters thermodynamics. Eqs. (49) and (50), thus, are a grander First Law which now includes feedback information as a part of the conservation [39] with "informatic energy" F α dα, on a par with heat energy T dS, mechanical energy pdV , and Gibbs' chemical energy µdN. Eq. (49) is the theory of absolute information in connection to controlling α.
In engineering and biological research on complex systems, however, the notion of information often has a more subjective meaning, or meanings, usually hidden in the form of a statistical prior [66,67]. One of the best examples, perhaps, is in current cellular biology: Many key biochemical processes inside a living cell are said to be "carrying out cellular signal transduction". Various biochemical activities and changing molecular concentrations are "interpreted" as "intracellular signals" that instruct a cell to respond to its environment. Here, two very different, but complementary, mathematical theories are equally valid: Since nearly all cellular biochemical reactions can be considered at constant temperature and volume, one describes the stochastic biochemical dynamics in terms of Gibbs' theory based on the µ in Eq. (50). On the other hand, the same stochastic biochemical dynamics described in term of the probability theory can also be represented as an information processing machine with communication channels and transmissions of bits of information, carrying out a myriad of biological functions such as sensing, proofreading, timing, adaptation, and amplifications of signal magnitude, detection sensitivity, and response specificity [48]. The information flow narratives provide bioscientists a higher level of abstraction of a physicochemical reality [49].
Such an interpretive information theory, however, will lack the fundamental character of Eqs. (49) and (50). Still, as a multi-scale, coarse grained theory, some inequalities can be established [50,51]. It is also noted that changing α can always be mechanistically further represented in terms of changing geometric quantities such as volume and particle numbers via chemical reactions: The ultimate physical bases of information and its manipulation have to be matters and known forces.
We believe this dual possibility has a fundamental reason, rooted in Kolmogorov's rigorous theory of probability: A probability space is an abstract object associated with which many different random variables, as measurements, are possible. At this point, it is interesting to read the preface of [46] written by E. T. Jaynes, who is considered by many as one of the greatest information theorists since Shannon: "From many years of experience with its applications in hundreds of real problems, our views on the foundations of probability theory have evolved into something quite complex, which cannot be described in any such simplistic terms as 'pro-this' or 'anti-that'. For example, our system of probability could hardly be more different from that of Kolmogorov, in style, philosophy, and purpose. What we consider to be fully half of probability theory as it is needed in current applicationsthe principles for assigning probabilities by logical analysis of incomplete informationis not present at all in the Kolmogorov system." Then in an amazing candidness, Jaynes goes on: "Yet, when all is said and done, we find ourselves, to our own surprise, in agreement with Kolmogorov and in disagreement with its critics, on nearly all technical issues." Acknowledgements. We thank Professors P. Ao, M. Esposito, and H. Ge for helpful advices, Ying Tang, Lowell Thompson, Yue Wang, and Felix Ye for many discussions.

A.1 The Jarzynski equality
The basic idea for the derivation is as follows: We first represent a path X(t) by a discrete version with N steps and write the path probability in terms of the product of N transition probabilities given by the N i=0 (· · · ) in Eq. (51). Then the mean-exponential of negative is [24]: in which the work from state (X i ; α i ) to state (X i+1 ; α i+1 ) is defined as the difference in the global ϕ(x; α) with a common zero reference. This is a consequence of the First Law of Thermodynamics. Since equilibrium is attained at t 0 , p(x 0 ; t 0 ) = f eq X (x 0 ; α 0 ). With the global ϕ(x; α) = −ǫ 2 ln f eq X (x; α) − ǫ 2 ln Z(α), we have: Since we have defined in Sec. 3.1 the free energy as: A 2 (α) = −ǫ 2 ln Z(α), thus we obtain the Jarzynski equality: In a very similar vein, for the macroscopic thermodynamic variables (E, α), one defines the work done to the system by the external environment through controlling α(t) with ratė α: Write ϕ τ ( α) = E(τ, α). Then the discretized e − 1 is: where S(E, α) is defined in Eq. (43). On the other hand, equilibrium probability density function of E at (E i , α i ) is: Hence, we have Therefore, the log-mean exponential of minus work is equal to the minus of free energy difference.
When the system is in detailed balance, Crooks' approach recovers the Jarzynski equality. If one chooses the global energy ϕ with zero reference for each own equilibrium, i.e., ∆A 2 = 0 for all α, then it recovers the Hatano-Sasa equality.