From free energy measurements to thermodynamic inference in nonequilibrium small systems

Fluctuation theorems (FTs), such as the Crooks or Jarzynski equalities (JEs), have become an important tool in single-molecule biophysics where they allow experimentalists to exploit thermal fluctuations and measure free-energy differences from non-equilibrium pulling experiments. The rich phenomenology of biomolecular systems has stimulated the development of extensions to the standard FTs, to encompass different experimental situations. Here we discuss an extension of the Crooks fluctuation relation that allows the thermodynamic characterization of kinetic molecular states. This extension can be connected to the generalized JE under feedback. Finally we address the recently introduced concept of thermodynamic inference or how FTs can be used to extract the total entropy production distribution in nonequilibrium systems from partial entropy production measurements. We discuss the significance of the concept of effective temperature in this context and show how thermodynamic inference provides a unifying comprehensive picture in several nonequilibrium problems.


Introduction
Experiments determine the ultimate fate of scientific theories. Theory must be put to test in experiments, to check whether predictions are met. Moreover, if a theory survives this initial stage, it will rapidly gain widespread acceptance if it spurs new experiments, possibly in situations where measurement had been, at first sight, out of the question. Theories which uncover previously unnoticed connections between observables are especially suited to this aim and equilibrium statistical mechanics offers important examples: the fluctuation-dissipation theorem (FDT) and the Onsager reciprocal relations. Both these results highlight a connection between observables (fluctuation and response in the first case and different transport coefficients in the second) and both of them are of great practical value. From an experimental perspective the FDT offers two different strategies to measure susceptibilities: from fluctuations or through a small perturbation. Experimentalists can freely choose at their convenience. Onsagerʼs reciprocal relations reduce the number of transport coefficients to be measured when characterizing a physical system. The practical value of these results is made even higher by the small number of assumptions on which they rest, essentially the time reversal symmetry of equilibrium dynamics: it is easy to identify the settings in which they apply.
In the last 20 years several new theoretical relations known as 'fluctuation relations' (FRs) have been discovered. The different FRs apply to different physical situations, but, in general, they connect the probability of a given trajectory and that of its time-reversed version with the entropy production along the trajectory. Most importantly for our discussion, the FDT and Onsager reciprocal relations are implied by a FR for entropy production in steady states [1]. At the fundamental level, i.e. considering the system in its full phase space picture, FRs hold under simple assumptions on the dynamics. For example the Crooks fluctuation relation (CFR), which applies to irreversible processes between equilibrium states, holds for systems undergoing Hamiltonian dynamics. Such fundamental and general formulations of FRs may not be directly applicable to experiments: in most practical cases the system is observed through some coarse-grained configurational variable (e.g. the end-to-end distance of a single molecule). In these cases FRs are valid if the coarse-grained variables undergoes a Markovian dynamics. The theoretical development of FRs has been accompanied by intense experimental activity. FRs have shown to hold in different physical systems spanning colloidal particles in optical traps [2], harmonic oscillators [3], single molecules [4], a periodically excited single defect center in a diamond [5] or single-electron boxes [6] or quantum systems [7] just to cite a few. Up to now, experiments were mainly designed to test the validity of FRs in different systems and under different conditions. Now it is reasonable to assume that, at least for the aforementioned specific systems, FRs do hold. At this point we want to ask a different question: could FRs be used for measurements of physical quantities that would otherwise be difficult in non-equilibrium conditions? A first answer to this question comes from the single-molecule field where FRs are used to measure free energy differences from non-equilibrium pulling experiments. By manipulating biomolecules at the single-molecule scale, researchers are able to observe rare configurations (e.g. misfolded configurations) which often are difficult to observe by classical bulk methods. In these cases, FRs provide a way to measure thermodynamic parameters from non-equilibrium experiments, something which is impossible by other methods. We shall discuss the state of the art in this kind of measurements in sections 2-4. In section 5 we will bring the discussion to a more general level and will present a tentative but general strategy to use FRs in measurements, which we call an inference via the FR. Our starting point will be the following: the violation of FRs in a given setting is itself an important information since it hints that we are probably missing some contribution tothe full entropy production. Can the extent of this violation tell us something about the missing entropy? Can we extract meaningful quantitative information from such violation? These questions are particularly interesting if a 'hidden' entropy source is not directly measurable. This situation is found in many experiments, e.g. in systems with hidden degrees of freedom [8], systems with incomplete detection [9], systems with more than one configurational variable [10] and coarse grained systems [11]. In section 5 we will show how this inference works on real experiments performed in a dual-trap optical tweezers setup and generalize our results to different experimental situations.

Equilibrium free energies from nonequilibrium pulling experiments
The measurement of free-energy differences in classical thermodynamics is attained by quasi-statically changing the control parameter of a given experimental system and measuring the net amount of energy exchanged between the system and its environment during the process. At macroscopic scales, the experimental output of the thermodynamic manipulation of a system does not significantly change over different repetitions, even when experiments are carried out under non-equilibrium conditions. Samples contain a large number N of molecules and fluctuations in the experimental outcome, which are of the order of N 1 , are negligible. When the experiment is carried out under non-equilibrium conditions the average work 〈 〉 W evaluated over different realizations of the same experimental protocol is larger than or equal to the free energy difference between the initial and the final states of the protocol, i.e. Δ 〈 〉 ⩾ W G, as stated by the second law of thermodynamics [12]. The situation is starkly different at the microscopic scale. Advances in nanomanipulation carried out during the last 20 years grant access to events occurring at the single molecule level (N ∼ 1). In this microscopic scale, work measurements are of the same order of magnitude as thermal fluctuations, ∼k T B (k B is the Boltzmann constant and T the temperature of the environment). These systems are typically known as small systems. Here fluctuations are relevant and different repetitions of the same experimental protocol give different outcomes [13]. The second law of thermodynamics holds on average, but transient violations can be observed in which the work performed on the system is smaller than the free energy difference, The need to physically characterize such systems implied a boost in non-equilibrium statistical theories, which favored the development of stochastic thermodynamics and the appearance of FRs.
In this contribution we focus on work relations, which are FRs that relate non-equilibrium work measurements with free-energy differences [4,14,15]. In this paper we focus on thermodynamic processes where both pressure and temperature are kept constant. For convention, we will refer always to variations of the Gibbs free energy, ΔG. Hence, we first provide an experimental verification of these relations using single molecule experiments, by pulling a DNA molecule with optical tweezers. Work relations allow us to extract its free energy of formation ΔG 0 , i.e. the free energy difference at zero force between the unfolded and the native folded state. Next, we show how work relations can be extended in order to gain access to the thermodynamic characterization of kinetic states. These are metastable states such as molecular intermediate or misfolded states. They are difficult to observe in full equilibrium conditions, but nevertheless play an important role in many regulatory reactions inside cells.

CFRs and Jarzynski Equality (JE)
Consider a system initially in thermal equilibrium with a value for the control parameter λ λ = (0) 0 . An experimental protocol λ t ( )is applied on the system during a time interval τ by manipulating the control parameter λ. At the end of the protocol, the value of the control parameter is λ τ λ = ( ) 1 ( figure 1(a)). The work measured along one realization of the experiment is: where H is the Hamiltonian of the system, which depends on configurational variables x { } such as atomic coordinates and the control parameter λ. When dealing with small systems, the work W measured along the experimental protocol can be significantly different upon different independent realizations of an identical protocol because we do not have control over the microscopic configurations explored by the system (i.e., variables x { }) and consequently the term λ ∂ ∂ H in equation (1) varies in each trajectory. Now suppose that the time-reversed experimental protocol is performed: the system is in equilibrium at λ 1 and the control parameter varies according to λ τ − t ( ), until reaching λ 0 ( figure 1(b)). An important work relation is the CFR, which reads as [14]: is the probability density function of the work performed along a forward protocol, is the probability density function of the work (with opposite sign) performed along the reversed protocol, and ) is equal to the Gibbs free-energy difference between the equilibrium states of the system at λ 1 and λ 0 , respectively.
The JE [16] can be obtained by multiplying equation (2) where 〈 〉 ... F denotes the average over forward trajectories. Even though the JE is a corollary of the CFR, it was derived earlier. A consequence of equation (2) is that the value of the work with identical probability to occur in the forward and the reversed experimental protocols (i.e, ) is equal to the free-energy difference ΔG, and it is always the same no matter how far from equilibrium the system is driven during the experimental process. Another interesting consequence of both CFR and JE is that they predict the existence of trajectories where Δ < W G even though, the work average 〈 〉 W evaluated over multiple independent realizations of the experiment is always larger, or equal if the protocol is applied slow enough, to the free-energy difference, This result is the second law of thermodynamics for small systems: for systems subject to stochastic thermodynamics that have few degrees of freedom the second law of thermodynamic is recovered by taking the average over an infinite number of repetitions of the experiment.
The recovery of free-energy differences from irreversible work measurements is possible by applying the JE (equation (3)) to unidirectional work measurements or applying the CFR (equation 2) to bidirectional work measurements (when both the forward and the reversed protocols are feasible). Typically, the combination of information from the forward and reversed protocols provides less biased free-energy estimates. However, when dissipation and hysteresis effects between the forward and the reversed processes are high, the work distributions in equation (2) separate from each other and a large error is introduced in the free-energy estimate. A theory of bias is then required to improve results [17].
Applications of FRs include the measurement of the free energy of formation of RNA and DNA hairpins [18]; the determination of the stability of native domains in proteins [19]; the measurement of mechanical torque in rotary motors [20]; the conversion of information into work in systems under feedback control [21]; the recovery of free energy landscapes from unidirectional work measurements [22,23]; the reconstruction of the free-energy branches for the different molecular states of a system as a function of the control parameter [24,25]; the determination of the free energies of formation of kinetic states [25]; and even the measurement of binding free-energies and equilibrium constants in chemical reactions [26][27][28].

Experimental validation of the Crooks equality
Here we experimentally test the CFR as it was done in 2005 by Delphin Collin and collaborators [18], which turned out to be fundamental to establish the basis of how to determine the free energies of formation of molecules from irreversible work measurements [19,[29][30][31]. Here, we pull a DNA hairpin using optical tweezers (figure 2). Pulling experiments consist of unfolding and folding processes. Hereafter, the unfolding process will be identified with the forward protocol, whereas the folding process will be identified with the reversed one. The dynamics of molecules during a single molecule experiment can be described through a single collective variable: the end-to-end distance. the mechanical work performed on the system, defined in equation (1), can be directly measured without knowing the internal configurations of the different elements of the experimental system.
In the unfolding process (red-solid trajectory in figure 3(a)), the trap-pipette distance λ is initially set to λ 0 , where the molecule is fully equilibrated in its folded-native state N. Next, λ is increased at a constant pulling speed v during a time interval τ ( λ λ = = t v d d˙). During this period, the mechanical force applied to the DNA hairpin also increases. For a given stochastic value of the force, the hairpin can no longer withstand the force and it unfolds. This is observed as an abrupt drop in force that corresponds to the relaxation of the bead into the center of the optical trap due to the release of ssDNA associated with the unfolding of the hairpin. Hopping events between states N and U are occasionally observed along a given trajectory. Regardless of the molecular state of the hairpin (folded in the native conformation N or unfolded in the stretched conformation U), λ t d d equals v until the value λ 1 is reached at 1 , where the protocol stops and the molecule remains in state U. According to equation (1), the work measured along an unfolding trajectory is:  In the refolding process the time-reversed protocol λ τ − t ( )is applied (blue-dashed trajectory in figure 3(a). Therefore, the trap-pipette distance λ is initially set to λ 1 , where the molecule is equilibrated in state U. Next, λ is decreased at the constant pulling speed −v ( λ λ = = − t v d d˙) during the time interval τ until it reaches the value λ 0 , where the protocol ends and the molecule equilibrates in state N. Along the folding process, the force applied to the DNA hairpin decreases. When it reaches a sufficiently low value, the molecule folds and a jump in force is observed. The work in a given folding trajectory is measured as: which is equal to the area below the FDC with a negative sign (figure 3(a)). Again, the value of W is different for each trajectory. Figure 3(b) shows the experimental P W ( ) F and − P W ( ) R measured by pulling the hairpin at two different pulling speeds. It can be observed that, even though hysteresis effects (and therefore dissipation) increase with the pulling speed, the work value at which the two distributions cross each other does not depend on v. According to the CFR, such value is equal to 0 , since at λ 0 the molecule is equilibrated in state N whereas at λ 1 the system is in equilibrium at state U. The measurement of the crossing point of work distributions obtained at 60 and 180 nm s −1 gives Δ = ± G 335 1 NU and ± k T 336 1 B , respectively (figure 3(b)).
A validation of the CFR is shown in figure 3(c), where the logarithm of the ratio between the probabilities P W ( ) [32]. The linear fit to the experimental data gives a slope equal to 0.95 ± 0.05, which is in excellent agreement with the theoretical prediction provided by equation (2), (empty squares and circles), obtained by pulling the hairpin at 60 and 180 nm s −1 (red squares and blue circles, respectively).
that implies that the slope equals 1. In addition, from this linear fit we can measure ΔG NU as the value of the work at which − = ( ) , which is essentially equivalent to determining the work value where Yet another verification can be obtained by rewriting the CFR as: Accordingly, if we multiply the experimentally measured reversed work distribution we should get the forward work distribution P W ( ) F [10,17]. This is shown in figure 3(d) for the work distributions measured at 60 and 180 nm s −1 (squares and circles respectively). There, the reversed work distribution obtained at 60 nm s −1 (solid blue histogram in figure 3(b)) has been multiplied by , thus obtaining the empty squares in figure 3(d) which are in good agreement with the experimentally measured forward work distribution at 60 nm s −1 (solid squares in figure 3(d) and solid red line in figure 3(b)). In addition, the term allows us to infer the shape of the left-most tails of the forward work distribution obtained at 60 nm s −1 . An identical approach is perform for work measurements obtained at 180 nm s (circles in figure 3(d)). Noteworthy, both values of ΔG NU are in good agreement with the two previous estimators.
Finally, one could describe the work distributions obtained in figure 3(b) as Gaussian functions. However, it must be stressed that this is a particular result for this hairpin and not a general consequence of non-equilibrium single-molecule experiments. It can be mathematically proved that Gaussian work distributions satisfying the CFR (equation (2)) must fulfill the following relation: where 〈 〉 W is the average work over trajectories, σ 2 is the variance of the work distribution. The signs above (+ in 〈 〉 W and − in σ 2 ) are used when extracting the free energy difference ΔG NU from forward work measurements, while the signs below (− in 〈 〉 W and + in σ 2 ) are used when extracting ΔG NU from reverse work measurements. Using equation (8)  In order to extract the free energy difference between states N and U at zero force, ΔG 0 , we need to subtract the elastic contributions due to stretching the handles and the ssDNA, displacing the bead in the optical trap, and orienting the hairpin double helix. In the example depicted we get Δ = ± G kT 50 4

The extended fluctuation relation (EFR)
Standard work relations allow us to measure free-energy differences between a final state and an initial state of the system along an experimental protocol. A requirement of standard FRs is that the initial state in both the forward and the reversed protocols are sampled in full equilibrium conditions. This is a limitation if one wants to measure free-energy differences between states that are difficult to observe in full equilibrium conditions and that are only transiently sampled in non-equilibrium experiments, such as intermediates or misfolded molecular states. The thermodynamic characterization of such states is interesting because of its crucial role in the fate of many molecular reactions, for instance protein and peptide-nucleic acid binding, specific cation binding, antigen-antibody interactions, transient states in enzymatic reactions or the formation of transient intermediates and non-native structures in molecular folders.
Here we show that it is possible to extend the CFR in order to overcome this limitation and recover free energy differences for kinetic molecular states that can be observed in partial equilibrium conditions along a non-equilibrium protocol. In what follows, we define a 'kinetic state' as a partially equilibrated region ′  of the configurational space, meaning that configurations inside each region are sampled according to the Boltzmann-Gibbs equilibrium distribution restricted to such region [12]. In contrast, the statistical weights of the different regions ′  do not necessarily follow an equilibrium distribution. It can be mathematically described as: is the partition function of the system at λ, and λ ′  Z , is the partition function restricted to the region ′  [33]. In the case of biomolecules, the configurational space can be considered to be partitioned into different molecular kinetic states, such as the native conformation, intermediate and misfolded states, or the unfolded conformation. As a result, during a pulling experiment we assume that the molecule follows a sequence of kinetic states that determines its trajectory. Because of thermal fluctuations and the stochastic nature of small systems, each independent realization of a pulling experiment may result in a different trajectory. Hence, the molecule does not necessarily follow the same sequence of kinetic states for different realizations of the identical protocol.
Let A and B denote any two kinetic states of a thermodynamic system and let λ denote the control parameter. In a forward process the system starts in partial equilibrium and λ varies from λ 0 to λ 1 during a time τ according to a predetermined protocol λ t ( ). In the time-reversed process the system is initially set in partial equilibrium and λ varies from λ 1 to λ 0 according to the time-reversed protocol λ λ τ In this situation, different kinetic states can be accessed by the system at the beginning of both the forward and the reversed protocols, and consequently different trajectories connecting different kinetic states can be experimentally observed. For the trajectories that connect the kinetic state A at the beginning of the forward protocol with the kinetic state B at the beginning of the reversed protocol the EFR reads as [24,25]: 0 is the free energy difference between kinetic states B at λ 1 and A at λ 0 ; denote the partial work distributions for the forward and reversed processes that start and end at A and B respectively; and ϕ → F A B and ϕ ← R A B are the fraction of paths starting in A (or B) at λ 0 (or λ 1 ) and ending in B (or A) at λ 1 (or λ 0 ). The EFR implies that the work value at which the forward and the reversed work histograms cross each other ( ) is no longer equal to the free energy difference of the system at λ 1 and λ 0 but it is equal to and the resulting expression is integrated over W one gets an extended version of the JE for kinetic states, where 〈 〉 ... F denotes the average over forward trajectories. There are two main differences between the EFR in equation (10) and the CFR in equation (2). First, the use of partial work distributions in the EFR implies that from all the measured forward (reversed) trajectories, only those starting in state A (B) and ending in state B (A) are selected. Second, the presence of the prefactor ϕ ϕ into the Crooks estimation of the free-energy difference between kinetic states. Noteworthy, the EFR is a generalization of the CFR (equation (2)), since equilibrium is a particular case of partial equilibrium: if the forward and reverse protocols start in full equilibrium at states A and B, respectively, the two fractions ϕ → F A B and ϕ ← R A B are equal to 1 and hence the CFR is recovered from equation (10). However, in partial equilibrium conditions the omission of the prefactor ϕ ϕ A B leeds to systematically biased results for the free-energy differences between different kinetic states [24]. We emphasize that for the case of kinetic structures that apparently behave reversibly under the protocol, ΔG AB is not just equal to the measured work during the experiment, which is apparently reversible, since the term ϕ ϕ 3. We find the partial work distributions for each corresponding set of forward and reversed trajectories. In figure 4 4. By applying the EFR, we find ΔG NN and ΔG NU using the partial work distributions and the prefactors ϕ → The free-energy difference ΔG NN or ΔG NU as a function of the control parameter λ 1 is usually referred to as the free-energy branch of state N or U, respectively. In figure 4(c) we show the free-energy branches Δ λ G ( ) NN 1 and Δ λ G ( ) NU 1 obtained using the EFR for the two different pulling speeds. In both cases, the free energy of state N at λ = −40 0 nm is taken as the reference energy. As expected, the profile of the free-energy branches does not depend on the speed of the pulling protocol.
For a better visualization, in figure 5(a) we plot the free-energy branches for states N and U taking as the reference energy the full free energy ΔG of the system at each value of λ 1 , defined as:   (12)). In this case, free-energy branches depend on the pulling speed, specially for state U. Moreover, these results suggest that the stability of hairpin is always dominated by state U under pulling experiments (i.e., the free-energy branches for N and U do not cross at any value of λ 1 ). Hence, it is observed that the use of the EFR and the presence of the prefactors ϕ ϕ ) in equation (10) are required to properly recover the thermodynamic stabilities of the two states.

The EFR and feedback protocols
In recent years much attention has been devoted to thermodynamic transformations involving feedback. These transformations, instead of using a fixed protocol λ t ( ), choose among different protocols depending on the evolution of the system. A simple example of a pulling experiment with feedback performed on a DNA hairpin would be the following (figure 6): we start at time t = 0 at a low force and with the molecule in state N; we pull at a constant speed v until time = t t m where λ λ λ = > m 0 , and a measurement is performed on the molecule. If the molecule is still folded the pulling goes on until λ λ = t ( ) 1 with the same pulling speed v. If the molecule is unfolded the pulling speed changes to ′ > v vand the pulling still goes on to the final value of the control parameter λ 1 . Such experiments can readily be implemented in an optical tweezers setup. The fact that the pulling speed is raised only if the molecule is found in the unfolded state will prevent temporary refolding events.
Under feedback, we expect JE not to be fulfilled as, on average, we are decreasing the work needed to unfold the molecule. The EFR gives us a method to quantify the violation of the JE. We will have to consider free-energy differences for different values of λ so we extend the previous notation to: where Δ λ λ G AB , 0 1 is the free-energy difference of a system in partial equilibrium in state A at λ 0 and a system in partial equilibrium in state B at λ 1 . For free energies conditioned only on the state at the start (or end) of the protocol we will use the symbol Δ λ λ G A· , 0 1 (Δ λ λ G B · , 0 1 ). The EFR enables us to consider conditional averages, where the condition is on the trajectory of the system. We could for example condition the path average, so that the molecule is in a given state when λ λ = m . This amounts to inserting a term χ x ( ) A t m in the average, where x Aand zero otherwise, and x t denotes the configurational variable at time t. As a first exercise we will consider the standard JE and write it as a sum over contributions conditioned to visiting a given state at time t m : where the sum is taken over disjoint sets partitioning all the phase space; W i j , denotes the work performed in the interval λ λ , · m is the fraction of trajectories which start at equilibrium at λ 0 and end in state A at λ m ; and denotes an average conditioned to ending (starting) in state A. Using the EFR (equation (11)) we can compute explicitly the two conditional averages: to recover the result of the JE equation (3). In this first exercise we have written the exponential average of the work as a sum of conditional averages and have then recovered the JE computing the conditional averages using the EFR. We will now follow a similar strategy to compute the exponential average of the work under feedback. We consider again the feedback protocol mentioned earlier in this section: up λ m the pulling speed will be constant and equal to v, at λ λ = t ( ) m we perform a measurement and change the pulling speed to v A depending on the measurement outcome. We will consider the exponential average of the work and break it again, as in equation (14) into conditional contributions: By conditioning the trajectory on the state of the system at the moment of the measurement λ λ = t ( ) m we are able to write the exponential work average in the feedback protocol as a sum of conditional averages with standard protocols. The only difference with the case of the standard JE is that the second conditional average now depends on the measurement outcome through the pulling speed, as denoted by the subscript v A . We can now perform the same steps as in the previous computation (equation (15)) and get: The symbol ϕ λ · m A denotes the fraction of trajectories starting at equilibrium at λ 1 and arriving to A at λ m in a reverse protocol with pulling speed v A . In the previous computations these fractions summed to one: they were evaluated using the same pulling speed. Here they do not sum to one anymore: each fraction is evaluated using a different pulling speed which depends on the folding state A. This term quantifies the violation of the JE by feedback-based protocols. The reader familiar with the theory of fluctuation theorems (FTs) in presence of feedback will recognize in the above expression the parameter · m A introduced in [34]. Summing up we have used conditional averages to connect the theory of feedback protocols and that of the EFR, in an effort to develop unifying concepts in the rapidly expanding field of FRs.

Thermodynamic inference
To put the discussion in perspective let us suppose we have two optical traps focused in a microfluidics chamber filled with water in a dual-trap optical tweezers setup. A DNA molecule is then tethered between two beads captured in the optical traps forming a dumbbell ( figure 7(a)). The molecule is being pulled by moving one optical trap while the other remains at rest in the reference frame of water. The two optical traps can measure forces so in principle one could measure the work using the force in either of the two traps. The question is whether both forces yield equivalent work measurements or not. This problem has been addressed in much detail in [10] where we combined theory and experiments to demonstrate that the force measured in the moving trap (with respect to the reference frame of water) is the one that must be used to extract the correct mechanical work W according to the usual definition in stochastic thermodynamics (equation (1)), or its extension in the presence of a mean flow [35]. In contrast, the force measured in the trap at rest provides an incorrect work measurement W′ which does not satisfy the FT ( figure 7(b)). The difference between the two works, = − ′ + W W W equals to the energy dissipated by the center of mass of the dumbbell. In the over-damped limit, which applies to our setup, this amounts to where γ + is the hydrodynamic coefficient of the dumbbell, v is the speed of the moving trap and t is the duration of the pull. We will call W′ a partial work measurement because it misses a part ( + W ) of the full work exerted = ′ + + W W W , and thus a part of the total entropy production.
Let us now suppose that we are in a situation where we can only measure the work in the trap at rest, W′, rather than W. This is not a purely hypothetical scenario as several dual-trap setups can only measure the work in the trap at rest due to technical reasons [36][37][38]. In this case we should not apply the CFR or the JE to extract free-energy differences. In particular, the CFR would not be satisfied and the JE applied for W′ would underestimate free-energy differences in apparent violation of the second law. The question remains whether we can infer the full work distribution P W ( )from the partial one, ′ ′ P W ( ). The answer is positive: for symmetric dumbbells one can show how by shifting all measured partial work values W′ by a constant Δ, Δ = ′ + W W , it is possible to adjust the value of Δ to infer the P W ( )that satisfies the FR figures 6(c) and (d). From the inferred P W ( )we can now extract free energy differences. Moreover the value of Δ = 〈 〉 + W provides a measure of the missing dissipation due to the Stokes friction experienced by the dumbbell. From Δ we can then infer the value of the hydrodynamic coefficient of the dumbbell, γ + , a quantity that can be also extracted by measuring equilibrium fluctuations of the center of mass of the dumbbell but which requires simultaneous tracking of beads in both traps. The above example provides a case of thermodynamic inference: by only measuring W′ values we can infer the correct work distribution P W ( )and from that the value of ΔG using FRs ( figure 7(c)). For asymmetric dumbbells the inference procedure is more elaborated but still possible.

The inference problem
The general setting for the inference problem is illustrated in figure 8 for the case of non-equilibrium steady states. System A (for Agent) produces a total entropy S t during a time interval t. System A is observed via a second system, e.g. a detector T such as an optical trap, coupled to A via a noisy channel C. Measurements on T report a partial entropy production ′ S t with where P S ( ) A t is the probability of the system A to produce a total entropy S t , ′| P S S ( ) C t t is the transfer function of the noisy channel, i.e. the probability of measuring an entropy production as ′ S t given that the total entropy production is S t , and ′ P S ( ) T t is the observed distribution of entropy production. The FT holds for S t : but not, in general for ′ S t . The inference problem can be stated as follows: can we infer P S ( ) A t from a measurement of ′ P S ( ) T T under the additional assumption that the former satisfies a FT? As we shall see, in many practical cases the answer is positive and the inference process does also serve as a mean to characterize the transfer function P C of the noisy channel. In what follows and for sake of generality, the inference problem is formulated in an abstract setting however it applies to several experimental situations in stochastic thermodynamics. In the case of the dumbbell discussed in the previous section the agent A, producing entropy, is the trap which is moved with respect to water. The noisy channel is the dumbbell and the detector is the trap at rest with respect to water. A second setting for the inference problem is the field of molecular motors: in these systems the total entropy production gets at least two contributions, one from translocation against an applied force and one from ATP hydrolysis. Modern experimental systems allow measurements of the former contribution while the latter is not observable at the single molecule level. Is it possible to extract useful information about the mechano-chemical step by observing translocation and assuming a FT for the total entropy production? In this case the agent A is the motor that injects power on a substrate through ATP hydrolysis; the noisy channel C is the mechano-chemical coupling, i.e. the stochastic coupling between entropy production and translocation; and T is the device used to measure translocation under and applied force, be it an optical or magnetic trap. Inference problems are possible beyond the single-molecule field: in [45] the authors consider a situation in which the current flowing through a Quantum Dot is monitored using a capacitively coupled Quantum Point Contact. These experiments provide another setting for the inference process. More recently calorimetric work measurements on two-state quantum systems have been considered [9]. Here, work is estimated through a measurement of photon exchange between the system and the baths. If, as realistic, some photons remain unrecorded as they are exchanged with baths only a partial work measurement is available. Here the agent A is the two state system, the channel C is the detection efficiency of photons and the detector T is the calorimeter, setting the stage for an inference of the total work (or entropy production).

Inference close to equilibrium: the Gaussian case
Inference is not possible in general: special forms for ′| P S S ( ) C are necessary. Close to equilibrium one can assume the different probability distributions ′ P S P S ( ), ( ) A t T t and ′| P S S ( ) C t t to be approximately Gaussian. In this case: A t s t t T t s t t s 2 σ s 2 , σ′ s 2 are the mean and variances of both Gaussian distributions. We also assume that: We have in mind a situation in which only the distribution of ′ S t is measurable. From equations (18), (20)-(22) one gets: We assume P S ( ) A t to satisfy the FT equation (19) and therefore σ = 〈 〉 k S 2 s t 2 B . In contrast ′ P S ( ) T t does not fulfill the FT equation (19). In fact if one calculates the ratio ′ − ′ P S P S ( ) ( ) T t T t one gets: Figure 8. Measurement and inference. System A (for Agent) produces a total entropy S t during a time interval t. System A is observed via a second system, e.g. a detector T such as an optical trap, coupled to A via a noisy channel C. Measurements on T report a partial entropy production ′ S t with with the dimensionless parameter x quantifying how much ′ P S ( ) T t deviates from the FT. As we discuss below we will call this an x-FT. The parameter x, quantifying the violation of the FR for ′ S t , also characterizes the Gaussian noisy channel: Although equation (26) provides important information on the channel, complete inference of P S ( ) A t from ′ P S ( ) T t (i.e. the simultaneous determination of σ C and Δ C ) is not possible. Two limiting cases can be considered in which the noisy channel affects only the variance or only the mean of the distribution: Equation (27) correspond to the case in which the noisy channel affects the variance of the probability distribution but not its mean and vice-versa, equation (28) corresponds to the case in which the mean is affected but not the variance, a situation physically realized in the dumbell example discussed in the previous section. In both these cases, measuring the distribution of ′ S t and assuming a FT for S t it is possible to recover the distribution of S t . In the intermediate cases in which neither the value of Δ C nor that of σ C 2 is fixed by physical constraints, inference can still be possible if some more information about the system is available (e.g. the Fano factor of the noisy channel, ). In [10] we give an experimental example in which equilibrium information on the system is used to complement the inference.

Inference far from equilibrium
In the previous section we considered inference in a Gaussian setting, where we could quantify the violation of the FT by a single parameter x. The assumption of a Gaussian P S ( ) A t is particularly restrictive, as it is limited to near-equilibrium macroscopic systems. To address general non-equilibrium settings we must consider non-Gaussian P S ( ) A t . We will, however, still consider for simplicity a Gaussian noisy channel with transfer function: i.e. a channel affecting the variance of the distribution of entropy production but not its mean. The measured distribution ′ P S ( ) T t and the full entropy production distribution P S ( ) A t are related by: where σ  (0, ) C 2 is a normal distribution with mean zero and variance σ C 2 , equation (29), and * denotes convolution. As always we will assume P S ( ) A t satisfies the FT, equation (19). Using this symmetry we can evaluate for the fluctuation symmetry yields: which shows that the FT equation (25) is not fulfilled in general, but is recovered in the limit σ → 0 T t The variance σ C 2 can now be inferred comparing ′ P S ( ) T t with the one-parameter family of probability distributions as shown in figure 9. This allows to select a value Δ Δ = * such that: ( figure 9). Finally, although in this section we used a transfer function (P C in equation (29)) with zero mean (Δ = 0 C ), the discussion can be extended to the case Δ ≠ 0 C . Similarly to the Gaussian case, when both σ C 2 and Δ C are different from zero, complete inference is not possible and one gets Δ 5.5. The x-FT and the effective temperature Let us go back to our near-equilibrium inference example and let us consider the partial entropy production distribution ′ P S ( ) T t (equation (21)). Being a Gaussian distribution we have already shown in equation (25) that it fulfills the following relation: . For reasons that will become clear soon we will call x the fluctuation ratio. For x = 1 we get the standard FT while in the most general case where x is different from 1 we might better speak of an x-FT. What is the physical meaning of the x-FT? In our example, x characterizes the noisy channel through which we are observing agent A: measuring a violation of the FT (equation (35)) yields quantitative information about the system.
The fluctuation ratio x could be also interpreted as an effective temperature. To better understand this let us consider the special case of the DNA molecule tethered between two beads captured in two optical traps discussed in section 5.1 ( figure 7(a)). Under a pulling cycle protocol Δ = G 0 so the dissipation W dis (measured in the moving trap) equals the full work W and the partial and the missing work respectively. In the linear dissipative regime where the DNA molecule is not pulled too fast (the dissipated work varying linearly with the pulling speed) all work distributions are Gaussian to a very good approximation. The partial work distribution ′ ′ P W ( )then satisfies a FR equivalent to equation (35), B has the dimensions of a temperature often referred to as the effective temperature. Here, x is equal to the fraction of the average total dissipated work along a cycle 〈 〉 W captured by the partial work measurement. If x = 1 we recover the standard FT with = T T eff . Both x and the effective temperature T eff carry the same information about the inference process: they quantify the fraction of entropy production missed in the measurement of a nonequilibrium process. We can summarize this by saying that the standard FT equation (2) holds for the full dissipation W but does not hold when only a part of the total dissipation, W′, is measured. In this case, an x-FT may hold generally with a value of the effective temperature typically higher than T ( < < x 0 1). Our discussion has been focused on the Gaussian case. For the general non-Gaussian case the x-FT (equation (35)) may hold asymptotically for sufficiently long times t in a given sector of entropy production rate values i.e. < ′ p S t t with p of order 1. This result would be in the line of heat FT [41,42], where a similar conclusion has been reached. The x-FT scenario is realized, for example, in weakly ergodic aging systems, as recently shown in [44]. In this case ′ S t is equal to the so-called exclusive work which is the work delivered by an external field h applied to an aging system, with ΔA equal to the change during time t of the observable conjugated to the field h. As shown in [44], the distribution ′ P S ( ) t shows a crossover at a characteristic value * S . Below * S , ′ P S ( ) t satisfies the standard FT equation (19) just as an equilibrium system. Above * S a crossover to an x-FT (equation (25)) is observed. Also in this case the parameter x can be given a clear physical meaning. In aging systems the effective temperature and the fluctuation ratio are used to quantify violations of the FDT that relates correlations and responses [43]. In [44] it was demonstrated that, in weakly ergodic aging systems in a scenario of entropy driven relaxational dynamics, either the x defined from the x-FT and from the FDT are equal. In spin-glass theory the physical meaning of x is related to the presence of frozen degrees of freedom that cannot relax over the observable timescales. The parameter x is also related to the Parisi replica symmetry breaking parameter introduced in the static solution of mean field spin glasses. The meaning of x then appears quite similar to that provided by thermodynamic inference: x quantifies the missing entropy production or dissipation due to the presence of frozen degrees of freedom in the system.

Conclusions
Over the past 20 years we have witnessed a fast development of theoretical concepts and experimental tools that have contributed to our understanding of energy processes in non-equilibrium small systems. FTs are nowadays widely used to recover, from irreversible work measurements, the free energy of formation of native molecular states of proteins and nucleic acids. However, a major requirement to correctly apply standard FRs is that at the beginning of the forward and reversed experimental protocols the system is fully equilibrated. This makes it difficult to characterize both intermediate and misfolded states with standard FRs. EFRs were born when full equilibrium was replaced by partial equilibrium at the beginning of both forward and reversed experimental protocols. This introduces a pre-factor in the CFR which accounts for the fraction experimental trajectory observed between two partially-equilibrated molecular states. The use of EFRs paves the way to investigate the thermodynamic properties (such as the free-energy branches or the free energy of folding) of not only native states, but also intermediate, misfolded, and even intermolecular-bound states, which might be difficult to study under equilibrium and become accessible in partial equilibrium conditions. In recent years there has been a growing interest in the use of thermodynamic transformations involving feedback. In these case the experimental protocol is stochastic and depends on the trajectory of the system. FRs in the presence of feedback, then, need to take into account and quantify the information extracted from the system. Using conditional averages on the microscopic trajectories we have demonstrated how the theory of feedback control and that of EFRs are equivalent. Finally, we have shown how FTs are applicable to extract useful information form a variety biological and physical systems. The recent extension of FTs to thermodynamic inference opens exciting new perspectives. The fact that in complex systems only a partial amount of information is accessible through direct experimental measurements calls for a completely new approach. The list of examples where thermodynamic inference could be applied is large: characterizing the mechano-chemical cycle of molecular motors, inferring work distributions in quantum systems, unravelling feedback effects in autonomous systems, quantifying heterogeneity in molecular ensembles and investigating molecular evolution in mutational ensembles.