Thermodynamic work from operational principles

In recent years we have witnessed a concentrated effort to make sense of thermodynamics for small-scale systems. One of the main difficulties is to capture a suitable notion of work that models realistically the purpose of quantum machines, in an analogous way to the role played, for macroscopic machines, by the energy stored in the idealisation of a lifted weight. Despite of several attempts to resolve this issue by putting forward specific models, these are far from capturing realistically the transitions that a quantum machine is expected to perform. In this work, we adopt a novel strategy by considering arbitrary kinds of systems that one can attach to a quantum thermal machine and seeking for work quantifiers. These are functions that measure the value of a transition and generalise the concept of work beyond the model of a lifted weight. We do so by imposing simple operational axioms that any reasonable work quantifier must fulfil and by deriving from them stringent mathematical condition with a clear physical interpretation. Our approach allows us to derive much of the structure of the theory of thermodynamics without taking as a primitive the definition of work. We can derive, for any work quantifier, a quantitative second law in the sense of bounding the work that can be performed using some non-equilibrium resource by the work that is needed to create it. We also discuss in detail the role of reversibility and correlations in connection with the second law. Furthermore, we recover the usual identification of work with energy in degrees of freedom with vanishing entropy as a particular case of our formalism. Our mathematical results can be formulated abstractly and are general enough to carry over to other resource theories than quantum thermodynamics.


I. INTRODUCTION
With the advent of highly-controlled experiments on small-scale quantum devices and the technological perspective to use such devices as machines [1][2][3][4][5] it is becoming increasingly important to understand what it precisely means for such a machine to perform work. For macroscopic, classical machines, one way to measure performed work is by introducing a work-storage device, for example a suspended weight that is lifted. The height of the weight can be seen as a deterministic statevariable that cannot be changed by bringing the weight into contact with a heat-bath. Hence, by defining work proportional to the height-difference of the weight, one ensures that work captures a notion of useful energy, in contrast to the energy stored in the microscopic degrees of freedom of the weight, which is of fluctuating nature.
At the nano-scale, however, not only the machine, but also the work-storage device may eventually be formed of a few atoms. In this regime, it is impossible to conceive a weight with a deterministic state-variable like the height: all the degrees of freedom are subject to nonnegligible thermal fluctuations. This constitutes a challenge to precisely distinguish between the energy corresponding to work and the one capturing heat. Furthermore, at this scale quantum effects become relevant, hence the work-storage device is better modelled by a quantum description. This makes the definition of work even more problematic, since it is unclear which role quantum coherences of the work-storage device play in the definition of work [6][7][8][9].
Motivated by such considerations, a variety of approaches have recently been put forward, each addressing the problem of defining work as well as accounting for the probabilistic or even quantum description of the work-storage device [6][7][8][9][10][11][12][13][14][15][16][17]. For each of them, a quantitative second law of thermodynamics can be derived in terms of the usual free energy or generalisations thereof. Still, despite constituting significant progress in resolving the above challenge, it seems fair to say that each of the proposals has certain deficits: either they impose to some extent artificial limitations on the set of allowed operations and states of the work-storage device in order to arrive at meaningful work quantifiers, or (and) they give rise to paradoxical scenarios where positive work is assigned to thermal energy.
In this work, we present a proposal of how to possibly generally measure work for arbitrary systems and processes. We do so by adopting a strictly operational perspective, guided by the mindset of an interactive proof system. We introduce an operational framework to formulate the question of measuring work in a deviceindependent manner and propose basic operational principles that we expect any reasonable measure of work to fulfil. From these elementary principles, we derive surprisingly stringent and specific conditions to the work quantifier. These conditions generalise relevant notions in quantum information theory [18] such as monotonicity and strong-subadditivity [19]. Our approach relies strongly on ideas from quantum resource theories [20][21][22][23][24]; as a consequence of this approach, we obtain several results on resource theories interesting in their own right. In particular, our results highlight the role of so-called catalysts [25][26][27] and how the local-global structure of multi-partite systems and the role of correlations interrelate with catalysis.

II. THE CHALLENGE OF A DEFINITION OF WORK
In this section, we would like to argue that the question of how to reasonable quantify work in the presence of a quantum work-storage device is by no means settled. To provide substance to this claim, we carefully review recent approaches to the definition of work and flesh out certain limitations that come along with them. At the same time, we highlight what precise limitation still needs to be overcome in order to arrive at a fully satisfactory definition.
A seminal approach on the role of energy fluctuations in work extraction is undoubtedly the one followed in Refs. [28][29][30]. There, the operations performed on a system E consist of a unitary evolution under timedependent Hamiltonians, followed by an energy measurement on E at the beginning and at the end of the process. Work is then simply taken to be energy difference between those measurements, giving rise to a realvalued random variable. The process is characterised by a probability distribution P W of work w.
In the light of the observations of the introduction, there are some aspects of this picture that seem not entirely satisfactory. Firstly, it is restricted to the situation where precisely unitary dynamics takes place on a system. Secondly, and more importantly, it has been emphasised in Ref. [14] that in such a picture, fluctuating, "disordered" energy is easily treated as work rather than heat. For example, suppose that P W is a thermal distribution, i.e., P W (w) = exp(−βw)/Z (1) where Z = w exp(−βw). Imagine now that such extracted work is stored in a work-storing device labelled by R, initially in the ground-state. The result would be that R is described by a Gibbs state at inverse temperature β for a given Hamiltonian. Clearly, such a Gibbs state of R could have been obtained simply by putting it in contact with a thermal bath. Hence, whether this process is ultimately extracting work has been critically reviewed in the literature [10][11][12][13][14].
One of the aims of so-called single-shot thermodynamics is to precisely tackle this criticism [10][11][12][13][14]47]. In particular, the notion of -deterministic work W has been introduced to remedy the situation: W is, roughly speaking, the maximum value of w that occurs with probability greater than 1 − , for a suitable > 0. In this way, one captures the intuition that work is "ordered" energy, in the sense that fluctuations are suppressed if one demands to be sufficiently small. Although these notions provide key insights towards a sensible definition of work, the notion of -deterministic work still suffers to some extent from an attenuated version of the previous criticism. That is, given a thermal distribution P W of work that could have been obtained simply by energy transfer from a bath (i.e. heat), W takes a value larger than zero [31]. That is, a positive value of work is assigned to a type of energy which is obviously of heatlike nature. A similar criticism applies to the formalism presented in Ref. [32]. There, the engine E and the workstorage device R are considered explicitly. The system R has a two-level Hamiltonian with gap ∆ > 0. A process is said to have performed -deterministic work equal to ∆ if a transition from the ground to the exited state of R is performed with probability larger than 1 − . Again, this definition of work assigns positive values to processes that merely bring R into contact with a heat bath [33]. A further drawback is that it restricts to the situation of initial states of R being a qubit initially in the ground state.
Another recent approach to the definition of work is the one put forth in Refs. [15][16][17]. There, the engine E and the work-storage device R are again considered explicitly, equipped with specific Hamiltonians. Work is simply defined as the average energy that can be stored in R. In this way, the optimal work extraction is limited by the usual second law in terms of free energy differences. A similar definition of work has also been used to derive the third law [34]. However, this approach has the limitation of considering rather restricted sets of operations: These are global transformations on ER that, among other properties, do not allow to simply bring R in thermal contact with a heat bath, in such a way it reaches an equilibrium state given by the Gibbs state [35]. Such restriction is essential: It effectively circumvents the problem of distinguishing work from heat by simply not allowing heat to be transferred to the workstorage device. Similarly, this restriction has been imposed in other approaches by considering that R is initially in a high energy state, so that it cannot receive energy from the heat bath [36]. Again related situations have been considered when average extracted energy is used as a work quantifier in Refs. [14,[37][38][39]. There, the set of operations is given by time-dependent Hamiltonians combined with interactions with a thermal bath. The work-storage device R acts like a battery supplying the energy gained by E during the timedependent Hamiltonian evolution. However, the possibility of bringing R in contact with a thermal bath is again not considered [40]. This is difficult to reconcile with a meaningful notion of work as the average energy difference on R.
It seems clear that a similar restriction on the set of operations is unnecessary to derive meaningful results in phenomenological thermodynamics. There, bringing the macroscopic lifted-weight in contact with the heat bath is a perfectly valid operation, and surely one that would not compromise the validity of the usual workextraction bounds. The reason for this is precisely that the definition of work (e.g. as a lifted-weight) is adequate to describe the thermodynamical processes that take place at the macroscopic scale. Such feature must prevail in an eventual adequate definition of work for the nano-scale.
In summary, we have seen how each of the different notions of work in the literature appears to fail to satisfy at least one the following natural requirements: i) Do not allow to extract positive work from a system on the thermal state.
ii) Do not restrict the set of operations in such a way the work-storage device cannot absorb any energy from a heat bath.
In fact, the criticism formulated in Refs. [14,32] on the notion of work as a random variable can be also captured in this way. If one takes the work-storage device explicitly into account as a physical system, then nontrivial probability distributions of work P W can be obtained just by putting the work-storage device in contact with a thermal bath, hence running into contradiction with i). The alternative is to simply restrict the validity of the notion of work as random variable to scenarios where there the whole process is thermally isolated, thus running in contradiction with ii). This justifies our claim that the definition of work is unsettled, since i) seems a necessary requirement to connect work to the familiar notion in phenomenological thermodynamics, and ii) seems a basic condition so that one can even pose the problem of how to distinguish the heat-like and the work-like energy stored in a work-storage device.
To conclude, we stress that fluctuation-theorems [41], single-shot thermodynamics [10][11][12][13] and second laws in terms of smooth-entropies [14,32] are seminal key insights in thermodynamics. The present work questions the adequacy of using the term and notion of work therein.

III. THE OPERATIONAL AXIOMS
The previous considerations motivate our approach to the problem of defining work in terms of an operational viewpoint. We formulate it as a game between two players [42]. The first one is Arthur, who possesses a quantum system which takes the role of the work-storage device. The system is described by a pair of a quantum state and an Hamiltonian referred to as object. The system is initially described by p (i) . The second player is referred to as Merlin, who has a machine capable of performing transitions between the initial object p (i) to a final object p (f ) . This machine will play the role of the thermal machine or engine, which performs a transformation on the work-storage device.
We assume that the transition p (i) → p (f ) is performed in an environment of temperature T = 1/β (we set k B = 1), which we consider to be fixed throughout the rest of our work. In a resource-theoretic setting, this means that Merlin performs the transitions while having unlimited and free access to arbitrary heat-baths at inverse temperature β. The term "free" is here used in the sense of a resource theory, a notion that will be made precise later.
Arthur and Merlin, having agreed on the free character of heat-baths, want to establish a fair way of quantifying the value of a given transition p (i) → p (f ) . That is, they want to agree on a function that establishes the prize at which Merlin sells to Arthur the transition that he has performed, where we will take the convention that W(p (i) → p (f ) , β) ≥ 0 implies that Arthur has to pay to Merlin. The prize of the transition is what we define as work, and the function W a work quantifier. Apart from the agreement on the free heat-baths at inverse temperature β, the work value has to be established solely on the basis of which transition p (i) → p (f ) is performed by Merlin, without any assumption or restriction on the internal mechanism of Merlin's device. In this sense, our approach is device-independent, reminiscent of quantum certification or the situation encountered in Bell's theorem, where Merlin has to convince Arthur, solely from the output data, that he performed a valuable process (in that case, a process that cannot be simulated classically).

A. Free catalytic transitions
Since the notions and the use of language may be unfamiliar in the quantum thermodynamic context, we will now specify clearly what we mean by free operations in the context of a resource theory. Here, Arthur and Merlin have free access to heat baths at inverse temperature β. This will be relevant for the choice of the function W, since Arthur will not pay a positive amount for a transitions that can be performed by only employing free resources. That is to say, it is important to specifically characterise the transitions that can be performed without expending valuable resources and only using baths.
Concretely, we assume that both Arthur and Merlin can pick heat-baths, that is, a quantum system B prepared in a Gibbs state with arbitrary Hamiltonian H B . They can also apply any global unitary U that commutes with the total Hamiltonian H+H B . We use the short-hand H A +H B := H A ⊗ 1 + 1 ⊗ H B whenever the support of two operators is clear from the context. This is the formalism of thermal operations first introduced in Ref. [32]. It is also meaningful to allow for more general sets of operations such as the so-called Gibbs-preserving maps [8,39,43], also see the appendix, or simply thermalising maps where the only possible interaction with the thermal bath is to bring the system to the Gibbs state in the spirit of Refs. [11,14,37]. Undoubtedly, in many thermodynamic settings, the latter one is the most realistic one capturing actual experimental situations. The final form of the work quantifier W will in principle depend on which model of operations with the bath is chosen, but the formalism is general enough to be applicable in any of these situations. In the appendix we discuss in detail which are the minimal properties of the free operations that are ex- plicitly needed to derive our results and show that the examples presented above have such properties. More importantly, we will assume that both Arthur and Merlin, in addition to the heat bath, can also borrow any quantum ancillary system uncorrelated with the bath and the work-storage device, as long as it is returned in the same initial state and also uncorrelated with the work-storage device (see Figure 1). This ancillary system is referred to as a catalyst, and its usage extends the set of transitions that can be performed with a bath [25,26]. Such catalytic operations have been frequently studied in the recent literature of quantum thermodynamics, and naturally capture "bystanders", so auxiliary systems that may help performing transformations. In the following we will refer to the operations described in this section as free operations when done without catalyst and catalytic free operations when performed with catalyst. Similarly, we will refer to the transitions induced by free and catalytic free operations as free transitions and catalytic free transitions, respectively.

B. Two basic axioms
We are now in the position to formulate two operational axioms concerning the work quantifier W. They seem as innocent as they are natural, and clearly capture features that a function of the above type quantifying work should satisfy. They are physically very intuitive: To develop our operational framework, however, they will be formulated in the mindset of the game played by Arthur and Merlin. In this language, they simply ensure that none of the players can get arbitrarily rich without expending valuable resources.

Axiom 1 (Work values for cyclic transitions).
For any cyclic sequence of transitions p (1) → p (2) → · · · → p (n) = p (1) the sum of the work-values of the individual transitions is larger or equal to zero.
According to our convention, if W takes a negative value, then Arthur is benefiting from the transaction, i.e. Merlin pays to Arthur. Hence, the previous axiom ensures that -taking the simplest case -Arthur cannot get rich by demanding Merlin to first do a transition ρ (1) → ρ (2) and then demanding from him to undo the transition. If this principle was violated, Arthur could get infinitely rich just by repeatedly interacting with Merlin. Note that Arthur is not even himself implementing the transition, hence, he is by definition not expending any resource. Our second axiom is the following.
Axiom 2 (Catalytic sequences). If p (1) → · · · → p (n) is a sequence that can be performed by only using thermal baths and a catalyst that, at the end of the sequence, is returned in the same state, then the sum of the work values for the individual transitions is smaller or equal than zero.
Note that this axiom does not require that each transition in the sequence can be done by catalytic free operations as the catalyst only has to be returned unchanged at the end of the sequence. We can rephrase this axiom more informally: Arthur will not pay for work when he is receiving heat, where in this context, we mean by heat a transition that could have been performed simply by an energy flow from the heat bath to the work-storage device. To see this, suppose that a sequence of transitions p (1) → · · · → p (n) is performed using thermal baths and a catalyst. As the catalyst by definition remains untouched in the end, all the energy stored in the work-storage device comes from the heat bath. Therefore we impose that the work value of these transitions is at most zero. It is worth pointing out how the deviceindependence of our notion of work comes into play here. The set of transitions p (1) → · · · → p (n) , even though they might have been performed with catalytic free operations, are performed by Merlin in a way unknown to Arthur. Hence, it may be the case, that Merlin is extracting work content from the work-storage device, in such a way it can be sold afterwards. This is the reason why Axiom 2 allows W to take a negative value.
The Axioms 1 and 2 encode the spirit of the second law of thermodynamics: By preventing any of the two players to become arbitrarily rich without spending resources, we are enforcing the impossibility to create a perpetuum-mobilé. Our approach is, however, inverse to what usually found in phenomenological thermodynamics. There, work is defined a priori through the lifted-weight and the second law is understood as a constraint on the possible physical proccesses. In contrast, Axioms 1 and 2 do not impose any constraint on the allowed physical operations that Merlin is performing. They simply state that one does not account as work what can be generated with a bath and a catalyst with the a priori given physical operations. As such, in our set-up it is also impossible to violate the second law: If by using, say, a forthcoming post-quantum theory, someone claimed to extract work from a single heat bath, then it simply means -regardless of the details of such theory -that what it is referred to as work does not fulfil our operational principles.

IV. GENERAL PROPERTIES OF THE WORK QUANTIFIER
The advantage of the framework developed here is that very basic principles already allow for formulating stringent properties of possible work functions W. In this subsection we will turn to discussing properties of a work-function W that respects Axioms 1 and 2. For conceptual clarity, we will keep the discussion rather informal in this subsection. For a mathematically detailed and rigorous treatment, we refer to the appendix. Nevertheless, we will have to introduce some notation and definitions first. We are looking for a function W [44] that assigns a real number to a pair of objects p (i) = (ρ (i) , H (i) ) and p (f ) = (ρ (f ) , H (f ) ). We will use Latin letters p, q, r, s, . . . to denote objects and denote the workvalue of a transition if β is clear from the context. If the Hamiltonian of the two objects is identical, we will also use the notation W(ρ (i) → ρ (f ) ). Given any object p, we define F(p) as the set of objects that can be reached from p by free operations. Similarly F C (p) denotes the set of objects that can be reached from p by catalytic free operations. Furthermore we will denote by w β any pair of the form (ω H,β , H).
The first result is that Axioms 1 and 2 translate into three intuitive properties of the work-function W.
Theorem 1 (Properties of work quantifiers). A function W respects Axioms 1 and 2 if and only if it has the following properties for all objects p, q, r, This theorem already implies that the work-cost of forming a state out of equilibrium W(w → p) is exactly minus the work yield of converting the state into thermal equilibrium W(p → w), a property that is violated in certain formulations of one-shot quantum thermodynamics [14,32]. Furthermore, the third property in the theorem implies that the sum of work-values for a sequence of transitions only depends on the initial and final state of the work-storage system. From these properties, the subsequent theorem follows easily, by choosing some arbitrary reference-object.

Theorem 2 (Form of work quantifiers).
A function W respects Axioms 1 and 2 if and only if it can be written as for a function M that is a monotone of catalytic free operations, i.e., Thus the work-storage device can be treated similarly to the case of a massive body under the influence of a conservative force in classical mechanics: There is a state-variable M and its difference along a transition determines the work-value of the transition. Using catalytic free operations, which generalise the concept of putting a system in contact with a heat bath in phenomenological thermodynamics, this state-variable cannot be increased.
According to the constraints given by Theorem 2, there still exist many work-functions that fulfil Axioms 1 and 2: One for each monotone M of catalytic free operations. A simply way to obtain such a monotone is the following. Chose any real-valued function f on objects p. Then the function is a monotone under catalytic free operations. Conversely, any monotone can be written in the form of Eq. (10). An example for such a function f would be the energy-expectation value, or the probability to find the system in the highest energy-eigenstate. Both are not monotonic under catalytic free operation themselves. An example for a function that already is a monotone under catalytic free operations is the difference of the non-equilibrium free energy to the Gibbs state (see Theorem 4).

V. MULTI-PARTITE WORK-STORAGE DEVICES
So far, we have not considered the possibility of work-storage systems that are made up of smaller noninteracting parts. We will now do so and we will see that such considerations severely constrain the set of monotones that can be used for a valid work-function. That is to say, in this subsection we will formulate operational principles for the case that Arthur owns a work-storage device that is made up of smaller parts that do not interact. Still, we assume that the subsystems can be correlated or even entangled. In this sense, genuine quantum features are allowed for.
Imagine that Arthur has such a bi-partite and noninteracting work-storage device made up of parts A and B and he provides only part A to Merlin. Merlin then performs a transition, which we will denote as p where the index A indicates that Merlin only has access to this subsystem. Clearly, because Merlin only acts on A, there exist limitations on which pairs of objects can be connected as p The quantum states are related as where E is a completely-positive trace-preserving map, and the Hamiltonian is changed only by adding Hamiltonian terms with support only on A. Again, Arthur and Merlin want to agree on a quantifier, this time for the local transitions, referred to as and similarly for B or any other subsystem of a multipartite system. The next axiom encodes the notion of locality into the definition of work, in the sense that if Merlin only had access to subsystem A, then the work is calculated in the same way as if a regular transition was performed on the A-marginal.
Axiom 3 (Locality). Given two objects p so that a transition between them can be implemented by acting locally on subsystem A, then A . This also holds for transitions implemented on B or for any multipartite generalisation.
This axiom is justified in the light that if Merlin implements the transition acting only on A, then he will not know of part B of Arthur's work-storage device. From Merlin's perspective, the situation is indistinguishable from having performed a transition on the marginal on part A. Eq. (12) simply ensures that it is also indistinguishable when it comes to evaluate the work value of the transition. Note that Axiom 3 is perfectly compatible with having two objects p AB can be performed with free operations globally -which implies a negative work value for W -however requiring Merlin to invest resources if he only has access to system A.
We will introduce a final axiom concerning the way local and global transitions are combined. Axiom 4 (Work values for cyclic local transitions). Suppose a cyclic sequence is a concatenation of local and global transitions, that is, we have a sequence where S i is either a subsystem or the whole multipartite system. Then the sum of the work-values of the individual transitions is larger or equal to zero.
Axiom 4 is indeed a generalisation of Axiom 1, where the cyclic sequences can contain local transitions. Hence, they are justified on the same basis. To give an example of how Axiom 4 operates, consider the following situation: First Merlin is asked to do a local transi- AB and then Arthur asks Merlin to reverse the situation, but giving him the full system AB. Then the total work cost of these two transitions is positive by Axiom 4. Using from Theorem 1 that the work-value has to FIG. 2. De-correlating a system using local free operations. The framed region corresponds to Merlin when handed one part of Arthur's system (yellow region). The green dots denote the catalyst and one part of Arthur's system. After the exchange of the catalyst and Arthur's part, Arthur's system is de-correlated. However, the catalyst is now correlated with the blue part of Arthur's system. be anti-symmetric in the states, we immediately get the relation for any p AB such that there exists a local transition on A connecting them. That is, performing a transition globally costs at most as much work as doing it locally. From an intuitive point of view, one could say that this is obviously true, since by acting locally, Merlin has less free operations at his hands and therefore the price has to be higher. This, however, is not precisely true. Indeed there are operations which can be done locally using free catalytic operations, but considered as a global operation they are not free. An example is the following (also see Fig. 2): Arthur gives Merlin part A of the bipartite system. Then Merlin takes a catalyst with the same state as the A-marginal of Arthur and simply exchanges the two systems. The catalyst, which after the swap is Arthur's subsystem A, remains unchanged and uncorrelated with the system that Merlin acted on. Thus from Merlin's point of view, this is a free operation. The effect of the operation is to de-correlate Arthur's system, ρ AB → ρ A ⊗ ρ B . When viewed as a global operation however, this operation is not free, because after the swap, the catalyst is correlated with subsystem B of Arthur. This does not imply, however, that the transition is not a free transition, when considered as a global transition. From this protocol and using relation (14) and (12) we learn that Here, we have introduced the notation In terms of the monotone M , which induces the function W, this means that Thus, our operational principles, although from an operational point of view not very restrictive, already imply strong properties for valid monotones. Now, we will use all the axioms previously stated to derive the set of properties on the monotones M that determine the work quantifier.

VI. POSSIBLE WORK-FUNCTIONS FOR MULTIPARTITE WORK-STORAGE DEVICES
We have already seen that taking into account the operational principles concerning local actions on workstorage devices leads to interesting properties of the work function that may seem surprising given the basic nature of the underlying axioms. It turns out that one can derive more and indeed very stringent conditions on the monotones from the operational principles. In this section, we will present these properties and also present a function which fulfils all the properties.
Theorem 3 (Possible work quantifiers for multi-partite systems). A work-function W respects Axioms 1-4 if and only if it can be written as where M is a monotone under catalytic free operations and has the following properties: 3. Generalised strong super-additivity: Let us discuss a few key properties that can be derived from the properties of M . A first example concerns super-additivity. Combining Eq. (17) with extensivity, we immediately obtain It turns out, however, (see Theorem 20 in the appendix), that we can derive a still significantly stronger property. Consider a three-partite work-storage device ABC and the transition for any H = H A + H B + H C and ρ ABC . Using properties 1., 2. and 3. we obtain This property, but with the reversed inequality-sign, is known as strong sub-additivity in the case of trivial Hamiltonians and is famously fulfilled by the von Neumann entropy [19]. It is a corner-stone of quantum information theory. Our results generalise strong subadditivity from information theory to the case of quantum thermodynamics, that is, when one has an arbitrary non-trivial Hamiltonian. This motivates to use the term generalised strong super-additivity. Furthermore, this property follows solely from operational principles. If we furthermore assume that the function ρ → M (ρ, H) is continuous for any H, we can also show that it is convex (see Proposition 4 in the appendix). At this point, it is tempting to guess that the monotone M is closely related to the von Neumann entropy. This is indeed true for the case that the Hamiltonian is trivial. In this case it follows from the axiomatic characterisation of the von Neumann entropy (see, e.g., Ref. [45] and references therein) that for any continuous (on quantum states) monotone that defines a valid work-function we have where d is the dimension of the quantum-state ρ and α ≥ 0 is some constant independent of d. Let us introduce the quantum relative entropy for any full-rank state σ. Then we can also write From a physical point of view, a Hamiltonian is only defined up to a constant, since only energy-differences are important. And indeed, from extensivity and the fact that ω H = ω H+λ1 for any λ ∈ R it is easy to deduce that This property implies that, since the temperature of the accessible heat baths is fixed, we can introduce a new functionM on pairs of states such thatM (ρ, ω H ) = M (ρ, H). With a sight at Eq. (23) it is now tempting to conjecture thatM (ρ, ω H ) ∝ S(ρ||ω H ). Indeed this is a possible choice.
Theorem 4 (Free energy difference). The non-equilibrium free energy difference to the Gibbs state is a valid monotone fulfilling Axioms 1-4.
The free energy used in the theorem is defined as Given the stringent set of Properties 1.-3. of Theorem 3, to our knowledge there is no other monotone than the free energy ∆F β fulfilling the operational Axioms 1-4. In fact, we have strong evidence that the free energy can be shown to be the unique monotone fulfilling Axioms 1-4 in the case that the free transitions are given by Gibbs-preserving transitions (see Section E in the appendix), based on axiomatic characterisations of the relative entropy. This will be discussed in detail elsewhere [46]; see also Ref. [47] where the role of the free energy of the work-storage device is discussed.

VII. SECOND LAW OF THERMODYNAMICS FOR WORK-EXTRACTION
It has been our main concern to approach the problem of identifying a good definition of work that overcomes the challenges highlighted in the introduction. It is key to thermodynamics to know how much work can actually be performed on some work-storage device given some non-equilibrium resource. Our approach to the definition of work ensures that no positive work can be obtained from a heat bath, however, we have not yet given a bound to the maximum amount of work that Merlin can perform given a resource that is not a thermal bath. We will now derive such bounds.
We assume that Merlin has some object p M = (ρ M , H M ) and is given some work-storage device p A = (ρ A , H A ) initially uncorrelated with M . Merlin can now perform a catalytic free operation on p A ⊗ p M to obtain a final object p M A . It follows straight-forwardly from extensivity and super-additivity of any valid monotone M that (26) Optimising the bound by setting p M = w H M we get back the second law of thermodynamics for isothermal workextraction, This bound, if we take M = ∆F β , coincides with the one obtained when work is defined as the average energy change, when the work-storage device cannot be brought in contact with a heat bath [14][15][16][37][38][39]. It also corresponds to the case of phenomenological thermodynamics, where the maximum amount of energy that can be store in a lifted-weight by an isothermal process is bounded by the free energy. Note, however, that we obtain the same bound while work is not defined as an average energy difference of the work-storage device.

VIII. PATH-DEPENDENCE OF WORK
We now turn to an important feature of the above work function only hinted at in the introduction: This is the path dependence of notions of work. It is well known that work in phenomenological thermodynamics is not an exact differential and therefore is pathdependent. How can this be made to fit to the fact that our work seems not to be path-dependent at first sight? When we talk about work in phenomenological thermodynamics, we usually mean the work that is done by the working system on the work-storage device. Formulated in our operational language, in phenomenological thermodynamics we usually take the role of Merlin and not that of Arthur. But also in our setting we cannot calculate, just from knowing the initial and final state of Merlin, how much work he has done on any work-storage device. This work-value depends on the precise operation in contact with the work-storage device, a heat bath and a catalyst that Merlin actually implements. It is therefore path-dependent in the very same manner as in phenomenological thermodynamics. What we call work is the work that is needed to do a given state-transition on a work-storage device and therefore only depends on the initial and final state of the work-storage device.

IX. ENERGY-MEASUREMENTS AND WORK DISTRIBUTIONS
Let us finally turn to the interesting topic of the role of work distributions. In our discussion, we have concluded that the free-energy difference is a valid -presumably the only one -work-quantifier in the above sense. If one considers the particular case of initial and final states being energy-eigenstates and a constant Hamiltonian, then the free-energy difference reduces to In this sense, our formalism incorporates as a valid work definition the -deterministic work of Refs. [14,32] in the particular case of = 0. The form of Eq. (28) may at first sight suggest that our formalism also incorporates as particular case the definitions of work given by Ref.'s [28-30], where an energy measurement is performed before and after the process, so that work is a random variable w := E f − E i behaving according to P W . However, note that performing an energy measurement and post-selecting on the outcome -so that (ρ, H) is transformed into (|E j E j |, H) with probability tr(ρ|E j E j |) -is not a free transition in our sense, nor a catalytic free transition. Indeed, any attempt to accommodate energy measurements as free operations seems problematic in our approach, since from a thermal state (w H,β , H) one could produce for free any pure eigenstate of H with some probability, leading to a trivial resource theory. Note, that measurements can indeed be accommodated as free operations in other resource theories, for instance the one of entanglement and LOCC ("local operations with classical communication"). This is because, in the resource theory of entanglement, local-measurements with post-selection map resourceless states (separable) into resourceless states. On the contrary, for a resource theory of thermodynamics, the resourceless state (thermal) can be mapped into resourceful states (any non-thermal state) by a measurement.

X. SUMMARY
In this work, we have approached the question of defining work from the perspective of simple operational axioms, inspired from device-independent notions in quantum information theory. It is key to our approach that it is applicable to very general classes of physical systems and situations; allowing one to quantify the work-value of arbitrary state-transitions and even changes of the Hamiltonian. Remarkably, simple and elementary as these axioms may appear, they provide sufficient structure to give rise to surprisingly detailed necessary and sufficient properties that any function quantifying work has to fulfil. These properties generalise well known properties from (quantum) information theory, such as strong sub-additivity, to the realm of quantum thermodynamics. Since these properties follow from purely operational considerations, we thereby also establish new operational explanations for these properties. We have also proved that the nonequilibrium free energy difference to the Gibbs state fulfils all the properties and reasonably conjecture it is the only function fulfilling all the properties when we consider thermodynamics. When applied to the particular case of initial and final states being eigenstates of the Hamiltonian, our work-definition reduces to the energy difference; thus, on one hand, recovering the traditional definition of work and on the other, highlighting the restricted applicability of such a notion. We have also clarified the relation to the well-known concept of probability distributions of work when work is considered as a random variable of energy-differences.
For coherence of the presentation, we have in the main text entirely sticked to the interpretation of work quantifiers in quantum thermodynamics. The technical results achieved are formulated in the appendix in such a way that they are also applicable to other quantum resource theories, beyond the quantum thermodynamic context. The arguments laid out here highlight the role that catalysts play and how the local-global properties of multi-partite systems relate to them. Furthermore our results show that there is a close connection between catalysis and reversibility.
The mindset of this work has been purely operational -and yet it presents a fresh approach to the problem of defining work in the context of quantum thermodynamics. Our approach is not to consider a specific model or to formulate a definition of a notion of work, to be justified by physical considerations. Rather, we look at basic defining features that any such notion has to fulfil -to arrive at surprisingly precise and rich predictions. It is the hope that the operational approach taken here can be seen as a further invitation to approach physical problems with a mindset inspired by resource theories and notions of device-independent quantum information theory.

XI. ACKNOWLEDGEMENTS
This work has been supported by the EU (SIQS, AQuS, RAQUEL), the ERC (TAQ), the Alexander von Humboldt-Foundation and the Studienstiftung des Deutschen Volkes. This has been noted in a related form in Ref. [6]. It is shown that one can extract a positive amount ofdeterministic work from a thermal bath, in a clear violation of the second law. The violation gets exponentially suppressed with the success probability , which ensures that the second law is recovered in the macroscopic case. Still, at the nano-scale, the values of work extracted from a thermal bath (i.e., the amount by which the second law is violated) however exponentially attenuated, may be of the order of magnitude of typical energies at which the engine operates.  [15][16][17], the work-storage device is described as having a Hamiltonian unbounded from below. Hence, strictly speaking, it is in any case impossible to prepare it in a Gibbs state, regardless of the restrictions on the set of valid operations considered. However, this is clearly an idealisation of a real physical situation with bounded Hamiltonians. One could in this model also construct global operations on ER such that one increases the average energy of the work-storage device. This kind of operations are explicitly forbidden by their formalism to ensure that the battery is not used as a source of free-energy and it is in this sense that we mean that explicit restrictions are imposed.

Transitions and free transitions
Let us consider a pair of a quantum states and a Hamiltonian p = (ρ, H). In the following we will call such pairs objects and denote the associated Hilbert space by H(p), which for most of this work is taken to be finite-dimensional. We will later give a more re-fined definition of objects in our context.

Definition 5 (Transition).
A transition is defined by a pair of objects p (i) , p (f ) and an ordering between them. We will refer to a transition as p (i) → p (f ) .

Definition 6 (State transition). This is a transition in which the Hamiltonian remains constant.
That is, if (ρ (i) , H) → (ρ (f ) , H), we will refer to a state transition and denote it simply, if the Hamiltonian is clear from the context, by ρ (i) → ρ (f ) .
Such transitions are to be interpreted, in the context of the present work, as changes on the system and state Hamiltonian of the battery of Arthur as implemented by Merlin. State transitions in which the Hamiltonian remains unchanged are relevant in the context of thermodynamics and will be the object of study for most of this manuscript.

Definition 7 (Free image).
A free image is a function F that maps p (i) and a parameter β into sets of objects {p k } = F(p (i) , β). When F is such that the Hamiltonian remains constant, that is, we will refer to it as free state-image.

Definition 8 (Free transition).
A free transition is defined as any transition p (i) → p (f ) , where p (f ) ∈ F(p (i) , β). When the Hamiltonian is constant and the parameter β is clear from the context, we will denote a free transition simply as ρ (i) → F(ρ (i) ).
Definition 9 (Tensoring objects). Given two objects p = (ρ, H) and p = (ρ , H ), we define the tensor product In the definition we explicitly indicated on which tensor-factor the identity maps act. In the following, we will omit such indications when the information is clear from the context. Definition 10 (Non-interacting objects). If an object based on a bipartite system of parts A and B has the form we refer to it as non-interacting object.
Non-interacting objects are those objects on which we define a partial trace.
Definition 11 (Partial traces). Given any two objects p S = (ρ S , H S ) and p |S = (ρ |S , H |S ), we define the trace tr |S as an operator acting on objects p of the form such that tr |S (p) = p S . We extend this definition to all noninteracting objects by the partial trace on quantum states.
At this point a remark about Hamiltonians is in order. When we consider non-interacting objects, the local Hamiltonians are not well-defined: We can always change their traces by adding a global zero of the form to the global Hamiltonian. Therefore we will from now call two Hamiltonian operators equivalent if they differ by a multiple of the identity, H ∼ H + λ1. For simplicity, we will, however, not indicate this in our notation and will just refer to the equivalence classes as Hamiltonians. We could also just fix the trace of the Hamiltonians. It will become clear later, why we do not follow this path.
Definition 12 (Catalytic free image). Given the free image F, we define the catalytic free image F C as β). When the Hamiltonian H is constant and the parameter β is clear from the context, we will denote a free state-transition simply as ρ (i) → F c (ρ (i) ).
Note that this is less stringent than requiring that p (i+1) ∈ F C (q (i) , β). In Def. 14, it is not required that the catalyst is returned in the same state after each individual transition p (i) → p (i+1) , it can be recycled for subsequent transitions in such a way that merely the initial state q (1) and the final state q (n) coincide. In the case of n = 2, a catalytic free sequence is simply a catalytic free transition.

Basic assumptions on the free transitions
In the main text we have focused on the resource theory of a-thermality, where the free operations are, loosely speaking, defined as the energy preserving joint operations on system and bath. These are mathematically characterised by the GP-maps, or strictly contained subsets of operations, such as the thermal operations. However, our results apply potentially to widely different resource theories defined by other classes of free operations, not motivated by the thermodynamic context. In this endeavour, we aim at contributing to the emerging understanding of general resource theories [22,23,48]. We state below the two first assumptions on the free operations that are needed in order to derive the results of Sec. IV in the main text, in particular Theorems 1 and 2 (restated as Theorems 16 and 17 in this appendix). Later in this appendix, we state another two properties that are needed only to derive Theorem 3 of Sec. VI in the main text (restated as Theorem 20 in this appendix).
Lastly, note that Property 2 implies that the identity is a catalytic free transition, that is, p ∈ F C (p, β) for all β. This follows since one can take as catalyst q = p and perform a swap between the system and the catalyst.

Work quantifiers
Once we have specified the transitions and the free transitions, we will define a quantifier of the value of a given transitions.

Definition 15 (Work quantifier).
We define the work quantifier as a function W that maps a pair of transition and parameter (p (i) → p (f ) , β) into the real numbers. If the transition is a state transition and the Hamiltonian and β are both clear from the context, we will simply write W(ρ (i) → ρ (f ) ).

Appendix B: General axioms
We will now present the Axioms 1 and 2 of the main text, restated in a more precise manner by making use of the mathematical definitions of Sec. A 1. (p (1) , . . . , p (n) ) such that p (n) = p (1) , then

Axiom 1. Given a collection of objects
Axiom 1 ensures that if a set of states can be arranged in a cyclic sequence, the total work, given by the l.h.s. of (B1), cannot be negative. Otherwise, Arthur, who receives at the end the same object he possessed at the beginning, can repeat the protocol an arbitrarily number of times and obtain an arbitrarily large benefit. (p (1) , . . . , p (n) ) that form a catalytic free sequence, then

Axiom 2. Given a collection of objects
If a set of states can be arranged in a sequence that can be performed by Merlin alone without expending any resources -that is, by making use of the free transitions and its catalytic extension -then the work has to be accounted for as being negative. In that way, Merlin does not obtain benefit from a procedure in which he does not invest resources. As β > 0 is fixed and defined by the context, we make use of the notation W(p (i) → p (f ) , β) = W(p (i) → p (f ) ) and F C (p, β) = F C (p) in the following.

Implications for the work definition
We now turn to exploring implications for the work quantifers.
Theorem 16 (Theorem 1 in the main text). Consider a free state image F fulfilling Properties 1 and 2. In this case, Axioms 1 and 2 are fulfilled if and only if W satisfies the following properties, i) W(p → q) ≤ 0 ∀ p and q ∈ F C (p), Proof. We will first show that Axioms 1 and 2 imply properties i-iii). Property i) follows straightforwardly from Axiom 2 by taking n = 2. In this case, a catalytic free sequence is simply a catalytic free transition p (1) → q ∈ F C (p (1) ).
In order to derive property ii) we will show first that any collection of objects (p W ) forms a catalytic free sequence if the free transitions include Property 2 (we include the label W to refer to the actual system and distinguish it from the catalyst). We will prove this by providing the particular catalyst A and the protocol so that the conditions of Definition 14 are met. Let us take p The second, and last, step to complete the sequence is to swap again W and A, which yields p A , hence, it fulfils the conditions of Definition 14. Hence, we have shown that any (p W ) forms a catalytic free state-sequence, which by Axiom. 2 implies that Combined with Axiom 1, one obtains ii).
Condition iii) is derived in a similar fashion. We show first that any (p (1) , p (2) , p (3) , p 4 = p (1) ) constitutes a catalytic free state-sequence. Now we take p (1) We can first swap systems W and A 1 so that (B5) Secondly, we swap W and A 2 , which yields (B6) Then, swapping W and A 1 together with a swap of A 2 and A 1 delivers A1A2 , hence it fulfils the condition of Definition 14. Applying Axioms 1 and 2 one arrives at the conclusion that (B8) This, together with property ii), easily yields iii). Now we show that properties i-iii) imply Axioms 1 and 2. To start with, note that ii) implies that W(p → p) = 0. Axiom 1 follows by iii), which implies that To show Axiom 2, note that a catalytic free sequence is defined by the property p (i+1) ⊗ q (i+1) ∈ F(p (i) ⊗ q (i) ). Using the property of composability 1, we find that with q (n) = q (1) . Hence p (n) ∈ F C (p (1) ). Now, using Property i) and iii), one can easily show (B2).
Let us now show that Axioms 1 and 2, or equivalently Conditions i-iii) of Theorem 16, imply that the work function W must take a very particular form.
Theorem 17 (Theorem 2 in the main text). Given a free image F that fulfils Properties 1 and 2, the function W fulfils Axioms 1 and 2 if and only if it can be written as where M is a monotone under F C , that is, if p (f ) ∈ F C (p (i) , β).
Proof. First we will show that Axioms 1 and 2 imply (B11). For this, an intermediate step is proving that Axiom 1 implies This step is shown by contradiction. That is, if there , then Axioms 1 and 2 would be violated, or equivalently, we arrive at a contradiction by using i-iii) of Theorem 16. For this, consider objects where (B14) follows from ii), (B15) from i) and the assumption thatp (i) ∈ F C (p (i) , β), and (B16) from the assumption that W(p (i) → p (f ) ) > W(p (i) → p (f ) ).
Clearly, (B16) is incompatible with Conditions i-iii), which imply that W T = W(p (i) → p (i) ) = 0. With a similar reasoning, one can argue that Axiom 1 implies that Also, let us note that condition ii-iii) imply that W can be always written as W(p (i) → p (f ) ) = M (p (f ) ) − M (p (i) ). To see this, let us fix an object t and define W(t → p) =: M (p). Then we have, using ii-iii), that Combining (B18) with (B13) we get that One can easily obtain from (B19-B20) that M (p (i) ) ≥ M (p (i) ), which implies that M is a monotone. Lastly, one can easily check that if W is written as in (B11), then i-iii) are fulfilled, which in turn implies Axioms 1 and 2.
The transitions as defined in Definitions 5 and 6 are interpreted as the change of the quantum state and Hamiltonian of a physical system. Such transitions, are implemented by an agent, Merlin, who has in principle access to the whole system.
We will now turn to considering multipartite systems with non-interacting Hamiltonians and we will be concerned with transitions in which Merlin has only access to a set of the subsystems when performing a given transition. This will impose limitations on the pairs of p (i) and p (f ) that can be connected by such local transition, as well as on the the free transitions that can be performed locally.
and an ordering between them. S-local transitions are only defined for p (i) , p (f ) with the property that there exists a completely-positive trace-preserving map E S with support on S only, so that E S ⊗ 1 |S (ρ (i) ) = ρ (f ) . Furthermore, the two objects have to be non-interacting w.r.t. the S-cut, i.e., We will refer to the S-local transition as Note that if S = A 1 ∪ . . . ∪ A N , then the S-local transition reduces to the usual transition of Definition 5.
An S-local transition is then any transition that can be performed by acting on the subsystems S only, when S and |S do not share any interacting Hamiltonian term. In that way, the Hamiltonian that is not supported on S does not change, and the state is altered by a completelypositive map acting on S only.

Definition 19 (S-local work quantifier).
We define an Slocal work quantifier as a function W S that maps S-local transitions p (i) S − → p (f ) and a parameter β into the real numbers. If the parameter β is clear from the context, we will simply write W S (p (i) S − → p (f ) ).

Axioms on local transitions
We will now introduce further axioms that apply to the function W S and discuss how it compares to the work measure for global transitions W. They correspond to Axioms 3 and 4 in the main text.  (C8) Axiom 4 is a generalisation of Axiom 1 for the case where different local state-transtions are combined in a sequence. In order to denote compactly the arrangement of a given set of n objects and some S i -local transitions among them, we will subsequently employ the notation

Extending and reducing Hilbert spaces
In this subsection we will include two new properties on the free operations. These are needed, together with Properties 1 and 2, to derive the results of Section VI of the main text, in particular Theorem 3. No other property of the free operations is required throughout the manuscript. Hence, all the results in the main text follow from any resource theory with free transitions fulfilling Properties 1-4. Property 3 (Free objects). There exists a set of free objects w β , such that for any object p, p ⊗ w β ∈ F(p, β). (C10) The set of free objects is closed under tensor-products: For two free objects w β , w β , the tensor-product w β ⊗ w β = w β is again a free object.
We stress the parameter-dependence of the freeobjects: A free object for some parameter-value β will in general not be a free object for some other parametervalue β .
Property 4 (Tracing as free operation). For any subsystem S of A 1 , . . . ,A N of a product object, tracing out is in the free image. That is, In the case where the entire system is traced out. In this case we introduce the notation tr S (p) := ∅. In this instance Proposition 4 is also fulfilled and we denote it by W(p → ∅, β) ≤ 0. The object ∅ can be seen as the pair (1, 0) on H = C. Note that it therefore fulfills p ⊗ ∅ = p for every object p. It is therefore a free object independent of β. Properties 3 and 4 interplay with the property of Composability 1, yielding non-trivial properties as shown in the following Proposition.
Proposition 1 (Free states have minimum monotone value). Given a catalytic free image F C such that it fulfils Properties 3 and 4, then any monotone M under F C is such that for all w β being a free objects. Furthermore M (p, β) ≥ M (w β , β) for any p and free object w β .
Proof. A monotone under F C is also a monotone under F transitions. Hence, it fulfills Using Property 3, taking p = ∅, we find that M (w β ) ≤ M (∅) for any free object w β . Also, using Property 4 for the transition w β → ∅, we find that M (w β ) ≥ M (∅), which implies that M (w β ) = M (∅) ∀ w β being a free object. Lastly, Property 4 implies that M (p) ≥ M (∅) = M (w β ) for any p.
Proof. We will first show that Axioms 1,2, 3 and 4 imply (D1) with Properties 1.-3. First note that Axioms 1 and 2 imply (D1); this is the content of Theorem 17 which we include here for completeness. Let us also note that Property 3 together with Axioms 4 and 1 (or the equivalent conditions i-iii) of Theorem 16) imply that The same reasoning applied to Property 4 yields 0 ≤ W(p ⊗ w → p) = −W(p ⊗ w → p). Hence we conclude that for all p and free objects w.
We will now show that Axioms imply extensivity (D3). For this, let us consider the set of objects, arranged in the notation of (C9), Note that the conditions of Definition 18 for the two Blocal transitions are met, since it applies on a product state, hence there always exists a completely-positive trace-preserving performing the transition. Applying Axiom 4 and Eq. (D7) Using Axiom 3 and Property ii) of Theorem 16, one obtains A similar reasoning can be applied to the transitions . The reasoning holds for any object q B , hence we conclude that (D12) Now we use Property iii) from Theorem 16, which gives rise to and insert (D12) and W(p (i) → p (f ) ) = M (p (f ) )−M (p (i) ) to get Using (C8) and (C6) we find that from which (D5) follows using (D1). Lastly, we show that W with the properties of Theorem 20 fulfils Axioms 1-4. Axioms 1 and 2 follow from (D1). This is shown in Theorem 17. Axiom 3 is fulfilled independently of the particular form of W. Lastly, Axiom 4 can be proven by using that (D5) can be rewritten as where the first equality follows from (D1). The second is a consequence of the fact that the conditions under which Axiom 4 must be proven are such that This completes the proof.

Appendix E: Gibbs-preserving and thermal operations
In this section, we will turn to two classes of operations that can be used to model thermodynamics previously discussed, that is, Gibbs-preserving operations (GPO) [8,39,43] and thermal operations (TO) [32]. We will first introduce the necessary objects, then define what state transitions are possible, and finally show that all the necessary properties are indeed fulfilled. Both GPO and TO have the same sets of free objects, induced by Gibbs-states.
Definition 21 (Gibbs objects). The free objects of GPO and TO are given by with any Hamiltonian H, and called Gibbs objects.
A particular way to describe Gibbs-preserving transitions is through Gibbs-preserving operations: these are maps from objects to objects which induce Gibbspreserving transitions.

Definition 23 (Gibbs-preserving operations).
A Gibbspreserving operation is a map G that maps objects onto objects, such that Proposition 4 (Convexity). Let M be a monotone for catalytic thermal transitions that is extensive and super-additive. Then, if the function ρ → M (ρ, H) (fixed Hamiltonian) is continuous (for a fixed Hilbert space), it is also convex.
Before we prove the validity of the proposition, let us show that random permutations with uniform distribution over the symmetric group can be implemented making use of thermal operations.
Lemma 31 (Permutations). Consider an N -partite state ρ = ⊗ N i=1 ρ i with the same Hamiltonian on each subsystem. For any permutation π ∈ S(N ) let S π be the unitary operator on the system that permutes the subsystems according to the permutation π. Define S π (ρ 1 ⊗ · · · ⊗ ρ N ).

(F1)
Then the transition ρ → ρ can be done by thermal operations, independent of temperature.
Proof. Consider a thermal bath of dimension N ! and trivial Hamiltonian. We denote an orthonormal basis of the Hilbert space of the bath by {|π } π∈S(N ) . Then the operator commutes with the total Hamiltonian of both system and bath since the system Hamiltonian is permutation invariant. Since the thermal state of the bath is maximally mixed, we get Tracing out the bath now delivers ρ on the system. for some m = 1, . . . , N . By the previous lemma, the transition Ω → Ω is a free operation. The marginals of Ω on the N subsystems are given by Since decorrelating is a free operation (by correlated catalysis) the transition Ω → σ ⊗N is free. Now consider the chain of transitions By the above arguments, the second and third transitions have work cost less or equal 0. Thus, for the total transition we get W(ρ ⊗N → σ ⊗N ) ≤ W(ρ ⊗N → Ω). (F7) Using extensivity we get hence get N W(ρ → σ) ≤ mW(ρ → σ 1 ) + (N − m)W(ρ → σ 2 ). this shows convexity of M for rational mixtures. By continuity, we get convexity for arbitrary mixtures.
Note that this theorem also implies that ∆F β is a monotone for catalytic thermal transitions, since these constitute a strict subset of catalytic Gibbs-preserving transitions. Hence ∆F β defines a valid work-quantifier for both Gibbs-preserving transitions and thermal operations.
We will separate the proof into several propositions. We will frequently use the following well-known properties of the relative entropy: Positivity directly implies that, for a fixed Hamiltonian H, the Gibbs-state ω H at inverse temperature β is the unique minimum of the function ρ → ∆F β (ρ, H) ≥ 0.
Thus we already know that ∆F (w) = 0 for any Gibbs object and that ∆F β (p) > 0 if p is not a Gibbs object.
Proposition 5 (Extensivity of the free energy difference). The function ∆F β is extensive.
Proof. The proof follows immediately from Property 3. of the relative entropy.
Proposition 6 (Super-additivity of the free energy difference). The function ∆F β fulfils strong generalised superadditivity.
Proof. First note that it is sufficient to show the property for all bipartite systems. So assume that two objects . We need to show that (G12) If ω A , ω B are two Gibbs-states, then it is easy to prove, using locality, that for any state ρ AB on AB we have Using this relation we can rewrite the r.h.s. of Eq. (G12) as .
Using ρ B and extensitivity, we find that the second term in brackets reduces to But from the data-processing inequality we get that By choosing as final object a Gibbs object we get as a corollary the "usual" super-additivity: If p A1A2 is a noninteracting object, ∆F β (p A1A2 ) ≥ ∆F β (p A1 ) + ∆F β (p A2 ). (G18) What is left to be proven is that ∆F β is a monotone under free (catalytic) transitions. where the last inequality is the data-processing inequality. Proposition 8 (Monotonicity under catalytic Gibbs-preserving transitions). The function ∆F β is a monotone under catalytic Gibbs-preserving transitions.

Corollary 33 (Mapping Gibbs objects to Gibbs objects).
Catalytic Gibbs-preserving transitions map Gibbs objects to Gibbs objects.
Proof. Consider a transition w ⊗ q → r ⊗ q. Then ∆F β (r) ≤ ∆F β (w) = 0. But ∆F β ≥ 0 and ∆F β vanishes only on Gibbs-objects. Hence r has to be a Gibbs object.
This finishes the proof of Theorem 32.