Limits to catalysis in quantum thermodynamics

Quantum thermodynamics is a research field that aims at fleshing out the ultimate limits of thermodynamic processes in the deep quantum regime. A complete picture of quantum thermodynamics allows for catalysts, i.e., systems facilitating state transformations while remaining essentially intact in their state, very much reminding of catalysts in chemical reactions. In this work, we present a comprehensive analysis of the power and limitation of such thermal catalysis. Specifically, we provide a family of optimal catalysts that can be returned with minimal trace distance error after facilitating a state transformation process. To incorporate the genuine physical role of a catalyst, we identify very significant restrictions on arbitrary state transformations under dimension or mean energy bounds, using methods of convex relaxations. We discuss the implication of these findings on possible thermodynamic state transformations in the quantum regime.


I. INTRODUCTION
In chemical reactions, it is common that a certain reaction should in principle be allowed, but in reality cannot take place (or occurs at extremely low rates) because of the presence of some large energy barrier. Fortunately, the situation is sometimes redeemed by the presence of certain chemical substances, referred to as catalysts, which effectively lower the energy barrier across the transformation. That is to say, catalysts significantly increase the reaction rates. Importantly, these catalysts can remain unchanged after the occurrence of the reaction, and hence a small amount of catalytic substance could be used repeatedly and is sufficient to facilitate the chemical reaction of interest.
The basic principles of chemical reactions are governed by thermodynamic considerations such as the second law. There have specifically been a number of recent advances in the quest of understanding the fundamental laws of thermodynamics [1][2][3][4][5][6]. These efforts are especially focused on the quantum nano-regime, where finite size effects and quantum coherences are becoming increasingly relevant. One particularly insightful approach is to cast thermodynamics as a resource theory [2,3,7,8], reminding of notions in entanglement theory [9][10][11]. In this framework, thermodynamics can be seen as the theory that describes conditions for state transformation ρ → σ from some quantum state to another under thermal operations (TO). The notion of TO means allowing for the full set of global unitaries which are energy preserving in the presence of some thermal bath. This is a healthy and fruitful standpoint, and allows the application of many concepts and powerful tools derived from information theory [12][13][14].
In the context of thermal operations, catalysts emerge as ancillatory systems that facilitate state transformation processes: there are cases where ρ → σ is not possible, but there exists a state ω C such that ω C ⊗ ρ → ω C ⊗ σ is possible. The metaphor of catalysis is appropriate indeed: This implies that by using such a catalyst ω C , one is enabled to perform the thermodynamic transformation ρ → σ, while returning the catalyst back in its exact original form. This is called exact catalysis. The inclusion of catalyst states in thermal operations serve as an important step in an eventual complete picture of quantum thermodynamics; it allows us to describe thermodynamic transformations in the full picture, where the system is interacting with experimental apparatus, for example a clock system. Furthermore, it has been shown that one can obtain necessary and sufficient conditions for exact catalysis in terms of a whole family of generalised free energies [1]. The ordinary second law of ever-decreasing free energy is but the constraint on one of these free energies.
Naturally, for physically realistic scenarios inexact catalysis is anticipated, where the catalyst is returned except for a slight degradation. However, rather surprisingly, it has been shown [1] that at least in some cases, the conditions for catalytic transformations are highly non-robust against small errors induced in the catalyst. The form of the second law thus depends crucially on the measure used to quantify inexactness. In particular, if inexactness is defined in terms of small trace distance, then there is no second law at all: for any ε > 0 one could pick any two states ρ and σ, and starting from ω C ⊗ ρ, get ε-close in terms of trace distance to ω C ⊗ σ via thermal operations. We refer to this effect as thermal embezzling: Here one observes that instead of merely catalysing the reaction, energy/purity has possibly been extracted from the catalyst and used to facilitate thermodynamic transformations, while leaving the catalyst state arbitrarily close to being intact [15]. On physical grounds, such a setting seems implausible, even though it is formally legitimate. A clarification of this puzzle seems very much warranted.
Argued formally, a first hint towards a resolution may be provided by looking at how the error depends on the system size. Naturally, the trace distance error ε depends on the dimension of the catalyst states dim(ω C ) = n; nevertheless one can find examples of catalysts where ε → 0 as n approaches infinity. While examples show that in principle thermal embezzling may occur [1], hardly anything else is known otherwise. Indeed, it would be interesting to understand the crucial properties that distinguish between a catalyst and an active reactant in thermodynamics. From a physical perspective, it seems highly desirable to understand to what the effect of embezzling can even occur for physically plausible systems.
In this work, we highlight both the power and limitations of thermal catalysis, by providing comprehensive answers to the above questions raised. Firstly, we construct a family of catalyst states depending on dimension n, which achieves the optimal trace distance error while facilitating the state transformation ρ → σ, for ρ and σ being some arbitrary m-dimensional states. This is done for the regime where the Hamiltonians of the system and catalyst are trivial. Secondly, we show that thermal embezzling with arbitrary precision cannot happen under reasonable constraints on the catalyst. More precisely, whenever the dimension of the catalyst is bounded, we derive non-zero bounds on the trace distance error. By making use of splitting techniques to simplify the optimization problems of interest, such bounds can also be obtained when the expectation value of energy of the catalyst state is finite, for catalyst Hamiltonians with unbounded energy eigenvalues and a finite partition function. We hence set very strong limitations on the possibility of enlarging the set of allowed operations in quantum thermodynamics, if systems with reasonable Hamiltonians are being considered.

A. The power of thermal embezzling
We begin by exploring the case for vanishing, trivial Hamiltonians, where it is known that thermal embezzling can occur. This is also the simplest case of thermodynamics in resource theory [1], when all energy levels are fully degenerate, and the Hamiltonian is simply proportional to the identity operator. Entropy and information, instead of energy, become the main quantity that measures the usefulness of resources. In such cases, the sole conditions governing a transition from some quantum state ρ to σ is that the eigenvalue vector of ρ majorizes that of σ [2]. This is commonly denoted as ρ ≻ σ. Such a condition also implies that entropy can never decrease under thermal operations [1].
To investigate thermal embezzling in this setting, one asks if given fixed m, n, what is the smallest ε such that there exists a catalyst state ω C that satisfies where the trace distance d(ω C , ω ′ C ) between the input catalyst ω C and output catalyst ω ′ C is not greater than ε. This trace distance is used as a measure of catalytic error throughout our analysis. If some catalyst pair (ω C , ω ′ C ) satisfies condition Eq. (1) with trace distance ε, then it also facilitates ω C ⊗ ρ → ω ′ C ⊗ σ for any m-dimensional states ρ, σ. This is because a pure state majorizes any other state, while the maximally mixed state I/m is majorized by any other state.
Since majorization conditions depend solely on the eigenvalues of the density matrices ω C and ω ′ C , one can phrase this problem of state transformation in terms of a linear minimization program over catalyst states diagonal and ordered in the  Figure 1. The eigenvalues of our final catalyst state ω ′ C (blue) versus those ofωC proposed in Ref. [15] (red, dashed), for m = 2 and n = 8.
same basis (see appendix). In fact, the eigenvalues of ω C , ω ′ C which give rise to optimal trace distance error can be solved by such a linear program, although these eigenvalues are nonunique. Whenever m ≥ 2 and n = m a where a ≥ 1 is an integer, we provide an analytic construction of catalyst states, which we later show to be optimal for the state transformation in Eq. (1). Let the initial catalyst state Note that our catalyst state ω C does not have full rank, and this is crucial for the majorization condition in Eq. (1) to hold, since ρ ≻ σ implies that rank(ρ) ≤ rank(σ), and the joint state ω ′ C ⊗ |0 0| can have at most rank n. The output catalyst ω ′ C can be obtained from ω C , by subtracting a small value ε from the largest eigenvalue ω 1 and distributing the amount ε equally over the indices i > n/m. This brings ω ′ C to be a state of full rank n. We show that this family achieves trace distance error which we prove by mathematical induction to be optimal, given fixed m, n where n = m a (see appendix). Figs. 1 and 2 compare our final catalyst state with the statẽ with C(n) = n i=1 1/i being the normalization constant. The familyω C was proposed in Ref. [15] for embezzling in the LOCC setting. In Fig. 3, we compare the trace distance error achieved by catalystω C from Ref. [15] with the error achieved by our catalyst ω C . We see that for small dimensions, our catalyst outperformsω C , however asymptotically the error scales with log n for both catalysts.  Figure 2. The eigenvalues of our final catalyst state ω ′ C (blue) versus those ofωC proposed in Ref. [15] (red, dashed), for m = 3 and n = 27.

B. The limits of thermal embezzling
In this section, we are interested in finding additional physical restrictions which prevent thermal embezzling. To do so, we look at general Hamiltonians for both the system and catalyst, where the energy of the system comes into play. In [1], it is shown that the monotonicity of quantum Rényi divergences [16] form necessary conditions for state transformations. More, precisely, for arbitrary ρ S and ρ ′ S , if ρ S → ρ ′ S is possible via catalytic thermal operations, then for all α ≥ 0, holds, where τ S is the thermal state of system S, at temperature T of the thermal bath. Eq. (5) implies that one can use the monotonicity of Rényi divergences to find lower bounds on thermal embezzling error for state transformation between arbitrary states ρ S and ρ ′ S . For simplicity, we present the case where ρ S and ρ ′ S are diagonal (in the energy eigenbasis of H S ). The case for arbitrary states can be treated similarly, and details are given in the appendix.
For the case where two states ρ and σ are diagonal, the Rényi divergences are defined as where {ρ i }, {σ i } are the eigenvalues of ρ, and σ. Again, for states ρ S and ρ ′ S diagonal, it suffices to look at a single transformation where Π S max = |E S max E S max | is the pure energy eigenstate with energy E S max . Note that both τ S and Π S max are diagonal in the energy eigenbasis. As explained in the appendix, Eq. (7) is sufficient to ensure universal thermal embezzling for aribtrary states ρ S and ρ ′ S as long as they are diagonal in the same energy eigenbasis. Similarly, one can take ω C and ω ′ C to be diagonal in the energy eigenbasis of H C [1]. This can be written as the following minimization problem, ε being the solution of is the thermal state of the catalyst and system. The system Hamiltonian H S is assumed to be finite.
A straightforward relaxation of Eq. (8) allows us to now consider an alternative problem for some fixed α From Ref. [1], we know that any (ω C , ω ′ C ) feasible for Eq. (8) is also feasible for Eq. (9). Therefore, for any α ≥ 0, ε ≥ ε α . By choosing α one can arrive at much simpler optimization problems, that provide lower bounds for the trace distance error. We apply this to study two cases, detailed as below.
1. Bounded dimension: Consider the case where both the system H S and catalyst Hamiltonians H C have fixed dimensions, and denote the maximum energy eigenvalues as E S max , E C max respectively. One sees that the solution of Eq. (8) is lower bounded by Eq. (9) for α → ∞. Recall that w.l.o.g. we can assume that ω C and ω ′ C are diagonal in the same basis, which we take to be the energy eigenbasis. Eq. (9) can be rewritten as where τ i = Z C −1 e −βE C i are the probabilities defined by the thermal state of the catalyst Hamiltonian, and Z C is the partition function of the catalyst system. To solve this problem, we note that the optimal strategy to maximize the quantity max i ω i /τ i within the ε−ball of ω ′ C is to increase one of the eigenvalues by ε, so that the quantity max i (ω i + ε)/τ i is maximized. With further details in the appendix, we show that the trace distance error can therefore be lower-bounded by where Z S , Z C are the partition functions of the system and catalyst. Although this bound is valid for arbitrary finite-dimensional Hamiltonians, it is not tight. Indeed, in the case of trivial Hamiltonians where all states have constant energy value, normalized to 0, the partition functions Z S , Z C reduce to the dimension m, n of the system and catalyst. This bound then yields d opt (0 S , 0 C ) ≥ (m − 1)/n, which is much weaker than the optimal trace distance we derived in Eq. (3).

Hamiltonians with unbounded energy levels:
A more general result holds for unbounded dimension and energy levels where the partition function Z C is finite. More precisely, for such cases, we show that setting an upper bound on the average energy of the catalyst state limits thermal embezzling.
Let us now explain the proof of our results. Consider some H C with unbounded energy levels {E C j }. For simplicity, we restrict ourselves to the case where the catalyst states are diagonal in the energy eigenbasis, and assume the system Hamiltonian to be trivial with dimension m = 2. A more general derivation involving arbitrary system Hamiltonians may be found in the appendix.
A) Formulation of the problem: Consider the minimization of catalytic error under the relaxed constraint that monotonicity for the α-Rényi divergence is satisfied. Using Eq. (9) with α = 1/2, by substituting H S = 0 S , the first constraint can be simplified as follows Furthermore, we want that the initial catalyst state must have a expectation value of energy no larger than some finite E. In summary, we now look at the minimization of trace distance under the following constraints where γ = e −β/2 ∈ (0, 1). Denote the solution of this problem as ε. In the subsequent steps, our goal is to show that ε is lower bounded by a non-zero constant, by making use of techniques of convex relaxations of optimisation problems. As such, this is an intricate problem, as it is a non-convex problem both in ω i and ω ′ i .

B) Splitting a relaxed minimization problem:
The key idea to proceed is to suitably split the problem into two independent optimization problems in a relaxation, which then turn out to be convex optimization problems the duals of which can be readily assessed. The starting point of this approach is rooted in the observation that for any ω i , ω ′ i ∈ [0, 1], the following inequality holds true, Since requiring the R.H.S. of Eq. (13) to be positive is a less stringent compared to the L.H.S., one can now further use it to obtain a lower bound for the minimization in Eq. (12). By defining a new variable the solution ζ of which obeys ε ≥ ζ. In the next step, we will see that the relaxed problem in Eq. (14) is much simpler to solve, since it can be written as two separate, independent optimization problems. One can see now that the variables x i , ω i are independent from each other. This allows us to first perform a minimization of the function i √ ω i γ E C i for constraints involving ω i only.
C) Invoking energy constraints to provide lower bound: The energy constraint on ω C plays a crucial role in lower bounding the solution. Intuitively, when such a constraint is placed for some finite E, it implies that the probability of populating some relatively low energy levels cannot be vanishingly small. We prove this with more rigor in the appendix. Along this line of reasoning, one concludes that for the minimization its solution ε 1 > 0 has to be strictly positive. More precisely, where j(W ) = min{j : . A derivation of this expression can be found in the appendix.

D) Merging both problems:
After obtaining a lower bound for the subproblem Eq. (15), we recombine the two problems into Eq. (14) to obtain This is a quadratic optimization problem in the variables √ x i , hence it is easy to obtain the Lagrange dual of this problem, which takes on a very simple form involving the simple minimization of a quadratic function w.r.t. λ. Solving this we arrive at a lower bound where i is the partition function of the catalyst Hamiltonian. We summarize our findings in Table I.

III. DISCUSSION AND CONCLUSION
The bounds on dimensionality are closely related to energy restrictions. While placing an upper bound on the dimension directly imply an upper bound on the average energy, the reverse statement is not generally true. However, if one restricts not only the expectation value of the energy distribution, but also restricts its variance to be finite, then this is almost equivalent to placing a dimension restriction. For example, given any non-degenerate Hamiltonian H C with unbounded eigenvalues, consider the set of catalyst states such that the average energy and variance of a given catalyst is finite. Then by the Chebyshev inequality one can understand that this is equivalent to introducing a cut-off on the maximum energy eigenvalue (and therefore on the dimension). We note that it is easy to see that e.g. for the harmonic oscillator the variance is not always bounded whenever the mean energy is bounded.
In the case of infinite-dimensional Hamiltonians, we have also shown that for certain classes of catalyst Hamiltonians, explicit bounds can be derived on the trace distance error of a catalyst when the average energy is finite. Our results have covered a large range of Hamiltonians which are commonly found in physical systems -including the important case of the Harmonic oscillator in free systems -with the minimal assumption that partition function Z C is finite, which holds for all systems for which the canonical ensemble is well-defined. However, we know that thermal embezzling can be arbitrarily accurate as dimension grows, at least in the simplest case of the trivial Hamiltonian. This implies that there will be specific cases of infinite-dimensional Hamiltonians where simply bounds on average energy do not give explicit bounds on thermal embezzling error. We suspect that this may be true for Hamiltonians with unbounded dimension, but upper bounded energy levels. The reason is that if dimension is unbounded, then there must exist an accumulation point in the energy spectrum. The subspace of this accumulation point will be very similar to the trivial Hamiltonian.
In summary, we have investigated the phenomenon of thermal embezzling under different physical scenarios. While one acknowledges that thermal embezzling is possible in the fully degenerate Hamiltonian case, we show that under many realistic circumstances, with physically motivated restrictions, thermal embezzling cannot happen with arbitrary accuracy. In this sense, we resolve the puzzle of thermal embezzling, hence further contributing to a complete understanding of the thermodynamic laws in the quantum world.

APPENDIX
In this appendix we fully elaborate our findings on thermal catalysis. We begin in Section I A by explaining the similarities and subtle differences between thermal embezzling and embezzling in the LOCC setting. The Rényi divergences and their relation to thermal operations are detailed in Section I B. Proceeding to Section II, we focus on thermal embezzling for trivial Hamiltonians with fixed dimensions. On the one hand, we investigate the problem of finding a catalyst which allows us to perform thermal embezzling with minimum possible error in trace distance. We detail the proofs on our construction of a catalyst family (given dimension parameters for both system of interest and catalyst), and prove that our construction achieves the optimal embezzling error.
On the other, by placing restrictions on the dimension, we derive non-zero lower bounds for embezzling error, considering arbitrary system and catalyst Hamiltonians. The proofs are detailed in Section III. Some technical background on α−Rényi divergences and their relation to thermodynamic operations are given. Lastly, in Section IV we focus on infinite-dimensional Hamiltonians, with unbounded energy levels (and finite partition function). We show that as long as the average energy of the catalyst is finite, explicit lower bounds on accuracy of embezzling can be obtained.

A. Thermodynamics as a resource theory
Resource theories are frameworks useful in identifying states which are valuable, under specific classes of allowed operations and states given for free. A state is a valuable resource if one can use it to create many other states under the set of allowed operations. Thermodynamics can be viewed as a resource theory [2,3], where the allowed operations are the so-called thermal operations. They are summarized as follows: consider a system S, given a state ρ S and the Hamiltonian H S , one can Recently, the framework of thermal operations have been used to prove a second law in Ref. [1] by including catalytic effects. This is because there exists certain states ρ and σ such that via thermal operations, ρ σ, but ρ ⊗ ω C → σ ⊗ ω C for some state ω C . More precisely, catalytic effects can be accounted for by adding a fourth rule 4. for any catalyst system C with Hamiltonian H C , attach any additional catalyst state ω C , as long as the returned state ω ′ C is ε−close to its original state ω C , to the set of allowed operations. One can now ask, given ρ S , what are the states ρ ′ S can be obtained from ρ S under approximate catalytic thermal operations? More precisely, do there exist ω C , ω ′ C which are ε−close to each other, such that ω C ⊗ ρ S → ω ′ C ⊗ ρ ′ S ? Depending on ε and the measure of closeness used, the conditions for ρ S → ρ ′ S to occur can vary. For example, if ε is required to be zero, i.e. the catalyst must be returned in its exact form, then Ref. [1] shows for any ρ S and ρ ′ S such that ρ S → ρ ′ S is possible via catalytic thermal operations, a whole set of Rényi divergences must necessarily decrease. In the next Section I B, we define the Rényi divergences and state the results of [1] in detail. On the other hand, if ε is measured in terms of trace distance between the input and output catalyst only, Ref. [1] also proves that for any ε > 0, the state transformation conditions are trivial, i.e. any ρ S can be transformed to any ρ ′ S . We denote thermal embezzling as the phenomenon that by requiring only the input and output catalyst to be close in terms of trace distance, one can achieve ρ S → ρ ′ S for any ρ S , ρ ′ S . Another well-studied example of a resource theory is entanglement theory, where the allowed operations are those that can be implemented using local operations and classical communiaction (LOCC), while free states are the set of separable states. The interconversion of resources states in entanglement theory have been studied intensively, and have also provided insight into the resource theory of thermodynamics.
Embezzling states were originally introduced for the LOCC setting in Ref. [15]. An entangled state |ν(n) AB ∈ C n ⊗ C n shared between two parties A and B can be used as a resource to prepare some other state (of much smaller dimension), The fidelity between the actual final state with |ν(n) AB |ψ AB is denoted by 1 − ε , which goes to 1 when n goes to infinity. This enables the approximate preparation of the state |ψ AB , while the embezzling resource state is also left close to its original state. Such a preparation can even be achieved simply via local operations (LO). The family |ν(n) AB is called a universal embezzling state if it enables the preparation of any |ψ AB . While this seemingly violates entanglement monotonicity under LOCC operations, one quickly realises that it is really because the closeness in entanglement content of two states depend not only on the fidelity, but also the dimension. Hence entanglement is exhausted to prepare |ψ AB , while |ν AB remains close to intact on the whole. However, there is also something special about embezzling states, in the sense that a maximally entangled state does not serve as a good embezzling state. In Ref. [11], a comprehensive study about general characteristics of embezzling states was conducted, providing insight into the necessary structure of a state to be a good embezzler. The power of embezzling in LOCC has been applied in several areas of quantum information, such as coherent state exchange protocols [10], projection games [17], or as a theoretical tool in proving the Quantum Reverse Shannon Theorem [18]. There are some similarities between thermal embezzling and LOCC embezzling, however also many distinctive features exist. Most significantly, in thermodynamical systems, the Hamiltonian which determines the evolution of the system plays an important role in state conversion conditions [19]. This feature is absent in LOCC embezzling. We summarize the similarities and differences of LOCC and thermal embezzling in Table II Table II. An overview of differences between LOCC and thermal embezzling.

B. Rényi divergences as thermal monotones
In this section we detail the conditions for state transformation under catalytic thermal operations, which are closely related to the Rényi divergences. The simplest case of catalytic thermal operations is when all Hamiltonians H S , H C are trivial. For arbitrary states ρ and σ, ρ → σ is possible if and only if ρ ≻ σ [2]. In the case where H S or H C are generally non-trivial, state conversion conditions are affected by the involved Hamiltonians. More precisely, instead of majorization, we need to consider the monotonicity of Rényi divergences as a (necessary) condition for state transformations. These conditions are used later in Sections III andIV to investigate the limits of thermal embezzling. Let us first define these quantities in Definition I.1.
Definition I.1 (Rényi divergences [16]). Given arbitrary states ρ, σ ≥ 0, for α ∈ [0, ∞], the Rényi divergence of ρ relative to σ is defined as For ρ, σ diagonal in the same basis, let p = (p 1 , p 2 , ..., p n ) and q = (q 1 , q 2 , ..., q n ) denote the eigenvalue vectors of the ρ, σ respectively. Then the Rényi divergences reduce to the form It has been shown that the quantities D α (ρ τ ) are thermal monotones, where τ is the thermal state of the system of interest. Intuitively, this implies that thermal operations can only bring the system of interest closer to its thermal state with the same temperature T as the bath. We detail this in the following Lemma I.2.

Lemma I.2 (Monotonicity under thermal operations [1]). Given some Hamiltonian H A , consider arbitrary states ρ
A is possible via catalytic thermal operations. Denote by τ A the thermal state of system A. Then for any α ∈ [0, ∞), Furthermore, for any ρ A , ρ ′ A diagonal in H A , if Eq. (4) holds for all α ≥ 0, then ρ A → ρ ′ A is possible via catalytic thermal operations.
In essence, Lemma 4 implies that the monotonicity of Rényi divergences are necessary conditions for arbitrary state transformation, and for the case of states diagonal (in the energy eigenbasis), they are also sufficient. Let us also use a notation which was introduced in [19] for diagonal states: we say that there exists a catalyst ω such that ω C ⊗ ρ S ≻ T ω C ⊗ ρ ′ S , if ρ → σ via catalytic thermal operations. We refer to the notion ≻ T as thermo-majorization. Now, let us consider the scenario of preparing a pure excited state of maximum energy Π S max = |E S max E S max | from a thermal state τ S . Intuitively, if we concern ourselves only with diagonal state transformations, then this is the hardest thermal embezzling scenario possible. This is because for Π S max ≻ T ρ S ≻ T τ is possible for any diagonal ρ S . Therefore, whenever we investigate the case where involved states are diagonal, it suffices to analyse the preparation of such a pure excited state. The necessary and sufficient conditions are In the next lemma, we show that given fixed Hamiltonians and dimensions, any catalyst state that succeeds in preparing such a state can also be used to facilitate any other state transformation.

Lemma I.3 (Universal embezzlers for diagonal states). Suppose there exists
Proof. This can be proven by noting that is equivalent with the existence a thermal operation denoted by M, such that M(ω C ⊗ τ S ) = ω ′ C ⊗ Π S max . It remains to show that for any ρ S , ρ ′ S , there exists a thermal operation M ′ such that M ′ (ω C ⊗ ρ S ) = ω ′ C ⊗ ρ ′ S . Since the thermal state ρ S ≻ T τ S is thermo-majorized by any state ρ S , and Π S max ≻ T ρ ′ S thermo-majorizes any other state ρ ′ S , there exist thermal operations N 1 , N 2 such that N 1 (ρ S ) = τ S and N 2 (ρ S ) = Π max . Finally, consider then one sees that

II. OPTIMAL THERMAL CATALYST FOR TRIVIAL HAMILTONIANS
In this section we look at a specific thermodynamic transformation involving system (S) and catalyst (C) states of any dimension m and n = m a respectively. For the trivial Hamiltonian where all states have same energy, the thermal state of the system is simply the fully mixed state I S m , while any pure state corresponds to Π S max , so we simply pick |0 0| without loss of generality. Note that thermo-majorization conditions are reduced to the simplest form, i.e. that is possible if and only if the initial state majorizes the latter, i.e.
In this section we give a construction of catalyst states which allow this transformation, and prove that our construction achieves the optimal trace distance d(ω C , ω ′ C ) = 1 2 ω C − ω ′ C 1 in any fixed dimension n = m a . Furthermore, these states are universal embezzlers, since any catalyst which successfully creates |0 0| S from I S /m would also allow to obtain any ρ ′ S from any ρ S , as shown in Lemma I.3.
Definition II.1. Consider integers m ≥ 2 and n = m a where a ≥ 1. Let S m,n be the set of n-dimensional catalyst state pairs (ω C , ω ′ C ) enabling the transformation Let d m,n = min{d(ω C , ω ′ C ) | (ω C , ω ′ C ) ∈ S m,n }.

A family of catalyst states
We offer the following construction of catalyst input and output states in any dimension n = m a where m ≥ 2 and a ≥ 1 are integers. We take the output catalyst ω A simple way to visualise this is as follows: for the first m elements, the distribution is uniform with some probability ω 1 ; for the next m + 1 up to m 2 elements the distribution is uniform again, with probability ω 1 /m; and so on up to n = m a . The initial ω 1 is then chosen so that the full distribution is normalised. We choose the input catalyst state to be ω Such a state ω C is obtained from ω ′ C by setting all the probabilities for i > n/m to be zero, while renormalizing by increasing the largest peak of the probability distribution. Note that ω 1 > ω ′ 1 while ω i ≤ ω ′ i for all i > 2. The trace distance between ω C and ω ′ C can be calculated to be This shows that since we have constructed a specific state pair achieving this trace distance. In the next section we will see that for catalysts satisfying Eq. (9), smaller values of trace distance cannot be achieved, which implies that Eq. (14) is true with equality, and the family presented above is optimal.

Optimal catalysis
In this section we show by induction that Recall that our problem is to minimize over states ω C , ω ′ C the trace distance d(ω C , ω ′ C ) such that Eq. (9) is satisfied. We first show that it suffices to minimize over states which are diagonal in the same basis.
Lemma II.2 (States diagonal in the same basis). Consider fixed n-tuples of eigenvalues (ω 1 , · · · , ω n ) and (ω ′ 1 , · · · , ω ′ n ), such and that (ω C ,ω C ) also satisfies Eq. (9). Proof. There are two steps in this proof: firstly, we constructω C from ω ′ C and show that the trace distance decreases by invoking data processing inequality. Then, we use Schur's theorem to show that majorization holds. Letω C = N (ω ′ C ), where N (ρ) = i |e i e i |ρ|e i e i | is the fully dephasing channel in the basis {|e i }. Note that since ω C is already diagonal in {|e i }, N (ω C ) = ω C . Because the trace distance is non-increasing under quantum operations [20], we have On the other hand, we will show that ω ′ C ≻ω C . For any matrix M , let λ(M ) be the vector of its eigenvalues. We want to show that λ(ω ′ C ) ≻ λ(ω C ). Recall thatω C = N (ω ′ C ) and, from the definition of N , observe that the eigenvalues λ(ω C ) are precisely the diagonal elements of ω ′ C in the basis {|e i }. Schur's theorem ( [21], Chapter 9, Theorem B.1.) says that for any Hermitian matrix M , the diagonal elements of M are majorized by λ(M ). Therefore, λ(ω ′ C ) ≻ λ(ω C ) and thus ω ′ C ≻ω C . Making use of the initial assumption ω C ⊗ I S /m ≻ ω ′ C ⊗ |0 0| S , we now see that which concludes the proof.
We are now ready to establish our lower bound on d m,n , where will use the fact established in the previous Lemma II.2 that we can take both states to be diagonal in the same basis. Theorem II.3. Consider integers m ≥ 2 and n = m a where a ≥ 1. Then where d m,n is defined in Eq. (II.1). Hence, the family of catalyst states from Section II.1 is optimal.
Proof. The majorization condition only depends on the eigenvalues of ω and ω ′ . For fixed eigenvalues, the trace distance d(ω, ω ′ ) is minimized if the two states share the same eigenbasis and the eigenvalues are ordered in the same way, e.g., in decreasing order, as discussed in Lemma II.2. Hence, from now on we consider only diagonal states ω = diag(ω 1 , . . . , ω n ) and Here, diag(· · · ) denotes the diagonal matrix with the corresponding diagonal elements. To prove the theorem we only need to show that as the other inequality follows from the family of embezzling states exhibited in Section II 1. We use induction on the power a.
For the base case a = 1, we need to show that d m,m ≥ 1 − 1/m. Consider any feasible solution (ω, ω ′ ) in dimension n = m. From the majorization condition it follows that ω 1 /m ≥ ω ′ 1 and ω i = 0 for i ≥ 2. Hence, ω 1 = 1 and 1/m ≥ ω ′ 1 . Since ω ′ 1 is the largest of the m values ω ′ i , we get ω ′ i = 1/m for all i. Finally, a simple calculation reveals that d(ω, ω ′ ) = 1 − 1/m, which establishes the base case. For the inductive step, we assume that for some n = m a and aim to show that for k = m a+1 . The main idea is to consider an optimal catalyst pair (ω, ω ′ ) ∈ S m,k and from it construct a catalyst pair (σ, σ ′ ) ∈ S m,n in dimension n = m a . Since our construction will allow to relate d(σ, σ ′ ) ≥ d m,n to d(ω, ω ′ ) = d m,k , we then obtain a lower bound on d m,k in terms of d m,n as in Eq. (22). Let us start by using the state pair that satisfies Eq. (19) and achieves d m,k , and from it derive some useful properties. Firstly, pick (ω, ω ′ ) ∈ S m,k so that d(ω, ω ′ ) = d m,k . As before, without loss of generality, we assume that ω = diag(ω 1 , . . . , ω k ) and again implies that ω 1 > ω ′ 1 and ω i = 0 for i > k/m = m a . To further simplify matters, we can also assume that ω i ≤ ω ′ i for all i ≥ 2. This is because we can always replace ω withω = diag(ω 1 , . . . ,ω k ), wherẽ for i ≥ 2 andω 1 is chosen so that iω i = 1. In essence, all the majorization advantage of ω against ω ′ can be piled upon the first, largest eigenvalue of ω. This replacement is valid since (ω, ω ′ ) still satisfies the majorization condition. Furthermore, implies that the distance is unchanged. Subsequently, we proceed to bound d m,n . To do this, construct a catalyst pair (σ, σ ′ ) ∈ S m,n in dimension n = m a = k/m. Essentially, this is done by directly applying a cut to the dimension of the final catalyst state ω ′ , reducing it to having dimension k/m = n. Similarly, the same amount of probability is cut from the initial state, and both states are renormalized.
Let us decribe this in more detail: denote δ = i>k/m ω ′ i and pick index s and valueω s ≤ ω s so that i<s ω i +ω s = 1 − δ. Note that s ≤ k/m 2 , since the majorization condition Eq. (24) implies that This inequality is obtained by summing up the first k/m elements of both distributions in the L.H.S. and R.H.S. of Eq. (24). We now define Since i<s ω i +ω s = i≤k/m ω ′ i = 1 − δ the states σ and σ ′ are properly normalized. To establish that (σ, σ ′ ) ∈ S m,n , we need to show that the majorization condition holds true. We consider two separate cases: whenω s = ω s , and whenω s = ω s . Ifω s = ω s , then the inequalities in the majorization condition for (σ, σ ′ ) have already been enforced by the majorization condition of (ω, ω ′ ). Hence, (σ, σ ′ ) is a valid catalyst pair in dimension n = k/m, i.e., (σ, σ ′ ) ∈ S m,k . Let us now make the following two observations. 1. d(ω, ω ′ ) ≥ δ. To see this, recall that ω i = 0 for i > k/m = n, and thus To see this, note that since only the first diagonal element of σ is strictly larger than the corresponding diagonal element of σ ′ .
Combining observations (1) and (2) gives Rearranging gives us and we have completed the inductive step. Ifω s = ω s , then the majorization inequalities involvingω s might fail to hold. Therefore, instead of (σ, σ ′ ) we consider the following, slightly different, pair of states where The diagonal elements of ζ ′ are still in descending order, and the state is properly normalized. To argue that (ζ, ζ ′ ) is a valid pair of catalyst states, we need to verify the majorization inequalities that are not directly implied by the majorization condition for (ω, ω ′ ). That is, we need to verify that for all 1 ≤ j ≤ m, where We can see that this is true for the state pair (ζ, ζ ′ ) because in this regime of Eq. (38), both sides increase linearly with the indices j, and for the endpoints j = 0 and j = m, the L.H.S. is higher than the R.H.S., which is guaranteed by the majorization condition for (ω, ω ′ ), Therefore, (1 − p)C + p(C +ω s ) ≥ (1 − p)C ′ + p(C ′ + ml ) for any 0 ≤ p ≤ 1. Taking p = j/m yields the desired inequality (38) and hence (ζ, ζ ′ ) is a valid catalyst pair. Lastly, note that reasoning similar to the one in Equation (31) can be used to deduce that Therefore, d(ζ, ζ ′ ) = d(σ, σ ′ ) and we can use the argument from the previous case to complete the inductive step. By this proof of induction we have shown that d m,n ≥ m − 1/(1 + (m − 1)a) for all m, n = m a and a ≥ 1. This together with the conclusion in Section II 1 that d m,n ≤ m − 1/(1 + (m − 1)a) proves that and the state pair described in Eq. (11) and (12) is optimal.

A. Diagonal states
In our work, we use two particular quantities, which are the Rényi divergences for α = 1/2 and α = ∞, which for classical probability distributions have the following form: As mentioned in Section I B, given Hamiltonians H S and H C , it suffices to consider Here, we prove whenever the dimension of the catalyst (and system) are finite, there exists a lower bound on the accuracy of thermal embezzling. Such a bound is dependent on H S and H C . To do so, consider the problem [1], it has been shown that for initial and target states commuting with the Hamiltonian H S , it is sufficient to consider catalyt states commuting with H C . Therefore, since τ S and Π S max both commute with H S , it is sufficient to consider input and output catalysts states which are diagonal in the basis of H C . Since all α Rényi divergences are thermal monotones according to Lemma I.2, in particular the min-relative entropy (D ∞ ), for α → ∞, where ρ i and ρ ′ i are the eigenvalues of ρ, ρ ′ respectively. Therefore, satisfying the thermo-majorization conditions in Eq. (44) implies that . To further simplify this expression, note that τ CS = τ C ⊗ τ S and that D α (ρ ⊗ ρ ′ σ ⊗ σ ′ ) = D α (ρ σ) + D α (ρ ′ σ ′ ). The additivity of Rényi divergences under tensor products holds for all states. Furthermore, D α (ρ ρ) = 0 for any ρ. Therefore, we arrive at the expression where Z S is the partition function of the system. The spectral values of ω C and ω ′ C are denoted as {ω j } and {ω ′ j }, respectively. Using the definition of D ∞ as shown in Eq. (42), we obtain are the eigenvalues of the thermal state for the catalyst, for the energy eigenstate with energy eigenvalue E C i , with normalization Z C , the partition function of the catalyst. Sinceε is the minimum trace distance between states ω C , ω ′ C , and D ∞ depends only on the maximum of ω ′ i /τ i across the distribution, the optimal strategy to increase D ∞ while going from ω ′ C to ω C is to increase a specific ω ′ i by an amountε. Therefore, we can consider a relaxation of Eq. (44) In the next lemma, we show that ε ≥ε ≥ δ > 0 whenever E C max , E S max < ∞. Lemma III.1 (Lower bound to error in catalysis). Consider system and catalyst Hamiltonians which are finite-dimensional, to be the set of energy eigenvalues respectively. Then for some fixed E C max , E S max , consider any probability distribution r (which corresponds to eigenvalues of a catalyst ω), andε such that where τ i = e −βE C i /Z C . Note that index i runs over all energy levels E C i . Then In other words, thermal embezzling of diagonal states with arbitrary accuracy is not possible.
Proof. Firstly, let r * , τ * indicate the pair such that r * /τ * = max j r j /τ j . Then The first term of L.H.S. is equal to r * /τ * , and therefore can be grouped with the R.H.S. to form since we know that D ∞ (r q) = log max i r i /τ i = log r * /τ * ≥ 0, therefore r * /τ * ≥ 1. Finally, taking the maximization of 1/τ i over i gives 1/τ min , recall that τ i corresponds to probabilities of the thermal state being in the eigenstate with energy E i . Therefore, τ min = e −βE C max /Z C , and we getε The case of arbitrary states are treated separately, since our Lemma I.3 on universal embezzlers hold only for diagonal states, where necessary and sufficient conditions are known for state transformations. Nevertheless, since the monotonicity of D α is necessary for arbitrary state transformations ρ S → ρ ′ S , one can use techniques very similar to those in Section III A to lower bound the embezzling error, if we minimize over diagonal catalysts.
More precisely, denote ε(ρ S , ρ ′ S ) to be the solution of Recall that τ CS = τ C ⊗ τ S , and that D α is additive under tensor products. Therefore, by defining we can rearrange the first constraint in Eq. (54) Note that this is almost equivalent to Eq. (46), except the constant log Z S /e −βE S max previously is now replaced with κ 1 (ρ S , ρ ′ S ). By following the same steps used to prove Lemma III.1, we obtain a lower bound depending on ρ S , ρ ′ S .

Lemma III.2. Consider system and catalyst Hamiltonians which are finite-dimensional, and denote {E
to be the set of energy eigenvalues respectively. Then for some fixed 0 ≤ E C max , E S max , consider any probability distribution r (which corresponds to eigenvalues of a catalyst ω), andε such that Note that index i runs over all energy levels E C i . Then This implies thermal embezzling with arbitrary accuracy, using a diagonal catalyst is not possible.
Comparing Lemma III.1 and Lemma III.2 which are very similar, one sees that for non-diagonal states Lemma III.2 gives a state-dependent lower bound on the embezzling error. However for diagonal states, the bound in Lemma III.1 can be made state-independent because of the existence of universal embezzlers.

C. Relation to energy constraints
Rather than bounding the dimension of the catalyst, one can ask if restrictions on other physical quantities such as the average energy of the catalyst would prevent indefinitely accurate embezzling from occurring. While this by itself is an independently interesting problem, we can first note that such restrictions are sometimes related to restrictions on the dimension. In one direction this is straightforward: if the catalyst is finite-dimensional, then the average energy and all other moments of energy distribution would be finite as well.
Here, we show that by restricting the first and second moments of the energy distribution of the catalyst to be finite, this implies that the states involved are always close to finite-dimensional states. In other words, if we consider the set of catalysts such that the average and variance of energy is finite, then for any such catalyst state from this set, there always exists a finite-dimensional state ε-close to it. This can be shown by invoking a simple theorem, namely the Chebyshev inequality which says that for given any finite non-zero error ε, the support of the energy distribution must be finite.

Lemma III.3 (Chebyshev inequality).
Consider a random variable X with finite meanX and finite variance σ 2 X , then for all k > 0,

IV. LIMITS OF THERMAL EMBEZZLING FROM ENERGY CONSTRAINTS
In this section we provide lower bounds for the error in catalysis, given constraints on the average energy of the catalyst state. We do so by adding a constraint on the average energy of the catalyst to the problem stated in Eq. (44). By looking at the Rényi divergence for α = 1/2, we can show a non-zero lower bound on the catalytic error, for cases where the partition function of the catalyst Hamiltonian Z C is finite. This minimal assumption covers most physical scenarios, especially if we want the thermal state to be a trace class operator to begin with. Again we start with diagonal states, then later generalize to arbitrary states.

A. Diagonal states
Firstly, let us recall the problem stated in Eq. (44). We aim at minimizing the trace distance between all input and output catalyst states, such that the most significant thermal embezzlement of a smaller system S can be achieved. We denote again the initial and final catalysts by ω C and ω ′ C with spectral values {ω j } and {ω ′ j }. Again, by restricting ourselves to look at catalyst diagonal in the Hamiltonian basis, and by invoking only the thermal monotone D 1/2 (. .), one can find the alternative relaxed problem and γ = e −β/2 < 1. Furthermore, since A = 1/ min i τ i with τ i forming a probability distribution (that of a thermal state), one can deduce that whenever the dimension of system S is m ≥ 2, A ≥ m ≥ 2 holds as well.
The solution of this minimization problem serves as a lower bound to the optimal trace distance error. This problem can be relaxed to a convex optimisation problem. We can arrive at a simple bound, however, with rather non-technical means. In essence, we introduce split bounds, so that the optimization can be written as two independent, individually significantly simpler optimization problems. We make use of the inequality which holds true for x, y ∈ [0, 1], a ≥ 2 and with f : We can then relax the problem by replacing the first constraint in Eq. (61), with x j taking the role of |ω j − ω ′ j |, to arrive at s.t.
These are now two independent optimisation problems, by treating x j and ω j as independent variables. Define ε C to be the solution of the simple linear problem involving only variables {ω j }, which we explicitly write out in Corollary IV.2. In this subproblem, one notes that the constraint on expectation value of the energy implies that the total probability of having relatively low energy eigenvalues cannot be vanishingly small, which we prove in Lemma IV.1. One can then use this fact to place a lower bound on the quantity ε C , which we detail in Corollary IV.2.
Lemma IV.1 (Lower bound to sums of eigenvalues). Consider any probability distribution {ω i } over ascendingly ordered energy eigenvalues {E C i }, with the property that the energy eigenvalues are unbounded, i.e. lim n→∞ E C n = ∞. If the expectation value of energy ∞ i=1 ω i E C i ≤ E for some finite constant E, define for any 0 < W < 1 Then Proof. One can easily prove this by contradiction. Assume that and therefore ∞ i=j(W )+1 ω i > 1 − W . This violates the energy constraint, since ∞ i=j(W )+1 Corollary IV.2 (Lower bound to ε C ). For a set of unbounded energy eigenvalues {E C i }, consider the minimization problem Denote γ = e −β ∈ (0, 1). Then for j(W ) = min{j : Proof. This is a direct application of Lemma IV.1, since the first and second constraints are satisfied automatically by any probability distribution. Given some W ∈ (0, 1), by Lemma IV.1 we know that The objective function then can be lower bounded as for any such W . To obtain the best lower bound, one maximizes over all W ∈ (0, 1). Remark IV.3 (Temperature independence). The bound obtained in Corollary IV.2 is dependent on temperature of the bath, and goes to zero in the limit T → 0.
We have now solved the subproblem involving variables {ω i }. Inserting the solution into the former optimisation problem, we arrive at the lower bound for ε, The optimal solution for this minimization can easily be lower bounded by considering the Lagrange dual, which is In fact, this can obviously be immediately solved as a quadratic problem in one variable. Let and consider the stationary point of the function by setting first derivative w.r.t. λ to zero, where the second derivative is negative, hence this implies a maximum point. Substituting this into the objective function gives f (A)ε 2 C /Z C , and hence we conclude that In this way, we arrive at the main result.
Theorem IV.4 (Energy constraint limits the accuracy of thermal catalysis). Consider the transformation ω C ⊗ τ S → ω ′ C ⊗ |E S max E S max |, where d opt = 1 2 ω C − ω ′ C 1 = 1 2 ε is the error induced on the catalyst. Then for all catalyst states with finite average energy, d opt is lower bounded by where f (x) is defined in Eq. (64), A = Z S /e −βE S max , ε C = max W ∈(0,1) W γ E C j(W ) and j(W ) = min{j : E C j+1 > E/(1 − W )}. In other words, thermal embezzling of diagonal states with arbitrary accuracy is not possible.

B. Arbitrary states
Similar to our previous discussions in Section III B , when the states ρ S or ρ ′ S are non-diagonal, we can still obtain a state dependent lower bound for the embezzling error. For any state ρ S , ρ ′ S , let us define the quantity κ 2 (ρ S , ρ ′ S ) := D 1/2 (ρ ′ S τ S ) − D 1/2 (ρ S τ S ).
Then a lower bound can be obtained by following the steps as proved in Section IV A, only now replacing the constant A defined in Eq. (62) with a state-dependant function.