Thermodynamically optimal creation of correlations

Correlations lie at the heart of almost all scientific predictions. It is therefore of interest to ask whether there exist general limitations to the amount of correlations that can be created at a finite amount of invested energy. Within quantum thermodynamics such limitations can be derived from first principles. In particular, it can be shown that establishing correlations between initially uncorrelated systems in a thermal background has an energetic cost. This cost, which depends on the system dimension and the details of the energy-level structure, can be bounded from below but whether these bounds are achievable is an open question. Here, we put forward a framework for studying the process of optimally correlating identical (thermal) quantum systems. The framework is based on decompositions into subspaces that each support only states with diagonal (classical) marginals. Using methods from stochastic majorisation theory, we show that the creation of correlations at minimal energy cost is possible for all pairs of three- and four-dimensional quantum systems. For higher dimensions we provide sufficient conditions for the existence of such optimally correlating operations, which we conjecture to exist in all dimensions.


I. INTRODUCTION
Correlations can be regarded as the fundamental means of obtaining information: From a physical perspective, obtaining knowledge requires correlation of physical variables in an observed system with physical variables in a measurement apparatus [1]. As long as one regards system and measurement apparatus as separate entities which are not persistently strongly interacting [2], this correlation requires an investment of energy. Conversely, all correlations imply extractable work [3]. Since correlations and energy (work) can be considered to be resources in (quantum) information theory and thermodynamics, respectively, these observations establish one of the fundamental connections between these theories. Indeed, both theories already share a common framework using statistical ensembles to capture knowledge about collections of physical systems. This knowledge can then be harnessed to facilitate the most efficient use of the available resources, e.g., energy, for the tasks at hand. The young field of quantum thermodynamics combines features from both fields and investigates their interplay in the quantum domain [4][5][6].
Here, we explore the specific quantitative relation between correlations and energy [7]. While general qualitative insights can help to understand some quandaries arising from Maxwell's demon or Szilard's engine [8][9][10], precise quantitative statements about the trade-off between work and correlations are generally complicated. For example, by allowing arbitrarily slow quasi-static operations and perfect control over arbitrarily many auxiliary systems, one may provide tight lower bounds on the work cost of creating bipartite correlations as measured by the mutual information [7,[11][12][13][14].
However, for two identical systems, these bounds are tight only in the case when specific so-called symmetrically thermalizing unitaries (STUs) exist, i.e., unitaries that map initial thermal states to locally thermal states at higher local effective temperatures. Moreover, how well two systems can be correlated for a given energy in finite time and with limited control is generally not known. In particular, the possibility of optimal conversion of energy into correlations for the interesting special case of fully controlled, closed systems with two identical subsystems rests entirely on the assumed existence of STUs. In this sense, central open questions at the very foundations of quantum thermodynamics concern the quantitative correspondence between correlations and energy.
In this paper, we put forward a framework for investigating potential marginal spectra in the unitary orbit of any quantum state via decompositions into locally classical subspaces (LCSs) and majorisation relations. Here, LCSs are subspaces of a bipartite Hilbert space that support only states that are locally classical, i.e., states which have diagonal marginals with respect to a chosen basis. We use this approach to investigate the question of existence of STUs for arbitrary bipartite quantum systems. In particular, we show that STUs exist for two identical copies of arbitrary three-and four-dimensional systems, thereby extending previous results on the special case of Hamiltonians with equally-spaced energy gaps [12,13]. In addition, we formulate sufficient conditions for the existence of STUs for identical subsystems with arbitrary local dimension and for multiple copies of the initial state. Indeed, we show that STUs exist in the limit of infinitely many copies. This paper is structured as follows: In Sec. II, we describe the conceptual framework of our investigation, formulate our central question, and briefly review the state of the art. In Sec. II.1.1, we further discuss that, while STUs do not in general exist for asymmetric situations where the two systems have different Hamiltonians, this does not resolve the problem of maximizing correlations for asymmetric systems. We then focus on the open case of two identical subsystems in Sec. II.2 and discuss a family of unitary transformations that result in symmetric, diagonal marginals. These unitaries have a blockdiagonal structure with respect to a specific choice of LCSs, and give rise to a rich structure of potential marginal transformations. In particular, they allow showing that STUs exist in the asymptotic limit of infinitely many copies. In Sec. III, we then formulate three alternative approaches to demonstrating the existence of STUs based on unitaries in these LCSs. As we show, all three approaches allow proving the existence of STUs in local dimension d = 3. However, only two of these approaches are successful (to different extent) in local dimension d = 4 and provide general sufficient conditions for the existence of STUs for higher dimensions. Finally, in Sec. IV we provide a summary and discussion and put forward a hypothesis for general dimensions.
II. FRAMEWORK

II.1. Main question
Let us start our investigation with a detailed description of the problem: We consider the process of correlating two previously uncorrelated quantum systems. More precisely, we study the controlled interaction of two quantum systems A and B with Hamiltonians H A and H B , respectively, which are initially in a tensor product state A ⊗ B , with the goal of increasing the correlations between them. As we want to obtain optimal bounds, we consider optimal coherently controlled systems, i.e. the ability to externally engineer and time any interaction Hamiltonian between the two quantum systems, resulting in unitary dynamics on the system and we thus study the global unitary orbit of product states. Furthermore, we are interested in the energy needed to establish correlations between these initially uncorrelated quantum systems.
One could generalise the question from perfect system control and thus unitaries on the system, to correlating maps generated by unitaries on the system and arbitrary auxiliary systems, inducing completely positive and trace preserving (CPTP) maps on the system. However, some transformations (e.g., cooling to the ground state) may only be achievable asymptotically (i.e., using infinite time or energy [15][16][17][18]). Moreover, the implementation of CPTP maps generally requires access to and control over auxiliary systems, resulting in an implicit energy cost for preparation and manipulation of the auxiliaries that is obscured by the use of the CPTP maps instead of the explicit description of the corresponding unitaries on the total Hilbert space. Consequently, it is of interest to understand the limits of unitary correlating processes, which allow for a complete account of all the energetic changes within the system. Moreover, to give a fair account of the energy that needs to be supplied to the system, it is important to consider initial states that do not implicitly supply free energy. That is, we assume that the initial states of the subsystems are thermal states at the same temperature T , i.e., A = τ A (β) and B = τ B (β) with β = 1 T (we use units where k B = 1 throughout), where τ (β) ∶= e −βH Z and Z = Tr(e −βH ). In this way, the initial state τ A (β) ⊗ τ B (β) minimises the local energies for given entropies and is also completely passive [19], i.e., a state from which no energy can be extracted unitarily even when taking multiple copies.
Correlations can be quantified in different ways. Qualitatively, one may describe correlations between two subsystems as the feature that more information is available about properties of the joint system than about properties of the subsystems. If there is no preference as to the specific properties that are to be correlated, i.e., if one is interested in quantifying any correlations, it is most useful to consider entropic measures. The most paradigmatic among them is the mutual information I( AB ) = S( A ) + S( B ) − S( AB ), based on the von Neumann entropy S( ) ∶= −Tr ln . Although we focus on the von Neumann entropy, in principle one can replace it with any other measure of local 'mixedness'. And indeed, our methods are framed in the context of majorisation relations, and thus in principle also open an avenue towards studying other entropic measures.
When focusing on the mutual information, the problem of unitarily creating correlations can be reduced to the task of maximising the sum of the marginal entropies under local energy constraints. That is, the invested energy ∆E ∶= Tr H AB U AB τ A (β) ⊗ τ B (β)U † AB − τ A (β) ⊗ τ B (β) only depends on the local marginals, while the global entropy is invariant under unitary operations. The change in mutual information thus reduces to ∆I = ∆S A + ∆S B . We thus arrive at the following question: for an invested energy of at most ∆E?
The fact that thermal states maximise the local entropies under local energy constraints suggests that the maximal value of ∆I is always achieved when the final state marginals are thermal at a higher temperature T ′ ≥ T (or, equivalently, for Let us consider Question 1 in the case H A ≠ H B . Although the problem is in general complex, note that nontrivial constraints on marginal spectra for given global spec-tra exist in the form of entropy inequalities. In particular, both the von Neumann entropy S( ) and the Rényi 0-entropy S 0 ( ) ∶= − log 2 (rank( )) satisfy subadditivity [20,21], while no other entropy satisfies (dimension independent) linear inequalities [22]. Since we are interested in thermodynamic questions, where the rank of the involved states is typically full, only subadditivity of the von Neumann entropy, which we can write as provides a nontrivial constraint. Also observe that, given an initial state τ A (β) ⊗ τ B (β) subject to a unitary evolution, the entropy of the total system is fixed to S( AB ) = S(τ A (β)) + S(τ B (β)). If we assume that the final marginals are thermal states with equal higher temperature i.e., β ′ ⩽ β, inequality (2) can be rewritten as This already implies a nontrivial constraint, and also provides a counterargument to the naive assumption that two thermal marginals at the same temperature are reachable in general. While this makes the structure complicated to deal with in general, one can still solve the optimisation in some special cases. In particular, for a system initialised in the ground state, i.e., a product of two pure states τ AB (β → ∞) = 0 A 0 B ⟩⟨ 0 A 0 B , the total state remains pure during the unitary evolution and hence has a Schmidt decomposition. The final state can therefore be written as This already means that the marginals (since they have equal rank) cannot both be thermal states if However, in this case it is still possible to maximise the amount of correlation subject to a given maximal amount of energy consumption. In other words, one can find optimal points of the following optimisation problem: To do so one first solves the (loosely speaking) inverse problem of minimising the energy consumption for a given amount of correlation. This problem can be turned into two instances of the well-known passivity problem [19], for which passive states are the solutions. Using this fact, the problem reduces to a simple problem that can be solved using Lagrange multipliers. One then shows that the solution found for that problem is strictly monotonically increasing in the constraint, allowing us to reverse the problem and show that it is a solution of the problem in (6). For more details see Appendix A.II. See also Ref. [23] for an alternative approach to the energy minimisation problem. The solutions obtained for (6) are thermal states at inverse temperature β(c) (uniquely defined by the constant c) of the HamiltoniansH A andH B , respectively, defined as where The solutions hence take the form The maximum amount of correlation achievable subject to a given maximal amount of energy c is then

II.1.2. Symmetric case
Let us now consider the symmetric case H A = H B . Following our considerations in the previous section and Appendix A.I, it is clear that an upper bound for ∆I is in this case achieved when ∆S A = ∆S B , and since the states with maximal entropy given a fixed energy are thermal states, Question 1 can be substantially simplified to that of the existence of symmetrically thermalizing unitaries.

Question 2: Existence of STUs
Does there exist a unitary U AB on H AB such that for every pair of local Hamiltonians H A = H B , for all final temperatures T ′ = 1 β ′ and all initial temperatures T = 1 β ≤ T ′ ?
If such STUs exist, then they are automatically the optimally correlating unitaries for a given system. It is already known that Question 1 can be answered affirmatively when the subsystem Hamiltonians are equally spaced, i.e., H A = H B = ∑ d−1 n=0 E n n ⟩⟨ n with E n+1 −E n = ω ∀n ∈ {0, 1, . . . , d− 1} for some constant ω (with appropriate units) [13] (see also the more recent formulation of the derivation in Ref. [24]). In particular, this implies that such optimally correlating unitaries always exist for two qubits, i.e., when d = 2. However, there is (yet) no answer for the general situation of arbitrary Hamiltonians and dimensions. To fill this gap, we present a framework that generalises previous approaches [13] and allows investigating this general question in the symmetric cases for any dimension. This approach is based on unitary operations in LCSs, as we will explain in Sec. II.2, and enables us to show that STUs exist for the simplest nontrivial cases of two qutrits (d = 3) and two ququarts (d = 4).

II.2. Unitary operations on locally classical subspaces
Let us now discuss our general framework for marginal transformations. The central idea of this approach is to decompose the diagonal elements of the marginals into elements originating from different subspaces, with the property that any unitary that leaves the division into these subspaces invariant never creates local off-diagonal elements.
More precisely, we consider a pair of d-dimensional systems A and B with matching local Hamiltonians where we set E 0 = 0 without loss of generality. Moreover, let the energy eigenvalues, sorted in nondecreasing order (with E i+1 ≥ E i ∀ i), be measured in units of E 1 . The initial thermal states of A and B are where p i (β) = e −βEi Z(β) and Z(β) = ∑ i e −βEi . For convenience, we use the shorthand p i ≡ p i (β). The joint initial state τ AB = τ A (β) ⊗ τ B (β) is also diagonal with respect to the energy eigenbasis and its diagonal entries are products of the diagonal entries of the reduced states, i.e., we write p ij ∶= p i p j such that τ AB = ∑ d−1 i,j=0 p ij ij⟩⟨ij . Since the unitaries that we use throughout the manuscript do not create coherence in the marginals, it is convenient to introduce a vectorised notation for the diagonal entries of the reduced states. That is, for arbitrary diagonal joint states where the p ij need not factorise with respect to i and j, the reduced states A = Tr B ( AB ) and B = Tr A ( AB ) are diagonal, and the diagonal entries can be collected into vectors p A , p B ∈ R d with nonnegative components and unit 1-norm,

II.2.1. Locally classical subspaces
Having expressed the diagonals of the marginals in this vectorised form, we further choose particular subspace vector decompositions, as we illustrate for d = 3 in Box 1. For general local dimension d ≥ 3, we write p A and p B as sums of d vectors according to where j=0 denotes the chosen orthonormal basis of R d and all indices are to be taken modulo d. In the second equality, we have further used the fact that the i-th vector r B i in the decomposition of p B can be related to the i-th vector r A i in the decomposition of p A via the i-th power of the permutation matrix Π = ∑ d−1 j=0 j + 1⟩⟨j . One further observes that the decomposition of p A and p B into these vectors corresponds to the selection of a total of d subspaces H q and {H ri } d−1 i=1 of the joint Hilbert space H AB , with H ri , such that arbitrary unitaries U q and U ri applied in either of the d subspaces cannot lead to nonzero off-diagonal elements in the reduced states A or B , provided that none are present to begin with. In other words, any state on H AB that has support on only one of these subspaces, or which has a block-diagonal structure with respect to this subspace decomposition, is locally classical, i.e., has diagonal marginals with respect to the local bases { j ⟩} d−1 j=0 . We hence call these subspaces locally classical. However, note that the choice of LCSs we make here is not unique, as discussed below in Sec. II.2.2.
A unitary transformation AB ↦˜ AB = U AB U † with U = U q ⊕ d−1 i=1 U ri hence does not preclude off-diagonal elements from appearing in the joint state˜ AB but leaves the reduced states diagonal, implying that we can directly read off the spectra of the marginals. In such a case, it is still useful to describe the marginals by d-component vectorsp A andp B collecting their diagonal elements, and the transformation of these vectors can be represented as where q = r A 0 = r B 0 , and M q and M ri are unistochastic d × d matrices, i.e., doubly-stochastic (entries in rows and columns sum to 1) square matrices M α (with α ∈ {q, r i }) whose components can be understood as the moduli squared of unitary matrices, (M α ) kl = (U α ) kl 2 . As a technical remark, note that the Schur-Horn theorem implies a weak convexity property for unistochastic matrices, namely that for any v ∈ R d , the set of vectors obtained by applying the set of unistochastic d × d matrices to v is equivalent to the set of vectors obtained by applying the set of doublystochastic d × d matrices to v, see Ref. [25]. This means that for any doubly-stochastic matrix M and vector v, there exists Box 1: Unitary operations on locally classical subspaces for two qutrits For any tensor product of two identical states τ AB = ∑ 2 i,j=0 p ij ij⟩⟨ij , we can separate the local diagonal elements (i.e., the eigenvalues, since the matrix is diagonal) through an economic notation where we denote the diagonal entries of the marginals as entries in a vector p 00 + p 01 + p 02 p 11 + p 12 + p 10 p 22 + p 20 + p 21 p 00 + p 10 + p 20 p 11 + p 21 + p 01 p 22 + p 02 + p 12 Employing the unitary U AB = U q ⊕ U r1 ⊕ U r2 , where U q , U r1 , and U r2 act unitarily on the subspaces respectively, the transformation of the marginals can be described by unistochastic matrices The components of the unistochastic matrices are determined by the moduli squared of corresponding unitary matrices, (M α ) kl = (U α ) kl 2 for α = q, r 1 , r 2 . Due to the symmetry p ij = p ji , we can further identify r 2 = Πr 1 to obtaiñ Here, this property permits us to conclude that the vectorised marginals of any state reachable by unitaries of the form U = U q ⊕ d−1 i=1 U ri can be written in terms of the action of doubly-stochastic matrices M q and M ri on q and r i , respectively.
In general, Eq. (19) can be rewritten as where k = d−1 2 if d is odd and k = d 2 if d is even. Further taking into account the symmetry p ij = p ji , which implies r d−i = ∑ d−1 j=0 p j j+d−i Π i e j−i = Π i r i , the vectorised form of the marginal A can be written in a compact way as where ⌊x⌋ denotes the floor function of x, and the prefactor (⌊ 2i d ⌋ + 1) −1 is equal to 1 unless d is even and i = k, in which case it is 1 2 . Using Eq. (20), one may obtain the transformed marginal B as where we have used the property Π −i = Π d−i of the ddimensional cyclic permutation. To satisfy the conditions of Question 2, we need to further restrict ourselves to a subgroup of transformations which generate the same marginals,p A =p B . This requirement results in the condition for the doubly-stochastic matrices mentioned in Eq. (22). With this, we arrive at the form for both vectorised marginals. Now that we have ensured that both marginals transform in the same way, we may concentrate on one of them, sayp A (dropping the subscript A for ease of notation) and use the established framework to investigate the existence of STUs as specified in Question 2.

II.2.2. General locally classical subspaces
While we focus here on the decomposition into the specific LCSs from Eq. (18), we note that this is by far not the only option. Indeed, let {P i } i=0,...,d−1 be a set of permutations of d elements j = 0, . . . , d − 1. Then a sufficient condition for obtaining an LCS decomposition H AB = ⊕ d−1 i=0Hri into LCSs of the formHr i = span{ j P i (j) ⟩} d−1 j=0 is that the d × d matrix Γ with components Γ ij = P i (j) is a Latin square, i.e., every entry j = 0, . . . , d − 1 appears exactly once in each row and in each column. For every such choice of LCS decomposition, one may then apply unitariesŨ = ⊕ d−1 i=0 Ur i that leave the LCSs invariant, i.e., where Ur i acts nontrivially only on the subspaceHr i . Denoting the corresponding vector decomposition of the first vectorised marginal as In this case, unitaries that leave the LCSs invariant transform the vectorised marginals according to where the matrices Mr i are the unistochastic matrices corresponding to the unitaries Ur i . This implies the existence of symmetric marginal transformations, e.g., for unistochastic matrices M that commute with all the P i , such that the marginals take the form M p A = M p B =p. For the remainder of this work we focus again on the specific special case we consider in Eq. (18), where P i = (Π −1 ) i .

II.2.3. Asymptotic case
Before we go further into detail on trying to answer Question 2, let us briefly showcase that this is indeed a problem connected to the finite size of the system. More specifically, we can consider a scenario where one wishes to correlate multiple copies of the initial thermal states τ A (β) and τ B (β) via a joint unitary, such that the final state of n copies is U τ AB (β) ⊗n U † , with τ AB (β) ⊗n = τ A (β) ⊗n ⊗ τ B (β) ⊗n . As we will show now, STUs such that˜ A =˜ B = τ (β ′ ) exist for all β ′ , β such that β ′ ≤ β, and for all local Hamiltonians in the limit of infinitely many copies, n → ∞.
To see this, note that for any n we can find a unitary such that the marginals A B = Tr B A (U τ A (β) ⊗n ⊗ τ B (β) ⊗n U † ) are passive states whose entropy equals that of the thermal state with the target temperature, i.e., S( A B ) = S(τ A B (β ′ )). This is a consequence of Eq. (25) and the continuity of the von Neumann entropy. More specifically, note that a trivial way of obtaining marginals with the same final spectrum is to select M q and all M ri in Eq. (25) to be circulant, i.e., convex combinations of cyclic permutations. Since these matrices commute with Π i , one may reach any marginal whose vectorised marginal is any cyclic permutation of the original marginal vector, or indeed, any convex combination of cyclic permutations of the original marginal vector. In particular, the equally weighted convex combination of all cyclic permutations yields maximally mixed marginals, The convex structure of the set of vectors reachable from a given vector via application of unistochastic matrices then suggests that the points corresponding to p A andp A are connected by a continuous line of states reachable from p A via circulant unistochastic matrices. The entropy must vary continuously along this line from the initial value S τ A (β) to S A = S τ A (β ′ ) . Thus, one may obtain marginals A = B that are passive states with the desired entropy, but which might have a different spectra than τ A (β ′ ) = τ B (β ′ ).
Given this fact, it is sufficient to note that for n → ∞ we can convert passive states with a given entropy S( ) into thermal states with the same entropy only by using local operations [19]. Because the application of local operations to the subsystems leaves the mutual information I( AB ) invariant, this consequently proves the existence of STUs for the asymptotic case n → ∞. Further discussion of the case of finitely many copies is presented in Appendix A.III.

III. OPTIMALLY CORRELATING UNITARIES FOR BIPARTITE STATES WITH MATCHING HAMILTONIANS
In this section, we present three approaches to construct STUs for bipartite systems in an (uncorrelated) initial thermal state at temperature T = 1 β of H = H A + H B with matching Hamiltonians H A = H B . We focus on bipartite 3 × 3 and 4 × 4dimensional systems, and show explicit constructions for such STUs, thereby proving that Question 2 can be answered affirmatively for d = 3 and d = 4. In the d = 3 case, we show the existence of STUs via all three alternative approaches as a basis for generalisations to higher dimensions. We then turn to the particular case of dimension d = 4, highlighting the challenges and strengths encountered for each of these methods, before proving the d = 4 case using a geometric argument.
Concretely, referring to Sec. II.2 as our framework, we start from Eq. (25), which provides a transformations of the reduced states that leaves both marginals equal and diagonal with respect to the chosen basis. In general, M q and all the M ri are arbitrary doubly-stochastic matrices, and therefore one can try to make use of majorisation conditions to prove the existence of STUs via the Hardy-Littlewood-Pólya (HLP) theorem [26, p. 75] or [27, p. 33]. The HLP theorem states that a necessary and sufficient condition that y ≻ x is that there exist a doubly-stochastic matrix M such that x = M y. This, in fact, is the first argument that we use below for d = 3, where it allows proving a statement that is actually even stronger than what is required for STUs. However, the theorem that we are going to prove holds only for the d = 3 case.

III.1. Majorised marginals approach
In the particular case of d = 3 the final expression for the (equal) marginals from Eq. (25) is where, as in Eq. (20), we have introduced the vectors q and r, with components Based on the HLP theorem and Eq. (28), we are going to prove the existence of STUs in d = 3 via the following lemma: Lemma 1. For any 3 × 3 doubly-stochastic matrix M , there exists a 3 × 3 doubly-stochastic matrixM such that The proof of Lemma 1 is presented in Appendix A.IV. We are now ready to state the first theorem for d = 3. For every pair of states and˜ in a 3-dimensional Hilbert space which satisfy the condition λ(˜ ) ≺ λ( ), where λ( ) is the vector of eigenvalues of , there exists a unitary U AB on H AB such that Proof. The unitary is given by U AB = U loc U ent , i.e., by a product of an entangling unitary U ent which acts on the global system preserving the equal marginals as discussed in Sec. II.2, and a local unitary of the form U loc = U ⊗ U , such that the marginals are kept equal. From Lemma 1 we then see that, given any doubly-stochastic matrix M q , one can find M r such that M q (1 + Π) = (1 + Π)M r and consequently a mapping to each probability vectorp such that From the HLP theorem, we thus know that it is possible to produce all the vectorsp that are majorised by the vector p corresponding to the initial state. Thus, we proved that through the entangling unitary, all the marginals ′ = ∑ 2 i=0pi i ⟩⟨ i such thatp ≺ p can be reached from the initial state. Then, if we can also arbitrarily change their eigenstates in a symmetric way, Theorem 1 is proven. For that, we use a local unitary U ⊗ U with U = ∑ 2 i=0 φ i ⟩⟨ i to change the eigenstates of the marginal to an arbitrary set of eigenstates { φ i ⟩} 2 i=0 , such that finally the marginals become˜ = ∑ 2 i=0pi φ i ⟩⟨ φ i in which the set { φ i ⟩} can be any orthonormal basis in d = 3.
As a corollary of Theorem 1, the existence of STUs is proven for the two-qutrit case for initial and final thermal states with inverse temperatures β and β ′ ≤ β, respectively, since p(β ′ ) ≺ p(β) holds whenever β ′ ⩽ β.
Unfortunately, generalisations of Lemma 1 to higher dimensions fail, as we explain in more detail in Appendix A.V. Despite the general statement of Theorem 1, any attempts at generalisations must hence be based on a different approach, which does not require the existence of a doubly-stochastic matrixM as in Eq. (30) for all doubly-stochastic matrices M .
III.2. Alternative approach: "passing on the norm" Here, we explore another possible approach that makes use of the HLP theorem and tries to overcome the difficulties encountered for generalising the previous approach. Let us first discuss the method for two d−dimensional systems, before turning to the special cases d = 3 and d = 4.
To begin, we once again employ the transformation presented in Eq. (25) to generate equal marginals. Recalling that the marginals of our final target state are thermal states at inverse temperature β ′ < β, we decompose the final state vector into the formp Further, recall that the initial states have the same decomposition with inverse temperature β. One way of transforming p intop is to transform q into a and r i into b i for all i = 1, . . . , k. Looking at Eq. (25), one realises that this is possible if Unfortunately, for any β ′ < β we have q > a , where ⋅ denotes the 1-norm, see Appendix A.VII, and since doubly stochastic transformations conserve the norm of vectors, Eq. (33) cannot hold true. The norms of the r i also generically differ from those of the b i , meaning Eq. (34) does not hold either. However, it is not clear whether r i (β) is a monotonic function of β, and so r i may in principle be smaller or larger than b i , depending on β, β ′ and i. Loosely speaking, the vector q has 'too much norm'. Intuitively, one may thus make use of the excessive norm of q by transforming the required 'amount' of q into a, and redistributing its excessive norm to the rest ofp, or in other words by passing the excessive norm of q on to the rest ofp, giving rise to the name of the approach. Formally, one achieves this by splitting M q into a convex combination of two doubly stochastic matrices, i.e., where α 0 , α 1 ≥ 0, α 0 + α 1 = 1, and M q→a and M q→b are doubly stochastic matrices such that where f depends on the vectors b i . The main idea of this approach is that if one chooses f and M q→b in an appropriate way, the M ri may potentially be chosen such that in order for p to transform intop as desired. One way of 'passing on the norm' of q in this way, i.e. choosing f and M q→b , is to pass it to each b i individually bi bi , where the {α i } i constitutes a probability distribution. However, this is not always possible as the condition q q ≻ b i b i , necessary for the existence of M q→bi , cannot be satisfied in general. See Appendix A.VI for details. One is therefore forced to pass the norm of q in a more clever way. A good candidate is However, this separation of M q into one part "passing on the norm" from q to a and another part from q to the b i is not strictly needed. One may instead only require the existence of a doubly stochastic matrix M q such that with (44) Unfortunately, Eq. (43) is challenging to check, e.g., even confirming whether a +f (r 1 , b 1 , . . . , r k , b k ) is a vector of nonnegative components is already complicated. In summary, the approach requires checking the two following conditions: while at the same time q ≥ a , where a ∶= q(β ′ ).
(ii) There exists a doubly stochastic matrix M q such that or the stronger version: r i ≤ b i for all i = 1, . . . , k and there exist doubly stochastic matrices M q→2bi , i = 1, . . . , k such that Condition (i) indeed holds, and we thus state it below as the following lemma, the proof of which is presented in Appendix A.VII.
Lemma 2. For any d-dimensional system with arbitrary Hamiltonian, for any β ′ < β, and for all i = 0, . . . , d − 1 we have Given the above lemma and assuming that Condition (ii) also holds, one can prove the existence of STUs for all dimensions by using the following majorisation relations. The majorisation relation (49) from Lemma 2 ensures the existence of doubly-stochastic matrices M ri which satisfy M ri r i r i = b i b i . To reach any thermal state with higher temperature, i.e.,p = p(β ′ ), we then need the existence of M q such that therefore compensating for the required norm of the vectors associated to the subspace However, since Condition (ii) is in general cumbersome to prove, one can check whether (47) holds instead. See Appendix A.VIII for more details about how to prove the existence of STUs in general in this way. In the particular case of d = 3, the strong version of Condition (ii) can be reduced to a ("norm passing") requirement that we formulate in the following lemma: Lemma 3. In d = 3, for every choice of E 1 and E 2 with E 2 ≥ E 1 and for any β ′ ≤ β, the following majorisation relation holds: The proof of Lemma 3 is presented in Appendix A.IX. Together, Lemmas 2 and 3 then confirm the existence of STUs for the d = 3 case. Now, let us examine the case d = 4. Consider a bipartite system with equal local Hamiltonians H = ∑ 3 i=0 E i i ⟩⟨ i in which the energy eigenvalues are ordered in increasing order and E 0 = 0. Following Eq. (17), the vectorised form of the marginals for the initial uncorrelated thermal state is given by where r i is a shorthand for r i (β). Furthermore, we again denote the vector decomposition of any thermal state with higher Unitaries on the LCSs that generate the same marginals, according to Eq. (25), lead to the transformation where the M ri are arbitrary doubly-stochastic matrices. To achieve a thermal marginal with higher temperature, one can transform each r i to b i as prescribed by Condition i and pass the extra norm of the vector q as dictated by the stronger version of Condition (ii). This is possible at least under some restrictions on the energy level spacings δ i ∶= E i+1 − E i . Let us phrase this statement more precisely in the following Lemma 4, a detailed proof of which is presented in Appendix A.IX.
Lemma 4. In d = 4, for every choice of the set {E i } 3 i=0 with E i+1 ⩾ E i and δ i+1 ≤ δ i and for any β ′ ≤ β, the following relations hold for i = 1, 2: With Lemma 4 at hand, we are ready to state the result achieved for the d = 4 case with this method.
Theorem 2: Existence of STUs in d = 4 In any 4-dimensional system, for every set of energy which implies the existence of STUs in d = 4 for symmetric Hamiltonians with decreasing energy gaps.
Proof. The statement follows from Lemmas 2 and 4.
We have thus seen that the approach discussed in this section does, at least partially, generalise to local dimension 4. However, the proof of Theorem 2, which exploits the stronger version of Condition (ii) fails for E 3 ≫ 1. Note that this failure is not an artifact of even dimensions, but rather continues to persist in subsequent higher dimensions. Still, the possibility remains that above results can be generalised by proving the weaker version of Condition (ii).

III.3. Geometric approach
Our attempts to generalise the approaches of Secs. III.1 and III.2 to higher dimensions have shown that the problem can be recast in terms of different sets of conditions. However, checking these conditions has proven to be increasingly complex with growing dimension, and has thus only provided partial results even for the case d = 4. We therefore now turn to a third approach which at least provides a complete proof for d = 4. This approach is centred around the geometric structure generated by doubly-stochastic matrices. More specifically, recall that the Schur-Horn theorem implies that for any v ∈ R n , the set of vectors obtained by applying the set of unistochastic n × n matrices to v is a convex polytope given by the convex hull of all permutations of the entries of v, see, e.g., Ref. [25]. Here, we can apply this idea to the vectors q and r i and the matrices M q and M ri , respectively, to generate matching diagonal marginals according to Eq. (25). In other words, the set of all possible symmetric marginal vectors reachable by unitaries that are block-diagonal with respect to the chosen LCS decomposition is the polytope with vertices given by the set of points with i j ∈ {1, 2, . . . , d!} for all j = 0, . . . , k with k as in Eq. (21). Here, Π (i) for i = 1, . . . , d! are the possible permutations of d elements. The question about the existence of STUs is then equivalent to asking if the curve defined by the set of points {p(β ′ ) β ≥ β ′ ≥ 0} is enclosed within the polytope corresponding to the convex hull of the (d!) k+1 points To answer this question, we then proceed in the following way. First, we note that the d-component vectors p = (p i ) only have d − 1 independent entries due to the normalisation condition ∑ for n = 0, 1, . . . , d − 2, while the additional last coordinate i=0 p i = −1 is fixed by normalisation and can thus be disregarded. In these coordinates, the point obtained for infinite temperature (β → 0) is the origin, p(β → 0) = (0, 0, . . . , 0, −1), the point for vanishing temperature is p(β → ∞) = (−1, −1, . . . , −1) and all thermal states lie on a continuous curve connecting these points that is strictly confined to negative coordinate regions, x i ≤ 0∀i. However, note that there are points corresponding to reachable marginals that have positive values for some of the new coordinates.
For any given initial temperature 1 β and dimension d, a sufficient set of conditions for the inclusion of the curve defined by the points {p(β ′ ) β ≥ β ′ ≥ 0} in the reachable polytope is as follows: in the polytope of achievable marginals.
The initial state of the marginals is represented by the point p(β) indicated by a blue arrow. The shaded blue area shows the polytope of diagonal reduced states withpA =pB that is reachable by the application of a unistochastic matrix M on q in combination with a circulant unistochastic matrixM applied to r. This polytope is the convex hull of the points obtained from combining any (out of 6 possible) permutations of q (green arrows) with cyclic permutations (out of 3 possible) of r (orange arrows). We have chosen to restrict to cyclic permutations on r here to illustrate that this is enough for d = 3, whereas this is no longer the case when d = 4. The red arrows between p(β),p (6,1) , and p(β → 0) delineate a triangle, which we show contains all points p(β ′ ) that correspond to thermal states with temperatures higher than the original temperature, β > β ′ . (b) The polytope is shown in terms of the independent coordinates x ≡ x0 = −p0 + p1 and y ≡ x1 = −p0 − p1 + 2p2 from Eq. (58). The vertex v (3) 1 is located at the intersection of the y axis with the line connecting p(β) andp (6,1) . Lemma 5. In the coordinates defined by Eq. (58), the curve of thermal states at inverse temperature β ′ ≤ β satisfies Condition (II) for all β in all dimensions d.
Besides satisfying Condition (II), it is also easy to see that the vertices v 0 and v d−1 from Condition (I) are reachable for all dimensions (see, e.g., Sec. II.2.3). However, showing the inclusion of the rest of the points in (I) is increasingly difficult, due to the rapidly growing number of possible polytope vertices and the difficulty of visualizing the (d − 1)dimensional polytope beyond d = 4. In the following, we prove that Condition (I) holds (at least) in the particular cases of d = 3 and d = 4. See also Fig. 1 for an illustration in dimension 3 and Appendices A.XI and A.XII for further details. In d = 3 and d = 4 systems, for every choice of Hamiltonians and initial inverse temperature β, the set of thermal states with β ′ ≤ β is contained within the polytope with vertices defined in Eq. (59), which proves the existence of STUs in the symmetric two-qutrit and two-quqart cases.
Proof. For d = 3 we have to prove that the point v (3) 1 = (0, x 1 (β), −1) can be reached with transformations of the type p ↦p = M q q + (1 1 + Π)M r r for some doubly-stochastic 3 × 3 matrices M q and M r , where q and r are as in Eq. (29). This is indeed the case, e.g., for M r = 1 1 being the identity matrix and , which is a doubly-stochastic matrix since p 0 + p 1 ≥ 1 2 for d = 3. Note that, geometrically, this choice of M q and M r represents a convex combination of the points p(β) andp (6,1) shown in Fig. (1). For this transformation, we havep = ∑ The point v (4) 1 can be reached with the equivalent of the transformation used to reach v (3) 1 above, that is, using M r1 = M r2 = 1 1 and with m = 1 − 1 [2(p 0 + p 1 )], which is again doubly-stochastic since p 0 + p 1 ≥ 1 2 also holds for d = 4.
To prove that v (4) 2 can be reached, one can see that it can be obtained as a convex combination of (at most) 5 of the vertices in Eq. (57), as we show in detail in Appendix A.XII.
For higher dimensions, as we already mentioned, the problem becomes more and more complex, but one can try to build a recursive approach based on the above lower dimensional proofs and borrowing some ideas from the "passing on the norm" approach. In particular, we outline a possible route for such an approach for the case d = 5 in Appendix A.XIII, where we can show the existence of STUs for d = 5 for a subset of all possible Hamiltonians.

IV. CONCLUSIONS
We have investigated the generation of correlations in initially thermal, uncorrelated systems. For two identical ddimensional systems, the conversion of energy into correlations as measured by the mutual information is optimal when the final state can be reached unitarily and both marginals of the final state are thermal at the same effective temperature. For any given system, the possibility of such an optimal conversion for all input energies (or desired amout of correlations) hence hinges upon the existence of symmetrically thermalizing unitaries for all initial temperatures and effective local final temperatures. This gives rise to the central question: Is it possible to find unitaries (STUs) transforming thermal marginals to other thermal marginals with higher temperature for any local Hamiltonian?
In asymmetric cases, where the two local Hamiltonians are different, this is generally not possible, as we have shown via constraints on the subsystem entropies. For the symmetric case (equal local Hamiltonians), we have provided a framework based on locally classical subspaces in d×d-dimensional systems to address this question beyond previous partial results for equally gapped Hamiltonians [13]. In particular, we have shown that STUs exist for all (locally matching) Hamiltonians in local dimensions d = 3 and d = 4, and we conjecture that STUs exist in all local dimensions.
To showcase the complexity and interesting features of the problem, as well as to provide further guidance for proving (or disproving) our conjecture, we have discussed three approaches operating within our framework. Using the "majorised marginals" approach we showed for two qutrits (d = 3) that, not only do STUs generically exist for any local Hamiltonian at any temperature, but also it is indeed possible to symmetrically reach any marginal that is majorised by the initial marginals. However, since this approach fails to be generalised to higher dimensions, we introduced two alternative approaches that we call "passing on the norm" and "geometric approach", respectively. Both allow proving the existence of STUs in the two-qutrit case. Using the "passings on the norm" approach, we were further able to show that STUs exist for d = 4 when the local Hamiltonians satisfy specific conditions on their energy gaps, i.e., δ i+1 ⩽ δ i . Finally, we have used the "geometric" method to prove the existence of STUs in local dimension d = 4 for all symmetric Hamiltonians, and we formulate a set of conditions to extend this approach to higher dimensions.
Our work addresses a fundamental question in quantum thermodynamics, whether correlations can always be created energetically optimally, or not. Besides a question about conversion between thermodynamic and information-theoretic resources, the question at hand can be considered a part of the quantum marginal problem. What kind of marginals can be unitarily reached from (are compatible with) a particular global state? The framework we put forward in terms of locally classical subspaces is more general than symmetric marginal transformations and also goes beyond mere majorisation relations of marginal eigenvalues. As such, it may also be relevant for other variants of this question, such as addressing the catalytic entropy conjecture of [28]. Just using one of the many possible such subspaces, we managed to resolve our main question for dimensions d = 3 and d = 4 and it would be interesting to know if all marginal eigenvalue distributions could potentially be reached by operating in locally classical subspaces only.

ACKNOWLEDGMENTS
We are grateful to Paul Boes, Mirdit Doda, Christian Gogolin, Claude Klöckl, and Alex Monras for fruitful discussions. We acknowledge support by the Austrian Science Fund (FWF) through the START project Y879-N27, the Lise-Meitner project M 2462-N27, the project P 31339-N27, the Zukunftskolleg ZK03 and the joint Czech-Austrian project MultiQUEST (I 3053-N27 and GF17-33780L In these appendices, we provide detailed proofs of the lemmas supporting the main theorems, as well as additional detailed calculations and counterexamples mentioned in the main text. In Appendix A.II, we investigate the maximal amount of correlations unitarily achievable for a fixed amount of energy that can be created between two arbitrary asymmetric systems initialised in a pure state. In Appendix A.III, we propose a scheme to transform finitely many copies of bipartite thermal states with thermal marginals to states with symmetric thermal marginals at a higher temperature. In Appendix A.IV, we give a detailed proof of Lemma 1. We also discuss why Lemma 1 cannot be generalised to higher dimensions in Appendix A.V. In Appendix A.VI, we show via counterexample that it is in general not possible to map the normalised versions of the vectors q i to the vectors r i by doublystochastic matrices. In Appendix A.VII, we present a detailed proof of Lemma 2 that confirms that Condition (i), used in the "passing on the norm" approach, holds in general. In Appendix A.VIII, we discuss how one can show the existence of STUs via the stronger version of Condition (ii). In Appendix A.IX, by proving Lemma 3 and Lemma 4, we complete the proof of the existence of STUs via this approach in d = 3 and under specific constraints on the energy gaps in d = 4. Then, we turn our attention to the geometric approach and show the monotonicity and convexity of the thermal curve in Appendix A.X. The detailed proofs of the existence of STUs using the geometric approach in dimensions 3 and 4 are presented in Appendix A.XI and Appendix A.XII, respectively. Finally, we discuss the possibility of generalising the geometric method to higher dimensions in Appendix A.XIII.

A.I. Upper bound on correlation
In this appendix, we show that if STUs exist in general, i.e., in particular for the desired temperature, they provide an upper bound for the amount of correlation that can be achieved unitarily. Using the same notation as in the main text, we are interested in solving the problem This can be rewritten as According to Jaynes' principle, the maximum is obtained for whereβ is chosen such that This solution can be found using Lagrange multipliers by considering the n×n matrix AB as an vector with n 2 components. We therefore have as desired Here, we discuss and solve the problem of maximising the correlations under an energy constraint in the asymmetric case with an initial pure state. That is, we want to solve are orthonormal bases of H A and H B , respectively. Without loss of generality we then assume d = d A . Note that S(˜ A ) = S(˜ B ) and i ⟩⟨ ϕ A i such that the problem may be rewritten as where we have dropped the tilde and subscript A on for ease of notation. To solve this problem we consider the converse problem min ,U and show that (at least a family of) optimal points of (A.9) are optimal points of (A.8). To simplify the notation further let us write Proof. Denoting the spectrum of as λ = (λ 0 , . . . , λ d−1 ), we first show that ( opt (β(κ)), U opt ) is a solution of the following minimisation problem where the minimisation on the right-hand side is over all unitaries V . The passive state with spectrum λ is well-known to solve this minimisation. Adopting the notation v ↓ = (v ↓ i ) to denote the vector obtained by arranging the components of the vector v = (v 1 , . . . , v n ) in decreasing order, i.e., such that This solution is in fact also an attainable solution of the lefthand side of (A.17). Hence We have therefore reduced the minimisation problem of (A.14) to solving This problem, in turn, can be solved by means of Lagrange multipliers, which yields which is precisely what is delivered by the solution ( opt (β(κ)), U opt ).

(A.24)
Strictly speaking, the last line is not valid whenH ∝ 1, but in that case one can straightforwardly check that (U opt , 1 d ) solves our original problem (A.8) for any allowed c. We hence (tacitly) discard it from the start.
(A.26) We can then establish the following proposition.
The above is saying that if f has an inverse then that inverse is strictly monotonically increasing. This is indeed how the proof proceeds.
Proof. First, note that we have Further, we have already seen that Similarly one calculates This proves that f is strictly monotonically increasing. It therefore has an inverse f −1 that is also strictly monotonically increasing which proves our claim.
We are now ready to solve the maximisation in (A.8).
Proposition 3. There is a (unique) β such that ( opt (β), U opt ) is a solution of (A.8). and ensures that there exists a (unique) β, sayβ, such that Let us further consider opt (β(κ)), the solution of (A.9) and choose κ = S( 1 ). Then it holds that which, using Proposition (2), implies A.III. Correlating finitely many copies Here we discuss a protocol that can be applied to n copies of the initial state, i.e., τ A (β) ⊗n ⊗ τ B (β) ⊗n , to increase the temperature of the marginals in cases where STUs for single copies exist for small temperature differences, but are not attainable for larger temperature differences. For n copies, the protocol consists of n consecutive steps. In each step, a fixed STU achieving some finite temperature increase is applied to different LCSs that correspond to particular pairs of subsystems, such that for i, j = 1, . . . , n, each subsystem A i interacts with one and only one subsystem B j , and no subsystem interacts again with a subsystem it has previously interacted with. This ensures that all marginals are left diagonal and thermal after each step, and leads to a step-wise increase in the temperature of the marginals. In particular, in the jth step the unitary is applied to the subsystems A i and B i+j−1 mod(n) . This construction guarantees that the tensor product structure of thermal states is preserved for each pair, i.e., τ A (β j ) ⊗ τ B (β j ) where β j denotes the effective local inverse temperature after the jth step.. Furthermore, this means that in each step the STU can be applied, as the marginals are in a thermal state with the same temperature and moreover, product to each other. While this protocol ensures that the desired structures are preserved, the necessary conditions for the protocol to achieve arbitrary final temperatures are still part of ongoing research. In general the question remains, whether it is possible to reach any arbitrary temperature difference, β ′ < β, within finitely many steps. This discussion contrasts the proof of the existence of STUs in the asymptotic case, discussed in Sec. II.2.3.
To illustrate the protocol, let us illustrate the case of n = 4 copies here. In the following, each dot reprsents a subsystem and the application of STUs is denoted by lines connnecting pairs of subsystems. Initially one is confronted with the following situation.
In the first step, we connect subsystems corresponding to the same copy, i.e., A i and B i .
As specified, we are now in the situation where we have slightly decreased the inverse temperature of the marginals to τ (β 1 ). We now apply a unitary to the subsystems A i and B i+1 mod(n) , further decreasing the inverse temperature of the marginals.
In the third step, the subsystems A i and B i+2 mod(n) are connected.
Lastly, we connect A i and B i+3 mod(n) as prescribed.
We have now used the four copies of the initial state, to increase the correlations between the two sides step by step, while retaining the product structure of the thermal marginals, arriving at the final reduced states ρ A B = τ (β 4 ) ⊗4 for the subsystems A and B.
A.IV. Proof of Lemma 1 In this section, we will first show that for any 3 × 3 doubly-stochastic matrix M , there exists another 3×3 doublystochastic matrixM such that M (1+Π) = (1+Π)M , a statement we called Lemma 1 in the main text.
Proof of Lemma 1. In general, all doubly-stochastic matrices can be written as convex combinations of permutation matrices as the following where Π (i) indicate permutation matrices in dimension d, and ∑ i α i = 1. Since we need to show that the statement is true for any doubly-stochastic matrix M , it is sufficient to prove that the statement is true for M being a permutation matrix. In the following we will just focus on dimension 3. In this dimension there are 3! = 6 permutation matrices where we collect the cyclic permutations Π (1) in a particular order, i.e., where Π (1) , Π (5) , and Π (3) trivially commute with Π (5) = Π. It is further straightforward to show that One of the approaches to investigating the existence of the unitaries mentioned in Question 1 is to generalise the 'majorised marginals approach' of Sec. III.1. This approach appears to be promising because of its simplicity and utility. For such a generalisation to higher dimensions to be successful, Eq. (57) would demand to prove the following claim.  Π)) 31 = 0 we arrive at a contradiction. Since Lemma 1 does not hold in general, one can try to relax it to a weaker statement which is still suitable for our purposes. One way to do so is to demand that an equivalent of Eq. (A.44) holds only when applied to (certain) vectors, in the spirit of the observation that although the set of doublystochastic matrices does not coincide with that of unistochastic ones, given a vector v with nonnegative components and a doubly-stochastic matrix M , there is yet a unistochastic M U such that M v = M U v. Thus, we investigate whether the following statement is true or not: Claim 2. Given a vector v of nonnegative components and a doubly-stochastic matrix M , there exists a doubly-stochastic matrixM v , which may depend on v, such that The relaxation being clearly that nowM is allowed to depend on v. Unfortunately, the previous counterexample carries over to Claim 2, as we shall see.

A.VI. Counterexample for majorisation relations
Here, via counterexample, we show that the following claim (discussed in Sec. III.2) does not hold in general: Claim 3. The relation q q ≻ bi bi holds for all d ≥ 2 and β ′ ≤ β.
Counterexample to Claim 3. Consider the case where the last eigenvalue of the local Hamiltonian is infinitely large, E d−1 → ∞. In this case, we know that for a thermal state with a finite temperature p d−1 → 0 and p d−1 (β ′ ) → 0. Due to the definition of the vector q and b i [Eqs. (29) and (32), respectively], it is clear that p d−1 , respectively p d−1 (β ′ ), contributes to one element of the vector q, (q) d−1 = p 2 d−1 , respectively two elements of the vector b i , . Hence, the vectors q and b i have d − 1 and d − 2 nonzero elements, respectively. Now recalling from majorisation theory that no vector can be majorised by a vector with higher rank one can see that Claim 3 does not hold.

A.VII. Proof of Lemma 2
Here, we present the proof of Lemma 2 for any ddimensional system, which ensures that Condition (i) holds in general.
Proof of Eq. (48) in Lemma 2. To prove that Eq. (48) holds, we need to show that for any β > β ′ , g(β) ∶= q is greater than g(β ′ ) = a in d-dimensional systems. To achieve this, one may use the positivity of the first derivative of the function g(β). We therefore calculate Without loss of generality, we have ordered the energy eigenvalues in increasing order, E i+1 ≥ E i . Then, for any pair (m, i) with m > i, we know that (e −β(2Ei+Em) − e −β(Ei+2Em) ) is nonnegative, implying ∂ β q ≥ 0. Also note that the last inequality is strict unless E i = E m for all m > i implying H ∝ 1 for which the problem is already solved. For all practical purposes, the inequality is therefore strict. where p j j+i (β) ∶= e −β(Ej +Ej+i) Z(β) 2 and p ′ j j+i ∶= p j j+i (β ′ ). We then calculate where A j ∶= E j + E j+i for all j = 0, . . . , d − 1. The last expression is nothing else than the vectorised form of a thermal state at inverse temperature β with respect to the Hamiltonian ∑ d−1 j=0 A j j ⟩⟨ j . Since the vector of diagonal entries of any thermal state majorises the vectorised diagonal of any thermal state (with respect to the same Hamiltonian) with lower inverse temperature β ′ < β, which is nothing else than b i b i , Eq. (49) holds, which concludes the proof.
A.VIII. The existence of STUs with the stronger version of Condition (ii) In this section, we will show how one can prove the existence of the STUs if Condition (i) and the stronger version of Condition (ii) from Sec. III.2 hold.
Condition (i) ensures that there exist doubly-stochastic matrices M ri which transform r i r i to b i b i and also q q to a a , i.e., Using Eqs. (25) and (A.55), the marginals then becomẽ To reach a thermal state with higher temperature, one can compensate the required norm of (1 1 + Π i ) b i by extra norm from the vector q. To achieve that, we need to have q ⩾ a , which is ensured by Condition (i). In addition, another requirement for shifting the norm is the existence of doubly-stochastic matrices M q→bi to transform q q to (1 1 + Π i ) b i 2 b i . Due to the HLP theorem, Condition (ii) ensures that such matrices exist. Therefore, M q can be written as where M q→a and the M q→bi are doubly-stochastic matrices that map q to a and q to each of the (1 1 Applying M q to the vector q in Eq. (A.56), we havẽ In order to obtain a thermal state at inverse temperature β ′ ≤ β, one must choose the coefficients α i as Using the fact that convex combinations of doubly-stochastic matrices are doubly-stochastic matrices, to ensure that M q is indeed a doubly-stochastic matrix, one needs to show that the coefficients α i as given by Eq. (A.59) are positive and sum to one. The inequality b i ⩾ r i in Condition (ii) ensures their positivity. Furthermore by using that p = p we have that Here we present detailed proofs of Lemma 3 and Lemma 4.
Proof of Lemma 3. To prove q q ≻ (1+Π)b 2 b , we can employ the following two majorisation relations: If we show that these relations are true for any β ′ ⩽ β, we can say that our statement is proven.
To prove the relation in (A.61), we need to show that the greatest/smallest entry of q q is greater/smaller than the greatest/smallest entry of 1+Π 2 r r . That is, we first need to check that p 00 p 00 + p 11 + p 22 ≥ p 01 + p 20 2(p 01 + p 12 + p 20 ) .
Disregarding the trivial case where p 0 = 1, and p 1 = p 2 = 0, this inequality can be transformed to This inequality indeed holds, since the left-hand side is larger than (or equal to) (e −βE1 + e −βE2 ) 2 , which, in turn, is larger or equal to the righ-hand side of (A.64). Then, second, we need to check the inequality p 22 p 00 + p 11 + p 22 ≥ p 20 + p 12 2(p 01 + p 12 + p 20 ) , where the relevant case (i.e., for p 1 ≠ 1 such that at least p 1 ≠ 0) can be rewritten as This second inequality holds since the right-hand side of (A.66) is larger than (or equal to) (e −βE2 + e −β(E1+E2) ) 2 , which, in turn, is larger or equal to the righ-hand side of (A.66).
To prove Eq. (A.62), we show that the greatest entry of (1+Π)r 2 r is monotonically increasing with β, and that its smallest entry is monotonically decreasing with β. For the former we calculate And for the latter Proof of Eq. (54) in Lemma 4. We prove this lemma under the stated condition on the gap structure of the local Hamiltonians, i.e., δ i ≥δ i+1 . Here we need to show that To do so, we argue that the majorisation relations q which can be rewritten as which is true since the right-hand side is larger or equal e 2βE2 + 2, which, in turn, is larger than (or equal to) the lefthand side, because (1 − e βE2 ) 2 ≥ 0. This proves Inequality (A.72c). Thus far, we have shown that q q ≻ 1+Π 2 r1 r1 . To complete the first part of the proof, we also need to show that To do so, we first prove that the derivatives with respect to β of the two largest elements of 1+Π 2 r1 r1 are positive and those of the two smallest elements are negative. Let us represent the elements of 1+Π 2 r1 r1 as 1 + Π 2 Note that f 0 (β) = p 01 + p 30 2(p 01 + p 12 + p 23 + p 30 ) = 1 2 (A.83) which yields (A.84) p 00 p 00 + p 11 + p 22 + p 33 ≥ p 12 + p 01 2(p 01 + p 12 + p 23 + p 30 ) (A.96) must hold in order for the relation to be true. But in the limit E 3 → ∞, E 1 → 0, E 2 → 0 the above inequality becomes p 00 3p 00 ≥ 2p 00 2(p 00 + p 00 ) which is obviously invalid. The relation q q ≻ r1 r1 also fails sometimes. Looking at the limit E 3 → ∞, q q → (p 00 , p 11 , p 22 , 0) and r1 r1 → (p 02 , 0, p 20 , 0). For the majorisation relation to hold, we would need p 22 ≤ 0, which is clearly not true in general. Proof of Eq. (55) in Lemma 4. We need to prove that r i ⩽ b i , for i = 1, 2, for d = 4. To do so, we again use the first derivative of r i and show that it is always negative. From the partial derivative with respect to β reads (A.99) Now we need to show that for any i ≠ 0, ∂ β r i is negative. For i = 1 we find Since we have ordered the energy eigenvalues in the increasing order, it is straightforward to show that ∂ β r 1 is always nonpositive for any set of energy eigenvalues in dimension 4.
Now similarly, we can show that r 2 ⩽ b 2 . By employing Eq. (A.99), ∂ β r 2 is obtained as The above expression is also nonpositive under the assumption δ i ≥ δ i+1 .
A.X. Proof of monotonicity and convexity of the thermal curve In this section, we will show that the Condition (II) mentioned in Sec. III.3 is true for every choice of the set {E i } d−1 i=0 and any initial inverse temperature.
Expressing a general point in the curve p(β ′ ) = (x 0 (β ′ ), . . . , x d−2 (β ′ ), −1) as a linear combination of the vertices v i in Condition (I) we have where the (a 0 , . . . , a d−1 ) are coefficients given by (A.103) which are positive for all i and satisfy ∑ i a i = 1 (and thus below 1 for all i) if the following condition holds d dβ which means that the function x i x i+1 (β) is monotonically decreasing with β. This in turn means that the final point can be reached as a convex combination of the vertices.
With a convenient relabeling of the index we can write x m+1 x m (β) in the form To prove our claim [Condition (II)], we need to show that for any β ≥ 0 the function x m x m−1 is monotonically decreasing with β. That is, we have to show that The derivative takes the form To prove the nonpositivity of R(β) we need to show that To do so, we define y m ∶= β(E m − E i ), and write inequality (A.108) in the form This inequality is satisfied for all y m+1 ≥ y m if the function h(y) is monotonically decreasing. To see that this is the case, we calculate the derivative which can be seen to be smaller or equal to zero since (1 − y)e y is a monotonically decreasing function for y ≥ 0, i.e., ∂ ∂y (1 − y)e y = −ye y ≤ 0 and hence has its maximum value of 1 at y = 0, confirming that (1 − y)e y − 1 ≤ 0 and that h(y) is monotonically decreasing.
A.X.a. Explicit partial derivatives in dimension 3 To show explicitly that the curve x(β) is monotonically increasing in d = 3, we first calculate where the functions f x (β) and f y (β) are given by With this, we can then evaluate which allows us to confirm that the curve segment lies above (with respect to the coordinates x and y in the plane of the polytope) the line connecting x(β), y(β) and x,ỹ .
To show that this curve is a convex function, we need to show that ∂ 2 y ∂x 2 ≥ 0, and calculate The second partial derivatives with respect to β are found to be Combining this with Eq. (A.114) we find The derivatives of f x (β) and f y (β) are which we insert into Eq. (A.116) to arrive at where we have made use of the fact that f x ≥ 0 (since 1 ≥ e −βE1 ), along with E 1 , E 2 , Z ≥ 0 and E 2 ≥ E 1 .
Finally, to show that the pointÕ = (0, 0, z(β)) is contained within the polytope, we show that the points C, D and E satisfy, x(C) ≥ 0, y(D) ≥ 0 and x(E) ≤ 0, while (AÕ ×AE) z ≥ 0, (EÕ × ED) z ≥ 0 and (DÕ × DC) z ≥ 0, where for any two points X and Y we use the notation XY = Y − X. In other words, the closed path connecting the five points A, B, C, D and E encircles the pointÕ, meaning thatÕ can be written as a convex combination of these points. In a slight abuse of notation, we now use the shorthand p i ≡ p i (β) for the thermal state diagonal components to express the relevant coordinates for the points C, D and E as x(C) = p 0 − p 1 2p 0 + 2p 1 + p 2 − 1 + p 1 − p 2 p 0 + p 1 + p 2 ≥ 0, (A.139a) y(D) = 3 p 0 − p 1 p 0 + p 1 + p 2 − 1 3 + 3 p 1 − p 2 p 0 + p 1 + p 2 − 2 3 ≥ 0, (A.139b) x(E) = − p 0 − p 1 1 − p 0 − p 1 + p 1 − p 2 p 1 + p 2 ≤ 0, (A.139c) while the z-components of the relevant cross products are given explicitly below. In Appendix A.X.b we have further shown that the partial derivatives ∂y ∂x, ∂z ∂x, and ∂z ∂y as well as second derivatives ∂ 2 y ∂x 2 , ∂ 2 z ∂x 2 , and ∂ 2 z ∂y 2 are nonnegative along the curve of thermal states, meaning that Conditions (I) and (II) are satisfied for d = 4.

A.XIII. Outlook on geometric method in higher dimensions
Here we present a possible recursive generalisation of the geometric approach, based on re-expressing Condition (I) in terms of majorisation relations rather than trying to prove it in a full geometric way.
To prove that v 2 = (0, 0, x 2 (β), −1) can be reached, we also considered M r2 = 1 1, which turns out to be a big simplification, due to the following reasoning. Assuming M r2 = 1 1 we can rewrite the general transformation that (potentially) reaches v 2 as v 2 + q − p − (1 1 + Π)(M r1 − 1 1)r 1 = M q q, (A.142) where we used the fact that 1 2 (1 1 + Π 2 )r 2 = p − q − (1 1 + Π)r 1 . Thus, with this simplification of assuming M r2 = 1 1, the problem reduces to showing that there exist some pair (M q , M r1 ) such that Eq. (A.142) holds. In particular, we use the fact that (1 + Π)(M r1 − 1 1) has norm equal to zero, which in turn implies that the sum of vectors on the left hand side of Eq. (A.142) has the same norm as q, since both v 2 and p have norm 1 (i.e., they are probabilities). Note also that the vector (1 1+Π)(M r1 −1 1) by itself need not be ordered with decreasing components, as well as the whole expression in the left hand side of Eq. (A.142). More precisely, in this case the greatest component of expression v 2 + q − p is the third.
To prove the statement, then, we can show that v 2 + q − p − (1 1 + Π)(M r1 − 1 1)r 1 is majorised by q, or, more precisely, we have to find a proper stochastic matrix M r1 such that the above majorisation holds. This is achieved again with a matrix of the form M r1 = M q1 that leads to which implies that we have to choose such that the majorisation relation is correctly satisfied, since the other conditions are also trivially satisfied. The last step to check is that with this choice of the parameter 1 − m, M r1 is indeed a doubly-stochastic matrix. This is true because 1 − m ≤ 1 ⇐⇒ p 0 + p 1 − 2p 2 − 3(p 2 0 − p 2 2 ) ≤ 3p 1 (p 0 − p 2 ) (A.146) also holds, as well as 1 − m ≥ 0, which holds by construction.
For higher dimensions, we can try a sort of recursive reasoning, that we work out here for the case d = 5. Let us hypotesise that the three vertices (v 1 , v 2 , v 3 ) can be found with a set of transformations {(M qi , (M r1 ) i , 1 1)} i=1,2,3 , i.e., that are such that the last stochastic matrix is always the identity (M r2 ) i = 1 1 for all i = 1, 2, 3. This allows us to decompose the last vector as which is satisfied if and only if the left hand side is majorised by q. So now, the task would be to find suitable doubly-stochastic matrices (M r1 ) i such that the left hand side is majorised by q for each of the v i . Let us then start by considering v 1 , which in the probability coordinates is v 1 = ((p 0 + p 1 ) 2, (p 0 + p 1 ) 2, p 2 , p 3 , p 4 ). In this case, the greatest component of v i − p + q is the second, namely p0−p1 2 + p 2 1 . Unlike lower dimensions, however, in this case we have that p0−p1 2 + p 2 1 ≤ p 2 0 ⇐⇒ p 0 + p 1 ≥ 1 2 does not