The intrinsic non-equilibrium nature of thermophoresis

Exposing a solution to a temperature gradient can lead to the accumulation of particles on either the cold or warm side. This phenomenon, known as thermophoresis, has been discovered more than a century ago, and yet its microscopic origin is still debated. Here, we show that thermophoresis can be observed in any system such that the transitions between different internal states are modulated by temperature and such that different internal states have different transport properties. We establish thermophoresis as a genuine non-equilibrium effect, whereby a system of currents in real and internal space that is consistent with the thermodynamic necessity of transporting heat from warm to cold regions. Our approach also provides an expression for the Soret coefficient, which decides whether particles accumulate on the cold or on the warm side, that is associated with the correlation between the energies of the internal states and their transport properties, that instead remain system-specific quantities. Finally, we connect our results to previous approaches based on close-to-equilibrium energetics. Our thermodynamically consistent approach thus encompasses and generalizes previous findings.

A solution in contact with a temperature gradient supports the onset of a phenomenon known as thermophoresis, thermodiffusion, or Ludwig-Soret effect [1,2]. This is characterized by the net migration of particles towards either the cold or warm side of the gradient, leading to a tilted stationary distribution [3]. The first evidence of thermophoresis goes back to the work of Ludwig in 1856 [4]. Since then, it has been observed in several different systems, such as colloidal suspensions [5], bimolecular solutions [6,7], fluid mixtures [8] and DNA beads [9], just to cite some examples.
Despite the overwhelming experimental evidence, a comprehensive microscopic theory of thermophoresis is still lacking [10]. One of the main difficulties consists in the fact that details about particle-solution interactions seems to be non-negligible, being indeed necessary to provide reliable predictions [11], and most of the theoretical efforts rooted in system-dependent modeling have fallen short of determining the essential ingredients responsible for the emergence of thermophoresis. Moreover, although thermophoresis feeds upon the imposed thermal gradient, the role of energy fluxes and their inevitable dissipation is a long-standing enigma [12].
Heuristically, thermophoresis acts on the system as an external velocity drift, v. For diluted concentrations, it is usually assumed that v is proportional to the temperature gradient, ∂ x T [3]. This additional flux competes with standard diffusion, just as any other drift term would, resulting in a non-uniform distribution at steadystate. Therefore, the total flux is: where c is the particle concentration, D and D T are, respectively, diffusion and thermodiffusion coefficients. For the sake of simplicity, we consider here a one-dimensional system. The steady-state concentration, c ss , is usually determined employing the zero-flux condition: with S T = D T /D is the so-called Soret coefficient. Depending on the sign of S T , the particles accumulate on the cold or warm side of the gradient. Eq. (2) relies on observations, and it is not obtained from a microscopic theory. It is not known, for example, which are the system properties determining the sign of the Soret coefficient. Most importantly, it is still unclear how to reconcile the zero-flux condition with the presence of a thermal gradient and its thermodynamic consequences (e.g. heat transport, energy fluxes).
Thermophoresis can be tackled in two different ways [3,13]. Hydrodynamic arguments ascribe a dominant role to the pressure difference caused by thermo-osmotic fluid flow around a particle [14]. On the contrary, thermodynamic models are based on mesoscopic energetic analyses, which account for the leading contributions to thermophoresis when particles are too small to experience appreciable temperature differences on their surroundings.
In this work we mostly focus on the thermodynamic approach, building upon a preliminary observation presented in [15], where thermophoresis emerged in a simple three-state chemical system in the presence of a thermal gradient, as a consequence of the different diffusivities of internal states. Here we aim at formulating a general theory for particles with multiple internal states, providing a microscopic derivation of the phenomenological equation, Eq. (1), finding an expression of the Soret coefficient as a function of the internal parameters. Furthermore, we show that the onset of thermophoresis is inextricably related to the presence of non-vanishing fluxes in the system. Our approach extends to single-state particles, highlighting that in this case the role of internal states is played by different velocities in the complete underdamped description of the system. A previously reported result [16,17], built on the assumption of a closeto-equilibrium regime, is also properly discussed within our framework, by means of a thermodynamic energetic approach.

Diffusivities, energies and thermophoresis
Discrete state-space We first consider a system composed by particles with multiple internal states in a solution, in the presence of a thermal gradient, T (x). Despite being idealized, this model can be easily adapted to describe a large variety of systems, ranging from polymer chains to enantiomers, and catalytic enzymes. In the case of polymer chains, for which thermophoresis has been observed [17], internal states can be associated to different conformations. Analogously, simple molecules can explore different isomers, each associated with an internal state, and enzymes can be associated with the substrate, the product or alone. In short, multi-state particles capture a large array of systems with internal degrees of freedom.
In what follows, we deal with linear reactions X i ↔ X j , such as isomerization processes, even if it is straightforward to generalize the model to multi-molecular reactions (e.g. catalysis), as we will show later in this work. The probability of occupation of each internal state i = 1, . . . n (e.g. chemical species), P i , evolves according to Markovian dynamics and follows a reactiondiffusion equation [18]: where k ij is the rate at which state j transforms into state i, with the usual relation k ii = − j k ij to ensure probability conservation, and D i is the diffusion coefficient of state i. In this formulation, temperature enters implicitly in the kinetic rates, for which we do not provide yet an explicit expression. As a first approximation, we are considering diffusion coefficients that do not depend on temperature, in order to show that thermophoresis can emerge even in this simple setting. Summing over the internal states, we get rid of the information about the dynamics in the internal space obtaining: where P tot = i P i . This equation describes the evolution of the probability to be at x at time t, independently of the chemical states, but it does not provide a complete solution to the system, which requires instead as many equations as the number of internal states.
A simple example: two internal states The onset of thermophoresis in this simple case has been preliminary discussed in [15]. To fully characterize the evolution of the system, we write the dynamical equations for P tot = P 1 + P 2 (analogous to Eq. (4)) and Π = P 1 − P 2 : Examination of Eq. (5) immediately reveals that a necessary condition for thermophoresis (i.e. a non uniform P ss tot ) is that D 1 = D 2 , which we assume to be true throughout this work. Solving Eq. (5) for Π, and introducing it in Eq. (6), the exact stationary solution for the total probability, P ss tot , employing the zero-flux condition, is (see Appendix A1): . If the transition rates are of the usual Arrhenius-like form: where E i is the free energy of the state i, then with Z eq the equilibrium partition function. For slowly varying functions (small wavelength approximation, i.e. ∂ 2 x P ss tot 0), Eq. (7) takes the same form as Eq. (2), with the identification (see Appendix A1 for the detailed derivation of this expression). The spatial and temperature dependence in the effective diffusion coefficient, D eq , results from an averaging procedure over internal states. Hence, D eq depends on x and T (x) through the kinetic rates k ij . Eqs. (5) and (6) represent a system of coupled diffusion equations (with a non-diagonal diffusion matrix) with an extra term in Eq. (6), linear in P tot and Π, that does not obey local continuity (i.e. conservation) conditions and that can thus be considered a source/sink. At equilibrium, it vanishes because of detailed balance (it is indeed nothing else than k 12 P 2 −k 21 P 1 ). In non-equilibrium conditions, instead, it does not vanish, indicating that the system is supporting non-equilibrium fluxes, that are deceptively hidden in the customary, zero-flux phenomenological description, Eq. (2). In turn, this implies that currents of the two species are present in the system.
These fluxes are actually a thermodynamic necessity for heat transport from warm to cold regions. The heat flux across the system is A B Using the relations between P 1 , P 2 , P tot and Π, it is possible to show that, at steady-state and within the small wavelength approximation for Eq. (7), the heat flux is directed from the warm to the cold side: (see Appendix A2 for details of the derivation). Eq. (12) relies on Eq. (7) (which is equivalent to the phenomenological equation (2)), which is behind thermophoresis. Thus, thermophoresis, particle fluxes, and thermodynamically necessary heat fluxes are inextricably intertwined.
Soret coefficient in the fast reaction limit In the Appendix B1, we show that the expression for the Soret coefficient derived above, Eq. (10), can be obtained for a general reaction network obeying Eq. (3), by employing the fast reaction limit. In this approximation, which is equivalent to the small wavelength approximation used for Eq. (7), the transition rates between chemical states, k ij , are much faster than diffusion, a realistic condition in many experimental settings [19,20], and the system locally relaxes to equilibrium. Eq. (10) can then be further developed, leading to (see Appendix B1 for details) Fig. 1, we show an illustrative example of a discretestate system in which, inverting the covariance between energies and diffusion coefficients, the steady-state distribution P tot inverts its tilting accordingly.
This results provides an insight into the physical origin of the Soret coefficient and into its intimate structure.
It highlights the intrinsic relation between thermophoresis and how energies and diffusion coefficients are distributed among the internal states. In particular, when high energy states diffuse faster, particles tend to accumulate on the cold side, and viceversa. This observation might stimulate a new avenue of research about the possibility to design and control thermophoretic response of bio-inspired chemical nanodevices [21,22].

Continuous state-space
Discrete internal states are of course an approximation applicable to systems with continuous internal variables q = {q 1 , . . . q m } that are nonetheless localized most of the time in a few regions because of, for example, deep minima of the potential energy function U ( q). In this case, there are no discrete jumps from one internal state to the other. Instead, the system evolves in the internal space according to a diffusion equation, with diffusion constant ∆(x), and subject to a force −∂ q U ( q). The system also evolves in space according to a diffusion equation with diffusion constant in space, D( q), that crucially depends on the internal state, as in the case of the discrete state system described before.
The corresponding Fokker-Planck (in one spatial dimension and with one internal degree of freedom for simplicity) is: where the spatial and internal variable currents are highlighted. Integrating over all internal degrees of freedom, the system can be described in terms of P tot (x, t) = dqP (x, q, t), which is the probability of finding a particle in the position x at time t independently of the state q: Eq. (15) does not provide a complete description of the system, since it results from the procedure of integrating out the information on q. As a consequence, the noflux boundary condition, translating also in this case into J tot = 0 everywhere, deceptively hide the presence of spatial fluxes for different values of the internal variable q. Once again, these fluxes are a consequence of the nonequilibrium setting and are necessary for heat transport. In Fig. 2 we show the stationary profile of P tot for the simple case of a double-well potential, exhibiting a nonuniform distribution of particles in space, consistently with thermophoresis. Upper inset -Full distribution in the (x, q) space, with probability peaks corresponding to the location of wells in the subspace parametrized by q. Lower inset -The double-well potential is sketched. Here, we set the diffusion coefficients to ∆(x) = 1, and D(q) = α(U (q) + 2.5), where α = 10, 10 −2 or +∞, as indicated by the legend.
The limit of fast internal dynamics can be exploited also in this case, leading to (see Appendix B2): The continuous case is thus equivalent to the discrete one.
Underdamped picture for single-state particles

Internal states and phase-space
In the previous models, the diffusion constant was independent from space to highlight that thermophoresis emerges by the interplay between currents in real and internal space. Of course, if temperature depends on space, then, according to Einstein relation D(x) = k B T (x)/γ (γ being the Stokes' friction coefficient) also the diffusion constant depends on x. Actually, this property alone is enough to induce an accumulation of particles on the cold side (as if in the presence of a positive Soret coefficient). Using a revised derivation of the diffusion equation from the underdamped Kramers equation, we show that also this effect is a consequence of the intrinsic nonequilibrium nature of a non-uniform temperature, with fluxes in the full phase-space.
The Kramers equation for the evolution of the probability P (x, v, t) is where m is the particle mass. It is possible to show (see Appendix C1) that the correct parameter to perform a consistent overdamped limit is the friction characteristic time-scale τ = m/γ. In particular, when τ −1 1, the relaxation due to the friction is much faster than all other time-scales in play, i.e. the system experiences a faster equilibration in velocity space, and the system satisfies the following equation: where P is the marginalized distribution obtained by integrating , that is the ensemble average over the equilibrium distribution in velocity space, with Z eq a normalization factor, and D(x) the overdamped diffusion coefficient satisfying Einstein's relation. Solving by using the zero-flux condition, as above, we obtain: This is clearly an oversimplified model, whose aim is only to show that the Soret coefficient stems, again, from the presence of internal states with different energies and different transport coefficients. In this case, the internal state variable is the velocity, and the internal state energy is the kinetic energy. Clearly, higher kinetic energy positively correlate with faster transport resulting in a positive Soret coefficient.

Non-equilibrium fluxes in phase-space
We integrated out variables associated with a faster relaxation to obtain equations for the total probability of finding a particle in position x at time t, as the ones in Eq. (4) and Eq. (18). There are hidden non-equilibrium fluxes associated to these hidden degrees of freedom. In the previous simple case of two internal states, although the total concentration was flux-less, there where spatial fluxes of the two states that, by conservation of the probability, are accompanied by fluxes in the internal space (Fig.3, upper panel). Analogously, when dealing with the case of underdamped single-state particles, nonequilibrium fluxes take place in the whole phase-space, although they do not appear in the overdamped dynamical description, Eq. (18). Their presence, however, is crucial both to sustain a non-uniform stationary marginalized distribution P(x), which is a signature of the presence of a thermophoretic effect, and to transport heat as dictated by the laws of thermodynamics.
The component of the current in position-space is . Its integral over the velocity space is zero because of the zero-flux condition in space. However, J x (x, v, t) is not zero in the whole phase-space (x, v), consistently with the nonequilibrium conditions. Indeed, we show in the Appendix C2, that the probability current of particles slower than |v|, for any |v|, is always parallel to ∂ x T (thus directed from the cold to the warm side), implying that, because of the no-flux condition, the current accounting for particles faster than |v|, is always parallel to −∂ x T (thus running from the warm to the cold side). Both of them vanish only for |v| = 0 and |v| → +∞, reaching their maximum absolute value at v * = 3T (x)/m. Just as in the two-states case currents close in the internal space, here they close in velocity space, with slow particles warming up on the warm side and fast particles cooling down on the cold side (Fig.3, lower panel). The system thus picks up heat on the warm side, transports it across the system and releases it on the cold side. A detailed analysis of the total energy current in position space shows that, to the leading order, at stationarity (see Appendix C3 for details) where N is the normalization factor of steady-state solution.

Soret coefficient and dimer formation
As an extension to the presented model, consider the case of a non-diluted solution in which interactions among particles are not negligible. This picture allows for the formation of complex states, with a more complex energy landscape. Here, we investigate the simple case of dimer formation, as sketched in Fig. 4. The system is described by the following reaction-diffusion equation: where c 1 and c 2 are, respectively, monomer and dimer concentrations, satisfying the normalization condition c 1 + 2c 2 = c tot , with c tot total concentration. The dissociation constant has the usual form K d (x) = k−(x) k+(x) , where both association and dissociation rates depend on space through temperature (as a reminder, the dissociation constant has the dimensions of a concentration).
Defining · eq = 1 ctot 2 n=1 · nc eq n , the Soret coefficient in the fast reaction limit is again of the form Eq. (13), and in particular (see Appendix D) where , with g(T, K d , c tot ) a positive function. Since dimers are typically larger than monomers, hance D 1 − D 2 > 0, and the dissociation constant increases with temperature (dimers tend to dissociate at higher temperatures), the Soret coefficient can be either negative or positive, with the overall concentration of molecules higher on the warm or cold side, respectively. While an accumulation on the cold side intuitively follows the direction of the heat flow, it is also possible to conceive complex scenarios, whereby dimers are stabilized by contacts between unstructured regions, as in proteins, hence increasing temperatures might stabilize the dimer state.
In Fig. 4, we simulate the system of equation Eq. (D1), showing the appearance of thermophoresis. Previously, we showed that single-state particles exhibit a positive Soret coefficient in dilute solutions (Eq. (19)). When particle-particle interactions become non-negligible, in non-dilute solutions, the potential formation of complex molecules may lead to an additional contribution to the Soret coefficient. The combination of these two effects might even result in an inversion of the thermophoretic response, as a function of particle concentrations.

Discussion and conclusions
A theoretical understanding of thermophoresis has to date been elusive, despite the effect being well established. A source of confusion has for sure been the lack of a microscopic characterization of the Soret coefficient, likely due to its apparent dependence on the system details. Furthermore, even though thermophore-sis is intrinsically a non-equilibrium effect, since it depends on the presence of a thermal gradient, its connection with non-equilibrium statistical physics has not yet been fully established. As a matter of fact, several approaches are based on a free-energy description, which is formally inappropriate in a non-equilibrium scenario, and which can be recovered only from quasi-equilibrium or local-equilibrium assumptions, that must nonetheless be justified on rigorous grounds.
In this work we have tried to move a first step in this direction, by firmly treating thermophoresis in the framework of stochastic thermodynamics, which is being broadly accepted as the correct way to cast nonequilibrium phenomena. We could thus establish a few, important, facts about thermophoresis: • Thermophoresis emerges through the interplay between transport in real space and temperaturemodulated transitions in some internal space, which can be a chemical, conformational, or velocity space • The phenomenological approach to thermophoresis, Eq. (2), is an approximation of the correct equations, which is valid only in the fast reaction limit, corresponding to the local-equilibrium assumption • Eq. (2) is customarily solved with the no-flux condition, hiding the presence of currents for the underlying degrees of freedom, which are present both in real and internal spaces • These currents are consistent with the nonequilibrium setting determined by the thermal gradient, and are actually a necessity for heat transport from warm to cold regions, as dictated by thermodynamics • The Soret coefficient is related to the microscopic features of the system through the correlation between transport properties of each internal state and its energy In particular, we have provided here a general (albeit valid only within the fast-reaction approximation) formula for the Soret coefficient, which proposes a bridge toward its microscopic understanding, and rationalizes its dependence on a multitude of system-specific factors. For example, the diffusion coefficient of a given conformation (internal state) of the system might depend on its peculiar interactions with all the components of the surrounding solvent, that can be derived only through a careful microscopic treatment. Nonetheless, Eq. 13 provides the mathematical framework through which microscopic details must be assembled.
In this respect, the connection to thermodynamic approaches to thermophoresis deserve a special comment. Indeed, as shown in [16,17], the thermodynamic derivation of the Soret coefficient argues that S T = T −1 ∂ T G, where G is the free-energy. However, in order to define a free-energy, each point in space should be in equilibrium with a bath at the local temperature T (x). From stochastic thermodynamics arguments, dG = dQ − T dS, (Q being the heat and S the entropy) with dQ = 0 and dS = d(− log P tot ), since thermophoresis is captured by a description in terms of P tot , which follows a purely diffusive equation, Eq. (4). Hence, S T = ∂ T S = −∂ T P tot /P tot , being indeed paired with a quantity encoding, through the derivative, information about spatial transport due to thermal gradient. This mixed approach highlights the link between our approach and previous ones.
As already mentioned, our appraoch is not restricted to the overdamped regime. As a matter of fact, through a careful and unambiguous derivation of the Smoluchowsky equation from the underdamped Kramers equation, we have shown that also the presence of a diffusion constant that depends on space, through its dependence on a nonuniform temperature field, goes hand-in-hand with the presence of currents in phase space whose presence is necessary for heat transport.
We have also presented the extension of our model to the case of non-dilute solutions with dimer formation, to highlight how our approach can be straightforwardly extended to several other systems with internal states. Moreover, simple chemical systems could be experimentally tested, in order to verify and improve the theoretical grasp on the relationship between non-equilibrium fluxes, microscopic parameters, and thermophoresis. To find the exact solution of a two-state diffusion-reaction system, let us define two variables: the sum P tot = P 1 +P 2 and the difference Π = P 1 − P 2 . Using these two, we rewrite the time evolution equation: Moving in Fourier space, we introduce the Fourier transform of each function: hence, naming D = D 1 + D 2 and ∆ = D 1 − D 2 , we have the following set of equations: Solving the first equation, when k = 0, we have: that, in real space, corresponds to: A non-local term appears, which is the propagator of the one-dimensional Laplacian operator. It governs the diffusive dynamics of the system toward stationarity. However, the general solution of Π has to include an additional term: where f (x) is such that ∂ 2 x f (x) = 0. Inserting (A6) into the second line of (A3), multiplying by ∆, and dividing by k sum = k 12 + k 21 , we obtain: This equation is directly in real space. We remark that we have divided by k sum , in order to isolate the term f (x).
Since we also know that the second spatial derivative of this function has to vanish, we can derive twice with respect to x: This form constitutes the exact solution, without any approximation. The first line is the same equation that appears in the fast reaction limit, derived in detail in the next Section. The second line is the correction to compute the steady state, P ss tot , when the chemical reactions are not faster than diffusion. The last line controls the evolution of P tot towards the steady state, and it contains the kernel of long-range interactions. At this point, we note that the fast reaction limit is, in general, a good approximation, since the only correction, at stationarity, is a fourth-order spatial derivative of P ss tot . In Fig. 5 we compare the steady state profile in the fast reaction limit with two choices of parameters, D 1 = 10 −2 (reactions faster than diffusion), and D 1 = 10 −4 (reactions slower than diffusion).
For slowly varying functions, i.e. in the small wavelength approximation, corresponding to ∂ 2 x P ss tot 0, (A8) takes the following form at stationarity: If the transition rates are of the usual Arrhenius-like form: where E i is the free energy of the state i, we define the equilibrium ensemble average of the diffusion coefficient: with Z eq the equilibrium partition function. Then, employing the zero-flux condition, we have the following equation for the stationary state: which leads to the following identification of the Soret coefficient: The energy flux from the warm side to the cold side For two-state particles diffusing in a temperature gradient, the energy flux associated to transport of heat across the system is: where −D i ∂ x P i is the diffusive probability flux for the state i. Writing P 1 and P 2 as functions of P tot and Π, we have: Reminding that, at steady-state, Π ss = −P ss tot D/∆ (see the previous subsection), after some algebra, we have: Spelling out ∂ x P ss tot by inverting (A13), we finally have: Appendix B: Soret coefficient in the fast-reaction limit

Discrete chemical space
Let us start from the reaction-diffusion system defined in the main text: Let us suppose that chemical reaction are faster than diffusion. Starting from Eq. (B1), we can employ a standard time-scale separation analysis. First, we explicit the time scale of reaction rate, k ij =k ij / , with 1. Second, we propose a solution in the following form: Plugging this expression into Eq. (B1), and solving order by order, we find the following zeroth order equation: This means that P i (x, t) can be written as: where T (x) is the local temperature at position x, Z = i e −Ei/k B T (x) is the partition function. In other words, it is the product between Boltzmann distribution in chemical space and a generic time-dependent function. At first order, summing over all chemical states, the system satisfy: Noting that π(x, t) = j P (0) j (x) = P tot (x), up to the leading order in , from Eq. (B5), we obtain: with the equilibrium ensemble average defined as: Hence, the Soret coefficient is the ensemble covariance of energies and diffusion coefficients. This provides some physical insights into the thermophoresis problem from an energetic perspective. If energy and diffusion coefficients are positively correlated, i.e. particle in high-energy states diffuse faster, then the Soret coefficient is positive, and the particles tend to move from the hot region to the cold region. The opposite motion happens when there is a negative correlation.

Continuous chemical space
A system may experience a continuum of possible internal states, and the reaction-diffusion equation results as a coarse-grained description of transition among local minima. When this approximation is not valid, the complete dynamics of the system can be captured by a Fokker-Planck equation, as introduced in the main text: where P ≡ P (x, q), is the probability distribution in the position-chemical space. We aim at investigating the fast reaction limit in this context, i.e. the flux acting on chemical space, J q →J q / , with 1, is much stronger than the one acting on real space, J x . This condition is also equivalent to the following assumption: where the last equality corresponds to the Einstein relation. Guessing a solution of the following form: at the zeroth order, we have: By solving this equation, we have: where P(x) has to be determined solving the first order equation, and the other factor is the equilibrium distribution in chemical space. At first order, integrating over the chemical space by employing the zero-flux condition, we have: where Φ(x) = d dq P 0 (x, q) is the marginal distribution along x, while the other term is the average of the D x over chemical space. Writing the ensemble average in chemical space as · q = 1 Zq q dq · exp − Uq k B T , we obtain which is the same as the discrete-state case. The Soret coefficient can be equally written in covariance form: (B15) We assume that the system is operating in the overdamped regime, i.e. in a region of parameters where the time-scale associated to the friction, τ −1 = γ/m is much faster than all the others in play. It is then natural to employ a time-scale separation procedure to develop an equation describing the dynamics at the leading order. In order to understand the correct scaling of (C1) with respect to the expansion parameter τ 1, we perform the following change of variables: Then, in terms of these new variables, (C1) becomes: In the limit of small 1/τ , the probability distribution can be written as: since the smallest order appearing in the modified Kramers equation (C3) is proportional to √ τ , which is then the natural expansion parameter to be adopted.

a. Zeroth order
Substituting (C4) in (C3) we obtain a set of equations at different orders in √ τ . The zeroth order equation is equal to: It is easy to verify that the solution is in the form P 0 (x, v, t) = e − v 2 2T (x) Φ 0 (x, t).

b. First order
The equation resulting from the first order terms is: Introducing the expression for P 0 , and guessing the P 1 has a similar form, i.e. P 1 = e − v 2 2T (x) Φ 1 (x, v, t), we get: where, for sake of clarity, we have not reported all the dependence on x and v. Guessing a solution of the form we obtain: Integrating this equation between −∞ and v , we get: where we assumed that the probability distribution has no divergent behavior at any order. Then, we have the following solution for the full probability distribution: where f (x, t) is an arbitrary function acting as a perturbation of Φ 0 (x, t). Moreover, P (x, v, t) is the full probability distribution of x and v. However, we are interested in deriving an effective equation for the evolution of the the marginal distribution of x and t. In fact, the time-scale expansion we have performed is compatible with the situation in which the variables v thermalize faster than x. Using the standard idea of the time-scale separation, we integrate over v, in order to understand what is the effective probability distribution we would like to describe, P(x, t), in terms of the full state space: where, from now on, we adopt the following notation: · = +∞ −∞ · e − v 2 2T (x) dv. The integration can be easily performed by noting the parity of the function involved.

c. Second order
Going up to the second order in √ τ , and using the general expressions for P 0 and P 1 , we obtain the following equation: Since we want to find the dynamical evolution of the function P(x, t) = Φ 0 (x, t) 1 defined in (C11), we integrate (C12) over the domain of v, obtaining: 1 ∂ t Φ 0 (x, t) + ∂ x ( vΦ 1 (x, v, t) ) = ∂ t P(x, t) + ∂ x ( vΦ 1 (x, v, t) ) = 0 (C13) where we have used again the non-singularity condition of the probability distribution.
Here, we evaluate the following integral: Notice that the function f introduced in (C10) disappears after the integration over v both in the evaluation of P, (C11), and in the equation above. Then, f has no effect in the limit of thermalizing velocities (i.e. overdamped) we are considering. Using the fact that v 4 and v 2 can be evaluated explicitly, and that the following relation holds: v 4 = 3 v 2 T (x), we get: Moreover, we highlight that: Substituting this equality in (C15), we finally get: Then, (C18), in terms of the marginal probability distribution P(x, t), becomes: which is the standard Smoluchowski equation, where D(x) = v 2 eq . Note that we identified v 2 / 1 = v 2 eq , that is the equilibrium ensemble average of v 2 .

d. Conclusions
Hence, the origin of the thermophoretic effect for hard spheres, with no internal states, diffusing in a temperature gradient has to be found in the fact that spheres with different velocities have different transport properties. In other words, the ensemble of possible states in the velocity space plays the same role of the internal states of a molecule. It is important to note that the integration over the velocity states is crucial, since it allows us to associate to the same position x a plethora of states with different velocities: this is why these latter assume the same flavour of internal (e.g. configurational, energetic) states.
By writing down the expression of D(x), and mapping back the final equation to the original variables (x, v), we obtain the following standard Smoluchowski equation with a diffusion coefficient satisfying Einstein relation, as for the original Kramers equation we started with: ∂ t P(x, t) = m γ ∂ 2 x v 2 eq P(x, t) = ∂ 2 x (D(x)P(x, t)) (C19) where D(x) = k B T (x)/γ is the standard overdamped diffusion coefficient.
e. Beyond the second order -determination of f (x, t) The second order equation (C12) can be solved by looking for a solution of the form P 2 = e − v 2 2T x) Φ 2 (x, v, t), and integrating over v between −∞ and v, as before. This leads to: employing the non-singular behaviour of P 2 , where · (v) = v −∞ . Expressing · (v) in terms of · , we obtain: where I(x, v) is the remaining integral that does not contain exponential terms. Using the Smoluchowski equation, (C18):