Irreversibility, heat and information flows induced by non-reciprocal interactions

We study the thermodynamic properties induced by non-reciprocal interactions between stochastic degrees of freedom in time- and space-continuous systems. We show that, under fairly general conditions, non-reciprocal coupling alone implies a steady energy flow through the system, i.e., non-equilibrium. Projecting out the non-reciprocally coupled degrees of freedom renders non-Markovian, one-variable Langevin descriptions with complex types of memory, for which we find a generalized second law involving information flow. We demonstrate that non-reciprocal linear interactions can be used to engineer non-monotonic memory, which is typical for, e.g., time-delayed feedback control, and is automatically accompanied with a nonzero information flow through the system. Furthermore, already a single non-reciprocally coupled degree of freedom can extract energy from a single heat bath (at isothermal conditions), and can thus be viewed as a minimal version of a time-continuous, autonomous ‘Maxwell demon’. We also show that for appropriate parameter settings, the non-reciprocal system has characteristic features of active matter, such as a positive energy input on the level of the fluctuating trajectories without global particle transport.


Introduction
Fundamental physical interactions between mutually coupled particles, such as atoms or molecules, are typically reciprocal. They are derivable from a Hamiltonian (i.e., conservative) and thus fulfill, automatically, Newton's third law, actio = reactio. In the absence of driving forces or (temperature) gradients, systems with reciprocal interactions equilibrate and are well described by traditional thermodynamics. This holds even on the mesoscale, that is, when instead of the full microscopic dynamics, only few representative (stochastic) variables are considered by integrating out all other degrees of freedom (d.o.f.). This is the key idea of the celebrated Mori-Zwanzig approach [1] yielding a generalized Langevin equation (LE), which involves noise and a memory kernel satisfying a fluctuation-dissipation relation (FDR), and may stochastically describe the motion of a colloid in a complex environment (e.g., a viscoelastic fluid [2][3][4][5]).
While some models which involve non-reciprocal interactions have already been studied from a thermodynamic perspective [35][36][37][38][39][40][41][42], the general thermodynamic and information-theoretical implications of non-reciprocity itself have, to our knowledge, not been discussed so far. This is the first major goal of this paper. To this end, we will review and reinterpret some results from the literature (for systems with two d.o.f.), and derive new formulae for larger systems. In particular, we consider (mostly) overdamped Markovian systems of n + 1 non-reciprocally coupled subsystems X 0,1,...,n with white noise. Each subsystem can represent, e.g., the position of a colloid in an experiment (accordingly, we will assume that the variables are even under time-reversal, like positions or angles). By considering different thermodynamic quantities, we investigate the following questions: can non-reciprocal systems reach a state of thermal equilibrium? Is there a crucial difference between nonequilibrium states induced by non-reciprocity, vs external drivings? Indeed, we show here that, except for some specific cases, non-reciprocal systems are inherently out of equilibrium, even in the absence of external forces or (temperature) gradients. In order to discuss the fundamental consequences of non-reciprocity on a purely analytical basis, we will consider linear models. However, as we will discuss, several conclusions take over to non-linear models. As different representatives of non-reciprocal coupled systems that share some crucial features we will consider, on the one hand, active systems and, on the other hand, feedback-controlled systems. The second main goal of this paper is to explain why, under certain conditions, a setup with non-reciprocal linear couplings can be used to build a 'microswimmer', a 'feedback controller', or a 'Maxwell demon'. For microswimmers, thermodynamic notions are already a huge topic [32, 35-41, 43, 44]. Here, we calculate the information and energy flow between the particle (here X 0 ) and its propulsion mechanism (here represented by at least one subsystem X 1 ), confirming general expectations, e.g., the active swimmer heats up its environment but never cools it down. In contrast, in the context of time-and space-continuous feedback [45][46][47][48][49], the connection to non-reciprocal coupling is rather uncommon and new. Therefore, we dedicate a more detailed analysis to this point. We show that linear non-reciprocal couplings can be used to construct a time-delayed feedback loop, and clarify under which conditions a non-reciprocal coupled d.o.f. can extract energy from a single heat bath, making it a 'Maxwell demon'. We further find conditions under which thermal fluctuation suppression (or enhancement), i.e., 'isothermal compression or expansion' of a single-particle gas are possible.
While some of the questions and connections discussed here may seem to be intuitively clear, almost representing 'common wisdom', there are only few studies where these issues are formally addressed. Moreover, we also detect counter-intuitive phenomena. For example, non-Markovian processes can exhibit a nonequilibrium steady state (NESS) without dissipation, where the entropy is exported purely in the form of information, implying that information and entropy are transported without accompanying energy flow (while in total sustaining this process relies on external energy supply). Furthermore, we show that, under certain conditions, a system of two isothermal subsystems with non-reciprocal coupling can be mapped onto a reciprocal system with a temperature gradient, building a bridge to other active matter models [44,50,51]. In this context, we also consider the underdamped case. In addition, we provide a detailed derivation of the relevant information flows, which is, so far, a quantity that is not well-established for timeand space-continuous systems.
From a conceptual viewpoint it is important to also think about situations, where a portion of the d.o.f. might be invisible to a ('marginal') observer. Even more, in some theoretical models, a portion of the d.o.f. has no direct physical interpretation. Then, the dynamics can be equivalently formulated as a non-Markovian, one-variable equation (for X 0 ) with a memory kernel and colored noise, upon projecting out X j>0 . In such a situation, the interpretation of thermodynamic quantities must be treated with care, and is indeed subject of a recent debate [35,37,40,47,48]. To account for this fact, we will pay special attention to the different measures of (non)equilibrium on the levels of the Markovian and non-Markovian description, and also explicitly consider the entropy balance of an individual subsystem. We will further comment on the connection to so-called 'effective thermodynamic' descriptions [52,53].
We close this introduction with a brief outline. After introducing the model in section 2, we will investigate under which conditions detailed balance (DB) and the FDR are satisfied (section 3). Then, we will calculate the total entropy production of the entire system and the dissipation of an individual subsystem in section 4. Thereafter we will consider the entropy balance of an individual sub-system and derive explicit expressions for the information flows through the system (section 5). In section 6, we show that, under certain conditions, a non-reciprocal (overdamped) system can be mapped onto a reciprocal one. This is also possible for the corresponding underdamped case, as discussed in section 7. There, we also consider the heat flow in a non-reciprocal system with inertia. We finally conclude in section 7.

Model
We consider time-and space-continuous systems described by Markovian with the vector X = (X 0 , X 1 , . . . , X n ) T ∈ R n+1 involving n + 1 stochastic d.o.f. We will discuss thermodynamic properties of both, the entire system {X 0 , X 1 , . . . , X n }, and of the individual X j . To set the focus, we will occasionally call {X 0 , X 1 , . . . , X n } the 'super-system', while an individual X j will be called a 'sub-system'. Further, ξ j denote zero-mean, Gaussian white noises with . . , n}, with k B , γ j being the Boltzmann and friction constants that also appear in the diagonal friction matrix γ with γ jj = γ j . f 0 is an, in general, nonlinear force. The coupling matrix a defines the strength of the couplings a ij , and gives the timescale γ j /a jj of the exponential relaxation dynamics of each d.o.f., due to the restoring forces a jj X j . We will focus on cases where the motion of X 0 is confined, i.e., a 00 < 0, and consider natural boundary conditions, i.e., the probability to find the particle vanishes at X 0 → ±∞. Further, we will focus on situations, where a stable steady state exists, which is the case whenever the real part of the largest eigenvalue of the coupling matrix a is negative.
At this point, we may already note one apparent difference between reciprocal system (a ij = a ji ∀i, j) and those that involve non-reciprocal couplings (a ij = a ji ), that is, only the purely reciprocal coupled equations can be expressed as derivatives of a Hamiltonian, plus noise terms (and, if present, plus non-conservative forces f 0 ). In that case, (1) can be written as γ jẊj = − ∂H ∂X j + ξ j , with the Hamiltonian where the last term in (2) represents the interaction part, H int . In contrast, non-reciprocal couplings appear as a non-conservative force (like f 0 ). In that case, (1) corresponds to γ jẊj = − ∂V j ∂X j + i =j a ij X i + ξ j . Equivalently to (1), one can describe the dynamics of one d.o.f., say X 0 , by a one-variable LE which can be derived by projecting the X j>0 onto X 0 , as described in [1,54] and in appendices A and B. Generally (unless the time-scales of X 0 and X j>0 are well-separated), (3) is a non-Markovian LE, i.e., it comprises memory. In particular, it involves a time-nonlocal force depending on the past trajectory, weighted with a memory kernel K, and ν is a zero-mean, Gaussian colored noise (both depend on the topology of the coupling matrix, concrete examples are given below). For T j>0 ≡ 0, there is no colored noise in (3). We aim to emphasize that the dynamics of X 0 is identical to (1). Using (3) instead of (1) can be regarded as a coarse-graining or marginalization, because the dynamics of X j is not explicitly considered. However, it does not imply loss of information about, or approximation of, X 0 . One should note that, in reverse, for a non-Markovian process (3), a corresponding Markovian representation (1) is not unique. Thus, a specific memory can be realized by different Markovian networks [this can be seen, e.g., from equation (4) by the fact that a 01 and a 10 only arise as product, a 01 a 10 ]. For the sake of generality, we deliberately do not focus on a specific model, and rather offer different interpretations for the involved d.o.f.; explicit examples will be given below. However, a situation of special interest is that the observer only sees parts of the system (say only X 0 ), while the other d.o.f. are 'hidden'. Even more, in some cases, only certain d.o.f. (say only X 0 ), represent actual, physical d.o.f. (such as the position of a colloid), whereas the others (say X j>0 ) are effective (or auxiliary) variables representing those parts of the complex environment which generate a feedback loop or active motion. In such a situation, a non-Markovian description (3), which only involves X 0 , may be the more fundamental one. We will discuss both situations, only X 0 or all X j being observed, in this paper.
Before we start with investigating the thermodynamic consequences of non-reciprocity, we first aim to discuss the relationship between non-reciprocal coupling in (1) and resulting memory in (3) and then give some examples for systems that can be modeled by (1) and (3). f. For reciprocal coupling κ = p, when this corresponds to a mechanical system, the memory kernel is exponentially decaying. Non-reciprocal coupling κ = p yields non-monotonic memory (5). (b) and (c) Memory kernels K(T ) (solid black lines) and noise correlations ν n (gray dashed lines) generated by systems with the coupling topologies as shown in the insets, (b) n = 1, (c) n = 2, and k = 1, τ = 1/2, T j>0 = 1.

Memory induced by non-reciprocally coupled systems
We begin by considering the smallest version of (1) with n = 1. While various aspects of this case have been studied previously [35,37,40,[55][56][57][58], the full implications of non-reciprocity have so far, to the best of our knowledge, not been discussed. For n = 1, the memory kernel K and the noise correlations C ν (T) := ν(t)ν(t + T) are both found to decay exponentially for reciprocal as well as non-reciprocal coupling, and read K(T) = (a 01 a 10 /γ 1 )e a 11 T/γ 1 , An exemplary plot of both functions is given in figure 1(a).
Let us now investigate the effect of adding more sub-systems X j to the super-system (1), such that there may be an interplay of multiple non-reciprocal interactions. Most importantly in the present context, this leads to complex types of memory beyond the single exponential decay. To illustrate this, let us consider a ring of three d.o.f., where all (counter-)clockwise couplings are set to (p) κ, (with a jj−1 = κ, a jj+1 = p, −a jj = p + κ), as sketched in figure 1(a). This super-system generates the memory kernel (see appendix A for a derivation). For reciprocal, i.e., conservative couplings, κ = p, (5) simplifies to an exponential decay K(T = |t − t |) = 2κ 2 e −κT . In contrast, if the coupling is non-reciprocal, we find that the super-system (1) generates a non-monotonic memory kernel, despite the linearity of all couplings. In the present example, the memory kernel (5) has a maximum at a finite time difference. In the limit of unidirectional coupling p → 0, the memory kernel (5) converges to a Gamma-distribution K(T) = κ 3 T e −κT , which has a pronounced maximum near κ/3, see figure 1(b). Noteworthy, in this limit, the kernel vanishes at T = 0, i.e., the instantaneous position does not contribute to the integral X 0 (t )K(t − t )dt in (3) [while the integral is dominated by the instantaneous position for reciprocal coupling]. In appendix A, we discuss the general case where all couplings are different, yielding very cumbersome expressions while the overall characteristics are the same.
Playing around with different coupling topologies and system sizes, we generally find that non-reciprocal coupling is a crucial ingredient to generate non-monotonic memory, while reciprocal couplings always yield monotonic kernels. With an appropriate coupling topology, it is also possible to generate memory kernels with multiple maxima. We observe that a kernel with n extrema can be represented via (at least) n d.o.f.
On the other hand, we observe that the correlation C ν (T) of the colored noise produced by Markovian systems with ring topology [of type (5)] is always monotonically decreasing with T (see, e.g., figure 1). This implies a broken FDR, as we will discuss below in section 3.2. For other coupling topologies, linearly and non-reciprocally coupled d.o.f. can also induce non-monotonic noise correlations. We have performed a systematic study of the connections between coupling topology, generated memory, and the resulting correlation functions in [59].

Examples
Let us now consider some exemplary systems of type (1) with non-reciprocal interactions. We start with a brief summary of models known from the literature and then introduce our new models with feedback. Figure 2 provides an overview for the case n = 1. Overview of various systems describable by the generic model (1) with n = 1. Left: for unidirectional coupling (a 01 = 0), X 1 corresponds to a 'cellular sensor' in the model [21], and X 0 to a d.o.f. measured by that sensor (e.g., a ligand concentration), see equation (7) and text below. Center: for reciprocal coupling (a 01 = a 10 ), X 0,1 correspond to the angles of two mechanically coupled vanes [60]. Right: for unidirectional coupling (a 10 = 0), X 0 corresponds to the position of a microswimmer within the AOUP model, while X 1 represents the self-propulsion velocity [35,37,40,55,56,61], see (6) and below. In the intermediate cases with bidirectional non-reciprocal coupling, X 1 corresponds to a feedback controller acting on a colloid at position X 0 .
For reciprocal coupling, the dynamics of the two d.o.f. X 0 and X 1 , corresponds to the angles of two vanes that rotate in two different heat baths at T 0 and T 1 , and are coupled by a torsion spring with spring constant a 01 = a 10 . At T 0 = T 1 this setup was considered as a minimal model for heat conducting through mechanical motion, as discussed in [60] (see p 154). Further, for unidirectional coupling a 10 = 0, a 01 > 0, a 11 < 0, our model (1) reduces to the active Ornstein-Uhlenbeck particle (AOUP) model with transitional noise [35,37,40,55,56,61], reading which corresponds to (1) with τ = γ 1 /|a 11 | 0, and a 01 = 1. This is a simple (overdamped) model for active swimmers, where X 0 corresponds to the position of a microswimmer in an harmonic trap with stiffness a 00 0, while X 1 represents the 'self-propulsion velocity' [61], pushing X 0 away from it. In a real system, the propulsion could be created by the flagella of a bacterium, or the asymmetric flow field around a Janus colloid. In the corresponding non-Markovian representation (3), the colored noise (which is here the only type of memory, as K ≡ 0 when a 10 = 0) yields the persistence of the motion, and τ quantifies the 'persistence' of the 'active noise' [61]. Next, the super-system with reversed unidirectional coupling (i.e., a 01 a 10 ), was recently suggested as a model for a cellular sensor [21] γ 0Ẋ0 = a 00 X 0 + ξ 0 ,Ẋ 1 = (a 11 /γ 1 )[X 1 − X 0 ] + ξ 1 /γ 1 , with a 00 < 0, a 11 < 0, which corresponds to (1) with a 11 = −a 01 < 0. Thereby, the cellular sensor is described by a one-dimensional variable X 1 (giving the state of the sensor at time t, which is, according to [21], related to the number of bound receptors). The purpose of the sensor is to measure a certain external d.o.f., X 0 , which could be the concentration of some ligand [21]. Last, we aim to note that the model for a cellular sensor with memory from reference [21], corresponds to the case n = 2, where X 2 represents the past state of the sensor, i.e., the memory (related to the number of phosphorylated internal proteins [21]). Then, X 0 → X 1 , and X 1 → X 2 are coupled unidirectionally, and there is no direct link between X 0 and X 2 . As we will show in this paper, the generic system (1) with non-reciprocal couplings also includes cases where the d.o.f., X j>0 , can be regarded as a feedback controller continuously operating with the force F c on a system X 0 , yielding a dynamical equation of the colloid which is a special type of (3). A characteristic aspect of feedback control is the occurrence of a time delay between 'measurement' and 'control action'. In experimental setups, this delay either emerges naturally due to finite signal transmission or information processing times (e.g., think of optical feedback with the help of videomicroscopy [45,[62][63][64][65]), or may be implemented intentionally (e.g., in Pyragas control [66,67]), because it is known to induce interesting dynamical and thermodynamical behavior, such as particle oscillations [62,68,69], transport [69], or a net energy extraction from the bath [45]. The controller model with n = 1 and bidirectional non-reciprocal coupling can be interpreted as a minimal realization of such a controller. However, it yields an exponentially distributed delay with maximum at t − t = 0. In contrast, the feedback loop often has a typical finite duration, i.e., the control action depends on X 0 (t − τ ), with a distinct characteristic delay time, τ > 0, implying that the equation of the controlled system (here X 0 ) involves a memory kernel with a maximum around τ . It now becomes clear that a unidirectional ring with n = 2 can describe such a controller with preferred delay time. Specifically, setting with a pronounced maximum at τ . The feedback force in the non-Markovian equations (8) or (3) is , and, in the Markovian description the feedback force is kX n , respectively. Note that, due to this setting, the only remaining free controller parameters are the time delay τ and the feedback gain k. To better compare the controllers with n = 1 and n = 2, we analogously set a 10 = −a 11 = γ 1 /τ , and a 01 = k in the case with n = 1, obtaining from (4), In this paper, we focus on the cases n = 1, 2, a generalization toward higher n will be discussed in a future publication by the same authors. We note that the limit n → ∞ yields a δ-distributed memory kernel around τ [45,54], i.e., K ∝ δ(T − τ ). Such stochastic delay differential equations are infinite-dimensional, which makes their treatment very involved, especially when it comes to thermodynamics [47,48,70,71]. In comparison, the model proposed here has in total three d.o.f. and is thus, quite handy.

Intrinsic non-equilibrium
Now we turn to the thermodynamic properties induced by the occurrence of non-reciprocal interactions, focusing on the long-time behavior t → ∞, when transient dynamics due to the initial conditions have decayed and the system has approached a steady state. We start by clarifying whether thermal equilibrium can exist despite non-reciprocity. As mentioned before, non-reciprocal interactions are non-conservative. One might therefore guess that a system with non-reciprocal interactions cannot reach thermal equilibrium. To investigate this question, we check the DB condition on the level of the Markovian representation (1). Since the latter is only meaningful when all d.o.f. have a physical interpretation, we also discuss the FDR on the level of the non-Markovian description (3).
Since we are interested in analytical solutions, we will focus on the linear case, i.e., f 0 = 0. We stress, however, that the framework is readily adaptable to cases where a nonlinear force act on X 0 , then requiring numerical solutions.

Detailed balance
To investigate whether the super-system (1) can approach thermal equilibrium, we check the DB condition. To this end, we consider the flow of the (n + 1)-point joint probability density function (pdf), ρ n+1 (x, t), of x = (x 0 , . . ., x n ) T . To access this quantity, we utilize the closed, multivariate Fokker-Planck equation (FPE) [54] corresponding to (1), which reads with the probability current J and diagonal diffusion matrix D jj = k B T j /γ j . We note that J is generally constant in steady states, and zero in equilibrium. Using the identity ∂ x ρ = [∂ x ln(ρ)]ρ, we rewrite (11) as which is connected to the probability current by J = vρ n+1 . DB means that all probability currents vanish, hence, v j = 0 ∀j. From (12), we obtain the condition D −1 γ −1 a x = ∇ ln ρ n+1 , which implies that the vector D −1 γ −1 a x is the gradient of a scalar function. This, in turn, is true if and only if ∇ × (D −1 γ −1 a x) = 0. Noting that γ and D are diagonal, this brings us to for all pairwise coupling constants between every two mutually coupled sub-systems. We stress that this condition is irrespective of the coupling topology, or system size. Remarkably, (13) shows that non-reciprocal systems that fulfill DB do exist, as long as a ij a ji > 0. However, unidirectional super-systems are by construction pure nonequilibrium models, including the (AOUP) microswimmer, or the controller with non-monotonic memory (n = 2), see equation (9). Condition (13) further implies that non-reciprocal systems can reach equilibrium despite T i = T j . Below, we will show that also the total entropy production vanishes at this point, as well as the heat and information flows [see equations (23), (24) and (37)]. This is in sharp contrast to reciprocally coupled (or 'passive') systems, which generally never equilibrate when being simultaneously coupled to heat baths of different temperatures. This has been shown, e.g., in [73,74]. We, however, do not think that our results contradict the previous findings, which exclusively refer to reciprocally coupled systems, like mechanical ones. The non-reciprocal coupling considered in this paper does not correspond to a mechanical coupling, and is typically only realizable with the help of some external apparatus acting on the system (for an example of non-reciprocal coupling realized by light, see [24]).

Fluctuation-dissipation relation
Let us now turn to the corresponding non-Markovian process (3) in x 0 -space, which is more appropriate for models where X j>0 have no direct physical interpretations or if a marginal observer only sees X 0 . On this level of description, the definition of a probability current is less clear, as there is, in general, no corresponding closed FPE [54]. However, from the non-Markovian LE (3) [at f 0 = 0] alone, we can immediately deduce that the probability current in this marginalized space must vanish by a simple symmetry argument: on an ensemble-averaged level, the non-Markovian system has no preferred direction. In other words, the ensemble average of equation (3) is completely symmetric w.r.t. a coordinate inversion x 0 → −x 0 . Consequentially, the probability current cannot have any direction. Thus, naively repeating the analysis from section 3.1, the system would always appear to be in equilibrium. This is, however, not true, as we see by instead considering the FDR [75], which describes a balance between the friction kernel γ and thermal noise μ As well known for, e.g., viscoelastic fluids, the validity of an FDR would imply that the system equilibrates in the absence of external driving [4,75].
To check (14) for the present model, we rewrite (3) in the form of a generalized LE by converting the time-integral with K in (3) via partial integration into a friction-like integral that involves the 'velocity'Ẋ 0 and the friction kernel γ(|t − s|). This yields involving the noise μ(t) = ξ 0 (t) + ν(t), the integrated kernel K, and the friction kernel For the case n = 1, the integrated kernel reads K(T) = (a 01 a 10 /a 11 )e a 11 T/γ 1 . It can easily be verified (using (4) for the noise correlations) that the FDR holds only if which agrees with the DB condition (13). Thus, the non-Markovian process is out of equilibrium unless (16) holds. This is, for example, never the case for the active microswimmer (where a 10 = 0). For our n = 2 controller (9), we find K(T) = k 1 + T/τ e −T/τ , and FDR thus amounts to There is no pair of k, τ that simultaneously obeys T 0 2γ 1 = T 1 3τ k and T 0 2γ 1 = T 1 τ k, which would be necessary to fulfill FDR. Thus, in this case, FDR (and DB) are never fulfilled (except for the trivial cases, where k or τ nullify, or tend toward ∞). For other coupling schemes and n > 1, we observe that a non-reciprocal system may fulfill FDR, but violate DB. We present a detailed investigation, which is beyond the scope of this paper, in [59].
In this section, we have seen that non-reciprocity implies an intriguing property of the corresponding non-Markovian stochastic process, i.e., the existence of NESS with zero probability currents. This, in turn, also implies the absence of global particle transport, thus, intrinsic nonequilibrium. Such states have been considered, e.g., in [76]. They commonly occur in active systems [77][78][79][80], but can also be found in feedback-controlled systems [45]. The reason is that, in both cases, the 'driving' occurs directly on the level of the stochastic trajectories, yielding, e.g., persistence, but it does not come in the form of a global gradient, i.e., there is no global symmetry breaking (using the language of control theory, one might say that the driving is in a 'closed-loop' form [62,69,81]). In particular, in the present case, the driving is hidden in the coupling forces. To further investigate this, we will next reconsider the system from an energetic perspective.

Energy & entropy
To further unravel the nature of the intrinsic non-equilibrium, we consider the energy flows. Sekimoto's framework [60] tells us that the fluctuating heat exchange between each X j and its heat bath along a stochastic trajectory of length dt is given by yielding for the entire super-system a total dissipated energy of δq = n j=0 δq j . Here, • indicates Stratonovich calculus. Note that we employ the sign convention that a positive heat flow corresponds to energy flowing from the particle to the heat bath, different from [60]. Using the LE (1), we can write the ensemble average of the heat rate, denotingQ = δq/dt ,Q j = δq j /dt , aṡ (recall that f 0 = 0). Now we utilize the steady-state identity X kẊl = − X lẊk ∀ k, l, which readily follows from the fact that the correlations X k (t)X l (t) are time-independent and thus d dt X k (t)X l (t) = Ẋ k (t)X l (t) + X k (t)Ẋ l (t) = 0. Using these identities, we immediately obtain from (19), Accordingly, if all couplings are reciprocal, the rate of total dissipated energyQ is zero, as expected. Equation (20) further reveals that, in contrast, a non-reciprocal interaction a ij = a ji leads to a net dissipation. Let us discuss this in more depth. First, we realize thatQ is nonnegative, as follows from the connection to the total entropy production rate (EP) [82] with S sh being the ensemble average of the fluctuating multivariate (joint) Shannon entropy , andṠ sh ≡ 0 in steady states. Noteworthy, (21) describes the actual total thermodynamic EP only when all d.o.f. have a physical interpretation. In other cases its meaning is debatable. However, in any case, the second lawṠ tot 0 holds (as formally shown below in (32)), wherė S tot = 0 in thermal equilibrium. Second, according to the first law of thermodynamics, δq = δw + du, the net dissipation associated with each non-reciprocal interaction (20), must result from work δw applied to the system, while the internal energy is conserved in steady states, du = 0. In other words, the total entropy production is due to a positive energy input at rateẆ =Q 0 (20) into the system. Where does this energy come from? Because fundamental physical interactions are generally reciprocal, in order to realize a non-reciprocal coupling some (external) mechanism is necessary, which is here not explicitly modeled but 'hidden' in the equations within the non-reciprocity. The positive energy inputẆ =Q 0 (20) gives the minimal energy needed (by this mechanism) to sustain the non-reciprocal coupling. We also note that a positive energy input on the level of the fluctuating trajectories is considered a defining property of active systems [30-32, 83, 84]. As we see here, it can be introduced in the form of a non-reciprocal interaction.
Next, we take a closer look at the individual heat flow between X 0 and its bath. We focus onQ 0 , as it is a characteristic thermodynamic quantity and it is independent of whether all d.o.f. have a clear physical interpretation, or not, and independent of the employed description (Markovian or non-Markovian). To calculate the steady-state ensemble average, we again utilize X jẊj = 0 and X l ξ j = 0 for all j = l, and therewith find from (18) Likewise, one can calculate the heat flows of the other d.o.f.Q l = n i=0 j =l a lj a li γ l X j X i . It should be noted that by writing down this expression for the dissipation of X j>0 and the total EP (21), we implicitly assume that all X j are even under time-reversal, that means, position-like variables. In contrast, odd variables would not contribute to the total EP, see [35]. We note that for active matter models the parity of X j>0 is in fact a nontrivial aspect, and subject of an ongoing debate, see e.g., [35,37,85,86].
Together with the correlations X i X j that are derived in appendix C, equations (21), (22) represent analytical expressions for heat flow and entropy production for any n. For example, in the case n = 1 (which was also discussed in [57]), where the expression significantly simplifies, we find from (22), (21) From (23) one immediately sees that the EP vanishes if, and only if, DB (13) and FDR (16) are fulfilled (as one shall expect). Thus, all three notions of equilibrium are consistent. Further, if (13) is fulfilled, also the heat flow vanishes. As we show in section 7, this also holds for the corresponding underdamped model, see equation (50). Thus, now we have convinced ourselves that the non-reciprocal systems which are simultaneously coupled to baths at different temperatures can really reach states of thermal equilibrium, if (13) holds.
Let us now take a closer look at the heat flow (24) for different coupling schemes, shown in figure 3 for T 0 = T 1 , a 00 = a 11 and n = 1. Note that these isothermal conditions allow to better investigate the effect of non-reciprocity and, at the same time, are most realistic in regard to experimental realizations. For example, this could represent a system of two colloidal particles trapped in a harmonic potential of stiffness |a 00 | = |a 11 | and coupled with each other with the help of an external setup similar to [22,24]. When the system is reciprocally coupled (along the dotted diagonal), it equilibrates and the heat flow nullifies. Then, the EP (23) is zero as well. The heat flow also vanishes in the trivial case a 01 = 0, i.e., when X 0 does not 'see' X 1 (dashed horizontal line), as is the case when X 1 corresponds to a sensor [21]. As one would expect, being measured does not bring X 0 out of equilibrium. If the unidirectional coupling is reversed (a 10 = 0), the heat flow is strictly nonnegative (dashed vertical line). This suits to the idea that X 0 is an active swimmer: the swimmer eventually heats up the surrounding fluid, but never has a net cooling effect. Remarkably, for cases with bidirectional non-reciprocal coupling, we observe that,Q 0 can also become negative. We will discuss this in more depth in the next section.

Conditions for reversed, i.e., negative heat flow
WhenQ 0 is negative (as in the blue regions of figure 3), heat is constantly flowing out of the bath (on average). This happens due to the coupling with another (or multiple) subsystem(s) X 1 , although the other subsystem is not colder, which would be a trivial case of heat extraction. Let us take a moment and think about the meaning of this observation. We aim to remind the reader that a steady-state heat flow induced by an non-conservative external force (e.g., a constant, a time-dependent, or a space-dependent driving force) acting on a passive, Markovian system is strictly nonnegative, as dictated by the second law, Q 0 /T 0 =Ṡ tot 0. Loosely speaking: 'stirring a particle from outside will eventually heat up the environment.' Here we find that, in contrast, the non-conservative force a 01 X 1 , or F c in the notation (8), can induce a negative i.e., reversed heat flow,Q 0 < 0. Thus, F c can be viewed as an external force, which stirs the particle in a clever way, and thereby cools down the particle's environment. (The underlying reason is the usage of extracted information, see section 5.) The negative sign ofQ 0 implies a steady extraction of energy from the bath, which is converted into workẆ 0 , i.e., a (potentially useful) form of energy. It is, of course, well-known that such an energy extraction can be realized by 'Maxwell-demon'-type of devices [87,88]. Here we see that the non-reciprocally coupled d.o.f. represents a minimal, time-continuous version of  (4) at n = 1. Along the diagonal a 01 = a 10 , the super-system is reciprocally coupled, and X 0,1 may model the angles of vanes coupled by a spring [60]. For unidirectional coupling a 01 = 0, X 1 correspond to a cellular sensor (7) (13) hold at k = 1/τ (dotted lines), i.e., the system is in equilibrium. We have added a corresponding thin line also in the n = 2 plot of the information flow, serving as a guide to the eye. Trivially, in the uncoupled case (k = 0), the subsystem X 0 is equilibrium as well (for arbitrary n). Note that the (n = 1)-controller with τ = 1, corresponds to the system in figure 3 along the (dashed) line a 10 = 1 with a 01 = k.
such a device, where the control action is automatically encoded in the non-reciprocity of the coupling. Note that the total EP, which is proportional to the sum over both heat flows,Q 0 +Q 1 , is strictly positive also in this case, i.e., the isothermal 'Maxwell demon' X 1 must heat up its own environment. To find out under which conditions the negative heat flow occurs for n = 1 and 2 [with the parameter setting from (9), (10)], we vary the two important parameters, the feedback gain k and delay time τ . Figure 4 reveals that the heat flowQ 0 is qualitatively and quantitatively similar for n = 1 and 2. The similarity of the two cases is indeed striking, given the differences between both systems. In particular, we here compare systems with monotonic memory kernel K(t = t ), vs non-monotonic K(t = t ) which nullifies at t = t (for n = 2). At n = 1, the feedback force k K(t − t )X 0 (t )dt mostly depends on the instantaneous position X 0 (t), while at n = 2 it is independent of the latter, and mostly depends on t − τ . Further, in regard to the Markovian super-system, there is a direct coupling from X 1 to X 0 in the case n = 1, while this coupling is only indirect (via a third sub-system) in the case n = 2. Nevertheless, the (blue) area of reversed heat flow lies in the same region of the (τ , k)-plane and is of similar size. Also, in both cases, it only occurs if k > 0.
In the context of control theory, it is common to characterize feedback loops as positive or negative feedback, according to the question whether a small perturbation (from the desired state) is enhanced, or reduced by the feedback [62]. In the present case, k < 0 corresponds to positive feedback, while k > 0 is negative feedback, see appendix D for an explanation on the terminology and an illustration. Thus, in both models, only negative feedback may induce a negative heat flow.
Besides the trivial case, k = 0, there is, for both n, a second line in figure 4 along which the heat flow vanishes. For n = 1, this line corresponds to parameters where DB and the FDR (13) are fulfilled (dashed line), i.e., the system is in equilibrium. For n = 2, DB and the FDR are generally broken for all (τ , k). This second line hence reveals another interesting property of non-Markovian systems: they may be out of equilibrium without exhibiting dissipation (zero heat flow), in sharp contrast to reciprocal systems. In our system, such a state is found for n > 1 and non-reciprocal coupling only. From the viewpoint of the non-Markovian process X 0 this is indeed a bit puzzling. If X 0 is in a true nonequilibrium steady-state, there must be an associated entropy production. However, the zero heat flow indicates zero medium entropy production. Thus, where does the entropy go? To answer this question, we shall consider the entropy balance of the individual subsystem X 0 , as we will do the next section.
We note that an NESS with zero heat flow and regimes of negative heat flow may also occur in systems with δ-distributed memory, which are, moreover, nonlinear, as we have reported in [45]. Further, such states may also occur in Markovian (reciprocal) systems with non-Gaussian noise. As was shown in [89], the presence of nonlinear forces is then a necessary condition, different from the reversed heat flow induced by non-reciprocal coupling or time-delayed feedback.

Information
Now we turn to an information-theoretical investigation of non-reciprocal coupling. The motivation of this is two-fold. First, it will help us better understand the previous observations, for example: why is heat extraction only possible for negative feedback (see figure 4), and only if |a 10 | > |a 01 | (figure 3)? Until now, these conditions seem arbitrary. Second, by also considering information flows, we will be able to describe the entropy balance of an individual subsystem whereas, so far, we have studied entropic properties of the entire super-system only. This is especially important in situations where only one part of the system is observable (or has a direct physical interpretation).
It has been established in previous literature [21,90,91] that the entropy flow associated with the exchanged information between two coupled subsystems (say X 0 and X 1 ), is associated with the information flows between them. This quantity is closely connected to the mutual information [40,91], which describes the total amount of information exchange in the entire supersystem, but is, in contrast to the information flows, not directed. While information flows are already common to investigate discrete systems [87,[91][92][93], this quantity is less established for time-and space-continuous systems (with time-continuous feedback) [92]. First steps in this direction have been undertaken in [21,91] and in [90] (where the reciprocally coupled n = 1 case was studied). It should be noted that there are various other notions of information flows and information exchanges, which are more appropriate in other contexts, see [94] for an educational overview.
However, the previously developed framework based on information flows, and the definition of the mutual information itself, are only applicable for situations where two subsystems exchange information (n = 1). Here we will generalize this framework to arbitrary system sizes and topologies.
We start by considering the total temporal derivative of the Shannon entropy (21), i.e., In steady states, the first term naturally vanishes. To calculate the ensemble average of the second term, we use Ẋ j A(x j , t) = J j A(x j , t)dx j [82,95], with the probability currents J j . We consider natural boundary conditions lim x→±∞ ρ(x) = 0, and denote improper integrals lim r→∞ r −r simply as . With these tools, we find the ensemble average of each summand of (25) where we have introduced the multivariate information flowİ →j to X j . We note that when applied to the case n = 1, the here definedİ →j reduces to the information flow from [90,91], with the sign convention as in [90].  (26), one can see that in equilibrium all individual information flows are necessarily zero.
To further proceed, we utilize the closed, multivariate FPE (11), and find where we have introduced the change of the Shannon entropy of the marginal pdf ρ 1 (x j ) In sum, we have shown thatṠ Let us now consider the multivariate information flows defined in (26) in more detail. For the case n = 1, it has been shown thatİ →0 +İ →1 =İ [90], i.e., the individual information flows sum up to the temporal derivative of the mutual information. As we show in appendix F, for systems with multiple subsystems (n 1), the individual information flows sum up to the multivariate generalization of the mutual information For n = 1, this reduces to the usual mutual information. Just like the latter, we have, on the one hand, n j=0İ →j =İ, and, on the other hand,İ = 0 in steady states (because the pdfs are time-independent). (We note that there is not a unique way of generalizing the mutual information to systems with more than two subsystems, see [96][97][98][99]).
Since I = 0, the information flows among all d.o.f. X j in total cancel each other out (thus, from an information-theoretical point of view, the super-system as a whole is 'closed'). However, they constitute an important contribution to the entropy balance, when an individual subsystem is considered.
To see this, we reconsider the summands of (25), and rewrite them using the FPE (11) as Combining (25), (26), (31), we obtain the entropy balance of each subsysteṁ With (33) we have recovered the mean total EP (21). Further, equation (32) may be seen as a generalized second law for each d.o.f., giving the entropy balance of an individual sub-system. In steady states, whereṠ consistent with [90,91]. Equation (34) states that a negative steady heat flow,Q 0 < 0, is only possible, ifİ →0 < 0, i.e., information is flowing from the X 0 to the rest of the system. The more information about X 0 is gathered by the other X j>0 (the controller d.o.f.), the more heat can be extracted from the bath. Figure 5 shows (for n = 1) the information and heat flows, as well as the total EP, which are all connected via (32), (33). It also illustrates that, in the reciprocal and isothermal case, there is no 'entropic cost' (zero EP), but, at the same time, no net information extraction is achieved, nor is a heat flow induced.  (38) and heatQ j (24) flows vs a 01 /a 10 , for n = 1. The heat flow (solid red and gray lines) of each d.o.f. is bounded from below by the information flow (dashed lines), as predicted by (34). The total EP is given by the sum over j=0,1Q j −İ →j . The plots pertain to a 11 = a 00 = −1, and all other parameters and k B are set to unity.
Due to the linearity of the model, we can calculate the information flows analytically. The steady-state pdfs are multivariate Gaussians with zero mean and with the covariance matrix (Σ) i j = X i X j , which are described in appendix C. To derive explicit expressions for the steady-state information flows, it turns out to be most convenient to start with (26). Using the general property of normal distributions, Inserting the LE (1), utilizing 2 X lẊl = d X 2 l /dt = 0 and X l ξ j = 0 for j = l, we obtain the general formulaİ Equation (36) represents in combination with (C3), an analytic expression for the steady-state information flow to any sub-system in (super-)systems of arbitrary sizes.

A single non-reciprocal interaction, n = 1
We are now in the position to clarify the information-thermodynamic implications of non-reciprocal coupling. First we start with n = 1, where we find from (36) in combination with (C5) 10 T 0 ][a 00 a 01 /T 0 + a 11 a 10 γ 0 /(T 1 γ 1 )] T 0 (a 00 γ 1 + a 11 γ 0 ) 2 + a 2 01 T 1 γ 0 γ 1 − 2T 0 a 01 a 10 + a 2 10 Equation (37) explicitly shows that the information flow vanishes in thermal equilibrium when DB holds, T i a ji = T j a ij , as already follows from its definition (26). Furthermore, it trivially vanishes if the cross-correlations nullify. If a 01 = 0, the information flow can be expressed aṡ revealing that the information flow out of and into X 0 necessarily nullifies, if the heat flow is zero (if a 01 = 0). The information flow is shown in figure 3 together with the heat flow. Along the unidirectional coupling axis a 01 = 0, there is net information flow from X 0 to X 1 , but no net work applied to X 0 (Q 0 =Ẇ 0 = 0). Thus, it is indeed sensible to consider X 1 a 'sensor' and the coupling a 'sensing interaction'. If the unidirectional coupling is reversed (a 10 = 0), the heat flow is always positive,Q 0 > 0, i.e., an active swimmer eventually heats up its surrounding. In this case, there is as well a nonzero information flow, which is directed from the source of propulsion (e.g., the flagella) to the particle. This is also reasonable, as the propulsion force 'carries' information: one could, on average, reconstruct the position of the flagella by only monitoring X 0 .
For non-reciprocal, bidirectional coupling, the information flow can be positive or negative, depending on whether the 'sensing', or the 'active force' is stronger. It seems intuitive to consider X 0 a feedback-controlled system, only if the net information flow out of X 0 is positive, i.e., the controller 'knows' more about X 0 then vice versa. According to this definition, the control regime is given if |a 10 | > |a 01 | (blue regions in the middle panel of figure 3). This is exactly the regime where we have detected the negative heat flow, i.e., here the controller may extract energy from a single heat bath (under isothermal conditions). Note that this observation is consistent with the generalized second law (34) which does not predict, but allow for a negative heat flow in this very regime only.
Interestingly, we find that another intriguing phenomenon may occur (only) when the information flow is negative, namely, the suppression of thermal fluctuations. The latter can be measured by a reduced second moment X 2 0 < X 2 0 a 01 =0 , which we have displayed in figure 3 (right panel). In the blue areas, the second moment is reduced, thus, the feedback has the same effect as stiffening the trap. This resembles the situation in a recent experiment involving colloids in an optical trap [64], where time-delayed feedback was used to effectively stiffen a trap. Thermal fluctuation suppression can further be viewed as 'isothermal compression' of a single-molecule gas, which represents, for example, an important step in the cycle of a (colloidal) heat engine [100,101]. It also implies noise-reduction, which is desired in various experimental setups, and indeed one of the main applications of feedback control [102][103][104]. Interestingly, by only varying a 10 (which does not explicitly appear in the equation for X 0 ), one can vary between fluctuation enhancement (isothermal expansion), and fluctuation suppression (isothermal compression). The suppression of thermal fluctuations is limited to the area where one direction of the coupling is attractive (a ij < 0) while the revers direction is repulsive (a ij > 0). We find it quite remarkable that wheneverİ →0 < 0, such that X 1 can be viewed as a controller, it either yields a suppression of the fluctuations of X 0 (reduction of Shannon entropy), or a heat flow from the bath to X 0 (reduction of medium entropy).
Lastly, we detect a further counter-intuitive property appearing exclusively in non-reciprocal super-systems: there are NESS, where all information flows nullify (note thatİ →1 = −İ →0 for n = 1). Thus, the subsystems may be driven out of equilibrium just due to their interaction (as signaled by finite dissipation), but without exchanging any information with each other.

Two non-reciprocal interactions, n = 2
For higher n, the explicit expressions for the information flow are quite cumbersome. For example, for n = 2, (39) Again, equation (39) reflects that the existence of a nonzero information flow necessarily implies that the d.o.f. are cross-correlated among each other. However, different from the case n = 1, there is no proportionality between heat and information flow. In contrast, we find that for n > 1, the relationship between those quantities becomes more complicated. To better understand their relationship, let us consider the cases n = 1, 2 again with the parameter setting from (9) and (10), shown in figure 4. Remarkably, despite the different nature of the super-system and the different type of memory, the information flow maps look almost identical for n = 1 and 2. This indicates that the information flow is almost exclusively affected by the direct coupling (here from X 0 to X 1 ), which is, in principle, the same in both cases [given by the force −(1/τ)X 1 ]. Thus, different from the energy flows, the information exchange is not affected by the additional indirect coupling though a third d.o.f. in the case n = 2. Furthermore, we again find that the areas of negative heat flow (blue region in figure 4(a)) appear in the control regime ofİ →0 < 0 (blue region in figure 4(b)). We note that, as in the case n = 1, the regime of thermal fluctuation suppression (not shown here) is limited to the area of negative information flow, i.e, to the control regime.
Apart from these similarities, we observe a phenomenon which only occurs for n > 1 and non-reciprocal coupling, that is, the existence of NESS where X 0 is out of equilibrium with broken FDR anḋ I 0→ < 0, butQ 0 = 0. Considering the entropy balance (32), the entropy produced in X 0 due to the non-reciprocal coupling force, is transported only in the form of information. This state corresponds to the aforementioned non-Markovian NESS with zero dissipation (see section 4.1).

Mapping non-reciprocity onto temperature gradients
In the course of this paper, we have demonstrated that non-reciprocal coupling introduces 'activity', or more generally, intrinsic nonequilibrium. In contrast, there are several other recent publications which discuss (hidden) temperature gradients between reciprocally coupled stochastic d.o.f. as possible mechanisms that fuel active motion, see, e.g., [44,50,51]. In this section we show that, in some cases, non-reciprocal coupled systems can indeed be mapped onto a reciprocally coupled system with an internal temperature gradient.
Consider the non-reciprocal system with n = 1 and a 01 a 10 = 0, We now introduce new variables |a 01 | X 0 = X 0 , |a 10 | X 1 = X 1 , and |a 01 | T 0 = T 0 , |a 10 | T 1 = T 1 . We note that if the X j are position-like d.o.f., their scaling should indeed be accompanied by scaling of the temperatures due to the connection between temperatures and the time-derivative of the positions. In this way, we find ⎧ ⎨ ⎩ γ 0˙ X 0 = a 00 X 0 + sgn(a 01 ) √ a 01 a 10 X 1 + ξ 0 γ 1˙ X 1 = sgn(a 10 ) √ a 10 a 01 X 0 + a 11 X 1 + ξ 1 , . If a 01 a 10 > 0, this system has reciprocal coupling. Further, even if T 0 = T 1 , it involves a temperature gradient. The symmetric system (41) could, for example, model the angles of two vanes in different heat baths, coupled by a torsion spring [60].
As well-known [60,73], such a reciprocally coupled system equilibrates if, and only if, T 1 = T 0 ⇔ |a 01 |T 1 = |a 10 |T 0 . The equilibrium condition found in this way is identical to the equilibrium condition (13) found from DB and FDR. Importantly, these considerations are not restricted to the case n = 1. In appendix E, we give an explicit example for a non-reciprocal system with n = 2 that can be mapped onto a reciprocally coupled one, if a ij a ji > 0, ∀i, j ∈ {0, 1, 2}. Again, this mapping yields the identical equilibrium conditions as (13), and the strategy can be generalized to larger n. Thus, here we have shown that, when the equilibrium model with non-reciprocal coupling and temperature difference is mapped onto a reciprocally coupled system-which is potentially realisable by a mechanical setup-the temperature difference vanishes. Now we turn to the impact of this scaling on the thermodynamic quantities, using n = 1 as an illustration. For the heat flows, we find the relations This further means i.e., the EP in the scaled model is identical to the EP in the original model, while the energy flows in general differ. We conclude that the two 'driving mechanisms', that is, non-reciprocal coupling (with a ij a ji > 0), or a temperature gradient, can formally not be distinguished on the level of EP. This mapping also builds a bridge to active matter models where temperature gradients between reciprocally coupled stochastic d.o.f. fuel the active motion [44,50,51]. It should be emphasized, however, that a scaling as employed here cannot be found if a ij a ji 0 (which, interestingly, includes unidirectional coupling, e.g., the AOUP model). This suggests that non-reciprocal coupling is the more general way to introduce intrinsic non-equilibrium.

Underdamped dynamics
So far, we have focused on overdamped descriptions, which are appropriate when the inertia is negligible, or if one is mainly interested in the dynamics above the ballistic timescale. However, in certain situations the inertia terms might yield contributions to thermodynamic quantities that are crucial to obtain a physically consistent description, even above the ballistic timescale. This is, e.g., the case for feedback systems with very short delay times [45]. More importantly in the present context, this is also true for Markovian systems that are simultaneously coupled to multiple heat baths at different temperatures, since then energy may be transferred between different heat baths via the kinetic energy of the system, see, e.g., [74]. Therefore, we dedicate this last section to the consideration of inertia effects in the presence of non-reciprocal coupling. We will pay special attention to the following two aspects: (i) does the equilibrium nature of the non-reciprocal models which fulfill (13) persist when we account for inertia terms? (ii) Is our calculation of the heat flow consistent with underdamped dynamics?
As before, the mapping yields a reciprocal system, if a 01 a 10 > 0. Due to the explicit inclusion of the inertia terms in the underdamped case, we can now consider the equipartition theorem, which represents yet another measure for equilibrium. If the reciprocal system (45) is in equilibrium, traditional thermodynamics tells us that equipartition holds, thus with p 0,1 = m 0,1˙ X 0,1 , and that T 0 = T 1 . Transforming back to the original variables, this corresponds to Hence, also the non-reciprocal (underdamped) system (44) fulfills the equipartition theorem if T 0 = T 1 ⇒ a 10 T 0 = a 01 T 1 . This condition is in agreement with the equilibrium condition from DB for overdamped dynamics, equation (13). We emphasize that the arguments presented here (including the mapping (45)) can readily be generalized to n > 1. Next, we consider the heat flow in the presence of inertia. To this end, we consider as a specific example the case m 0,1 = m, γ 0,1 = γ, a 00 = a 11 = − sgn(a 10 ) √ a 10 a 01 , and introduce the new variablẽ κ = sgn(a 10 ) √ a 10 a 01 , to simplify the notation. Then, (45) reduces to For this system, the heat flow between system X 0 and its bath has been calculated in reference [73] (see there equation (A16), and note the different sign convention). In our notation, it readṡ Transforming back to the original variables (and recalling (42) which also holds in the underdamped description), this yields the heat flow (recall a 01 a 10 > 0) Q 0 = − |a 01 |k B sgn(a 10 ) √ a 10 a 01 2(γ + sgn(a 10 ) √ a 10 a 01 m/γ) √ a 10 a 01 m/γ) Thus, the heat flow vanishes if (13) is fulfilled. This exactly agrees with the condition that the heat flow in the overdamped description vanishes. Finally, we take the overdamped limit m/γ → 0 of (50), which yields lim (m/γ)→0Q This is indeed identical to (24) for the given parameters, confirming the consistency of our considerations.

Conclusion
This paper addresses the thermodynamic implications of non-reciprocal coupling between stochastic d.o.f., which is a form of non-conservative interaction appearing in various artificial or natural complex systems across the fields. The most important result is that the occurrence of a non-reciprocal coupling alone implies nonequilibrium, as indicated by a broken DB and fluctuationdissipation relation, and is automatically associated with a net energy and information flow. Remarkably, we found that under special conditions (specifically if a ij T j = a ji T i ), non-reciprocal system can reach a state of thermal equilibrium, despite begin simultaneously coupled to two heat baths at different temperatures. To prove the equilibrium nature of this state, we have considered a variety of equilibrium measures, that is, the FDR, DB, the equipartition theorem when we additionally include inertia terms, zero total entropy production, zero heat and information flows. In these equilibrium situations, the non-reciprocal system with internal temperature gradient can be formally mapped onto a reciprocal one at isothermal conditions, giving a mathematical explanation for the observed exceptions. Another key result is that a non-reciprocal coupling between isothermal d.o.f. may induce, for one of the two d.o.f., a negative heat flow (while the total dissipated energy is always positive), meaning that energy is extracted from the bath. This shows a crucial difference between the thermodynamic implications of a non-conservative (non-reciprocal) interaction vs a non-conservative external force, which could only induce a positive heat flow (as dictated by the second law). Both, the existence of isothermal systems with negative heat flow, and the existence of thermal equilibrium despite temperature gradients, are intriguing phenomena, which significantly depart from the thermodynamic behavior of reciprocal systems. Indeed, giving intuitive explanations appears to be challenging. We hope that this manuscript will stimulate fruitful discussions and future research on this matter. As different, exemplary representatives of non-reciprocal systems, we have considered active matter or feedback-controlled systems. While a single unidirectional coupling makes X 1 a 'propulsion mechanism' and X 0 an 'active swimmer', a single non-reciprocal bidirectional coupling may make X 1 a 'feedback controller' that operates on X 0 . Moreover, when the controller knows more about the controlled system than vice versa (indicated by an information flow to the controller), some major goals of feedback control can be achieved, including thermal fluctuation suppression, and energy extraction of the heat bath (i.e., a negative heat flow) making X 1 a minimal version of a continuously operating 'Maxwell demon'. The latter can only be achieved if (i) the information flow is directed from the system to the controller and (ii) the controller applies negative feedback, i.e., a feedback force pointing away from the delayed position of X 0 .
Whereas one non-reciprocal coupling (n = 1) only induces exponentially decaying memory in the corresponding non-Markovian equation for the single d.o.f. (e.g., X 0 ), the interplay of multiple linear non-reciprocal interactions (n > 1) allows to generate non-monotonic memory, which, in turn, is typical for time-delayed feedback control. From a thermodynamic point of view, the cases n = 1 and n = 2 share the main characteristics. However, there is indeed a crucial difference, that is, the heat and information flows are not proportional to each other, if n > 1. Thus, one can find for n = 2 some interesting NESS which only occur for n > 1 and non-reciprocal coupling. On the one hand, mutually coupled systems can be driven out of equilibrium due to their interaction, without at the same time exchanging any information. On the other hand, for a different non-reciprocal coupling topology, one can also find a state where one of these subsystems is in an NESS where it exports the entropy exclusively in the form of information without displaying a heat flow (no entropy is exported to the bath).
We close this paper by giving some perspectives on future research. In our present paper, we have shown that, under certain conditions, non-reciprocal forces can be mapped onto temperature gradients. Moreover, it is known that non-reciprocal couplings may result from gradients of chemical potentials [10,105,106]. This is, e.g., the case in the cellular sensor model [21], used as an example in this paper (section 2). Thus, it seems worth to systematically explore in the future whether, and under which conditions, a mapping onto other thermodynamic 'forces' is feasible.
A major focus of recent research is the search of meaningful thermodynamic descriptions for active systems. This is indeed not the topic of the present work, and we have here merely scratched the surface of this issue. For example, it is generally not possible to access the full dissipation of a complex living system, as long as not all underlying bio-chemical processes are fully known, understood, and also observable. The last point, i.e., the observability is related to another main problem in this context, that is, the thermodynamic treatment of auxiliary, or effective variables, which lack of a clear physical interpretation, as it is the case for the variable X 1 in the AOUP model. As we have pointed out several times throughout the paper, in such a situation the meaning of, e.g., the total EP is questionable. To account for this fact, we have discussed the different measures of (non)equilibrium on the Markovian and non-Markovian level of description. However, the DB condition or the FDR only yield a binary classification (equilibrium or not), but cannot quantify the distance from equilibrium. Finding out an appropriate way to do this is discussed, e.g., in [44]. An interesting line of research in the context of observability and auxiliary variables, is the search of 'effective thermodynamic' descriptions [52,53,86]. For a similar underdamped model with n = 1, different ways to obtain an 'effective thermodynamic' description, were recently compared in [52]. A generalization toward higher n (and overdamped models) represents a nontrivial but certainly worthwhile direction for future research. It would also be interesting to investigate the here observed special types of NESS, e.g., with zero dissipation but nonzero information flow, from this perspective.
Also, regarding the different measures for (non)equilibrium, our preliminary observations indicate that for n > 1, there are non-reciprocal systems that fulfill FDR but violate DB, i.e., are nonequilbrium models with FDRs. It might be interesting to study the corresponding information flows for these cases.
In this paper, we have analyzed the thermodynamic properties of small stochastic systems of few colloids with non-reciprocal couplings. As a next step, one could think about the implications of our findings for larger systems with numerous non-reciprocal couplings, which are, as a matter of fact, already realized in recent experiments [24]. Indeed, the non-reciprocity is found to yield intriguing clustering collective behavior. At this point, we also aim to note that in non-linear dynamics and network science, studying the effects of symmetry-broken coupling on the collective behavior is already a well-established research field [107]. For example, the existence of chimera states, a special type of clustering, was linked to symmetry-broken coupling [108], and shown to persist in the presence of discrete delay [109] and Gamma-distributed memory [110].
Lastly, the unidirectionally coupled ring system studied here is very similar to the reservoir computers investigated in [111,112]. A reservoir computer of this type may be experimentally realized by a laser network [113,114], or by coupled RC circuits [115,116]. Another link to machine learning is the similarity between the unidirectional ring and recurrent neural networks [117], used for example for reinforcement learning. In these contexts, the connection between non-reciprocal coupling and information flow discussed here might be of particular importance. Noteworthy, the architecture of the unidirectional ring considered here also resembles the architecture of a Brownian clock [118], which, in contrast, has discrete dynamics.
Since we are interested in steady-state dynamics in this paper, we can safely set X j (0) ≡ 0 without loss of generality. We therewith obtain for all j, Let us first consider the case n = 1. We plug (A2) for j = 1 into the equation (A1) for X 0 and immediately find sγ 0X0 = a 00X0 + a 01 a 10 (sγ 1 − a 11 )X 0 + a 01 (sγ 1 − a 11 )ξ 1 +ξ 0 .
Now we make use of the convolution theorem and the linearity of the Laplace transformation to transform back to real space, obtaining the non-Markovian process (3) with a memory kernel given by the inverse Laplace transformation of a 01 a 10 (sγ 1 −a 11 ) , as explicitly given in (4). Analogously, one finds the Gaussian colored noise in (3) ν(t) = (A5) In the steady state (t → ∞), the second term vanishes (if a 11 < 0), yielding the correlation from (4).

Appendix D. On the terminology of positive and negative feedback
In control theory, it is common to characterize feedback loops as positive or negative feedback, according to the question whether the force points toward, or away from the desired state once there is a perturbation from it, see, e.g., [62]. In the following, we check of which type the feedback considered in this paper is. We recall that in the present case, the control problem, which is given in (8) or, equivalently, in (3), reads γ 0Ẋ0 (t) = a 00 X 0 (t) + F c + f 0 + μ(t), with the feedback force F c = K(t − t )X 0 (t )dt . For the sake of illustration, let us explicitly consider the limit n → ∞, where the notation simplifies while the following reasoning is the same for any n. Thus, F c = kX 0 (t − τ ). This control problem can be alternatively expressed as γ 0Ẋ0 (t) = (a 00 + k)X 0 (t) − k[X 0 (t) − X 0 (t − τ )] + f 0 + μ(t), suiting to the picture of a colloidal in a static harmonic trap of stiffness a 00 + k < 0, and subject to a co-moving feedback trap centered around X 0 (t − τ ) and with stiffness k. When X 0 (t) ≡ X 0 (t − τ ), the control term −k[X 0 (t) − X 0 (t − τ )] vanishes, thus, this is a 'non-invasive' control. In contrast, when the system is perturbed from the delayed state, it may yield a positive or negative force on X 0 . Specifically, if k > 0, we have a negative force −k[X 0 (t) − X 0 (t − τ )] < 0 whenever X 0 (t) > X 0 (t − τ ). On the other hand, this force is positive, whenever X 0 (t) < X 0 (t − τ ). Now, it is clear that the feedback with k > 0 is pointing toward the past state, X 0 (t − τ ). Therefore, this case is denoted positive feedback. On the contrary, the feedback force is always pointing away from X 0 (t − τ ) if k < 0.
Let us now consider the individual summands. By application of basic properties of the logarithm and the natural boundary conditions, we find ∂ x j J j (x) ln ρ 1 (x 0 )ρ 1 (x 1 )..ρ 1 (x n ) ρ n+1 (x) dx = ln Thus, the change of mutual information is given by the sum over all information flows, n j=0İ →j = · I. (As was shown in [90], the information flowİ →j is actually the 'time-shifted mutual information' with the time shift applied to X j .)