Implications of non-Markovian dynamics on information-driven engine

The understanding of memory effects arising from the interaction between system and environment is a key for engineering quantum thermodynamic devices beyond the standard Markovian limit. We study the performance of measurement-based thermal machine whose working medium dynamics is subject to backflow of information from the reservoir via collision based model. In this study, the non-Markovian effect is introduced by allowing for additional unitary interactions between the environments. We present two strategies of realizing non-Markovian dynamics and study their influence on the performance of the engine. Moreover, the role of system-environment memory effects on the engine work extraction and information gain through measurement can be beneficial in short time.


Introduction
The second law of thermodynamics is ubiquitous in nature: it stipulates that heat always flows from hot place to cold one. However, in 1867 Maxwell proposes the opposite with his idea of an intelligent demon to illustrate the statistical nature of the second law of thermodynamics [1]. The demon, with sufficient information about the microscopic motions of individual atoms and molecules, is able to separate the fast-moving ('hot') ones from the slow-moving ('cold') ones and induce the heat to flow from cold to hot, in apparent contradiction with the second law of thermodynamics. It took nearly a century to resolve this apparent paradox following a series of works, starting from Szilard's engine [2] through Landauer [3], Bennett [4] and others to clarify the link between the information recorded by the demon and the thermodynamic entropy, see [5]. The advances in nanotechnology have made the realization of Maxwell's thought experiment, Szilard's engine possible in recent time [6][7][8].
In addition to this, there has been a parallel line of development in the non-Markovian dynamic behavior of system interacting with reservoir. Theoretical advances have been made on the characterization of non-Markovian [9][10][11] as well as the verifications in various experimental setup [12][13][14]. The role of memory (non-Markovian) effects in understanding of information processing at both the classical and quantum level is currently attracting research interest [15][16][17][18]. Likewise, over the last few years , there has been an increase on the studies to understand or harness the non-Markovian effect on quantum thermodynamic machines [19][20][21][22][23][24].
Recently, studying the non-Markovian dynamic of a system has shed more light into the understanding of the Landauer principle [17].
Over the past few years, great effort has been devoted on studying the interplay between thermodynamics and quantum mechanics [25][26][27][28][29][30]. Remarkable progress has been made in understanding the non-equilibrium processes in thermodynamics as well as extending/generalizing the second law of thermodynamics to incorporates measurement and feedback driven processes [31][32][33][34][35][36][37]. Recently, the role of feedback control on information thermodynamic engine has been experimentally studied in different platform [38][39][40][41][42][43]. However, the understanding of the machine performance when the feedback engine protocol is performed by system exhibiting non-Markovian dynamics is still lacking. Although the self-consistent formulation of an interpretation of thermodynamic laws in the presence of measurements and feedback is still work in progress, and is attracting much attention, more practical issues such as the enhancement of the performance of cooling algorithms by feedback-based mechanisms are already under investigation and exploitation [44][45][46][47]. However, non-Markovian effects from the point of view of information flow have been examined [48] and a feedbackassisted work extraction demon has been proposed in [49].
In this paper, we investigate the implications of non-Markovian quantum dynamics on feedback-based information-driven machines described by collisional models (CM) [16,[50][51][52][53][54][55]. We discretize the continuous time evolution with a series of steps during which the system of interest couples/interacts with different components of a many-body quantum system that stimulates an extended environment. By properly controlling the intra-environment collisions/interactions one can pass from a purely Markovian dynamics to a strong non-Markovian regime. In fact, an extended control over the amount of non-Markovianity based on CM has been demonstrated experimentally in photonic setups [13,56,57]. Controllable non-Markovian quantum dynamics of an electronic spin qubit has been realized using a nitrogen-vacancy center in diamond [58][59][60]. In addition, the system under scrutiny is subjected to weak measurements implemented by weakly coupling it to an ancilla M that is then affected by strong projective measurements. This provides an elegant way to infer the state of effects on quantum system with only very little disturbance [61]. We show that memory can enhance the overall performance-work-done and information gain of the engine in a small number of discrete steps (i.e., short time). We remark that our framework/protocol can easily be implemented in a photonic experiment setup [13].
The rest of the paper is organized as follows. In section 2, we first present the measurement-based engine and then briefly discuss its thermodynamic analysis, section 2.2. In section 3 we introduce the CM-based model of non-Markovian dynamics and outline two different strategies for the tracking of the dynamics. Then, the characterization of the non-Markovian features is given in section 4.1, while the analysis of the feedback-driven engine in both Markovian and non-Markovian situation is reported in section 4.2. Finally, section 5 draws our conclusions.

Measurement-based thermo-machine
The system is initially brought into contact with a heat reservoir. It is then decoupled from the reservoir and and attached to a measuring apparatus. The latter consists of a quantum system, prepared in a given state, coupled to the system and subjected to projective measurements. The apparatus acquires information on the state of the system and depending on the result of the measurement performed on its state, a feedback operation is performed on the system. The setup consists of three components; system, reservoir and the ancilla.

Description of the protocol
We now introduce the protocol for the investigation of the effects that a process of information-gathering and feedback has on the capability of a system undergoing non-Markovian quantum dynamics to perform work, see figure 1. While Step 1 and 2 of the scheme illustrated in figure 1 and described herein generate non-markovian dynamics, Steps 3-6 illustrated below corresponds to the protocol in [35]. We proceed step by step, as follows: Step 1: Initial preparation-System S and thermal reservoir(s) R are prepared in their respective equilibrium states at inverse temperature b = k T 1 i B i , where i=S, R. The initial system-reservoir state is described by the density matrix where H i denotes the Hamiltonian of element i and = b - ]is the corresponding partition function. For simplicity, we consider the case in which the system and the reservoir are made of two-level systems.
Step 2: System-environment coupling-System and reservoir interact unitarily. In line with the usual formalism used in collisional models for quantum open-system dynamics [16,37,[50][51][52][53]62], in what follows we focus on a time-evolution operator of the partial-SWAP form such as where τ is a dimensionless interaction time and U sw is the two-particle SWAP transformation . This results in a sequential coherent exchange of information between the system and the element of the reservoir it has collided with for each collision/iteration. The S-R state after such unitary evolution is thus In general, the joint dynamics embodied by U SR gives rise to quantum correlations between system and environment. The environment is then discarded, leaving us with the reduced state of the system only Step 3: Pre-measurement-The system is then brought into contact with a measuring apparatus, i.e. an ancillary qubit M prepared in state ρ M . The S-M coupling takes place according to the unitary transformation U SM , which gives the joint density matrix where we call τ m the dimensionless system-probe interaction time and H SM the corresponding S-M coupling Hamiltonian such that ) . The coupling Hamiltonian can take different forms depending on the coupling direction. However, without loss of generality, we consider the case where we aim at performing projections onto the eigenstates of the Pauli spin matrix σ z which can be achieved by preparing a probe qubit in the state ñ 0 | and then a controlled-NOT (CNOT) gate operation from the system to the probe before inferring the σ z from the probe. Thus the unitary operator describing the general interactions between the system and probe is [63] t is the definition of CNOT gate. For τ m =0, there is no correlation between the system and probe whereas τ m =π/2 (CNOT up to the global phase -i) implies perfect correlation between the system and probe. The system and probe becomes partially correlated for 0<τ m <π/2.
Step 4: Measurement-This is the actual information-gathering step where the information on S acquired by the ancilla during Step 3 is transferred to M via an actual measurement process. The latter is described by the complete set of projective operators M M k { } ( ) , defined in the Hilbert space of the ancilla M. Let us assume that the ancilla is initially prepared in one of its computational-basis states, i.e. r = ñá The probability that outcome k is obtained as a result of such measurement is given by an element of the positive-operator value measure (POVM) induced on the system. The corresponding post-measurement state of the system reads Concretely, we consider a weak/gentle projective measurement on the probe after the pre-measurement/ interaction that gives only partial information about the system and thus only partially projects the system state.
The projectors describing the local σ z measurement on the probe are However, for τ m = 1, the outcome ñ 0 | occurs with probability P 0 ≈1, and the post-measurement state of the system is almost unchanged from the initial state [64], (cf appendix A). The resulting probability and postmeasurement system state are In the context of photon polarization, one might direct single photons toward a weakly polarization dependent beam splitter to simulate such a measurement. In addition, it is possible to design a measurement protocol that only output post-selected state of a weak measurement on an NMR quantum information processor by controlled gate operation [65].
Step 5: Feedback control operation-Based on the outcome of the measurement at Step 4, the controller performs a conditional operation on the state of the system [31,35]. The most general unitary transformation on a single-qubit state is a rotation a a s s s = -R i n exp , , n x y z ( ) ( · ( )) that depends on an angle α about an arbitrary axis identified by the unit vector q f q f q = n sin cos , sin sin , cos ( ) , which has been written in polar coordinates specified by the polar angle θ and azimuthal one f. By including a general global phase γ, such an arbitrary unitary rotation operator is cos cos sin sin sin sin sin cos cos sin 10 ). In our case, the set of parameters upon which such rotation depends should be interpreted as conditioned on the outcome of the measurement performed, at Step 4, on the ancilla M. That is The use of such conditioned rotation, which embodies our simple feedback control operation, delivers the state of the system However, we remark that the feedback unitary operation could cancel the actual measurement effect depending on the choice of parameter v k , for more discussion, see appendix B.
Step 6: The reset-The system evolves independently and a fresh ancilla is made available to the next iteration of the protocol, which proceeds again from Step 1 onwards. This stage has no effect on the analysis that follows.

Thermodynamics of the machine
We proceed with the thermodynamic analysis of the protocol presented above, by calculating the changes in internal energy r ]of the system associated with the preparation, measurement and feedback-control steps.
First, after the system preparation (interaction with the reservoir), the change in the system internal energy is and the change in system entropy reads r r r r r r From the first law of thermodynamics, D = + E W Q, and assuming that the heat exchange between the system and reservoir is governed by = -Q Q S u R u , which is reasonable in the absence of any channel for heat exchange other than the S-R interaction, the work done on/by the system can be written as During the measurement stage, the information acquired from the system leads to entropy reduction. The resulting system state is out of equilibrium but its entropy and average energy are still well defined [32]. The gain of information about the system achieved through the measurement, after pre-measurement and measurement stage, is r r r = - Equation (18) is beyond the second law of thermodynamics due to the correlation between the system and the memory. We remark that the form of such bound was first given in [31] for a discrete quantum feedback protocol (Step 3-6) starting and ending in equilibrium states, while details on the subject can be found in [37]. In section 3, we present the model that we use to account for non-Markovianity in the dynamics of S. Such effects can be characterized by work done on/by the system W u . Then, we illustrate numerically the influence of the preparation on the information gain I g and work extraction W t in section 4.

Non-Markovian dynamics of the system-collisional based model
Here, we consider a situation where the system undergoes non-Markovian dynamics as a result of its interaction with the environment (taking place at steps 1 and 2 of our protocol). The realization of the dynamics that we decide to consider is that of collisional models, which offer great flexibility and richness of phenomenology [51,52].
In particular, we consider the case in which the reservoir's memory mechanism arises from collisions between different elements of a structured, multi-party environment, following an interaction with the system. This scenario has been successfully used in the past to model memory-bearing mechanisms able to propagate to the environment information acquired on the state of the system [66]. More recently, this realization of memory-bearing effects has been used to assess the performance of a quantum Otto cycle having a harmonic system as a working medium [67]. Collisional models allow for the tracking of the dynamics of both system and environments, which in turn makes it possible to follow the ensuing emergence of the system-environment correlations responsible for memory effects [16,[50][51][52][53][54][55]68]. They are thus invaluable methodological tools to assess the back-action of memory-bearing environments on the information-driven engine at the core of our study.
As anticipated above, we assume an environment R made out of a large number of elements, which we label {E 1 , E 2 ,.., E n } and that we assume, for the sake of simplicity, to be identical. The total state of system and environment is initially factorized and the dynamics proceeds through as sequential collisions (interaction process) between S and an element E n of the environment. These are followed by pairwise collisions/interactions between the elements of the-environment, as illustrated in figure 2. In [66], it has been shown that the degree of non-Markovianity of the reduced system dynamics depends on how the erasure of system-environment correlations is performed.
Here, we will consider two inequivalent schemes of tracing out the degree of freedom of the environment. The first scenario that we consider to compute the reduced dynamics of S requires the environmental particle E n to be traced out when it has interacted with S and + E n 1 but before the system interacts with + E n 1 . In the second scenario, the reduced dynamics of the system is obtained by tracing out the environmental particle once it has interacted with system S. The remaining environmental particle interacts with the next homogeneous particle Figure 2. Schematic of non-Markovian dynamics via collision model for nearest sub-environment collisions. The system and the subenvironment particles are initially uncorrelated. In first step (a), system S interacts with E 1 . In step, (b) E 1 interacts with E 2 thereby correlating the system and particles E 1 and E 2 . Then in step (c), E 1 is traced away. After this the system interacts with E 2 before being isolated for the measurement and feedback processes (Step 3-5) entailed in strategy-1. Then, the system moves forward (d), while E 2 and E 3 interact. before the latter subsequently collides with the system. We also assume that the environment-environment interaction evolution is described by the unitary operator [16,[51][52][53]  which describes another partial-SWAP gate between two consecutive elements of the environment, parameterized by the dimensionless interaction time τ e . The first scenario (which we term strategy-1) that we consider involves tracing out particle E n after it has collided with + E n 1 , as exemplified in figures 2(a)-(c). It starts with a collision between S and E n , modelled through the unitary operation U SR in equation (2), which delivers the joint state The three particles S, E n and + E n 1 then become correlated through the intra-environment interaction + U E E n n 1 in equation (19), after which particle E n is traced out. This results in the bipartite S-+ E n 1 state r The marginal state of the system is computed after the interaction with + E n 1 . Thus, strategy-1 prepare the system in state We remark that retaining the correlations up to the third environment-which corresponds to the systematic collision with the environmental components + E E , , n n 1 and + E n 2 as in figure 2-does not change the resulting dynamics [54]. At the end of the system-environment interaction, the engine-protocol steps [step [3][4][5][6] are performed before the system collides with another fresh environment.
In the second scenario, dubbed strategy-2, the correlation established between S and E n is removed before the intra-environment interaction -+ E E n n 1 . The states achieved at each stage of strategy-2 are thus as follows. First, the collision between system and E n occurs, which gives the state This scenario clearly differs from the first one in both the number of particles being involved, and the amount of correlations that are retained as a result of the system-environment interaction. In turn, this influences the non-Markovian features of the dynamical maps applied to S and arising from the implementation of such strategies.
To quantify the degree of non-Markovianity of the reduced system dynamics undergone by S, we employ the measure for non-Markovianity proposed in [9] which is associated with back-flow of information from the environment to the system. This is based on the time behavior of the trace distance between two different initial quantum states of S, that is is the trace norm of operator ρ and r 1,2 are two density matrices of S. For Markovian dynamics, D(ρ 1 , ρ 2 ) monotonically decreases with time for any pair of initial states r 0 1,2 ( ). On the contrary, a dynamical process is signalled as non-Markovian if there is a pair of such states for which this quantity exhibits a non-monotonic behaviour.

Analysis of non-Markovianity and its role in the performance of the engine
Now we present the numerical analysis of the non-Markovian dynamics of the collision model for both strategies described above and then, their role on the thermodynamics of the engine. In the remainder of the paper, we will assume both the system and reservoir to be two-level systems with Hamiltonian w s = H 2 where = j x y z , , is a label for the j-Pauli spin operator of particle = i S R , , and β i is the corresponding inverse temperature. We remark that, provided that the frequencies are positive (w w > R S ) and the inverse temperatures are the same, the results presented hold qualitatively.

Non-Markovianity features from both strategies
We numerically analyze the behaviour of the trace distance r r D , )as the collision-based model for systemenvironment interactions are repeatedly executed. This analysis will elucidate how to arrange the dynamics to be a non-Markovain using different strategies described in section 3 and corresponds to the first two steps of the engine protocol, see section 2.1. We present the behaviour of the trace distance in equation (26) for two initial states prepared at r s S S z 1 ( )and r s S S y 2 ( ). We have assumed that all environmental particles/qubits are initialized in the state r s R R z ( ). The large value of the trace distance corresponds to distinguishable states while a null value is achieved when the states are identical. Figures 3(a) and (b) show the differences between the two strategies addressed in this study. For purely Markovian dynamics (t = 0 e , red dotted curves), the trace distance decreases monotonously while switching on the inter-environment interaction times (t ¹ 0 e , blue dashed and green dot-dashed curves) results in revivals that are evidence of non-Markovianity. In fact, this system-environment interaction produces a backflow mechanism-which is seen as oscillations of the trace distance that fades out in the large number of collisions with fresh ancilla. We observe that the strong environment-environment interaction time t p = 4 e corresponds to a full state-swap between two consecutive environment particles that results in a non vanishing trace distance, see the green dot-dashed curves in figures 3(a) and (b). It can be seen that the oscillations are more persistent in strategy-1 (figure 3(a)) but fade out to a non-zero value in the strategy-2, see figure 3(b). While the non-Markovian dynamics persists for both strategies in strong intra-environment interaction, the intermediate coupling strength shows a clear dependence of the non-Markovian nature on the way information/correlation is developed via collisions. For a weaker environment-environment particle interaction times t p < 4 e , both strategies trace distance decreases as the number of environmental collision increases, see blue dashed curves in figure 3. For more extensive discussion on the way information is exchanged between the system and environment for the two strategies and their differences/superiority, see [54,66].

Feedback-driven engine analysis
Let us now evaluate the influence of non-Markovianity on the performance of the measurement-based machine described in section 2 above. We consider a two-level system initially prepared in the state r s ) is performed on the system state after the measurement outcome. The thermodynamic quantities, such as work extraction and quantum information gain are numerically calculated and presented in figure 4. The quantum information gain gives a measure of the correlations between the system and the measurement apparatus while the work extraction deals with the energy exchange during the preparation, measurement and feedback operation. Note that a deeper and more complete theoretical explanation of the link between the thermodynamic quantities is still missing.
In figure 4, the feedback engine performance, work performed by the engine protocol and the corresponding quantum mutual information associated with the measurement step, as a function of repeated collision are presented for the two different non-Markovian strategies described above. For the Markovian dynamics (t = 0 e , red dotted curves in figures 4(a) and (b)), the total work done and quantum mutual information increases as the system-environment interactions times grow until it they reach constant values many collision iteration. For the strategy-1, figure 4(a), as the system dynamics is prepared to be non-Markovian, an oscillatory behaviour which vanishes in the long collision time are observed for both engine performance quantities-work done and information gain. The non-Markovian feature is strong at short collision times and can exceed their Markovian counterpart. However, the intermediate system-environment iteration is marked with suppression of the engine performance due to memory effect. For the non-swap environment-environment interactions (e.g t p = 10 43 e ), the total work done and information gain approach the Markovian values after many number of collisions, see figure 4(a). This results from the reduction of information back-flow and the saturation point corresponds to the collision iteration number that the thermodynamic quantities (DE u , Q u and W u ) during the preparation step vanishes. In addition, we remark that including the work done on/by the system during the preparation (Step-1 & 2) does not affect our results qualitatively. Moreover, the system work done during the preparation W u exhibit similar oscillatory behaviour but alternates between positive and negative values, see right panel of figure 4(a). Figure 4(b) shows the work-extraction and information gain through measurement resulting from implementation of strategy-2. We observe that such non-Markovian dynamics scenario (t ¹ 0 e ) gives rise to non oscillatory behaviour contrary to strategy-1 and the amount of work extraction and information gain quantities never exceed the Markovian one. This behaviour is akin to the observation in the trace distance figure 3(b), in which the strategy-2 oscillation are short time leave. Interestingly, for strong environmentenvironment interaction time t p = 4 e , the total work done and information gain saturate to finite value that is lower than the Markovian case, see the green curves in figure 4(b). Likewise, the saturation occurs at a vanishing change in the system work done, D = W 0 u . For more iterations with fresh environments under weaker interaction environment-environment time t p = 10 43 e , the quantities attain the Markovian values. However, it takes different amount of environment collisions to achieve the Markovian conditions for both strategies.
Strategy-1 is evidently superior to strategy-2 in setting a nonzero degree of non-Markovianity as well as oscillatory behaviour of the engine performance (work done and information gain). The oscillations may lead to enhanced work and information gain compared to the Markovian dynamics (τ e =0). Interestingly, even the small oscillatory behaviour observed for the strong system-environment interaction time based on strategy-2 (see, trace distance of figure 3(b)) did not leave a trace in the performance analysis. However, the resulting work after system-environment interaction W u has a behaviour that is reminiscent of the quantum information gain of the engine protocol. We remark that the differences depend on the way system-environment correlations are accounted for at the system preparation stage (i.e, Step 1 & 2).

Conclusion
We have investigated the interplay between memory effects and performance of a feedback-driven quantum engine. The engine setup consists of system, reservoir and measurement probe which we have modelled as set of Figure 4. Feedback driven engine performance: The total work done W t , the quantum information gain I g and the work done during preparation step W u as a function of number of collision n with the environment. The upper panel (a) corresponds to strategy-1 while the lower panel (b) is for strategy-2. The red dotted curve corresponds to the Markovian dynamics, τ e =0.0 while the blue dashed curve represent the non-Markovian dynamics, τ e =10π/43. The green dot-dashed curve represent the full swap non-Markovian dynamics, τ e =π/4. The system-environment interaction time is τ=π/42 for weak coupling and the system and environment frequencies parameters are ω S =1 and ω R = 3.0 respectively. The system-probe interaction time is τ m =π/14 and β S =β R =0.94. In addition, the heat associated with the pre-measurement step is Q pm M ≈±4×10 −16 for the chosen parameters.
two-level systems. We have employed the trace distance as a measure of memory effects (non-Markovianity) to illustrate two strategies of realizing non-Markovian dynamics. We have observed that memory effects can enhance the performance-work and information gain-of feedback-driven engine for a small number of system-environment collisions. However, the performance decreases as the number of collisions grows and approaches the Markovian value for a very large number of collisions. Besides shedding light on the interplay between non-Markovianity and measurement driven engine, this study suggest more theoretical effort to understand the role of memory on information thermodynamics. Furthermore, it will be interesting to understand the influence of non-Markovian dynamics that arise from the intrinsic uncertainties associated with measurement e.g. quantum projection noise [69].