Fully differentiable optimization protocols for non-equilibrium steady states

In the case of quantum systems interacting with multiple environments, the time-evolution of the reduced density matrix is described by the Liouvillian. For a variety of physical observables, the long-time limit or steady state solution is needed for the computation of desired physical observables. For inverse design or optimal control of such systems, the common approaches are based on brute-force search strategies. Here, we present a novel methodology, based on automatic differentiation, capable of differentiating the steady state solution with respect to any parameter of the Liouvillian. Our approach has a low memory cost, and is agnostic to the exact algorithm for computing the steady state. We illustrate the advantage of this method by inverse designing the parameters of a quantum heat transfer device that maximizes the heat current and the rectification coefficient. Additionally, we optimize the parameters of various Lindblad operators used in the simulation of energy transfer under natural incoherent light. We also present a sensitivity analysis of the steady state for energy transfer under natural incoherent light as a function of the incoherent-light pumping rate.

In the case of quantum systems interacting with multiple environments, the time-evolution of the reduced density matrix is described by the Liouvillian. For a variety of physical observables, the long-time limit or steady state solution is needed for the computation of desired physical observables. For inverse design or optimal control of such systems, the common approaches are based on bruteforce search strategies. Here, we present a novel methodology, based on automatic differentiation, capable of differentiating the steady state solution with respect to any parameter of the Liouvillian. Our approach has a low memory cost, and is agnostic to the exact algorithm for computing the steady state. We illustrate the advantage of this method by inverse designing the parameters of a quantum heat transfer device that maximizes the heat current and the rectification coefficient. Additionally, we optimize the parameters of various Lindblad operators used in the simulation of energy transfer under natural incoherent light. We also present a sensitivity analysis of the steady state for energy transfer under natural incoherent light as a function of the incoherent-light pumping rate.

I. INTRODUCTION
Nano-scale devices are commonly described as quantum systems that interact with multiple environments or baths. Their performance is usually quantified through quantum observables of the form, Ô (t) = Tr[Ôρ(t)], where ρ(t) is the reduced representation of the quantum system's state at time t [1]. For systems such as quantum heat engines, batteries, and incoherently excited exciton transport systems, the observables of interest depends on the long-time limit, where ρ ss is the steady state (SS) solution which satisfies dρ ss (t) dt = 0. To solve for the steady state, the timeevolution of ρ is usually described by a quantum master equation of the form, where F is the Liouvillian, and Θ ∈ R d represents any set of d-parameters used to construct the Liouvillian, e.g., bath parameters such as the decay rates for Lindblad operators, temperature(s) of the bath(s), and system-bath parameters [1]. From the above equations, it is clear that ρ ss and O ρ ss both directly depend on the functional form of the Liouvillian and the value of the parameters Θ. By knowing the gradient of ρ ss with respect to any parameter of F , we could understand more in depth the effect Θ has on ρ ss and O ρ ss .
The study and optimization of non-equilibrium steady state systems has only been done by brute force search or physically motivated methods, Refs. [45][46][47][48]. This is due to two main difficulties, i) the need to solve for ρ ss a large number of times, and ii) inefficient numerical techniques for computing dρ ss dΘ and d O ρ ss dΘ . For the former problem, there have been recent works on how to alleviate the large cost in obtaining ρ ss by parametrizing the steady state solution [49][50][51][52], and determining the Liouvillian gap [53] using various deep learning architectures. Additionally, the memory kernel has also been approximated with deep learning methodologies [54][55][56][57].
Gradient based methods, based on dρ ss dΘ and d O ρ ss dΘ , could facilitate the navigation/search for the optimal parameters Θ * in the steady states where the O ρ ss observable is maximum/minimum. While deep learning based approaches could alleviate some of the computational cost associated with obtaining ρ ss , none of these methodologies can improve the computation of dρ ss dΘ since they were not designed to learn the relation between the steady state and the Liouvillian's parameters, ρ ss = f (Θ). The work presented here introduces a new route to efficiently compute these quantities by combining automatic differentiation [58] and the implicit function theorem [59].
The sections of the paper are organized as follows, Section II presents the methodology. Section III contains the results and discussion for the optimization of a quantum heat transfer device and the energy transfer efficiency in a exciton model. Lastly, the summary is in Section IV.

II. METHOD
In general, any physical observable that depends on the steady state is a scalar function, e.g. Ô ρ ss = g(ρ ss , Θ) (Eq. 1), whose gradient with respect to the parameters can be decomposed with the chain rule, where terms of the form ∂Ô(Θ) ∂Θ can be computed efficiently with automatic differentiation [58], or using closed form expressions when available. On the other hand, the gradient of the steady state with respect to some parameters, dρ ss dΘ , is not readily available, and the standard ways of obtaining these gradients is either analytically (when possible) or via finite differences. However, the latter approach requires computing dρ ss dΘi for each single parameter separately, for a total of O(d) evaluations of the steady state, where d is the number of free-parameters in the Liouvillian, Θ ∈ R d . This makes the finite difference approach intractable for systems that depend on a larger number of parameters.
Computing gradients through the chain rule is the role of an automatic differentiation (AD) framework, though we note that naïvely using AD is insufficient for computing dρ ss dΘ . While we can solve for ρ ss by running an appropriate ODE solver for a sufficiently long period of time, differentiating through the internals of the ODE solver is prohibitive as it requires storing all intermediate quantities of the solver. There exists low-memory methods for computing gradients of ODE solutions but they require either the trajectory ρ(t) to be stored in memory or solving ρ(t) in reverse time for constant memory [31,32,60]. However, the reversing approach is not applicable as the steady state, once reached, cannot be reversed.
To differentiate the steady state solution with respect to any parameter with constant memory usage, we view ρ ss as the solution of a fixed point problem, By differentiating both-sides of Eq. 4 and solving for dρ ss /dΘ we obtain, The implicit function theorem [59] (Eq. 5) permits us to exactly compute dρ ss dΘ without knowing the explicit or analytic dependence of ρ ss on Θ, ρ ss = f (Θ). The full derivation of Eq. 5 is presented in Section I-A in the Supplemental Material.
For the inverse design of non-equilibrium quantum systems using gradient based methods, we require the Jacobian of O ρ ss with respect to any parameter (Eq. 3). Specifically, we only require vector-Jacobian products (VJP) of the form v dρ ss dΘ . In the context of computing Eq. 3, v is the vectorized form ofÔ(Θ). For any vector v, the vector-Jacobian product for dρ ss dΘ is given by the implicit function theorem as, where the term v is the solution of a linear set of equations of the form, v = y ∂F (ρ ss ; Θ) ∂ρ .
The Jacobian J ρ := ∂F (ρ ss ;Θ) ∂ρ is a m 2 ×m 2 matrix, where m is the number of degrees of freedom to describe a quantum system, and it could be efficiently inverted only for small quantum systems. Additionally, to compute J ρ using automatic differentiation, we must to evaluate the Liouvillian O(m 4 ) times, which could be computationally expensive for larger quantum systems.
The approach presented here avoids the computation of J −1 ρ by realizing that v J −1 ρ is the steady state solution of second ODE of the form, where the steady state solution (y ss ) is, Notably, simulating this second ODE only requires vector-Jacobian products of the form y (∂F (ρ ss ; Θ)/∂ρ), that can be efficiently computed with AD. Thus, to differentiate the steady state solution of a QME, we essentially solve two steady state problems, both of which can be computed using any black-box steady state solver. For all our simulations we used an adaptive step size Runge-Kutta algorithm of order 5 [61] to solve for the steady state. Steady state solutions obtained by this approach were checked by exact long time computations. In summary, we propose an efficient algorithm for computing v dρ ss dΘ in order to allow any black-box steady state solver to be placed within an AD framework, such as JAX [62]. For more details about AD and the implicit function theorem we refer the reader to Ref. [58] and Section I-A in the Supplemental Material.
The implicit function theorem for ODE steady states was first proposed by F. J. Pineda in 1987 to generalize the back propagation algorithm for recurrent neural networks [63]. However, the methodology presented here is a generalization of that work since it does not depend on analytic forms for the Jacobian of the Liouvillian.
To summarize, we propose a method of efficiently computing exact derivatives for all parameters simultaneously with just a single evaluation of ρ ss , compared to finite difference approximations computed using O(d) evaluations. The computation of dρ ss dΘ is independent of how we solve for ρ ss . The implementation of the proposed algorithm is available online [64]. In the following sections, we carry out inverse design and sensitivity analysis by differentiating through the steady state for the optimization of quantum heat and energy transfer systems.

A. Redfield theory: Model systems
Quantum heat transfer (QHT) models have been proposed as rich systems to study quantum effects in thermodynamics [45,[65][66][67][68]. QHT models are commonly studied within the framework of open quantum systems where various approaches of the quantum master equation have been applied. In the limit of weak interactions between the system and the baths, the time evolution of a QHT model can be described in a perturbative manner using Redfield theory (RT). RT assumes that the baths are prepared in a canonical thermal state, that the system-bath interaction can be factorized, and that the environments are Markovian. The Redfield master equation for multiple non-interacting baths is [1,19], where D α is the dissipator for each α environment, and the Redfield tensor for the α environment is R α . In the Markovian limit and neglecting the Lamb shift, the R α µ,ν,κ,λ are, where the transition rates α Γ ± are, where S α is the interaction of the system with the α−bath and Υ α (ω) is defined as, Each Redfield tensor depends on the systemenvironment interactions, the spectral density function, G α (ω) = γ α ωe −ω/ωc , and the average phonon occupation number, n α (ω). [µ, ν, κ, λ] are the index of the eigenstates of H S , i.e. they satisfy H S |µ = µ |µ , and ω µ,ν is the difference between eigenvalues µ and ν . Additionally, each environment is characterized by the temperature T α and a friction coefficient γ α . For more details about Redfield theory we refer the reader to the Supplemental Material and to Refs. [1,18,19].
Quantum heat transfer models are characterized by the change in the system's energy, d H S /dt = Tr H Sρ (t) , corresponding to the heat flow. This equality only holds if the system's Hamiltonian H S is time independent. In the long-time limit,ρ(t) = 0, the energy exchange is comprised of the flow of heat, where the rate of heat exchange with the α-bath is given by, If one tries to inverse design the system, baths, or the system-baths interactions to maximize J α , one needs to understand the effect each parameter has in the QME, e.g., ∂D α ∂θ or ∂ρ ss ∂γα . Here, we combine automatic differentiation and Eq. (5), to inverse design the heat current using gradient based methods. We illustrate this new methodology on a QHT model with a three-level system interacting with three baths, where the system Hamiltonian is, θ and J describe the energy of sites |1 and |2 and the hopping between them. We set g = 0 for reference. The shorthand "h.c." denotes the Hermitian conjugate of former terms in the expressions. For the construction of the Redfield tensors we invoked the Markovian approximation, and each environment's state is described with an ohmic spectral density function and a friction coefficient γ α . We also neglect the Lamb shift. For more details see the Supplemental Material.
The most general type of interactions with the hot (H), cold (C) and decoherence (D) baths are, where [a 0 , a 1 ] and [b 0 , b 1 ] are the coupling strength parameters to the H and C bath, left panel Fig. 1. The D−bath is a control mechanism to study the role of coherences between the sites. For this system, analytic results for J α can only be derived in the secular limit [45]. These system-bath interactions (Eqs. [17][18][19] are needed to construct the transition rates α Γ ± (Eq. 13). The parameters of this QHT model can be efficiently optimized by maximizing the heat exchange with the hot bath, J H (Eq. (15)). For these simulations, the temperatures are held fixed for all three baths, T H = 0.15, For each optimization procedure, all initial parameters were randomly sampled. Values of a i and b i were constrained to [0,1] to avoid physically incorrect models, and the values of θ and J were constrained to the positive domain by casting them as the exponential function of unconstrained variables. To maximize J H , we used the Adam [69] optimization algorithm with a learning rate of 0.02.
In Ref. [45], this system was studied using fixed parameters a 0 = b 1 = 1 and a 1 = b 0 = 0. Through optimization, we found that the majority of the optimized systems also recovered these parameter values (Model A). However, the remainder of the optimized results indicate that J H is maximum when both baths, hot and cold, interact with only one site, e.g., Fig. 2 contains a collection of optimizations, with the optimized values of θ and J indicated and those of the remaining parameters not explicitly shown. In addition, the upper inset shows that all of these optimizations converge to essentially the same value of J H . It is worth noticing that independent of the initial parameters, J H > 1.85 × 10 −5 after 200 iterations; Fig. 2 upper inset panel.
Our QHT model considers any linear combination of interactions between both hot and cold baths with any site. It is not a surprise that J H is maximized when a i ≈ 0 or b i ≈ 1, since a stronger system-bath interaction increases the heat transfer. However, by being able to independently optimize a i and b i , we found that the magnitude of the heat exchange for Model B is the same as for Model A, making it an interesting future route for further examination since it has never been studied.
We stressed that the parameters for both Models A and B where found by maximizing J H with Adam [69], a first order gradient descent method. The optimal value of the J parameter for Model A is J ≈ 0.018, and for Model B 0.0025 < J < 0.03. For both systems, Model A and B, the optimal value of the energy of sites |1 and |2 is θ ≈ 0.3895. The optimized parameters are reported in Table I in the Supplemental Material.
γ H and γ C are the friction parameters that describe the strength of the coupling to each bath. For each individual optimization we found that at the end of the search procedure, γ H /γ C ≈ 1. This is an interesting property that can depend on the fixed values of the temperatures. We also found that the value of ∂J H /∂γ D is zero in the regions where J H is maximum. Given the weak-interacting assumption inherent within Redfield theory, the maximum value allowed for γ i is 0.0025 and the initial values for these parameters were only sampled from [10 −5 , 10 −4 ]. The gradient of ∂J H /∂γ H and ∂J H /∂γ C show that their values must increase in order to maximize the heat exchange.
For this case, each iteration requires only two steady state evaluations with our approach, while a finite difference approach would have needed 2×|Θ| = 18 evaluations of the steady state. Additionally, each gradient-based optimization took less than 150 iterations to find optimal parameters with high precision, Fig. 2. A standard gridsearch approach of 10 points for each parameter would have required over 10 9 steady state evaluations.
We also considered a quantum heat transfer model with non-degenerate states, (20) using the same procedure and the same system-bath interactions, Eqs. 18-19. We found that degenerate systems, θ 1 /θ 2 ≈ 1 are the systems with the highest FIG. 2. The symbols in the main and lower inset panels represent the optimal set of parameter that maximize the heat exchange for the hot bath (JH ), Eq. 15, for a quantum heat transfer system, (Fig. 1). We categorize the optimal set of parameter into two different cases, i) model A for a0 = b1 ≈ 1 or a1 = b0 ≈ 1, and ii) model B for ai = bi ≈ 1. Model A the hot (H) and cold (C) baths interact with different sites, and for model B the H and C baths interact with the same site. All parameters, Θ = [θ, J, a0, a1, b0, b1, γH , γC , γD], were optimized with the Adam algorithm. The upper inset figure depicts the value of the heat exchange during the optimization procedure, where the initial values of Θ were sampled randomly. We only considered 100 random initializations of Θ.
J H . The optimal values of using separate parameters for each site (θ 1 and θ 2 ) were similar to the optimized value of using a single on-site energy parameter (θ) to describe both sites in Eq. 16. The optimal value found was θ i = θ ≈ 0.389 (Fig. 2). For these degenerate Hamiltonians it was found that the hot and the cold bath interacting with different sites (Model A) was ideal. It is well known that degeneracy between sites leads to increased coherences in transport systems which in turn increases the currents. This demonstrates that even with limited input the method outlined in Sec. II can lead to correct physical models. All parameters are reported in Table II in the Supplemental Material.
Another common observable to study quantum heat transfer models is the rectification coefficient, R, which is the net heat current when the temperature difference, T H − T C , in the H and C reservoirs is reversed [70], where J H is the heat current when the hot bath interacts with |1 and cold bath with |2 , and surrogate observable was J H can be used to tune the free parameters for R. We use the same gradient-based algorithm, Adam, to maximize R. For these simulations, we only considered as free parameters, θ, J and the friction coefficients for all three baths, [γ H , γ C , γ D ]. Results of the optimal parameters, as in Fig. 2 for a set of obtained cases, are displayed in Fig. 3. As stressed through this paper, inverse designing quantum heat transfer model with modern gradient methods like Adam, requires very few number of iterations-roughly 100 or less for this case-to find optimal parameter values. For each individual optimization, we first sampled the parameters θ, J, γ H , γ C , γ D and used Adam to find the maximizer of R. All optimizations were stopped once R > 95%, which on averaged took approximately 100 iterations; lower inset in Fig. 3. Our results illustrate that there is a linear trend between θ and J. However, the optimal range of θ when R is maximum is wider than for J H , indicating that both quantum observables do not share the same set of optimal physical parameters. B. Energy transfer for the V-system As a second system, we consider a simplified model of energy transfer shown in the right panel in Fig. 1. When the radiation incident on the donor state |1 is taken to originate from an incoherent source, such as the sun, the system is a useful minimal model of biological energy transfer. The steady state efficiency, η loc , is quantified by; Here ρ ss 2 is the probability of being in the site neighboring the reaction center, (|2 ), Fig. 1, Γ RC is the rate of energy transfer from the acceptor state |2 to the reaction center |RC , and r is the incoherent-light pumping rate. The time evolution of this system is modeled by, where the first term is the unitary evolution of the system, , and the rest of the terms, L i , describe the radiation (rad), the trapping of the excitons at the reaction center (RC), environmental dephasing (deph), and the recombination of the excitons (rec). See the Supplemental Material and Ref. [46] for more information.
We optimize Θ = [Γ, γ d , | 1 − 2 |, J] by maximizing η loc using the Adam algorithm, Figs. 4-5. | 1 − 2 | is the energy difference between site |1 and |2 , J is the hopping amplitude, both being system parameters. Γ corresponds to the recombination rate, and γ d is the phonon bath dephasing rate. For each optimization, we randomly sampled different values for the parameters. We fixed the values of Γ RC to Γ RC = 0.5 ps −1 and the incoherent-light pumping rate r = 6.34 × 10 −10 ps −1 ; parameters taken from Ref. [48]. Optimal set of parameters are presented in Fig. 4 and Table III in the supplemental material.
From Fig. 4, we can observe that when the value of J is small, there is a linear correlation with the difference between the energy sites, | 1 − 2 |. For γ d the optimal possible values span a wider range, from 10 9 to 10 13 Hz; however, for larger values of γ d the | 1 − 2 | optimal parameters must be greater as well. The optimal range for Γ was less wider than the rest, and interestingly, this range is aligned with the values used in physical simulations of Ref. [47]. For these simulations, the initial parameters were sampled from a region where η loc is not optimal, however, our approach managed to optimize the parameters regardless of their initial values. Throughout our optimizations, we notice that it only took approximately 20 iterations to reach η loc > 99% (Fig. 4 upper inset and left panel in Fig. 5). As we can observe from the left panel of Fig. 4, for some values of | 1 − 2 | and J the plateau where η loc is maximum is when Γ and γ d have small values. We report all optimal parameters in Table III in the Supplemental Material.
In Fig. 5 we display the optimization trajectories, using Adam, where all random initial parameters had η loc < 20% and the end result where η loc > 99%. The time evolution of ρ(t) for a pair of random and optimal set of parameters is display in the right panel of Γ RC = 0.5 ps −1 , | 1 − 2 | = 1.3 ps −1 , and r = 6.34×10 −10 ps −1 [47]. As it can be observed, there is a significant difference in ρ ss for the random initial parameters and the optimized ones.
As we pointed above, our algorithm allows us to compute the vector-Jacobian product of the steady state with respect to any parameter of the Liouvillian. So far, the main application has been the inverse design of open quantum systems by maximizing/minimizing quantum observables using gradient based methods. However, the Jacobian can also be used to understand the effect a set of parameters have on a quantity of interest, sensitivity analysis. For non-equilibrium steady state systems and before this work, the Jacobian of the steady state was only computed when analytic solutions were available, or by finite differences. Here, as the last example, we study the effects of the attenuation of the incident radiation which is important since photon absorbing centers are found in a variety of environments [71][72][73]. By computing ∂ρ ss /∂r we found that ρ ss e−,e− is the most sensitive to r, i.e. ∂ρ ss e−,e− /∂r > ∂ρ ss e+,e+ /∂r, Fig. 6. Here, e ± are the eigenstates of the system Hamiltonian, H S |e ± = ± |e ± , and ρ ss e±,e∓ are the matrix elements of the steady state density matrix in the eigenbasis. Additionally, from Fig. 6 we can also observe that the imaginary part of the coherence, (ρ e+,e− ), which relates to the exciton flux [46], is less sensitive until r ≈ 0.510 −9 ps −1 , indicating that the pumping rate will not increase the flux below r ∼ 0.510 −9 ps −1 . The markers represent the optimization trajectories that the Adam algorithm follows. (lower panel) The time evolution of ρ until a steady state is reached. We considered two cases, a random initial parameters (dashed curves) where η loc ≈ 0.0, and for optimized parameters (solid curves) where η loc ≈ 1. The random initial parameters considered were, γ d = 2.88 × 10 12 Hz and Γ = 0.0194 ps −1 , and the optimized ones are γ d = 3.53 × 10 11 Hz and Γ = 7.2 × 10 −5 ps −1 . For these simulations, we fixed the values of ΓRC = 0.5 ps −1 and r = 6.34 × 10 −10 ps −1 [47].

IV. SUMMARY
Inverse design and optimal control protocols for open quantum systems that are quantified through observables that depend on the steady state, O ρ ss , are commonly done with brute-force search or inspired methods. On the other hand, gradient-based algorithms have proven to be efficient tools to minimize/maximize functions. For non-equilibrium steady state systems, the technical limitation was the inability to efficiently compute the Jacobian of the steady state with respect to any parameter of the Liouvillian; dρ ss dΘ . We circumvent this by combining automatic differentiation and the implicit function theorem. Furthermore, we believe that the present work is the first example of the application of gradient-based methods to efficiently inverse design non-equilibrium steady state systems. All systems were optimized with Adam, a first order gradient algorithm; however, the procedure proposed here, combined with automatic differentiation, can be also applied to efficiently compute the Hessian matrix, used in second order gradient optimization algorithms.
The optimal design of non-equilibrium systems is still driven by physical intuition. However, with the possibility to compute dρ ss dΘ , we could engineer more robust systems, baths, and system-bath interactions. Furthermore, this methodology could also be used to study the sensitivity of ρ ss with respect to any parameter in the Liouvillian, and, for example, could lead to more insight in how light affects biological processes.
Any inverse design protocol must create physically valid parameters. While this could be taken into account by adding some constraints to the main physical observable of interest, here we decided to take a different route by leveraging the flexibility of automatic differentiation. For example, to constrain the system-bath parameters to [0,1] we used the soft-max function [74], and for γ i 's the range was constrained to [0, 0.0025], the soft-max function times the maximum value allowed. By constraining the range of a i , b i , and γ i , the optimization of the system remains in the weak-interacting limit where the Redfield theory is valid. Similar algebraic transformations could be applied to constrain the value of other parameters to ensure a valid physical range for experimental setups.
In the cases introduced, optimization of a single target quantity (e.g. the heat transfer, or energy efficiency) was carried out in models parametrized by several quantities. As a result, numerous optimized models were obtained, each with similar values of the optimized target. That is, interestingly, the parameter surface has multiple maxima of similar depth. In cases where one is attempting to achieve optimization of multiple quantities (e.g. heat transfer and verification) the proposed methodology could be integrated into gradient-based algorithms for multi-objective optimization to construct the Pareto front [75].
In all the simulations presented here, an ODE solver was used to obtain ρ ss ; however, Eq. 5 is agnostic to the exact method used to obtain ρ ss . This makes the present methodology particularly valuable. Additionally, this method could be used to study any other system whose observables depend on a long-time solution for equations of the form of Eq. 2. Finally, this methodology opens the possibility of studying more complex quantum heat transfer devices or natural-light induced processes.
A tutorial of the method presented here is publicly available in the repository: https://github.com/ RodrigoAVargasHdz/steady_state_jax We acknowledge fruitful discussions with Professor Dvira Segal. This work was funded by components of two grants from the US Air Force Office of Scientific Research, FA9550-19-1-0267 and FA9550-20-1-0354.

Abstract
The purpose of this supplemental material is to provide details of the numerical calculations we present in this work. Section I presents the core of the algorithm, and sections II B and II A describe the details of the quantum systems used in the main work. In section III we present numerical solutions of the optimal parameters presented in the main text.

I. AUTOMATIC DIFFERENTIATION
Automatic differentiation (AD) is a framework designed to compute exact vector-Jacobian (v J) and Jacobian-vector (Jv) products [1], where J is any Jacobian and v is a vector.
The former is often referred to as reverse-mode automatic differentiation, and the latter as forward-mode. For a function f : R n → R m the full Jacobian evaluated at an input value x is a matrix of m by n: where the row vector is a shorthand notation for the matrix.
For scalar functions, f : R n → R 1 it is clear that with a single vector-Jacobian product it is possible to construct the entire Jacobian. For this case, v is the scalar 1. In our setting, the optimization objective is a quantum observable, Ô = Tr [Ôρ], which is a scalar function.
Thus, with reverse-mode AD, it is possible to construct the entire Jacobian with the same asymptotic cost as a single computation of Ô .
In this work we take advantage of efficient gradient computation to illustrate that complex quantum observables could be numerically optimized by combining AD and gradient-based optimization methods such as Adam [2]. Since quantum observables are often defined as a function of the steady state, we require efficient evaluation of dρ ss dΘ , where ρ ss is the steady state solution, dρ(t) dt = 0, of a first-order differential equation, Θ is any set of parameters of F (·), and ρ is the reduced density matrix. Here we studied the evolution of quantum systems that are affected by multiple external environments, and Θ can contain controllable parameters of the environment.
A. Fixed-point problem: computing the Jacobian for the steady state solution In the steady state limit, dρ(t) dt = 0, we can recognize that, Conditioned on a given set of parameters Θ 0 , we denote ρ ss as the solution that satisfies the above equation. By differentiating with respect to Θ 0 and evaluating at the point (ρ ss , Θ 0 ) we get, We can re-arrange this equation and obtain an expression for the term of interest: Any of the terms in Eq. 5 can be computed exactly with AD. In other words, given an Θ 0 no matter how we solve for ρ ss , we can still compute the Jacobian using just derivative information at the solution point Θ 0 . Also, the above procedure is agnostic to how the steady state is solved, and avoids differentiating through the ODE solver. The memory cost is thus constant and does not scale with the complexity of the problem.
All the simulations in this work were coded in the suite JAX [3]. Figure 1 is the computational diagram of the methodology, presented in the main work, to compute dρ ss dΘ , where Θ is the collection of parameters of F (·).

II. MODEL DESCRIPTIONS
The time-evolution of systems of interest in this work are described within an open quantum system formalism. We describe the steady state of the quantum heat transfer using Redfield theory [4,5], and for the energy transfer model for the donor-acceptor excitonic model we applied Lindblad [6][7][8]. Both systems and theories are briefly presented in the following sections. We optimize the parameters of both models using a gradient descent method (Adam). A. Heat transfer model The system Hamiltonian for a quantum heat transfer model is, where θ and J describe the energy of sites |1 and |2 and the hopping between them. We set g = 0 for reference. This system will be interacting with two baths and is treated using Redfield theory (RT), which is a second-order time-local master equation [4,6]. For multiple non-interacting baths with factorized initial conditions the Redfield quantum master equation for states |g and |i , where D α is the dissipator for each α environment, and the Redfield tensors for the α environment R α , are, where α Γ ± are the transition rates, Ω α (τ ) is the correlation function for the α-bath, and λ|S α |ν are the matrix elements of the system-bath interaction (S), where G α (ω) is the spectral density function, and n α (ω) is the average phonon occupation number for the α−bath, characterized by the temperature T α . [µ, ν, κ, λ] label the eigenstates of H S , and ω µ,ν is the difference between eigenvalues µ and ν . Specific functional forms for S are described below.
The spectral density function of each bath is taken to be of the Ohmic form with an exponential cutoff, The golden rule rates in Eq. 11 can be solved in closed form in the Markovian limit to give where ∆ α is the Lamb shift, and The heat transfer is characterized by the rate of heat exchange with the hot-bath, where D H is the dissipator of the hot-bath, Eqs. 9-10.
We also computed the rectification coefficient, where J H is the heat current when the hot bath interacts with site one and the cold bath with site two, while J H is computed by swapping the temperatures of the H and C baths. As we illustrate in the main draft, both functions, J H and R, can be maximized using gradient ascent based methods.

B. Energy transfer model
The quantum master equation for a donor-acceptor exciton composite system is, where L 0 [ρ] = −i Ĥ S ,ρ S andĤ S is the Hamiltonian of the system of interest. For this excitonic system, where i are the site-energies for each state |i , and J is the hopping constant [9][10][11]. The energy eigenstates of H S are |e ± ; H S |e ± = ± |e ± .
In energy transfer scenarios one needs a metric to characterize the amount of energy transferred through the system. This is quantified by the exciton efficiency Clearly η loc is a function of the parameters of the quantum master equation, Eq. 23, and ρ ss 2 is the density matrix of site |2 in the steady state limit.
To optimize η loc with gradient descent we require the knowledge of ∂ρ ss 2 ∂Θ i , where Θ i is any parameter of the quantum master equation.
A code for the energy transfer system is publicly available through the following repository: https://github.com/RodrigoAVargasHdz/steady_state_jax.

III. RESULTS
In this section we present the values of the optimal models for a quantum heat transfer model.
The most general type of interactions S α (Eq.14) with the hot (H), cold (C) and decoherence (D) baths are, where [a 0 , a 1 ] and [b 0 , b 1 ] are the coupling strength parameters to the H and C bath. The D−bath is a control mechanism to study the role of coherences between the sites. Analytic results for J α can only be derived in the secular limit [13].  Table I. For Table I, we can find that the optimal models could be categorize in two groups, where the hot and cold bath mainly interact with independent sites, (i) a 0 = b 1 ≈ 1 and a 1 = b 0 ≈ 0 or vice versa, and (ii) when both hot and cold bath mainly interact with one site a i = b i ≈ 1 and a j = b j ≈ 0. This trend is discussed in greater detail in the main text.
For the quantum heat transfer system were we maximize the rectification coefficient, we optimized θ, J, γ H , γ C , and γ D . We held fixed T H = 0.15, T C = 0.1, and T D = 0.12. We use the Adam optimizer with a learning rate = 0.01. For these simulations the hot and cold bath only interact with opposite sites. θ and J were restricted to only positive values by an exponential function. We report the collection of optimal parameters in Table II.
From Table II, we can observe that for values where R > 0.95, there exists a strong linear correlation between the energy of sites |1 and |2 θ, and the hopping parameter J; Figure   3 in the main text. Adam. For each set of parameters, the initial values were sampled randomly. We held fixed T H = 0.15, T C = 0.1, and T D = 0.12.   Table III, we can observe that the optimal values of Γ span in the range between 10 −3.8 and 10 −3.17 ps −1 . We can also observe that J and | 1 − 2 | are linearly correlated when | 1 − 2 | < 0.5 ps −1 and J < 3.0 ps −1 . Furthermore, γ d is the parameter that span the largest range for efficient steady state transport, 10 9 < γ d < 10 13 Hz. Each independent optimization was stopped when the steady state efficiency was greater than 99%, which for the majority of the trajectories took approximately 20 iterations with Adam.