EPR-Net: constructing a non-equilibrium potential landscape via a variational force projection formulation

ABSTRACT We present EPR-Net, a novel and effective deep learning approach that tackles a crucial challenge in biophysics: constructing potential landscapes for high-dimensional non-equilibrium steady-state systems. EPR-Net leverages a nice mathematical fact that the desired negative potential gradient is simply the orthogonal projection of the driving force of the underlying dynamics in a weighted inner-product space. Remarkably, our loss function has an intimate connection with the steady entropy production rate (EPR), enabling simultaneous landscape construction and EPR estimation. We introduce an enhanced learning strategy for systems with small noise, and extend our framework to include dimensionality reduction and the state-dependent diffusion coefficient case in a unified fashion. Comparative evaluations on benchmark problems demonstrate the superior accuracy, effectiveness and robustness of EPR-Net compared to existing methods. We apply our approach to challenging biophysical problems, such as an eight-dimensional (8D) limit cycle and a 52D multi-stability problem, which provide accurate solutions and interesting insights on constructed landscapes. With its versatility and power, EPR-Net offers a promising solution for diverse landscape construction problems in biophysics.

Since Waddington's famous landscape metaphor on the development of cells in the 1950s [1], the construction of potential landscape for non-equilibrium biochemical reaction systems has been recognized as an important problem in theoretical biology, as it provides insightful pictures for understanding complex dynamical mechanisms of biological processes. This problem has attracted considerable attention in recent decades in both biophysics and applied mathematics community. Until now, several approaches have been proposed to realize Waddington's landscape metaphor in a rational way, see [2][3][4][5][6][7][8][9][10] and the references therein for details and [11][12][13][14] for reviews. Broadly speaking, these proposals can be classified into two types: (T1) the construction of potential landscape in the finite noise regime [3][4][5] and (T2) the construction of the quasi-potential in the zero noise limit [2,[6][7][8][9].
For low-dimensional systems (i.e., dimension less than 4), the potential landscape can be numerically computed either by solving a Fokker-Planck equation (FPE) using grid-based methods until the steady solution is reached approximately as in (T1) type proposals [3,5], or by solving a Hamilton-Jacobi-Bellman (HJB) equation using, for instance, the ordered upwind method [15] or minimum action type method [8] as in (T2) type proposals. However, these approaches suffer from the curse of dimensionality when applied to high-dimensional systems. Although methods based on mean field approximations are able to provide a semi-quantitative description of the energy landscape for typical systems [4,16], direct and general approaches are still favored in applications. In this aspect, pioneering work has been done recently, which allows direct construction of high-dimensional potential landscape using deep neural networks (DNN), based on either the steady viscous HJB equation satisfied by the * wei.zhang@fu-berlin.de † tieli@pku.edu.cn landscape function in (T1) case [17,18], or the pointwise orthogonal decomposition of the force field in (T2) case [19]. These works have brought significant advances in the methodological developments in both cases. However, these approaches, which are based on solving HJB equations alone, may encounter numerical difficulties due to the non-uniqueness of the weak solution to the nonviscous HJB equation in (T2) case [20], and challenges in solving the steady HJB equation with a small noise in (T1) case. Setup. In this letter, we present a simple yet effective DNN approach, EPR-Net, for constructing the potential landscape of high-dimensional non-equilibrium steady state (NESS) systems in (T1) type. Our key observation is that the negative potential gradient is the orthogonal projection of the driving force under a weighted inner product with respect to the steady-state distribution. To be specific, let us consider the stochastic differential equations (SDEs) where x 0 ∈ R d , F : R d → R d is a smooth function, w = (ẇ 1 , . . . ,ẇ d ) is the d-dimensional temporal Gaussian white noise with Eẇ i (t) = 0 and E[ẇ i (t)ẇ j (s)] = δ ij δ(t − s) for i, j = 1, . . . , d, s, t > 0 and D > 0 is the noise strength, which is often related to the system's temperature T by D = k B T , where k B is the Boltzmann constant. We assume that (1) is ergodic and denote by p ss (x) its steady-state probability density function (PDF). We follow the (T1) type proposal in [3] to derive the potential landscape of (1) in the case of D > 0. That is, we define the potential U = −D ln p ss and the steady probability flux J ss = p ss F −D∇p ss in the domain Ω, which we assume for simplicity is either R d or a d-dimensional hyperrectangle. The steady-state PDF p ss (x) satisfies the Fokker-Planck equation (FPE) ∇ · (p ss F ) − D∆p ss = 0, for x ∈ Ω, (2) and we assume the asymptotic boundary condition (BC) p ss (x) → 0 as |x| → ∞ when Ω = R d , or the reflecting boundary condition J ss · n = 0 when Ω ⊂ R d is a d-dimensional hyperrectangle, where n is the unit outer normal. In both cases, we have p ss (x) ≥ 0 and Ω p ss (x) dx = 1. Learning approach. Aiming at an effective approach for high-dimensional applications, we employ DNNs to approximate U (x), and the key idea in this letter is to learn U by training DNN with the following loss function where V := V (x; θ) is a neural network function with parameters θ [21], and dπ(x) = p ss (x) dx. To justify (3), we note that U satisfies the important orthogonality relation: for any suitable function W : Therefore, U (x) is the unique minimizer (up to a constant) of the loss L EPR and, moreover, the negative potential gradient −∇U is in fact the projection of the force field F in the π-weighted Hilbert space. See Sec. A and B in the Supplemental Material (SM) for derivations in detail.
The minimum loss L EPR (U ) has a clear physical interpretation. Indeed, we have (see SM Sec. B) where e ss p denotes the steady entropy production rate (EPR) of the NESS system (1) [3,22,23]. Therefore, minimizing (3) is equivalent to approximating the steady EPR. This explains the name EPR-Net of our approach.
To utilize (3) in numerical computations, we replace the spatial integral in (3) with respect to the unknown π by its empirical average using data sampled from (1): where (x i ) 1≤i≤N could be either the final states (at time T ) of N trajectories starting from different initializations or equally spaced time series along a single long trajectory up to time T , where T 1. In both cases, the ergodicity of SDE (1) guarantees that (6) is a good approximation of (3) as long as T is large [24]. We adopt the former approach in the numerical experiments in this work, where the gradients of both V (with respect to x) and the loss itself (with respect to θ) in (6) are calculated by auto-differentiation through PyTorch [25]. The stability analysis of this approximation is presented in detail in SM Sec. C.
We apply our method to a toy model first in order to check its applicability and accuracy. We take where A ∈ R d×d is a constant skew-symmetric matrix, i.e., A = −A, and U 0 is some known function. With this choice of F , one can check that the true potential landscape is U (x) = U 0 (x). In particular, the system is reversible when A = 0. Based on the proposed method, we construct a double-well model with known potential U 0 for verification. We take D = 0.1. As shown in Fig. 1(A), the learned potential agrees well with the simulated samples. Also, the decomposition of the force field shows that the negative gradient part −∇V (x; θ) around the wells points towards the attractor and is nearly orthogonal to the non-gradient part. The overall non-gradient field shows a counter-clockwise rotation. The relative root mean square error (rRMSE) of the potential V (x; θ) learned by EPR loss is 0.0987 (averaged over 3 runs), which supports the effectiveness of our approach. See SM Sec. F F.1 for details of the problem setting. The correct interpretation of the computational results based on the EPR loss (3) is that the accuracy of V (x) is guaranteed only when π(x) is evidently above zero for any specific x. In the "visible" domain of π (i.e., the places where there are sample points of {x i }), the trained potential V gives reliable approximation; while in the weakly visible or invisible domain, especially in local transition regions between meta-stable states and boundaries of the visible domain, we must resort to the original FPE (2) which holds pointwise in space.
Learning strategy for small D. Substituting the relation p ss (x) = exp(−U (x)/D) into (2), we get the viscous HJB equation with the asymptotic BC U → ∞ as |x| → ∞ in the case of Ω = R d , or the reflecting BC (F + ∇U ) · n = 0 on ∂Ω when Ω is a d-dimensional hyperrectangle, respectively. As in the framework of physics-informed neural networks (PINNs) [26], (8) motivates the HJB loss where µ is any desirable distribution. By choosing µ properly, this loss allows the use of sample data that better cover the domain Ω and, when combined with the loss in (3), leads to significant improvement of the training results in our numerical experiments when D is small. Specifically, for small D, we propose the enhanced loss in training which has the form where L EPR (θ) is defined in (6), L HJB (θ) = 1 N N i=1 |N HJB (V (x i ; θ))| 2 is an approximation of (9) using an independent data set (x i ) 1≤i≤N obtained by sampling the trajectories of (1) with a larger D > D, and λ > 0 is a weight parameter balancing the contribution of the two terms in (10). Note that the proposed strategy is both general and easily adaptable. For instance, one  [3] and a tri-stable cell development model [5] learned by enhanced loss (10). The force field F (x) is decomposed into the gradient part −∇V (x; θ) (white arrows) and the non-gradient part (gray arrows). The length of an arrow denotes the scale of the vector. The solid dots are samples from the simulated invariant distribution.
can alternatively use data (x i ) 1≤i≤N that contains more samples in the transition region, or employ a modification of the loss (9) in (10) [17].
We apply our enhanced loss (10) to construct the landscape for a 2D biological system with a limit cycle [3] and a 2D multistable system [5]. The potential V (x; θ) learned by the enhanced loss (10), the force decomposition, and sample points from the simulated invariant distribution are shown in Fig. 1(B) and (C). As in the toy model case, the gradient part (white arrows) points directly towards the attractors, while the non-gradient part (gray arrows) shows a counter-clockwise rotation for the limit cycle, and a splitting-and-back flow from the middle attractor to the other two attractors for the tri-stable dynamical model. To further verify the accuracy of the method, we numerically solve the FPE (2) as reference solutions by a fine grid discretization. Comparisons between the proposed method and the method based on the naive HJB loss on these two problems are demonstrated in SM. Averaged over 3 runs, the rRMSE of the potential V learned by our enhanced loss is 0.0524 and 0.0402, respectively, which shows an evident advantage over the naive HJB loss. See SM Sec. F for details of the comparisons.
Dimensionality reduction. When applying the approach above to high-dimensional problems, dimensionality reduction is necessary in order to visualize the results and gain physical insights. A straightforward approach is to first learn the high-dimensional potential U and then find its low-dimensional representation, i.e., the reduced potential or the free energy function, using dimensionality reduction techniques (see SM Sec. D D.1). In the following, we present an alternative approach that allows to directly learn the low-dimensional reduced potential.
For simplicity, we consider the linear case and, with a slight abuse of notation, denote by x = (y, z) , where z = (x i , x j ) ∈ R 2 contains the coordinates of two variables of interest, and y ∈ R d−2 corresponds to the other d − 2 variables. The domain Ω (either R d or a d-dimensional hyperrectangle) has the decomposition Ω = Σ × Ω, where Σ ⊆ R d−2 and Ω ⊆ R 2 are the domains of y and z, respectively. As can be seen in the numerical examples, this setting is applicable to many interesting biochemical systems. Extensions to nonlinear low-dimensional reduced variables with general domains are possible, e.g., by applying the approach developed in [27]. In the current setting, the reduced potential is and one can show that U minimizes the following loss function: where F z (y, z) ∈ R 2 is the z-component of the force field F = (F y , F z ) . Similar to (6), the empirical form of (12) can be used in learning the reduced potential U . Moreover, one can derive an enhanced loss as in (10) that could be used for systems with small D. To this end, we note that U satisfies the projected HJB equation with asymptotic BC U → ∞ as |z| → ∞, or the is the projected force defined using the conditional distribution dπ(y|z) = p ss (y, z)/ p ss (z) dy, and n denotes the unit outer normal on ∂ Ω. Based on (13), we can formulate the projected HJB loss where µ is any suitable distribution over Ω, and F in (13) is learned beforehand by training a DNN with the loss The overall enhanced loss used in numerical computations comprises two terms, which are empirical estimates of (12) and (14) based on two different sets of sample data. See SM Sec. D for derivation details. We then apply our dimensionality reduction approach to construct the landscape for an 8D cell cycle model containing both a limit cycle and a stable equilibrium point for the chosen parameters, and take CycB and Cdc20 as the reduced variables following [4]. As shown in Fig. 2, we can find that the depth of the reduced potential and force strength agree well with the density of projected samples. Moreover, we can also get some important insights from Fig. 2 on the projection of the high-dimensional dynamics with a limit cycle to two dimensions. One particular feature is that the limit cycle induced by the projected force G (outer red circle) has minor differences with the limit cycle directly projected from high dimensions (yellow circle), and the difference is slight or moderate depending on whether the density of samples is high or low. This is natural in the reduction since the distribution π(y|z) in the projection is not of Dirac type when D > 0, and this difference will disappear as D → 0. Another feature is that we unexpectedly get an additional stable limit cycle (inner red circle) and a stable point (red dot in the center) emerging inside the limit cycle. Though virtual in high dimensions and biologically irrelevant, the existence of such two limit sets is reminiscent of the Poincaré-Bendixson theorem in planar dynamics theory [28, Chapter 10.6], which depicts a common phenomenon when performing dimensionality reduction with limit cycles to 2D plane. The emergence of these two limit sets, though being not a general situation, is specific in the considered model due to the relatively flat landscape of the potential in the centering region. In addition, close to the saddle point (0.13, 0.55) of V (green star), there is a barrier domain along the limit cycle direction, while a local well domain along the Cdc20 direction, which characterizes the region that biological cycle paths mainly go through. Last but not the least, a zoom-in view of the local well domain outside of the limit cycle shows its detailed spiral structure (Fig. 2C), which has not been revealed before by making a Gaussian approximation. Some other applications of our approach to Ferrell's three-ODE model [29], 52D stem cell network model [16] and 3D Lorenz model are demonstrated in SM Sec. G and H.
Extension to variable diffusion coefficient case. The EPR-Net formulation can be extended to the case of state-dependent diffusion coefficients without any difficulty. Consider the Ito SDEs with diffusion matrix σ(x) ∈ R d×m andẇ is an mdimensional temporal Gaussian white noise. We assume that m ≥ d and the matrix a(x) := (σσ )(x) satisfies u a(x)u ≥ c 0 |u| 2 for all x, u ∈ R d , where c 0 > 0 is a positive constant. Using a similar derivation as before, we can again show that the high-dimensional landscape function U of (16) minimizes the EPR loss We provide derivation details of (17) in SM Sec. E. However, we will not pursue a numerical study of (16)- (17) in this paper.
Discussions and Conclusion. Below we make some final remarks. First, concerning the use of the steady-state distribution π(x) in (3) and its approximation by a long time series of the SDE (1) in EPR-Net, we emphasize that it is the sampling approximation of π that naturally captures the important parts of the potential function, and the landscape beyond the sampled regions is not that essential in practice. Second, as is exemplified in SM Sec. F F.4, we found that a direct application of density estimation methods (DEM), e.g., normalizing flows [30], to the sampled time series data does not give potential landscape with satisfactory accuracy. We speculate that such deficiency of DEM is due to its over-generality and the fact that it does not take advantage of the force field information explicitly compared to (3).
Overall, we have presented the EPR-Net, a simple yet effective DNN approach, for constructing the nonequilibrium potential landscape of NESS systems. This approach is both elegant and robust due to its variational structure and its flexibility to be combined with other types of loss functions. Further extension of dimensionality reduction to nonlinear reduced variables and numerical investigations in the case of state-dependents diffusion coefficients will be explored in future work.
Acknowledgement In this supplemental material (SM), we will present further theoretical derivations and computational details of the contents in the main text (MT). This SM consists of two parts: Theory and computation.

PART 1: THEORY
We will first provide details of theoretical derivations omitted in the MT.

A. VALIDATION OF THE EPR LOSS
In this section, we show that, up to an additive constant, the potential function U (x) := −D ln p ss (x) is the unique minimizer of the EPR loss (3) defined in the MT.
First, we show that the orthogonality relation where we have used integration by parts and the relation p ss (x) = exp(−U (x)/D). The term P 1 is zero due to the fact that p ss (x) tends to 0 exponentially as |x| → ∞ when Ω = R d , and the reflecting BC J ss · n = 0 which holds on ∂Ω when Ω is bounded. The term P 2 is zero due to the steady state Fokker-Planck equation (FPE) satisfied by p ss . Now consider the EPR loss, we have where we have used the orthogonality relation (18) to arrive at the last equality, from which we conclude that U (x) is the unique minimizer of the EPR loss up to an additive constant. In fact, define the π-weighted inner product for any square integrable functions f, g on Ω: (f, g) π := Ω f (x)g(x) dπ(x) (19) and the corresponding L 2 π -norm · π by f 2 π := (f, f ) π , we get a Hilbert space L 2 π (see, e.g., [31, Chapter II.1]). Choosing W = U in (18), we observe that the minimization of EPR loss finds the orthogonal projection of F under the π-weighted inner product, i.e., However, we remark that this orthogonality holds only in the L 2 π -inner product sense instead of the pointwise sense. Furthermore, the two orthogonality relations (18) and (20) can be understood as follows. Using (20), the relation (18) is equivalent to Ω l · ∇W dπ = 0 for any W . Integration by parts gives ∇ · (l e −U/D ) = 0, which is equivalent to ∇U · l + D∇ · l = 0. When D → 0, we recover the pointwise orthogonality, which is adopted in computing quasi-potentials in [19].

B. EPR LOSS AND ENTROPY PRODUCTION RATE
In this section, we show that the minimum EPR loss coincides with the steady entropy production rate in nonequilibrium steady state (NESS) theory.
Following [22,23], we have the important identity concerning the entropy production for the SDE (1) defined in the MT: where S(t) := − Ω p(x, t) ln p(x, t) dx is the entropy of the probability density function p(x, t) at time t, e p is the entropy production rate (EPR) and h d is the heat dissipation rate where J ss (x) = p ss (x)(F (x)+∇U (x)) is the steady probability flux. This shows the relation between the proposed EPR loss function and the entropy production rate in the NESS theory.

C. STABILITY OF THE EPR MINIMIZER
In this section, we formally show that small perturbations of the invariant distribution π will not introduce a disastrous change to the minimizer of the corresponding EPR loss. We only consider the bounded domain, i.e., Ω is a hyperrectangle. The argument for unbounded domains is similar.
Suppose dπ(x) = p(x)dx, dµ(x) = q(x)dx, and the functions U (x) andŪ (x) are the unique minimizers (up to a constant) of the following two EPR losses respectively. It is not difficult to find that the Euler-Lagrange equations of U,Ū are given by the following partial differential equation (PDE) with suitable BCs: The PDEs above defined inside the domain Ω can be converted to ∆U p + ∇U · ∇p = −∇ · (pF ), ∆Ū q + ∇Ū · ∇q = −∇ · (qF ).

D. DIMENSIONALITY REDUCTION
In this section, we study dimensionality reduction for high-dimensional problems in order to learn the projected potential.
Denote by x = (y, z) ∈ Ω. As in the MT, we assume the domain where Ω ⊆ R 2 and Σ ⊆ R d−2 are the domain of y and z, respectively. The reduced potential U (z) is defined as One natural approach for constructing U (z) is directly integrating p ss (y, z) based on the learned U (y, z) with the EPR loss, i.e., However, performing this integration is not a straightforward numerical task (see, e.g., [34,Chapter 7]).

D.1. Gradient projection loss
In this subsection, we study a simple approach to approximate U (z) based on sample points, which approximately obey the invariant distribution π(x), and the learned high dimensional potential function U (x) by EPR loss. This approach is not investigated numerically in this work, but it will be useful for the derivations in the next subsection. The idea is to utilize the gradient projection (GP) loss on the z components of ∇U : To justify (28), we note that where P 1 and P 2 denote the terms in the third and the fourth line above, respectively. The term P 2 = 0 since which cancel with each other in P 2 . Therefore, the minimization of GP loss is equivalent to minimizing which clearly implies that U (z) is the unique minimizer (up to a constant) of the proposed GP loss.

D.2. Projected EPR loss
In this subsection, we study the projected EPR (P-EPR) loss, which has the form where F z (y, z) ∈ R 2 is the z-component of the force field F = (F y , F z ) . Define where ∇ is the full gradient with respect to x. To justify (29), we first note the following equivalence since ∇ y V (z) = 0 and the y-components of F +∇ V only introduce an irrelevant constant in (30). Furthermore, we have where the last equality is due to the orthogonality relation (18). Using a similar argument for deriving (31), the equivalence (31) itself, as well as the GP loss in (28), we get Since U minimizes the GP loss as is shown in the previous subsection, we conclude that U minimizes the loss in (29).

D.3. Force projection loss
In this subsection, we study the force projection (P-For) loss for approximating the projection of F z onto the z-space.
Denote by the projected force defined using the conditional distribution dπ(y|z) = p ss (y, z)/ p ss (z) dy.
We can learn F (z) via the following force projection loss To justify (35), we note that The term P 2 can be simplified as Therefore, we have the equivalence where From the analysis above we can conclude that F (z) minimizes the loss in (35).

D.4. HJB equation for the reduced potential
In this subsection, we show that the reduced potential U satisfies the projected HJB equation with asymptotic BC U → ∞ as |z| → ∞, or the reflecting BC ( F + ∇ z U ) · n = 0 on ∂ Ω, where n denotes the unit outer normal on ∂ Ω. We will only consider the rectangular domain case here. The argument for the unbounded case is similar. Recall that p ss (x) satisfies the FPE Integrating both sides of (38) on Σ with respect to y and utilizing the boundary condition J ss · n = 0, where J ss = p ss F − D∇p ss , we get Taking (33) and (34) into account, we obtain i.e., a FPE for p ss (z) with the reduced force field F , where J := p ss F −D∇ z p ss . The corresponding boundary condition can be also derived by integrating the original BC J ss · n = 0 on Σ with respect to y for z ∈ ∂ Ω, which gives J · n = p ss F − D∇ z p ss · n = 0.
Substituting the relation p ss (z) = exp − U (z)/D into (40) and (41), we get (37) and the corresponding reflecting BC after some algebraic manipulations.

E. STATE-DEPENDENT DIFFUSION COEFFICIENTS
In this section, we study the EPR loss for NESS systems with a state-dependent diffusion coefficient.
Consider the Ito SDEs with the state-dependent diffusion matrix σ(x). Under the same assumptions as in the MT, we have the FPE ∇ · (p ss F ) − D∇ 2 : (p ss a) = 0.
We show that the high dimensional landscape function U of (42) minimizes the EPR loss where F v (x) := F (x) − D∇ · a(x) and |u| 2 a −1 (x) := u a −1 (x)u for u ∈ R d .
To justify (44), we first note that (43) can be rewritten as which, together with the BC, implies the orthogonality relation for a suitable test function W (x). Following the same reasoning used in establishing (18) and utilizing (46), we have The last expression implies that U (x) is the unique minimizer of L V-EPR (V ) up to a constant. The above derivation for the state-dependent diffusion case will permit us to construct the landscape for the chemical Langevin dynamics, which will be studied in future work.

PART 2: COMPUTATION
Now we present the computational details and results omitted in the MT in the computation part.

F. 2D MODELS AND COMPARISONS
In this section, we will describe the computational setup and results for some 2D models which we utilize for the test of different formulations, including the toy model with known potential in the MT, a 2D biological system with a limit cycle [3] and a 2D multi-stable system [5]. We will also demonstrate the motivation for enhanced EPR and its advantage over other methods.

F.1. Toy model and enhanced EPR
In the toy model, we set the force field as and choose the potential where x = (x, y) . We take the anti-symmetric matrix which introduces a counter-clockwise rotation for a focusing central force field. This sets up a simple nonequilibrium system. In this model, we have and l(x) · ∇U 0 (x) = 0 holds in the pointwise sense. So, we have constructed a double-well non-reversible system with analytically known potential which can be used to verify the accuracy of the learned potential. We focus on the domain Primarily, the single EPR loss works well for the toy model with a relatively large diffusion coefficient D = 0.1, as shown in Fig. 1(A) in the MT. A slice plot of the potential at y = 1.5 (Fig. 3(A)) shows the EPR solution coincides well with the analytical solution. The relative root mean square error (rRMSE) and the relative mean absolute error (rMAE), which will be defined in Section F F.4, have mean and standard deviation of 0.099 ± 0.010 and 0.081 ± 0.013 over 3 runs, respectively.
However, when decreasing D to 0.05, the samples from simulated invariant distribution mainly stay in the double wells and away from the transition region (orange points in Fig. 3(C)). In this case, the double well domain can still be learned well, yet the transition region, without enough samples, has not been effectively trained. Thus, as shown in Fig. 3(B), the single EPR result captures the double wells, but cannot accurately connect them in the transition domain, which makes the left well a bit higher than the right one. The pointwise HJB loss with enhanced samples that better cover the transition domain thus helps the EPR loss with samples for small D, which mainly focuses on the local well domain. Using these enhanced samples for D = 0.1 (green points in Fig. 3(A)), the enhanced EPR method performs much better in the transition domain between the two wells and thus agrees well with the true solution.
The above strategy is general, and we apply it to compute the landscape for all of the continued 2D problems and compare it with other methods in Section F F.4.

F.2. 2D limit cycle model
We apply our approach to the limit cycle dynamics with a Mexican-hat shape landscape [3].
Before proceeding to the concrete dynamical model, we have the following observation. For any SDEs like the corresponding steady FPE is ∇ · (F p ss ) − D∆p ss = 0.
If we make the transformation F → κF , D → κD in (50), then the steady state PDF is not changed. The transformation only changes the timescale of the dynamics (50) from t 0 to κt 0 . However, this transformation changes the learned potential from U to κU if we utilize the drift κF (x) and noise strength κD in the system (50), which is helpful to set the scale of U to be O(1) by adjusting κ suitably for a specific problem. An alternative approach to accomplish this task is by choosing F to be κF in the EPR loss.
We take D = 0.1 and consider the limit cycle dynamics where the parameters are κ = 100, α = a = b = 0.1, c = 100, and τ 0 = 5. Here the choice of κ = 100 is to make U ∼ O(1) following [17]. We focus on the domain Ω = [0, 8] × [0, 8] and compute the potential landscape and force decomposition which is presented in the MT. As explained in the above paragraph, this corresponds to the case D = 0.1/κ = 0.001 for the force field considered in [3].

F.3. 2D multi-stable model
We also apply the enhanced approach to study the dynamics of a multi-stable system [5] dx dt = ax n S n + x n + bS n S n + y n − k 1 x,

F.4. Numerical comparisons
In this subsection, we conduct a comparison study on the previous 2D problems to show the priority of our enhanced EPR approach over other methods. For the toy model, we have the analytical solution; while for the other two 2D examples, we take the reference solution as the numerical solution of the steady FPE by a piecewise bilinear finite element method with fine rectangular grids and the least squares solver for the obtained sparse linear system (a normalization condition Ω p ss (x)dx = 1 is added to fix the extra shifting degree of freedom).
We use a fully connected neural network with 3 layers and 20 hidden states as the potential V (x; θ). We train the network with a batch size of 2048 and a learning rate of 0.001 by the Adam [35] optimizer for 3000 epochs. We simulate the SDEs by the Euler-Maruyama scheme with reflecting boundaries on the boundary of the domain and obtain a dataset of size 10000 to approximate the invariant distribution. We update the dataset by one time step at each training iteration to make it closer to the invariant distribution. In the toy model, we try different scales to enhance samples and report the best performance (when D = 2D) for naive HJB. For fairness, we use the same enhanced samples in enhanced EPR as naive HJB does. In SM, we denote the enhanced loss as λ 1 L EPR +λ 2 L HJB and use λ 1 = 0.1, λ 2 = 1.0 in the three models. We can also use Gaussian disturbances of the SDE data to obtain enhanced data, as we do in the limit cycle problem. We use D = 5D in the multistable problem for a better covering of the transition domain. For the comparison with normalizing flows, we train a neural spline flow [36] using the implementation from [37]. We repeat 4 blocks of the rational quadratic spline with 3 layers of 64 hidden units and a followed LU linear permutation. We train the flow model by Adam of the learning rate 0.0001 for 20000 epochs, based on the same sample dataset as enhanced EPR.
We shift the potential to the origin by its minimum and focus on the domain We then define the modified potential for the shifted potential U 0 (x) and V (x; θ) since only the potential values in the domain D is of practical interest. We use the relative root mean square error (rRMSE) and the relative mean absolute error (rMAE) to describe the accuracy.
We summarize the comparison of numerical errors for the 2D problems in Table I. The advantages of enhanced EPR over both naive HJB and normalizing flow can be identified from the following points.  • Without the guidance of EPR loss, naive HJB can not effectively optimize to the true solution with the heuristically chosen distribution. As shown in Table I, the enhanced EPR significantly achieves much better performances than naive HJB. Also, in the toy model with D = 0.05, naively training by HJB leads to an unreliable solution in Fig. 4(B) with relative errors larger than 0.1. Our computational experiences show that the enhanced EPR is more robust than naive HJB and less sensitive to the enhanced data distribution and parameters.
• The enhanced EPR converges faster than the naive HJB. For instance, in the toy model with D = 0.1, the enhanced EPR has achieved rRMSE of 0.087 ± 0.069 and rMAE of 0.066 ± 0.013 in 2000 epochs, while the naive HJB can not attain the same level even after 3000 epochs.
• Without information from the dynamics, the normalizing flow performs the worst only based on the simulated invariant distribution dataset. The learned potential tends to be rough and nonsmooth at the edge of samples as shown in Fig. 4. Thus the enhanced EPR explicitly utilizing the information of the force field does help in more accurate training of the potential.
We further compare the potential landscape computed by different methods in Fig. 4. We remark that we omit the space {x|V (x) ≥ 30D} in both Fig. 3 and Fig. 4 since these domains are not of practical interest (their probability is less than 10 −9 according to the Gibbs form of the invariant distribution). The enhanced EPR presents the landscape more consistent with the simulated samples and the true/reference solution than other methods. The decomposition of the force also shows better matching for the toy model. The normalizing flow captures the high probability domain but lacks information on the dynamics, thus making its error larger than enhanced EPR and naive HJB.

G. 3D MODELS
In this section, we describe the computational setup for the Lorenz system in three dimensions and Ferrell's three-ODE model. We demonstrate the slices of the 3D potential for the former and conduct the proposed dimensionality reduction on the latter. G.1. 3D Lorenz system In this subsection, we apply our landscape construction approach to the 3D Lorenz system [38] with isotropic temporal Gaussian white noise.
The Lorenz system has the form where β 1 = 10, β 2 = 28 and β 3 = 8 3 . We add the noise with strength D = 1. This model was also considered in [18] with D = 20. We obtain the enhanced data by adding Gaussian noises with standard deviation σ = 5 to the SDEssimulation data. We directly train the 3D potential V (x; θ) by enhanced EPR with λ 1 = 10.0, λ 2 = 1.0 and present a slice view of the landscape in Fig. 5. The learned 3D potential agrees well with the simulated samples and shows a butterfly-like shape as the original system does.

G.2. Ferrell's three-ODE model
In this subsection, we consider Ferrell's three-ODE model for a simplified cell cycle dynamics [29] denoted by for the concentration of CDK1, Plk1, and APC. We have the ODEs where α 1 = 0.1, α 2 = α 3 = β 1 = 3, β 2 = β 3 = 1, K 1 = K 2 = K 3 = 0.5, n 1 = n 2 = 8, and n 3 = 8. We add the noise scale D = 0.01 with isotropic temporal Gaussian white noise. By taking the reduced variables z = (x, y) , we can apply our force projection loss and enhanced loss to learn the projected forceG(x) and potentialṼ (x; θ), and the results are shown in Fig. 6. We use three-layer networks with 80 hidden states in this problem and enhanced samples simulated from a more diffusive distribution with D = 5D. We train the projected forceG(x) for 1000 epochs and then conduct enhanced EPR with λ 1 = 0.1, λ 2 = 1.0 for 4000 epochs to compute the projected potential. The obtained reduced potential shows a plateau in the centering region and a local-well tube domain along the reduced limit cycle.

H. HIGH DIMENSIONAL MODELS
In this section, we apply our approach to 8D limit cycle dynamics [4] and 52D multistable dynamics [39]. We directly train the reduced force fieldG(z) and poten-tialṼ (z; θ) according to the selected reduction variables suggested in the corresponding literature. We use threelayer networks with 80 hidden states for both force and potential. The training strategies are similar to previous examples.

H.1. 8D complex system
We consider an 8D system in which the dynamics and parameters are the same as the supporting information of [4], and take CycB and Cyc20 as the reduction variable z. We set the mass in this problem as m = 0.8.
In [4], the noise strength D = 0.0005 is not suitable for direct neural network training since the scale of the potential is O(10 −5 ). Borrowing the idea in Section F F.2, we amplify the original force field F considered in [4] by κ = 1000 times, and take D = 0.01 for the transformed force field. This amounts to set D = 10 −5 for the original force field, which is even smaller than the case considered in [4]. We simulate the SDEs without boundaries first and then fix the dataset without updating. We obtain the enhanced samples by adding Gaussian perturbations to the obtained dataset. Only the data within the FIG. 7. Streamlines and limit sets of the projected force field of the 8D cell cycle model by two reduced variables CycB and Cdc20. The outer red circle is the stable limit cycle of the reduced force field corresponding to the yellow circle as the projection of the original high dimensional limit cycle. The inner red circle, red dot and two green circles are stable and unstable limit sets of the reduced dynamics, which are virtual in high dimensions.
We train the projected forceG(z; θ) for 5000 epochs and conduct the enhanced EPR with λ 1 = 0.1, λ 2 = 1.0 for 10000 epochs. Some essential features of the reduced potential and dynamics on the plane have been presented in MT.
In the SM Fig. 7, we present a more thorough picture of the reduced dynamics for the 8D model than the MT Fig. 2. To be more specific, we further show two unstable limit cycles of the projected force field, two green circles obtained by reverse time integration, in SM Fig. 7. They fall between the outer and inner stable limit cycles (inner and outer red circles), and the inner stable limit cycle and inner stable node (red dot in the center), which play the role of separatrices between the neighboring stable limit sets. This picture occurs as the result that the landscape of the considered system in the centering region is very flat. These inner limit sets are virtual in high dimensions, but they naturally appear in the reduced dynamics on the plane. Similar features might also occur in other reduced dynamics in two dimensions.

H.2. 52D multi-stable system
We also apply our approach to a biological system with 52 ODEs constructed by [39] and take GATA6 and NANOG as the reduction variable z. We define A i as the set of indices for activating x i and R i as the set of indices for repressing x i , the corresponding relationships are defined as the 52D node network shown in [39].
where a = 0.37, b = 0.5, k = 1, S = 0.5, and n = 3. We choose the noise strength D = 0.01. We train the forceG(z; θ) for 500 epochs and conduct enhanced EPR with λ 1 = 100.0, λ 2 = 1.0 for 500 epochs. We use enhanced samples simulated from a more diffusive distribution with D = 5D. As shown in Fig. 8, the projected force demonstrate the reduced dynamics and the depth of the constructed potential agrees well with the density of the sample points.