Markov models from the Square Root Approximation of the Fokker-Planck equation: calculating the grid-dependent flux

Molecular dynamics are extremely complex, yet understanding the slow components of their dynamics is essential to understanding their macroscopic properties. To achieve this, one models the molecular dynamics as a stochastic process and analyses the dominant eigenfunctions of the associated Fokker-Planck operator, or of closely related transfer operators. So far, the calculation of the discretized operators requires extensive molecular dynamics simulations. The Square-root approximation of the Fokker-Planck equation is a method to calculate transition rates as a ratio of the Boltzmann densities of neighboring grid cells times a flux, and can in principle be calculated without a simulation. In a previous work we still used molecular dynamics simulations to determine the flux. Here, we propose several methods to calculate the exact or approximate flux for various grid types, and thus estimate the rate matrix without a simulation. Using model potentials we test computational efficiency of the methods, and the accuracy with which they reproduce the dominant eigenfunctions and eigenvalues. For these model potentials, rate matrices with up to $\mathcal{O}(10^6)$ states can be obtained within seconds on a single high-performance compute server if regular grids are used.


I. INTRODUCTION
The dynamics of molecular systems is astonishingly complex. Only a small fraction of their high-dimensional state space is actually accessible at room temperature. Yet finding out which regions of the state space are accessible, requires sophisticated computer simulations, i.e. molecular dynamics (MD) simulations. Molecular dynamics can be very sensitive to small changes in some variables of the system or the environment, but can also be remarkably robust with respect to changes in other variables. Humanly understandable models of the molecular dynamics are therefore essential for the elucidation of complex molecular systems.
Markov state models (MSMs) represent the conformational dynamics of molecular system as transition probabilities between states in the conformational space [1][2][3][4][5][6]. From the dominant eigenvectors and eigenvalues of the transition matrix T(τ ), one can deduce a wealth of useful information on the molecular system, such as the long-lived conformations, the dynamic processes that govern the dynamic equilibrium between them, transition networks and pathways in these networks, and one can quantify the sensitivity of experimental observables with respect to the dynamic processes [7,8]. MSMs are now a well-established and valuable tool for the elucidation of large molecular systems, and in particular biomolecular systems [9][10][11][12][13][14][15].
In the construction of MSMs, one assumes that the molecular dynamics is a stochastic process. The time-evolution of the probability density is governed by the associated Fokker-Planck equation, or equivalently: the infinitesimal generator of the stochastic process Q.
By formally integrating the Fokker-Planck equation one obtains a transfer operator, whose discretized version is the MSM transition matrix T(τ ). The matrix elements of T(τ ) can conveniently be estimated from MD simulations as correlation functions. On the other hand, this means that the accuracy of the MSM stands and falls with the quality of this simulation.
Because MD simulations are costly and slow to converge, enhanced sampling techniques have been developed to speed up the exploration of state space and the convergence of ensemble averages [16][17][18][19]. With recently developed dynamic reweighting methods one can additionally recover the correlation functions and thus the MSM of the unbiased system from these biased simulations [20][21][22][23]. But despite enhanced sampling techniques, there is usually no way to be certain whether an MD simulation has explored all of the accessible state space, and even assessing whether the sampling within the explored state space has converged can be difficult [24,25]. Thus, there is ample motivation to investigate avenues to obtain a MSM of a molecular system without generating a MD simulation.
Square Root Approximation (SqRA) is a technique that approximates the Fokker-Planck equation by a rate matrix [26,27]. Given a discretization of the state space, the rate from cell Ω i to cell Ω j is where Φ S ij V i is the flux of the probability density through the intersecting surface S ij in the absence of any potential energy function, V i is the volume of cell Ω i , and π(x i ) and π(x j ) are the Boltzmann densities at the centers of cell Ω i and Ω j , respectively. We recently derived the SqRA for N D -dimensional systems by exploiting Gauss's flux theorem, and showed that for infinitely small grid cells the geometric average of the Boltzmann weights converges to the

Smoluchowski diffusion equation, i.e. the Fokker-Planck equation associated to overdamped
Langevin dynamics [27,28]. Previously an analogous formula for one-dimensional systems has been derived from the one-dimensional Smoluchowski equation [29] and using the maximum caliber (maximum path entropy) approach [30][31][32]. In addition the geometric average of the Boltzmann weights has been used as reweighting factor in the dynamic histogram analysis method (DHAM) to reweight transition probabilities [21].
The SqRA opens up a way to calculate the transition rates without having to resort to rare-event simulations, at least for systems with not too many degrees of freedom. The ratio of the Boltzmann densities π(x i )/π(x j ) can be readily calculated from the potential energy function. The grid volume V i and the intersecting surface S ij can be calculated from the discretization of the state space. However, how to best calculate Φ S ij V i is an open question. In our previous work [27], we assumed that the factor is S ij V i is constant for all grid cells. This is true for hyper-cubic grids and approximately true for Voronoi grids with very small grid cells. We then estimated the factor Φ S ij V i by comparing the rate matrix to a MSM transition matrix, the construction of which required an MD simulation.
In this contribution, we derive the exact expression for Φ from the equation of the overdamped Langevin dynamics with constant potential, and show that it depends on the diffusion constant and on the discrete Laplace operator. We then compare several methods to calculate S ij V i for different types of discretizations. For regular grids, this ratio can be calculated analytically. For Voronoi grids, we use the quickhull algorithm [33] to calculate S ij V i numerically, and we approximate the ratio by interpolating between all neighbors of the cell Ω i [34]. We additionally propose a method to calculate Φ S ij V i by comparing to the analytically known transition probability of a Wiener process (i.e diffusion at a constant potential energy function). With theses methods, we can construct the rate matrix without any MD simulation. We test the methods on model potentials with respect to computational efficiency, the dimensionality of the systems, and accuracy of the resulting rate matrix.

II. THEORY
We consider a system of n p particles that move in the three-dimensional Cartesian space, i.e. in a state space with N D = 3n p dimensions: Ω ⊂ R N D . Its dynamics is described by the overdamped Langevin dynamics: where x(t) ∈ Ω is the state vector at time t, ξ is a friction parameter with units of 1/s, M is a diagonal 3n p × 3n p -mass matrix, M −1 is its inverse, V (x) is the potential energy function, and B(t) is an N D -dimensional Wiener process scaled by the diagonal matrix where T is the temperature, and k B is the Boltzmann constant. Eq. 1 generates a Markovian, ergodic and reversible process [1,35].
The time-evolution of the associated probability density ρ(x, t) is given by the following Fokker-Planck equation which is also known as the Smoluchowski diffusion equation. The symbol ∇ denotes the gradient of a function f : R n → R, and ∆ = ∇·∇ is the corresponding Laplacian. The factor in front of the Laplacian can be interpreted as the matrix of the diffusion coefficients D = 1 2 σ 2 , which are assumed to be independent of the particle positions [35]. Eq. 2 introduces the Fokker-Planck operator Q. Q can also be interpreted as the infinitesimal generator of a transfer operator (or propagator) with lag time τ : T (τ ) = exp (Qτ ). The operator T (τ ) propagates ρ(x, t) forward in time by a time interval τ : The stationary solution of eq. 2 is the Boltzmann density The square root approximation (SqRA) of the infinitesimal generator is a method to discretize Q, and to calculate the corresponding matrix elements [26,27]. We will briefly review its derivation in the following section.
Consider a disjoint decomposition of the state space Ω into N Voronoi cells Ω i , such that The characteristic function associated to each Voronoi cell Ω i is We introduce the following scalar product For disjoint sets, the Galerkin discretization of Q is computed via Q = ( χ j |χ i π ) −1 ij ( χ j |Qχ i π ) ij which reduces to if we use eq. 4 as ansatz functions. The term π i = χ i |χ i π =´Ω i π(x)dx denotes the stationary probability of cell Ω i .
Eq. 5 defines a N × N transition rate matrix Q with elements Q ij , where Q ij , for i = j, denotes the rate from cell Ω i to cell Ω j . The discretization is analogous to the discretization of the transfer operator in the derivation of Markov State Models (MSMs) [1,5], which yields a transition matrix T(τ ). Just as in MSMs, the state space Ω is usually so high-dimensional that solving the integral in eq. 5 is not a viable option. However, in contrast to MSMs, the numerator in eq. 5 cannot be estimated from correlation functions obtained by simulating the stochastic process in eq. 1 [5,36]. The square root approximation provides a solution to this impasse, which neither requires solving the high-dimensional integral nor sampling the stochastic process.
The derivation starts by noting that for time-homogeneous processes the rate matrix and the transition matrix are related by Q := ∂T(τ ) . For infinitesimally small lag times τ , the transition rates between cells which do not share a common boundary is certainly zero.
Thus, we can set the rate matrix elements for non-adjacent cells to and Ω i is not adjacent to Ω j .
Because the matrix elements T ij (τ ) represent transition probabilities, we can use the Gauss theorem to show that the rate matrix elements for adjacent cells satisfy [26,27] where¸denotes a surface integral. Furthermore, ∂Ω i ∂Ω j is the common surface between the cell Ω i and Ω j . Φ(z) = δ Ω i =Ω j v · n denotes the flux of the configurations z through the surface ∂Ω i ∂Ω j . The vector v is the velocity field associated to the time-dependent probability density. This is analogous to the fluid velocity in fluid dynamics, which describes the velocity of a small element of fluid such that the mass is conserved.
To approximate the surface integral in eq. 7, we introduce the first of two assumptions of SqRA: 1. The flux does not depend on the position in state space: Φ(x) = Φ. Then The remaining surface integral in eq. 8 represents the stationary density on the intersecting surface ∂Ω i ∂Ω j . To approximate it, we formulate our second assumption: 2. Each cell is small such that the potential energy V (x) is almost constant within the It follows that the stationary density π(x), and, by extension, also the time-dependent density ρ(x, t), is constant within a given cell Ω i . The continuous and the discretized probabilities are related by where´Ω i 1 dx = V i is the volume of the cell Ω i , and in particular we have where x i is the center of Ω i . Likewise, we can assume that the potential energy function on ∂Ω i ∂Ω j is essentially constant, and that it can be approximated by some average of V i and , because for this type of mean-value calculation one can show that the resulting discretized operator Q converges to the Fokker-Planck-operator Q in the limit of infinitesimally small cells [27,28]. The surface integral in eq. 8 then becomes where¸∂ Ω i ∂Ω j 1 dS(z) = S ij is the area of the intersecting surface. Note that an arithmetic mean of the potential energy function results in a geometric mean of the stationary densities: With this appoximation of the surface integral and with eq. 9, we obtain the following expression for rates between adjacent cells (eq. 7) and the following rate matrix This is the SqRA of the Fokker-Planck operator Q. Note that in our previous publication [27], we did not write the factor S ij /V i explicitly, because we assumed that it is approximately the same for all pairs of adjacent cells and can be incorporated intoΦ = Φ S ij V i . The discretization of the Fokker-Planck equation (eq. 2) then is where ρ(t) is the vector-representation of the continuous probability density ρ(x, t) with Eq. 14 can be rewritten as an evolution equation for the individual vector elements which is often written more concisely as a master equation where i∼j denotes the sum over all adjacent cells Ω i of cell Ω j .
The great appeal of the SqRA of the Fokker-Planck operator is that, apart from the only requires the Boltzmann-density π(x i ) at the cell centers (eq. 13), which are readily available from the potential energy surface of the system. In principle, no time-series are required to calculate the rate matrix. The challenge lies in estimating Φ S ij V i . In the following, we introduce two different approaches to calculate Φ S ij V i that do not rely on a realization of eq. 1.
If the flux Φ does not depend on the potential energy function (assumption 1), one should be able to determine Φ by analyzing the overdamped Langevin dynamics on a constant potential V (x) = const., and the associated Fokker-Planck equation This has two advantages. First, the differential operator in eq. 18 essentially consists of the Laplacian, whose discretization is known. Second, the stationary density (eq. 3) of this process is constant, which simplifies the expression for the rates (eq. 12).
Applying the Gauss theorem, the Laplacian of the probability density ρ(x, t) over a small region with volume V and surface S, is written as [37] ∆ρ(x, t) = lim where n is the unit vector orthogonal to the surface S. It follows, that on a Voronoi tessellation of the space, the discrete Laplacian on a small Voronoi cell Ω i is [38,39] ∆ρ( The term ∇ρ(x, t)| x=x j · n ji is the gradient in the direction j → i (directional derivative), which can be approximated by the finite difference where h ji = x j − x i is the distance between the centers of the cells Ω j and Ω i . Inserting this finite difference into eq. 20 yields Assuming that the density ρ(x, t) is approximately constant within cell Ω i (assumption 2), V i in eq. 22 and inserting into eq. 18 yields and we obtain the the discrete Fokker-Planck equation (eq. 18) at constant potential Comparing eq. 24 to the master equation (eq. 16) and to the definition of rates between adjacent cell within the SqRA (eq. 12) we obtain the following equality where we used that π(x i ) = 1 at constant potential. Thus, We have obtained an analytical expression for Φ between adjacent cells that only depends on the distance h ij between the cell centers. Appendix B contains an alternative derivation of eq. 26 using Fick's first law of diffusion.
The rate matrix (eq. 13) can now be written more concretely as Section III introduces formulas to evaluate 1

analyzing the transition probability density
Our starting point is again eq. 13, and we introduce a third assumption: 3. The volumes of all cells are approximately equal (V i ≈ V, ∀ Ω i ), and the intersecting surfaces areas are approximately equal for all adjacent cells (S ij ≈ S, ∀ Ω i adjacent to Ω j ).
In this case, the factor Φ has approximately the same value for each pair of adjacent cells. Φ grid is thus a flux value which is characteristic for a given grid rather than for a specific pairs of cells. This is the assumption we used in ref. 27.
Every grid can be represented as an unweighted graph, in which nodes correspond to the grid cells Ω i , and two nodes are connected by an edge if the corresponding grid cells are adjacent. At constant potential, the rate matrix (eq. 13) can then be written as the Laplacian matrix of the graph L multiplied by the grid flux where the Laplician matrix of the graph is defined as Note that L = D − A, where A is the adjacency matrix of the graph, with elements A ij = 1 if Ω i and Ω j are neighbors, and A ij = 0 otherwise. D is the degree matrix of the graph, a diagonal matrix whose diagonal entries contain the number of neighbors for each cell, The transition matrix T(τ ) and the rate matrix Q are related by If one knows the transition probability of a T ij (τ ) of single pair of adjacent cells at constant potential, one can calculate Φ grid by comparing T ij (τ ) to the matrix element The transition probability is defined as the integral transition probability density p(x, y, τ ) over the initial and final cell where π(x) is the unconditional probability density of finding the system at point x at time t, and p(x, y, τ ) is the conditional probability density of finding the system in ydy at time t + τ given that it started in point x at time t. In ref. 27 we obtained the transition probability by constructing a MSM based on a simulation of the dynamic process.
Here, we propose a different approach. We again use the idea that the flux, and by extension Φ grid , does not depend on the potential energy function. Therefore, it can be determined from an overdamped Langevin dynamics at constant potential energy (eq. 17), for which the transition probability density is: where N D is the dimension of the state space. If the cells are small (assumption 2), the distance from any point in cell Ω i to any other point in Ω j is approximately equal to the distance of the centers of the two cells, i.e. y − x ≈ x j − x i for all x ∈ Ω i , y ∈ Ω j . With this assumption eq. 31 becomes where h ij = x j − x i is the distance between the centers of Ω i and Ω j . Besides N D , σ, and h ij , one only needs the average cell volume V to calculate T ij (τ ). The lag time τ can in principle be chosen freely.
Now that we have a closed-form approximation for T ij (τ ) at constant potential, we can use eq. 30 to calculate Φ grid . Because the Laplacian matrix L is not invertible, we cannot determine Φ grid by rearranging eq. 30. Instead we use Φ as a parameter that minimizes the difference between T (τ ) and exp (−τ ΦL), i.e. we minimize the function: and . This approach requires calculating the matrix exponential of the potentially large but sparse matrix L and is tested in the result section.
It is tempting to avoid the minimization of f (Φ) (eq. 34) by approximating the matrix exponential as a truncated Taylor series, and solving for Φ. Mathematically this is possible.
But the resulting equation for Φ grid is a poor approximation to the true value of Φ grid , and we do not recommend using this approach. Appendix C discusses the details.

III. METHODS
In section II B we showed that the grid-dependent flux factor can be expressed in terms of the known parameters σ and h ij : Φ In this section, we summarize methods to evaluate S ij /V i for different grid types.
a. Arbitrary grid / exact method. "The Quickhull Algorithm" [33], implemented in the MATLAB function "convhulln()", computes the convex hull of a set of multidimensional points and can be used to numerically calculate the surface S ij and the volume V i of an arbitrarily shaped cell. We will call this the "exact method", because it directly calculates S ij V i without any assumptions on the grid geometry. However, the algorithm requires not only the centers of the cells, but also the vertices of each cell, which makes it computationally expensive for high-dimensional spaces.
b. Hyper-rectangular grid. On a (hyper-)rectangular grid, the ratio between interface surface and cell volume is simply given by the cell length in direction i → j, i.e Note that on a (hyper-)cubic grid h ij = h is the same in all grid dimensions, while for a one-dimensional grid one obtains the equation derived in ref. 29 from the one-dimensional reaction-diffusion equation. Appendix B shows that eq. 35 converges to the Fokker-Planck equation in the limit of infinitesimally small cells [27,28].
c. Hexagonal grid. The apothem a of a cell is the distance from the cell center to one of the midpoint of its sides. On a two-dimensional hexagonal grid, a = h/2, where h = h ij is the distance between cell centers. Using the apothem we can calculate the intersecting surface, which is equal to the length of each side of the hexagon S ij = 2 3h ij , and the rate between adjacent hexagonal cells is d. Voronoi grid via the neighbors-method. On arbitrary Voronoi grids, several methods [40] have been proposed to approximate S ij V i . For example, from the Taylor expansion of a function on an irregular mesh, the rate between adjacent cells can be expressed as [34] Q ij, adjacent,Voronoi = σ 2 2 where n i is the number of neighbors of the cell Ω i , andh i is the average distance between the cell Ω i and all the neighbors. Eq.
Exact ratio of intersecting surface area and cell volume exact arbitrary of rate matrices Q constructed using eq. 28 in combination with the methods in Tab. I. Note that eq. 28 implies that the row-sums of each of these matrices Q are zero, consistent with eq. 27.  The usefulness of the SqRA critically depends on how many dimensions N D the dynamical system in eq. 1 may have, before the calculation of Q via eq. 13 becomes computationally intractable. Q is a N × N square matrix, where N is the number of cells Ω i . If each dimension of the dynamical system is discretized into N bins , the number of cells is given as N = N N D bins , i.e N grows exponentially with the number of dimensions N D . Thus, even for low-dimensional systems we have to construct a sparse but extremely high-dimensional rate matrix Q.

IV. RESULTS AND DISCUSSION
To compare the computational efficiency of the methods to estimate Φ S ij V i , we devised a model system, which consists of particles of mass m = 1 kg moving in a N D -dimensional Cartesian space according to eq. 1 with ξ = 1 s −1 and σ = 2.2 J defined on the domain Ω = {(x 1 , ..., x N D ) : −π < x i ≤ π for i = 1, ..., N D }. We applied periodic boundary conditions in each direction, and the one dimensional potential energy is 2π-periodic in direction i. The parameter k i is the force constant, m i is the multiplicity and describes the number of barriers and wells of the function, and x 0i is the phase. For each direction i, we used the same triplet of parameters k i = 2 kg s −1 , m i = 2 and x 0i = 0 rad. Fig. 1-A shows the potential for the 1-dimensional system, which is a periodic double well potential. For an N D -dimensional system, the potential has 2 N D wells in the N D -dimensional space. Eq. 40 mimics the function that governs torsion angles in MD force fields. By choosing a Cartesian space with periodic boundary conditions rather then an actual torsion angle, we avoid any complications that arise from the coordinate transformation to the torsion angle space, and a volume element in Ω is simply given as dV = dx 1 dx 2 . . . dx N D .
We next scanned the number bins between 2 and 60 to find the coarsest possible discretization that still yields accurate results for the dominant processes of the one-dimensional system. For N bins = 5, κ 1 = −1.58, which is 1.2% lower than the reference value; while κ 2 = −2.69 and κ 3 = −5.68 are respectively 6.2 % and 11.4 % higher than the reference values. A smaller number of bins yield considerable deviations from the reference value. In spite of the very low resolution of the eigenvector, we can identify the two peaks corresponding to the two wells of the potential.
We constructed grids for up to N D = 9 dimensions. For the hypercubic grids, we dis- We included the following methods in this scan: "rectangular" on a hypercubic grid, and "exact", "neighbors", and "minimization" on a Voronoi grid. The method "hexagonal" is excluded, because a hexagonal grid can only be constructed on a two-dimensional Cartesian space. The method "exact" for the hypercubic grid is not shown explicitly, because it is identical to the method "rectangular". For each rate matrix, we calculated the four leading eigenvalues. For this system the second eigenvalue has a degeneracy equal to the number of dimensions. Because they perfectly overlap, there appear to be less eigenvalues for the higher-dimensional systems in fig. 1-D. All four methods yield eigenvalues that are in excellent agreement with the reference solution. Thus, at least at this level of discretization, approximating the ratio S ij V i by these methods does not introduce an error of relevant magnitude.
However, the computational cost varies drastically between the methods ( fig. 1-E). Three separate tasks go into calculating the eigenvectors and eigenvalues of Q: (i) constructing the adjacency matrix of the grid from which the Laplacian matrix of the grid L can then be calculated, (ii) calculating Φ S ij V i using one of the four methods, and (iii) calculating the dominant eigenvalue-eigenvector pairs for the resulting matrix Q. The most efficient method is "rectangular" on a hypercubic grid, for which we could calculate rate matrices for up to nine dimensions on a server with a Intel Xeon CPU (E5-2690 v3 @ 2.60 GHz) and 160 GB of RAM. Using MATLAB, the execution time was 45 seconds. We provide an example script On hypercubic grids, the distance h ij between neighboring cells is a constant, and the factor Φ S ij V i = 1 2 σ 2 h 2 can be calculated at negligible cost. Moreover, one can build adjacency matrices and construct the matrix Q very efficiently using sparse matrices and the Kronecker product (see supplementary material). Consequently, approximately the 90 % of the computational time is used up by the third task: solving the eigenvalue problem ( fig. 1-E, magenta triangles). The time to solve the eigenvalue problem primarily depends on the dimension of the matrix Q, i.e. the number of cells N . It does not depend on the method of computing the flux, and it only weakly depends on the type of grid. Thus, the computational cost, that is displayed as magenta triangles in fig. 1-E is part of every calculation in fig. 1-E. Note that the execution time depends on the algorithm used to solve the eigenvalue problem. The MATLAB function "eigs()" permits to provide the number of eigenvalues-eigenvectors to be calculated. This is particularly useful when one is interested only in the slowest dominant processes, which are associated to the largest eigenvalues.
All three methods to construct the rate matrix on a Voronoi grid are orders of magnitude slower than the "rectangular" method for hypercubic grids, because building adjacency matrices for a Voronoi discretization is computationally difficult and costly. We constructed the adjacency matrices A using an algorithm based on linear programming as suggested in ref. 26. Note that for Voronoi grids, the computational cost of diagonalizing the rate matrix (magenta triangles) makes up only a small fraction of the total calculation.
Among the methods for a Voronoi grid, the "exact" method (green dots in fig. 1-D) is about an order of magnitude more expensive than the methods "neighbors" or "minimization" (blue and red dots in fig. 1-D), because it requires the exact calculation of cell volumes.
Using the same computer as for the "rectangular" method, we were able to build the rate matrix of the five dimensional system. The calculation took 8.9 × 10 5 s, corresponding to more than ten days of calculations. However, the "exact" method is slightly more accurate then the other three methods for Voronoi grids. The methods "neighbors" and the "minimization" slightly overestimate the eigenvalues, but the execution time reduced to 1.

B. Accuracy
Next, we test wether the methods in Tab. I differ in their accuracy. We consider a particle of mass 1 kg which moves on a two-dimensional Cartesian space according to eq. 1 with ξ = 1 s −1 and σ = 15 J 1 2 kg − 1 2 s − 1 2 . The potential energy function is with the parameters k 1 = 0.003, a 1 = 3.3, k 2 = 1, a 2 = 3.5, k 12 = −50 and c = 1. This potential is composed of a one-dimensional term which describes a slow double-well dynamics along the x axis, a one-dimensional term which describes a fast double-well dynamics along the y axis, and a coupling term (Fig. 2-A). The eigenvalue spectrum of the Fokker-Planck operator for this system (eq. 2) exhibits four dominant eigenvalues at κ 2D 0 = 0, κ 2D 1 = −5.98, κ 2D 2 = −10.2, and κ 2D 3 = −15.06 ( fig. 2.B). The corresponding eigenfunctions l 0 (x, y) to l 3 (x, y) are shown in ( fig. 2-D). The eigenvector l 0 (x, y) is equal to the stationary density.
Eigenvectors l 1 (x, y) and l 2 (x, y) describe slow transitions along the x and y axis, respectively.
Eigenvector l 4 (x, y) represents a dynamic process which mixes x and y, and is due to the coupling term in V (x, y). We constructed this reference solution by evaluating the SqRA (eq. 13) on a quadratic grid (eq. 35) with N = 50 × 50 = 2500 cells on the space [−6, 6] × [−6, 6].
Additionally, we constructed a MSM on the same grid. We generated a time-discretized trajectory of 1 × 10 8 time-steps, with a time-step ∆t = 0.001, integrating eq. 1 according to the Euler-Maruyama scheme [41]. The MSM has been constructed by counting transitions C ij (τ ) from cell Ω i to cell Ω j within a lag time τ varied in a range of [5:500] time-steps [5]. Detailed balance has been enforced by symmetrizing the resulting 50 × 50-count matrix: C sym (τ ) = C(τ ) + C (τ ), where C (τ ) denotes the transpose of C(τ ). The MSM transition matrix T(τ ) was obtained by row-normalizing C sym (τ ).
The eigenvectors of the MSM transition matrix are defined as l i T(τ ) = λ i (τ )l i . The MSM yielded the same dominant eigenvectors as the SqRA of the rate matrix. The MSM eigenvalues λ i (τ ) and the eigenvalues of the rate matrix can be interconverted by and are in excellent agreement ( Fig. 2-C). The fact that the ratio ln(λ i (τ )) τ does not vary with τ indicates that the MSM on this grid has a negligible projection error. Since the SqRA-model and the MSM do not deviate from each other, we can assume that also the SqRA-model has a negligible projection error. We will therefore use the SqRA-model on a regular grid with N = 2500 cells as a reference solution for further tests.
To assess whether the method to estimate the flux has influence in the accuracy of the SqRA of the rate matrix, we varied the number of grid cells from N = 4 to N = 225. We constructed quadratic grids, hexagonal grids and arbitrary Voronoi grids. For the arbitrary Voronoi grids, we randomly placed grid centers in the two-dimensional state space. To account for the variance in these randomly constructed grids, we constructed fifty different grids for each value of N and constructed the corresponding rate matrix.
In Fig. 3, we report the mean and the variance of the dominant eigenvalues for Voronoi grids, that were calculated using the methods "exact" (green), "neighbors" (blue), and "minimization" (red). Fig. 3 also shows the dominant eigenvalues for the quadratic grid calculated using the method "rectangular" (black) and the hexagonal grid calculated using the method "hexagonal" (orange), as well as the reference value for the eigenvalues (dashed).
The results for the Voronoi grids seem to converge faster than the results for the regular grid. The mean of the eigenvalues for the Voronoi grids is already reasonably accurate for N = 9 or N = 16 grid cells. Note however that the variance is sizeable at these low numbers of grid cells and that, depending on the exact location of the grid cells, the Voronoi results can also deviate considerably from the reference value. For N = 25 all five methods yield results that are close to the reference value, and the accuracy of all five methods increases only slowly with increasing N . In fact, between N = 100 and N = 225 we do not find a significant improvement for any of the five methods. This means that, at least for this potential energy function, 25 to 100 grid cells are sufficient to discretize the two-dimensional state space. This is an order of magnitude lower than previously reported discretizations of two-dimensionsal molecular states spaces that relied on the SqRA [21,42].
For N > 100 the eigenvalues obtained from regular grids are almost exactly equal to the reference values, whereas the results from Voronoi grids tend to overestimate the eigenvalues. This indicates that, if one is interested in a highly accurate estimate of the dominant eigenvalues, one should opt for a regular grid.

V. CONCLUSION
We have derived, from the equation of the overdamped Langevin dynamics with constant potential, the expression of the flux Φ that appears in the SqRA formula [26,27]. An analogous formula, was previously derived for the one-dimensional Smoluchowski equation discretized on a regular grid [29] and later used to estimate the diffusion coefficients of molecular systems projected on one-dimensional relevant coordinates [43,44]. However, molecules at room temperature only access a small fraction of their state space, and the experience with MSMs has shown that Voronoi grids are useful for discretizing the accessible state space [5]. We therefore do not yet want to rule out Voronoi grids. We have compared three methods to calculate Φ S ij V i : an "exact" method that aims at calculating S ij V i numerically, and two approximate methods. The "neighbor" method is based on an already known interpolation scheme between all neighbors of a given cell Ω i . The "minimization" method is an approach that we proposed in this contribution, and is based on a comparison to the analytically known transition probability at constant potential. On Voronoi grids, the construction of the adjacency matrix is computationally much more demanding than on regular grids, and for the approximate methods, the computational cost is dominated by the construction of the adjacency matrix. However, with the "exact" method the calculation of S ij V i is the most costly step, increasing the computational cost of the entire calculation by an order of magnitude. Taking into account that this method only slightly improves the accuracy of the MSM, the "exact" method is not suited for an actual application.
With a few seconds computing time on a single compute server, we could reach O(10 6 ) grid cells for regular grids, while we need a fews days to reach O(10 3 ) for Voronoi grids.
Using the "neighbors" or the "minimization" method, O(10 4 ) are within reach for Voronoi grids, and moving the calculation to high-performance compute clusters or GPUs will likely push the limit to O(10 5 ) states. With grids of this size, the SqRA becomes useful for small molecules. However, a brute-force discretization of the entire state space of larger molecules would require even larger grids. There are two possible remedies: (i) one discretizes only the accessible state space, or (ii) one projects the dynamics on a lower-dimensional space and discretizes this space. In the first approach, the accessible state space will best be formulated in terms of internal coordinates. Then the distances between grid cells and potentially also the cell volumes have to be transformed accordingly. In the second approach, one needs to calculate the free-energy surface and the position-dependent flux on the low-dimensional space, for which MD simulations are needed [29,35,43,45]. Additionally, one needs to adjust the SqRA to account for the position-dependent flux. For one-dimensional regular grids, the adjusted SqRA rates are reported in refs. 29 and 43. Note that the second approach is currently not limited by the computational cost for the SqRA, but by the computational cost for the MD simulations. In future work we compare these two approaches, and apply the SqRA to molecular systems.

VI. SUPPLEMENTARY MATERIAL
See supplementary material for the example script to construct an adjacency matrix for N D -dimensional systems on hyper-cubic grids and to construct the rate matrix Q using the "rectangular" method. In the following, we provide an alternative derivation of the quantity Φ that appears in eq. 13. The Fokker-Planck equation (eq. 2) can be written in the form of a continuity equation [35], which at constant potential energy function reduces to Fick's second law of diffusion [46] where the flux is given by To discretize the ∇J on a Voronoi grid, we apply the Gauss theorem, and discretize the surface integral along the sides of the Voronoi cell Ω j where is the flux in direction n ji = Using the same finite difference as in eq.21, we obtain To convert the continuous probability density evaluated at the cell centers ρ(x j , t) to discrete probability defined on finite cell volumes we use the relation which is identical to eq. 24. And thus also from the view-point of the continuity equation and Fick's laws of diffusion the flux is given as We show that the SqRA on a (hyper-)cubic grid converges to the Fokker-Planck equation in the limit of infinitesimally small cells. On these grids the cell length h is the same in each grid dimension, but the extension to rectangular grids is straight forward.
The rates between adjacent cells are given by eq. 35 which yields the following the master equation (eq. 16) We replace the exponential function in eq. B1 by its first-order Taylor expansion, exp −β , and obtain where we omitted the t-dependence of ρ i (t) and ρ j (t) for the sake of brevity. Next we write We revover the continous probability density ρ(x, t) from the discrete probabilities using the relation ρ i (t) = ρ(x i , t) V i = ρ(x i , t) h n , and the relation for the potential energy function The cell volume h n appears linearly on both sides of the equation, and cancels: where x i and x j are the (still discrete) cell enters, and we omit the t-dependence of ρ(x, t) for the sake of brevity.
We remind the reader that i∼j denotes a sum over all cells Ω i which are adjacent to cell Ω j . On a regular grid, every cell Ω j has two neighbors in each grid dimension, which are centered at x j + h · n k and x j − h · n k , where n k is the unit vector pointing in direction k, and h · n k is the lattice vector along the kth dimension. We will now sort the sum over adjacent cells according to grid dimension k and will take the limit h → 0 to recover the differential equation. With this approach, the first term in eq. B5 becomes where ∂ k denotes the derivative with respect to the kth dimension. Similarly, The third term in eq. B5 has the following limit In the limit h → 0 eq. B4 becomes ∂ t ρ(x, t) = σ 2 2 [∆ρ(x, t) + βρ(x, t)∆V (x) + β∇ρ(x, t)∇V (x)] .
Thus, Φ grid, approx (τ opt ) cannot be used as a valid approximation of the characteristic flux of the grid. Since eq. C6 likely shows a similar behaviour for arbitrary Voronoi grids, we do not recommend using it, and have not included it in our analysis in the main part of the publication.