Stochastic Chemical Reaction Networks for Robustly Approximating Arbitrary Probability Distributions

We show that discrete distributions on the $d$-dimensional non-negative integer lattice can be approximated arbitrarily well via the marginals of stationary distributions for various classes of stochastic chemical reaction networks. We begin by providing a class of detailed balanced networks and prove that they can approximate any discrete distribution to any desired accuracy. However, these detailed balanced constructions rely on the ability to initialize a system precisely, and are therefore susceptible to perturbations in the initial conditions. We therefore provide another construction based on the ability to approximate point mass distributions and prove that this construction is capable of approximating arbitrary discrete distributions for any choice of initial condition. In particular, the developed models are ergodic, so their limit distributions are robust to a finite number of perturbations over time in the counts of molecules.


Introduction
Chemical reaction networks (CRNs) with mass-action kinetics [7,22] model the behavior of well-mixed chemical solutions and have a wide range of applications in science and engineering. In particular, they are used to study the behavior of natural, industrial, and biological processes. Thus, it is important to understand their mathematical foundations. For systems where the number of molecules is large, stochasticity averages out so the state variables are the real-valued concentrations of molecular species, and dynamics are expressed as ordinary differential equations. These models are well-studied and much is known about their stationary behavior [17]. Furthermore, the dynamics of deterministic CRNs are capable of simulating arbitrary electrical and digital circuits, and there is a sense in which they are capable of universal computation [32,40]. Of increasing importance is the case of solutions with small volume and discrete counts, such as the interiors of biological cells and nanoscale engineered systems. In this case, the state variables are the integer counts of molecules, and dynamics are expressed as continuous-time, discrete-space stochastic processes [7,19].
These stochastic models are well studied. In particular, both their stationary and finite time dynamics are of interest and have been analyzed [1,2,4,5,6,7,8,11,12,13,21,24,25,27,28,29,39,42,43]. Moreover, their computational properties have been explored and it has been shown that they can simulate Turing machines, so long as a small probability of error is allowed [15]. They are therefore capable of universal computation. However, there are known limitations on the dynamics exhibited by certain classes of stochastic CRNs [30,31] and the full repertoire of behaviors accessible to stochastic CRNs has yet to be characterized.
A particular question is: what are the possible fluctuation sizes that can be seen in molecule counts for stochastic CRNs in stationarity? When used to model systems in thermodynamic equilibrium with particle reservoirs, i.e., models that are detailed balanced and have inflows and outflows of each species, the stationary distributions of stochastic CRNs take the form of the product of independent Poisson distributions [46]. Hence, in these cases, the magnitude of fluctuations in the population of each molecule type is equal to the square root of the mean, since the mean and variance of Poisson distributions are equal. In the physical sciences, it is typical to encounter systems with fluctuations with size of square root of the mean. This was stated by Schrödinger in What is life? when talking about the inaccuracy of distribution can be considered as a representation of knowledge; a CRN with parameters chosen such that it generates a specific distribution can be considered to be storing said knowledge. It is therefore reasonable to ask how information stored in CRN distributions can be further processed, manipulated, and acted upon by other CRNs, or how the knowledge can be extracted by interaction with a (proto)cell's environment, in the form of tuning parameters [20,44,45].

Notation
Let us denote the sets of nonnegative integers, reals, and positive reals, with Z ≥0 , R, and R >0 , respectively. In what follows, let d ∈ Z >0 . For any set K ⊆ R of real numbers, we denote by K d the set of vectors with d entries in K. We will often refer to the elements of Z d ≥0 as states. Let u, v ∈ R d be vectors. We write vectors as rows v = (v(1), . . . , v(d)), where v(i) denotes the ith entry of v. For i ∈ {1, . . . , d} we define e i to be the vector of Z d ≥0 with 1 in the ith entry and 0 otherwise, i.e. with e i (i) = 1, and e i (j) = 0, for j = i. If u(i) ≤ v(i), for all i ∈ {1, . . . , d}, we write u ≤ v. We define: We define the following: with the conventions that 0 0 = 1, and 0! = 1. Let f : Z d ≥0 → R be a function. We define the infinity norm as usual: {|f (x)|}.
If f satisfies f (x) ≥ 0, for each x ∈ Z d ≥0 , and x∈Z d

≥0
f (x) = 1, we say f is a distribution. We call the set {x ∈ Z d ≥0 : f (x) > 0} the support of f . Finally, we denote the cardinality of a set S with |S|.

Model
We are interested in the counts of molecules of different chemical species undergoing different chemical transformations. We use the standard model of stochastic chemical reaction networks in which the dynamics of the counts of the different chemical species is modeled as a continuous-time Markov chain. We begin with the definition of a reaction network, and then characterize the dynamics.
Definition 1. A chemical reaction network (CRN) is a quadruple N = (S, C, R, κ) where S, C, R, κ are defined as follows. S is a finite set of species. C is a finite set of complexes, which are linear combinations of species, with nonnegative integer coefficients. R is a finite set of reactions, that is a finite subset of C × C with the property that for any y ∈ C we have (y, y) / ∈ R. Usually, a reaction (y, y ) is denoted by y → y , and we adopt this notation. Finally, given an ordering for the reactions, κ is a vector in R |R| >0 that gives every reaction a rate constant. The rate constant of reaction y → y will be denoted by κ y→y .
Let S = {A 1 , . . . , A |S| } be an ordering of the species. Complexes will be regarded as vectors in Z |S| ≥0 and we use the following notation for a complex y: A network is said to be reversible if y → y ∈ R whenever y → y ∈ R, and we write y y ∈ R for the pair of reactions. We will often summarize the sets of complexes, reactions, and rate constants of a CRN in a reaction diagram, with reactions and their corresponding rate constants denoted in the following manner For example, consider the reaction network with species S = {A, B, C}, complexes C = {0, A, C, 2A + B}, and reactions R = {2A + B → C, 0 → A, A → 0}. Assume all rate constants are unity, i.e. κ = (1, 1, 1). The quadruple N = (S, C, R, κ) forms a CRN. Moreover, the reaction network includes the reversible pair 0 → A and A → 0, so we can write the set of reactions as R = {2A + B → C, 0 A}. The reaction diagram of this CRN is: The usual stochastic mass action model of a CRN N = (S, C, R, κ) is defined as a continuous-time Markov chain (CTMC), denoted by X(·), where the propensity for reaction y → y in state x is given by:

For example, the propensity of the reaction
the transition rate of the CTMC from x to x is given by: For an initial condition x 0 ∈ Z |S| ≥0 , we use the following notation for the probability mass function: P (x, t|x 0 ) = P (X(t) = x|X(0) = x 0 ).

Key concepts
If there is a t > 0 for which P (x , t|x) > 0 we say that x is reachable from x. Note that if x is reachable from x then it is possible to reach x from x via a finite number of reactions. The reachability class of a state x 0 is the set of states that are reachable from x 0 . If there is a distribution π(·|x 0 ) for which π(x|x 0 ) = lim t→∞ P (x, t|x 0 ), for all x in the state space, then π(·|x 0 ) is said to be the limit distribution of the processes for initial condition x 0 . We note that limit distributions are stationary distributions for the models, i.e. distributions π such that P (·, t) = π(·) for all t ≥ 0 if X(0) is distributed according to π. For a subset V ⊆ S of species the marginal of π(·|x 0 ) onto V is: where (x ) V is the projection of x onto the species in V.
Definition 2. Let U be a set of CRNs. We say that U approximates a distribution q : Z d ≥0 → [0, 1] if for every ε > 0 there exists a CRN (S, C, R, κ) ∈ U, an initial condition x 0 ∈ Z |S| ≥0 , and a subset of the species V ⊆ S, called the visible species, with |V| = d, such that a limit distribution π(·|x 0 ) as in (1) exists and π V (·|x 0 ) − q ∞ < ε. If U approximates every distribution on Z m ≥0 for every m ≥ 1, then we say that it is universally approximating.

Universal Approximation with Detailed Balanced Networks
Definition 3. Let the CRN N = (S, C, R, κ) be reversible. If there exists a vector c ∈ R |S| >0 such that κ y→y c y = κ y →y c y , for each reaction y → y ∈ R, then we say that N is detailed balanced.
Remark 1. We employ one of multiple ways of defining detailed balance for CRNs. In the definition above the vector c is a vector of steady-state concentrations for the deterministic CRN with mass action kinetics that obeys the detailed balance condition [23]. The choice of c may not be unique. In particular, detailed balanced networks have the remarkable property that if one positive equilibrium satisfies the detailed balanced condition (3), then all positive equilibria satisfy (3) [12,Theorem 3.10]. If one recognizes that concentrations correspond to c i = e −Gi/kT , where G i is the free energy of formation associated to the ith species, k is Boltzmann's constant, and T is temperature, the definition takes the form of the thermodynamic formula κ y→y /κ y →y = e −∆G(y→y )/kT , which relates the equilibrium constant to the change in free energy ∆G(y → y ) = |S| i=1 G i (y (i) − y(i)) associated with reaction y → y . The stationary distributions for detailed balanced models are well known: for each x in the reachability class of x 0 , where M x0 is the corresponding normalization constant. This result appears in, for example, Theorem 3.2 in [46], or, more recently, as a special case of Theorem 4.1 in [5]. However, versions of the theorem were published as early as 1958 in [10], and 1967 in [33]. The stationary distributions of detailed balanced systems consist exclusively of restrictions of products of Poisson distributions. Yet, as we will show, every distribution with finite support can be expressed as the marginal of the limit distribution of some detailed balanced system. Construction 1 (Fully connected network). Let d ≥ 1 and let q : Z d ≥0 → [0, 1] be a distribution with finite support {x 1 , . . . , x n }. Define the CRN Full(q) = (S, C, R, κ) as follows.
• The set of species is S = {V 1 , . . . , V d , H 1 , . . . , H m }. Note that there is one species V i for each of the dimensions of q and that there is one species H j for each of the points in the support of q.
• The sets of complexes, reactions, and rate constants are given by the reaction diagram described by: for i, j ∈ {1, . . . , d}, i = j.
States of the associated continuous-time Markov chain will reside in Z d+m ≥0 , with the first d dimensions corresponding to the visible species V i and the final m dimensions corresponding to the H j . We will therefore write states as Note that if the initial condition for Construction 1 is x 0 = (x 1 , e 1 ) (or, more generally, of the form (x i , e i ), for i ∈ {1, . . . , m}), then at any future time there is precisely one H i molecule that has a count of one and the others have a count of zero. Moreover, the network is designed so that if the count of H i is 1, then the counts of the (V 1 , . . . , V d ) are exactly x i . Lemma 1. Let d ≥ 1 and let q : Z d ≥0 → [0, 1] be a distribution with finite support. Then, Full(q) is detailed balanced and, for initial condition x 0 = (x 1 , e 1 ) and set of visible species V = {V 1 , . . . , V d }, the marginal of the limit distribution satisfies π V (·|x 0 ) = q.
Proof. Let us denote the complexes of Full(q) with: given by satisfies κ yi→yj c yi = x i !x j !q(x i )q(x j ) = κ yj →yi c yj for all i, j ∈ {1, . . . , m}. Hence, Full(q) is detailed balanced. The reachability class of initial condition Hence, for initial condition x 0 , the limit distribution (4) satisfies: Notice that, since q is normalized, we have: Therefore, for the set of visible species V = {V 1 , . . . , V d }, the marginal of the distribution satisfies: for all i ∈ {1, . . . , m}, and it is 0 otherwise.
Notice that if we modify the stoichiometry of the reactions in Construction 1, while preserving the net change of molecules of each reaction, the reachability class for the initial condition x 0 = (x 1 , e 1 ) is the same as before. Also, we can scale the rate constants of reversible pairs of reactions by the same factor without affecting the limit distribution. For example, if a state x i in the support of q has exactly one more molecule of V k than another state x j in the support of q, we may use the reactions for that pair of states, instead of (5). With this choice of rate constants Lemma 1 remains true.
Let d ≥ 1 and consider the set Z d ≥0 of states. We say that two states x, x ∈ Z d ≥0 are adjacent if there exists i ∈ {1, . . . , d} such that |x(i) − x (i)| = 1, and x(j) = x (j) otherwise. Let U ⊆ Z d ≥0 be a set of states. If U can be connected using only edges that connect adjacent states we say that U is a cluster. Then, if the support of a distribution is finite and also a cluster we can modify Construction 1 to use reactions that have at most two molecules in the reactants and similarly for the products, i.e. using only bimolecular reactions.
• The sets of complexes, reactions, and rate constants are given by the reaction diagram described by: for i, j ∈ {1, . . . , m}, and k ∈ {1, . . . , d} such that x i (k) = x j (k) + 1, and the components of x i and x j are otherwise equal. That is, when Remark 3. By an argument similar to the proof of Lemma 1, Bimol(q) is detailed balanced with detailed balanced equilibrium c from (6). Moreover, for initial condition x 0 = (x 1 , e 1 ) and set of visible species V = {V 1 , . . . , V d }, the marginal of the limit distribution satisfies π V (·|x 0 ) = q.
We can further simplify Construction 2 by removing reactions but preserving the connectivity of the cluster. If we remove edges of a cluster until no edge can be removed without splitting it into two disconnected graphs, the graph that results is a spanning tree.
Let U ⊆ Z d ≥0 be a cluster, and let E ⊆ U × U be a simple graph that consists only of edges that connect adjacent states. If E connects all of the elements of U but no proper subgraph of E connects the elements of U we say that E is a cluster spanning tree.
For the Bimolecular and Spanning Tree Indexed Networks we only consider reactions between adjacent states, indicated by green edges as well. For example, we assign to the pair x 1 = (5, 3), x 2 = (5, 4) the reversible reaction pair H 1 H 2 + V 2 . While the Bimolecular Indexed Network includes one reaction for every adjacent pair of elements of the support, the Spanning Tree Indexed Network includes only as many adjacent pairs as necessary to leave the support connected. • The sets of complexes, reactions, and rate constants are given by the reaction diagram described by: for i, j ∈ {1, . . . , m}, and k ∈ {1, . . . , d} such that (x i , x j ) ∈ E, and x i (k) = x j (k) + 1. Notice that E consists of only edges between adjacent states so we must have that Remark 4. Notice that, similar to Construction 2, SpanTree(q, E) is detailed balanced with detailed balanced equilibrium c from (6). Also, for initial condition x 0 = (x 1 , e 1 ) and set of visible species V = {V 1 , . . . , V d }, the marginal of the limit distribution satisfies π V (·|x 0 ) = q. In Construction 1 the number of reactions is equal to m(m − 1) = O(m 2 ), i.e. it is quadratic in the size of the support of q. In comparison, the number of reactions in Construction 2 is O(dm) since every point of the support has at most 2d adjacent states. Finally, the number of reactions in Construction 3 is equal to 2(m − 1) = O(m), i.e. it is linear in the size of the support of q.
be a distribution, and let ε > 0. Then, there exists a distribution q : Z d ≥0 → [0, 1] with finite support that satisfies q − q ∞ < ε. Proof. Consider an ordering of the points in the state space Finally, notice that q satisfies: as desired. Proof. For d ≥ 1, let q : Z d ≥0 → [0, 1] be a distribution, and let ε > 0. By Lemma 2 we know that there exists a distribution q : Z d ≥0 → [0, 1] with finite support that satisfies q − q < ε. Consider the CRN Full(q ) as given in Construction 1. We know from Lemma 1 that for V = {V 1 , . . . , V d } the marginal of the limit distribution for initial condition x 0 = (x 1 , e 1 ) satisfies π V (·|x 0 ) = q . Therefore, since q and ε were arbitrary, the set of fully connected network CRNs is universally approximating.
Remark 5. The proof of the theorem above can be carried out using Constructions 2 or 3 instead of 1. If the support of a distribution is not a cluster one may connect the disconnected components using a finite number of edges and distribute a sufficiently small amount of probability throughout the states along the edges that were added.
In order for any of the above constructions to produce the appropriate limit distribution it is necessary to provide them with the right initial condition. Suppose instead that we wish to have a CRN whose limit distribution is independent of initial conditions. In particular, the process would be able to reach the support of its limit distribution from any state. Moreover, if the CRN satisfies detailed balance, then a limit distribution exists [3] and it is a product of Poisson distributions [46]. Therefore, such a model could not be universally approximating if it satisfies detailed balance, and a new construction is needed if one wishes to dispose of the dependence on initial conditions.

Universal Approximation with Robust Networks
We shift our attention to systems that have a unique limit distribution. In particular, for these systems, the limit distribution is independent of initial conditions and is robust to an arbitrary single perturbation in the counts of species at any time.
We say that N is robust if the limit distribution π(·|x 0 ) does not depend upon x 0 . Since the limit distribution is unique we use π(·) and omit the initial condition.
Remark 6. Notice that our definition of robustness corresponds to the definition of an ergodic stochastic processes in the probability theory literature.
Unlike in the detailed balanced case considered in the previous section, we do not have a general form for the limit distribution of robust systems. Instead, we will provide a construction of a robust CRN that can approximate a given point mass distribution. We will then "embbed" this point mass construction in a larger robust construction that is capable of approximating an arbitrary distribution. Let Construction 4 (Point mass network). Let d ≥ 1, x ∈ Z d ≥0 , and ε > 0. Define the CRN PointMass(x, ε) = (S, C, R, κ) as follows.
• The sets of complexes, reactions, and rate constants are given by the reaction diagram described by: Note that for every i ∈ {1, . . . , d}, exactly one of the two reactions 0 → V i or 2V i → 0 is present in Construction 4.
Proof. Notice that PointMass(x, ε) consists of d decoupled subnetworks, each of which controls the counts of some V i independently of all the others. We will therefore focus on one such subnetwork and later generalize. Let π i be the stationary distribution of the subnetwork that keeps track of the counts of V i . If x(i) = 0, then the subsystem is for all m ≥ 0 and the result is shown. Otherwise, we have x(i) > 0 and the subsystem is a birth-death process when restricted to the closed set Υ = {m ∈ Z : m ≥ x(i)}. Note that the set Υ is almost surely reached from any initial condition (because of the reaction 0 → V i ). Hence, we can compute the unique limit distribution explicitly: for any m, n ≥ 0 with and π i (n |m) = 0, for n ≤ x(i) − 1.
Since the subsystems behave independently, the joint limit probability will simply be the product of the marginal limit probabilities: for any x , We will now show that π−δ x ∞ < ε. Since p−q ∞ ≤ 2 for any probability distributions p, q, we can assume that ε < 2 ≤ 2d. Note that |π(x) − δ x (x)| = 1 − π(x), and for all x = x |π(x ) − δ x (x )| = π(x ). Also, notice that: Utilizing equation (10) in the case n = 0 and (12), we deduce that π(x) = 1/M , where Recall that, in general, for variables a and b we have so for a = 1 and b = 1 − ε/(2d) we have where the inequality is satisfied because 0 < (1 − ε/(2d)) < 1 if ε < 2d, and there are d terms in the sum. Therefore, the following is true which, when combined with (13), concludes the proof.
Remark 7. Lemma 4 holds true (with an almost identical proof) if Construction 4 is replaced with where a distinction between null and positive entries of x is not made. However, if such a distinction is made, as in Construction 4, then the approximation of the target limit distribution is more accurate (since the marginal limit distributions of the null entries of x are exactly the point mass distribution at 0). Moreover, the reactions of type 2V i → 0 utilized when x(i) = 0 provide a faster convergence to the limit distribution, which will be essential to obtain the main result of this paper, which is Theorem 5 (which in turn implies Theorem 6). More details on how the convergence rate is used to prove the result are given in the Appendix.
In this case, each of the species H 1 , . . . , H m will serve as a catalyst for a network that generates a point mass distribution centered at the corresponding element of the support • The sets of complexes, reactions, and rate constants are given by the reaction diagram described by: Remark 8. If δ is small enough, creation of catalyst species is much slower than destruction, and the probability that there is more than one catalyst species at any time can be made arbitrarily small. Furthermore, once a catalyst species is present, if the destruction rate δ is slow enough, the number of catalysts will remain unchanged long enough for the corresponding Point Mass Network to approach its limit distribution.
be a distribution with finite support. Then, (i) for every δ > 0 the CRN PointMassMix(q, δ) is robust, and (ii) for any ε > 0 there exists δ > 0 such that, for the set of visible species The proof of Theorem 5 is given in Appendix A.6. Here we state and prove an immediate important consequence.
Theorem 6. The set of all robust CRNs is universally approximating.
be a distribution, and let ε > 0. By Lemma 2 we know that there exists a distribution q : Z d ≥0 → [0, 1] with finite support that satisfies q − q ∞ < ε. Consider the CRN PointMassMix(q , ε − q − q ∞ ). By Theorem 5 we have that for the set of visible species Finally, by the triangle inequality we have

More general constructions
In this section we explore generalizations of the constructions used in the previous sections. Theorem 5 states that by following Construction 5 we can design robust CRNs whose limit distribution approximates a given distribution, with any given accuracy and for any chosen distribution. Hence, the aim of generalizing Constructions 4 and 5 does not reside in exploring a larger set of distributions to approximate, but rather in comparing different CRNs with similar stationary distributions. As an example, let q be the Poisson distribution with mean κ. Using the construction provided above, we can design a robust CRN whose limit distribution is arbitrarily close to q, by truncating q to a finite support Υ as in Lemma 2, and by using Construction 5 to approximate the truncated q. Note that Construction 5 gives a CRN with |Υ| + 1 species, 4|Υ| reactions, and with high values of the molecularity (i.e. max y∈C y 1 ) and of the logarithm of the rate constants, | log κ y→y |, to obtain its high accuracy. However, the distribution q can be obtained exactly as the limit distribution of the robust CRN Hence, natural questions in terms of complexity arise: given a distribution q and a parameter ε > 0, what is the minimal size of a robust CRN (in terms of number of species, number of reactions, and magnitude of the rate constants) whose limit distribution π satisfies π − q ∞ < ε?
We begin by formulating a generalization of Construction 4. The generalization, detailed in Construction 6 below, allows us to design robust CRNs whose limit distributions are arbitrarily close to the uniform distributions on d-dimensional boxes (see Proposition 7). The molecularity and the logarithm of the rate constants are similar to those of Construction 5, but the number of species utilized is only d, and the number of reactions is 2d.
Remark 9. Note that the point mass distribution at x is a particular case of the uniform distribution, as it can be regarded as the uniform distribution over [x, x].
• The sets of complexes, reactions, and rate constants are given by the reaction diagram described by: Remark 10. Note that PointMass(x, δ) can be regarded as a particular case of Construction 6. In particular, Then, for any choice of δ > 0 the CRN MultiDimUnif(a, b, δ) is robust. Moreover, if we denote by π δ its limit distribution and by q the uniform distribution over [a, b], we have A proof of the proposition is given in Section A.3 in the Appendix, together with a sharper estimate on the distance between π δ and q.
The second, and probably more important, generalization we deal with in this section is the following. In Construction 5 we combine different robust CRNs whose limit distributions are close to point mass distributions to obtain a new robust CRN whose limit distribution is close to a mixture of point mass distributions. In general, given a finite number of robust CRNs with limit distributions π i , we want to be able to design a new robust CRN that combines them, and whose limit distribution is arbitrarily close to the mixture of the distributions π i . This can be accomplished under some general conditions, and the precise result is stated in Theorem 8. The assumptions of Theorem 8 have a slightly more technical nature than those of Theorem 5, which is a particular case. Note that by the known theory on detailed balanced CRNs, Lemma 4, and Proposition 7, it follows that robust CRNs whose limit distributions are arbitrarily close to a mixture of Poisson distributions, point mass distributions, and uniform distributions are readily available, and involve less species and reactions than those of Construction 5. The precise statement of the result is given in Theorem 9.
Before stating Theorem 8, we introduce a new construction (which is a generalization of Construction 5) and necessary concepts from the theory of stochastic processes.
Definition 6. Consider a robust CRN with d species, and let ε > 0. We define the mixing time at level ε to be the quantity  Figure 3: Example of general mixing. The reaction diagram on the two left columns is an example of general mixing as given by Construction 7 with its target distribution on the right. Notice that whereas the CRN with limit distribution shown in the third panel of Figure 2 has 6 reactions per pixel, giving a total of 6 × 50 × 50 = 1.5 × 10 4 reactions, the above distribution is generated using only 34 reactions despite the dimension of its support being larger, namely 100 × 100. In general, the size of the support of a distribution is not an indication of its complexity in terms of the CRNs that generate them.

Mixture Distribution
The CRNs PointMass(x, δ) and MultiDimUnif(a, b, δ) have finite mixing times τ ε , for any δ, ε > 0. This is proven in Lemma A.2 in Section A.2 of the Appendix.
Definition 7. We say that a CRN is explosive if there exists an initial condition x 0 such that for a finite time t > 0 We say that the CRN is non-explosive otherwise.
Theorem 8. Let F = {N 1 , . . . , N m } be a finite ordered set of m CRNs with the same set of species V = {V 1 , . . . , V d }. Assume that each CRN N i is robust, denote by π i its limit distribution and by τ ε i its mixing time at level ε > 0. Let ζ ∈ R m >0 be such that m i=1 ζ(i) = 1, and assume that for every ε > 0 max 1≤i≤m τ ε i < ∞.
Moreover, assume that for every δ > 0 the CRN Mix(F, ζ, δ) is non-explosive. Then, for every δ > 0 the CRN Mix(F, ζ, δ) is robust, and, if we denote by π δ the limit distribution of Mix(F, ζ, δ), we have In order to choose δ such that the distance is smaller than a given quantity, it is important to have upper bounds on the mixing times τ ε i . In Appendix B, such bounds are developed for the CRN PointMass(x, δ). We have seen in Theorem 5 (which is a consequence of Theorem 8) how a family of point mass networks can be used to construct a robust CRN whose limit distribution is arbitrarily close to a given distribution q.
However, the application of Theorem 8 does not need to be limited to point mass networks. Before stating the next result, which is more general than Theorem 5, we introduce a choice of CRN constructions with a product-form Poisson as limit distribution. Note that, due to [5], many different choices are possible. Here we choose one with a finite mixing time, so that Theorem 8 can be applied.
• The sets of complexes, reactions, and rate constants are given by the reaction diagram described by: Theorem 9. Let {π 1 , . . . , π m } be a family of distributions on Z d ≥0 such that there is a partition {I 1 , I 2 , I 3 } of {1, . . . , m} satisfying the following: • for all i ∈ I 1 , π i is a point mass distribution at some x i ∈ Z d ≥0 ; • for all i ∈ I 2 , π i is a uniform distribution over [a(i), b(i)] for some a(i) ≤ b(i) ∈ Z d ≥0 ; • for all i ∈ I 3 , π i is a product-form Poisson distribution with mean

family of reaction networks with common set of species
Then, for any δ > 0 the CRN Mix(F δ , ζ, δ) is robust. Moreover, if π δ denotes its limit distribution, then The proofs of Theorem 8 and Theorem 9 are given in Sections A.1 and A.5 of the Appendix, respectively.
As a final remark, note that in Constructions 4, 6, and 8, the species do no interact (i.e. two different species are never involved in the same reaction). As a consequence, the counts of the different species evolve independently. It is therefore straightforward to obtain CRNs whose limit distribution approximates products of Poisson distributions and uniform distributions. It is indeed sufficient to consider different constructions for different species: consider for example the distribution q on Z 2 ≥0 given by q(v 1 , v 2 ) = q 1 (v 1 )q 2 (v 2 ), where q 1 is uniform over {a, a + 1, . . . , b} (say with 1 ≤ a ≤ b) and q 2 is Poisson with mean c. Then, q is approximated by the limit distribution of The design of the CRN in Figure 2 follows this idea.

Acknowledgment
The current structure of the project was conceived when the four authors participated in the BIRS 5-day Workshop "Mathematical Analysis of Biological Interaction Networks." Parts of the proofs for the present work were then completed while two of the authors were taking part in the AIM SQuaRE workshop "Dynamical properties of deterministic and stochastic models of reaction networks." We thank the Banff International Research Station and the American Institute of Mathematics for making this possible.
Anderson gratefully acknowledges support via the Army Research Office through grant W911NF-18-1-0324. Winfree gratefully acknowledges support via the National Science Foundation "Expedition in Computing" grant CCF-1317694.
This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE1745301. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Appendix A Proofs and estimates
The aim of this section is to provide a complete proof of Proposition 7, Theorem 5, Theorem 8, and Theorem 9. In particular, we will show how Theorem 5 follows from Theorem 9, which in turn follows from Theorem 8.
From the proof of Theorem 8 it will emerge how, for a fixed ε > 0, the choice of δ such that depends on the mixing times of the reaction networks N 1 , . . . , N m . This holds in the particular case of Theorem 5 as well. Hence, to guide the design of networks that approximate a given distribution with accuracy ε, we will provide in Appendix B useful estimates on the mixing times of PointMass(x, δ).

A.1 Proof of Theorem 8
Denote by X δ (·) the continuous-time Markov chain associated with Mix(F, ζ, δ). Define X δ V (·) and X δ H (·) as the projections of X δ (·) onto the components of the species in V = {V 1 , . . . , V d } and H = {H 1 , . . . , H m }, respectively. Moreover, for convenience we will write a state of Mix(F, ζ, δ) as with v and h indicating the components of the species in V and H, respectively. Note that X δ H (·) is a continuous-time Markov chain itself, and is distributed according to the subnetwork of mix(F, ζ, δ), which we will denote by N H = (H, C H , R H ), given by N H is detailed balanced with detailed balanced equilibrium δζ, and its state space is irreducible. It follows that N H admits a unique stationary distribution π δ H defined by for all h ∈ Z m ≥0 . Consider a vector v ∈ Z d ≥0 that is a positive recurrent state for at least one reaction system N i , i ∈ {1, . . . , d}. Moreover, denote by σ δ (v) the time of the first visit of X δ (·) to (v, 0), defined as σ δ (v) = inf{t ≥ 0 : X δ (t) = (v, 0) and X δ (s) = (v, 0) for some 0 ≤ s < t}.
We prove that Mix(F, ζ, δ) is robust by proving that for any Indeed, if (16) holds, then (v, 0) is positive recurrent by definition, hence there exists a stationary distribution π δ whose support coincides with the closed irreducible component that contains (v, 0), and such stationary distribution is unique. Moreover, a unique closed irreducible set exists and it is eventually reached with probability 1 from all states of Z d+m ≥0 , otherwise (16) could not hold. We need to prove (16). Let e i ∈ Z m ≥0 be the ith vector of the canonical basis, namely the vector with 1 in the ith entry and 0 in the other components. We have that (14) is positive recurrent and irreducible, and that X δ (·) is non-explosive and therefore well-defined for all times greater than 0. Hence, given X δ (0) = (v 0 , h 0 ), the chain will satisfy X δ H (t) = e i after a time with finite expectation. Assume X δ H (t) = e i , and let u be the time until the next change in copy-numbers of the species {H 1 , . . . , H m }. Then, u is exponentially distributed with rate rate δ + δ 2 , independently of the value of X δ V (t). It follows that there is a positive probability ϕ i (δ, v) that u > τ πi(v)/2 , where τ η = max 1≤i≤m τ η i is finite by assumption for all η > 0. Moreover, by definition of mixing times, Hence, the number of times the chain satisfies X δ H (t) = e i before visiting (v, 0) is stochastically bounded from above by a geometric random variable with mean Moreover, the expected time between two visits of (14) to e i is finite. Hence, (16) holds.
For all δ > 0 we have For By classical Markov chain theory and by robustness of Mix(F, ζ, δ) we have that almost surely, for any initial condition X δ (0). We now assume that for any v ∈ Z d ≥0 and any ε > 0, there exists almost surely, independently on the initial condition X δ (0). For δ small enough, both δ ≤ δ ε,v and 1 − e −δ < ε/4 hold. Hence, by (17), (18), and the triangular inequality For any ε > 0, there exists a compact set K ε ⊂ Z d ≥0 such that for any Since K ε is compact, the minimum δ ε = min v∈K ε δ ε,v exists and is positive. Hence, for all v ∈ Z d ≥0 and for all δ small enough such that δ ≤ δ ε and 1 − e −δ < ε/4, we have and the proof is concluded. Hence, it suffices to show (18). Let t 0 = 0 and define recursively t j = inf{t ≥ t j−1 : X δ H (t) = 0 and X δ H (s) = 0 for some t j−1 < s < t} for j ≥ 1. That is, t j with j ≥ 1 is the time of the jth visit to a state with no molecules of species H 1 , . . . , H m .
For any j ≥ 1, let s j denote the holding time in the state with no molecules of {H 1 , . . . , H m }, measured from time t j . Then, s j is exponentially distributed with rate It follows from classical renewal theory and from (15) that for any j ≥ 1 It follows that, with probability 1, lim j→∞ t j = ∞. This in turn implies that almost surely For all j ≥ 1, independently of the value of X δ V (t j ), a molecule of H i is produced at time t j + s j with probability ζ(i). Let u j denote the molecule lifetime. Note that with probability δ δ + δ 2 = 1 1 + δ the molecule of H i is degraded before another molecule of a species in H is produced. For convenience, denote this event by A j . Given that A j occurs, u j is the minimum between the degradation of the H i molecule (exponentially distributed with rate δ) and the time until the production of another molecule of a species in H (exponentially distributed with rate δ 2 ). Hence, given that A j occurs, u j is exponentially distributed with rate δ + δ 2 . It follows that Given that u j > τ ε/2 , all the reactions of the system N i can take place for a time longer than τ ε/2 , and these are the only reactions that can occur. Thus, by the definition of mixing times, Note that the bounds do not depend on X δ V (t j ) = v . In conclusion, by conditioning and by using the probabilities of the conditioning events calculated above, we obtain The sequence D δ (j) = X δ V (t j ) for j ∈ Z ≥0 defines a discrete time Markov chain. Since X δ (·) is robust, D δ (·) has a unique closed irreducible set Υ, and for any v ∈ Z d ≥0 we have Moreover, (19) and (20) give a lower and upper bound on the transition probabilities to a state v from a state v , which does not depend on v . Hence, for small enough ε and small enough δ such that b δ (v) > 0, D δ (·) restricted to Υ is aperiodic and positive recurrent. It follows that there exists a limit distribution γ such that for any independently of v . Furthermore, since the argument of the limit is bounded from below by b δ (v) and from above by If W j (v) is the number of visits of D δ (·) to v up to step j (included), we have that with probability 1 lim j→∞ W j (v) = ∞, and By the strong law of large numbers, we have that almost surely Hence, If δ is small enough, which proves (18) and concludes the proof.

A.2 Analysis of Constructions 4 and 6
In this section, we study the limit distributions and the mixing times of a construction that generalize slightly Constructions 4 and 6, by allowing for a more general choice of rate constants. We also give explicit bounds on the distance between the uniform distribution and the limit distribution of the construction presented here. The bounds provided here are in general sharper than those presented in the main text.
Construction 6 (General uniform distribution network). Let d ≥ 1 and let a, b ∈ Z d ≥0 be such that a ≤ b. Define the CRN MultiDimUnif (a, b, κ) = (S, C, R, κ) as follows.
• The sets of complexes, reactions, and rate constants are given by the reaction diagram described below: In what follows, we will denote by X κ (·) the continuous-time Markov chain associated with the CRN MultiDimUnif (a, b, κ), and we will use the notation r i = b(i) − a(i) + 1 for all i ∈ {1, . . . , d}. Further, we will denote by X κ (·, i) the ith component of X κ (·). Note that the components X κ (·, i) are distributed as independent continuous-time Markov chains. In particular, X κ (·, i) is distributed as the process associated with the subnetwork of MultiDimUnif (a, b, κ) given by the reactions changing the species V i . We will denote by A κ,i the generator of X κ (·, i) (see [16]). We define σ κ,i and σ κ,i v(i) as the hitting times of [a(i), b(i)] and of v(i) ∈ Z ≥0 , respectively: Finally, we denote by E κ,i v0(i) [·] the expectation with respect to the distribution of X κ (·, i) given X κ (0, i) = v 0 (i) Lemma A.1. Let d ≥ 1 and let a, b ∈ Z d ≥0 be such that a ≤ b. Consider the function L(·), defined as L(v(i)) = v(i) for all v(i) ∈ Z ≥0 . Then, lim v(i)→∞ L(v(i)) = ∞. Moreover, for any κ > 0 and any i ∈ {1, . . . , d}, there exists α i ∈ R >0 and a compact set K i ⊂ Z ≥0 such that Proof. Clearly, lim v(i)→∞ L(v(i)) = ∞. Now assume that b(i) = 0. Then, for all v(i) ≥ 2 we have , and the result holds.
which is smaller than or equal to − MultiDimUnif (a, b, κ) is robust and the support of its limit distribution is Moreover, for any κ > 0 and any ε > 0, the mixing time of MultiDimUnif (a, b, κ) at level ε is finite.
Proof. In order to prove the existence of a unique limit distribution and argue that the mixing times are finite, we will make use of Foster-Lyapunov criteria discussed by Meyn and Tweedy in [34]. In particular, we will use the concept of super Lyapunov function, developed by Athreya, Kolba, and Mattingly in [9], to which Appendix C is devoted. Since the components X κ (·, i) are distributed as independent continuous-time Markov chains, in order to prove robustness of X κ (·) it is sufficient to prove it for all its components separately. The same holds for the description of the irreducible closed sets of X κ (·), which are necessarily Cartesian products of closed and irreducible sets of the components X κ (·, i). Finally, the mixing times of X κ (·) at level ε > 0 are finite for all ε > 0, if and only if the mixing times τ ε,i of X κ (·, i) at level ε > 0 are finite for all ε > 0 and for all i ∈ {1, . . . , d}.
If b(i) = 0, then the process X κ (·, i) is the continuous-time Markov chain associated with the CRN with reaction diagram As such, X κ (·, i) can only decrease. It follows that the set {0} is closed and irreducible for X κ (·, i). Moreover, it is the only closed and irreducible set, since the reaction V i → 0 can always take place as long as there is at least one molecule of V i . If b(i) = 0, then the process X κ (·, i) is the continuous-time Markov chain associated with the CRN with reaction diagram Hence, the process X κ (·, i) can always increase by 1, and decrease by r i if and only if at least b(i) + 1 = r i + a(i) molecules of V i are available. Hence, the set Θ(i) = {v(i) ∈ Z ≥0 | v(i) ≥ a(i)} is the only closed and irreducible set of X κ (·, i).
In both cases, we can conclude that X κ (·, i) is robust by showing that a unique limit distribution exists. This, together with the fact that the mixing times τ ε,i are finite, follows from Lemma A.1 and Theorem C.2.
In Lemma A.2 we proved that X κ (·) is robust, which holds if and only if each process X κ (·, i) is robust. We denote by π κ the unique limit distribution of X κ (·), and by π κ,i the unique limit distribution of X κ (·, i). Since the components of X κ (·) are independent, it follows that Hence, in order to study π κ it is sufficient to study the distributions π κ,i , for i ∈ {1, . . . , d}, and this is what we will do. Lemma A.3. Let i ∈ {1, . . . , d} with b(i) = 0. Then, π κ,i is the point mass distribution at 0.
Proof. The result follows from the fact that X κ (·, i) is robust, and {0} is its only closed irreducible set, as proven in Lemma A.2. Then By (21) and since for all times t > 0 we have E κ,i v0(i) [L(X κ (min{t, σ κ,i }, i))] ≥ 0, we may conclude We may use the monotone convergence theorem to conclude the proof.
Lemma A.5. Let i ∈ {1, . . . , d} with b(i) ≥ 1, and assume (21) holds. Then, Proof. By classical theory on continuous-time Markov chains it is known that , where 1/κ i 1 is the expectd holding time of X κ (·, i) in the state b(i). By conditioning on the first two steps, we have Hence, From the first two equalities we have 1 ri − π κ,i (b(i)) ≥ 0, which is the first part of the lemma. Moreover, where the second term bounds from above the expected time to reach b(i) from within the set {a(i), . . . , b(i)}. Hence, by Lemma A.4, which concludes the proof.
Lemma A.6. Let i ∈ {1, . . . , d} with b(i) ≥ 1, and assume (21) holds. Let w ∈ Z ≥0 such that w ∈ [2, r i ]. Then, Proof. To simplify the notation, denote for all integers v(i) that are greater than or equal to b(i) + 1, so that the transition rate from v(i) to v(i) − r i is given by κ i 2 φ i (v(i). Then, by using again classical Markov chain theory we have where the first factor is the expectation of the holding time of X κ (·, i) in the state b(i)+w. By performing first step analysis on E κ,i b(i)+w [σ κ,i b(i)+w ], we have We will then study E a(i)+w−1 [σ κ,i b(i)+w ]. Note that a(i) + w − 1 ∈ [a(i), b(i)], because 2 ≤ w ≤ r i by assumption.
From each state v(i) ∈ [a(i), b(i)], it takes at least one exponential time with rate κ i 1 to reach the state b(i) + 1. From b(i) + 1, we reach the state b(i) + w without hitting [a(i), b(i)] with probability .
Hence, if we let G κ,i w be a geometric random variable with parameter p κ,i w , we have Hence, by combining (22), (23), and (24), we have , which concludes the proof.
Finally, we are ready to prove the following result: . Moreover, Proof. By Lemma A.3, if b(i) = 0 then π κ,i = q i , hence the statement holds. We now assume b(i) ≥ 1. By Lemma A.2, X κ (·, i) is non-explosive, hence the forward Kolmogorov equation holds true [35]. It follows that, provided that r i ≥ 2 (which is equivalent to b(i) ≥ a(i) + 1), for any integer v(i) ∈ [a(i) + 1, b(i)] Hence, by Lemma A.6 we have Hence, by applying the above inequality iteratively, we have that for any two integers By Lemma A.2, for any integer v(i) < a(i) we have π κ,i (v(i)) = q i (v(i)) = 0. By Lemma A.5 and (26), where D(κ, i) is as in (25). Finally, from (27) it follows that Hence, for any v(i) ≥ b(i) + 1 which concludes the proof.

A.3 Proof of Proposition 7
The CRN MultiDimUnif(a, b, δ) is equal to MultiDimUnif (a, b, κ δ ), with κ δ as described in Construction 6. The quantity D(κ δ , i) defined in (25) becomes where g i (δ) is a continuous function satisfying Hence, by Theorem A.7 for any v ∈ Z d ≥0 such that a ≤ v ≤ b we have where G(δ) is a continuous function satisfying lim δ→0 G(δ) < ∞. If v ∈ Z d ≥0 does not satisfy v ≥ a, then by Lemma A.2 we have π δ (v) = 0 = q(v). Finally, for all v ∈ Z d ≥0 with v ≥ a and v b, we have The proof is concluded by noting that r i < b(i) + 2.

A.4 Analysis of Construction 8
Let X(·) be the continuous-time Markov chain associated with ProdPois(c). Note that the different components X(·, i) are independent continuous-time Markov chains, each one associated with the subnetwork of ProdPois(c) governing the changes of the species V i . We state an prove the following results concerning Construction 8.
Proof. It is clear that lim v(i)→∞ L(v(i)) = ∞. Moreover, which concludes the proof.
Proposition A.9. The CRN ProdPois(c) is robust, with limit distribution π given by Moreover, the mixing times of ProdPois(c) are finite at any level ε > 0.
Proof. Since the components X(·, i) are independent, the state space of X(·) is irreducible if and only if the same holds for each X(·, i). This is the case, since the molecules of a species V i can always increase by 1, and to decrease by 2 whenever at least 2 molecules are available. The CRN ProdPois(c) is complex balanced with complex balanced equilibrium c, in the sense of [23]. Indeed, for any complex y ∈ C it holds that y ∈C y →y∈R κ y →y c y = y ∈C y→y ∈R κ y→y c y .
Hence, due to [3] the associated continuous-time Markov chain X(·) is non-explosive, and due to [5] and to the fact that the state space is irreducible, the limit distribution satisfies (28).
Finally, the mixing times of X(·) are finite at any level, if and only if the mixing times of any component X(·, i) are finite at any level. The latter is implied by Lemma A.8 and Theorem C.2, hence the proof is concluded.

A.5 Proof of Theorem 9
Let X δ (·) be the continuous-time Markov chain associated with Mix(F, ζ, δ). By Lemma 4, Proposition 7, and Proposition A.9, all the networks N δ i are robust, for all δ > 0. Let π δ i denote the limit distribution of N δ i . By Lemma 4, Proposition 7, and Proposition A.9 we have where o(δ) is a function with the property lim δ→0 o(δ) δ = 0.
Furthermore, Constructions 4 and 6 are special cases of Construction 6 , hence by Lemma A.2 and Proposition A.9 it follows that the mixing times of any network N δ i are finite at any level ε > 0, for any δ > 0.
Let X δ H (·) be the projection of X δ (·) onto the species {H 1 , . . . , H m }. Note that X δ H (·) is a continuoustime Markov chain, associated with the CRN The above CRN is detailed balanced, with detailed balanced equilibrium δζ ∈ R m >0 . Hence, due to [3] X δ H (·) is non-explosive. If X δ (·) were explosive, then infinitely many transitions of X δ (·) would occur while X δ H (·) is fixed at a state h. Note that given X δ H (·) ≡ h, the components of X δ (·) relative to the species {V 1 , . . . , V d } are independent continuous-time Markov chains. Let X δ,h (·, j) denote the component of X δ (·) relative to species V j , given that X δ H (·) ≡ h. If X δ (·) were explosive, then a process X δ,h (·, j) would be explosive, for some h ∈ Z m ≥0 and some j ∈ {1, . . . , d}. We will conclude the proof by showing that this is not possible.
Let A δ,i be the generator of N δ i . Then, the generator of X δ,h (·, j) is given by From Lemma A.1 and Lemma A.8, it follows that if L(v(j)) = v(j) for all v(j) ∈ Z ≥0 , then for each i ∈ {1, . . . , m} and for each δ ∈ R >0 , there exist α δ,i ∈ R >0 and a compact set K δ,i ⊂ Z ≥0 such that Hence, if The proof is then concluded by Theorem C.2.
B Bounds for the mixing times of PointMass(x, δ) Here we give some useful bounds on the mixing times of a generalization of one-dimensional PointMass(x, δ), where the choice of rate constants is not constrained. Then, the CRN is robust with unique limit distribution π being the point mass distribution centered at 0. Moreover, for any choice of rate constants κ 1 , κ 2 , and for all ε > 0, Proof. Robustness of the CRN and the fact that the limit distribution is the point mass distribution at 0 follows from Lemma A.3. Let X(·) denote the continuous-time Markov chain associated with the CRN. Let • σ = inf{t ≥ 0 : X(t) = 0}; • Y (·) be the embedded discrete time Markov chain of X(·): Y (0) = X(0) and for each n ≥ 1, Y (n) is defined as the value of the process X(·) after n jumps; • M = inf{n ≥ 0 : Y (n) = 0}.
Let X(0) = v 0 . Note that since the process X(·) decreases at least by one unit at each jump, necessarily M ≤ v 0 . Denote by (E v ) ∞ v=0 a sequence of independent exponential random variables with rates κ 1 v + κ 2 v(v − 1). Then, Then, by Markov inequality, .
There are three cases: 1. If v 0 < x, and if we denote by Γ(k, θ) the sum of k independent exponential random variables with mean θ, then by Markov inequality If v 0 = x, then P (σ < t|X(0) = v 0 ) = 1 for all t ≥ 0.
3. If v 0 > x, then For notational convenience, let and denote by (E v ) ∞ v=x+1 a sequence of independent exponential random variables with rates κ 2 φ x (v). Then, by Markov inequality, .

C Super Lyapunov functions
The theory we develop here was mostly developed in [9], for a specific family of stochastic differential equations. The concept and the terminology of "super Lyapunov function" themselves were also introduced in [9]. We are interested in an adaptation of [9, Lemma 6.1], which we state here as Theorem C.1 and whose proof we repeat for completeness.
Definition C.1. Let X(·) be a continuous-time Markov chain on Z d ≥0 , with generator A. We say that a function L : Z d ≥0 → R ≥0 is a super Lyapunov function if the following holds true: • lim x→∞ L(x) = ∞ • there exists a compact set K and real numbers α > 0 and γ > 1 such that AL(x) ≤ −αL γ (x) for all x / ∈ K.
Remark C.1. If (35) holds for γ = 1, then the function L is a standard Lyapunov function. While the existence of such a function implies that the process X(·) is non explosive and that a limit distribution exists for any initial condition [34], in general it does not imply that the mixing times are finite.
Indeed, it is sufficient to consider b = max x∈K |AL(x)|.
It follows that τ ε ≤ T + R ε , which concludes the proof.