Regulation of stem cell dynamics through volume exclusion

Maintenance and regeneration of adult tissues rely on the self-renewal of stem cells. Regeneration without over-proliferation requires precise regulation of the stem cell proliferation and differentiation rates. The nature of such regulatory mechanisms in different tissues, and how to incorporate them in models of stem cell population dynamics, is incompletely understood. The critical birth-death (CBD) process is widely used to model stem cell populations, capturing key phenomena, such as scaling laws in clone size distributions. However, the CBD process neglects regulatory mechanisms. Here, we propose the birth-death process with volume exclusion (vBD), a variation of the birth-death process that considers crowding effects, such as may arise due to limited space in a stem cell niche. While the deterministic rate equations predict a single non-trivial attracting steady state, the master equation predicts extinction and transient distributions of stem cell numbers with three possible behaviours: long-lived quasi-steady state, and short-lived bimodal or unimodal distributions. In all cases, we approximate solutions to the vBD master equation using a renormalised system-size expansion, quasi-steady state approximation and the WKB method. Our study suggests that the size distribution of a stem cell population bears signatures that are useful to detect negative feedback mediated via volume exclusion.

Maintenance and regeneration of adult tissues rely on the self-renewal of stem cells. Regeneration without over-proliferation requires precise regulation of the stem cell proliferation and differentiation rates. The nature of such regulatory mechanisms in different tissues, and how to incorporate them in models of stem cell population dynamics, is incompletely understood. The critical birth-death (CBD) process is widely used to model stem cell populations, capturing key phenomena, such as scaling laws in clone size distributions. However, the CBD process neglects regulatory mechanisms. Here, we propose the birth-death process with volume exclusion (vBD), a variation of the birth-death process that considers crowding effects, such as may arise due to limited space in a stem cell niche. While the deterministic rate equations predict a single non-trivial attracting steady state, the master equation predicts extinction and transient distributions of stem cell numbers with three possible behaviours: long-lived quasi-steady state, and short-lived bimodal or unimodal distributions. In all cases, we approximate solutions to the vBD master equation using a renormalized system-size expansion, quasi-steady state approximation and the WKB method. Our study suggests that the size distribution of a stem cell population bears signatures that are useful to detect negative feedback mediated via volume exclusion.

Introduction
Stem cells (SCs) are a population of cells capable of self-renewing and differentiating into all the cells in a particular lineage. In adult tissue homeostasis, SCs slowly self-renew and differentiate to compensate for the death of other cells while maintaining a constant average population size [1][2][3][4]. Upon injury, however, the SC proliferation and differentiation rates increase dramatically to repair the tissue, only settling back into homeostasis when regeneration is completed [5][6][7]. Such tight control over the SC proliferation and differentiation rates requires regulatory mechanisms providing feedback to the SCs.
Against the backdrop of recent advances in experimental techniques in stem cell biology, a broad range of SC regulatory mechanisms have been reported, such as negative feedback exerted by the more differentiated cells [6][7][8], competition between SCs for fate determinants [9,10], or mechanical feedback [11][12][13]. An additional plausible mechanism stems from the confinement of SCs to a particular microenvironment, the stem cell niche [9,[14][15][16]. This microenvironment plays a key role in maintaining cell stemness and promoting SC quiescence [17][18][19], self-renewal, or differentiation, according to tissue requirements; however, it also triggers a competition between SCs for niche access [10,20,21].
Crowding effects are often associated with volume exclusion [22][23][24][25][26]. The non-negligible volume of particles restricts their movement, thus obstructing their access to available free space [22]. As a consequence, the accessible phase space can be greatly reduced. If cells are dividing without reducing their size, crowding effects can have an impact on the SC proliferation and death (or differentiation) rates. For example, a proliferation event reduces the available space, which in turn reduces the proliferation rate, thus creating a negative feedback loop. Volume exclusion has been also suggested to play a role in the regulation of cancer stem cells and tumour growth [27]. However, it is not yet clear how to distinguish between crowding effects and other regulatory mechanisms from observations of the population evolution (e.g. from snapshots of the SC population at different times).
Stem cell division and differentiation has been previously modelled stochastically by the simple chemical reaction network S → 2S, S → ∅ [1,28], where differentiation is equivalent to death if differentiated progeny do not self-renew. To prevent the population from diverging or vanishing, the birth and death rates must be equal, thus obtaining a critical birth-death process (CBD). For stem cell population dynamics, the CBD process has been frequently treated under well-mixed and dilute gas conditions, which facilitate its computational implementation, e.g., Gillespie algorithm, [29,30] and mathematical analysis through the master equation formalism [31,32]. This approach has been successfully employed to illustrate key features of SC populations, such as population asymmetry (the maintenance of a constant average population via symmetric divisions that are balanced at the population level, instead of asymmetric divisions), neutral competition [1,28,33], and scaling properties of clone size distributions [28,[33][34][35]. However, the CBD ignores the finite-size nature of cells and thus disregards the role of available space in cell division.
Here, we present a modification of the birth-death process that includes competition for niche access, the birth-death process with volume exclusion (vBD). We subdivide the space within a niche into N voxels (small volumes of space); each voxel is either occupied by a stem cell or else is empty. Assuming well-mixed conditions, the effective chemical reaction network describing this process is S + E → 2S, S → E, where S and E describe stem cells and empty voxels in the niche, respectively. The first reaction reflects the need for a stem cell to find an empty voxel to divide, while the second one represents the birth of an empty voxel after stem cell death or differentiation (assuming differentiated progeny leave the niche space). Naturally the system obeys the conservation law n S + n E = N , where N is the niche carrying capacity, and n S , n E are the number of stem cells and empty voxels, respectively. Note that the vBD resembles a stochastic SIS model, with the number of infectious given by the species S, and susceptible by E. However, while the SIS model is usually treated for N → ∞, we are interested in the low N behaviour. In terms of the vBD model parameters, the basic reproduction number for the equivalent SIS model is given by R 0 = N k 1 /k 2 , where k 1 and k 2 are the rates of proliferation and differentiation, respectively. The deterministic rate equations for the vBD process predict a logistic convergence to a non-trivial attracting steady state. From a microscopic perspective (i.e. the master equation's solutions), however, this prediction is not realised, and the vBD process relaxes to extinction, irrespective of the parameter values and initial conditions. Our analysis reveals the three different behaviours of the stochastic vBD process that are absent in its deterministic counterpart. When the birth rate is much larger than the death rate, the system quickly takes the form of a long-lived, quasi-steady state, and very slowly relaxes to extinction through a transient bimodal distribution. Conversely, for death rates much larger than the birth rates, the system quickly converges to extinction through a unimodal transient. Lastly, when the birth and death rates are comparable, the transient distribution is bimodal but the convergence to extinction is fast. For these three different parameter regimes, we approximate the solution of the vBD master equation using a quasi-stationary approximation, a renormalized system-size expansion (including finite size corrections to the linear-noise approximation), and the WKB method. In particular, the renormalized system-size expansion is a recent modification of the original van Kampen's SSE that has not been widely used yet, but proves useful for tackling master equations of non-linear birth-death processes. Finally, we derive an expression for the expected extinction time, based on Kolmogorov's backward equation and first-passage time theory. Our analytical solutions provide insights into the rich behaviours of the vBD model.

vBD model
The birth-death process with volume exclusion is defined by the chemical reaction network where S and E represent stem cells and empty voxels, respectively (See Fig. 1A for an illustration). For the deterministic system to have a non-trivial steady state, we require that k 1 > k 2 . Note that the two species are coupled by the conservation law n S + n E = N . Assuming mass-action kinetics and defining the dimensionless time τ = k 2 t, the rate equation for the average stem cell concentration φ = n S /N adopts the logistic form where φ * = 1 − k 2 /k 1 is the non-trivial steady state. The deterministic evolution of the stem cell concentration has the form φ(τ ) = ( is the initial condition, and we can appreciate that φ 1B). In this deterministic system, the extinction state φ = 0 is never reached unless φ(0) = 0.
The stochastic behaviour of the vBD differs from the deterministic predictions. Trajectories of the vBD generated using the stochastic simulation algorithm (SSA [29]) fluctuate in the vicinity of the deterministic steady state for some finite period of time (Fig. 1B), but fluctuations eventually drive the stem cell number to extinction. The ensemble average of stochastic trajectories thus converges to zero (see blue line in Fig. 1B), disagreeing with the deterministic model's prediction.
A stochastic treatment of the vBD process is provided by its chemical master equation, i.e., Kolmogorov's forward equation. The chemical master equation (CME) describes the timeevolution of the probability that the system is in one of its states [31,[36][37][38]. To construct the CME, we first note that the vBD model only involves reactions that increase or reduce the number of stem cells by one unit. Hence the stochastic process underlying the reaction network (2.1) takes the form of the Markov chain depicted in Fig. 2A, where the states 0, 1, . . . , N represent the number of stem cells. Note that the vBD process features a reflecting boundary at n = N and an absorbing boundary at n = 0. The propensities are determined by the law of mass-action and Let P (n, τ | P (τ 0 ) = P 0 ) be the probability of finding the system in a state of n cells at time τ , given that it was P 0 at time τ 0 , which we will abbreviate as P (n, τ ). Defining the probability vector P(τ ) = (P (0, τ ), P (1, τ ), . . . , P (N, τ )) T , where T denotes the vector transpose, the CME can be expressed as dP/dτ = MP, where M is the operator defined by The n th row of the master equation reads with a 0 = b −1 = 0. Note that the only parameters present are the carrying capacity, N , and the steady state from the deterministic equations, φ * , as per (2.3). The solution of the master equation, for an initial probability distribution P(0), is given by P(τ ) = e Mτ P(0). The main properties of the solution are captured by the eigenvectors and eigenvalues of M. It is easy to prove that λ 0 = 0 is always an eigenvalue associated with the eigenvector [1, 0, . . . , 0] T (the extinction state), while the other eigenvalues are real and negative [39]. Therefore, the extinction state is always reached, irrespective of the parameter values and initial conditions. Moreover, the expected extinction time is the inverse of the spectral gap, |λ 1 − λ 0 | −1 , where λ 1 is the smallest (in absolute value) non-zero eigenvalue. The third 5 rspa.royalsocietypublishing.org Proc R Soc A 0000000 eigenvalue, λ 2 , has a considerably higher absolute value than λ 1 , as we can appreciate from the spectral gap of the reduced system obtained by eliminating the extinction state (orange line in Fig. 2 B) -we observe that the smallest gap is λ 2 ≈ 2.6λ 1 which is achieved in the limit of small φ * . Hence, after an initial transient the PDF is dominated by the eigenvectors associated with λ 0 and λ 1 , leading to where e 0 = [1, 0, . . . , 0] T is the extinction state, and e 1 = [−1, f (1), f (2), . . . , f (N )] T is the leading eigenvector. The first element of e 1 comes from the lower boundary of the state space. The first row in the CME reads dP . Note that f (n) is the PDF of stem cell numbers conditioned on nonextinction. It follows that the expected extinction time is The distribution of the surviving trajectories is defined as where P'(τ ) = [P (1, τ ), . . . , P (N, τ )] T . Hence, the master equation for the PDF conditioned on non-extinction reads where the operatorM results from eliminating the first row and column of M. Note that, even though the steady state in the original CME is the extinction state, the CME for the surviving trajectories presents a non-trivial steady state ( Fig. 2C-E). In effect, the steady state of Eq. (2.9) yields the entries f (k) of e 1 . Numerical experiments reveal the presence of three different behaviours in the solution given by Eq. (2.6). For high φ * values the system relaxes to a long-lived quasi-steady state, f (n), before slowly relaxing again to extinction through a transient bimodal distribution (Fig. 2C). The two modes given by the eigenstates e 0 and e 1 in Eq. (2.6) are located at the extinction state and near (but not necessarily at) N φ * , respectively. The leading eigenvalue vanishes for high φ * (blue line in Fig. 2B), indicating the presence of a quasi-steady state (QSS). For intermediate φ * values, the transient bimodality is still present, but the relaxation to extinction is faster than the time to reach the QSS (Fig. 2D). For low φ * values the system rapidly goes extinct, and the mode around N φ * is absent ( Fig 2E). The phase diagram in Fig. 2F summarises the three types of behaviour of the vBD model as a function of φ * and the carrying capacity N . To determine the parameter regimes for these three behaviours, we compute numerically the solution to the master equation, P(τ ) = e Mτ P(0), choosing P (n, 0) = δ nk , where δ ij is the Kronecker delta and k = N φ * . We then extract the PDF conditioned on non-extinction after the initial transient. The black region in Fig. 2F corresponds to the case in which such distribution does not present maximum for n > 1. To distinguish between the bimodal extinction and QSS regions, we compute numerically the second eigenvalue of M, λ 1 , and set a tolerance α = 1 × 10 −2 . The grey region in Fig. 2F corresponds to λ 1 > α, and the white region to λ 1 ≤ α. In the next two sections we derive approximate solutions for f (n) and T in the three different regimes.

Approximate solutions
Here, we present approximate solutions for the vBD master equation. From Eq. (2.6), it follows that the solution adopts the form P (n, τ ) = f (n)e −λ1τ , ∀n ≥ 1, and P (0, τ ) = 1 − e −λ1τ . Hence, the asymptotic solution is determined by the PDF conditioned on non-extinction, f (n), and the inverse of the expected extinction time, λ 1 . In this section we tackle the problem of approximating . Correspondingly, the expected extinction time, |λ 1 | −1 , is large for high φ * , indicating the presence of a quasi-steady state. The relative spectral gap of the system conditioned on non-extinction, i.e. |(λ 2 − λ 1 )/λ 2 |, increases with φ * (orange line). In general, |λ 2 | is at least 2.6 times higher than |λ 1 |, supporting a solution of the form of Eq. (2.6) for the vBD master equation. f (n), assuming λ 1 is known, while in the next section we derive an accurate expression for the expected extinction time and correspondingly for λ 1 .

C-E
Substituting Eq. (2.6) into the master equation we obtain with the boundary conditions 2) can be solved iteratively, leading to a solution in terms of the continued fraction To obtain an expression for f (n), we apply the boundary condition f (1) = λ 1 /b 0 , thus leading to

(a) QSS approximation
For high φ * values, λ 1 1 and correspondingly the mean extinction time is very large (see Fig. 1B). Since λ 1 = b 0 f (1), we can impose a quasi-steady state condition by disregarding the second term in the r.h.s. of Eq. (2.9), leading to ∂P/∂τ ≈MP. The steady state of the CME conditioned on non-extinction can be obtained by solvingMP = 0; the N th row yielding . This relationship can be iterated to obtain Finally to find f (N ) we make use of the normalisation condition which leads us to a closed form solution for the quasi-steady-state PDF (3.6) Substituting the propensities an and bn from Eq. (2.3) yields the approximate QSS solution The QSS approximation accurately describes the PDF conditioned on non-extinction for high φ * values (Figs. 3A and 4A).
From the QSS approximation we can calculate the position for the non-zero mode. To do so, it is convenient to study the discrete first derivative ∂f (n) minimum to take place, it is necessary that f (n + 1)/f (n) = 1. According to Eq. (3.7), we have Solving for n we have The sign of the discrete second derivative, ∂ 2 f /∂n 2 = f (n + 1) + f (n − 1) − 2f (n) reveals that, when both solutions exist, n * + corresponds to a maximum and n * − to a minimum. Note that, for n * ± to adopt real values, , which is the case for all parameter sets in the QSS region. Hence, in general the position of the non-zero mode differs from the deterministic model's prediction, i.e., n * + = N φ * . As an example, for a carrying capacity of N = 100 and φ * = 0.4, then n * + ≈ 38. Note that, when φ * → 1 or N → ∞, n * + → N φ * and n * − → 0. Thus in these limits, the distribution has a single mode sitting at the deterministic model's prediction of the mean stem cell number.

(b) Renormalized system size expansion
For parameter sets (φ * , N ) in the bimodal and unimodal extinction regions (grey and black areas in Fig. 2F), the QSS approximation is unable to capture the PDF conditioned on non-extinction (as it can be seen in Figs. 3B, C and 4A). In such cases, the condition λ 1 1 does not hold, and hence the second term in the r.h.s. of Eq. (2.9) can no longer be considered negligible (which is needed to find the steady state of the CME conditioned on non-extinction iteratively, as we did in the QSS case). Hence a different approximation is needed. In what follows we derive a new approximate PDF conditioned on non-extinction based on a high-order renormalized system size expansion of the vBD's master equation. In section i we provide a general analytical recipe to obtain an expression for the PDF conditioned on non-extinction, Eqs. (3.20) and (3.21) being the main results. While self-contained, this derivation is mathematically lengthy, and the reader can skip it should they be solely interested in its application to the vBD process. We point the reader to [40] for a detailed derivation of the method. In section ii we apply the renormalized SSE results to the vBD system, and obtain an analytical expression for the PDF conditioned on non-extinction.

(i) Derivation of the renormalized SSE formula
The van Kampen system size expansion (SSE) approximates a master equation by splitting the random variable that describes the stem cell number, n, into deterministic and non-deterministic components, to then obtain a master equation for the non-deterministic components (usually taken as a continuous random variable). The resultant CME can be expanded in powers of N −1/2 , truncated to a desired order and solved, leading to a hierarchy of approximate solutions [41]. Whilst the bulk of applications are centred in truncating the CME after the order N 0 to obtain the linear noise approximation (LNA), which considers the particle concentration equal to its deterministic value, other works have used higher order truncation schemes to obtain corrections to the mean concentration [42,43]. The deterministic component is commonly assumed to be given by the solution of the rate equations. However, the disagreement between the deterministic and stochastic predictions (see Fig. 1B) makes it sensible to introduce a correction term to the mean stem cell concentration in the original ansatz. To do so, we follow the procedure from [40], that starts by considering the ansatz where the first term in the r.h.s. is the zero-order mean stem cell concentration obtained from the deterministic rate equations, the second term is a correction to the mean concentration due to fluctuations, and the third term in the r.h.s. represents fluctuations about the corrected mean concentration. This renormalization of the mean concentration leads to a different system-size expansion than the conventional one by van Kampen, which we refer to as the renormalized system-size expansion. Next, we briefly describe how to compute the corrections to the mean concentration as a series in powers of N −1/2 : (3.11) The expansion coefficients for the correction term to the mean concentration are calculated iteratively as follows: where J is the Jacobian of the deterministic rate equations, and we assume a (0) m = 0. To define the operators D q p,s , we assume that the propensity functions for a birth or death event when the system is in a state of n stem cells (a n+1 and b n−1 ), expressed in terms of the concentrations, can be expanded in power series of the inverse carrying capacity (N −1 ) as where g (Sr) p ∂ q g (s) where Sr is the net change in the number of stem cells when the r th reaction occurs, namely S 1 = 1 and S 2 = −1. The functions I αβ mn in Eq. (3.12) are defined as for (α + β) − (m + n) even, and zero otherwise. We have introduced the notation (2k − 1)!! = (2k)!/(2 k k!) for the double factorial, and σ is the standard deviation of the concentration according to the standard linear noise approximation [41] The variance of the fluctuations can also be computed as a series in powers of N −1/2 where the expansion coefficients are given bŷ has as arguments 1!a 1 , and 3!a continuous random variable) in powers of N −1/2 , and truncating after O(N 0 ) to obtain a Fokker-Planck equation that yields the LNA, thus describing Gaussian fluctuations around the mean concentration. The higher order approximate solutions can then be expressed in terms of the first-order approximation. However, this approach often leads to non-physically meaningful distributions for truncations of the system-size expansion beyond the LNA level of approximation, e.g., yielding negative probabilities or oscillatory behaviour. Such effects can be greatly reduced by introducing a discrete formulation of the SSE's approximate solutions [40].
The discrete formulation replaces the continuous-variable LNA approximate solution, by the discrete approximation where erf is the error function, x = n − N φ − N 1/2 is the stem cell number centred about its (corrected) deterministic value, and Σ 2 = Nσ 2 its variance. Note that the time dependence is implicit in the temporal change of the mean concentration and the variance of concentration fluctuations, φ(τ ) and σ(τ ), respectively. Equation (3.20) is the discrete version of a Gaussian in the sense that every moment of the distribution coincides with their corresponding of a continuous-variable Gaussian. Next, the expansion of the SSE approximate solution up to any order can be expressed in terms of P 0 (n, τ ) and its derivatives where the new expansion coefficientsâ To calculate the derivatives of P 0 (n, τ ), it is possible to prove by induction the following formula where · denotes the ceiling function, and F (y, m) is sin(πy) for m odd and cos(πy) for m even. Calculation of the correction terms to the mean concentration and the variance of fluctuations, andσ 2 respectively, by truncation of Eqs. (3.11) and (3.18) to any desired order, followed by substitution into Eq. (3.21) and truncation, provides a means to systematically obtain approximate solutions to the CME. Note that the order of the approximate solution is determined by the order of truncation of andσ 2 . For example, to obtain an approximation of order (s + 1)/2 with s = 0, 1, . . . , both andσ 2 are expanded up to that order (assuming that at least one of the coefficients is non-zero) using Eqs.

(ii) The expansion of the vBD master equation
We now employ the renormalized discrete-formulation of the system size expansion to approximate the probability distributions conditioned on non-extinction in the bimodal extinction region, i.e., the grey zone in Fig. 2F. To do so, we observe that the absorbing boundary, which is the factor that renders the deterministic rate equations inaccurate for calculating the evolution of the mean stem cell number, is absent when conditioning on non-extinction. In effect, the rate equations capture meaningful information about the mean stem cell number of the surviving stochastic trajectories. Thus, to approximate the steady states of Eq. (2.9) we can apply the renormalized system size expansion described in the previous subsection, under stationary conditions, to the non-conditional master equation (2.5), which includes the extinction state. We then recover the PDF conditioned on non-extinction by removing the extinction state and multiplying the resulting distribution by a normalization constant 1/(1 − P (0)), where P (0) is the probability of being in the extinction state obtained by the SSE. In particular, to capture non-Gaussian fluctuations, we truncate the renormalized SSE after terms of O(N −1 ).
The expansion coefficients for the vBD propensities (g (s) r in Eq. (3.14)) are given by and g (s) r (φ) = 0 ∀s ≥ 1, which allows us to calculate the values of D q p,s . From the standard LNA we obtain the order N 0 approximation for the first two moments of the distributions Note that, since we have imposed the stationary conditions φ = φ * and σ = 1 − φ * , there is no longer a time dependence in P 0 . To calculate the higher order approximate solutions we make use of Eq. (3.21). The expansion coefficients, corrected by renormalization, are given by Eq. (3.22). The first few coefficients arê (3.31) Remarkably, there are fewer non-zero renormalized coefficients than regular ones; hence, the analytical expressions for the SSE distributions corrected by renormalization adopt a simpler form. Next, Eq.( 3.21) yields the following expression for the order N −1/2 approximate solution .

(3.33)
The renormalized SSE accurately describes the probability conditioned on non-extinction for φ * in the QSS region ( Figure 3A), as well as in the fast bimodal extinction region (Fig. 3B), where the QSS approximation breaks down, and accurately captures the distribution skewness. Moreover, this result is robust to changes in the carrying capacity (Fig. 4B). However, the accuracy of the renormalized SSE decreases dramatically for very low φ * values, as it can be seen in Fig. 3C. For non-linear birth-death processes featuring non-Gaussian fluctuations, the renormalized SSE consistently performs better than the LNA, which is unable to capture the distribution skewness under stationary conditions.

(iii) Approximation for low φ * values
The phases with extinction through a bimodal and unimodal transient are separated by a critical curve, which also accurately demarcates the regions of parameter space where the renormalized SSE expansion is accurate and where it is not (see Fig. 4B). We attribute the inaccuracy of the SSE in the low φ * region to the fact that the PDF conditioned on non-extinction no longer features a mode around N φ * , which is the basis of the SSE approximations.
In the unimodal extinction region -black region in Fig. 2F-the time-dependent probability distribution features a mode at n = 0 as extinction is approached. This suggests that we can approximate the cell number concentration by the steady-state of the deterministic equations, φ * ,  D). B In the bimodal extinction region, the renormalized SSE is the only accurate approximation. C In the unimodal extinction region, the HG provides the best approximation. D The Hellinger distance between the various distribution approximations and the exact distribution confirms that the QSS, SSE, and HG are the best approximations for the quasi-steady state, bimodal extinction, and unimodal extinction regions, respectively. and the cell number concentration fluctuations (conditioned on non-extinction) by means of the LNA, Eq. (3.27), yielding σ 2 = 1 − φ * under stationary conditions. Note that the latter conditions naturally arise from the conditioning of the distribution on non-extinction. Given the mode at zero, a Gaussian is clearly not a good approximation and hence instead we try a half-Gaussian approximation with the aforementioned first two moments The half-Gaussian provides an excellent approximation to the PDF conditioned on non-extinction in the low φ * unimodal region (Figure 3 C). However, as expected, this approximation breaks down for higher φ * values, where the non-trivial mode is present (Fig. 3B and D).

(c) WKB approximation
An alternative way to obtain a quasi-steady state approximation is the popular Wentzel-Kramers-Brillouin We can now write the master equation for the PDF of the continuous variable φ, Π(φ, τ ) = P (N φ, τ ). It follows that P (n ± 1, τ ) = P (N (φ ± 1/N ), τ ) = Π(φ ± 1/N, τ ). Mutiplying the CME (3.37) Next, the WKB approximation amounts to assuming a solution of the form, where S(φ) and K(φ) are of the order of unity. Substituting in the quasi-stationary master equation, expanding with respect to N −1 , and collecting the leading order terms yields where S (φ) = dS(φ)/dφ. From here, we note that the above equation corresponds to a stationary Hamilton-Jacobi equation (H(φ, S (φ)) = 0), for an action S with Hamiltonian with p = S (φ). The corresponding Hamilton equations are (3.41) Since we are interested in the zero-energy solution (H = 0), with initial conditions φ(t 0 ) = φ * , the action along a fluctuation trajectory will be given by We are now ready to calculate the action S(φ). Substituting in Eq. (3.42) and integrating, we obtain It is possible to prove, using the next order contributions to the WKB expansion, that the prefactor K(φ) is where A is later determined by the normalisation condition [45]. Finally, the WKB expansion for the vBD system yields the probability distribution .

(3.46)
While the WKB is a good approximation for large values of φ * (Figure 3A), it fails for small and intermediate values of φ * (3 B-C). In general, the range of validity of the WKB approximation coincides with that of the QSS approximation (compare Figs. 4A and C). An advantage of the WKB approximation stems from its simple analytical form. However the QSS approximation is generally more accurate than the WKB approximation (see Fig. 3D).

Calculation of the extinction time
In the previous section we have shown different approximations for the leading eigenvector of the vBD model's master equation solution, Eq. (2.6). The time-dependent component of the solution is determined by the leading eigenvalue, λ 1 , which is the inverse of the expected extinction time. The most direct method for estimating this eigenvalue would be to calculate the probability conditioned on non-extinction, f (n), and making use of Eq. (2.7) to yield λ 1 = b 0 f (1). However, the absorbing boundary at n = 0 leads to an inaccurate estimation of the probability of having n = 1 stem cells (using all approximation methods considered) and thus we cannot use this method to estimate the extinction time (see Fig. 5A). Hence, we present an alternative calculation for the expected extinction time that is based on averaging the mean extinction time starting from any state, and makes use of the Kolmogorov's backward equation. This approach has been effectively used for other similar problems (see for example [48,49]). Instead of relying solely on the estimation of f (1), this method involves averaging among all the f (n) values, which significantly improves the accuracy with respect to direct application of Given an estimate for the probability distribution of the surviving trajectories, f (n), the expected extinction time T is simply the average among initial conditions of the mean first passage times to hit the extinction state where τ * n is the mean first passage time to hit the extinction state, starting from the state with n cells. Hence, to estimate T we need to find the mean first passage times τ * n . To do so, we make use of the discrete-time Kolmogorov's backward equation (4.2) where Q 0,n (τ ) is the probability of being extinct at time τ , given that initially there were n cells. The backward equation just states that the total probability of becoming extinct from state n at time τ + ∆τ equals the probability of jumping to the state with n + 1 cells and then going extinct, plus the probability of jumping to state with n − 1 cells and then going extinct, plus the probability of staying in the same state and going extinct. The probability density of becoming extinct at time τ + ∆τ is Q 0,n (τ + ∆τ ) − Q 0,n (τ ). Hence, the mean first passage time to extinction, starting from a state of n cells, reads Here, we assume Q 0,n (−∆τ ) = 0. To simplify the notation, let us denote with Qn(τ ) the cumulative probability of becoming extinct at time τ given that the system started in state with n cells. Substituting τ + ∆τ → τ in Eq. (4.2), we have . Subtracting both expressions, multiplying by τ = k∆τ and integrating over time (which in this case amounts to sum over all k), leads to obtain a n+1 τ * n+1 + b n−1 τ * n−1 − (a n+1 + b n−1 )τ * n + 1 = 0. (4.5) The first boundary condition for this recurrence relation is τ * 0 = 0, i.e., the mean first passage time to extinction starting from extinction is zero. The second condition is τ * N = τ * N −1 + 1/b N −1 , i.e., the mean first passage time to extinction from N cells is the corresponding one from N − 1 cells plus the mean time in which the system hops from N cells to N − 1 cells.
To solve for τ * n we define vn = τ * n − τ * n−1 . Thus, the recurrence relation becomes which can be solved iteratively (applying the corresponding boundary condition), to yield Finally, solving for τ * n yields Naturally, the accuracy of the expected extinction time T (and its inverse, λ 1 ), is sensitive to the accuracy of the approximate PDF conditioned on non-extinction, f (n). When calculating λ 1 from Eq. (4.1) and Eq. (4.8) (with f (n) estimated numerically), the result is in very good agreement with λ 1 numerically calculated from the master operator's eigenvalues (compare the orange and black dashed lines in Figure 5B). The approximate λ 1 calculated using Eq. (4.8), where f (n) is obtained using one of the approximations for the probability distribution conditioned on nonextinction (Fig. 5 B), is much closer to its real value than λ 1 calculated via Eq. (4.1) (Fig. 5 A). For high φ * , all f (n) approximations lead to accurate λ 1 estimations, with the exception of the half-Gaussian. On the other hand, the half-Gaussian approximation performs well for very low φ * , when every other approximation of f (n) yields inaccurate λ 1 estimations. In the intermediate region, the renormalized-system size expansion turns out to be the best choice.
Extinction times vary significantly between parameter regimes, which we can think of as corresponding to different stem cell niches. Assuming that the system is in quasi-steady state, the average number of stem cells is roughly given by n ≈ N φ * . Hence, given the average number of stem cells in a niche, we can estimate the effective carrying capacity for different φ * values from the quasi-steady state parameter regime. It is unclear how these stem cells are distributed in individual niches, but even for niches hosting 100 stem cells the extinction times fall within 10 11 − 10 13 years, much higher than the human lifetime. For niches hosting a larger number of stem cells the extinction time is even higher.

Discussion and Conclusion
We have introduced the birth-death process with volume exclusion, a variation of the birthdeath process that incorporates crowding effects due to the finite size of stem cell niches. For cell division to occur there needs to be free space in the niche to accommodate the new born cellhence the effective proliferation rate is higher (lower) when the niche is less (more) populated. In effect, the expected stem cell number in the vBD model is independent of the initial condition, in contrast to the CBD case, in which the expected number of stem cells is constant in time.
Regulation through volume exclusion could also affect clone size distributions and their scaling with average clone size. In contrast to the CBD process, in which the size evolution of different clones is independent, in the vBD all clones are coupled through the empty space species. When one clone grows in size, the probability of other clones growing is decreased, thus affecting the clone-size distribution. Hence, it might be possible to detect evidence of volume exclusion effects from snapshots of clone size distributions, which are commonly measured experimentally [54-57]. We will explore this in follow-up research. At the stochastic level, the predictions of the vBD master equation differ significantly from those of its deterministic counterpart. While the deterministic rate equations feature a stable steady state, φ * , to which the system converges logistically, the master equation's solution predicts the vBD model converging to extinction for all parameter sets. However, for φ * sufficiently large, a long-lived quasi-steady state appears, and convergence to extinction is very slow. Hence, for high φ * , a single stochastic trajectory representing a real system, might fluctuate around the quasi-steady state's non-trivial mode for all its lifetime.
We have shown the vBD model to have three phases with different behaviours that, to the best of our knowledge, have not been analysed before: fast extinction dynamics through a unimodal transient, fast extinction dynamics through a bimodal transient, and slow extinction dynamics with a quasi-steady state. Transient bimodality rarely occurs in chemical reaction networks, but has been reported in recent works [58,59]. For the quasi-steady state region, we have shown two independent approximate solutions to the master equation, the QSS and WKB approximations. The QSS provides a more accurate approximation, whilst the WKB adopts a simpler mathematical expression. Moreover, the QSS approximation allowed us to prove that the position of the nontrivial mode generally differs from the deterministic prediction. For the bimodal extinction region, we have derived an approximate time-dependent solution to the master equation by making use of a renormalized system-size expansion, which is particularly useful for solving master equations of non-linear birth-death processes, but has not been widely applied. Remarkably, the expression obtained by the renormalized SSE is simpler than the one from the regular SSE.
The vBD model is mathematically similar to susceptible-infected-susceptible (SIS) models in epidemiology [60-63]. Most of the stochastic SIS models studies consider a very large carrying capacity, or let it tend to infinity. Here we have instead focused on solving the master equation for low carrying capacities, motivated by the application to stem cell population dynamics in niches that can have a carrying capacity as low as few tens of cells. The approximate solutions to the master equations we present here, thus, can shed light into the behaviour of SIS-like systems when the effect of having a finite carrying capacity becomes evident. In particular, the solutions of the vBD master equation can be interpreted as the time-dependent PDF of the number of infected individuals, and our approximation of the expected extinction time as the population's expected time to recovery. Another similar model can be found when studying the role of positive feedback in cluster formation of signalling molecules [64]. In this case, the transition from quasisteady to extinction states are interpreted as a switch from clustered to non-clustered states. A difference between such model and the vBD is that transitions to non-clustered to clustered states are allowed, whilst in the vBD model it is not possible to exit the extinction state.
Our study proposes and describes a minimal model for stem cell dynamics with regulation through competition for space. In this context, the system parameters should be taken as effective parameters encompassing many different features. Let us conclude by discussing aspects of biological realism that could be represented explicitly in future extensions of our model. The vBD model assumes Markovian dynamics, which implies exponentially distributed waiting times between consecutive cell divisions. There is increasing evidence showing that the cellcycle times are not exponentially distributed [65][66][67][68][69]. The inclusion of realistic cell-cycle time distributions in the vBD model will require further research. Spatially, our model considers cells distributed on a grid, assuming that the cell shapes remain unaltered. This hypothesis ignores the mechanical plasticity of cells, their mechanical response to pressure in different environments, and the irregular geometry of the stem cell niches [16,70]. Moreover, here we have modelled cell populations as well-mixed systems. This well-mixing assumption may be inappropriate when a niche is highly occupied. Since our aim is to study the hallmarks of regulation through volume exclusion, we have also neglected other plausible regulatory mechanisms that might be acting in parallel, such as competition for other resources [10,71,72] or cell-cell communication pathways between more differentiated (for example, transit amplifying) cells and stem cells [6,73]. In open niches these more differentiated cells might also compete with stem cells for space [10,72,74,75]. The role of multiple cell types in competition for space and the interaction between different plausible regulatory mechanisms are interesting avenues for future research.