Master equation approach to the stochastic accumulation dynamics of bacterial cell cycle

The mechanism of bacterial cell size control has been a mystery for decades, which involves the well-coordinated growth and division in the cell cycle. The revolutionary modern techniques of microfluidics and the advanced live imaging analysis techniques allow long term observations and high-throughput analysis of bacterial growth on single cell level, promoting a new wave of quantitative investigations on this puzzle. Taking the opportunity, this theoretical study aims to clarify the stochastic nature of bacterial cell size control under the assumption of the accumulation mechanism, which is favoured by recent experiments on species of bacteria. Via the master equation approach with properly chosen boundary conditions, the distributions concerned in cell size control are estimated and are confirmed by experiments. In this analysis, the inter-generation Green's function is analytically evaluated as the key to bridge two kinds of statistics used in batch-culture and mother machine experiments. This framework allows us to quantify the noise level in growth and accumulation according to experimental data. As a consequence of non-Gaussian noises of the added sizes, the non-equilibrium nature of bacterial cell size homeostasis is predicted, of which the biological meaning requires further investigation.


Introduction
Bacterial cells manage to grow and divide into two cells in a cell cycle. As a common observation, the sizes of the bacterial cells varies from one to another, but the fluctuation is well controlled [1][2][3]. The fascinating problem arises how bacteria coordinate the mass growth and division processes to keep the cell size stable. As a first approach to investigate the cell size control mechanism, the experiments were performed to measure the mean cell mass in various growth conditions. In 1958, Schaechter, Maaløe, and Kjeldgaard discovered the relation between the mean cell size and the growth rate in their experiments with extraordinary precision [4], which set the foundation of studies on cell size control for 60 years. Following their work, there had been a wave of fruitful investigations in bacterial physiology for quantitative understanding of the cell size distribution in an exponentially growing culture [5][6][7][8][9]. The control mechanism beneath the distribution was, however, not clear due to limited information provided by experiments in that age. In the recent decade, the newly developed mother machine experiment makes it possible to track cell cycles of single cells for long term [10], which initiated a new wave of studies on cell size control [1,3,[10][11][12][13]. With the help of the single-cell data, deterministic models of bacterial cell size control have been established on the level ofmean values [3,14,15]. By incorporating randomness in these models, attempts for understanding the ever-fluctuating cell cycles were also made in various approaches [3,13,[16][17][18][19][20][21][22][23].
As the key technique boosting the recent wave of studies, mother machine refers to a new bacterial culturing method realized by the microfluidic techniques, where channels of micrometre-width are fabricated to trap single bacterium while continuously supplying it with fresh medium [10]. In the experiments, only one mother cell is constantly trapped in the channel while all the daughter cells are pushed out during the cell growth and replication. One can easily measure the birth size, division size, and inter-division time of mother cells for generations in such setup. It was observed that the added size of the cell between successive divisions is uncorrelated with the birth size. The "adder principle" was hence established as a common strategy of bacterial cell size control, which suggests that a cell attempts to add a constant size before the next division [1,3]. The following question appears how a cell could sense the added size. A natural guess comes that the added size is sensed by certain circuits of chemical reactions. The reaction locales of the circuits are, however, non-trivial. They can not be uniformly distributed in the bulk of cytoplasm, since the chemical reactions in bulk is relevant to the concentration but not the absolute abundance, where the total cell volume plays no role. In bacteria with no nucleus, the reaction locales may either be on DNA or cell membrane. A likely candidate of the reaction is the assembly of the contractile ring on the membrane [13,24]. It was reported in the recent experiment the assembly is re-initiated in each newborn cells and the division can be conducted only if the assembly is completed [13]. Based on the above knowledge, an accumulation model was introduced that a newborn cell accumulates certain molecules as the adder indicator and it divides when the absolute abundance of the indicator reaches a threshold [3,25].
Quantitative modelling of the cell growth and division processes dates back to the pre-DNA age [26,27], where cell growth was considered as simple and clean stochastic processes due to the lack of detailed biological knowledge. Enlightened by the success of the microbiologists in the 1960s, Bell, Anderson and their colleagues developed the mathematical framework in the fashion of master equation for systematic description of cell growth and division processes [28][29][30][31][32]. In the new era of bacterial physiology, intergeneration dynamics [17,[33][34][35][36] has been intensively studied aiming to reveal the cell control mechanism from the single-cell data provided by mother machine experiments. The models integrating the dynamics at both single cell level and the population level were also developed in the context of modern biological knowledge [2,3,16,[19][20][21][22]37]. While stochastic features were inevitably incorporated in the models, a clear investigation on the stochastic accumulation model is still absent.
In this theoretical study, we investigate a transparent model for the stochastic growth-accumulation-division process with only two free parameters. The distributions concerned in cell size control are estimated, which are well confirmed by the experiments. As the central result of this work, we demonstrate the inter-generation Green's function is the key to bridge two kinds of statistics employed in the batch-culture experiments and that of the mother machine ones. It is analytically shown that not only the mean value but also the distribution of added size is independent of the birth size, which is elucidated in the picture of subordinated random walk. The analysis on experimental data shows that the noise levels significantly vary for different growth conditions. It hence suggests that in addition to growth rate, the noise strengths are necessary parameters to describe the bacterial cell cycle on the level of distributions. Evaluating the entropy production of the inter-generation stochastic dynamics, the non-equilibrium nature of bacterial cell size homeostasis is further predicted.
The paper is organized as follows. Section 2 introduces the stochastic accumulation model in the framework of master equation. Section 3 revisits the cell size distribution of the exponentially growing culture. Section 4 studies the inter-generation stochastic dynamics derived from the comprehensive model, where the analytic expression of the distribution of added size is shown. Section 5 estimates the noise levels of the experiments. Section 6 reveals the non-equilibrium nature of the cell size control mechanism. Section 7 contains discussions, including brief historical remarks, the connections with the other models, the differences and connections between the statistics in batch culture and mother machine, the noise levels in various growth conditions, and some biological implications. Section 8 gives a summary.

The deterministic population dynamics
This work studies the accumulation model, which is illustrated in Fig.1. According the model, the cell size x exponentially grows with the rate dx/dt = γx, while it accumulates certain adder indicator s from the newborn value s = 0 with the rate ds/dt = x/m 0 . The cell division happens when the adder indicator reaches the division threshold s = 1. Assuming even division, the birth sizes of the daughter cells are half of the division size of the mother, The framework of master equation of the population dynamics is employed to investigate the model. Let us denote the number of the cells of size x, age τ and the adder indicator s at time t in the whole population by u(x, s, τ, t). Taking cell age, size growth and indicator accumulation into account, the time course of u is governed by the master equation where we have adopted the exponential cell growth assumption that the rate coefficients are set as k (x) = γx with the cell growth rate γ and k (s) = x/m 0 . We will see later m 0 sets the mean added size for each cell cycle.
The accumulation model assumes cell accumulates certain protein s as the adder indicator from s = 0 at birth. The cell divides at the threshold s 0 = 1. The division current, i.e. the number of division events in unit time, can be written as To avoid additional complexity, we restrict ourselves in the case of perfect even-division, of which the two daughter cells have the same size. It suggests the birth current Noting that the age and the adder indicator are reset for the newborn cells, the full division-birth process can be written as the boundary condition Eq.(1) and Eq.(3) define the population dynamics of exponentially growing bacteria colony.
In this model, the growth rate γ is the only parameter relevant to the time scale. The equation is thus invariant under rescaling by The dynamics of the rescaled cell size is independent of the growth rate γ and the typical size m 0 . The famous bacteria growth law observed by Schaechter-Maaløe-Kjeldgaard [4], x ∝ γm 0 , is hence a direct consequence of the rescaling. The rescaling approach remains valid for the slow growth cases, of which the recent experimental study suggested the correction on m 0 by γm 0 ∝ (γ + γ 0 ) [25]. The dynamics defined by the above continuous master equation is deterministic. To be explicit, let us consider a colony starting from a cell of known birth size x 0 , with the initial condition u(x, s, τ )| t=0 = δ(x − x 0 )δ(s)δ(τ ). According to Eq.(1) and Eq.(3), the joint distribution u would ever be of zero-width. It is obviously not the case of a real bacterial colony. The stochastic theory including noises in cell growth and division is hence required.

The stochastic population dynamics
Master equation is naturally a language for Markov processes on discrete states. The deterministic dynamics defined above can be easily generalized to the stochastic process on discrete cell sizes {x i = iδ x } and adder indicators {s j = jδ s }, which follows the master equation where u ij denotes u(x i , s j , τ, t) for simplicity. Cell division is defined by the boundary condition where n 0 = 1/δ s is the division threshold for the adder indicator. Acknowledging the proteins are generally synthesized in the cell in a bursty manner [38], n 0 would be understood as the number of bursts required for synthesizing adequate specific divisionrelated protein.
Similar to that in the continuous case, scaling can be introduced by

equation is written as
while the boundary condition is given by Compared with the continuous case, there are two additional parameters left in the rescaled dynamics, δ x and δ s , which come from the noises in growth and division. In some cases, it would be convenient to introduce the inversed noise strength α = 1/δ x and β = 1/δ s . In the rest of the paper, the dynamics and distributions are investigated in the rescaled form, where γ and m 0 are set to 1 and the noise parameters play the key roles. The stochastic processes described by the above master equation can be understood as random walk on the two-dimensional lattice {x i , s j }, as shown in Fig.2. The random walk initiates at the cell birth with (x = x b , s = 0). For each state (x i , s j ), the walker jumps either to (x i+1 , s j ) by the rate k (x) i = αx i or to (x i , s j+1 ) by the rate k (s) i = βx i , until the cell division at (x d , s = 1). The pseudocode for simulation in Gillespie's fashion [44] is provided in Appendix D for better understanding, through most of the results are obtained by analytic and numerical approaches instead of simulation.
the discrete cell size s j = j · δ s the discrete adder indicator δ x , δ s minimal increments of cell size and adder indicator Rescaled noise parameters the rescaled increment rate for adder indicator Cell cycle Since the model is defined for discrete {x i , s j } and continuous τ , we would like to clarify that the expressions are often simplified as P (x i , s j , τ ) = (dτ ) −1 Prob (x i , s j , τ ∈ (τ, τ + dτ )), with the capital letters. The probability density functions (PDF) are introduced for comparisons among various δ x , δ s , and the experimental data, which are denoted by the lower letters as p(x, s, τ ) = (δ x δ s ) −1 P (x i , s j , τ ). There have been already series of studies on bacterial cell cycle, where the stochastic dynamics were constructed in different ways [3, 16-18, 21, 22, 27, 33, 34, 37, 39]. The connections and differences among the previous models and our model will be discussed in Sec. 7.2.

Batch culture and the cell size distribution
In biological studies, bacteria are usually batch cultured where they are kept in the exponentially growing phase by frequent transfer between batches of fresh medium before it exhausts the environment nutrient. Due to least disturbance on bacterial  11)). The symbols are from the experiments by Gray et al. [12].
(Right) Same as left, but in log-scale.
cells and the ability to keep a steady growth rate, the batch culture procedure was widely employed in the studies of the 1950s and 1960s when the cell size distribution has been intensively studied. Due to the rapidly increasing cell number in the crowding colony, the batch culture experiment is not convenient for single-cell tracking even with the help of modern techniques. The statistics in this case hence usually involve all the cells in different stages, from the newborn one to the ready-to-division one. The master equation Eq. (7) with the boundary condition Eq.(8) provides a straightforward description of the exponentially growing population.
One can numerically integrate the dynamics over time and observing the development of a bacteria colony. The cell number of the colony can be evaluated by summing over all the states as As Eq.(7) defines the growth process where the cell number is conserved, cell doubling arises only during the division process defined by Eq. (8). Considering a colony starting from one cell, the doubling events behave as steps of N (t) in the initial generations. The synchronization blurs afterwards due to noises. In the long time limit, a steady state of exponential growth is achieved, where the cell number grows smoothly.
In the current framework, the cell size distribution can be evaluated as the marginal probability The corresponding PDF is defined as p(x) ≡ P (x i )/δ x . In the long time limit, the steady distribution can be achieved, which is shown in Fig.3. To our surprise, the cell size distribution is so well confirmed by the data from the experiment [12]. Revisiting the historical literatures, we realize that the microbiologists in the 1960s have already obtained similar results in a series of fruitful investigations of different approaches other than master equation. (A relevant historical remark is included in Sec. 7.1.) It is known since then the inverse-square law for the exponentially growing and evenly divided cells, which is shaped by the growing cells with size x ranging from the birth size x b to the division size x d = 2x b [5,7]. Including the fluctuation of birth size and division size in consideration, the distribution is blurred with fast-decaying tails, as shown by the solid lines in Fig.3. Similar results were also noticed by Koch [9]. While the experiments in the 1960s have roughly confirmed the results, the high-quality modern experiment [12] provides more details of the distribution. As shown in Fig.3, the experiment suggests a rather fat tail for large cell size, which may indicate postponed cell divisions due to certain unknown biological reasons.

Mother machine and the inter-generation stochastic dynamics
Being developed in the 21st century, the mother machine technique provides a new way to cultivate bacteria. Since only one of the two daughters of each division is tracked, the cell number in the channels of mother machine is constant over generations. Noting this, the modified boundary condition is introduced for mother machine as where the prefactor for the number of tracked newborn cell is changed to 1. One can of course investigate the cell size distribution of the population, similar to Sec.3. It is, however, more interesting to investigate the distributions of birth sizes x b , division sizes x d and inter-division times τ d of the mother cells as they can be recorded as series over generations through the mother machine experiment, which makes it ideal to study the cell size control mechanism.

Master equation and Green's function of the inter-generation stochastic dynamics
The cell size control mechanism is usually formulated as the generation-wise dynamics of the division size (or birth size) from the ith generation to the (i + 1)th one as where perfect even-division is assumed for the birth size x where ∆ 0 is the expected added size and ξ is the noise. It is referred to the discrete-time Langevin dynamics in literature, where the noise ξ has been assumed following normal distribution [34] or log-normal distribution [17,33] based on arguments.
The inter-generation stochastic dynamics (Eq.(13)) can be equivalently described by another master equation, which concerns the evolution of division size distribution over generations as where Green's function G x is the conditioned probability of the division size for a cell with the given birth size x In the case of discrete cell size, the above equation is written as where P In the more compact matrix form, it is given as It shows that the inter-generation stochastic dynamics are fully determined by Green's function, which works as a matrix on the vector of division size distribution.

The distribution of added size
In general, the inter-generation dynamics is a consequence of the intra-generation dynamics, i.e. the growth and division processes. Assuming the current stochastic when the adder indicator s reaches the threshold. The initial condition can be written as u| t=0 = δ(x − x b )δ(s). Concerning the division size and inter-division time, the full Green's function is given by the first-passage distribution at for three typical x b in which one may notice the inter-division time decreases with the birth size and the division size increases with it.
In the generation-wise dynamics described in Sec.4.1, one focuses on the series of division sizes (or equivalently birth sizes). By integrating over the inter-division time, the Green's function can be obtained as Bearing in mind the adder principle, one may reorganize the Green's function of the successive division sizes as the distribution of the added size by It is well known in the deterministic accumulation model that the mean added size depends only on the growth rate but not on the birth size. The current stochastic model provides the opportunity to further investigate the distribution of the added size. As one of the main results of this study, the distribution is explicitly obtained as where the inversed noise strength α = 1/δ x and β = 1/δ s are used for simplicity. The corresponding PDF follows: It can be obtained by directly solving the discrete master equation, of which the technical details can be found in Appendix A. Another quick and illustrative approach will be soon shown in this section, which explains why Eq. (20) is in the form of the negative binomial distribution. The noise strengths determine the shape of the rescaled distribution of added size, which can be characterized by the mean value, coefficient of variation, and skewness, as shown in Appendix C. Figure 5 shows the distributions along with the experimental data [3,11], where the theoretical results are presented by lines for illustration without fine-tuning on the parameters. One can read significant differences among the distributions of experiments in various growth conditions, even after rescaling over the mean added size (or saying, rescaling over the growth rate). It would be a consequence of different noise levels in various growth conditions, which leads the investigation in Sec. 5.
Let's keep in the track of theoretical investigation of the model in this section. One may note that P (∆|x b ) is independent of x b . It would be a general feature of the accumulation mechanism, which has also been reported in the experiment [3] and a previous theoretical study on a model neglecting growth fluctuation [16]. The picture of random walk on the two-dimensional lattice {x i , s j }, as illustrated by Fig.2, would be helpful to clarify it. Since the transition rates depend only on the current state, it is a kind of trap dynamics [40], of which the subordination idea works [41][42][43]. Following this idea, the time-dependent process {x i , s j , t} can be described as the jumps of a random walk on the lattice {x i , s j , n} subordinated by the random waiting time {t n } for jumps. It leads to Gillespie's algorithm [44], of which the trajectory on the lattice and the series of waiting time can be generated independently. Estimating the distribution of division size, only the former one, i.e. {x i , s j , n}, is concerned. It depends on the ratio r w = k (x) /k (s) but is irrelevant of the scale of the waiting time τ w = k (x) + k (s) −1 . Noting that r w = α/β is constant for all the states, one can immediately see the random walk {x i , s j , n} is independent of the initial cell size x b . The first-passage distributions of x at s = 1 hence have the same shape, as illustrated in Fig. 6. The above reasoning can be extended to more general cases, as long as the growth rate k (x) and accumulation rate k (s) depends only on the current cell state while the ratio k (x) /k (s) is independent of cell state. This requirements can be satisfied by most adder dynamics, which explains the collapse of distributions observed in the experiment and the previous model study [3,16]. More discussion on the biological implication can be found in Sec.7.5.
A quick and illustrative approach to Eq.(20) follows the above insights is to map the random walk to "Bernoulli trials" with success probability q = β/(α + β). The cell divides when n s = 1/δ s + 1 success trials, i.e. n s -steps of accumulation of the adder indicator, are achieved. In this case, the number of failed trials n x , i.e. the steps of cell growth, follows the negative binomial distribution as Noting ∆ = n x δ x , α = 1/δ x , β = 1/δ s and the symmetry of combination number n m = n n−m , one can obtain Eq. (20). There have been recently a series of investigations [16,[21][22][23] on a special case of the current model with δ x → 0 (or α → ∞). In this case, the exponential growth is assumed in the continuous and deterministic manner while cell evolves along serial stages with random waiting times. At the final stage, the division happens. In the case of α → ∞, the added size distribution given by Eq. (20) and Eq.(21) becomes the Erlang distribution with the PDF: To obtain the above result, Stirling's formula, ln n! = n ln n − n + O(ln n), is used. The above special case leads to a long argued assumption that the division size follows the log-normal distribution, see e.g. [3,17,33,45]. This assumption also follows the insight of the cell cycle as serial stages while the cell grows exponentially with a constant rate. In the case that the number of the stages is large, the inter-division time   can be approximated by the normal distribution with the probability density Noting the exponential growth with the constant rate γ leads to x d = x b e γτ , and also p(τ d )dτ d = p(x d )dx d , one can obtain where , one can expect the added size follows a shifted log-normal distribution by the probability density Although the above argument can not justify the independence of added size on x b , it is still supported by our model that Eq.(26) with x b = ∆ 0 is a good approximation of g(∆), as shown by the lines in Fig.5, of which the deviance can be distinguished merely in the tail regions.

The steady distributions in the long time limit
It has been a common practice to measure the steady distributions of division size and inter-division time in long time mother machine experiments. We estimate the steady distributions here.
Following Eq. (16), the distribution of division size evolves over generations according toĜ. Noting the conditional probability G x d , we see in the discrete version m G mn = 1 for any n. In other words,Ĝ is a Markov matrix. The distribution hence should evolve towards a steady one as the eigenvector ofĜ by P ss x =ĜP ss x .
Provided Eq. (20) and G mn = P (∆ = x m − x n /2), the eigen-problem can be solved in principle. With P ss x as input, the steady distribution of inter-division time can be further evaluated by For the above evaluation, the full Green's function, where P (τ |x . It may be estimated in the subordination picture of random walk, under the assumption that the waiting time for a jump from (x i , s j ) follows the exponential distribution with the mean value τ i = [(α + β)x i ] −1 . This analytical approach is however too complicated. We use the numerically obtained Fig.7 and Fig.8 show the theoretical results along with those of the recent experiments. Given different levels of cell cycle noises, the theoretical results capture the features of the experimental data, except the heavier tails observed in experiments.
We would like to remark here that the distribution of division size and inter-division time, P (x d , τ d ) can also be obtained in the exponentially growing batch culture by counting the size and age of the dividing cells in a given time interval. In the model defined by Eq.(7) and Eq.(8), the distributions can be evaluated by the division current where C = i dτ βx i u i,n 0 (τ ) −1 is introduced for normalization. The marginal distributions can be then evaluated for x d and τ and In the long time limit, the memory on the sizes of the first cells is lost. The steady distributions are hence achieved. It has been confirmed that the distributions obtained in this approach well agree with the above results from the inter-generation stochastic dynamics, which is not shown here.

Noises level in cell growth and division processes: estimated from experimental data
The added size distributions shown in Fig.5 suggest significant variance of noise strengths in various growth conditions. In this section, we estimate the strengths of growth and division noises from the experimental data. The analytic expression given by Eq. (20) and Eq.(21) provide the exact solution of the added size distribution, of which the PDF is written as It is difficult to directly apply Eq.(33) for parameter estimation, due to numerical disasters brought by the drastic changing factors. For an applicable approximation, the improved form of Stirling's formula may be adopted that ln n! n ln n − n + 1 2 ln 2n + 1 3 π .
This logarithmic PDF can be easily employed for the maximum likelihood estimation (MLE) for the parameters. Handling the experimental data, we would also note that the added size ∆ in the above expression is the rescaled one ∆ = ∆ exp /(γm 0 ), while the scaling factor ∆ 0 = γm 0 varies for experiments. Take it into account, the added size recorded by experiments, ∆ exp , are expected following For each experimental dataset {x i }, the MLE algorithm estimates the three parameters α, β and ∆ 0 by maximizing the logarithmic likelihood Due to the non-Gaussian distribution, the transcendental equation arises in the MLE problem, which is numerically solved here. We performed MLE for the seven growth conditions in [3] and the fast growth condition in [11]. As shown in Table 2, the parameters have been stably estimated for five conditions out of eight, while MLE is not stable for the other three. All the data and the fitted distributions are shown in Fig.9. The estimated parameters suggest that the noise of cell growth can vary for three folds for different nutrient conditions, while the division noises are more or less at the similar level.
A heavy tail appears in the distribution of added size for all the experimental data sets, as shown in Fig.9. One may already notice the heavy tail appears in every distribution of cell size for the considered experiments. (See Fig.3, Fig.5, Fig.7, and Fig.9.) A conjecture arises that it comes from the postponed cell division, the reason of which is still unknown in biology. It is because of the heavy tail, which significantly reshapes the distributions shown in Fig.9f-9h, the expressions from the current model (Eq.(33), Eq.(35), and Eq.(36)) are not suitable to fit the data. While one can always add ingredients in model to fit the effects, our understanding can be truly improved only by careful experiment investigations on the rare events of the postponed division. The theory beyond the current accumulation model, based on novel biological insight, is required to embrace the strange tail.  The Gaussian fitting is also attempted to fit the data, which is shown by the dashed lines in Fig.9. Although the data are with considerable positive skewness, the Gaussian fitting with zero skewness still seems not too bad a choice except in the tail region. The following argument would help for better understanding. At the end of Sec.4.2, we have argued in the case of exponential growth with x = x b exp(γt), the Gaussian distributed inter-division time leads to the log-normal distributed division size, which is a good approximation of Eq. (33). Assuming here the linear growth case with x = x b (1+γt), the Gaussian distributed inter-division time would lead to the Gaussian distributed division size. Since the cell typically doubles the size over a cycle, λt ∼ ln 2 < 1 in the picture of exponential growth. For such λt, the linear growth is a not-too-bad approximation of the exponential growth, and so for the Gaussian distribution as an approximation of the log-normal distribution and eventually of Eq. (33).
Although the non-Gaussian and Gaussian distributions of added size look similar, it will be shown in the next section the different features of the cell size control mechanism, with or without energy consumption.

Non-equilibrium nature of the cell size control dynamics
When talking about the fluctuating yet stationary distributed cell size, one often uses the term "homeostasis". However, the precise meaning is not clear. In the view of statistical mechanics, one could go further to explicitly demonstrate the non-equilibrium nature of the cell size control mechanism, as shown in this section.
Let us start from the master equation on the division sizes over generations, Eq. (15), and re-organize it as where The two terms of the right-hand side are the inward current J in m and the outward current J out m . When the balance between the inward and outward currents is achieved for all the m, i.e. n G mn P ss n = n G nm P ss m , the distribution becomes stationary over generations. This is the general balance condition for the steady state. As a special steady state, the equilibrium state requires a stricter constraint that the balance is achieved between any pair of states, i.e. G mn P eq n = G nm P eq m , which is called the detailed balance condition. For more background on non-equilibrium statistical physics, one can see e.g. [46].
Equilibrium is usually assumed for the steady state of a closed physical system since it is the only state in which no entropy production (or energy dissipation) arises and hence requires no housekeeping energy. It is, however, not necessary for bacterial cells to satisfy this constraint. In contrast, a common strategy of living beings is to keep the system in non-equilibrium steady state (NESS) to best implement important biological functions at a cost of energy consumption [47,48]. We notice that in this model of cell cycle, the transition current J ss mn from x i d = x m to x i+1 d = x n is asymmetric for m and n, as shown in Fig.10(left). The detailed balance between the division sizes is substantially broken. The non-equilibrium feature can be quantitatively measured by the entropy production [46] for each generation as ∆S = m<n (J ss mn − J ss nm ) (ln J ss mn − ln J ss nm ) .
The heatmap of ∆S mn = (J ss mn − J ss nm ) (ln J ss mn − ln J ss nm ) /2 is shown in Fig.10(right). The non-trivial pattern reveals the structure of energy dissipation in the cell control process. Tracking the origin of the non-equilibrium feature, we find it is a consequence of the non-Gaussian noise on the added size. Assuming the Gaussian distributed P (∆) in Eq.(13) instead, the detailed balance would be recovered, which leads to an equilibrium scheme of cell size control. It is however not the case observed in experiments, where P (∆) are reported with significant skewness. It is interesting to notice that Thomas find in his theoretical investigation on the population dynamics [35] that the non-Gaussian noise with the positive skewness, which is the current case, would decrease the cell-to-cell size variability in the population level. A connection between the energy consumption in the cell size control and its biological utility is indicated, which requires further investigation.

Historical remarks on the studies of cell size distribution of an exponentially growing population
The bacterial cell cycle has been intensively investigated in the 1950s and 1960s by microbiologists in quite a quantitative way. A recent review has been given from the viewpoint of physicists [15]. Here we make short historical remarks on the achievement of cell size statistics. Since Schaechter, Maaløe and Kjeldgaard's precise measurement on the bacterial cell sizes in 1958 [4], the mechanism behind the well-controlled cell size was one of the most attractive topics in this field. In the 1960s, the Coulter counter was employed for automatic measurement of the size of cells in a culture, which drew theoretical interests to estimate the distribution. In 1961, Maclean and Munson discovered the "inverse square law of cell size distribution", saying p(x) ∼ x −2 for x ranging from the minimum (new-born) cell size and the maximum (ready-for-division) cell size [5], under the assumptions of exponential cell mass growth and perfect evendivision with sharp distributed division sizes. It has been a well-known law in biology textbooks. Collins and Richmond [6] and Koch [7] independently obtained the general relation between the distribution of division size and the distribution of cell size of the whole population. Assuming normal distributed division size, Koch obtained the cell size distribution of the whole population [9]. It suggested that the distribution was in general composed of the inverse-square peak blured by the fluctuation of division size. The shape of the distribution would be rather universal that the one obtained by Koch ( Fig.1 and Fig.3 in [9]) is quite similar to that in the current work (Fig.3), while the division size distributions are assumed in different forms. Noting the brilliant works on the cell size statistics, Bell, Anderson and their colleague from the physics community soon formulated the theory in the language of master equation [28][29][30][31][32], which led to the modern theoretical investigations.

Connections to other models
In this work, bacterial cell cycle is investigated by the model of stochastic cell growth and adder indicator accumulation as random walk on the two-dimensional lattice, combined with divisions at the threshold s = 1 for the adder indicator. There have been other approaches to construct the growth and division dynamics. Here we give a review of the previous studies and explain the connections to this study.
Taheri-Araghi et al. developed the master equation for specific cell size control mechanisms [3] with the inputs from modern biological experiments. In their work, the master equation was kept in continuous form, while the stochastic division was defined by the rate function of the cell size ('sizer') or the added size ('adder'). With the distribution of elongation rate and the distribution of division size measured in the experiment as inputs, the theoretical results fit the experimental data very well.
Noting the bursty feature of protein synthesis [38], various models of discrete dynamics have been introduced [16,21,22,37,39]. Ghusinga et al. studied the deterministic continuous cell size growth with the stochastic bursty accumulation of the adder indicator [16]. The analytic expression of the distributions were obtained in their study on the first-passage problem, where the added size is shown independent of the birth size. Modifying the accumulation rate, Nieto et al. generalized the above bursty accumulation model to encompass various cell size control mechanism [21]. In this track, Jia et al. has recently investigated the bursty accumulation model with additional noise in cell partition [22]. The cell size distribution and the inter-division time distribution have been analytically evaluated in the context of mother machine experiments. Jędrak et al. studied the model of stochastic protein bursts and stochastic partition during division by an exactly solvable generalized Fokker-Planck equation, where the division rate is set to constant [39]. Ibis-Biswas et al. performed experiments on Caulobacter crescentus cells, in which experiment the collapse of cell size and interdivision time distributions was observed [2]. The stochastic Hinshelwood cycle (SHC) model was introduced to explain their observation [37]. The SHC model is basically a master equation on discrete states, of which the boundary condition for cell division can be set in various ways. Digging into the historical literature, one can see that Kendall has proposed a similar model by considering the cell cycle as a Markov process exactly in the language of master equation of a stochastic birth process [27]. In the current work, we renew the theory by the accumulation mechanism, which is supported by recent bacterial experiments [3,13,14]. The collapse of the distributions in the adder case arises in the stochastic model. Via the explicitly obtained Green's function, we bridged the accumulation model to the inter-generation processes.
The inter-generation stochastic dynamics is a phenomenological description of the propagation of division features, such as the birth size or the division size, in cell lineages. It is a convenient language to investigate the cell size control mechanism [17,33,34]. In the current study on the accumulation model, the steady distributions are evaluated and well confirmed by the mother machine experimental data. Noting the non-Gaussian distribution of added size, we further predict the non-equilibrium nature of the intergeneration stochastic dynamics and the cell size control mechanism.
We note in the last that the stochastic cell growth can be also introduced by adding diffusion terms to the continuous master equation (Eq.(1)), which leads to a Fokker-Planck (FP) equation. The equivalent Langevin dynamics of the FP equation has been investigated by Pirjol et al. for cell growth [18]. Under Kramers-Moyal expansion, the processes defined by the FP equation and the discrete master equation are equivalent to the second order. The third moments are different, which affect the skewness of the added size distribution. Physical reasoning of the difference comes that the cell size does not shrink during growth following the discrete master equation, while it is possible in that defined by the FP equation with naive diffusion terms. The moments of stochastic growth defined by the current model and the Fokker-Planck equation are estimated in Appendix B.

Statistics in batch culture and in mother machine
In this study, the batch culture experiments and mother machine experiments are discussed separately because they are different in statistics. In batch culture experiments, due to the difficulty to track single cells for a long time, the statistics are usually on snapshots of the population, or saying, on cross-sections of the expanding lineage tree. In mother machine experiments, one can track few single branches on the lineage tree for generations, but not for all the offspring. The differences recalls the similar case in statistical mechanics, where one can either take a snapshot of the system by recording the states of all the particles at a certain time (known as the Gibbs ensemble), or track the motion of few particles and take the averages along the trajectories.
To solve the different statistics, different languages have been developed to describe the two observations. The inter-generation stochastic dynamics focuses on the feature propagating along a single branch and the master equation concerns distributions in the view of population dynamics. In the current study, we have translated the master equation description into the inter-generation stochastic dynamics. The reverse translation has been formulated by Thomas by studying the statistics on a Cayley tree [35]. It is still subtle how to bridge two types of experiments by assigning proper weights to division events [35,49], especially in the case of large noises.

The noises in the cell cylce
It has been a great achievement to construct the dynamics of bacterial cell cycle with the growth rate as the most important parameter, since the amazing finding [4] by Schaecther et al. To understand why the system with external inputs of multi-degrees of freedom can be characterized by the single parameter, recent studies conducted by Hwa group constructed the coarse-grained model of cellular processes [50][51][52][53], which reveal the hidden principle on the connection between growth rates and the partition of ribosome. This principle is well confirmed on the level of mean value by experiments with perturbations on various resources to break the degeneracy.
Looking at the distribution of the data, the variance of the noise strengths can, however, not be neglected. The noise strengths are included in this study as the additional parameters. It is shown that the strength of the growth noise can vary for three folds in various nutrient conditions. A biological reasoning comes that cell growth involves most of the noise sources in cell life [54], including the nutrient uptake, the energy metabolism, the synthesis of macromolecules, etc. The noise strength may hence significantly vary for different growth conditions due to all the fluctuations in the processes. In contrast, the division noises mainly come from the bursty synthesis of few kinds of division-related proteins, which may be less affected by the fluctuations of other cellular processes and hence can be kept at the similar level.
Although the bacterial physiological behavior has already been examined for various temperatures on the level of mean value in the seminal study by Schaechter et al. [4], the modern experiments on this topic controls bacteria mostly via nutrients and antibiotics. While the recent study on neurons can construct a clear model on the temperaturedependence of neuronal activities [55], it is currently still a question how to integrate the temperature effects into the stochastic model of cellular processes. As the temperature is often thought to affect the noise level of cell growth and other reactions, it would be interesting to investigate the cell cycle of bacteria in the mother machine in thermostat of various temperatures, to provide the cellular information on noises and help to understand the temperature effect on cell cycle control. Combining the proteome information from the modern mass spectrometry techniques [56], it will further offer the opportunity to understand the molecular mechanisms helping bacteria survive in the wild nature of perpetually varying temperature.

Biological implication of the decoupling between the birth size and the added size
It was observed in the experiment [1,3] that the added size is decoupled from the birth size. As a consequence, the added size distribution collapses for different birth sizes under a given growth condition [3]. Assuming the accumulation mechanism, certain connections between the cell growth and the accumulation of the adder indicator are expected. In the picture of random walk on the two-dimensional space of cell size and indicator abundance (see Fig.6), the decoupling requires the ratio of cell growth rate k (x) and indicator accumulation rate k (s) are independent of cell size. In other words, r w = k (x) /k (s) may vary as a function of s or other variables of the cell cycle but is expect with no direct correlation with cell size. In the language of biology, the expression level of the indicator, i.e. the fraction of the indicator mRNA in the whole transcriptome, is expected not to be a direct function of cell size, although it may change for different growth conditions or vary over a cell cycle.

Summary
The revolutionary modern techniques, including microfluidics and the advanced live imaging, lead to the emergence of high-quality data on cell cycle at the single-cell level. While the celebrated theory on bacterial physiology has successfully explained the experimental observations on the level of mean value, the distributions of data suggest different noise levels for different growth conditions. The features of the noises are hence necessary for a theory concerning the distributions, which leads to the stochastic model studied here.
In this theoretical work, we investigate the stochastic accumulation model in the framework of master equation. The inter-generation Green's function is analytically estimated for the model, which bridges two kinds of statistics used in the batch-culture and the mother machine experiments. It is shown that not only the mean value but also the distribution of added size is independent of the birth size. The picture of subordinated random walk is introduced to understand its biological origin, which is relevant to the expression level of the adder indicator. The theoretical results are utilized to analyze experimental data, which suggests that the division noise is kept in the similar level for different growth conditions while the strength of growth noise can vary for three folds. In the last, the non-equilibrium nature of bacterial cell size homeostasis is predicted by evaluating the entropy production in the inter-generation stochastic dynamics. The biological utility of the energy consumption during cell size control calls for further investigation.
It is fortunate thatM −1 can be explicitly obtained as In this appendix, we estimate the time course of moments and cumulants during cell growth. Let us recall the deterministic process defined by the continuous master equation, of which the ageing term is neglected for the one-generation growth τ is alway equal to t. The stochasticity is introduced by the parameters δ x and δ s in the discrete master equation (ME) For a cell with the given birth size x = x b , the initial condition is written as u ij | t=0 = δ i,i 0 δ s,0 , where the Kronecker delta is employed and i 0 = x b /δ b . The moment can be defined by 3) The cumulants are hence given as µ FP = x b e t = µ ME , σ 2 FP = δ x x b e t (e t − 1) = σ 2 ME , κ 3 FP = 3 2 δ 2 x x b (e t − 1) 2 e t < κ 3 ME . One can see the two stochastic approaches are equivalent to the second order. It recalls us the fact that the Fokker-Planck equation is the second-order approximation of the master equation under Kramer-Moyal expansion [57]. The difference arises for the higher moments, resulting in different skewness κ 3 of the size distribution.
With the expression of the cumulants, one may try the Gram-Charlie series to obtain the analytic expression of u(x, s) [58]. It would however not help, due to the serious minus probability problem when the higher-order terms with m ≥ 3 are included, which is also a known feature of Kramer-Moyal expansion [57].
We note that the Langevin dynamics corresponding to the above FP approach has been studied by Pirjol et al. [18], where the same results on x and x 2 were obtained.
Appendix C. The mean value, coefficient of variation, and skewness of the added size In the deterministic accumulation model, the added size is determined by the growth rate γ and the mean rescaled added size m 0 by ∆ = γm 0 . In present of noises, the mean value of the added size can be evaluated from the analytic expression as ∆ = γm 0 1 + 1 β . (C.1) It slightly shifts from that by the deterministic model due to the noise of division, while the growth noise would not affect it. As estimated from experimental data, β ∼ 50, the correction to the mean value is not significant. The coefficient of variation (CV) and skewness can be further evaluated as where the ratio of the inversed noise strengths, µ = β/α, is introduced. In the limit of the deterministic and continuous growth, α → ∞ and µ → 0, CV → 1/(1 + β) and Skewness → 2/ √ 1 + β. The CV and Skewness for various α and β are shown in Fig.C1. Figure C1: The heatmap of the coefficient of variation (left) and skewness (right) for different noise levels.

Appendix D. The pseudocode for cell cycle simulation
This appendix provides the pseudocode of cell cycle simulation by Gillespie's algorithm [44]. To be noted, most of the results in this work are obtained by analytic or numerical approaches instead of simulation.
Algorithm 1 Cell cycle simulation by Gillespie's algorithm Update k (s) and k (x) according to x while t < t max do t 0 ← 1/(k (s) + k (x) ) Draw a random number δt from exponential distribution with mean value t 0 t ← t + δt τ ← τ + δt Draw a random number jump uniformly distributed in [0, 1] if jump < k (s) /(k (s) + k (x) ) then s ← s + δ s else x ← x + δ x Update k (s) and k (x) according to x end if if s ≥ s 0 then s ← 0 x ← x/2 τ ← 0 end if end while