Quantifying the interplay between genetic and epigenetic regulations in stem cell development

Waddington epigenetic landscape, as a classic metaphor, has been used to explain cellular development and differentiation. However, it remains challenging to quantify the epigenetic landscape. Especially, a key issue arises as what are the underlying mechanisms for the interplay between genetic and epigenetic regulations to govern cell fate decisions in development. Based on a developmental epigenetic model combining histone modifications and gene regulations, we studied state switching mechanisms of histone modifications for stem cell development, and uncovered corresponding epigenetic landscape. The topography of landscape provides a quantitative measure for the relative stability of different attractors or phenotypes. We showed that histone regulations facilitate the occurrence of intermediate states or multistability. From the epigenetic landscape of stem cell differentiation, we identified key cellular states characterized by attractors, including pluripotent stem cell state, differentiated state and intermediate states. We also quantified representative kinetic transition paths for differentiation, reprogramming and transdifferentiation, which agree well with previous experimental observations. Specifically, previous experiments indicate that transdifferentiation can go through a mixed, unspecific intermediate or progenitor-like state. By calculating the kinetic transition paths, our developmental epigenetic models are able to replicate all these three experimental results, and therefore provide theoretical explanations for these experimental observations. We propose that epigenetic regulations play critical roles on the kinetic transitions for differentiation, reprogramming and transdifferentiation, which also provide a source for the heterogeneity of gene expressions observed in developmental process. Our work provides new insights into the roles of epigenetic modifications on controlling gene expression and stem cell differentiation, and facilitates our mechanistic understanding for the cell fate determinations regarding the interplay between genetic and epigenetic regulations.


Introduction
Cells change their phenotypes during development, giving rise to different lineages and differentiated cell types. This cell fate decision dynamics during development has been illustrated by Waddington using a metaphor, known as the famous Waddington epigenetic landscape [1]. Recently, it has been found that differentiated cells can be reprogrammed to induced pluripotent stem cells or other differentiated cell types by manipulating certain key transcription factors [2][3][4][5]. However, the mechanisms for cellular development and reprogramming have not been fully elucidated.

Mathematical model of chromatin
To combine genetic and epigenetic regulations, we first construct the Chromatin model based on previous work [21]. We considered the H3K27 methylation, demethylation using epigenetic model [21], and coupled this epigenetic model to different gene network model, including single gene transcription model, self-activation model and mutual repressed self-activation model (simplified developmental model). For the coupling between epigenetic model and simplified developmental model, we aim to obtain an epigenetic model to illuminate the interplay between genetic and epigenetic regulations in development.
First, we consider the single gene transcription model coupled with epigenetic regulations. In our model, the simulated region of the chromatin domain contains N/2 nucleosomes, each of which consists of a pair of histones (H3). Here, N denotes the number of histones (we consider N=60 histones here). Histones outside the simulated region are not considered. We chose histone number N = 60 by following [18,21]. This is roughly corresponding to an average gene size (about 10 kb). The value of N can be adjusted for a specific gene. In the activated state, Pol II can carry H3K27-demethelases (KDM) to promote demethylation and nucleosome exchange ( figure 1(A)). Demethylation occurs through both transcription-dependent and transcriptionindependent mechanisms. In the repressed state, the H3K27me3-modified histones can recruit PRC2 to methylate any neighboring histones ( figure 1(A)). In our model, PRC2 activity leads to methylation of H3K27 ( figure 1(B), orange arrows), and transcription leads to H3K27 demethylation and histone exchange ( figure 1(B), green arrows).
In our epigenetic model, me0, me1, me2 and me3 represent four methylation state of H3K27, and me1/me2 act as intermediate state between me0 and fully methylated me3. H3K27me3 is suggested to activate PRC2, H3K27me2 is suggested to activate PRC2 with a 10 fold reduced efficacy, but H3K27me1 does not activate PRC2 [34]. Therefore, the me0/me1 modification states can be grouped as neutral marks and me2/me3 as repressive marks, which inhibit transcription ( figure 1(B), orange repressive arrows). Then, the H3K27 methylation model is combined with single gene transcription model ( figure 1(B)). In our model, transcription is assumed to promote demethylation of H3K27, and methylation states are assumed to methylate neighboring histones by The elliptical nodes from left to right (me0 to me3) represent the methylation state of H3K27. Here, four histone methylation states (me0/me1/me2/me3) can switch to each other by methylation or demethylation reactions, histone methylation states affect transcription rate f (orange repression arrows) and methylation reactions of their neighbors (orange active arrows). Transcription also promotes histone demethylation (green arrows). The module with yellow arrows represent the process of gene expression for a selfactivated gene. Black arrows represent state transitions (STs). DNA * represents the DNA with protein bound to the promoter. f and f h indicate the transcription rate for gene without and with protein bound, respectively. (C) Mathematical description of model. Sum over neighbors in E i includes the other histone on the same nucleosome, and four histones on the neighboring nucleosomes. So the total number of the neighbors for each histone i is 5. P me2 me3 is the fraction of H3 histones carrying K27me2 or K27me3. α > 1 is the self-activation constant. A detailed description of each parameter can be found in table 3. activating PRC2 [21]. So, our model considers the methylation and demethylation (transition reactions among me0, me1, me2 and me3), the effects of transcription on methylation (green arrows), the effects of methylation states on transcription (orange repressive arrows), and PRC2-mediated positive feedback promoting methylation (orange arrows). To our knowledge, the mechanism for how the transcription influences the histone methylation remains to be clarified. So, for simplification, we modeled demethylation and histone exchange processes only depending on the default transcription rate f.
). The state of S i will change as each methylation or demethylation reaction happens ( figure 1(B)). For 1iN, the rates of methylation for the ith histone (denoted by r i me ) are calculated as [21]: Here, --k k , me0 1 me1 2 and k me are PRC2-mediated methylation rate and g g --, me0 1 me1 2 and g -me2 3 are noisy methylation rate. E i incorporates the positive feedback from neighboring me2/me3, and ρ me2 =0.1 accounts for the reduced efficiency of PRC2 activation from H3K27me2. δ x,y is the Kronecker delta function, equal to 1 if i=j and 0 otherwise, and E i is summed over 'neighboring' histones, where 2) has 5 neighbors except the 4 histones (i=1, 2, 58, 59) in the boundary nucleosomes. For these 4 histones (i=1, 2, 58, 59), each histone has three neighbors. We consider that this boundary effect will be small since the region of chromatin domain we are simulating is relatively large (60 histones) relative to the boundaries (4 histones).
The rates of transcription-independent demethylation, r i dem are calculated as: where γ dem is noisy demethylation rate. Transcription-dependent demethylation results in the removal of methylation groups at each histone and also replacement of each nucleosome. The rates of removal of methylation groups and histone exchange rate are represented by fp dem and fp ex , respectively (figure 1(C)).

ST approach for simulations
Stochastic reaction kinetics can be simulated by the Stochastic Simulation algorithm (SSA) [35], but the SSA can be low efficient for large chemical reaction systems. Based on the above epigenetic model, we improved the simulation approach by tracking the number of histones in different methylation state within a simulated region of chromatin instead of considering the state of each histone explicitly. Starting from the current state, we only consider which reaction occurs at the next moment without considering the specific histone location for the reaction. In this case, the number of possible reactions is reduced from 2N (N methylation reactions and N demethylation reactions) to 2 (methylation and demethylation) for a system including N histones. This significantly reduces the computational complexity of the system. We will use methylation and histone exchange as examples to to describe the improved approach, which we call the ST approach. In the original model, the propensities of methylation are given by r i If we don't consider where the reaction occurs, there are three possibilities for methylation reactions, which occur at me0, me1, and me2 state, respectively. Let Z l denote the number of histones in me l state, and = z Z N l l (lä{0, 1, 2, 3}) represent the percentage of histones in four different methylation states. The propensities of methylation are recalculated as:  where E i incorporates the positive feedback from neighboring me2/me3. We use expected value to estimate this effect, and we have: Similarly, for histone exchange reactions, there are totally 16 possible situations (figure 1(C)). We take one of the situations as an example. If the reaction occurs at me a at a certain location, the histone status at the neighbor position is me b , the propensity of this situation is calculated as fp ex ×Z a ×z b .
We made simulations by tracking the state of the system which is given by (Z 0 , Z 1 , Z 2 , Z 3 ). The reaction propensity function and state change of each reaction channel are shown in table 1. With above propensities, we let the system evolve with time and obtain the trajectories and probability distributions from statistics of trajectories. In this paper, all the simulations for histone modifications are performed with this method. Through this approach, the calculation speed of the algorithm can be significantly improved compared to using SSA. The improved ST approach can be consider as an approximation from original SSA models. By making comparisons for calculating distributions using the ESG model as an example (figure S1 available online at stacks.iop.org/NJP/21/103042/mmedia), we show that the relative error (1.87%) is very small for using our ST approach compared to using SSA. For estimating average transition time between high protein level state and low protein level state, ST approach is not as accurate as calculating landscape compared with SSA approach (with a relative error of 8.7% and 38.4%, respectively for high to low protein level and low to high protein level transition, see figure S1). However, considering the approximation used in ST approach and complexity that exists in our dynamical models, we believe that ST approach, as an approximation approach, provides a useful way to study landscape and transition time.
Epigenetic model of self-activated gene The mechanistic basis of transcriptional repression by PRC2 and H3K27me2/me3 is poorly understood. We assume the transcriptional rate f a simple linear function of the proportion of H3K27me2/me3 marked histones at the gene as = - where P me2,me3 is the proportion of me2/me3 marks [21]. In the transcriptional regulation of gene expression, we hypothesized that the protein molecules bind to DNA as a dimer and activate (or inhibit) gene expression. In our model, * DNA represents the DNA state with promoter bound to the protein ( figure 1(B)). For the self-activation regulation, the transcription rate takes the form as , where α>1 is defined as the self-activation constant, and we have = + P z z me2,me3 2 3 . Here, we modeled transcriptional activation as a factor to weaken the role of histone methylation, as indicated in above formula. In the second way, self-activated transcription is modeled as also increasing the maximal transcription rate, in which the formula changes to: m a x me2,me3 max min . Here, α m <1 represents that self-activation will also increase the maximal transcription rate (figure S2).
For the transcriptional regulation module, we use m and p to represent numbers of mRNA and protein respectively. The reaction propensity function and state change of each reaction channel are shown in table 2. By combining the transcriptional regulation module using direct Gillespie stochastic algorithm and histone modification module using improved ST approach, we can perform the stochastic simulations for the full epigenetic model of self-activated gene (see parameters values in table 3). We did not recorded the DNA state in our simulations here. We believe that this does not influence our conclusions, but warrants further explorations.

Results
Epigenetic landscape for single gene and self-activated gene To study the key properties of epigenetic regulations interacting with gene regulations, we first examine a simple single gene example, in which only the transcription and translation of one gene are considered, and couple it with the epigenetic regulations (figure 1(B), no yellow arrows). We call this model epigenetic model for single gene (ESG model). Figure 2 shows the simulation results for the ESG model. In this model, no gene regulation is considered, so only epigenetic modifications take roles. Here we plot both trajectories (figures 2(A)-(D)) for different components and potential landscape with different pairs of components (figures 2(E)-(I)). From the trajectories, we see that the system switches between two different states (low and high protein level) under fluctuations. However, the global stability is not easy to study only from trajectories. So we quantified the potential energy landscape based on the statistics for the long time trajectories [9]. Here, the potential landscape is defined as: = -U P log ss [6][7][8][9][10][11], and P ss is steady state probability distribution. The landscape surface is characterized by different colors, in which the blue region represents lower potential or higher probability, and the red region represents higher potential or lower probability. On the landscape of the epigenetic model of single gene transcription, two attractors or basins of attraction emerge, which characterize two different cell states (corresponding to low protein level and high protein level), respectively (figures 2(E)-(I)). We define the barrier height as the potential difference between local minimum and saddle point, which quantifies the feasibility of transitions between two cell states. So, the topography of landscape provides a measure to quantify the global stability and transition dynamics for epigenetic cell fate decisions.
To further study the interplay between epigenetic and transcriptional regulations, we combined the selfactivation regulation to the single gene based on the ESG model ( figure 1(B), whole network). This model The potential landscape is defined as: = -U P log ss , and P ss is steady state probability distribution. On the landscape the blue region represents lower potential or higher probability, and the red region represents higher potential or lower probability. exhibits some of the hallmarks found in developmental systems because many self-activation regulations have been suggested from developmental systems [8,36]. We call this model self-activated epigenetic (SAE) model. Figure 3 shows the trajectories of different components and the potential landscape (using different pairs as the coordinates) for the SAE model. We found that three stable states or attractors emerge on the epigenetic landscape (figures 3(E), (F), (H) and (I)), i.e. the self-activated gene can have three different expression levels (also see figure S2 for a different modeling strategy, where self-activated transcription is modeled as also increasing the maximal transcription rate). Specifically, in the methylation state (me3 high), the mRNA and protein both have two different levels (figures 3(E) and (F)), i.e. the gene or protein can have a middle level state. It can be imagined that the middle expression state will occur when the methylation and self-activation are both on because the methylation inhibits gene expression while the self-activation promotes gene expression, which results in a middle expression level state. Previous work has suggested that a self-activated gene can give rise to bistability (low and high expression) [37], where the epigenetic modifications (such as histone modifications) were not been explicitly explored. Here, our landscape results suggest that the epigenetic regulations can promote the multistability or the intermediate state. In the single gene case, gene can have low, high and middle expression level state. Of note, the intermediate state we identified here is related with our regulatory network coupling genetic and histone regulations ( figure 1(B)). Our histone modification network is based on the work of Berry et al [21], and a major difference compared with previous histone modification models [18] is that we explicitly considered the interplay between transcription and histone modifications.
Mutual repressed self-activation model (simplified developmental epigenetic model) Epigenetic factors have been long related to the cell fate decisions in development and differentiation [30][31][32]. But the mechanisms for the roles of epigenetic regulations in this process remain elusive. To address this problem, we developed an epigenetic model for development based on a gene circuit consisting of a pair of self- The potential landscape is defined as: = -U P log ss , and P ss is steady state probability distribution. On the landscape the blue region represents lower potential or higher probability, and the red region represents higher potential or lower probability. activated and mutually repressed genes. This self-activated mutual repression gene circuit has been commonly studied as an example for developmental system [6,38]. In fact, this gene regulatory circuit has been found in many tissues where a pluri/multipotent stem cell needs to go through a binary cell fate determination for the lineage commitment [38]. For example, in the multipotent common myeloic progenitor cell (CMP), a binary cell fate decision has to be made between the myeloid and the erythroid fate. The transcription factors PU.1, and GATA1, promote the myeloid and the erythroid fates, respectively, which forms such a circuit [5]. We need to stress that for the model of self-activation and mutual repression with self-activation, only a few of the vast literature are cited.
Here, we developed a simplified developmental epigenetic model by combining the histone modification model with the mutual repressed self-activation gene network structure (gene A and B mutually repress each other). In this model, two genes mutually repress each other and self-activate themselves, and each gene (A and B) is combined with the epigenetic modification model as illustrated in figure 4. The simplified developmental epigenetic model can be expressed by the following chemical reactions:  4). Taking gene A as an example, for the gene state index ij, the first index i=0(1) stands for gene A unbound(bound) to protein B (repression); the second index j=0(1) stands for gene A unbound(bound) to protein A (self-activation). Here, α 0 (α 1 ) represent the binding(unbinding) rate constant for self-activation, and β 0 (β 1 ) represent the binding(unbinding) rate constant for repression. For these four states, the rates of transcription become: Here, f represents the transcription rate for gene without bound to any protein, whose range is [f min , f max ], f h represents transcription rate for the gene only bound to the activator protein without bound to the repressor protein, whose range is max min max , which has the highest transcription rate, and f l represents transcription rate for the gene bound to the repressor protein, whose range is b , which has the lowest transcription rate.
In our algorithm, we divide the reactions of the system into three groups, where the first group is for the reactions of gene A, the second group is for the reactions of gene B, and the third group is for the reactions related to the interplay between gene A and gene B. In the process of simulation, we first determine in which group the next reaction will occur, and then determine which reaction will occur in corresponding group. Let p A , p B denote the level of protein A and protein B, respectively. The system state is given by (p A , p B ). With above reactions, we combined SSA (for transcription and translation) and our ST approach (for histone methylation) to perform the simulations for the developmental epigenetic model (see parameter values in table 4).

Epigenetic landscape for development
We further quantified the landscape for the developmental epigenetic model ( figure 4). Figure 5 shows the trajectories and landscape for gene A and B, respectively. In figure 5, the trajectories and landscape for gene A (figures 5(A)-(F)) or gene B (figures 5(G)-(L)) are similar, since this is a simplified symmetric model. Take gene A as an example, we found that gene A can have four different states in the space of protein level and me3 level ( figure 5(E)), which also suggests that protein A can have three different expression levels (high, low and middle). We can further analyze the reasons for the appearance of attractors. Take gene A as an example, we see four   Noisy methylation rate (me0 to me1) (histone −1 s −1 ) Noisy methylation rate (me1 to me2) (histone −1 s −1 ) Noisy methylation rate (me0 to me1) (histone −1 s −1 ) k me [21]   The potential landscape is defined as: = -U P log ss , and P ss is steady state probability distribution. On the landscape the blue region represents lower potential or higher probability, and the red region represents higher potential or lower probability. the methylation is on and there is no self-activation regulations, which leads to the lowest expression level for gene A. On the contrary, for the attractor in the up-right corner, the self-activation regulation is on and repression is off, which results in a high level expression for gene A although the methylation is also on.
From the color on the landscape, which denotes the potential U, we also see that the middle expression level state (for both A and B) is very stable as it occupies a very deep landscape basin (figures 5(D) and (E)). When we further plot the landscape (figures 6(A) and (B)) and the steady state distribution (figures 6(C) and (D)) of stem cell differentiation in terms of gene A and B, we identify the middle expression state as the pluripotent stem cell state (SC), which indicates that stem cell state is a stable cellular state. In fact, we discovered six attractors emerging on the landscape, including pluripotent stem cell state (SC), differentiated states (DA and DB) and some intermediate states (IA, IB and IO). This again suggests that the epigenetic regulations promote multistability or intermediate states. So, epigenetic regulation provides a source for the heterogeneity observed in the populations of development. Here, the intermediate expression level state is defined as stem cell state because this state is in a symmetrical position, which can go left (express A) or right (express B), and therefore corresponds to the pluripotent properties for stem cell. When the symmetry is broken, the cell has to choose to go to a state where gene A is expressed (DA) or a state where gene B is expressed (DB), which corresponds to a cell fate decision process in lineage commitment. Besides these stable pluripotent and differentiated cell states (SC, DA and DB), we found some other attractor states, which we define as the intermediate states (IA, IB and IO). The intermediate states have been suggested to play critical roles during cell lineage transitions [9,[39][40][41] and cancer progression [11].

Kinetic transition paths for differentiation, reprogramming and transdifferentiation
From the simulation trajectories of the developmental epigenetic model, we can obtain some representative transition paths between different cell types (basins). These paths will provide us the information on how the The potential landscape is defined as: = -U P log ss , and P ss is steady state probability distribution. On the landscape the blue region represents lower potential or higher probability, and the red region represents higher potential or lower probability.
transition happens in terms of the changes in protein (gene) levels for different transition process. Specifically, we acquired some typical transition paths for differentiation (figures 7(A)-(D)), reprogramming (figures 7(E)-(H)) and transdifferentiation (a process in which a differentiated cell fate is converted into another one) (figures 7(I)-(L) and (M)-(P)), and projected them on the landscape background ( figure 7). Here, magenta, white, and red lines correspond to three stages for the transitions in temporal order, i.e. magenta lines are for early stage of transitions and red lines are for late stage of transitions. From the typical transition paths, we found that differentiation will go through the intermediate state (IA or IB). This can happen in different ways. For example, the stem cell state can directly switch to differentiation state ( figure 7(A)). It can also first enter intermediate state IA, and then enter the differentiation state DA ( figure 7(B)). Or, it can go through intermediate state IA and IO, and finally enter the differentiation state DA ( figure 7(C)). It can also go through all three intermediate states IA and IB and IO, and then finish the differentiation ( figure 7(D)). Similarly, from the typical transition paths for reprogramming, we found that the reprogramming process can also take different paths, including direct transition (figure 7(E)), going through intermediate state IA (figure 7(F)), going through IA and IO (figure 7(G)), or going through both intermediate states IA and IB ( figure 7(H)). So, our results show that the differentiation and reprogramming can undergo direct or indirection transitions (go through intermediate states), which not only agree well with previous experimental and theoretical studies [9,42,43], but also provide reasonable explanations for these observations.
In the same way, we can discover the transition paths for transdifferentiation. Here, to see more possible transdifferentiation path, we use two different sets of parameter values. Specifically, in figures 7(I)-(L) we use default parameter values (table 4), and in figures 7(M)-(P) the parameters are changed as k α =20, k β =0.001 while keeping other parameters unchanged. We found that the transdifferentiation process can go through stem cell state ( figure 7(I)), going through stem cell state and intermediate state ( figure 7(J)), or going through both The potential landscape is defined as: = -U P log ss , and P ss is steady state probability distribution. On the landscape the blue region represents lower potential or higher probability, and the red region represents higher potential or lower probability. For (M)-(P), the parameters are changed as k α =20, k β =0.001 while keeping other parameters the same.
intermediate states ( figure 7(K)). Interestingly, we found that the transidifferentiation can also go through a mixed intermediate state (figures 7(N)-(P)), where both lineage marker genes are expressed (high A and high B). These results agree well with recent studies showing that cell could pass a mixed (MX), unspecific intermediate (UI) or progenitor-like (PG) state during transdifferentiation [33]. For example, single cell RNA-seq analysis showed that fibroblast undergoes transdifferentiation to neurons, which passes a progenitor-like state [44].
Additionally, experiments also suggest that transdifferentiation can also go through a 'mixed' state where both differentiation state marker gene are expressed. For example, single-cell transcriptome analysis has suggested that the Transdifferentiation of mouse cardiac fibroblasts to induced cardiomyocytes (iCMs) can go through several 'mixed' states where both cardiomyocyte and fibroblast-specific markers are expressed [45]. We also calculated the possibility (percentage) for the occurrence of each of these transition path and the distribution of average transition time of each transition path for differentiation, reprogramming and transdifferentiation, respectively (figure S3, see text S1 for details).
To summarize our transition path results, we provide an illustration figure for the comparisons between experimental observation and our simulations results (figure 8). Figure 8(A) gives an illustration for the experimental observations based on classic Waddington landscape [33]. Specifically, experiments suggest that transdifferentiation can go through a mixed (MX), unspecific intermediate (UI) or progenitor-like (PG) state. Figure 8(B) gives a summary for our major results of the transition paths for differentiation and transdifferentiation based on the epigenetic model of development (figure 7). Our simulation results agree well with the experimental observations in that three possible routes for transdifferentiaiton suggested by experiments all can be identified from our landscape model (see the red paths in figures 8(A) and (B)).
The adiabatic approximation (much faster protein-DNA binding/unbinding rate compared with epigenetic regulations) has been used in modeling developmental process [23]. However, the roles of non-adiabaticity in developmental process have also been emphasized [7,9]. To see how the adiabaticity affects our results, we calculated landscape for different a 1 (unbinding rate for self-activation) and β 1 (unbinding rate for mutual repression) while keeping the equilibrium constant (k α and k β ) for the binding/unbinding reactions the same (figure 9). Figure 9 shows that the unbinding rate β 1 does not affect the landscape results significantly. For example, along the vertical direction from down to top, β 1 increases from 10 −5 to 10 −3 , but the landscape does not change a lot. When we look at the horizontal direction from left to right (α 1 increases from 10 −5 to 10 −3 ), the landscape does change, and the number of basins (stable states) decreases and intermediate states are inclined to disappear. So, the time scale for the binding/unbinding rate of self-activation, i.e. the degree of adiabaticity, does influence the landscape or the number of cell states. However, as shown in figure 8, we believe that the slower binding/unbinding rate for self-activation (e.g. a = - Therefore, although the adiabatic approximation is important in modeling development [23], we believe that it is possible that cell can use different mechanisms for multiple fate transitions, including using the nonadiabatic regime to generate intermediate states and increase plasticity. It warrants further explorations about whether models using adiabatic approximation can also generate multiple intermediate states, consistent with experimental observations ( figure 8(A)).

Interplay between genetic and epigenetic regulations governs cell fate decisions in development
To study the influence of gene regulations on the stem cell differentiation, we quantified the landscape at different regulatory strengths. Specifically, we changed two critical parameters related to regulatory strengths to see how the landscape is affected ( figure 10). Here, k α represents the equilibrium constant for self-activation and k β represents the equilibrium constant for mutual repression. So, k α and k β can quantify the activation and repression strength, respectively. We found that when the repression strength decreases, the stem cell state becomes less stable (figure 10, vertical direction from up to down), indicated by the landscape change where stem cell attractor becomes more and more shallow and finally disappears. Also, as the repression strength declines, more attractor states occur on the landscape. When the activation strength increases, more intermediate states emerge on the landscape (figure 10, horizontal direction from left to right). This suggests that the decrease of repression strength or increase of activation strength might promote the occurrence of heterogeneity in the populations of differentiated cells. Therefore, our results indicate that the change in gene regulations can lead to the differentiation or reprogramming. In current case, the decrease of repression strength provides a possible way to induce the cellular differentiation. On the contrary, the increase of repression strength might provide a way to induce reprogramming.
Here, we can use two different ways to explain the landscape results. In the first way, we have a fixed landscape (with fixed parameters), and different cell states (e.g. pluripotent stem cell state and differentiated state) can make transitions among each other ( figure 6). In the second way, the landscape changes as parameter values change ( figure 10). Here the point is that cell fate transitions in biological systems are driven by signals Figure 9. Landscape comparisons at different unbinding constant (a 1 for self-activation and β 1 for repression) when protein-DNA binding/unbinding equilibrium constant k α and k β are fixed. The unbinding constant α 1 for self-activation is increasing from left to right, while the unbinding constant β 1 for mutual repression is declining from top to bottom. As α 1 increases, the number of stable states decreases, i.e. as non-adiabaticity increases for the bind/unbinding reactions of self-activation, the number of stable states increases. The potential landscape is defined as: = -U P log ss , and P ss is steady state probability distribution. On the landscape the blue region represents lower potential or higher probability, and the red region represents higher potential or lower probability.
(corresponding to parameters in our model), and different parameter values will generate different landscape shapes, which determine the stability of cell state attractors. Take differentiation as an example, if we look at the vertical direction from top to down for the first column in figure 10, we will see that as repression strength decreases, the landscape shape changes and differentiated states become more stable. That is, the decrease of repression strength might promote differentiation. As a result, when the repression strength is very small, differentiated states are very stable over time because the two differentiated states are much more stable than other states ( figure 10(D)). In summary, it is the landscape changing with signals, rather than fixed landscape, that can explain the distinct stability of cell types driven by signals.
We also calculated the mean first passage time (MFPT) for differentiation, reprogramming, and transdifferentiation ( figure 11). MFPT reflects the average transition time of the system switching from one basin to another basin on the landscape, and therefore can quantify the feasibility of the cells switching from one cell type to another one (for example, from stem cell to differentiated cell in development and from differentiated cell to stem cell in reprogramming). We estimated the MFPT based on the simulation trajectories (see text S1 for details) [9]. As k β increases, the MFPT for differentiation becomes longer while the MFPT for reprogramming becomes shorter ( figure 11(B), corresponding to third column in figure 10, k α is fixed as 0.2). This is consistent with the landscape results showing that larger k β leads to a more stable stem cell state (figure 10). As k α increases, the MFPT for differentiation becomes a little bit shorter but the MFPT for reprogramming does not change too much ( figure 11(A), corresponding to second row in figure 10, k β is fixed as 0.2). So, the change of k α does not change the landscape significantly. Similarly, the MFPT for transdifferentiation becomes shorter as k α increases but longer as k β increases (figures 11(C) and (D)). This shows that larger activation strength k α and smaller repression strength k β (right bottom figure in figure 10) Figure 10. Landscape comparisons at different equilibrium constant (k α for self-activation and k β for repression) when protein-DNA unbinding rate α 1 and β 1 are fixed. The equilibrium constant of self-activation k α is increasing from left to right (self-activation strength increases), while the equilibrium constant of repression k β is declining from top to bottom (repression strength declines). As k β declines, more stable states occur. The potential landscape is defined as: = -U P log ss , and P ss is steady state probability distribution. On the landscape the blue region represents lower potential or higher probability, and the red region represents higher potential or lower probability. promotes transdifferentiation, where the two differentiation states are very stable compared to other states. Of note, the relative time scales for differentiation, reprogramming, and transdifferentiation depend on the landscape shape (here depending on the parameter k α and k β ). So, we also calculated the MFPT results for for different parameter regions of k α and k β (see figure S4).
To further study the effects of epigenetic regulation on the cell fate decision process in stem cell development, we changed the epigenetic strength (represented by the parameter pdem), to see the landscape change. We found that in the case without epigenetic regulations ( figure 12, increasing pdem), the number of attractors on the landscape becomes less (compare figures 10 and 12). This agrees well with our conclusion that epigenetic regulations promote the occurrence of intermediate states. Additionally, in the case without epigenetic regulations, we also obtain the consistent results showing that the decrease of repression strength might induce the cellular differentiation.

Discussion
Waddington epigenetic landscape, as a famous metaphor, has been used to explain cellular differentiation. However, it remains challenging to quantify the epigenetic landscape. For example, a critical issue is how to integrate genetic and epigenetic factors to study the mechanisms of development in a holistic perspective. Previous studies have been devoted to quantifying the landscape for development. However, the epigenetic modifications (such as histone modifications) have not been investigated explicitly in previous landscape models [6,8,12]. In this work, by combining genetic and epigenetic regulations, we constructed a developmental epigenetic model and quantified the epigenetic landscape for cellular development. We identified some attractors on the landscape characterizing different cell states, including pluripotent stem cell state, differentiated states, and some intermediate states. Our results show that epigenetic regulations can promote Figure 11. The mean first passage time (MFPT) for differentiation, reprogramming and transdifferentiation as the equilibrium constant of self-activation k α (A and C, k β is fixed as 0.2) and equilibrium constant of repression k β (B and D, k α is fixed as 0.2) change. Also see figure S4 for the MFPT results in different k α and k β regime. multistability or intermediate states. This suggests that epigenetic regulations might play critical roles in stem cell development, which can be used to intervene the reprogramming.
We also quantified the kinetic transition paths for differentiation, reprogramming and transdifferentiation. We discovered that differentiation, reprogramming and transdifferentiation all can go through intermediate states. Intriguingly, the kinetic transition paths for transdifferentiation from our model predictions agree well with experimental observations, showing that transdifferentiaiton can go through a mixed (MX), unspecific intermediate (UI) or progenitor-like (PG) state. Therefore, our modeling results based on a simplified developmental epigenetic model recapitulate experimental observations and provide insights into the underlying mechanisms of cell fate decisions in development.
To study the roles of gene regulations on the differentiation or reprogramming, we changed the gene regulatory strengths to see how the landscape is influenced. We found that the decrease of repression strength makes stem cell state less stable. Therefore, decreasing repression strengths provides a way to promote differentiation, and increasing repression strengths provides a way to induce reprogramming. We also calculated the MFPT, which provides another measure for the feasibility of cell STs. The results of MFPT give the consistent results with the landscape results. From calculating MFPT, we found that the increase of activation strength or decrease of repression strength promotes transdifferentiation. Additionally, by changing the epigenetic strength, we found that the number of attractors decreases as the epigenetic strength declines. This is consistent with our findings that epigenetic modifications promote the occurrence of intermediate states. In our models, the rates of histone modifications are much lower than the degradation or synthesis rate of protein (tables 3 and 4). This corresponds to the non-adiabatic regulations, where the effective binding/unbinding ) when protein-DNA unbinding rate α 1 and β 1 are fixed. The equilibrium constant of self-activation k α is increasing from left to right (self-activation strength increases), while the equilibrium constant of repression k β is declining from top to bottom (repression strength declines). With no epigenetic regulations, less stable states occur compared with the landscape with epigenetic regulations as shown in figure 10. The potential landscape is defined as: = -U P log ss , and P ss is steady state probability distribution. On the landscape the blue region represents lower potential or higher probability, and the red region represents higher potential or lower probability. process can be comparable or even slower than the synthesis and degradation of proteins. Therefore, our results in this work are reminiscent of previous studies showing that non-adiabatic effects can facilitate the occurrence of more cell types [9]. Here, the non-adiabaticity comes from the slow epigenetic dynamics (histone modifications).
In this work, we considered the H3K27 (repressive) as an example for histone modifications in gene regulations. It would be straightforward to extend our models to integrate other forms of histone modification, e.g., the lysine 4 (active) on histone H3 (H3K4). Future work can incorporate other forms of histone modifications and/or DNA methylation to study more intricate mechanisms for epigenetic regulations on gene expressions in development. In our simplified developmental model, we use a two-gene network to simulate stem cell differentiation and the SC state and PG state are treated as one single state. However, in a more realistic developmental model, which usually involves more genes, the SC and PG state could be different, i.e. there could be many different SC/PG like states appearing in development and differentiation. It would be interesting to integrate the epigenetic control to more realistic developmental gene network models [8], and examine these epigenetic cell states. That will reveal more intricate epigenetic control mechanisms in development. Furthermore, current model is in the level of single cell, and it is desirable to develop multiscale approaches formulating different models at different levels (genetic, epigenetic, cellular etc). Also, it is of great importance to integrate the genetic and epigenetic data [46] into the stem cell development model. In our opinion this is a very challenging but crucial task in the fields of systems biology.
The epigenetic mechanisms for cell fate decisions also have indications for cancer. Although cancer is traditionally considered as a genetic disease, epigenetic plasticity and alterations are suggested to contribute to hallmarks of cancer [47]. Our modeling framework can be applied to construct the epigenetic model for cancer and study the epigenetic control mechanisms in tumorigenesis. In summary, our developmental epigenetic model provides new insights into the interplay between genetic and epigenetic factors on controlling lineage commitment in development. Our approach provides a general way to study the epigenetic cell fate decisions in biological systems, by integrating transcriptional, translational and epigenetic players.