A theoretical framework for the regulation of Shh morphogen-controlled gene expression

How morphogen gradients govern the pattern of gene expression in developing tissues is not well understood. Here, we describe a statistical thermodynamic model of gene regulation that combines the activity of a morphogen with the transcriptional network it controls. Using Sonic hedgehog (Shh) patterning of the ventral neural tube as an example, we show that the framework can be used together with the principled parameter selection technique of approximate Bayesian computation to obtain a dynamical model that accurately predicts tissue patterning. The analysis indicates that, for each target gene regulated by Gli, which is the transcriptional effector of Shh signalling, there is a neutral point in the gradient, either side of which altering the Gli binding affinity has opposite effects on gene expression. This explains recent counterintuitive experimental observations. The approach is broadly applicable and provides a unifying framework to explain the temporospatial pattern of morphogen-regulated gene expression.

beyond the neutral point increasing binding site number decreases expression.
The two models are therefore qualitatively similar. Indeed, changing binding affinity in the three--site model also has a similar effect to the single binding site model ( Figure S1C).
Increasing the number of sites also results in an increasingly non--linear response to the graded input. It should be noted that the model described here represents an idealized case in which each bound GBS has the same cumulative effect on the total activation or inhibition -regardless of how many sites are already bound. It is potentially more likely that there would be some kind of diminishing effect so that each additional GBS that is bound has less effect on polymerase than those already bound. In this case adding more binding sites would make the gene response relatively less non--linear [data not shown].
Section 2 -A model in which transcription factor binding rates are unaffected by bound polymerase has a neutral point that is dependent on basal transcription levels.
The thermodynamic model, defined by equation 1 in the main text, explicitly describes a situation in which activators (or repressors) act by reducing (or increasing) the total energy of the state in which they are bound at the DNA together with polymerase - such that the total energy is less (or more) than the sum of the energies of the individually bound species. Thus, the presence of a bound activator increases the probability that free polymerase will bind. The model also implies that bound polymerase will act to increase the probability that free activator will bind (or reduce the probability of free repressor binding).
This makes intuitive sense if one regards the DNA, transcription factor and promoter as single bound complex; however, it is unclear whether or not this is the most appropriate description of the underlying molecular processes in a cis-regulatory system. An alternative model would be to assume that an activator (or repressor) can increase (or reduce) the rate of polymerase binding --or alternatively reduce (or increase) unbinding --whilst the binding rate of the activator (or repressor) is unaffected by the presence of polymerase. In this case it is not straightforward to model the system using the thermodynamic framework because it is not possible to define a unique energy level associated with each bound state.
Instead, the probability of gene expression , can be expressed such that: Since in this model the binding of GliA and GliR are independent of P, a simple consideration of the binding reactions gives: We assume that A and R will change the binding constant of P by the factors cAP and cAR respectively. Therefore we obtain:

This gives the expression for
We can apply a similar logic as was applied previously to determine how changes to the Gli binding affinity, ( ≡ ! = ! ) will affect . We derive: In this case the neutral point, where At positions where ′ > 0 increasing K will increase . Conversely at position where ′ < 0 increasing K will decrease . In this model the neutral point Although this model exhibits these subtle differences to the thermodynamic model described in the main text, the fundamental result still holds; close to the source, increasing Gli binding affinity to the enhancer will increase gene expression probability and further from the source it will decrease. Determining precisely which molecular model of gene expression is more appropriate will require further experimental investigation.

Section 3 -Gene expression probability can be expressed as a linear function of signal strength
It is possible to rearrange the thermodynamic model for gene expression probability in the presence of S (i.e Sox2) binding (Equation 7) so that it becomes a linear function of the graded signal of GliA and GliR.
This takes the form: where, We can regard as the ratio of the probability of polymerase being bound (and hence the probability of gene expression) and the probability of it being unbound. ! is a measure of how much Gli will be bound in the absence of polymerase. If KA=KR and total Gli =constant=[A]+[R] then ! will be a spatially uniform constant. In this case is representative of the gradient and defines the proportion of activating Gli to the total Gli. The remaining terms, !"#"$ , ! , ! contain only parameters that are spatially uniform and independent of Gli. Thus, will be linear in .
In the case described in Supplemental section 2, where Gli binding is independent of Polymerase binding, we can also incorporate the binding of S to obtain the expression: and are the same as in previous case. In this model, where Gli binding is independent of polymerase occupancy (in contrast to the thermodynamic model above) will be linear in (when ! is a constant). This therefore highlights a way in which these models could be distinguished experimentally.

Differential binding affinity
The first simple modification to the single binding site model described in Equation 1 is that the activator and repressor forms of Gli (or Ci in this case) could bind with different affinities to different enhancer regions. In particular for a shift towards the source in the position at which gene expression exceeds basal levels, this requires: where KR_low and KA_low are the binding affinities of the 'low' affinity enhancer to R and A respectively, and KR_high and KA_high the 'high' affinity enhancer. In this case it is intuitive to see that increasing binding affinity proportionally more for the repressor could lead to more overall repression relative to the basal level of expression. This is shown in Figure S3A where Equation 1 has been simulated (solid blue line) but in this case the activator affinity has been increased 2 fold and the repressor affinity 3 fold (dashed blue line).
In regions very close to the source the gene expression has still increased (as in Figure 2C) and similarly very far from the source gene expression has decreased with the increase in GBS affinities. However in contrast to Figure 2C the point at which gene expression is at basal levels, has also shifted towards the source.

Cooperative repressor binding
A second alternative model, favored by Parker et. al. (Driever et al., 1989;Parker et al., 2011;White et al., 2012) to explain the Dpp response, was one in which the binding affinities for A and R are the same but there is cooperative binding between the repressive form of Gli at multiple binding sites. We illustrate this model schematically in Figure S3B for a simplified case in which there are two binding sites (note that three Ci binding sites were identified in the Dpp enhancer and included in the original analysis of the wing disc data).
The function for in this case has the form: cRR, cAA, cAR, represent the cooperativity factors for binding of RR, AA, RA at adjacent GBSs.
The basal level of expression is still the same as in Equation 2. Crucially in this model it can be shown analytically that if cRR≠cAA then the position at which the function is equal to the basal level of gene expression will be dependent on the GBS affinity K (in contrast to single binding site model).

Development | Supplementary Material
A numerical example is illustrated in Figure  S3C where cRR>>cAA. In this model there is an increase in gene expression close to the source and a decrease far from the source. However, now the position of gene expression above the basal (thin grey line) is shifted towards the gradient source when GBS affinity is increased. The cooperative binding of the inhibitor means that increasing GBS affinity increases the inhibition proportionally more than the activation.
The cooperative model was favored over the differential affinity model by

A - The same gradient defined in Figure 2B for activator [A] and repressor [R].
These are defined such that A=e --x/1.5 , R=1--A.   These are plotted on a log--scale x--axis ranging from --3 to 2 (the range of the initial prior search space). The height of the bars represents the relative density of the posterior parameter space occupied by each parameter. The joint distributions map the relative density of a two--dimensional slice of parameter space orthogonal to the two parameters in the row and column as indicated -both axes in this case range from --3 to 2 (the x--axis refers to the parameter in that column, the y--axis to the parameter in that row), density ranges from blue (low) to red (high). Note that in this case, no two parameters are tightly correlated and the joint distributions simply reflect the overlapping marginal distributions.