Implementing Arithmetic and Other Analytic Operations By Transcriptional Regulation

The transcriptional regulatory machinery of a gene can be viewed as a computational device, with transcription factor concentrations as inputs and expression level as the output. This view begs the question: what kinds of computations are possible? We show that different parameterizations of a simple chemical kinetic model of transcriptional regulation are able to approximate all four standard arithmetic operations: addition, subtraction, multiplication, and division, as well as various equality and inequality operations. This contrasts with other studies that emphasize logical or digital notions of computation in biological networks. We analyze the accuracy and precision of these approximations, showing that they depend on different sets of parameters, and are thus independently tunable. We demonstrate that networks of these “arithmetic” genes can be combined to accomplish yet more complicated computations by designing and simulating a network that detects statistically significant elevations in a time-varying signal. We also consider the much more general problem of approximating analytic functions, showing that this can be achieved by allowing multiple transcription factor binding sites on the promoter. These observations are important for the interpretation of naturally occurring networks and imply new possibilities for the design of synthetic networks.


Introduction
In grappling with the biochemical complexity of gene regulation, some have turned to computational metaphors for explaining gene behavior (e.g., [1][2][3][4][5]). The lac operon, for example, is often described as implementing a simple logical rule-the gene is on when lactose is present and glucose is absent [6]. Theoretically, a variety of nonlinear chemical systems, including gene regulatory systems, are capable of implementing arbitrary logical rules [7][8][9][10][11]. Networks of such systems can implement finite state machines [12], and families of such networks of increasing size can be said to implement arbitrary Turing-computable functions [13,14]. In practice, logical models have proven capable of accounting for the qualitative dynamics of a variety of genetic systems [15][16][17][18][19][20][21][22][23]. Models that combine logical rules with concentration thresholds for the action of regulatory molecules, as in the French-flag model of Wolpert [24] or Glass networks more generally [25], satisfactorily describe other systems either qualitatively [26][27][28] or quantitatively [29]. Synthetic biologists have constructed gene networks that perform elementary logical operations, such as storing a bit of memory [30] or turning off and on with a fixed period [31].
However, detailed analysis of transcriptional regulatory networks reveals a behavior richer than logical responses. Yuh et al.'s model of the sea urchin developmental gene Endo-16 contains logical as well as additive and multiplicative operations [32]. Recent measurements of lac transcription rate as a function of the concentrations of cAMP and an analogue of allolactose show four plateaus of different rates connected by smooth boundaries [4,5]. Even ''logical'' models of gene regulation often require more than two qualitatively distinct levels of gene expression (e.g., [19]), recognizing that some genes cannot be treated simply as on or off.
Besides logical or digital computation, several other notions of computation have been explored. Analog computations made by artificial neural networks can in principle be implemented by chemical systems [8,12,33]. Deckhard and Sauro experimented with evolving reaction networks to compute square and cubic roots of an input [34] and have since evolved networks to compute a variety of other arithmetic functions. This work shares the greatest commonality withours, the differences being that we specifically study single-gene transcriptional regulatory networks and that our designs are arrived at prescriptively, by analysis of the steady state equations, rather than by a computational search.
A different line of reasoning has focussed on the robustness of biochemical networks to variability in inputs and parametersintuitively important features for real systems [35][36][37]. Reduction in noise by development gene networks was experimentally observed [38,39] and confirmed as a property of mathematical models [20,40]. In this background, control theoretic concepts, especially noise filtering, were studied more carefully [41] and found in a wide variety of systems (see [37] for a summary). More recently, the behavior of several genes has been viewed in the context of information theory [42] and Bayesian decision theory [43].
The present work focuses on the general steady-state analog computational capacities of genes. Most of the paper considers a simple chemical kinetic model of transcriptional regulation for a single gene with two transcription factors. We assume that concentrations of proteins represent non-negative real numbers, with the transcription factor concentrations acting as inputs to the gene and the steady state expression of the gene acting as the output. Different choices for the kinetic rates allow the gene to approximate different binary arithmetic operations: addition, subtraction, multiplication, division, and testing for equality and various inequalities. Variations on the model allow alternative implementations of the same functions as well as other functions, such as the square root. In the model, these operations can be reproduced with arbitrary accuracy, although in reality, biological limits on the parameters would limit accuracy. We analyze the accuracy of these approximations in terms of the deviation between the steady state output and the desired output. We also analyze the precision of the approximations, in terms of the variability of the output over time in a stochastic interpretation of the model. Through theoretical analysis and simulations we show that such ''arithmetic genes'' can be combined in networks to compute more sophisticated functions. As an example, we describe an eight gene network that tracks the mean and standard deviation of a time-varying signal and detects times at which the signal is statistically significantly elevated. We also consider a model of a gene regulated by a single type of transcription factor but having multiple binding sites on the promoter. With such a gene, arbitrary analytic functions can be approximated up to a fixed order based on power series expansions. We demonstrate by designing a gene that approximates the cosine function.

Results
A chemical-kinetic model of transcriptional regulation Figure 1 presents a diagram of the model of transcriptional regulation and gene expression that we analyze. It models a single gene regulated by two transcription factors, A and B. These factors may bind irreversibly to form an inertdimer, or they may bind to the DNA, where they affect the rate of transcription. Transcripts are translated into proteins at a fixed rate, and both transcripts and proteins decay at fixed rates. The reaction equations below formalize the model. (Please note that Tables 1, 2, and 3 show the symbols employed in this paper and their meanings.) AzB ? rd C ð1Þ T ? rtz TzZ ð10Þ For the bidirectional reactions, equilibrium association constants are ratios of forward to backward rates (e.g., Most of our analysis concerns the steady state behavior of this system. We use t o , t a , t b , and t ab to denote the fractions of time that the promoter spends unbound or bound by different combinations of transcription factors. We assume that the reactions for transcription factors binding to the promoter are at equilibrium.
Author Summary The biochemistry of the cell is daunting in its complexity. In order to understand this complexity, we are often forced to use metaphors or construct analogies to systems that we understand better. One long-standing analogy is to digital computers, with their large networks of interacting components that manage to act in coherent and useful ways. Indeed, we know from both theoretical models and empirical observations that biological entities such as genes can sometimes be described accurately by digital, or logical, expressions-turning on or off in response to regulatory signals. However, far more sophisticated computations can also take place, as has been documented in the responses of genes such as the lac operon or From this system of equations, one can deduce that K oa K ab = K ob K ba . However, the system is degenerate and does not lead to a solution for the occupancy times until we recognize that the occupancy times must sum to one.
With the addition of this equation, we obtain four linearly independent equations, which can be manipulated to solve for the occupancy times.
Þð 21Þ This allows us to calculate the net rate of transcription.
The steady state concentration of protein Z then follows.
Because K oa K ab = K ob K ba , the term K oa K ab can be replaced by   K ob K ba in any of the equations above. These equations are also true for more restricted binding scenarios, including: A and B cannot be bound to the DNA simultaneously (signified by K ab = K ba = 0); A must bind before B binds, and B must unbind before A can unbind (K ob = K ba = 0); and B cannot bind the DNA at all (K ob = K ba = K ab = 0). In these alternative scenarios, however, it no longer holds that K oa K ab = K ob K ba .

Approximating arithmetic operations
By employing subsets of the allowed reactions and setting kinetic parameters appropriately, different arithmetic operations can be approximated, with [A tot ] and [B tot ] treated as the inputs. (See Figure 2 for a summary.) For example, consider Equation 24 under the conditions: These conditions can be interpreted as: (i) A and B do not dimerize, (ii) A and B cannot both be bound to the DNA at the same time, perhaps because their binding sites overlap, (iii) there is not transcription if neither A nor B are bound to the DNA, (iv) there is a certain balance between the production and decay rates of mRNA and protein, and (v),(vi) either A and B bind to the DNA comparatively weakly or else we are considering only comparatively small concentrations of A and B. Then Equation 24 tells us that [Z]<[A tot ]+[B tot ], so that the gene approximates the addition operation. (See Figure 2B for the exact steady state expression.) The error in the approximation is analyzed in more detail in the next section.
Alternatively, suppose that A and B bind the promoter sequentially, perhaps because A is a cofactor without which B cannot bind, and that transcription is activated only when both are bound. If production and decay are balanced as r tz r ab K oa  zero-truncated subtraction. These arithmetic operations can be implemented in other ways, even restricting attention to the model in Figure 1. For example, the addition gene could allow A and B to bind simultaneously, as long as there is no transcription when both are bound. Similarly, the multiplication gene could allow A and B to bind to the promoter in either order, as long as there is only transcription when both are bound. However, these alternate versions are less accurate than the designs in Figure reffig:diagrams because they introduce unnecessary promoter binding states. Mathematically, the decrease in accuracy corresponds to extra terms appearing in the denominator of the second term in Equation 24. Variations on these models can produce other useful functions. Trivial modifications allow the other comparison operations: greaterthan-or-equal, less-than, and less-than-or-equal. More interestingly, consider a model with a single transcription factor A, which can be obtained by dropping all equations from the model that involve B. Suppose that A activates transcription. Further, suppose that Z dimerizes at forward rate r z and that the dimer itself is inert and degrades at some rate d z , but that Z on its own does not decay. Then Equation 12 is replaced by If r tz K oa r a = d t r z and K oa [A tot ]%1, then at steady state Genes that approximate other fractional powers can be constructed by assuming more complicated promoterbinding or degradation schemes.

Accuracy depends on promoter occupancy
The accuracy with which the models in Figure 2 approximate the desired operations depends on the kinetic parameters as well as the transcription factor concentrations [A tot ] and [B tot ]. We analyze accuracy in terms of relative error. For the arithmetic operations of addition, multiplication, division, and zero-truncated subtraction, we define relative error as where f is the desired operation. This is not well defined when f is zero. However, for the models in Figure 2, [Z] = 0 whenever f is zero, so we can take the relative error to be zero. For the comparison operations we define relative error as . The relative error can be calculated by substituting the formula for [Z] in Figure 2B into the definition for relative error. Figure 3 shows the relative errors of the six gene models from Figure 2. An advantage to studying this notion of accuracy is that the relative error can be easily expressed in terms of the fraction of time the promoter spends in different binding states (Figure 3, third column). For the arithmetic operations, relative error is generally increasing in [A tot ] and [B tot ] as well as the equilibrium association constants for the binding of A and B to the DNA (Figure 3, second column). Thus, the approximations are most accurate when the concentrations of the transcription factors are small or when they bind weakly to the DNA-as already emphasized in the third column of Figure 2B. Intuitively, this keeps the transcriptional response in the linear regime. Saturation of the transcription rate, due to high concentration of transcription factor(s) or strong transcription factor binding, reduces accuracy. Indeed, the relative error of the addition, multiplication and zero-truncated subtraction genes is simply the fraction of time that the DNA is bound by either transcription factor. The division gene is a partial exception to these rules. First, its relative error is decreasing in K ob and [B tot ], not increasing. Second, the relative error is equal to the fraction of time the DNA is not bound by B.
The effects of these parameters cannot be considered in isolation, however, because the parameters as a group must satisfy the production-decay balance shown in Figure 2B. For the addition gene, for example, small transcription factor association constants, which result in an accurate approximation, must be balanced by large rates of transcription and/or translation or small rates of transcript and/or protein decay.
For the comparison operations, accuracy is highest when A and B bind strongly to the promoter. If K oa and K ob are small and if [A tot ] and [B tot ] are just slightly different, then there is only a small amount of undimerized transcription factor binding weakly to the promoter, and this does not provide sufficient transcriptional activation (in the case of .) or repression (in the case of = ). For either gene, the relative error is either zero or t o , depending on whether the comparison is true or false.

Noise in the output is independent of accuracy
In a stochastic kinetics interpretation of the model in Figure 1 (Equations 1 to 12), the output protein concentration varies over time even if [A tot ] and [B tot ] are fixed. Let X(t) denote the concentration of molecular species X at time t. Noise in Z(t), defined as the standard deviation of Z(t) over time divided by the mean of Z(t) over time, can be attributed to three sources: (i) inherent fluctuations due to the stochastic birth-death process for Z, (ii) variability in T(t) due to its own inherently stochastic birthdeath process, which in turn creates variability in the production rate of Z, and (iii) variability in the promoter state, which affects the production rate of T, and by extension, of Z. If one assumes that the transcription factor binding reactions are at steady state, then the third source of noise is absent, and the noise in Z(t) is equivalent to that in the simpler system T ? rtz TzZ ð29Þ where r t is the net rate of transcript production.
For this system, the moments of Z can be calculated exactly from the chemical master equation, and the noise is [44] g~ffi Importantly, the noise in Z(t) bears no necessary relationship with the accuracy with which [Z] approximates a desired function, because accuracy and noise depend on different sets of parameters. For the addition gene, for example, accuracy is determined by K oa and K ob . These could be large or small regardless of the noise level, which is determined by r tz , d t and d z . This is true even considering the production-decay balance constraint for the gene, r tz r a K oa = r tz r b K ob = d t d z . The parameters r a and r b do not occur in the formulas for either accuracy or noise and can be used to ensure that the constraints are satisfied. This is essentially the same as the observation by Thattai and van Oudenaarden that the mean and variance of gene expression levels are controlled by independent sets of parameters [44].
If the transcription factor binding reactions are not at steady state, then the third source of noise in Z(t) returns. Intuitively, however, the faster the binding and unbinding reactions are compared to the rate of transcription, the more this noise is filtered out by the slower transcription process. For example, consider the addition gene with inputs [A tot ] = 30 molecules and [B tot ] = 70 molecules. We used the Gillespie algorithm [45] to simulate the stochastic chemical kinetics defined by Equations 1 to 12, but replacing the transcription factor-DNA binding and unbinding rates (f oa , f ob , b oa , b ob ) by scaled versions (rf oa , rf ob , rb oa , rb ob ). By varying r, we could change the rates of binding and unbinding, while leaving K oa and K ob constant. (See Materials and Methods for the full set of kinetic parameters.) Figure 4A shows three sample traces of Z(t) for three different choices of r, in which the noise in Z(t) can be seen to decrease for increasing r. Figure 4B shows the noise in the simulated Z(t) for a wider range of r. For sufficiently large r, the steady state approximation for the promoter is good, and the noise is seen to converge to the value predicted by Equation 33, indicated by the dashed line. Figure 4C shows that the empirical mean concentration [Z] is at nearly the correctly value of 100, and is independent of r. Independence of accuracy from promoter state fluctuations has been shown analytically via Master equation analysis for some binding scenarios [46], but should hold in general for our models, as noise depends on both the binding and unbinding rates of transcription factors, whereas accuracy depends only on the ratios of the rates.

Example: A network for detecting statistically significant elevation in a time varying signal
Arithmetic genes can be combined into networks to perform more sophisticated computations. As an example, consider a cell in which the concentration of a molecule I varies with time. Suppose it is important for the cell to detect times at which I(t) is significantly elevated. For example, the cell might be a bacterium and I(t) might be correlated to the concentration of an extracellular sugar. If the bacterium encounters a high-sugar environment, it may want to begin expressing the genes needed to transport and metabolize the sugar. Or, I may be a toxin, and high toxin levels might trigger a defensive or developmental decision, such as sporulation, to protect the bacterium. In a synthetic biology context, I might be a signal sent by the experimenter to trigger a response or a marker for a diseased cell that should be destroyed. A single threshold-activated gene, for example the ''greater than'' gene of Figure 2, provides a simple solution to this problem. However, such a gene requires a predefined notion of how large a signal is considered elevated. Further, a chance fluctuation in the signal or in the transcriptional machinery itself might trigger a false response. The feedforward motif [47] is one way to reduce incorrect responses due to fluctuations. This motif describes a set of three genes in which gene 1 regulates gene 2, and both genes 1 and 2 regulate gene 3. The feedforward motif appears far more often than chance in natural genomes [48,49], and presents a variety of temporal information processing possibilities [50,51]. Among them is the ability to detect sustained, rather than merely transient or accidental, increases in an incoming signal. However, the level of signal that is considered elevated is still preordained.
We consider a more challenging version of the problem in which the statistics of the signal are not known ahead of time. This may be because the operating environment of the cell is not known a priori or because the signal itself is difficult to measure experimentally. Whether or not the signal is elevated at a given moment thus depends on the the signal's mean value and typical fluctuationsproperties which may themselves change over time. More formally, if the statistics of I(t) are relatively constant for some period, then a natural definition of elevated is I(t).m I +cs I , where m I is the mean of I(t) over time, s I is the standard deviation, and c is a constant specifying how large an elevation is of interest. To solve the problem, the cell must identify the signal statistics m I and s I , and then compute whether the signal is elevated at any particular time. Figure 5A depicts a network of arithmetic genes that accomplishes this task. The circles represent genes and an arrow from one gene to another means that the first gene's protein acts as an input to the second gene. The network comprises arithmetic genes for multiplication, addition, subtraction, taking the square root, and comparison, as well as two genes labeled by m. The arithmetic and comparison genes are assumed to reach steady state at a faster time scale than variations in the input signal I(t). The two m genes are simply activated proportional to their input and are assumed to reach steady state at a slower time scale than variations in the input signal. Thus, the m genes effectively compute a recency-weighted time-average of their inputs, the exact nature of which depends on the details of the kinetic parameters. Gene G1 multiplies I(t) by itself, and because G1 operates at a faster time scale than I, we can approximate the expression of its protein as Z 1 (t) = I(t) 2 . Gene G2 averages Z 1 (t) over time, so we can approximate its expression as Z 2 (t) = E t (Z 1 (t)) = E t (I(t) 2 ), where E t denotes averaging over time. By similar reasoning, the expression of genes G3 and G4 is Z 3 (t) = E t (I(t)) and Z 4 (t) = (Z 3 (t)) 2 = (E t (I(t))) 2 . Genes G5 and G6 compute ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (See Equations 26 and 27 and surrounding discussion for the square root function.) G7 is a modified addition gene which computes Z 7 (t) = m I +cs I , for constant c, and G8 compares this result to I(t), turning on when the latter is greater than the former.
To test this network, we simulated a differential equation model of transcript and protein dynamics subject to a time-varying input (see Figure 5B and Materials and Methods). The simulation covered 20 days during which the signal's mean or standard deviation changed three times. There were also six brief spikes in the signal. We chose c = 3, so that G8(t) should turn on, to an expression level of 100 nM, whenever I(t) exceeded its mean plus three standard deviations. Details of the simulation, including the exact equations and kinetic parameters, can be found in the Materials and Methods section. Figure 5C-E shows the results of our simulation. The simulation covers 20 days of simulated time, though we do not show the first 3 days during which a transient due to the initial conditions disappears. Figure 5B shows the input signal, and Figure 5C shows the overall response of the network to that signal. The network responds at the times that it should and at no other times. Its response to the input spikes and the increase in oscillation amplitude on day 14 are brief and do not reach the fully-on level of 100 nM, mainly because the elevations themselves are brief. When the signal jumps higher on day 8 and remains there, the network responds for longer and reaches a fully-on level. However, this response too vanishes as the network adjusts to the change in signal statistics. (The problem statement requires adjustment to changing signal statistics, so that elevation is relative to the recent mean and standard deviation of the signal. If a more sustained response is desired, G8 could be augmented with a positive feedback loop to keep it activated once triggered.) Figure 5D,E show the mean and standard deviation of the signal's oscillations and the network's recency-weighted estimates of signal mean and standard deviation. Interestingly, changes to the signal mean on days 8 and 11 are initially interpreted by the network as changes in the standard deviation of the signal. Only over the course of several days is the mistake rectified, with the mean estimate shifting to the correct value and the standard deviation estimate returning to its previous, correct level.

Approximating analytic functions
The elementary arithmetic operations provide a basis for analog computation, in that these operations can be combined to compute more complex functions. However, from the standpoint of the cell, it seems desirable to perform computations with as little biomolecular machinery as possible. From the form of Equation 24, it is not clear how other elementary operations, such as exponentiation, logarithms, or trigonometric functions, might be implemented. Whether these particular functions are of useto cells can be debated, but certainly cells need to compute more complicated functions than elementary arithmetic. One way to approximate more complicated operations is by power series expansions. The focus of this section is on analytic functionsfunctions with Taylor series expansions that converge pointwise to the correct values, at least for some range of inputs. For simplicity, we consider single-input functions. Generalization to multipleinput functions is straightforward. Figure 6A shows a schematic of a model describing a single gene regulated by one type of transcription factor, denoted by A. We assume N binding sites for A on the gene's promoter, each acting independently with equilibrium association binding constant K. We assume that transcription rate depends only on the number of binding sites occupied by A. This model is formalized by the reaction equations below.
T ? rtz TzZ ð36Þ Here, P i denotes the promoter bound by i copies of factor A. T denotes transcript, and Z denotes protein. The steady state expression of protein Z is This can also be written as Z ½ ~r tz dtdz | P N i~0 r i t i , where Suppose first that all the c i are nonnegative. Then the gene can  (39) and (40).
If some c i are negative, then the series can be divided into positive and negative components.
where c i + = max(c i ,0) and c i 2 = 2min(c i ,0). Two genes of the type in Figure 6A can then combine to compute the function, one computing the positive part of the series and one computing the negative part. The difference in the protein concentrations of the two genes approximates f.
For example, suppose the cell wanted to approximate the cosine function using a pair of genes of the type in Figure 6A with five binding sites. The Taylor series expansion for cosine, centered at zero, has coefficients c 0 = 1, c 1 = 0, c 2 = 21, c 3 = 0, c 4 = 1, … The blue and green curves in Figure 6B show, respectively, the cosine function and its 5th order Taylor series approximation. Using biologically plausible parameters for the transcription factor binding affinity, translation rate, and rates of transcript and protein decay (see Materials and Methods), Equation 41 can be solved for the necessary transcription rates, r i . The red curve in Figure 6B shows

Discussion
We have shown that different parameterizations of a simple model of transcriptional regulation can reproduce binary arithmetic operations (addition, zero-truncated subtraction, multiplication, division), various inequality comparisons, and fractional powers. Unary, ternary, quaternary, etc. operations can be similarly implemented at a biochemical level. These models are not the only way, and may not be the best way, that these operations can be implemented. For example, we have not considered models with cooperativity between binding sites, allosteric features for transcription factors, regulation of translation or degradation, or chromatin-related mechanisms. One way to explore the utility of these mechanisms would be to perform an explicit computational search through some appropriate space of allowed reaction systems (as in Deckard and Sauro [34], for example), searching for systems with superior performance in terms of various metrics (such as accuracy, noise, robustness, evolvability, or energy consumption). Nevertheless, our simple models demonstrate the possibility of arithmetic computations at the level of transcriptional regulation.
Other functions, such as exponentiation, logarithms, and trigonometric functions, could be computed by combinations of the arithmetic operations. However, this would be energetically wasteful and would require many genes. A more efficient approach is toapproximate directly by using more complicated regulatory machinery. We showed that a pair of genes, each with N binding sites for a transcription factor, can approximate any analytic function up to order N by reproducing its power series approximation. We also showed that by adjusting the parameters of such a gene pair, the analytic function can be approximated well over a larger range of inputs. While this drastically reduces the number of genes used to compute such functions, other mechanisms may be yet more efficient, and this is an avenue for future research.
Our emphasis on arithmetic computations stands in contrast to work on logical or Boolean notions of computation in genetic networks. Interestingly, Hjelmfelt et al. [12] and Magnasco [13] both considered the problem of addition, but in the binary sense, with different chemical species effecting each bit of the operation. In general, there is no relationship between an arithmetic interpretation of a gene's behavior-which is accurate for a limited range of transcriptionfactor concentrations-and a logical interpretation-which focuses on the extremes of low and high transcription factor concentrations. For example, in our model for an Addition gene, transcription occurs whenever factor A or factor B is bound, keeping in mind that factors A and B cannot bind simultaneously (Figure 2A). Such a gene could be interpreted as implementing the logical OR function, because it would be expressed if either factor A or factor B is in high concentration. Recall, however, that an alternate design for an Addition gene allows A and B to bind simultaneously, as long as there was no transcription when both were bound. This gene would be interpreted as implementing XOR, as it would be expressed if A or B, but not both, were in high concentrations. Conversely, our designs for the zero-truncated subtraction function and the greater than function have identical logical structure (Figure 2A), but behave quite differently as arithmetic operations. Thus, a gene's arithmetic behavior, if any, is not determined by its logical behavior and viceversa. This is an important caveat for the interpretation of empirical measurements of gene response surfaces, which may be probed at non-physiological concentrations of regulatory molecules.
For our gene models to accurately approximate the desired operations, certain products of kinetic parameters must be equal. For example, the multiplication gene requires that r tz r ab K oa K ab = d t d z . Such a constraint may be biologically implausible or difficult to maintain over different cellular conditions. If the constraint did not hold, the expression of the multiplication gene would still be proportional to the product of the transcription factor expression levels-the role of the constraint is only to scale the output expression correctly. Thus, violation of the constraint in a particular situation need not invalidate the interpretation of the gene as performing multiplication. The case is similar for all the genes in Figure 2 except the addition gene. This gene requires r tz r a K oa = r tz r b K ob = d t d z . If those equalities failed, but r a K oa = r b K ob , then the expression of the gene would still be proportional to the sum of the transcription factor expression levels. If the latter equality failed as well, then the gene could be interpreted as performing a weighted sum. Indeed, we took advantage of this possibility in our design for the network that detects significant elevations in a time varying signal (see gene G7 in Figure 5A).
In our models, precision (noise) in the output depends on the transcription factor binding and unbinding rates, while the accuracy of the output depends only on the ratio of the rates. This implies that there is no trade-off between precision and accuracy. Each can be selected independent of the other, by evolution or by synthetic biologists, subject to biological limitations on the parameters. We have not analyzed the speed with which these networks reach equilibrium, which could also be an important factor in some settings. Indeed, in many processes, such as development, expressing genes at the correct time is just as important as expressing them at the right level. Further, some genetic networks, such as circadian networks [52] or the feed-forward motif [50,51], exist solely to keep time or process temporal signals. Our design and analysis of a network that tracks the first and second moments of a time varying signal and responds when the signal is statistically significantly elevated demonstrates that genes performing arithmetic computations can be useful in temporal information processing. An important avenue for further research is to consider more dynamical notions of computation, including investigation of how biochemical networks might implement computations from sequential signal processing or control theory.
We expect that notions of computation more sophisticated than simple logical operations, already being documented experimentally [4,5,32,53], will play an increasing role in our understanding of transcriptional regulation and of biochemical systems more generally. Our analysis of the analog computational abilities of transcriptional regulatory networks is a step in this direction.

Materials and Methods
Our Gillespie simulations for the analysis of noise for the Addition gene used a slightly simplified set of equations. Because factors A and B cannot bind the promoter simultaneously, we lumped the two bound states into one.
where P 1 denotes the promoter bound by either A or B, f is the forward binding rate of A and B to the promoter, and b is the unbinding rate. Transcription then follows the reaction Translation, degradation of transcripts, and degradation of proteins remained the same, following Equations (10) to (12 21 , where r is the dimensionless parameter we used to control the binding and unbinding rates of A and B to the promoter, while leaving the equilibrium association constant unchanged. For different values of r, the system was simulated from the initial condition P 0 = 1, P 1 = T = Z = 0, for one million reactions. Figure 4A plots initial portions of several of those trajectories, but Figure 4B,C are based on the full one million reactions. Our simulations of the network for detecting elevated signals involved 17 variables: the exogenous input I(t), and a transcript concentration T i (t) and protein concentration Z i (t) for each gene i. All variables have units nanomolar (nM). The input signal was primarily a sinusoidal oscillation with a period of four hours. Oscillations had mean 10 and amplitude 2 for the first eight of the 20 days simulated, mean 20 nM and amplitude 2 nM for the next three days, mean 10 nM and amplitude 2 nM for another 3 days, and mean 10 nM and amplitude 6 nM for the final six days. An independent Gaussian disturbance with mean zero and variance one was added to the signal during every 10 minute period. At the starts of days 5, 6, 7, 17, 18 and 19, there were 10 minute spikes in the signal at levels 16,18,20,20,22 and 24 nM respectively. We assumed the transcription factor binding equations were at steady state in order to speed the calculations. Transcription was driven at the rate expected based on the transcription factor concentrations. The exact differential equations simulated were: The kinetic parameters are summarized in Table 4. The ''fast'' genes were given protein half-lives of 5 minutes and the ''slow'' genes were given protein half lives of 10 hours. These values are somewhat extreme, but the short half-life is consistent with measurements for actively degraded proteins [54] and both halflives are within the range of recent measurements in a genomewide study of yeast [55]. All other parameters are in typical ranges.
For our study of the cosine approximation, we used parameters  21 . For the optimized curve, we allowed all the parameters to vary, including the transcription rates of the positive and negative genes, and their rates of translation and transcript and protein decay. We use the matlab function ''fminsearch'' to optimize the parameters, minimizing the mean squared error between steady state [Z] and cos ([A]). Because all parameters should be non-negative, they were reformulated as p i = exp(b i ) where p i is the i th parameter and b i is the unconstrained parameter that we actually optimized. This resulted in the constrained parameters shown in Table 5.