Noise-processing by signaling networks

Signaling networks mediate environmental information to the cell nucleus. To perform this task effectively they must be able to integrate multiple stimuli and distinguish persistent signals from transient environmental fluctuations. However, the ways in which signaling networks process environmental noise are not well understood. Here we outline a mathematical framework that relates a network’s structure to its capacity to process noise, and use this framework to dissect the noise-processing ability of signaling networks. We find that complex networks that are dense in directed paths are poor noise processors, while those that are sparse and strongly directional process noise well. These results suggest that while cross-talk between signaling pathways may increase the ability of signaling networks to integrate multiple stimuli, too much cross-talk may compromise the ability of the network to distinguish signal from noise. To illustrate these general results we consider the structure of the signalling network that maintains pluripotency in mouse embryonic stem cells, and find an incoherent feedforward loop structure involving Stat3, Tfcp2l1, Esrrb, Klf2 and Klf4 is particularly important for noise-processing. Taken together these results suggest that noise-processing is an important function of signaling networks and they may be structured in part to optimize this task.

Cellular identities are regulated by a variety of complex, interconnected, molecular regulatory networks, including signaling networks, metabolic networks and core transcriptional regulatory networks [1][2][3][4][5][6][7][8][9] . Signaling networks are of particular importance in maintaining robust cellular identities since they mediate noisy environmental information from the local cellular micro-environment to the cell nucleus 7,[10][11][12][13][14] . In order to perform this task effectively they must be able to transmit complex environmental information robustly, and failure to do this has been linked to cancer initiation and progression, as well as defects in embryonic development [15][16][17] . Much of what is known about signaling networks comes from the detailed reductionist analysis of their constituent signaling pathways. Several of these have been studied in great detail, and the core components and biochemical mechanisms of signal transduction in pathways such as Wnt, TGF-β and MAP Kinase signaling are now well-defined. These signaling pathways function in a wide diversity of different biological processes and systems, and they are known to have a central role in maintaining pluripotency and specifying cell identities, for example 18,19 . A long-standing question of interest is why, despite the myriad of biological processes and systems that involve signaling, there are only a few distinct, but widely conserved and re-used pathways 20 . An emerging feature that may in part explain this observation is that signaling pathway 'modules' are interconnected in many different ways and cross-talk between pathways has been shown for most signaling pathways, with mechanisms ranging from transcriptional activation of pathway components through to direct interactions between proteins in different pathways 20 . In addition to cross-talk between pathways, specific feedback mechanisms allow for the homeostatic control of signaling activity. A well-defined example in the Wnt signaling pathway is the transcriptional activation of the Dickkopf proteins which negatively regulate Wnt signaling by binding to Wnt receptors in response to Wnt activation 21 . However, while much is now known about the function of specific signaling pathways, very little known about how cross-talk between pathways affects information-processing. In the context of signal processing it has been suggested that promiscuity in the protein-protein interactions is a major source of intrinsic noise, and that signaling pathways have evolved features (for example receptor clustering) to better distinguish signal from noise 22 . Although some studies have considered the ways in which noise propagates through regulatory networks [23][24][25][26][27] , the general mechanisms by which signaling networks distinguish persistent environmental signals from the noise that is inherent to the cellular micro-environment are still not well understood. To address this problem, here we outline a general mathematical framework which relates a network's structure to its capacity to process noise, and use this framework to dissect the noise-processing ability of signaling networks. As an illustrative example we examine the noise-processing ability of the network that maintains pluripotency in mouse embryonic stem cells.
Embryonic stem (ES) cells are found naturally in the pre-implantation embryo and are able to give rise to all embryonic lineages, a property known as pluripotency. The molecular basis of pluripotency has been extensively studied, and it is now known that activation of a small number of core transcription factors -including Oct3/4, Sox2, and Nanog along with other secondary factors such as Myc, Klf4 and Lin28 -is sufficient to maintain the pluripotent state 5,6,[28][29][30][31][32][33][34][35] . Indeed, forced expression of combinations of these factors in somatic cells is sufficient to induce pluripotency de novo 28,30,36 . Although this central transcriptional circuit is self-sustaining when shielded from external stimulation 37 , it is known that a network of signaling pathways which process extra-cellular environmental information are also essential both to maintenance of, and exit from, the pluripotent state 38 . Importantly, while the core transcriptional circuity is broadly similar in mouse and human pluripotent cells 33 , their dependency on external signaling is markedly different: mouse ES cells are dependent on Lif/Stat signaling 39,40 , Bmp 41 and canonical Wnt 37 to promote self-renewal, while Fgf/Erk signaling disrupts pluripotency 37,[42][43][44] ; by contrast human ES cell self-renewal is independent of Lif 45 , yet requires Activin and Fgf 46, 47 signaling and human ES cells undergo differentiation when exposed to Bmp 47 .
The remainder of the paper is organized as follows: we begin by outlining our general mathematical theory, as well as setting our assumptions, before establishing a mathematical formula that makes the connection between network structure and noise-processing explicit. To illustrate these results we then use this expression to investigate the structure of the regulatory network for pluripotency in mouse ES cells. This network is chosen since it is particularly well characterized 34 and so constitutes a good test model. We find that certain elements in this network, particularly incoherent feedforward structures, are particularly important for its noise-processing ability. Interestingly, these elements are distinct from the core feedback structures that are known to maintain the pluripotent ground state 48 , suggesting that different portions of this network perform different regulatory tasks.

Results
Our concern is with how a network G processes noise from an external source. In the context of signaling networks the nodes in the network are molecules in the signaling cascades, and edges are regulatory interactions (e.g. phosphorylation etc.) between molecules. Since signaling networks pass information from the cell exterior to the nucleus, we assume that the network G is inherently directed: the presence of an edge (i, j) indicates that node i exerts a regulatory effect on node j but not necessarily vice versa. Since regulatory interactions may be activatory or inhibitory we also allow each edge to have positive or negative weight representing the strength of activation or inhibition respectively. We denote the weight of edge (i, j) by A ij . Assuming that there are n nodes in G the n × n adjacency matrix A then describes the strength of all interactions in the system.
In general the regulatory interactions between nodes may be highly nonlinear. However, to better understand the relationship between network structure and function we will assume here that the dynamics are linear. By doing so we are effectively considering the linearization near to a fixed point in the nonlinear dynamics; this rationale for studying the linear case has been taken elsewhere 49 . In the absence of external fluctuations, the dynamics of the system are described by the following system of ordinary differential equations (ODEs): where x is the vector of node states (for example, protein concentrations), I is the n × n identity matrix, and we have assumed that all nodes decay at the same rate d, which then sets a timescale for the dynamics. Without loss of generality we may take d = 1 since this may always be achieved by suitable re-scaling. Given the linearity of this system there are only two possible types of long-term behavior: convergence toward a stable fixed point or divergence to infinity. We will assume that only the first behavior can happen, i.e. convergence to a stable fixed point is the only physically realistic scenario. This occurs whenever the real parts of the eigenvalues of M are all strictly positive. Properties of the network G for which Eq. (1) admits a stable solution have been discussed at length, and it is known that sparse modular networks confer stability, for example 49,50 . However, our concern here is not with stability per se but rather with the effect that external noise has on the magnitude of fluctuations around a stable equilibrium. To investigate this we will consider the following stochastic differential equation associated with Eq. (1) where W(t) is a standard n-dimensional Brownian motion (and dW/dt is therefore n-dimensional Gaussian white noise). Equation (2) describes a multivariate Ornstein-Uhlenbeck process. Whenever Eq. (1) admits a stable solution, Eq. (2) is ergodic and therefore admits a unique invariant measure 51 . Furthermore, by the linearity of Eq.
(2) we know that x(t) is distributed according to a multivariate normal with mean e −Mt x(0) and covariance matrix K(t) given by, If Eq. (2) describes an ergodic process then Although this is the standard formulation 51 , instead of working with Eq. (5) we will work directly with the Eq. (3) as it ultimately allows a more transparent assessment the effects of network structure on the stationary covariance of the system.
Since our purpose is to determine the way in which input noise is processed by the network G it is natural to consider a single noisy input to the system, which represents the fluctuating extra-cellular environment, and a single output, representing the computational core of the network. To do so we may, without loss of generality, chose a labeling of the nodes such that the first node is the noisy input and the n-th node is the output. Thus, we set σ Σ =  ( , 0, , 0) T and we are interested in calculating the variance of the n-th node in the network, which is given by K nn , relative to the magnitude of the input noise. If the input fluctuations carry no information, then this is a measure of the extent to which the network suppresses or amplifies random environmental fluctuations. If the fluctuations contain important environmental information, then this is a measure of the extent to which the target node can 'sense' this extra-cellular information. From Eq. (3) the limiting covariance, in index notation and using Einstein summation notation, is Although it is not immediately transparent, this expression connects strongly to the structure of the network via the fact that (i, j)-th entry of the k-th power of the adjacency matrix is the total weight of all walks from node i to node j with length k (the weight w(P) of a walk P from node i to node j is the product of its edge weights, ∏ (i,j)∈P A ij ). Thus, the exponential of the adjacency matrix of a network is a weighted sum of all walks between nodes i and j, and so is a simple measure of network 'communicability' 52 . Using the notation we may re-write the exponential terms in Eq. (7) as Making use of this connection and using the shorthand β ik = β i1k we then obtain, ij k l ik jl k l t s 2 we may furthermore simplify Eq. (9) to Since the variance of the noisy input is given by K 11 = σ 2 /2, we may investigate the noise-processing ability of the network by considering the ratio where we have further simplified notation by setting β nk = β k . When R > 1 noise is amplified by the network; when R < 1 noise is suppressed by the network. Importantly, if the process described by Eq. (2) is ergodic then R is finite and depends only on the structure of the network G. This formula therefore provides an explicit connection between network architecture and noise-processing; our interest is to determine how R is affected by different network architectures. To do so we note that Eq. (12) has a natural interpretation in terms of random walks on G, as follows.
Since each walk P from the input to the target has an associated weight w(P), a pair of (possibly intersecting) walks P, Q from the input to the target also has an associated weight w(P, Q) = w(P)w(Q), the product of the edge weights involved. If we write P k and P l for arbitrary walks from the input to the target of length k, and l respectively, then the product β k β l can be written as Substituting this into Eq. (12) and rewriting the second sum in terms of m = k + l gives, m k m P P m k m k 0 0 , k m k from which it can be seen that R is a weighted sum of all pairs of walks through G from the source node to the target, with the relative importance of each walk-pair determined by a coefficient drawn from a binomial distribution B(m, 1/2), where m is the length of the walk-pair. The appearance of binomial probabilities arises as the natural probability measure for pairs of random walkers on the network G. To see this consider two independent random walkers starting at the same time at the input node. At each time step, one walker is chosen with probability 1/2, and that walker moves through G choosing available edges with equal probability (i.e. if the walker is at node i then each outgoing edge from node i is chosen with probability d 1/ i out where d i out is the out-degree of node i). The probability that after precisely m = k + l steps both walkers are at the target node is  14) is the expected weight of a pair of walks from input to target, with respect to the probability measure generated by two independent random walkers [the presence of two random walkers rather than one, as might be expected, arises from the fact that K depends upon two exponential terms, e A(t−s) and − e A t s ( ) T , in Eq. (7)]. If G is a directed acyclic graph (in which there are no feedback loops), then all random walks have finite length and the first sum in Eq. (14) has finitely many terms. However, if cycles are present in the network then random walks may be arbitrarily long and the first sum will correspondingly have infinitely many terms. Positive feedback loops add infinitely many positive terms to the sum, and therefore always serve to amplify noise with respect to similarly structured acyclic networks; negative feedback loops add both positive and negative terms to the sum and may amplify or reduce noise with respect to similarly structured acyclic networks, depending on the particular arrangement of inhibitory edges in the network. This observation is consistent with previous studies on the effect of positive and negative feedback loops in noise propagation, particularly in biological regulatory networks 25,26,53,54 . As the random walks are independent of the edge weights, the importance of each edge to the noise-processing ability of the network has 2 contributions: (1) the probability that either of the random walkers traverses that edge (which depends solely on its position within the network relative to the source and target); and (2) the, extent to which it contributes to any walk it participates in (which depends solely on its weight).
It is worth noting here that although random walkers are a natural way to explore directed networks 55,56 , if the matrix A is normal (that is, if it commutes with its transpose A T ; a strong condition that is not typically satisfied by directed networks but does occur if the network G is undirected, for example), then a much simpler related result for the trace of the covariance matrix Tr(K) may be obtained. In the rest of this section we will study the particular case that A is normal. Although not directly relevant to signaling networks, this case does nevertheless provide insight into Eq. (12) by relating noise-processing to well-known network-theoretic notions.
First, let us take the sum of the ratios K jj /K 11 in Eq. (12) for all output nodes j, In general this sum cannot be simplified and we resort to interpreting in terms of random walkers, as above. However, if the matrices A and A T commute then we can use the binomial formula to expand the m-th power of the symmetric part of A as, The (1, 1)-entry of the matrix above is precisely the double inner sum in Eq. (17), so we may write where M s = (M + M T )/2 is the symmetric part of M. By taking into account all possible walks through the network, this is a simple variation on the exponential of the adjacency matrix as a measure of network communicability, although in this case the communicability of G is taken with respect to the analytic function (1 − x) −1 , rather than exp(x) 52 . Using this result we finally obtain, s 1 11 Thus, in order to determine the noise-processing ability of the network using this measure, we need only calculate the matrix inverse of the symmetric part of M, and take the (1, 1)-entry (where without loss of generality the first node is the input node). This is result is strongly related to the Laplacian matrix of the network G: if A is symmetric and normalized so that each row (or column) sums to 1 then M s is precisely the normalized Laplacian of G, which is well-known to be closely related to network connectivity 57,58 .
To illustrate how Eq. (12) works in practice, we now consider a couple of examples.
Signaling cascade. The simplest example network is a signaling cascade, consisting of a chain of m = n − 1 interactions between n nodes. To facilitate a transparent illustration we shall assume that all the weights in the network are the same. In this case, we may index the nodes such that A ij = a for edges (i, i + 1), where i = 1, 2, …, n − 1, and is zero otherwise. Equation (12)  Three conclusions are apparent from this result: (1) since the variance of the target node depends upon the magnitude of the interaction strength squared, the sign of the interactions (i.e. whether they are activating or inhibiting) does not affect the ability of the cascade to process noise; (2) since R is monotonic decreasing with m, the effect of input noise diminishes with longer cascades (for fixed a); (3) if |a| < 1 then R < 1 for any m and noise is diminished by the cascade; while if |a| > 1 noise may be amplified by the cascade depending on the magnitude of a relative to m. In general since R decreases with a and m, this analysis suggests that long signaling cascades with weak interactions process noise better than short cascades with stronger interactions. This result is in accordance with previous studies on noise-processing by transcriptional cascades 60 .

Feedforward loop.
In reality signaling cascades do not operate in isolation; rather cross-talk between pathways means that many different paths from the input to the output may exist, and each may process different aspects of the extra-cellular signal. The simplest example of such a network is the feedforward loop motif, a commonly occurring structure in biological regulatory networks which is known to be involved with in a range of biological functions, including distinguishing persistent signals from noise 61 . The simplest feedforward loop consists of three nodes with two paths from the source (node 1) to the target (node 3): one direct and one indirect, via an intermediary (node 2). In this case, assuming that all interactions are of equal weight we obtain By contrast to the simple signaling cascade, the sign of the edges in the feedforward loop do affect its noise-processing ability. If all edges are positive then β 1 , β 2 > 0 (all paths in the network are positive) and the target receives a consistent signal from the source. In this case the feedforward loop is said to be coherent 61 . However, if β 1 < 0 or β 2 < 0 (which occurs if either one or three of the edges is negative) then the target receives a inconsistent signal from the source. In this case, the feedforward loop is said to be incoherent 61 . Denoting the noise-processing ratios in the coherent and incoherent cases by R + and R − respectively, it follows from Eq. (25) that R + > R − , and therefore that the incoherent feedback loop is better at processing noise. This general conclusion also holds when the paths from the source to the target are of arbitrary length: incoherence of pairs of paths through the network tends to lead to better noise-processing. These results are consistent with previous studies on the effect of positive and negative paths in feedforward loops on noise propagation 25, 62-65 . Noise-processing by stem cells. In order to apply these general results we now consider signal transduction in the regulatory network for pluripotency. The skeleton of this network has recently been inferred from analysis of correlations between expression patters of important regulatory factors 34 and is illustrated in Fig. 1.
Scientific RepoRts | 7: 532 | DOI:10.1038/s41598-017-00659-x The behavior of this network is determined by input from three extra-cellular factors commonly added to ES cell culture media preparations: the cytokine leukemia inhibitory factor (Lif), and selective inhibitors of glycogen synthase kinase 3 (TGF-β) (Chiron99021, denoted CH) and mitogen-activated protein kinase kinase (Mek) (PD0325901, denoted PD). Since it is known that stimulation of Lif signaling is sufficient to maintain pluripotency in vitro, we will choose Lif as the noisy source in our analysis. When Lif is present in the extra-cellular environment it binds to the Stat3 receptor 66,67 , and activates signaling pathways that stimulate the core transcriptional regulatory network for pluripotency in mouse ES cells 35,68 . At the heart of this core network are the trio of transcription factors, Oct4, Sox2 and Nanog 29,69,70 . Since Oct4 is well-established as the most central factor in this core, we will consider the propagation of a noisy signal from (the input) Lif through the network to (the output) Oct4. Although the sign of the interactions (i.e. whether they are activatory or inhibitory) is known 34 their strength is unknown. In the absence of this information we assume that all interactions are of equal unit strength since this represents the most economic model. To investigate how network structure affects noise-processing, we sought to determine how the ratio R changes upon targeted removal of different interactions from this network by comparing the R-values of perturbed networks with that of the unperturbed network (denoted R full ). To uncover some of the structural determinants of noise-processing we also calculated two simple network measures based upon our interpretation of Eq.  The reduced network that we study here in which Lif is taken as a noisy input and Oct4 is taken as the target. Since CH and PD cannot be reached from Lif via a walk on this network, we can exclude these nodes, along with Tcf3 and ERK, from our analysis. Edges that participate in feedback loops in the network are shown in red.  Table 1. The effect that targeted removal of interactions on network noise-processing. The first column identifies the edge removed from the network; the second column shows the effect of targeted removal of the given edge on the ratio R by comparison with that of the unperturbed network; the third column shows the effect of targeted removal of the given edge has on network coherence; the fourth column shows the effect of targeted removal of the given edge has on network feedback. Edges that emanate from Oct4 do not contribute to the noise processing capacity of the network and their removal does not affect R so they are excluded from this table. Since all paths from Lif to Oct4 pass through the edge Lif → Stat3 its removal disconnects the network; this edge is also accordingly excluded from the table. Interactions are ordered by column 1.
paths from Lif to Oct4 respectively. This is a simple measure of structural coherence; and (2) f, the total number of feedback loops in the network, as a measure of network complexity. To determine how network properties varied with removal of specific edges, we also determined how these measures changed upon targeted removal of different interactions from the network by comparing values with those of the unperturbed network (denoted c full and f full respectively). The results of this analysis are summarized in Table 1.
The shortest paths from Lif to Oct4 in this network have length 4; there are two such paths: (1) Lif → Stat3 → Tfcp2l1 → Esrrb ⊣ Oct4; and (2) Lif → Stat3 → Klf4 → Klf2 → Oct4. The first of these paths is negative (due to the inhibitory interaction Esrrb ⊣ Oct4) while the second is positive. Thus, when taken together this pair of paths forms an incoherent feedforward loop; since these are the shortest paths in the network, we anticipate from Eq. (12) that this incoherent feedforward loop will have an important role in noise-processing in this network. Indeed, this is what is observed: if any of the elements of this structure are removed, then the noise-processing capacity of this network is severely inhibited and the ratio R increases substantially (see Table 1). By contrast, we also anticipate that since feedback loops introduce arbitrarily long walks in the network [and therefore contribute infinitely many terms to the sum in Eq. (12)] removal of edges which participate strongly in the feedback structure of the network will result in a substantial reduction in its noise-processing capacity. Again, this is what is observed: when edges which participate in large numbers of feedback loops are removed, the ratio R decreases substantially (see Table 1). Although we have focused on the effect of edge removal on two particular properties (coherence and feedback) each edge in the network is likely to have multiple functional roles, and an individual edge may be important for both coherence and feedback (for example, since Esrrb has an important role both in mediating environmental signals to the computational core and maintaining the feedback structure of the core, edges such as Tfcp2l1 → Esrrb and Esrrb ⊣ Oct4 naturally have a dual role). For this reason, while we observe general trends between noise-processing and coherence/feedback, we do not, as expected, see strong correlations (see Fig. 1). However, when taken together, these results indicate that those interactions in the feedback rich core transcriptional circuitry (shown in red in Fig. 2) that are needed to maintain a self-perpetuating pluripotent identity 5, 48 have a tendency to amplify extrinsic noise. To compensate, a distinct set of interactions between auxiliary factors structured into a set of incoherent feedforward loops tend to suppress environmental noise, and thereby ensure that environmental signals are robustly mediated to this core circuit.

Discussion
In this paper we have investigated the noise-processing by networks. To do so we have combined a stochastic approach (similar to that taken by Anderson and co-workers in ref. 59) with graph-theoretic notions to derive a simple expression that relates a network's structure to its noise-processing ability. This expression is easily calculated even for large networks [particularly if the network is undirected, see Eq. (21)], so provides an economic measure that may be used to examine the structure of naturally occurring networks, and guide the design of man-made networks. In practical applications it is often desirable to maintain signal intensity while noise is simultaneously controlled. In such cases some noise-minimizing strategies, such as introducing long or incoherent paths through the network, may not be desirable or feasible. It would be interesting to investigate further how the interplay between noise-processing and signal maintenance shapes the structure of biological regulatory networks, for example using recent methods from the theory of adaptive networks 71 . Similarly, it would also be interesting to examine how network structures affect noise-processing more generally, taking into account more complex network dynamics such as limit cycles. To illustrate our results we have considered the structure of the network that maintains pluripotency in mouse ES cells, and found that important network structures, distinct from those that maintain the core pluripotent state, are responsible for noise processing in this system, suggesting that different features of this network are responsible for different regulatory tasks. Accordingly, we anticipate that the structure of many natural networks may be determined, in part, to optimally process noise. It will be interesting to elucidate the extent to which cross-talk between pathways in natural networks, which typically have to process multiple complex signals, is shaped by the trade-off between signal integration and noise-processing.  Table 1. Removal of edges that result in an increase of coherence in the network tend to diminish the system's noise-processing ability, while removal of edges which reduce the overall feedback structure of the network tend to improve the system's noise-processing ability. Red lines show linear regression.