The Dynamics of Canalizing Boolean Networks

Boolean networks are a popular modeling framework in computational biology to capture the dynamics of molecular networks, such as gene regulatory networks. It has been observed that many published models of such networks are defined by regulatory rules driving the dynamics that have certain so-called canalizing properties. In this paper, we investigate the dynamics of a random Boolean network with such properties using analytical methods and simulations. From our simulations, we observe that Boolean networks with higher canalizing depth have generally fewer attractors, the attractors are smaller, and the basins are larger, with implications for the stability and robustness of the models. These properties are relevant to many biological applications. Moreover, our results show that, from the standpoint of the attractor structure, high canalizing depth, compared to relatively small positive canalizing depth, has a very modest impact on dynamics. Motivated by these observations, we conduct mathematical study of the attractor structure of a random Boolean network of canalizing depth one (i.e., the smallest positive depth). For every positive integer ℓ, we give an explicit formula for the limit of the expected number of attractors of length ℓ in an n-state random Boolean network as n goes to infinity.


Introduction
Dynamic mathematical models are a key enabling technology in systems biology. Depending on the system to be modeled, the data and information available for their construction, the questions to be answered, and different modeling frameworks can be used. For kinetic models, systems of ordinary differential equations have a long tradition.
Generally, they will have the very special structure of polynomial equations representing Michaelis-Menten kinetics, even in the case of systems, such as gene regulatory networks, that are not proper biochemical reaction networks. It is this special structure that gives models desirable properties and aids in model analysis. Besides continuous models, a range of discrete models are finding increasingly frequent use, in particular Boolean network models of a broad variety of biological systems, from intracellular molecular networks to population-level compartmental models (see e.g., [1][2][3][4][5]), going back to the work of Kauffman in the 1960s [6][7][8]. While Boolean network models, a collection of nodes, whose regulation by other nodes is described via a logical rule built from Boolean operators, are intuitive and mathematically simple to describe, their analysis is severely limited by the lack of mathematical tools. It generally consists of simulation results. Any set function on binary strings that takes on binary values can be represented as a Boolean function, so that the class of general Boolean networks is identical to the class of set functions on binary strings of a given length, making any general analysis impossible. The search for special classes of Boolean functions that are broad enough to cover all or most rules that occur in biology, but special enough to allow for mathematical approaches has a long history.
It was again Kauffman who proposed a class of functions [7] with properties inspired by the developmental biology concept of canalization, going back to Waddington in the 1940s [9]. There is some evidence that canalizing Boolean functions do indeed appear disproportionately in published models and that the dynamics of Boolean network models consisting of canalizing functions has special properties, in particular a "small" number of attractors. This is important since, in the case of intracellular molecular network models, attractors correspond to the different phenotypes a cell is capable of. Here, again, the majority of available results are obtained by simulating large numbers of such networks. The main question of this paper is as follows: What do the dynamics of a random canalizing Boolean network look like? We approach this question using both computer simulations and analytical methods, with the main result of the paper being Theorem 2, which gives a provable formula for the number of expected attractors of a general Boolean network with a particular canalization property. In addition to providing important information about canalizing Boolean network models, this result can be viewed as a part of a growing body of mathematical results characterizing this class of networks that promises to be as rich as that for chemical reaction network models based on ordinary differential equations.

Background
The property of canalization for Boolean functions was introduced by Kauffman in [7], inspired by the concept of canalization from developmental biology [9]. A Boolean function is canalizing if there is a variable and a value of the variable such that if the variable takes the value, then the value of the function does not depend on other variables. It was shown that models defined by such functions often exhibit less chaotic and more stable behavior [10,11]. Nested canalizing functions, obtained by applying the concept of canalization recursively, were introduced in [2]. They form a special subset of canalizing functions and have stable dynamics [11]. We note that there are other important properties shared by Boolean networks arising in modeling (for example, sparsity [7]). In this paper we focus only on canalization and its impact on the dynamics, and one of the natural future directions would be to consider several such properties simultaneously.
To cover more models arising in applications, the notion of nested canalizing function was relaxed by Layne et al. [12] by assigning to every Boolean function its canalizing depth. Noncanalizing functions have canalizing depth zero, and nested canalizing functions have the maximal possible canalizing depth equal to the number of variables. Canalizing depth of a Boolean network is defined as the minimum of the canalizing depths of the functions defining the network. In [12], activities and sensitivities of functions with different canalizing depths and stability and criticality of Boolean networks composed from such functions were investigated. It has been observed that Boolean networks of higher canalizing depth tend to be more stable and less sensitive. However, increasing the canalizing depth to the maximum does not improve the stability significantly compared to moderate positive canalizing depth. These observations give a strong indication of the biological utility of canalizing function, even with small canalizing depth.
Attractors in Boolean network models can be interpreted as distinct cell types [13, p. 202] and their lengths can be viewed as the variety of different gene expression patterns corresponding to the cell type. Thus, understanding the attractor structure of a random Boolean network defined by functions of a fixed canalizing depth is important for assessing biological relevance of such models. Analytic study of the attractor structure of nested canalizing Boolean networks has been carried out in [11]. For discussion about attractors of length one (i.e., steady state), we refer to [14].

Our Results
The main question of this paper is as follows: What do the dynamics of a random canalizing Boolean network look like? We approach this question using both computer simulations and analytical methods.
In our computational experiments, we generate approximately 30 million random Boolean networks of all possible canalizing depths with the number of variables ranging from 4 to 20. For each of these networks, we determine sizes of all the attractors and basins of attraction and analyze the obtained data. We discover the following:

1.
For a fixed number of variables, the sample mean of the number of attractors and average size of an attractor decrease when the canalizing depth increases

2.
The decrease of the average size of an attractor is much greater than the decrease of the number of attractors as the canalizing depth increases

3.
Both decreases from (8) are substantial when the canalizing depth changes from zero to small canalizing depths, but a further increase of the canalizing depth does not lead to a significant decrease for either the sample means or for the empirical distributions

4.
The relative decrease of the sample mean of the number of attractors and the average attractor size when the canalizing depth changes from zero to one becomes sharper when the number of variables increases Observations (8) and (A.4) are consistent with the results obtained in [12] for sensitivity and stability. This provides new evidence that Boolean networks of small positive canalizing depth are almost as well-suited for modeling as those with nested canalizing functions, from the point of view of stability. Since there are many more canalizing functions of small positive canalizing depth than nested canalizing functions [15,Section 5], they provide a richer modeling toolbox.
Motivated by observation (A.4), we conduct a mathematical study of the attractor structure of a random Boolean network of canalizing depth one (that is, the minimal positive depth). Our main theoretical result, Theorem 2, gives, for every positive integer ℓ, a formula for the limit of the expected number of attractors of length ℓ in a random Boolean network of depth one. The same formulas are valid for a random Boolean network defined by canalizing functions (see Remark 5). In particular, our formulas show that a large random network of depth one, on average, has more attractors of small sizes that an average Boolean network (Remark 6).
Formulas similar to the ones in our proofs (e.g., in Lemma A.4) have already appeared in the study of the average number of attractors of a given length in sparse Boolean networks, e.g., [16, equation (2)] and [17, equation (6)]. The results of [16,17] are based on describing the asymptotic behavior of these formulas in terms of N, the number of nodes in the network, and the asymptotics is of the form O N α . In our case, the average number of attractors of a given length simply approaches a constant as N→∞ (that is, O(1)), but our methods allow us to find the exact value of this constant.

Structure of the Paper.
The rest of the paper is organized as follows. Section 4 contains necessary definitions about canalizing functions and Boolean networks. Outlines of the algorithms used in our computational experiments are in Section 5. The main observations are summarized in Section 6. Our main theoretical result about attractors in a random Boolean network of canalizing depth one (Theorem 2) is presented in Section 7. Section 8 contains conclusions. The proofs are located in the Appendix.

Preliminaries Definition 1.
A Boolean network is a tuple f = f 1 , f 2 , …, f n of Boolean functions in n variables. For a state a t = a t, 1 , a t, 2 , …, a t, n ∈ 0, 1 n at time t, we define the state a t + 1 : = f a t = a t + 1, 1 , …, a t + 1, n ∈ 0, 1 n at time t + 1 by a t + 1, 1 = f 1 a t, 1 , …, a t, n , ⋮ a t + 1, n = f n a t, 1 , …, a t, n . (1)

Definition 2 (attractors and basins).
Let f = f 1 , …, f n be a Boolean network.

i.
A sequence a 1 , …, a ℓ ∈ 0, 1 n of distinct states is called an attractor of f if f a i = a i + 1 for every 1 ≤ i < ℓ and f a ℓ = a 1 . ii.

iii.
Let A = a 1 , …, a ℓ ∈ 0, 1 n ℓ be an attractor of f. The basin of A is the set (2)

Definition 3.
A nonconstant function f x 1 , …, x n is canalizing with respect to a variable x i if there exists a canalizing value a ∈ 0, 1 such that f x 1 , …, x i − 1 , a, x i + 1 , …, x n ≡ const . (3)

Example 1.
Consider f x 1 , x 2 = x 1 ⋅ x 2 (the product is understood modulo 2, that is, logical AND). It is canalizing with respect to x 1 with canalizing value 0 because f 0, x 2 = 0 regardless of the value of x 2 . Analogously, it is canalizing with respect to x 2 with canalizing value 0.
The same argument works for x 2 as well.
where i. i 1 , …, i k are distinct integers from 1 to n ii.
iii. g is a noncanalizing function in the variables x 1 , …, x n / x i 1 , …, x i k

Remark 1.
Since g in Definition 4 is noncanalizing, every function has a single well-defined canalizing depth. In particular, a function of depth two is not considered to have depth one.

Definition 5.
We say that a canalizing Boolean function f x 1 , …, x n is nested if f has canalizing depth n, that is, g = 0 or g = 1 (see Definition 4). For example, f x 1 , x 2 , x 3 = x 1 x 2 x 3 is nested canalizing because so the canalizing depth of f is 3, which is equal to n = 3.

Definition 6.
We say that a Boolean network f = f 1 , …, f n has canalizing depth k if f 1 , …, f n are Boolean functions of canalizing depth k.

Simulations: Outline of the Algorithms
In our computational experiment, we generated random Boolean networks of various canalizing depths. For each network, we store a list of pairs a i , b i , where a i is the size of the ith attractor of the network and b i is the size of its basin. The generated data are available at https://github.com/MathTauAthogen/Canalizing-Depth-Dynamics/tree/ master/data. To generate the data, we used two algorithms: one for generating a random Boolean network of a given canalizing depth and one for finding the sizes of attractors and their basins (Algorithm 1).

Generating Random Boolean Functions of a Given Canalizing Depth.
[12, Section 5] contains a sketch of an algorithm for generating random Boolean functions that have canalizing depth at least k for a given k. Here, we generate functions of canalizing depth equal to k and take a different approach than [12]. In order to ensure that the probability distribution of possible outputs is uniform, we use the following structure theorem due to He and Macaulay [15].
Theorem 1 (see [15], Theorem 4.5).-Every Boolean function f x 1 , …, x n ≢0 can be uniquely written as where M i = ∏ j = 1 k i x i j + a i j for every 1 ≤ i ≤ r, p C ≢0 is a noncanalizing function, and k = ∑ i = 1 r k i is the canalizing depth. Each x i appears in exactly one of M 1 , …, M r , p C , and the only restrictions on equation (8) are the following "exceptional cases": (E1) If p C ≡ 1 and r ≠ 1, then k r ≥ 2 (E2) If p C ≡ 1 and r = 1 and k 1 = 1, then b = 0 Example 3.-Consider f x 1 , x 2 , x 3 , x 4 = x 1 x 2 + 1 x 3 x 4 + x 3 + x 4 can be represented as so M 1 = x 1 + 0 x 2 + 1 , M 2 = x 3 + 1 x 4 + 1 , b = 0, k = 4, and p C = 1. This can be verified by expanding the brackets in the original and new representations of f.
Our algorithm is summarized in Algorithms 2 and 3 below. Correctness of Algorithm 2 follows from Theorem 1, and correctness of Algorithm 3 can be proved directly by induction on k.
Remark 2.-The complexity of Algorithm 2 is O n2 n (see Proposition B.2). Given that the size of the output is O 2 n , and this is nearly optimal.
We measured the runtimes of our implementation of Algorithm 2 on a laptop with a Core i5 processor (1.60 GHz) and 8 Gb RAM. Generating a single function with 20 variables (the largest number we used in our simulations) takes 4.9 − 5.5 seconds (faster for smaller canalizing depth). On a laptop, our implementation can go up to 24 variables (~2 minutes to generate a function), and then hits memory limits. One can go further by using a lower level language and more careful packing. However, already a Boolean function in 40 variables would require at least 128 Gb of memory.
Remark 3.-We generate a random noncanalizing function as follows. We generate a random Boolean function and test for canalization until we generate a noncanalizing one. Then, we return it. Since canalizing functions are rare [15,Section 5], this algorithm is fast enough for our purposes (see Lemma B.1).

Notation 1.
For a Boolean network f = f 1 , …, f n , let N(f) and S(f) denote the number of the attractors of f and the sum of the sizes of the attractors of f, respectively. We define the average size of an attractor as AS (f): = S(f)/N(f).

Sample Means of N(f) and AS(f).
For every n = 4, …, 20 and every 0 ≤ k ≤ n, we generate random Boolean networks in n variables of canalizing depth k and compute the mean of N(f) and AS (f). Figure 1 shows how these means depend on k for n = 15 (based on 50,000 samples for each k). The shape of the plots is similar for other values of n we did computation for (that is, n = 4, …, 20). Note that although both means are decreasing, the decrease of the mean of AS (f) is more substantial.

ALGORITHM 1:
Finding the sizes of the attractors and their basins. (1) (Network → Graph) Build a directed graph G with 2 n vertices corresponding to possible states and a directed edge from a to f(a) for every a ∈ 0, 1 n .
(2) (Attractors) Perform a depth-first search [18, § 22.3] traversal on G viewed as an undirected graph to detect the unique cycle in each connected component, these cycles are the attractors.
(3) (Basins) For each cycle from Step 2, perform a depth-first search traversal on G with all the edges reversed. The dfs trees will be the basins.
(4) Return the sizes of the attractors and basins found on Steps 2 and 3.

ALGORITHM 2:
Generating a random Boolean function of a given canalizing depth.
In: Nonnegative integers k and n with k ≤ n Out: A Boolean function f in n variables of canalizing depth k such that, for fixed k and n, all possible outputs have the same probability (1) In the notation of Theorem 1, generate the following: (a) random bits b, a 1 , …, a n ∈ 0, 1 ; (b) random subset X ⊂ x 1 , …, x n with |X | = k; (c) random ordered partition X = X 1 ⊔ ⋯ ⊔ X r of X (using Algorithm 2); (d) random noncanalizing function p C ≢0 in variables x 1 , …, x n /X (see Remark 3).
(2) Form a function f x 1 , …, x n using the data generated in Step 1 as in Theorem 1, where M i involves exactly the variables from X i for every 1 ≤ i ≤ r.
(3) If f does not satisfy any of the conditions (E1) or (E2), discard it and run the algorithm again. Otherwise, return f.

ALGORITHM 3:
Generating a random ordered partition of a given finite set.
In: A finite set X with |X | = k Out: An ordered partition X = X 1 ⊔ ⋯ ⊔ X r into nonempty subsets X 1 , …, X r such that, for a fixed X, all possible outputs have the same probability (1) Compute p 0 , …, p k , where p i is the number of ordered partitions of a set of size i, using the recurrence p j = ∑ i = 0 j − 1 j i p j − i , p 0 = 1 (see [19, equation (9)]).
(2) Generate an integer N uniformly at random from 1, p k .
Find the minimum integer j between 1 and k such that ∑ i = 0 j − 1 k i p k − i ≥ N.
(4) Randomly select a subset X 1 ⊂ X of size j.
(6) Return X 1 ⊔ ⋯ ⊔ X r . Figure 2 shows the empirical distributions of N(f) and AS f for n = 12 and k = 0, 1, 3, 12 based on 300,000 samples for each k. From the plot, we can make the following observations:

Distributions of N(f) and AS(f).
i.
The distributions become more concentrated and the peak shifts towards zero when k increases ii.
The distributions for nonzero canalizing depths (especially for larger depths) are much closer to each other that to the distribution for zero canalizing depth. This agrees with the plots on Figure 1.

Relative Decreases.
From Figure 1, we can observe that, for both N(f) and AS f), the sample mean decreases rapidly for small canalizing depths. In order to understand how this decrease behaves for large n, we introduce N k (n): = the sample mean of N (f) for n variables and canalizing depth k the sample mean of N (f) for n variables and canalizing depth 0 . (11) AS k (n) is defined analogously. Figure 3 plots N 1 (n), N 2 (n), N 3 (n), and N n (n) and AS 1 (n), AS 2 (n), AS 3 (n), and AS n (n) as functions of n. From the plots we see that i.
The relative initial decrease from canalizing depth 0 to canalizing depth 1 becomes even more substantial when n increases ii.
The relative decrease from canalizing depth 0 tocanalizing depth 3 is already very close to the relative decrease from depth zero to the maximal depth (i.e., nested canalizing functions)

Theory: The Main Result
We will introduce notation needed to state the main theorem. Let us fix a positive integer ℓ.
For a binary string α ∈ S: = 0, 1 ℓ , we define Then, we define a 2 ℓ × 2 ℓ matrix G ℓ by where we interpret numbers 1 ≤ a and b ≤ 2 ℓ as binary sequences of length ℓ.

Theorem 2.
Let A ℓ be the limit of the expected number of attractors of length ℓ in a random Boolean network of canalizing depth one (see Definition 6) when the number of variables n goes to infinity. Then, where P G ℓ is the characteristic polynomial of matrix G ℓ introduced above. In particular, we have A 1 = 1,

Remark 4.
The plots below show that the result of Theorem 2 agrees with our simulations (Figure 4).

Remark 5.
As explained in Remark A.1, Theorem 2 stills holds if we replace a random Boolean network of canalizing depth one with a random Boolean network defined by canalizing functions.

Remark 6.
Theorem 2 and Corollary A.1 imply that A ℓ > 1/ℓ for every ℓ > 1. On the other hand, a direct computation shows that the expected number of attractors of length ℓ in a random Boolean network (without any canalization requirements) is 1/ℓ. This is consistent with our observations from Section 6.1.

Conclusions
We conducted computational experiments to investigate the attractor structure of Boolean networks defined by functions of varying canalizing depth. We observed that networks with higher canalizing depth tend to have fewer attractors and the sizes of the attractors decrease dramatically when the canalizing depth increases moderately. As a consequence, the basins tend to grow when the canalizing depth increases. These properties are desirable in many biological applications of Boolean networks, so our results give new indications of the biological utility of Boolean networks defined by functions of positive canalizing depth.
We proved a theoretical result, Theorem 2, which complements the above observation as follows. The theorem implies that a large random Boolean network of canalizing depth one has on average more attractors of small size than a random Boolean network of the same size although it has less attractors in total. This also provides an explanation to the fact that the total size of attractors decreases faster than the number of attractors as the canalizing depth grows.
Furthermore, we observed that all the statistics we computed are almost the same in the case of the maximal possible canalizing depth (so-called nested canalizing Boolean networks) and in the case of moderate canalizing depth. This agrees with the results of Layne et al. [12]. This observation elucidates an interesting and powerful feature of canalization: even a very moderate canalizing influence in a Boolean network has a strong constraining influence on network dynamics. It would be of interest to explore the prevalence of these features in published Boolean network models.
Finally, we provided evidence that the observed phenomena will occur for Boolean networks with larger numbers of state variables.
For every 1 ≤ i < j ≤ ℓ, let G ℓ; i, j be the submatrix of G ℓ with rows and columns having indices from S i, j .
For every ℓ, we have
Since 0 ∈ S i, j the same argument as in the proof of the first part of the lemma shows that 2 ℓ + 2 /2 ℓ + 2 − 1 G ℓ; i, j + C ℓ; i, j T is stochastic and has exactly one of the eigenvalues being equal to 1. □ Corollary A.1.
(A. 9) The main result of [23] implies that, for every complex s × s matrix A, we have F (z) can be rewritten as F (z) = 1 y(z) s P A (1/y(z)) . (A.14) Finding the asymptotic behavior of the Taylor coefficients of F (z) would yield an asymptotic for C(A) n . We will do this using singularity analysis [24, Chapter VI] (similarly to [22, Theorem 2]). Since |y(z) | < 1 for | z | < 1/e (see [21, Figure 1]) and all roots of P A lie in the unit circle due to the stochasticity of A, 1/e is the singularity of F (z) with the smallest absolute value. Due to Lemma A. On the set of all Boolean networks with n states consider two probability distributions:

A.
All the networks with canalizing depth one have the same probability, and all others have probability zero

B.
İe probability assigned to each network is proportional to the product of the number of canalizing variables of the functions defining this network We fix a positive integer ℓ. By A ℓ, n and B ℓ, n we denote the average number of attractors of length ℓ in a random Boolean network with n states with respect to distributions (A) and (B), respectively. Then, We will illustrate the (B) distribution by an example. Consider the following three networks with two states: f 1 = x 1 x 2 + 1, x 1 + x 2 , f 2 = x 1 x 2 , x 1 , f 3 = x 1 x 2 + 1, x 1 x 2 .
(A. 19) Since the canalizing depth of x 1 + x 2 is zero, P B f 1 , the probability of f 1 with respect to B, is zero. Since the canalizing depths of x 1 x 2 and x 1 are 2 and 1, respectively, the ratio P B f 2 /P B f 3 is equal to (2 ⋅ 1/2 ⋅ 2) = (1/2).
Proof.-Let F n and F n * be the number of Boolean functions in n variables with canalizing depth exactly one and more than one, respectively. We will use the following bounds:

1.
F n * ≤ n 2 ⋅ 4 ⋅ 4 ⋅ 2 2 n − 2 : we look term-by-term. There are at most n 2 ways to choose first and second canalizing variables. There are at most 4 choices for the canalizing outputs and at most 4 choices for canalizing values for these two variables. There are at most 2 2 n − 2 core functions, since that is all possible functions, which may or may not be canalizing. Since redundant arrangements of canalizing variables are not accounted for, this must overcount.

2.
F n ≥ 2 2 n − 1 − (n − 1) ⋅ 2 ⋅ 2 ⋅ 2 2 n − 2 : this is a lower bound for the number of noncanalizing core function in n − 1 variables because (n − 1) ⋅ 2 ⋅ 2 ⋅ 2 2 n − 2 is an upper bound on the number of canalizing functions in n − 1 variables (obtained in the same way as the bound above).
We also introduce For X being (A) or (B) and positive integer n, let P X, n denote the probability (it is always the same) of choosing a network from distribution X with all functions being of depth exactly one. Let P n * be the maximal probability of choosing a network from (B) with at least one function being of depth more than one, respectively. By S n and S n * we denote the total number of attractors of length ℓ in networks with all functions being of depth exactly one and with at least one function being of depth more than one, respectively.
The statement of the lemma is equivalent to the statement that lim n ∞ A ℓ, n − B ℓ, n = 0. (A.21) Using the notation introduced above, we can bound A ℓ, n − B ℓ, n as P n, A S n − P n, B S n − P n * S n * ≤ A ℓ, n − B ℓ, n ≤ P n, A S n + P n, B S n .
(A. 22) We set U n : = S n P n, A − P n, B and V n : = P n * S n * . Then, (A.21) would follow from lim n ∞ U n = 0 and lim n ∞ V n = 0, so we will prove these two equalities.
Since any network has at most 2 n attractors of length ℓ, S n ≤ 2 n F n n . Since the total sum of the products of canalizing depths over all Boolean networks does not exceed F n + nF n * n , we have P n, B ≥ 1/ F n + nF n * n . Since P n, A = 1/F n n , we have U n ≤ 2 n F n n 1 F n n − 1 F n + nF n * n = 2 n 1 − By similar arguments, P n * ≤ n n /F n n and S n * ≤ 2 n n F n + F n * n − 1 F n * , so V n ≤ 2 n n n + 1 F n + F n * n − 1 F n * 1 F n n ≤ 2 n n n + 1 1 + R n n − 1 R n . ii. There exists a constant P n, C such that, if the canalizing depth of every f i is one, then P C (f) = P n, C

iii.
We have P C (f)/P n, C ≤ P B (f)/P n, B (using notation from the proof of Lemma A.3) The above properties hold, for example, for the following distribution.
(C) All the networks defined by canalizing functions have the same probability, and all others have probability zero.
Using this distribution instead of (A), we see that Theorem 2 holds also for a random Boolean network defined by canalizing functions.

Lemma A.4.
We will use Notation A.1 and notation from Lemma A.2. Then, for every positive integers ℓ and n, we have C G ℓ n − ∑ 1 ≤ i < j ≤ ℓ C G ℓ; i, j n ≤ ℓB ℓ, n ≤ C G ℓ n .
To prove (A.29), we will use that the functions f i i = 1, …, n in the network are chosen independently to decompose the left-hand side as P X 1 , …, X ℓ form an attractor in this order = ∏ i = 1 n P f i X j = X j + 1, i for every 1 ≤ j ≤ n , where we use notation X n + 1 = X 1 and the probability of each Boolean function to be chosen is assumed to be proportional to the number of its canalizing variables. We show that, for every 1 ≤ i ≤ n, P f i X j = X j + 1, i for every 1 ≤ j ≤ n = ∑ β ∈ S g X i , s(β) n β n . with a uniform probability distribution P Ω . Observe that for a function f with canalizing variables x k 1 , …, x k s , we have P (f) = P Ω f, k 1 + ⋯ + P Ω f, k s . If we can show that, for every 1 ≤ k ≤ n, P Ω f X j = X j + 1, 1 for every 1 ≤ j ≤ n | (f, k) ∈ Ω = g X 1 , s X k , (A.34) then (A.31) would follow by summing up (A.34) over all k and using the law of total probability.
To prove (A.35), consider any j, say j = 1. There are then 4 cases for the values of X 1, k and X 2, 1 : 1. X 1, k = 1 and X 2, 1 is 0 or 1. With probability 1/2, we have f X 1 = X 2, 1 . This is true due to symmetry, as for any f 1 which takes on the value w at X 1 , we can produce another function g that is equal to 0 if X 1, k = 0 and f 1 if X 1, k = 1. Then, g X 1 = w.
The only case in which X 1 ∨ s X k ≠ s X k is where there is at least one j such that case 2 is realized. In this case, the probability in the left-hand side of (A.35) will be zero. Otherwise, each occurrence of case 1 will multiply the total probability by 1/2 and each occurrence of case 3 will multiply the total probability by 1. Thus, we show that the left-hand side of (A.35) is indeed equal to ℎ X 1 , s X k . This finishes the proof of (A.29).
To finish the proof of the lemma, we set U : = n ∈ ℤ ≥ 0 S | ∑ α ∈ S n α = n& the support of n does not belong to Summing (A.29) over all ℓ-tuples X 1 , …, X ℓ of distinct elements of 0, 1 n , we obtain (see (A.7)) ℓB ℓ, n = ∑ n ∈ U G ℓ n n n n ≤ C G ℓ n . (A.37) On the other hand, if n is supported on one some S i, j , then G ℓ n = G ℓ; i, j n S i, j , where n S i, j denotes the restriction of n on the coordinates from S i, j . This implies that C G ℓ n − ℓB ℓ, n ≤ ∑ selecting a subset of size j amounts to selecting and removing j indices). In total, we obtain O k 2 .
The depth of the recursion calls is at most k. Since the complexity of each single call is O k 2 , so the total complexity is O k 3 . □ Lemma B.1.
The average complexity of the algorithm for generating a function in n > 0 variables which is either 1 or noncanalizing described in Remark 3 is O n2 n .
Proof.- [25, p. 116] implies that the proportion of functions which are canalizing in n variables is bounded from above by 4n/2 2n − 1 . Note that [25] considers constant functions canalizing which we do not. Thus, the probability P n of choosing a function which is either 1 or noncanalizing is bounded from above by This bound is less than 3/4 for all values of n except 1 and 2, but we can compute directly that P 1 = 3/4 and P 2 = 13/16. Therefore, the number of times the generation of a function needs to be repeated averages to 1/1 − P n , which does not exceed 4, so the average complexity of the whole procedure is the same as of a single generation step.
The complexity of a single step consists of generating a random function (which is O 2 n ) and checking whether it is canalizing or not. We perform this check by running linearly through the table for each variable, so the complexity is O n2 n time. Thus, the total complexity is indeed O n2 n . □

Lemma B.2.
There is a constant c < 1 such that the probability that a function generated in steps 1 and 2 of Algorithm 2 does not satisfy one of the conditions (E1) or (E2) is bounded by c for every n.

(B.2)
We will show that there is a constant c < 1 such that P ((E1) is false | r ≠ 1) and P (5.1 is false | r = 1) do not exceed c.
i. P ((E1) is false | r ≠ 1): the probability of having k r = 1 (the only possible k r < 2) is just the proportion of ordered partitions with a single element at the end. We can construct all of these by picking an element and then picking a partition of the remaining elements, so this creates k ⋅ p k − 1 possibilities. Thus, the probability of this occurring is kp k − 1 /p k . [19, equation (5)] implies that this approaches ln(2) < 1 as n goes to infinity. Thus, there exists such c.

Proposition B.2.
Complexity of Algorithm 2 is O n2 n .