Maximum mutational robustness in genotype–phenotype maps follows a self-similar blancmange-like curve

Phenotype robustness, defined as the average mutational robustness of all the genotypes that map to a given phenotype, plays a key role in facilitating neutral exploration of novel phenotypic variation by an evolving population. By applying results from coding theory, we prove that the maximum phenotype robustness occurs when genotypes are organized as bricklayer’s graphs, so-called because they resemble the way in which a bricklayer would fill in a Hamming graph. The value of the maximal robustness is given by a fractal continuous everywhere but differentiable nowhere sums-of-digits function from number theory. Interestingly, genotype–phenotype maps for RNA secondary structure and the hydrophobic-polar (HP) model for protein folding can exhibit phenotype robustness that exactly attains this upper bound. By exploiting properties of the sums-of-digits function, we prove a lower bound on the deviation of the maximum robustness of phenotypes with multiple neutral components from the bricklayer’s graph bound, and show that RNA secondary structure phenotypes obey this bound. Finally, we show how robustness changes when phenotypes are coarse-grained and derive a formula and associated bounds for the transition probabilities between such phenotypes.


Introduction
A single genotype is a collection of biological information. It can be encoded by a sequence of DNA or RNA or in a more coarse-grained way, for example, in the weights of a gene regulatory network. A genotype is mapped to a phenotype, which is a biologically observed output, trait or behaviour, via a genotype-phe-Because many mutations are effectively neutral there will typically be many more genotypes than phenotypes; for instance, in the RNA secondary structure GP map, for nucleotide sequences of length L, there are approximately 1.8 L phenotypes while there are exactly 4 L genotypes [5]. The set of all genotypes that map to a given phenotype is called a neutral set, or sometimes also a neutral network. It plays an important role in shaping the way that novel variation arises in evolutionary dynamics. Properties of neutral sets have been extensively studied [1][2][3][4]. A number of key features of neutral sets are shared across GP maps [2,3,19]. For example, the size of the neutral sets for different phenotypes typically vary over many orders of magnitude, with a small fraction of the phenotypes taking up the majority of genotypes. Such phenotype bias can strongly affect evolutionary outcomes [15,18,[26][27][28][29][30][31][32].
Another important shared trait is that neutral sets are typically highly connected by point mutations due to a high average mutational robustness, meaning that they are likely to be fully connected, or percolate. This property hugely enhances the probability that a neutral set can be traversed by single mutational steps, allowing a much larger set of alternative phenotypes to be accessible than one could reach from a single genotype. In this way, enhanced robustness can lead to enhanced evolvability, which is the ability to discover new phenotypes [10,33]. In some cases, the neutral set is split into smaller component networks which are disconnected, for example due to biophysical constraints [8,11,34]. Then it is not the full neutral set, but rather each neutral component that percolates and is easily traversed via point mutations.
The property that we will focus on in this paper is the mutational robustness ρ p of a phenotype p, defined as the average probability that a single character mutation of a genotype mapping to phenotype p does not change the phenotype p. Typically larger neutral sets have higher robustness. For the (3-non-crossing [35]) RNA sequence-tosecondary structure GP map, it has been shown that the distribution of robustness found upon random sampling of sequences accurately predicts the distribution of robustnesses for functional or non-coding RNAs found in nature [36], although for very short strands, naturally occurring RNA are marginally more robust [26]. In other words, for this system, the structure of GP map appears to largely determine the mutational robustness found in nature. Thus studying these more abstract mathematical features of the GP map may directly lead to predictions about naturally occurring phenotypes.
To study ρ p in a mathematically convenient way, we will use the language of graphs. The entire set of k ℓ possible sequences in a GP map with input sequences of length ℓ drawn from an alphabet of k characters is representable as a generalization of a hypercube graph called a Hamming graph H ℓ,k . Each sequence maps onto a vertex, and two vertices are connected by an edge in H ℓ,k only if the corresponding sequences differ by a single character. The neutral set of all genotypes mapping to phenotype p define a vertex set V(G p ) on a vertex-induced subgraph (or neutral set) G p in H ℓ,k . In other words, each vertex represents one of the genotypes in the neutral set. Similarly, the edge set E(G p ) is defined as the set of all edges between vertices in G p , and represents genotypes that are connected by neutral mutations. The robustness can now be defined as the average degree of G p divided by ℓ(k − 1) such that it is normalized to 0 ≤ ρ p ≤ 1. Using the relationship between the average degree and the edge-to-vertex ratio, we have a graph-theoretic definition of robustness ρ(G p ) of the neutral set G p rðG p Þ ¼ 2jEðG p Þj 'ðk À 1ÞjVðG p Þj : ð1:1Þ In other words, this definition of the robustness scales with the average number of edges per vertex, normalized so that it is equal to the fraction of possible mutations in the neutral set that are neutral. Relatively high phenotype robustness is important because high robustness has been shown to facilitate navigability of fitness landscapes [37]. To a good approximation, if the frequency of a phenotype p is above the threshold f p > 1/[ℓ(k − 1)], then the graph G p , or equivalently, the neutral set or neutral component, should percolate [19], resulting in very high robustness. Even under the assumption of random GP pairings, above the percolation threshold, the number of neutral components within a neutral set is expected to drop precipitously, and the size of the largest neutral component is expected to sharply increase [19], both of which suggest a highly connected neutral component and thereby high robustness. A null expectation of the robustness predicts that the probability of a mutation being neutral should scale as r p % f p (random null model), ð1:2Þ where the frequency f p ≡|V(G p )|/k ℓ is equal to the probability of obtaining p when choosing a genotype at random. This scaling holds when genotypes are completely uncorrelated as they would be in a random null model [19] where genotypes are randomly attributed to phenotypes under the constraint that the neutral set sizes are kept fixed. For the typically strong phenotype bias observed in GP maps, this scaling implies that many phenotypes will have a robustness that is too small to allow neutral set graphs to percolate. However, empirical studies of robustness have consistently shown a log-linear relationship with the frequency of a phenotype for a wide range of GP maps in biology, as well as for similar input-output maps in computer science and physics [2,3,[16][17][18][19][20][21][22][23][24][25]. The empirically measured robustness is orders of magnitude higher than what is predicted by the random null model. There must therefore be strong correlations between the genotypes mapping to a particular phenotype [19]. One important biological consequence of this empirically measured higher robustness is that, typically, a large fraction of all neutral set graphs in a GP map should percolate, leading to enhanced evolvability. These empirical results also raise an interesting question that will be the main focus of this paper, namely What is the upper bound on robustness, and how close are physical GP maps to this bound? Here, we prove, by applying concepts from coding theory harking back to the 1960s [38,39], that a particular type of graph called bricklayer's graphs [40] are maximally robust.
We then derive explicit expressions for the maximum robustness of bricklayer's graphs, which by extension gives a maximum on the possible robustness of neutral sets.
These expressions are closely related to the well-known sumsof-digits function from number theory. The dominant term scales as 1 + log k f p /ℓ, which resembles the empirical scaling of equation (1.3). Next, we numerically confirm that many individual neutral components for the RNA sequence-tosecondary structure GP maps and the hydrophobic-polar (HP) protein folding GP map introduced by Dill [41] can attain the bricklayer's graph bound exactly.
We prove that if a full neutral set is made up of several neutral components, then the maximal attainable robustness will be below the maximum given by a bricklayer's graph. By deriving a new property of the sums-of-digits function, we are able to calculate a lower bound on the robustness of a neutral set made up of multiple connected neutral components. We show numerically that the RNA GP map obeys this lower bound.
Lastly, we consider the coarse-graining of phenotypes. We show how robustness and transition probabilities change when multiple phenotypes are combined into compound phenotypes, and use our equations to explain some results about the robustness of these compound phenotypes.

Bricklayer's graphs have maximum robustness
The term 'bricklayer's graph' was coined by Reeves et al. [40] in an interesting paper where they studied the principal eigenvalue of the subgraph, which is a different measure of phenotype robustness because it also takes into account population structure at steady state [42]. The name arises because they are constructed by repeatedly adding an adjacent vertex in the Hamming graph in a manner resembling the process of laying bricks. While these particular graphs have a much older origin in coding theory, we will use this more recent nomenclature throughout this work. Bricklayer's graphs are formally defined as follows: Definition 2.1. Consider a Hamming graph H ℓ,k in which all k ℓ vertices are labelled with integers 0 to k ℓ − 1 according to each integer's base-k representation, and two vertices are connected by an edge if the base-k representations differ by exactly one character. A bricklayer's graph G n,k is an induced subgraph of a Hamming graph H ℓ,k containing the first |V(G n,k )| = n vertices, labelled from {0, 1, …, n − 1}.
From our definition of the robustness (1.1), it follows that maximizing robustness for a given number of vertices V(G) (the size of graph G that represents the neutral set) is equivalent to maximizing the number of edges of the equivalent graph. Finding the subgraph G with a fixed number of vertices of a Hamming graph H ℓ,k that maximizes the number of edges (and robustness) is equivalent to minimizing the 'edge boundary' of the subgraph, i.e. minimizing the number of edges {u, v} which connect a subgraph vertex u ∈ V(G) to a vertex outside the subgraph v [ VðH ',k n GÞ. This is known as the 'edgeisoperimetric problem' for the Hamming graph. This problem is connected to coding theory because, for example, it was effectively proven in [38,39] that if the vertices of a bricklayer's graph (which they referred to as a vertex numbering which maximizes 'connectedness', similar to robustness) represent the set of k-ary sequences that map to a particular codeword being transmitted, then this set of sequences minimizes the number of single-site mutations that would cause an incorrect transmission of the codeword, averaged over all the sequences mapping to that codeword. This is akin to maximizing mutational robustness, which measures the average number of point mutations that do not change the phenotype, in biological systems. More specifically, Harper [38] showed that bricklayer's graphs attain the maximum bound for the k = 2 case, and Graham [43] and Hart [44] calculated the exact value of the bound for k = 2, namely |E(G)| ≤ S 2 (n), where S k ðnÞ ¼ P nÀ1 i¼0 s k ðiÞ is formulated in terms of s k (i), the sum of all digits in the base-k representation of the integer i. We will call S k (n) the sums-of-digits function.
Importantly for this study, Lindsey [39] generalized the work of Harper [38] to prove that bricklayer's graphs (not necessarily uniquely) attain the maximum bound for all k ≥ 2 although he did not calculate the value of the bound. No other graph can have a robustness higher than a bricklayer's graph, although some graphs may attain the same robustness.

Exact robustness/number of edges
To exactly calculate the maximum robustness, we first need to prove a theorem about the maximum number of edges of a bricklayer's graph.
Theorem 3.1. A bricklayer's graph G n,k (V, E) with n vertices has jEj ¼ S k ðnÞ ¼ P nÀ1 i¼0 s k ðiÞ edges, where s k (i) is the sum of all digits in the base-k representation of the integer i, and S k (n) is the sumsof-digits function.

Proof. See appendix A. ▪
This theorem generalizes the proof by Graham [43] and Hart [44] for all k ≥ 2, and improves on the bound given by Squier et al. [45]. A graphical example of the theorem and the relationship between the coding/graph theory and number theory perspectives are shown in figure 1.
To calculate the upper bound on robustness, we therefore need to work out the properties of the sums-of-digits function. As far back as 1940, Bush [46] already showed that the asymptotic behaviour (for large n) of the sums-of-digits function scaled as S k (n) ∼ (n/2)(k − 1)log k n. An exact analytical form for k = 2 was given by Trollope in 1968 [47] and later generalized by Delange in 1975 for all k [48] as  [50]. The fractal Takagi function is sometimes called the Blancmange function, because it resembles the blancmange dessert. It has many applications, including but not limited to mathematical analysis, probability theory and number theory [51]. The general sums-of-digits function has also interesting connections to many fields, and in particular to number theory. For example, Delange [48] famously showed that the Fourier series coefficients {c n } of g k (k {x}−1 ) (which is periodic in x with a period of one) are defined by which are linked to ζ, the Riemann zeta function. Combining our expression for phenotype robustness (1.1) with theorem 3.1 and equation (3.1) provides an expression for the maximum phenotype robustness for a neutral set of size n p = |V(G p )| This upper bound is optimal because we can always construct a bricklayer's graph with n p vertices. The maximum robustness is plotted in figure 2, and exhibits the expected blancmange-like self-similar form.
Since the frequency f p = n p /k ℓ , the first term reproduces the empirically observed scaling of equation (1.3), namely ρ p ≈ ℓ −1 log k n p = 1 + log k ( f p )/ℓ, which upperbounds the exact robustness in figure 2.  [40] representation of the (top, right) neutral network with maximal robustness for n = 5 vertices. (Bottom) Table showing the maximal number of edges for different numbers of vertices, together with an example of a bricklayer graph. In theorem 3.1, |E| = S k (n) for k = 2, making a direct connection between number theory and the maximal number of edges, which is calculated by adding up the digits of the base-k representations of the integers 0 to n − 1, which are listed in the white rows of the table. bricklayer's graphs upper bound blancmange-like curve lower bound Figure 2. A linear-log plot of the bricklayer's graphs' robustness r max p for n vertices, given by equation (3.6), versus frequency (number of vertices n divided by k ℓ ), where ℓ = 6 and k = 2. Each blue dot denotes a possible neutral set size. The green line denotes the continuous everywhere but differentiable nowhere 'blancmange-like curve' (here k = 2, so one component of this line is exactly equivalent to the Tagaki curve [50]) that is given by the continuous n p version of equation (3.6), corresponding to r max p . The upper and lower bounds on r max p , given by equation (3.8), are also plotted. The upper bound is equivalent to the simple form ρ p = ℓ −1 log k n p = 1 + log k ( f p )/ℓ. Plots like this, containing the exact maximum robustness as well as the upper and lower bounds, can be generated with our free, open-source web tool RoBound Calculator [52]. royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 20: 20230169 the bound A k S k ðnÞ n À k À 1 2 log k n 0, ð3:7Þ

Bounds on neutral set robustness
in terms of a constant A k that only depends on the alphabet size k. This implies that the robustness of a bricklayer's graph, or equivalently the maximum robustness of a neutral set of size n from equation (3.6) is bounded below and above by In other words, the empirical scaling (1.3) is a strict upper bound to the robustness, and it also provides a lower bound up to terms that scale with Oð1='Þ. We have created a free, open-source web tool called RoBound Calculator using Google Colaboratory which can generate, for specified k and ℓ, exact values and continuous interpolations of the bricklayer's graph robustness r max n , as well as the upper and lower bounds from equation (3.8). RoBound Calculator is available at [52], and several example plots from this tool are provided in appendix C.
There is no short formula to calculate A k , but Galkin & Galkina [49] have found an exact, though fairly involved, algorithm to determine A k . For k = 2, A 2 = log 4 . Additional biologically relevant values of k can be calculated using their algorithm. For instance, proteins have k = 20 amino acids comprising their primary sequence, and we can use the algorithm to find that A 20 ¼ ð19 logð84=23Þ=logð400ÞÞ À 355 46 % À3:6097. Moreover, as k → ∞, Finally, note that the correction term 2A k /((k − 1)ℓ) that appears on the left-hand side of the equation (3.8) is typically quite small compared with the scale of the typical values of ρ p found for neutral components of these systems. For instance, this correction term has the value −0.0283 for RNA12, and −0.0226 for RNA15. Equation (3.8) is therefore typically a tight bound (see also figure 2). Thus, the maximum robustness can be quite reasonably approximated by simple r max

Neutral components of biological genotypephenotype maps can attain the bricklayer's graph bound
We next turn to the question of how close the neutral components of physical GP maps are to the upper bound (3.6).
We study the RNA (length 12 and 15) secondary structure models with all four nucleotides (ACUG) as well as the HP protein folding models (length 24, and 5 × 5 lattice). The RNA and HP simulation data were obtained from prior work [19]. A neutral set for a phenotype may not fully percolate, but it can be broken into multiple neutral components of various sizes that do percolate [19] (though some phenotypes may have only a single component). By definition, a neutral component has no single mutation connections to any other neutral component of that same phenotype's neutral set.
In figure 3, the robustness values of each neutral component in the RNA12, RNA15, HP24 and HP5 × 5 models are plotted against the logarithm of the number of vertices in that neutral component. For all of the systems studied, many neutral components over several orders of magnitude of phenotype frequencies f p exactly attain the same robustness as bricklayer's graphs. Typically, this is more common for smaller than for larger neutral components. In the HP model GP maps, the size of the largest neutral components that still reach the bricklayer's graph line have fewer vertices than the same for RNA secondary structure GP maps. This is probably due to the architecture of the GP maps themselves; it has been shown recently [53] that neutral components are often modular in that they consist of highly packed clusters of vertices which are then connected to other clusters by a smaller set of linking vertices. We speculate that in the HP model GP map there may be higher modularity, leading to less-than-maximally robust neutral components above a lower threshold. It is also worth mentioning that the RNA and HP systems examined here are fairly small in length because it is computationally quite expensive to exhaustively check the neutral components for much larger sequence lengths. Figure 3 also depicts the minimum robustness of each fully connected neutral component G i The above formula follows from the fact that a neutral component, by definition, is connected, and the minimum number of edges in a connected graph with n vertices is n − 1. This minimum value can be attained by many graphs, including the path graph P n and star graph K 1,n . This is also the robustness that individual phenotypes/outputs have for the one-dimensional Edwards-Anderson spin glass input-output map [25]. Note that the null-model for robustness, ρ p ≈ f p includes many disconnected components, and so can be much lower than equation (4.1), which holds for a fully connected component. The RoBound Calculator tool [52] we have introduced also calculates and plots the ρ min curve as well.
The strict upper bound (3.6) is close to empirical scaling (1.3) observed for many GP maps, and quite far from the naive scaling ρ p ≈ f p . Since the actual robustness cannot be higher than the bound, this suggests that, at least on the scale given by f p , physical GP maps exhibit a robustness that is close to the maximum value attainable. This empirical observation raises interesting theoretical questions because the upper bound is not imposed by the specific properties of an individual GP map (biological or not); rather, the robustness is bounded above due to very general mathematical properties of the Hamming graph underlying the GP map.

Robustness of full neutral sets and the bricklayer's bound
A bricklayer's graph is maximally dense in its edge-to-vertex ratio; it must certainly be fully connected. Therefore, if a neutral set is not fully connected, and thus broken down into neutral components, then the full phenotype robustness ρ p must deviate from this optimum, even if the robustness of each of its components attains the optimal bound.
royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 20: 20230169 To calculate a bound on how much phenotypes broken down into neutral components deviate from the optimal robustness, we consider a neutral set that has n vertices and is split into m neutral components. If each neutral component is maximally robust (as many of the RNA/HP neutral components are), then each robustness would be where n ¼ P m i¼1 n i , with n i being the number of vertices in the ith neutral component. In other words, the total robustness ρ p for this specific case of a neutral set made up of bricklayer's graph components is simply a frequency-weighted sum of the robustnesses of the individual neutral components where the last inequality simply follows from the fact that r max p is the strict upper bound on robustness, which can only be obtained when the entire neutral set only has one single component.
To calculate a more accurate bound, we first prove an interesting property of the sums-of-digits function S k (n), generalizing the proof by Graham [43], who proved the following for k = 2, which we now prove for general k Theorem 5.1. For k non-negative integers {n 1 , n 2 , …, n k } obeying 0 ≤ n 1 ≤ n 2 ≤ · · · ≤ n k , the following property of the sums-of-digits function holds: Proof. See appendix A. ▪ Theorem 5.1 is not only an interesting property of the sumsof-digits function which may be useful in coding theory, but it also can be used to provide, for the specific case that there are exactly k neutral components, a tight bound on the difference between the maximum phenotype robustness for a fully connected neutral set of size n, given by equation   .2), is plotted as the red line. Note that for these small neutral components, the minimum component robustness exceeds the random null expectation since, by definition, the neutral component consists of vertices which are a connected component. Such a collection of vertices will naturally have more edges than the average collection of vertices randomly selected to form a neutral network in the random null model. The data for these curves can be generated by using the RoBound Calculator [52] tool we have introduced. The natural GP maps all contain neutral components which attain the bricklayer's graph bound (as well as some very low robustness components that attain the minimum bound). The unfolded (trivial) phenotype is omitted from each of these plots. The minimum robustness appears to be larger than the bricklayer's graph line for low frequencies; but, this only happens for non-integer values of the number of vertices. Of course, any graph will have an integer number of vertices; in all of those cases, the bricklayer's graph robustness will be greater than or equal to the minimum robustness.
royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 20: 20230169 (6.1) and r k p ¼ ð1=nÞ P k i¼1 n i rðG n i ,k Þ, the robustness of a neutral set consisting of k bricklayer's graphs as neutral components. A further generalization and discussion of this formula is provided in §6.
The assumption that the neutral components are bricklayer's graphs is not necessary, however. Weakening this assumption to assume that each neutral component has an arbitrary topology simply weakens the tightness of the bound. For an arbitrary neutral component A n i with n i vertices, It is important to note that this inequality has been proven to hold only when the number of neutral components is less than or equal to k. Indeed, it does hold perfectly for the biological RNA neutral component/phenotype robustness data from our dataset from [19]. In figure 4, each plot point represents a phenotype, and the vertical axis coordinate is given by the log of the left-hand side of equation (5.5), which is (the log of) the difference in the optimal number of edges and the actual number of edges for that phenotype. The horizontal axis coordinate is given by the log of the righthand side of equation (5.5), which is a theoretical bound computed from the frequencies of the neutral components for that phenotype. The figure 4a shows actual RNA12 phenotype data, and figure 4b RNA 15 phenotype data. Green plot points have k = 4 or fewer neutral components; it is for these phenotypes that the theoretical bound rigorously holds. The biological data support the theory if all green plot points are above or on the dashed 1 : 1 diagonal line, as this would indicate that the inequality is valid; indeed that is the case.
Most of the phenotypes in the RNA12 and RNA15 secondary structure GP maps do not have ≤k = 4 neutral components, however. Also in figure 4, we have plotted, in magenta, the values of the left-and right-hand side of equation (5.5) for phenotypes with >k = 4. The theoretical bound seems to hold even for most of the cases outside the range for which it is proven, but it begins to fail for sufficiently large phenotypes.
Here, we discussed the process by which many neutral components are combined into one larger phenotype; these neutral components are not connected to each other (by definition), so the phenotype robustness is simply a frequencyweighted average of the component robustnesses that will be necessarily lower than the maximum possible achievable robustness for a single-component phenotype. With this being said, in practice, an evolving population will typically be confined to a component, as double neutral mutations are typically quite rare. Therefore, even though the robustness of the phenotype is lower, the robustness experienced in shorter time scales by the population may often be closer to the bricklayer's graph bound.

Robustness of coarse-grained outputs/ phenotypes
In discussing neutral set and neutral component topologies for natural systems, one should keep in mind that the definition of a phenotype can vary depending on which biological property one is interested in. Working out exactly how to map genotypes to phenotypes in biological systems can be difficult due to questions of how a phenotype is defined. For RNA structures, for example, one may be interested in a specific secondary structure, as we are here, or in a more restricted phenotype, such as a certain tertiary structure. Conversely, one may be interested in a broader class of secondary structures or in a different property, such as a certain catalytic function. It is therefore interesting to ask how robustness changes if one zeroes in on different levels of description of a phenotype.
In this section, we will study the robustness of a phenotype and transition probabilities between phenotypes that have been generated from the union of multiple neutral sets. We refer to this process of merging neutral sets of different phenotypes as 'coarse-graining' of phenotypes. Tackling this question requires a more generalized approach than was used in the previous section because different phenotypes typically have non-zero transition probabilities between each  . Plot of left-hand side (ordinate) and right-hand side (abscissa) of equation (5.5) for physical phenotype data for RNA12 and RNA15 from our dataset from [19]. Green plot points represent phenotypes with ≤k = 4 neutral components; the theoretical bound in equation (5.5) rigorously holds for these phenotypes. Magenta plot points have greater than 4 neutral components; despite the fact that the bound should not rigorously hold for such phenotypes, it still does seem to approximately hold for many phenotypes, or at least many plot points lie close to the dashed line.
We illustrate our approach for RNA secondary structure using a coarse-graining method from Giegerich et al. [54], who defined a set of new 'abstract shape' RNA secondary structures which ignore fine details of the stem and loop lengths and nesting. There may, in fact, be good biological reasons for doing this coarse-graining if one thinks that function or structure is not that sensitive to small changes in the full secondary structure. Another reason for our interest in this coarse-graining scheme comes from the work of Dingle et al. [31], who showed that the frequency with which noncoding or functional RNA abstract structure appears in the Rfam database [55,56] of non-coding or functional RNA could be remarkably well predicted over five orders of magnitude by the frequency by which these structures appear upon random sampling of sequences. This phenomenon illustrates how biases in the arrival of variation, which are mediated through the GP map, can dramatically affect evolutionary outcomes [15,26].
In coarse-grained GP maps, the value of the robustness is determined by the level of coarse-graining as well as by the underlying neutral set topologies. To study this problem, we will first derive analytic formulae which follow from the underlying graph theory. We then present numerical results on coarse-grained RNA GP maps and explain qualitative changes in robustness with phenotype coarse-graining. We conclude by deriving a critical transition probability that would be needed for two coarsegrained phenotypes to maintain 'high' robustness when coarse-grained together.

Robustness and transition probabilities for coarse-grained phenotypes
As before, we consider a GP map whose genotypes consist of sequences of length ℓ drawn from an alphabet of k characters. The genotype space is the Hamming graph H ℓ,k , and phenotype neutral sets are induced subgraphs of H ℓ,k . The ith phenotype's neutral set G i (assuming 1 ≤ i ≤ N p , where N p is the total number of phenotypes) is an induced subgraph of H ℓ,k . Once again, we let V(G) denote the vertex set of a graph G, E(G) denote the edge set of G, and we additionally define E(G i , G j ) = E H (G j , G i ) to denote the set of edges induced in graph H ℓ,k by union VðG i Þ < VðG j Þ which are neither elements of E(G i ) nor E(G j ), where we have taken both G i and G j for i ≠ j to be induced subgraphs of H ℓ,k . Precisely, As we have defined before, the robustness of the ith phenotype is defined as In other words it is proportional to the ratio of edges to vertices in the neutral set graph. The transition probability that a single point mutation in the genotype leads to a change from phenotype i to phenotype j is defined by which is again the average number of links per node between the two distinct neutral sets i and j. Note that ϕ ji |V(G i )| = ϕ ij-|V(G j )|. If we define the diagonal terms ϕ ii ≡ ρ i then there needs to be an additional prefactor of 2 since the connections are between nodes of the same neutral set, and so must be counted twice.

Robustness of coarse-grained phenotypes
We first derive a general formula for robustness of coarsegrained phenotypes. Let S be the set of phenotype indices that indicate which phenotypes are being coarse-grained into a new neutral set G S . The vertex set of G S is the union of all vertices The edge set of G S includes all edges in each individual neutral set as well as the edges joining the neutral sets where (r, s) ∈ S denotes an ordered pair of elements of S such that r > s. It follows that the robustness ρ S of coarse-grained phenotype S is ð6:5Þ Using equation (6.1) and the normalized phenotype frequency f i = |V(G i )|/k ℓ , we can rewrite the coarse-grained robustness in terms of familiar biological parameters where in the last step we have used ϕ ss = ρ s . It is easy to check that, if S = {1, 2, …, N p } is the set of all phenotypes, then P 1 s Np f rs f s ¼ f r (given the definition of ϕ rs ), so ρ S = 1 as expected. The intuition behind equation (6.6) is that the robustness of the coarse-grained phenotype includes terms that come from the frequency-weighted sum of the robustnesses of the original phenotypes, as was the case for combining neutral components in equation (5.2), plus a term that adds in the contribution from transition probabilities between the combined phenotypes. If there are many transitions between them, then the combined robustness will be higher because these are now also classed as an additional contribution to robustness.

Transition probabilities between coarse-grained phenotypes
We now calculate a general formula for transition probabilities between coarse-grained phenotypes. Let S and T be two non-overlapping sets of phenotype indices that indicate which phenotypes are being coarse-grained into two coarse-grained neutral sets G S and G T , respectively. The set of edges E(G S , G T ) joining G S and G T is the union of all sets of edges that adjoin every pair of (non-coarse-grained) phenotypes, where within each pair one element is picked from the constituent phenotypes of S and the other is picked from constituent phenotypes of T. It follows that royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 20: 20230169 It now follows that the transition probability ϕ TS from the Sth coarse-grained phenotype to the Tth coarse-grained phenotype (assuming S and T have no overlap) is We can now see from equations (6.6) and (6.8) that the coarsegraining procedure takes on the same functional form, which is represented graphically in figure 5. The process of coarse-graining neutral components into a phenotype neutral set, handled in §5, is a specific case of the more general process of coarse-graining phenotype neutral sets, but in the former case, all transition probabilities between neutral components ϕ ji = 0 since neutral components are not connected to each other, by definition. We now examine coarse-graining in the RNA secondary structure GP map. The RNASHAPES program [57] merges RNA phenotypes at different 'levels' of coarse-graining by progressively ignoring more and more layers of detail in the stem and loop lengths and nesting structure of the folded RNA oligonucleotides. The 'dot-bracket' structure is the actual RNA secondary structure phenotype, obtained from the ViennaRNA program [58]. The dot-bracket structure is then provided to the RNASHAPES program to produce the coarse-grained structures. Level 1 of coarse-graining ignores some details of the dot-bracket structure and combines similar phenotypes into the same abstract phenotype; the Level 2 structures include further coarse-graining, and so forth. There are five possible levels of coarse-graining. A schematic of the RNA coarse-graining process is shown in figure 6. In figure 7, we present results in which RNA secondary structure GP maps have robustness values calculated for these various levels of coarse-graining, performed using the RNASHAPES tool.
Even though the systems in figure 7 are too small to show many Level 4 or 5 coarse-grained phenotypes, the overall trends are visible. The dot-bracket structures appear to be closest to the bricklayer's graph maximum robustness curve. At the highest levels of coarse-graining (Level 4/5), abstract phenotypes are so densely packed with dot-bracket phenotypes that a substantial portion of the Hamming graph sequence space is covered by only a small number of abstract phenotypes. This leads to a percolation-like phenomenon that allows for highly coarse-grained, large-frequency phenotypes having high robustness as would be intuitively expected.
At lower levels of coarse-graining, however, we see a trend that we did not at first expect. It seemed to us reasonable to assume that coarse-graining dot-bracket phenotypes together, which increases the frequency, would simply 'push' the robustness parallel to the diagonal logarithm line. However, the data show that coarse-grained phenotypes with sufficiently small frequencies deviate further from the maximal possible robustness (the bricklayer's graph bound) than the phenotypes that comprise them. This is because the transition probabilities between these phenotypes being coarse-grained are probably too low to provide adequate increase in robustness after coarse-graining. Note that, in contrast to robustness, transition probabilities are expected to scale as ϕ ij ≈ f j for RNA [15], as well as some other models [19] such as the HP model [41] and the polyomino model for protein quaternary structure [12,59]. In other words, the scaling of the transition probability versus frequency is no different from what would be expected from random assignment of GP pairs, so that these values are typically much lower than the robustness.

Critical threshold for the coarse-graining of phenotypes with high robustness
We now consider the example of coarse-graining two phenotypes and ask how much the transition probability between those phenotypes should be in order to keep them along the same diagonal robustness line parallel to the bricklayer's graph bound. Recall that a phenotype's neutral set G i which contains n vertices has at most |E(G i )| = S k (n) edges, where S k (n) is once again the sums-of-digits function. We know that asymptotically S k (n) ∼ (n/2)log k n, and a reasonable approximation to the maximum robustness is In the high-robustness asymptotic assumption, employed below, we assume that the robustness is near this bound and approximate it as ρ i ≈ 1 + ℓ −1 log k f i . For two phenotypes p and q that are being coarse-grained into a new phenotype S, we can use equation (6.6) to show that Let us assume that the two phenotypes have robustness values that are displaced from the (asymptotic) optimal robustness curve by amounts Δ p and Δ q Substituting these approximations into equation (6.10), we have We would intuitively expect two very robust phenotypes p and q that have dense connections to each other (i.e. relatively high values of ϕ qp f p = ϕ pq f q ) to deviate from the optimal robustness curve by Δ p and Δ q . That is to say, we expect the robustness of the coarse-grained phenotype S to be bounded φ ts coarse-graining 6 s S 6 t T φ ts f s φ TS φ TS 6 s S f s Figure 5. Schematic diagram of phenotype coarse-graining on the transition matrix ϕ ts → ϕ TS , which includes transition probabilities (off-diagonals) and robustness (diagonals). If two non-overlapping sets of original phenotypes are coarse-grained into two new coarse-grained phenotypes T and S, then the transition probability from coarse-grained phenotype S to coarse-grained phenotype T is given by ϕ TS , calculated in equation (6.8), which involves taking a frequency-weighted sum over the transition probabilities ϕ ts between the original non-coarse-grained phenotypes that comprise T and S.
royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 20: 20230169 above by The following inequalities provide bounds on the transition probability ϕ qp in order for r l S r S r u S to be satisfied.
In the case of RNA12, Levels 4 and 5 are identical because the Level 4 phenotypes are already coarse-grained as much as possible. Also plotted are the bricklayer's bound indicating the maximum possible robustness, the null model robustness (equation (1.2)), and the minimum robustness for a phenotype that contains only one component (equation (4.1)); this would be the robustness of a star graph [25].
royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 20: 20230169 required to obtain a coarse-grained robustness that is no less than the max possible robustness minus max (Δ p , Δ q ). Four hundred and twenty-three transition probabilities fall within the bounds derived above. The remaining 43 transition probabilities overshoot the upper bound, thereby bringing the coarse-grained phenotype closer to the logarithmic upper bound. Moreover, 42 of those 43 pairs of phenotypes involve the non-folding phenotype, which is the highest occurring phenotype, with frequency 0.854, since most sequences in the RNA12 GP map do not have a folded structure as the minimum free energy structure. The frequency of 0.854 is higher than the percolation threshold of 1/[ℓ(k − 1)] = 1/36 ≈ 0.028, above which the neutral set contains a large component on the order of the system size. It is also higher than the giant component threshold 1 − k −(1/(k−1)) = 1 − 2 −2/3 ≈ 0.37, above which the components of the neutral network coalesce into one large giant component that is (nearly) fully connected [19]. A giant component which occupies most of genotype space is more likely to share many edges with neutral networks of other phenotypes; as a result, the transition probabilities tend to be high, and coarse-graining another neutral network into the giant component simply forms an even larger, more robust giant component in most cases. Figure 8, along with the bounds, shows colour-coded plots of the transition probabilities which undershoot, overshoot and fall within the bounds. It should be noted that, from a physical standpoint, it makes little sense to coarse-grain the non-folding phenotype along with a folding phenotype unless one were constructing, for instance, a coarse-grained 'non-functional' phenotype which incorporated phenotypes with low biological prevalence of functionality. Here, we entertain the above discussion only to offer physical intuition for why some transition probabilities cross the upper bound calculated earlier. In general, we expect that the intuition and theoretical insight gained from our formulation will be useful in the study of coarse-grained phenotypes, which is an active area in the field of GP maps [31,60].

Discussion
In this paper, we investigate maximally robust neutral sets known as bricklayer's graphs. By applying concepts from coding theory as well as results from number theory on the sums-of-digits function, we analytically calculate the maximum phenotype robustness of a biological neutral set, a quantity which plays an important role in GP maps. We used numerical simulation to show, for the RNA sequence-to-secondary structure GP map and for the HP map for protein folding, that many neutral components have robustness that is near or achieves the upper bound.
We then derived a new property of sums of digits and used it to calculate a lower bound on the deviation of the robustness from the maximum bound when a neutral set is made up of independent neutral components. Similar bounds for the deviation from the maximum robustness that occur when phenotypes are coarse-grained together were also derived. By coarse-graining RNA secondary structures into abstract shapes [54], we demonstrated that our bounds provide intuition for trends observed in the behaviour of the robustness of coarse-grained abstract RNA shapes.
It remains an open question as to why GP maps generically have such high robustness. There are heuristic arguments based on constrained and unconstrained sites that help point in this direction [17,20,61] for systems such as RNA, and it would be interesting to explore how they link to our graph-theoretical approach. The bricklayer's graph involves genotype networks that have precisely constrained genetic sites by construction. It is therefore perhaps surprising to see that many neutral components identified in §4 achieve the exact bricklayer's bound, though it is possible that some of these graphs may not exactly be bricklayer's graphs, but rather other small graphs that obtain the same bound. While combined phenotypes and neutral sets made up of royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 20: 20230169 multiple neutral components cannot exactly achieve this bound, it remains the case that their robustness is much closer (on a log scale) to this bricklayer's bound than it is to a random-null model of uncorrelated phenotypes. Important ideas for future work include studying other GP maps, to see how close their robustness is to our theoretical maximum. In naturally occurring functional RNA, the mutational robustness is very close to that predicted by random sampling of genotypes for the GP map [26,36], which provides a neat example of detailed mathematical structure of the GP map being reflected in the living world. It would be extremely interesting to see if other biological systems exhibit mutational robustness that can also be predicted in this way. Arguments from algorithmic information theory suggest that biological GP maps and other input-output maps share common underlying principles of organization [62]. This begs the question of how close the robustnesses of non-biological systems such as spin glasses [25], quantum circuits [63] and linear genetic programs [24] are to the maximum robustness we calculate here.
Another interesting direction of future work would be to better understand the spectral properties of bricklayer's graphs. These may provide insight into population distributions and average robustness on long evolutionary time scales [40,42], allowing the exploration of relationships between mutational robustness and spectral properties.
Data accessibility. We have introduced the web tool RoBound Calculator, a Google Colaboratory notebook which can generate, for specified ℓ and k, a continuous interpolation of the maximum robustness curve, tight upper and lower bounds on the maximum robustness curve, the exact robustnesses of bricklayer's graphs comprising 1 to k ℓ genotypes, the random null expectation of robustness and the minimum robustness curve for a single neutral component. The RoBound Calculator is available free of charge, with open-source code at the GitHub link in [52]. The data are available from the Dryad Digital Repository: https://datadryad.org/stash/dataset/ doi:10.5061/dryad.sj3tx969f.
All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration. We declare we have no competing interests. Acknowledgements. The authors thank Nora Martin and Akshay Jaggi for helpful discussions.
Appendix A. Proofs of main text theorems Theorem 3.1. A bricklayer's graph G n,k (V, E) with n vertices has jEj ¼ S k ðnÞ ¼ P nÀ1 i¼0 s k ðiÞ edges, where s k (i) is the sum of all digits in the base-k representation of the integer i, and S k (n) is the sumsof-digits function.
Proof. To see this, let ℓ be the length of the input sequence so ℓ ≥ log k n, and let (x ℓ−1 (n), …, x 0 (n)) be the vector of integers containing the digits of the base-k representation of the integer n such that n ¼ P 'À1 i¼0 x i ðnÞk i . Consider the bricklayer's graph G n−1,k . When we add one more vertex such that G nÀ1,k 7 ! G n,k , we look at the base-k representation of n. An edge can be added if the base-k representation of n differs from the base-k representation of the neighbouring vertex by exactly one digit. Going through digit by digit, we see that the only allowed flips for the ith digits are x i ðnÞ 7 ! f0, . . . , x i ðnÞ À 1g. This set has cardinality x i (n). Summing this over all digits, we find that the number of edges added to the graph G n−1,k when adding an additional vertex n is the sum of digits of n in the base-k representation s k (n). Therefore, the total number of edges in G n,k is S k ðnÞ ¼ P nÀ1 i¼0 s k ðiÞ. ▪ Theorem 5.1. For k non-negative integers {n 1 , n 2 , …, n k } obeying 0 ≤ n 1 ≤ n 2 ≤ · · · ≤ n k , the following property of the sums-of-digits function holds: Proof. Let n ¼ P k i¼1 n i , and choose ℓ such that n k ≤ k ℓ . We must necessarily have that n ≤ k n k ≤ k ℓ+1 . Consider the Hamming graph H ℓ+1,k . Since H 'þ1,k ¼ H ',k K k , we can decompose the edge set of H ℓ+1,k into EðH 'þ1,k Þ ¼ where H ðiÞ ',k is the Hamming graph consisting of the base-k representations of the integers whose (arbitrarily) first digit is i − 1, and EðG 1 , Þg is the set of edges in H ℓ+1,k which join two subgraphs G 1 and G 2 . Note that for 1 ≤ i ≤ k, we can construct a bricklayer's graph G n i ,k with n i vertices that is a subgraph of the Hamming graph H ðiÞ ',k . By theorem 3.1, each bricklayer's graph has size (number of edges) equal to S k (n i ). Let us assume that each bricklayer's graph has been constructed starting on the vertex such that its index's first digit is i − 1, and all other digits are 0. This ensures that the jEðG n i ,k , G n j ,k Þj ¼ n i for i < j is maximal. The total number of edges in this graph G-i.e. the subgraph G of the Hamming graph H ℓ+1,k induced by the vertex set VðGÞ ¼ S k i¼1 VðG n i ,k Þ-is given by the contributions from within the bricklayer's graphs and the connections between them royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 20: 20230169 However, we also know that G has |V(G)| = n and, by theorem 3.1, size |E(G)| ≤ S k (n). Therefore, ðk À iÞn i S k ðnÞ, ðA 3Þ and this completes the proof. ▪

Appendix B. Derivation of bounds on transition probability for coarse-graining of robustness
Starting with equation (6.13), we now perform a change of variables to β ≡ f q /f p and f S ≡ f p + f q , so f p = f S /(1 + β) and f q = f S β/(1 + β). Without loss of generality, we can choose phenotype assignment such that f q ≤ f p , i.e. 0 ≤ β ≤ 1. We now rewrite equation (6.13) as The actual coarse-grained robustness deviates from the upper bound, equation (6.13), by and it deviates from the lower bound, equation (6.14), by In order for the actual coarse-grained robustness ρ S to fall within the bounds of the interval ½r l S , r u S , for Δ p > Δ q , and with the constraint 0 ≤ ϕ qp ≤ 1, we have  Figure 9. Plots of the continuous interpolation of the maximum robustness curve (bricklayer's bound/blancmange-like curve), upper and lower bounds on the maximum robustness, the random null expectation robustness and the single neutral component minimum robustness, generated using RoBound Calculator [52] using k = 2 (corresponding to a binary alphabet, such as in the HP lattice protein model). Plots for ℓ = 10, 30, 50 and 100 are shown. The k = 2, ℓ = 10 plot also shows, as blue dots, the exact robustness of bricklayer's graphs with an integer number of genotypes/vertices. and for Δ p < Δ q , with the constraint 0 ≤ ϕ qp ≤ 1, we have max 0, 1 2 ð1 þ bÞ log k ð1 þ bÞ À b log k b ' À ðD q À D p Þ f qp min 1, 1 2 ð1 þ bÞ log k ð1 þ bÞ À b log k b ' þ bðD q À D p Þ : ðB 5Þ Collectively, we can write max 0, 1 2 ð1 þ bÞ log k ð1 þ bÞ À b log k b ' þ D p þ bD q À ð1 þ bÞ maxðD p , D q Þ f qp min 1, 1 2 ð1 þ bÞ log k ð1 þ bÞ À b log k b ' þ D p þ bD q À ð1 þ bÞminðD p , D q Þ : ðB 6Þ The lower and upper bounds on ϕ qp for the cases Δ p > Δ q and Δ p < Δ q are plotted in figure 8. For Δ p = Δ q , the range ½r l S , r u S collapses to a single value, and the constraint on ϕ qp becomes an equality Appendix C. Example plots using the RoBound calculator web tool In this section of the appendix, we show representative plots generated from our web tool RoBound Calculator [52], which generates-provided ℓ, k and a resolution value-the continuous interpolation of the maximum robustness curve (bricklayer's bound/blancmange-like curve), upper and lower bounds on the maximum robustness, the random null expectation robustness, the single neutral component minimum robustness and, if desired, the exact robustnesses of bricklayer's graphs with an integervalued number of nodes. The examples we provide are for sequences lengths ℓ = 10, 30, 50 and 100 for alphabet sizes k = 2 (corresponding to the HP lattice protein model, in figure 9), k = 4 (corresponding to the RNA alphabet, in figure 10), and k = 20 (corresponding to the number of amino acids in proteins, in figure 11). These plots help illustrate how for larger k and ℓ, the upper and lower bounds from equation (3.8) become tighter, so that the simple r max p % log k ðn p Þ=' ¼ 1 þ log k ðf p Þ=' form provides good approximation to the maximum robustness.  Figure 11. Plots of the continuous interpolation of the maximum robustness curve (bricklayer's bound/blancmange-like curve), upper and lower bounds on the maximum robustness, the random null expectation robustness and the single neutral component minimum robustness, generated using RoBound Calculator [52] using k = 20 (corresponding to the number of standard amino acids found in proteins). Plots for ℓ = 10, 30, 50 and 100 are shown.