HOW SIGNALING GAMES EXPLAIN MIMICRY AT MANY LEVELS: FROM VIRAL EPIDEMIOLOGY TO HUMAN SOCIOLOGY

Mimicry is exhibited in multiple scales, ranging from molecular, to organismal, and then to human society. ‘Batesian’ type mimicry entails a conflict of interest between sender and receiver, reflected in a deceptive mimic signal. ‘Müllerian’ type mimicry occurs when there is perfect common interest between sender and receiver, manifested by an honest co-mimic signal. Using a signaling games approach, simulations show that invasion by Batesian mimics will make Müllerian mimicry unstable, in a coevolutionary chase. We use these results to better understand the deceptive strategies of SARS-CoV-2 and their key role in the COVID-19 pandemic. At the biomolecular level, we explain how cellularization promotes Müllerian molecular mimicry, and discourages Batesian molecular mimicry. A wide range of processes analogous to cellularization are presented; these might represent a manner of reducing oscillatory instabilities. Lastly, we identify examples of mimicry in human society, that might be addressed using a signaling game approach.

Ognuno vede quel che tu pari, pochi sentono quel che tu sei (Everyone sees what you appear to be, few really know what you are) Niccolò Machiavelli, Il Principe 1. The imitation game: mimicry and signaling game theory. Niccolò Macchiavelli has been much maligned among philosophers for emphasizing the utility of deception, in his book Il Principe. However, deception -along with cheap and imitative signaling -turns out to be rather natural when rational agents strategically interact in information asymmetric contexts. In this paper, we formally study the utility of deception, from a signaling games perspective, focusing on mimicry and its universal occurrence, in natural and artificial worlds.
Deception can occur because of information asymmetry, which refers to incomplete information regarding a situation or object and differential states of knowledge held by participants. We suggest that the existence of information asymmetry in nature has deeply affected fundamentals of the genome, organismal biology, and human society and its institutions. Molecular, organismal and cultural evolution are all profoundly influenced by information asymmetry. Here we attempt to unify its effects on all three evolutionary processes, via signaling game theory.
A common form of deception is that of deceptive mimicry. 'Mimicry' refers to imitation, and is expected to bring fitness benefits to the mimic. 'Batesian' mimicry was the first type of mimicry to be formally described, and involves a 'mimic' organism that imitates a 'model' organism, in order to deceive a third organism, the 'dupe'. Typically, a non-toxic species will mimic a toxic species, deceiving a potential predator into wrongly avoiding the non-toxic mimic [1]. Batesian mimicry implies a loss of fitness of the dupe, because it is deprived of a meal, and of the model, given that the value of the warning signal to potential predators is diluted.
One way of reducing Batesian type mimicry and establishing the honesty of a signal is through the use of costly signals. A significant price is involved in the creation of a costly signal, thus deterring mimicry and other forms of deception, as the signal is prohibitively expensive to imitate [2] [3], thus the signal reveals the true type of the sender.
Mimicry may also be cooperative (mutualistic), when two organisms converge on a common signal that is sent to a third organism, to the benefit of all three, this signaling is termed 'Müllerian' mimicry [4]. This type of mimicry was originally characterized for toxic animals that share a common warning signal to potential predators. This is easier for a predator to learn, in turn benefiting the animals that send the warning signal. This type of mimicry is non-deceptive, given perfect common interest between the participants.
When more than two organisms share a common signal in a cooperative fashion, they may form a 'Müllerian' mimicry ring. There are only a few described examples of mimicry rings, a well-known example is that of bees and wasps (Figure 6a), which share black and yellow markings as a warning of toxicity. Further examples of mimicry rings are identified in this work.
A third type of mimicry is termed 'cue' mimicry [5]. A cue is an observable feature that may be inanimate, or of biological origin, and is non-strategic, not being intended as a signal. A signal in contrast, has a communication purpose, or meaning [6], and can be regarded as strategic. In cue mimicry, a cue is mimicked by an organism, typically to deceive another organism, which could be predator or prey. This can involve blending into the background (crypsis or camouflage). Table 1 gives examples of the different aspects of mimicry observed in biotic systems.  Asymptomatic patients that go undetected in contact tracing analysis, Phishing attacks. Psychopaths display affective mimicry (mimicry of emotions) [7] Müllerian mimicry Cellular tRNA isoacceptors are co-mimics, as are the common 5'leaders and 3' polyA tails of mRNAs for different genes (Section 3.1) Bees and wasps are Müllerian co-mimics, sharing a common signal of a black body with yellow stripes The Anonymous hacktivist collective is a manifestation of cooperative mimicry, as are voluntary COVID-19 research teams Mimicry ring tRNA molecular mimicry ring comprised of tRNAs, and tRNA-like mimics where the receiver is the ribosome (Figure 6b and c) Multiple bee and wasp species form a mimicry ring, where the receiver is a potential predator ( Figure 6a) The Silkroad vendor website, used to sell illegal products and services, has spawned offspring after it was shutdown, that mimic the appearance and functionality of the original website, including the webmaster, Dread Pirate Roberts. Pirate flags themselves constitute a mimicry ring, discussed in Supplementary Material Section 2 Cue mimicry Cancers can blend into the host tissue, by a variety of mechanisms such as surface antigen masking by sialic acids [8], down-regulation of MHC Class I expression [9] and truncation of oligosaccharides on cell surface proteins [10]. Devil facial tumor disease 1 (DFT1) is a contagious cancer in Tasmanian Devils, which blends into the somatic background, facilitated by loss of MHC Class I molecules [11] Spiders and chameleons can blend into their respective backgrounds Zero day vulnerabilities are very difficult to detect as they blend into the source code. Some may occur accidentally, others may be deliberately introduced.
A form of cue mimicry in cyberspace is protocol spoofing, which describes a means of concealing a communication within another type of communication so as to avoid its detection by a government or service provider that could potentially snoop. Additional examples include Tor's anonymization routing, anti-censorship and free speech technologies. A recent example is found with protesters in Hong Kong, aware that officials and police use biometrics such as facial identification, have utilized face masks to protect their anonymity. [12] Complexity and stability Simple, stable, close to optimal Moderately complex and stable, but requires costly mechanisms that might cause species extinctions [13] Highly complex, involving multiple institutions with complex checks and balances. Stability is poorly understood. Costly (honest) signal The tertiary structure of proteins represents a costly signal difficult to mimic [14] Animals display costly signals such as sexual ornamentation [3] Proof of work by bitcoin miners is a costly signal. Other examples of costly signaling include risk taking by health workers, conspicuous consumption [15], educational attainment [16], and potentially cognitive capacity [17] Signaling game theory provides an ideal framework for analyzing the different types of mimicry, providing a better understanding of its evolution and purpose. Signaling games involve an incomplete information setting, that results in the transmittal of a strategically chosen signal between two players, from a sender to a receiver, resulting in an action by the receiver [18]. The strategically chosen action by the receiver results in an increase in utility (analogous to organismal fitness), which is distributed to the players. If the signal is honest, then utility accrues to both the sender and the receiver; here the signal is mutually beneficial. If the signal is deceptive however, then the sender experiences an increase in utility, but the receiver experiences a shortfall in its expected utility [19].
'Replicator' dynamics refers to the propensity of a player with a strategy that produces a higher utility to preferentially replicate within a population, and introduces an evolutionary aspect to repeated games [20]. Senders with signaling strategies that result in increased utility, or receivers with an action strategy that produces a higher utility, will be more likely to become fixed in the population. Repeated games facilitate learning by the receiver to recognize the signal; this dynamics can occur in the lifetime of the organism, or over generations.
Different types of signaling equilibria are associated with different types of mimicry.
A so-called 'separating' equilibrium (also known as a signaling system/convention, or Lewis signaling game [21]), is where a specific signal from a sender leads to a specific action by the receiver, to mutual benefit. The signal is expected to be honest providing an accurate indication of the type of the sender (Figure 1a). A 'pooling' (or 'babbling') equilibrium is where the signal has no meaning and does not result in a specific action by the receiver [22] (Figure (1b). Pertinent to mimicry is the partial pooling equilibrium. Here, there are at least three senders, and two of them will send the same signal to the receiver, which results in the same action [23]. Both Müllerian and Batesian mimicry correspond to a partial pooling equilibrium (Figure 1c and 1d, respectively). Müllerian mimicry rings may be viewed as assuming a star network structure. A Shapley value is generated amongst all the senders and the receiver that comprise the mimicry ring (the Shapley value is a measure of the distribution of utilities amongst players in a cooperative game [25]). An important observation is that Batesian mimicry is frequency dependent: the higher the frequency of Batesian mimics in a population, the lesser their fitness advantage. This is because too many Batesian mimics will dilute the value of the signal [26], to the extent that selection processes view better discrimination mechanisms as advantageous over the continued use of the diluted signaling system [27]. Extensive form decision trees for Batesian, and Müllerian mimicry, are shown in Figure 2.

Methods.
We construct a mathematical model to study signal evolution in these informational asymmetric scenarios. We start by formalizing population structure as an ensemble of communicating types (species), with each type representing a population of agents. Next we define the space of signaling strategies, the signaling game encounters, and the differing rewards for encounters due to type and receiver action. We will assume a rudimentary decision function for the uninformed receiver that imposes a decision surface in signal space. Differing signaling strategies will yield differing rewards from encounters, to guide evolution we assume mechanisms that relate rewards to replication rates within type. Strategies that gain higher reward will spread at a quicker rate than their lower yielding peers. Additionally we incorporate a stationary mutation process that generates new strategies periodically. We thus incorporate natural selection, replication and mutation for each types within an evolutionary game.
With the evolutionary game defined we set the stage for analyzing mimicry forms and their dynamic consequences.
Population Structure and Predation Rewards: Agents are organized into type groups. Aside from constraining replication, types will determine outcomes of certain receiver actions during encounters. For example, as predators encounter other organisms in the wild, predation decisions are made using markings (signal). Poor metabolic outcomes are possible and the organisms' type will be the most important factor if predation occurs. The simplest interesting scenario is when two types have related markings but one is toxic while the other is nourishing.
We denote the population of agents who are type x as τ x , the population size of type X will be denoted N x = |τ x |. The population structure will refer to the types (and population sizes) within a model.  Fig. 1: Separating, pooling and partial pooling equilibria. A separating equilibrium (a) occurs where senders of different types send distinct signals, each of which elicits a specific action by the receiver, to the benefit of both sender and receiver [24]. A pooling or babbling equilibrium (b) is where senders of different types send the same signal, and the resulting action of the receiver is always the same. Most relevant to the phenomenon of mimicry is the 'partial pooling' equilibrium. Here, two or more senders of different types may send the same signal to a receiver, eliciting the same action, while other senders send different signals. Both Müllerian type (c) and Batesian type (d) mimicry can be represented by a partial pooling equilibrium. The partial pooling equilibrium can represent a common signal that two or more senders have converged upon, to their mutual benefit (Müllerian). Alternatively, it can also describe deceptive signaling by one sender, which imitates or mimics a signal sent by one of the other senders (Batesian). Different types of wine (Chianti, Riesling and Sylvaner) which each possess distinctive wine bottles (the signal) are used to illustrate the different types of equilibria. In (a) the distinct bottles act as a separating equilibrium. In (b) all three types of wine use the same bottle; this is a pooling equilibrium. In (c) different Chianti wineries, Ruffino and Opici, use the same characteristic Chianti bottle; this is cooperative Müllerian type mimicry. In (d) fake Chianti manufacturers use the typical Chianti bottle to deceive customers into buying the wine. This is Batesian type mimicry. Aposematic mimicry represents a special case of partial pooling equilibrium with a pooled signal of toxicity, contrasting with organisms that lack the signal, the action of the receiver being 'avoid' and 'consume', respectively an agent's type. Assuming a predator consumes agent a, the reward will be: In our simplest scenario, m = 2, the first type toxic and the second type nourishing Fig. 2: Extensive form decision trees for signaling strategies involving mimicry. Extensive form decision trees show the sequence of interactions between players in a game, and the respective payoffs, depending on the strategies that they follow. Extended form decision trees are shown for a) Batesian type, b) Müllerian type mimicry, and c) Mixed type. The open circle represents a decision taken by nature regarding the type of sender, which may be of two types. Two potential signals may be sent by senders: avoid or otherwise (consume). The dotted lines indicate that the receiver has incomplete information regarding the identity of the sender, which may be of two (or more) potential types. The receiver has two potential options, avoid, or consume. The utility payoffs are in brackets (S,R), and the Nash equilibrium is indicated by an asterisk. In (a) there are two types of sender, Τ model and Τ mimic , corresponding to the model and Batesian mimic respectively. In (b) the sender may be of type Τ co-mimicA or Τ co-mimicB , corresponding to two Müllerian co-mimics, A and B. In (c) we show how scenarios are composed to form a mixed type.
is represented by values: r 1 < 1 < r 2 . The metabolic reward of 1 can be thought of as neither loss nor gain.
Signaling Strategy and Predation Decisions: The signaling space is a set of distinguishable features, within the receiver's sensory and sender's combined repertoire of expression. All relevant signals can be represented within a signal space, a vector space Γ = R N for sufficiently large N . The signal space may consist of morphological, phenotype, markings, sounds, smells, coloration, or other attributes sensed or communicated that factor into the receiver's predation decision. For agent a, the signaling strategy will be denoted s(a) ∈ Γ.
Additionally, we will use the same space Γ (along with a scalar threshold) as a parameter space for the receiver's decision function. In our simple scenario we model the predator's choice of action as: predation or avoidance by partitioning Γ into various regions which prescribe those actions. The simplest decision model 1 for the agent receiver will utilize a reference point q ∈ Γ termed the avoidance feature, this will identify the center of an avoidance region in Γ. Using the cosine distance, cosd(x, y) = 1 − x,y ||x||||y|| , the avoidance region is symmetric around the avoidance feature, with radius determined by the threshold parameter δ ∈ [0, 2]. This results in the decisions function: When encountering an organism, any signal received within the avoidance region will be instinctively avoided, otherwise predation instincts prevail.
Signaling Game and Metabolic Objectives: Signaling games, start with nature, that selects the sender type. Nature provides agent encounters, assigning the type (identity) for the organism (sender) and receiver (predator). Next the sender transmits a message s ∈ Γ. The receiver having received the sender's message but without certain knowledge of sender's type, must select an action. Differing rewards occur depending on the combination of nature's selection of types, sender's signal, and receiver's action.
Abstractly this payoff function is represented by: where the first two symbols represent the sender's type and signal, the following two symbols are the receiver's action and type (also selected by nature if there are multiple receiver types), finally the right-hand side represents the rewards for sender and receiver, in that order. The signaling game in our scenario can be viewed most clearly during an encounter scenario: When predator encounters an organism that sends a certain signal (markings or coloration). The predator receiver, uncertain of the sender's type, selects an action, predation or avoidance, resulting in dramatic and differing health outcomes.
The repertoire of types (formalized as population structure), that nature presents to predators during encounters, will play a significant role in signaling game outcomes. We illustrate how this affects the signaling game structure in figure 2 where we illustrate extensive forms for signaling games with a variety of population structures. We provide a general reward matrix (in table 2) for game outcomes, this table includes the types used within the population structures studied. We use these types to create various ensembles for analysis, and consider the types of mimicry they express. Our results suggest that the population structure and predation reward function are critical to understanding how mimicry is formed, maintained, and destroyed in populations.
To keep our descriptions as simple as possible we will consider only one type of predator in the study, the outcome from signaling games is given by the reward payoff ( predator, prey ) encounter predator action model mimic co-mimic-A co-mimic-B avoid function U : T × Γ × A, specifically: The equation above may be further modified by the parameter such that 0 < < 1 representing the reduction in fitness (reproductive likelihood) for a sender organism consumed during the encounter. Values of R(t) apply for predation; the small deduction η < 1, accounts for metabolic loss and deferred replenishment for avoidance.
Best receiver action: Within a population of replicating strategies guided by evolution, more successful strategies will have increased replication rates. To explore these dynamics further we consider how the utility optimization implicitly depends on the statistical distribution of types. We consider the distribution µ which measures the probability of type given a specific signal s.
Letting: Θ(A|s) = t∈T U 2 (t, s, A(s|q, δ))µ(t|s), with U 2 , the receiver's utility, and µ(t|s), probability of type t given signal s. The best response may be taken as an argument policy maximization which seeks the best action as yielding the highest reward averaged over all agents of all types: By integrating over the signal space, calculation of the best response is revealed as a geometric problem: where λ (q,δ,t) is the proportion of t-type population with signaling strategy (signal or markings) contained in the avoidance region (q, δ). Notice the interesting case for mixed types, for example if r 1 < 1 − η < 1 < r 2 in our simplest scenario. Selection of parameters for best response is a nontrivial instance of the geometric knapsack problem. The dynamic evolutionary games further places dependence on frequency or distribution of signals. The geometric interpretation gives a sense of the type of distributed co-optimization as it occurs during evolutionary games, and provides an analogous geometric problem involving reinforcement and adversarial learning.
Evolutionary Games: In evolutionary games the dynamic evolution of strategies within a population is considered. We illustrate the phases of the evolutionary game in figure 3 and summarize the three main steps as: meet, mate (& repopulate), and mutate. Each type in the population structure is a component in the evolution processes, the repopulate and mutation phases will be self contained to the type component. The encounter phase intermingles the agent types coupling the component processes with signaling games that define the evolving objectives and problem solving requirements linking outcomes to replication rates.
Initialization: We will define a constant size population of agents for each type t as n t . We use an initial distribution to generate n t pure strategies and assign those to the agents of each type. We set the loop count k to be zero. Measure: for each time step k we record measures of the population to study empirically the evolving distribution of quantities. Meet/Encounter : are generated from an encounter distribution which generates random pairs with predator and prey agents. Games for every encounter, we use the sender type, sender signal, and receiver (predator) avoidance region to calculate the rewards gained from the signaling game. Score: for each type, for each strategy i, φ i = gi j gj where g j is the total reward received by all agents who use strategy j during encounters. The vector φ forms a probability distribution over the set of strategies. Mate/Repopulate: with details given in the Supplementary Materials, φ is slightly modified to derive Φ used to re-select n t strategies (from Multinomial(n t , Φ)). These new strategies are assigned to the agents of type t. Note that this technique is nothing but a simple form of statistical boosting, the better the score the more likely to re-select. Said differently, this phase will prefer to replenish better performing strategies over poorly performing strategies. Mutation Next a random mutation process is applied to agents whose strategies are mutated in place. Mutation acts to generate and probe novel strategies which the population can try. Increment Time step k is incremented and a next population is defined, processing continues from the Measure step.
Under natural selection a mimicry ring appears to be in a stable equilibrium because although mutation allows out of equilibrium signaling, it presumably offers only less benefit and accordingly will not survive for long. Once established in a ring, we expect mimicry to endure and evolve. While a Müllerian ring appears stable in isolation, considerable advantages await Batesian mimics to invade the ring. Under natural selection, such invasions lead to loss of rewards by receivers. As such the question resurfaces as to how many Batesians will it take to destabilize the ring.
The methods outlined are analyzed rigorously to evaluate the subjects of ring formation and robustness (to Batesian invasions) with simulation studies of evolving populations of agents under a variety of population structures.
3. Results and Discussion. The evolutionary game, discussed earlier, expresses a range of mimicries. Here we will further illustrate and discuss the surprisingly diverse dynamics expressed for a variety of population structures. We will draw attention to three prominent behaviors: 1) the emergence of Müllerian rings and their stability, 2) the adversarial behavior introduced by Batesian mimics, and 3) the behavior of mixed mode mimicry. We will illustrate how the mixed mode expresses both the emergence of rings and the antagonistic dynamics of Batesian types, but further we study how the Batesian types interact with the ring and how they destabilize it and/or eventually, cause its complete collapse. To investigate this phenomenon we illustrate how the mixed mode system evolves by transitions from one game equilibria to another, and the conditions that trigger transitions within the model. We also draw The process measures signaling strategies by their metabolic/protection rewards. The process guides evolution by constructing a relation between strategic rewards and reproduction implemented with statistical boosting. The process architecture is simple, but worth noting that each type in the population structure forms a component in the evolution processes. During the encounter phase the type components intermingle with signaling games. Accordingly the strategic objectives are dynamic, dependent on frequency (of strategies used by other types) and can behave in complex ways.
connections to existing biological studies where our results bear close resemblance to natural phenomena.
Emergence of Rings: In systems with potential common interest for coordinated behaviors, we observe the emergence of Müllerian rings. Once established we expect the ring to endure and evolve. This dynamic is observed most clearly in systems comprised of only toxic types and predators, such as system (1, 0) and (2, 0) which have respectively one and two toxic types, one predator but zero non-toxic types. In these systems we observe a signal locking phenomena; where, the predator and toxic types adapt and hold a purposeful signaling convention. Initialized in 'babbling,' where encounters between predator and toxic types invariably lead to predation, either a mutant predator (with advantageous avoidance behavior) or mutant toxic organism (that predator instinctively avoids) will eventually occur. Once such an event occurs, both populations quickly replicate those strategies catalyzed by the increased rewards they offer, as can be seen in figure S5 (b) and figure S9 (b). Figure  S6 illustrates the transition and the rapid crossover the populations take. This transition will be driven by the new strategies being favored and boosted in replication. Additionally this will coincide with the abandonment and extinction of many inferior strategies for the singular new one, accordingly, as the population becomes more clonal, we observed a decreased variance in strategies. The temporal requirements for such a transition can be understood as the expected search time for 'paths to cross.' Within a few generations a 'separating equilibrium,' where the signal is used to distinguish type (e.g., the receiver's basic problem of determining what to eat) takes over and a Müllerian ring is formed. Once established, we expect the ring to endure, its stability achieved by selective advantage, stumbled upon by mutants probing alternative strategies. Should a mutant strategy break the signaling convention their replication rates are attenuated by the less satisfying outcome. The stability of the ring can be observed in figure S5 and S9; these plots show the cosine distance between the strategies of encountering organisms. Any encounter within the avoidance region will result in the predator avoiding the organism, while predating otherwise. Clearly, the avoidance region is an attractor and absorbing state. Additionally, the avoidance decision can be interpreted to delineate foreground from insignificant or abiotic background to address cue mimicry forms. We emphasize that signal locking need not imply a constant or frozen signal convention (indicated by the motion of centroids in figure S7 and S10). Still this possibility points to an interesting mathematical question; namely, of asymptotic behavior as the ring grows in size.
As the ring endures, substantial advantages await Batesian mimics to invade. Since these invasions weaken the utility of the signaling convention, it raises a critical question: namely, how many Batesians does it take to destabilize the ring entirely.
Adversarial chase: Systems with conflicting reward structure will exhibit adversarial dynamics, as is observed most clearly in system comprised of non-toxic types and predators, such as system (0, 1) and (0, 2) which have one and two non toxic types respectively, and one predator type but zero toxic types. In these systems we observe an antagonistic chase: the non-toxic types attempt signal locking, however the predator repels any such signal convention by rapidly mutating and moving its avoidance region in response (see figure S8 and S13).
Initialized in 'babbling,' the non-toxic type seeks to mutate and rapidly adapt a strategy that predators instinctively avoid. Predators repel selective forces in this direction by attenuating replication of easily duped avoidance strategies. Should a non-toxic mutant dupe all predators into avoidance, predator mutation will eventually ensure a return to predation. Evolutionary events, which could offer a common signaling convention, are no longer sought by all types (as they are when a Müllerian ring forms), so they are no longer the flash-points for rapid adaptation in both the predator and prey class. Rather, in these cases what is good for one class is bad for the other; thus, setting the stage for the antagonistic chase.
Our simulations, as presented here, give the appearance that the predator has greater control, repelling the separating equilibrium (where Batesian mimics would thrive) in favor of the Babbling equilibrium (where predator minimizes loss); however, this notion of which type has greater control will depend on key model parameters. Note that the decision function has an important geometric aspect, with τ = 0.2 the avoidance region has far less volume than the predation region, when τ = 1.8 we observe that the non-toxic type controls the equilibrium by repelling babbling while maintaining the separating equilibrium.
Mixed mode behaviors In mixed mode systems with both mutualistic and conflicting reward structures (systems (n, m) are ensembles with n toxic, m non-toxic and one predator type) a novel and important behavior arises. Epochs of familiar behaviors are observed, such as ring formation via signal locking between predator and toxic types (as before in (n, 0) systems), as well as adversarial chase between non-toxic type and predator (as before in (0, n) systems). But critically, we observe a new dynamic behavior: namely, the destabilizing effects of Batesian (non-toxic) invasions on previously formed rings. Since rings emerge from babbling (as in the (n, 0) component), and Batesian invasion collapses rings back to babbling, the system cycles and we observe an oscillator (figure 4) whose main cycle is succinctly understood as transitional, from one game equilibrium to another: ordered as babbling, partial pooling, pooling and back to babbling. We illustrate the equilibria transition graph (in figure 5) for the simplest case (i.e., (1, 1) having one toxic, one non-toxic and one predator type 2 ) exhibiting the novel cycle. We generalize the discussion to other more complex scenarios where mimicry rings form.
Initialized to babbling, toxic and predator types (attracted by common interest) seek signal locking and the formation of a ring (Müllerian mode mimicry), a nontoxic type seeks to exploit any avoidance cues which predator instinctively employs (Batesian mode). Once a ring forms, enlistment of additional toxic types strengthen the ring by reinforcing the predator's reward for its avoidance behavior. As nontoxic (Batesian invaders) types invade the ring it reduces the predator's reward. The predator's utility is thus the critical ballast for ring's stability, and while it may tolerate a certain number of Batesian invaders, there may be a threshold at which the babbling equilibrium is preferred. The transition occurs when a mutant predator modifies avoidance and breaks out of the existing partial pooling equilibrium (or forgetting the inherited avoidance habit) thus putting non-toxic mimics back in play for predation. Because this approach increases metabolic rewards above that of the fully timid strategy held by the majority, the mutant strategy will quickly replicate among the predators.
Our result with mixed mode mimicry is consistent with other available evidence: namely, that models will diverge from mimics in a process of antagonistic co-evolution [28]. Theoretical considerations have indicated a 'coevolutionary chase' with a continual process of model divergence and mimic catch-up [29] [30]. Studies that monitor changes over time are scarce, and difficult experimentally -given the time scales necessary to observe multiple cycles of divergence and catch-up. More readily available examples might be found in human society, for example currency counterfeiting has to be countered with periodic introduction of new markings into banknotes. Our simulation study is simple and restricted to understanding the basic dynamics of one ring (anchored by a predator with limited avoidance parameters), however it connects in informative ways to studies that employ frequency dependence or consider multiple rings. For example, frequency dependent selection means that the effectiveness of the mimicry signal is reduced at high frequency, which could break the oscillation. While simulations have shown that invasion by Batesian mimics promote convergence between rings due to the promotion of signal divergence, which means that one ring might become similar to a second ring [31], generally their effects have been little studied. Still the criticality of predator's reward for ring stability indicates that within a larger networks the type of mean field game which arises from the games described here. In the larger networks organisms join as many rings as possible for protection, while predators dealing with rings cluttered with various levels of deception, make critical ring breaking decisions.
3.1. Molecular mimicry, the origin of life and cellularization. Molecular mimicry can be more fully understood within a signaling games framework [14]. The gene for an RNA or a protein macro-molecule can be considered as the sender, while the signal consists of the three dimensional conformation of the expressed gene product. The receiver is the macro-molecule, which specifically interacts with the signal macro-molecule, typically a protein, but could also be an RNA or DNA molecule. An action results from the binding of the receiver macro-molecule with the signal macromolecule, which results in an increase in utility (fitness) for both sender and receiver, if Fig. 4: Mixed Mimicry modes are shown to oscillate: A ring is established frequently but invaded by the non-toxic type which destabilizes the ring and precipitates the abandonment of the partial pooling in favor of babbling. When the ring is invaded the partial pooling equilibrium transitioned to a pooling equilibrium rendering the receiver's discerning strategy into one that is too timid. In (a) a population of one toxic, one non-toxic, and one predatory type evolve signaling strategies over time. In (b) a more complex scenario with a population of two toxic types, two non-toxic types and one predator. For every generation, a set of encounters results in a cosine distance measure between the predator's avoidance feature and the signaling organism. Plotted on the vertical axis is the average (and variation band) of cosine distance measures between predator and toxic type (blue) and predator and nourishing type (green). The horizontal shaded region (orange) represents the avoidance region, where encounters will lead to avoidance rather than predation. Since avoidance is mutually beneficial to toxic type and predator we observe epochs (measured in hundreds of generations) where the partial pooling equilibrium is stable and separates toxic from non-toxic types. The stability of the signaling system appears to be disrupted and destabilized when non-toxic types signal within the avoidance region.
there is perfect common interest between the two. The action might be an enzymatic reaction or conformational change by the receiver macro-molecule. The binding (substrate) specificity of the receiver macro-molecule is analogous to organismal receiver discrimination [32].
This model of a bio-molecular signaling game implies that the first signaling games were played between bio-molecules in the earliest life forms, once the bio-molecules were large enough to exert specificity [33] [14] [34]. This in turn implies that 'meaning'  Figure 2 with predator, toxic type, non-toxic type initialized in a babbling equilibrium. Each type leverages mutant and diverse strategies to search signal space. When toxic type and predator are first to lock signals, transition A is compelled by the utility seeking behavior of both causing a transition to partial pooling I, this is where predator and toxic type establish a signaling convention that is mutually beneficial and increases their utility. We observe that from partial pooling I, organisms from non-toxic type will eventually invade leading to transition B that yields a higher utility for mimic at the expense of the receiver (predator). This transition leads to a mixed mode where cooperative and Batesian mimicry strategies are simultaneously expressed and the signaling system is in pooling equilibrium. Note that predator has lost average utility from its prior state in partial pooling I and could exploit diverse or mutant strategies to return to a babbling state (transition C) if the first of such mutants stands to gain utility, as would clearly be the case when the benefits of consuming non-toxic type outweigh the risk of consuming toxic type. It is also possible that from the babbling state the non-toxic type first coalesce to predator's avoidance feature as identified by transition D leading to partial pooling II. This outcome is purely deceptive Batesian and will lead to gains for nourishing type and a loss for predator. Since predator can leverage diverse or mutant strategies which forget the avoidance feature transition E is clearly possible and preferred as a unitary move by the predator species.
first arose from the primordial soup, as the first signal was strategically sent, between a pair of replicating macro-molecules. However, immediately 'meaning' first arose, it then became susceptible to deception, effectively the original sin. mRNAs and tRNAs, may be regarded as Müllerian co-mimics, given that they typically have perfect common interest with each other. All mRNA 5' leaders and 3' polyA tails may be regarded as signal co-mimics of each other, the receiver being the translation initiation apparatus. In this sense, the protein coding portion of the genome may be regarded as an instantiation of Müllerian mimicry.
Likewise, all cellular tRNAs are signal co-mimics of each other, with the receiver the A-site of the ribosome. There are numerous additional tRNA-like co-mimics that are normal parts of the cell These include the yeast aspartyl-tRNA synthetase mRNA, Escherichia coli threonyl-tRNA synthetase mRNA, E.coli methionyl-tRNA synthetase mRNA (which all possess tRNA mimics on the mRNA leader; [35] [36] [37], the Salmonella typhimurium his operon [38], the mitochondrial Group I intron catalytic core [39], and ribosomal tRNA mimics (which are tRNA-like proteins that interact with the ribosome [40]). Several tRNA-like proteins interact with the ribosome, in the A-site. These molecules display Müllerian type mimicry, involving a cooperative relationship between the different tRNA shaped proteins and the ribosome, in both prokaryotes and eukaryotes (Figure 6b and c). These comprise molecular Müllerian mimicry rings, conferring a direct reward to the receiver, as opposed to the avoidance of harm in classical aposematic mimicry rings [41] (aposematism might be a special case of mimicry, with a pooled signal ('I am toxic') contrasting with the absence of a signal. In rewarding mimicry one might see a pooled signal contrasting with other signals, as in the wine bottles).
In contrast, Batesian molecular mimicry involves a conflict of interest between sender and receiver genes. Batesian molecular mimics of mRNAs and tRNAs may be termed 'deceiver' mRNAs [33] and 'deceiver' tRNAs [14], respectively. Virus mRNAs are all effectively deceiver mRNAs, tricking the host translational apparatus into translating them regardless. Viruses also harbor a variety of 'deceiver' tRNAs, which trick the host translational apparatus by mimicking normal cellular tRNAs [14] [42] [43]. The fitness of the virus is enhanced, but at a cost to the host (an example is provided in Figure 6c). Numerous further parallels between molecular and organismal mimicry are discussed in Supplementary Material Section 3. The relevance of the signaling games perspective of molecular mimicry to the coronavirus SARS-CoV-2, and the severity of the COVID-19 pandemic, has not escaped authors' notice. A key example is illustrated in Figure 7.
In early life, cellularization would have led to synchronization of sender and receiver gene replication, thus inducing common interest, and resulting in the alignment of their respective utilities. This process would have acted to promote cooperation, including Müllerian type molecular mimicry. In contrast, Batesian type molecular mimicry would have been disincentivized by the promotion of common interest. Conflicts of interest may still arise within the cell from selfish elements (i.e., insider threats in an intlligence organization), other forms of genetic conflict, and from external pathogens: this predicts the occurrence of molecular deception [14], which includes Batesian type molecular mimicry.
3.2. The role of molecular mimicry in COVID-19. The SARS-CoV-2 virus makes multiple uses of molecular mimicry in its efforts to exploit its human host, beginning its emergence via a zoonotic event from an earlier host, bat, which tolerates the virus in a quasi-Batesian mimicry ring. There follows a sample of some of the mimicry strategies that SARS-CoV-2 utilizes. i) Replication organelles, inside which the virus replicates [44], are a form of camouflage. ii) The addition of a cap-like structure onto the 5'end of viral mRNA by SARS-CoV-2 nsp16 [45], produces virus deceiver mRNAs, as discussed in Section 3.1. These are Batesian mimics of normal cellular mRNAs, which constitute a Müllerian mimicry ring, and is invaded by the viral deceiver mRNAs.
Further more, iii) Glycosylation of SARS-CoV-2 spike protein shields it from immune system surveillance [46]. Host glycans are acquired in the endoplasmic retic-  ulum by several RNA viruses, and so glycosylated viral proteins are regarded as self by the immune system [47], while shielding the protein epitopes from recognition; this deceptive strategy constitutes a mix of Batesian and cue mimicry. iv) An elevated Ka/Ks in exposed regions of the spike protein is an indication of ancient adversarial chases between the virus and the immune system of mammalian host(s). This may be understood as an oscillation between recognition -evasion -recognition -evasion and so on, equivalent to the adversarial chase (0,1) simulation, presented earlier in the Results section. Finally, v) the polybasic cleavage site (PCS) present in spike protein represents a Batesian molecular mimic, and playing a crucial role in the severity of the COVID-19 pandemic, described in more detail in Figure 7.
The most potent weapon in the human biotechnology-armamentarium against the Batesian deception of the virus is even cheaper molecular mimicry of the pathogen by vaccines. Vaccines may mimic different parts of a virus; its surface proteins, DNA or RNA. After administration of a vaccine, it deceives the human immune system into sensing that it is being attacked by a viable virus. This deception is costly to the vaccinated subject in the short term, as the vaccine itself does not bear any threat. However, the immune system retains a memory and so is pre-prepared for a (likely) future encounter with the real virus.
A virus may circumvent vaccination via antigenic drift, whereby virus epitopes mutate over time leading to novel immunogenic properties. This process will reduce the effectiveness of the vaccine, and is commonly encountered with flu vaccines, which must be generated on a yearly basis [48]. The response by the biomedical community to this, is to develop a new vaccine, which is a more accurate mimic of the newly evolved virus. Thus, a coevolutionary chase is joined, between vaccine and virus, that bears similarity to the oscillatory effect displayed in Figure 4.
Better knowledge of the deceptive strategies of SARS-CoV-2 will help to inform vaccine design. Particularly, a better understanding of decoy (non-neutralizing) epitopes will help in the rational design of vaccines using peptides. Decoy epitopes result in the production of non-neutralizing antibodies by the immune system, and can lead to antibody dependent enhancement (ADE). This phenomenon occurs when decoy epitopes bind to non-neutralizing antibodies which facilitate the entry of the virus into the host cell [49]. Decoy epitopes result in a reduction of efficiency of vaccines, by diverting immune resources away from the recognition of neutralizing epitopes [50], and by potentially causing ADE [49].
The prediction of decoy epitopes from virus protein sequences has been little studied. Understanding the evolutionary dynamics of decoy epitopes may allow their more precise identification. A key question to be answered is whether they are adaptive; if so then they may be better understood as Batesian mimics of neutralizing epitopes. In this scenario, the decoy epitope is deliberately exposed to the immune system, rather than being shielded by glycans, in order to divert antibodies from the neutralizing epitopes. Identification of decoy epitopes will allow the design of vaccines that circumvent such epitopes, thus sharpening the immune response to the vaccine.
Anti-viral drugs are also typically molecular mimics. For example, remdesivir is a molecular mimic of ribonucleotides. The drug represents a deceptive molecular signal, luring the virus replicase into binding to it. The response of virus over time is to develop drug resistance, by ceasing to bind the drug; it has 'learned' that the drug is deceptive.
The virus may develop resistance more slowly to some drugs than to others. We take as our inspiration the model of Polybasic Cleavage Sites (PCS) mimicry displayed in Figure 7, where the Müllerian molecular mimicry ring of PCS signals present in endogenous proteins constrains the signal sequence from diverging, in response to PCS mimicry by SARS-CoV-2 spike protein. Likewise, if an anti-viral drug invades a molecular mimicry ring formed by the viral protein and its canonical substrates, then the viral protein may not easily change its specificity and develop drug resistance. This stability sustains because it is constrained by the need to bind several canonical substrates, which are similar in structure.

Cellularization-like processes and the evolution of cooperation.
Analogous processes to cellularization, which involve the alignment of utilities thus promoting cooperation, abound at higher levels of biological organization (Table 2). In human society, the formation of trading blocs and religious denominations, tribalism, and groupings propelled by homophilic and other group splitting processes, including some considered harmful such as balkanisation, may also be considered forms of cellularization.
While such cellularizations are relatively stable, nonetheless, they can be sporadically destabilized by environmental changes that alter relationships among players (e.g., trust as measured by correlation of encounters) -e.g., social-distancing to mitigate a viral pandemic spread. The resulting cascade of disruptions among employeremployee relationships give rise to a Shumpeter's gale, a creative destruction dynamic, analogous to a coevolutionary chase, resulting in recellularizations, which may have to be coordinated carefully with artificial and temporary shifts in the utility functions: e.g., unemployment benefits, bail-outs, basic incomes, etc., but may also lead to extensive mimicry. The dynamics of recellularization in the presence of Batesian mimicry may lead to L, U , V or W shaped economic recoveries and warrant further investigations in the macro-economic contexts.
To break the cycle of constant invasion (W -shaped recovery), we speculate that nature and games must have a mechanism for stabilizing cellularization, this procedure strongly solidifies a tighter alignment bond between utilities of cooperative components and preserves the signaling system while also offering security recourse or a means for its protection often by costly signaling, or increasing the price for mimicry and thus disadvantaging invasive Batesian types.
Also we speculate that the mixed mode cycle completed by transition A, B, C in Figure 5 may afford evolution with a duty cycle. The return to a babbling state can allow retrial of various cooperative mimicry component combinations.
A particularly interesting socio-technological question comes up in the context of social-distancing aspects of pandemic measures and its effects on economic relationsfrequently mischaracterized in terms of lives-vs-livelihood trade-offs. Social distancing has led to novel applications of digital communications, automation, artificial intelligence and in silico simulations and poses interesting questions about restructuring dynamics for our macro-socio-economic worlds (e.g., guitar-string model and whether the recovery would be V or W -shaped). An important question posed in the context is the role likely to be played by the currently available (unexplainable) AI technologies and the "pandemonia of imitations" it may give rise to. Here, a mass of imitations may be used to increase the probability of a successful invasion of a Müllerian mimicry ring.
Though creativity, intelligence and problem solving play many important roles in modern economic relations, they have been difficult to formalize. For instance, computability has a widely-accepted model in terms of Church-Turing thesis, Turingreducibility and Turing-universality, but as a consequence of these, it remains impossible to define computers' (classical or otherwise) general problem solving capability necessary for automation of economic tasks: including estimating whether a particular task (specified in a contract) may be considered reasonably completed -the classical Halting Problem. In fact given two programs: one genuine and other (presumably) imitative, there can be no decision procedure to determine if they are Turing equivalent. These statements have deep implications on how we may wish to define Artificial Intelligence and its potential role in economic infrastructure.
The solution Turing suggested was in terms of mimicry in Information-Asymmetric Signaling games: involving a certain set of sender agents, some of which will have the type Oracles (e.g., humans) and the others of the type Imitators (e.g., models). The senders send certain signals (e.g., conversational statements in English) to receivers (e.g., humans) who must act by responding to Oracles, but ignoring Imitators. Such a game may be called an Imitation Game and the receivers test a Turing Test. As a signaling game the classical Imitation Game and its extension both have Nash Equilibria: some trivial such as Babbling or Pooling but others far more relevant to present discussion: namely, separating. A natural way to define Artificial Intelligence would be in terms of Imitators ability to achieve a reasonably informative and stable pooling (non-separating) Nash Equilibrium when introduced into a society of human Oracles.
One may propose a solution to the economic cellularization problem, which involves extending the economic system to include additional non-strategic agents: namely, Recommenders and Verifiers. These AI agents will have no explicit utilities to optimize (or even, satisfice) other than those described in terms of winning (or losing) certain tokens. An individual (homo-economicus) may envision organizing one's recommenders and verifiers not just playing imitation games in various disjoint circles of one's socio-economic lives, but also forming stable Müllerian mimicry rings to restore one's relationship with others in a rational utility-optimizing manner. Engineering these AI-augmented humans would be the core problem for AI: The ultimate Turing Test for the set of intertwined imitation games we call a modern civic society and its markets, falling and rising as motioned by an invisible hand. Table 3: Cellularization-like processes at multiple levels Cellularization-like processes promote cooperation and involve the alignment of utilities. This can occur at different levels of biotic complexity, ranging from biomolecules to whole organisms and then to human society, and ultimately supra-national organizations.

Level
Type of cellularization  . SARS-CoV-2 spike protein also contains a PCS, which leads to cleavage of the spike protein by endogenous proteases including furin, increasing the infectivity of the virus, and has directly contributed to the devastating effects of the COVID-19 pandemic [58]. The viral PCS deceives furin into cleaving it, constituting a Batesian mimic, which has invaded the PCS Müllerian molecular mimicry ring. This deceptive strategy is difficult to counteract pharmaceutically, because drugs that inhibit proteases from cleaving the viral PCS, will also inhibit the cleavage of endogenous PCSs that comprise the mimicry ring. The endogenous PCS signal is comprised of a short sequence, which is a cheap signal, meaning that it is easy to mimic. The large size of the mimicry ring means that the endogenous PCS signal is difficult to change, given the number of individual PCSs in endogenous proteins, and so an adversarial chase between endogenous and viral PCS signals is perhaps unlikely. These features may explain why the PCS is so commonly utilized by a wide range of microbial pathogens [63]. The ease of mimicry may also explain why it can arise in cancers [63], which despite utilizing a range of deceptive strategies, do not appear to make much use of Batesian mimicry, which can be costly to evolve. The structure of furin was obtained from the Protein Data Bank (5JXG).
Geometry and visualization of high dimensional signals, cues, and markings. Signal space To model signals for organisms we use a high dimensional vector space as points in Γ = [−1, 1] D , for D in the range [5,1000]. The components could generally translate to specific features of morphology or behavior. The geometry of symmetric sets (such as Γ) in high dimensional vector spaces exhibit interesting orthogonality properties stated statistically: As D increases, two random vectors form Γ are increasingly likely to be nearly orthogonal. More precisely, for any 0 < << 1: This convergence can be strongly sensed in vector spaces with dimension D as low as in [5,15]. Noting that the range of the x1,x2 ||x1||||x2|| is bounded in [−1, 1], the cosine distance function, commonly defined as: Will have range [0, 2] with concentration of values at one for random vectors x 1 , x 2 ∈ Γ.
We use these facts to create a model for predator decision making. In our model the receiver (predator) maintains a high dimensional vector as a mechanism to evaluate if it should perform an action (avoid predation or otherwise). The organism sender S will present a signal vector x ∈ Γ. The receiver R will utilize a decision vector y ∈ Γ (fixed over the organisms life span but amenable to mutation in successive generations), and a threshold τ (likewise fixed but amenable to mutation) to decide whether to consume S if d cos (x, y) < τ or not (otherwise). The decision function generalizes a half-plane model where τ = 1.
Visualizing Strategies: To visualize these high-dimensional signals we use star glyphs. Star glyphs provide a means to visualize high dimensional data in a way that is relatively neutral to recognition features which would otherwise be cognitively engaged. Below in figure S1 we illustrate an examples.
To glimpse the effects of mutating a signal in high dimensional space, where random vectors are likely found to be orthogonal (see S2 (a)), We can still get an idea of the displacements our mutation operation makes by fixing base vector X ∈ Γ = [−1, 1] 7 , and showing the effects of weighted averages with other random vectors. In figure S2 (b) we illustrate a few samples which are are scaled in increments and their associated distance from the base vector. Holding a signal X ∈ Γ constant we generate a sequence of displaced signals: Y k = λ k W k + (1 − λ k )X for W k drawn uniformly for each k ∈ {1, 2, . . . 20} (twenty distinct samples) and λ k = k × 1 20 . We plot the resulting cosine distance from X to vectors Y k as a function of k. This illustrates both the trend and the variation inherent to the mutation operation.
Measuring Population Dynamics and Identifying Mimicry:. We expect that signal mimicry will emerge due to the increased utility it offers. Noting that signal mimicry need not imply signals be fixed and constant, a suitable measure should account for dynamic signal evolution so long as the co-evolutionary constraints are clearly discerned. To capture mimicry we design statistical measures referenced from the evolving predator strategies. Fig. S1: Star Glyphs provide a useful means to visualize organism signals or the multidimensional sensory cues that they send to a receiver predators. The same star glyphs can be used to represent what features a predator may avoid. The star glyph is a set of wedges which indicate the weight assigned to each component, these glphys represent vectors from Γ = [−1, 1] 7 , and starting from the positive three o'clock position and progressing in a counterclockwise rotation around the circle, seven weights can be seen as the radius at which the wedge starts, the more of the wedge that is missing the more weight is placed on that component.
With these measures, signal mimicry among types will be characterized by: 1) low intra-type signal variation and 2) low extra-type signal variation. Taken together these indicate co-evolution to the benefit of at least one type. For example, in the Müllerian ring, when several types have similar markings, the signaling would be characterized by low extra-type signal variation. Low intra-signal variation occurs if the signaling traits are conserved.
Boosting Distribution:. Fixing the time step and the type, let φ s for s ∈ Γ, be the performance measure attained by strategies implemented during that time step. Letting φ * = min ({φ s }) and φ * = max ({φ s }) we can safely transfer the performance measures to the interval [0, 1] as the limit of fractional transformation: The term η simply prevents division by zero, and the term ξ is a statistical shrinkage term used as a model parameter that helps to distort global information available to agents when they re-select a strategy. We describe the probability that s ∈ Γ switches over to use the strategy s ∈ Γ as: In our simulations, noting that all rewards are positive (albeit greatly reduced when prey is consumed), we fix ξ = 0.
Descriptions of mimicry modes. Batesian, Müllerian, Deceptive Cue, Cooperative Cue mimicry . In (b) will illustrate a few sampled random scaled displacements of a base vector (blue) look like as well as their cosine distance. Holding a signal X ∈ Γ constant we generate a sequence of signals Y k = λ k W k + (1 − λ k )X for W k drawn uniformly random over Γ and λ k = k × 1 20 . The cosine distance is plotted as a function of k and a few star glyph visualizations are illustrated. This gives a sense of the high dimensional random mutation operations in the evolutionary game.
Supplementary Material 2: Pirate flags as Müllerian mimicry. The 17th-19th century pirate flag, the Jolly Roger, appears to have constituted a shared signal of toxicity to those crews that resisted boarding. There were a range of such flags with common elements, such as a black background, with a white skull and other objects such as weapons, bones and bleeding hearts. An interesting example of Batesian type mimicry in this context is the use of a legitimate flag by pirates in order to get close to a ship, and then raise the pirate flag when close enough to board ('showing one's true colors').
There were different gradations of pirates. A true pirate was an outlaw, who would be executed on capture. A 'privateer' was a legitimized form of piracy, at least on the part of the licensing nation, and so expected to give quarter when encountering intransigent ships. Pirates varied in ferocity (ie. toxicity), and so a fearsome reputation could be an asset. Consistent with this, some pirates took efforts to enhance their notoriety. Blackbeard presents a well known example, by braiding and growing his beard long and attaching lit fuses to his hat in battle [64], he was also reported to have taken periodic potshots at his crew, reasoning that "if he did not now and then kill one of them, they would forget who he was" [64]. An early pirate history recounted that "In the Commonwealth of Pyrates, he who goes the greatest Length of Wickedness, is looked upon with a kind of Envy amongst them..." [64].
Lastly, mixed (quasi-Batesian) mimicry would occur if a pirate imitated a more famous pirate, in order to more easily board merchant ships. A potential example of this appears to be Francis Spriggs, who is reported to have flown a Jolly Roger identical that of pirate Captain Low's, from whom he had deserted, "...a white Skeliton in the Middle of it, with a Dart in one Hand striking a bleeding Heart, and in the other, an Hour-Glass..." [64]. In this scenario the pirate flag would be partially honest (allowing identification as a pirate), and partly deceptive (promoting identification with a more famous and presumably fearsome pirate), and so constitutes mixed mimicry between Müllerian and Batesian type mimicry.
Supplementary Material 3: Further parallels between molecular and organismal mimicry. Given that natural selection operates at both the organismal and molecular levels, one might expect to observe numerous parallels between organismal and molecular mimicry. For example, at the organismal level the frequency of the model is an influence on the development of Batesian mimics: the more common the model, the more likely mimics are to arise [65]. In a potential parallel, there is a high concentration of tRNA in the cell because protein translation is a major cellular function, so the model tRNA molecule is present at high frequency in all living organisms, and so this may have influenced the widespread development of Batesian tRNA mimics by viruses.
In organismal Müllerian mimicry, the mimic is expected to evolve to become more similar in appearance to a model organism, in a process known as advergence [41]. The two step hypothesis proposes that a potential co-mimic first undergoes a mutation that causes a major change in phenotype, becoming more similar to the model. Then, if this is advantageous further mutations lead to an increase in similarity to the model ( [66], a summary). In molecular Müllerian type mimicry, a similar process might be expected to occur. Consistent with this, in the case of EF-P, a cellular tRNA mimic, it appears that the protein has evolved over time to become more tRNA-like, in a process of advergence [67]. Presumably, tRNAs predate the various other tRNA comimics found within the cell, and so comprises the model. A potential example of Batesian type molecular mimicry resulting from intragenomic conflict is that of the Mauriceville mitochondrial retroplasmid of Neurospora crassa, that uses a tRNA-like structure at the 3' end of the plasmid transcript to initiate cDNA synthesis [68]. Examples of organismal and molecular Müllerian are illustrated in Figure S3.
'Rewarding' mimicry has been proposed as an additional category of mimicry [5]. A rare example is provided by floral mimicry, where there is evidence that two plant species Turnera sidoides ssp. pinnatifida (Turneraceae) and Sphaeralcea cordobensis (Malvaceae) share a common signal, the shape and color of the flowers [69]. The sharing of a common signal between two flower species leads to more effective signaling to the pollinator, because the frequency with which the pollinator encounters the signal is increased, which means that the association between signal and reward (the pollen / nectar) is more efficiently learnt by the pollinator.
While organismal behavior is more complex than macromolecular behavior, some parallels can be drawn between the two. Thus, while 'receiver psychology' at the organismal level refers to the receivers ability to detect, discriminate and remember a signal [32], an equivalent can be found at the molecular level in the aptitude of a receiver macromolecule to distinguish a particular macromolecular signal. This is dictated by its biochemical binding affinity, which reflects the strength and specificity of binding between a macromolecular signal and receiver macromolecule, quantified by a binding constant such as the Michaelis-Menten constant (Km). Thus, a macromolecular mimic should display a similar binding affinity and specificity with a receiver macromolecule, to that of the model macromolecular signal.
At the organismal level, selection on the model to evolve differences from the Batesian mimic is expected, so that the dupe can better distinguish the model, as opposed to the mimic [70]. With cellular tRNA models, their conformation is tightly constrained by their role within the ribosome. However, we propose that the large number of tRNA base modifications [71] may have evolved so that the host organism can better distinguish viral deceiver tRNAs, which are deficient in modifications. If tRNAs with base modifications are harder to mimic, then these constitute a 'costly' signal.
At the organismal level, in order for the potential dupe to detect Batesian mimics, a variety of scanning and surveillance mechanisms have evolved [19]. Likewise, we expect molecular scanning mechanisms to have evolved to detect Batesian tRNA mimicry. These might include translational proofreading mechanisms (by aminoacyl-tRNA synthetases, and the ribosome), however these are likely to have evolved primarily for the important role of ensuring translational fidelity. Information asymmetry exists between the pathogen (or selfish element), and the host organism, hence there should be a selection pressure to evolve scanning mechanisms that can ameliorate this asymmetry. Consistent with this, in the case of the mimicry of eukaryotic initiation factor 2α (eIF2α) by viral eIF2α mimics, there is evidence for a selective pressure on the macromolecular receiver, protein kinase R, to evolve more effective recognition of eIF2α [72].
In a parallel, antimicrobial drugs often act as Batesian molecular mimics of biological macromolecules. For example, the antibiotic puromycin is an organic compound synthesized by Streptomyces alboniger that mimics tRNA [73], inhibiting bacterial translation. This process involves Batesian type tRNA mimicry, on the behalf of the physician(the sender), with the bacteria (the receiver), who experiences a drop in utility. Microbial drug resistance can be understood as the result of a selective pressure for more efficient detection by the pathogen of molecular mimicry.
In cue mimicry, there is only one sender, which mimics a cue, sending the mimicry signal to one or more receivers. This may be considered a modification of the partial pooling equilibrium, where the receiver may receive both a signal and cue, both of which will elicit the same action.
Rather than relying on increasingly costly signals, an additional strategy to detect mimicry might be to harness multiple receivers to determine signal veracity (which could be summarized by the phrase 'you can fool some of the receivers some of the time, but not all of the receivers all of the time'). In contrast, at the organismal level, multiple receivers have been tentatively linked with the occurrence of imperfect (low fidelity) mimicry. This is hypothesized to result from differences in sensory perception amongst receivers [74].
Lastly, learning a visual warning signal is important in the evolution of Müllerian and Batesian mimicry [41] [75]. While learning is a complex behavior deriving from neurological processes, a simpler but equivalent process may be found at the macromolecular level given that learning can be viewed as a process that a population undergoes over generations, as the strategies exhibited by individual members of the population change as a result of selection [18].
Supplementary Material 4: Mixed mimicry. Butterflies of weak toxicity show mixed mimicry, also termed 'quasi-Batesian' mimicry [77]. There is evidence that this allows them to freeride on the signal value of highly toxic species [78]. The prediction has been made that quasi-Batesian mimics might have a stronger selection pressure for signal accuracy, than their Müllerian co-mimics, and so the quality of the mimicry is superior [78]. This then suggests a manner of detecting quasi-Batesian mimics, by their greater quality signal (the common phrases 'too good to be true' or 'holier than thou' suggest that humans may be attuned to detect such strategies). (b) shows molecular Müllerian mimicry, illustrated by an example of cellular tRNA mimicry, where the mRNA of E.coli threonyl-tRNA synthetase (thrRS) mimics a tRNA structure (apparently in a negative feedback loop, [35]). ThrRS-mRNA and thr-tRNA are co-mimics, both interacting with thrRS, the common receiver. Here, the genes for thr-tRNA and thrRS are the senders, while the signals are the three dimensional structure of thr-tRNA, and the tRNA-like structure in the leader of thrRS mRNA. The two senders, and their respective signals, are co-mimics. Information regarding three dimensional structures and photo sources is located in Supplementary Materials.
A mimicry spectrum has been proposed to exist from those organisms that display purely Müllerian mimicry to those that are purely Batesian mimics [79]. This is because different species can vary in their levels of toxicity, and so those with lower toxicity are expected to be less cooperative, and their mimicry more Batesian, as a consequence. Signaling game theory would predict that mixed mimicry would occur when there is partial common interest between sender and receiver. The mimicry spectrum ranges from opposed interests (Batesian), to perfect common interest (Müllerian). At the molecular level, mixed mimicry would be expected to occur in cases of intragenomic conflict, given that interests are not perfectly aligned between differing components of the genome. Likewise, some cases may occur where pathogenic microbes do not have completely opposed interests to the host, and there is a degree of common interest, which concords with the Tradeoff Hypothesis [80].
Mixed mimicry might also be expected to occur within the 'parasitism -mutualism continuum'. This refers to the transition of microbes from pathogenicity (with opposed interests to the host) to mutualism (with perfect common interest with the host) that occurs over evolutionary time [81]. Between the two extremes of pathogenicity and mutualism there exist states of partial common interest. Here, a mixture of Batesian and Müllerian molecular mimicry is expected to be encountered. Thus, in the case of viruses, deceiver mRNAs might be expected to bring some benefit to the host, if there is a cooperative component to the virus infection. In this case, the characteristic sequence and structure of the cellular mRNAs 5' leaders and 3' polyA tails can be considered as cooperative co-mimicry signals. Virus encoded mRNAs may mimic these signals [82], but if some in some way the virus contributes to organismal fitness [83], this might present an example of quasi-Batesian type molecular mimicry.
Supplementary Material 5: An economic Müllerian mimicry ring. Many Chiantis use a distinct bottle with a basket (the 'fiasco toscano' ). Originally, this had the practical purpose of protecting the bottles during transport [84], and so was not intended as a signal, instead constituting a 'cue', an observable feature, not intended as a signal. The original purpose of the basket has been superseded by the use of stronger bottles, but the fiasco toscano has been retained by some producers, acting as an identity signal [85]. This presumably benefits the consumer as they can more easily identify a Chianti wine, due to their common signal, the shape of the bottle and the basket. Therefore, this appears to be an example of a rewarding Müllerian type mimicry ring, as both multiple senders, the vineyards which act as co-mimics, and the receiver, the customer, benefit (Figure 6b).
Chianti is one of the most heavily counterfeited of all wines [85], to the extent that there is a webservice dedicated to verifying the identity of a bottle of Chianti (www.chianticlassico.com/en/wine/traceability/). The counterfeiting constitutes Batesian type mimicry, causing harm to both the deceived customer, and producers of genuine Chianti. Interestingly, in order to maintain honest signaling, the signal (the basket) itself has evolved over time to reduce counterfeiting. For example, the basket was reduced to shoulder height in order to accommodate an identifying lead seal (a 'costly signal') in the neck of the bottle. Simulation Studies. Using the methodology outlined above, we construct a set of systems characterized by differing population structures. While most parameters of the system are kept fixed, the differing numbering of Müllerian and Batesian types provides insights on how mimicry forms, becomes stable, and unstable so that populations may adapting signaling strategies which meet their protective and metabolic requirements. Specifically for each system we perform simulations and apply measures to their simulated histories in order to evaluate mimicry.
To organize the study we introduce a structure index. To keep the study as simple as possible we will fix the number of predator types to be 1. Therefore the index will identify the number of Müllerian types and number of Batesian types as tuple: (M, B). Recall that we focus on the Predator's avoidance of toxic organisms, so the Müllerian types are toxic working with predator to learn avoidance. The Batesian types are non-toxic, and use deceptive or mimetic strategies. For each type (including the predator) will will fix the population size to be N = 100.
The systems are then ordered as a lattice in figure S4. We will make reference to Simulation experiments.. Below we outline simulation experiments by stating the hypothesis that are considered and tested. Results are illustrated and discussed.
RL: System (1, 0) expresses signal locking a type of reinforced learning. It has one toxic type and one predator type, thus it is in the common interest to emerge an honest signaling convention. This experiment will test that types learn a convention of avoidance. We hypothesis that once learned, the signal convention is stable, a condition we call signal locking. Even if the convention is stable, the signal itself will likely drift. AD: System (0, 1) expresses adversarial dynamics and learning. It has one nontoxic type and one predator type whose goals are in opposition. The non-toxic type may exploit the predators avoidance for its protection by a Batesian signaling strategy. The predator must update avoidance to secure its metabolic requirements. The results are used to evaluate if this dual optimization seeking opposing goals is expressed. MR: System (2, 0) has two species of toxic prey and a single species of predator.
We test that predator and both non-toxic types can lock signals to secure a ring of honest signaling. The result will indicate the stability and duration to Müllerian rings. MM: System (1, 1) is mixed mode composing system (1, 0) and (0, 1). It has one non-toxic, one toxic and one predator type. This experiment evaluates the dynamics of the classical model (honest sender), mimic (dishonest sender), and dupe (receiver) scenario. In particular this examines the role of Batesian mimicry; in particular, if a signaling convention among the predator and toxic types emerge then will it be evolutionary stable to the invasion of deceptive signal (arising from a nourishing species mimicking a toxic one). AD2: System (0, 2) expresses a double adversarial learning scenario. It has two non-toxic types and one predator type. The goals for the non-toxic type are in opposition to the predator. The receiver must update its avoidance parameters to prevent loss. MRA: System (2, 1) expresses a Müllerian Ring that can be invaded by Batesian type. It has two toxic, one non-toxic, and one predator type. Results will evaluate dynamics leading to: ring formation, stability, loss and reformation. S12: System (1, 2) expresses mostly adversarial dynamics inherited from system (0, 2), but includes the component system (1, 0) which offers the receivers a minor reprieve if an honest signaling convention can be achieved. Results help to evaluate the mixed mode dynamics and the destabilizing role of multiple Batesian types. MM2: System (2, 2) expresses numerous mixed modes of mimicry. It has two toxic, two non-toxic, and one predator types. The result will indicate how the additional equilibria affect the overall transitional dynamics. It will provide a glimpse of a more complex ecosystem. Results shed light on transitional pathways among equilibria.
(RL) Signal locking in system (1, 0). We consider a population of two type groups: one set of 100 toxic type organisms, and one set of 100 predators. We initialize signal vectors for all toxic organisms by selecting a uniform random vector in Γ = [−1, 1] 15 , and replicating this to all 100 members of the toxic type. We initialize the avoidance feature vector for the predators similarly, but hold the tolerance parameter throughout as τ = 0.2. The initialization technique and geometry of higher dimensional space ensure with high probability that the signaling convention for avoidance must be learned, as opposed to being present initially. Notice also that the clonal strategies within type will be temporary as mutation will diversify the population with mutants. The mutation process selects each individual as a Bernoulli trial (with p = 0.15) for mutations. A strategy mutation occurs in place updating input vector x ∈ Γ asx distributed N (x, 0.1). The encounter process used in each time step is random pairing from the two types, thus each organism will encounter a randomly chosen predator.
In figure S5 we illustrate the evolutionary dynamics of signals and rewards. First, the distance between predator and prey vectors (in the encounters as a function of time) takes on a notable phase transition at around generation 800. The average cosine distance falls steeply and stays under the tolerance threshold value of τ , within the predator's avoidance region. At the same time, a transition in rewards sharply increase. This establishes the signaling convention which satisfies both receiver and sender, and is the separating equilibrium of the signaling game. The equilibrium appears stable. Notice that both mutant predators and toxic organisms will probe various out of equilibrium signaling strategies; however, their lesser rewards will ensure their replication is outpaced by the better strategies in equilibrium. The differential in rewards is significant during replication, so once a separating equilibrium is established, the decreased rewards for mutants will act to strongly stabilize the separating equilibrium. Even though the equilibrium appears stable, its durability implicitly relies on mutation rate. Noting the extremes, when the mutation rates cool to zero the equilibrium will last indefinitely, whereas if there is too much mutation the equilibrium may not hold. Mutation rates have a sweet spot or region which is high enough to search for improvements and low enough to hold onto improvements once found.
Another insightful feature of this transition is that once the separating equilibrium is found, the cosine distance of interactions notably decreases in variance, this also reflects the stabilizing effect of the equilibrium's mutual increases in satisfaction for predator and prey alike, we find that whatever variance remains is primarily driven by mutation which vigilantly searches for possibly better equilibria. Below in figure  S6 we illustrate how this signal locking occurs as a sequence of distinct activities.
Additionally measures on signal distance and path motion in the signal space are of interest because they indicate how signals evolve distinguishing signal locking from the stronger notion of sample and hold. In figure S7 we plot Intra and Extra variation with displacement to centroid measures. This allows us to evaluate how predator and toxic organisms adjust a population of signals. The greatest motion (measured by path distance of type centroids swept over time) is found prior to the transition to the separating equilibrium, there after decreases in residual variance and motion are observed. That motion still exists should indicate that the signal is somewhat fluid and moves according to co-evolutionary process.
(AD) Adversarial Dynamics in system (0, 1). We consider a population of two type groups: one set of 100 non-toxic type organism, and one set of 100 predators. We initialize signal vectors for all toxic organisms by selecting a uniform random vector in Γ = [−1, 1] 7 , and replicating this to all 100 members of the toxic type. We initialize the avoidance feature vector for the predators similarly, but hold the tolerance parameter throughout as τ = 0.2. The initialization technique and geometry of higher dimensional space ensure that the non-toxic type will seek avoidance of its own protection, as opposed to being avoided initially. The clonal strategies within type will be temporary as mutation will diversify the population with mutants. The mutation process selects each individual as a Bernoulli trial (with p = 0.15) for mutations. A strategy mutation occurs in place updating input vector x ∈ Γ asx distributed N (x, 0.1). The encounter process used in each time step is random pairing from the two types, thus each organism will encounter a randomly chosen predator.
In figure S5 we illustrate the evolutionary dynamics of signals. Notice that the scenario is completely adversarial, whenever non-toxic type population approaches the predator avoidance region, the predator will evade the possibility of forming a separating equilibrium. The result illustrates the ease at which predator type can naturally evade and repel such hazards. In summary for equal size populations with identical mutation structures and a specific avoidance region it appears that Batesian type mimicry is no match for adaptive and reward oriented predators. Possible modulators which could change this outcome include: doubling of non-toxic mutation rates, doubling the non-toxic population size, halving the signaling space dimension, doubling the predator's avoidance threshold, adjusting the metabolic reward.
(MR) Müllerian Ring emergent in system (2, 0). We consider a population of two toxic types and one predator type, all composed of 100 organisms. For each type, we initialize signal vectors by selecting a uniform random vector in Γ = [−1, 1] 7 , and replicate the selected vector to all 100 members of the type. The predator's tolerance parameter will be fixed throughout as τ = 0.2. Similar to system (1, 0), the initialization renders avoidance unlikely and needing to be learned due to high dimensional geometry. Also the inital clonality within types will be short lived due to mutation. The encounter process used in this experiment is like that of system (1, 0), however we group the non predator types and compute six random pairs for each predator per time step for signaling game play. We use the co-mimic-A and co-mimic-B rewards for the types (see table 2).
Dynamics are shown in figure S9. Similar to system (1, 0), Signal locking occurs between one of the two toxic types and predator first. While that equilibrium is reached first, the equilibrium is reinforced when the second toxic type joins the signal convention later to form textcolorredenhance the value of the signal a stronger separating equilibrium. Not surprisingly this suggests that the signal locking condition is an attractor, and maybe joined by multiple toxic types to further grow and reinforce the ring. It is still unclear what attributes of stability are improved by the addition of another type in the ring. For example, it appears from figure S10 that the motions of signal centroids is no slower with a mimicry ring than that of a single type.
(MM) Mixed mode mimicry yields an oscillator in system (1, 1). We consider a population of one toxic type, one non toxic type and one predator type, all composed of 100 organisms. For each type, we initialize signal vectors by selecting a uniform random vector in Γ = [−1, 1] 7 , and replicate the selected vector to all 100 members of the type. The predator's tolerance parameter will be fixed throughout as τ = 0.2. Similar to system (1, 0) and due to the high dimensional geometry, initialization renders the system in a state where predators need to learn toxic type, and non-toxic types have yet to exploit the predators avoidance. Also the initial clonality within types will be short lived due to mutation. The encounter process used in this experiment is like that of system (2, 0), all non predator types are pooled for random pairing with predators, six random pairs for each predator per time step are generated for signaling game play. We use the model and mimic rewards for toxic and non-toxic type (see table 2).
The most notable characteristic of this system is oscillation. Below in Figure S11 we plot the dynamic characteristics of encounters and rewards. In S12 a more detailed view for a subset of time steps is given.
Our simulations suggest that there are two prominent cycles starting from babbling. The main progression appears to continue with partial pooling (mutual convention with toxic and predator types), pooling (when that convention is invaded by non-toxic type), and then back to babbling (when the Batesian mimics completely invert the predator benefits for maintaining the signal convention). The second progression occurs when the non-toxic type is the first to be avoided. This is followed by a rapid destabilization and return to babbling. We discuss these two progressions.
The main progression: The system is initialized to a babbling equilibrium, next a mutant toxic or predator type realizes a means of avoidance. Due to the boosted rewards the convention rapidly replicates in both the toxic and predator types to yield a partial pooling equilibrium as found in (1,0). This equilibrium is protected by natural selection, as the out of equilibrium signaling in either the predator or toxic prey will result in less satisfying rewards and consequently be replicated at lower rates. Aside from low probability mutational events, the only disruptive factor is that non-toxic type could discover a mimetic strategy to exploit the avoidance for its own protection needs. Eventually the non-toxic type will invades causing transition to a pooling equilibrium. But that is short lived if the expected benefits of eating nontoxic prey outweighs the penalty for predation of toxic prey. In that case Mutations of the predator's avoidance feature are no longer deterred by selection, and a rapid relocation of the avoidance feature is expected. This competes the steps of the main progression the system returns to a babbling regime.
The second progression: Starting from babbling, the second progression starts with non-toxic type being first to be avoided, this can occur when the non-toxic type mutates an individual that exploits the avoidance features of predator. When this happens, predators quickly responds with its own mutants that adjust avoidance to expel the non-toxic type from its avoidance region, thus returning to babbling in order to reestablish the predation benefit. The dynamics are essentially that of system (0, 1) and are independent of the behavior of toxic type that is occupied with search.
(AD2) Adversarial learning and aversion in system (0, 2). We consider a population of two non-toxic types and one predator type, all composed of 100 organisms. For each type, we initialize signal vectors by selecting a uniform random vector in Γ = [−1, 1] 7 , and replicate the selected vector to all 100 members of the type. The predator's tolerance parameter will be fixed throughout as τ = 0.2. The encounter process will group the non predator types and compute six random pairs for each predator per time step for signaling game play. We use the mimic rewards for the both non-toxic types (see table 2).
Dynamics are shown in figure S13. Note that the dynamics are adversarial. In system (0, 1), the predator, reserving its avoidance region to mark toxic organisms, protects against nutritious non-toxic prey entering. System (0, 2) is similar but is two such independent components played at once against the predator. We conclude that if the signal space is reasonably large the predator will be able to avoid a large number of such types.
(MRA) Stability of the Müllerian Ring to Batesian invasion in system (2, 1). We consider a population of two toxic, one non-toxic and one predator type, all composed of 100 organisms. For each type, we initialize signal vectors by selecting a uniform random vector in Γ = [−1, 1] 7 , and replicate the selected vector to all 100 members of the type. Additionally the predator's tolerance parameter will be fixed throughout as τ = 0.2. The encounter process used in this experiment will group the non predator types and compute six random pairs for each predator per time step for signaling game play. We use the mimic, co-mimic-A and co-mimic-B rewards for the types (see table 2).
In figure S14 we illustrate the evolutionary dynamics of signals and rewards. Similar to the behavior of (2, 0) we observe the formation of a ring. The ring however lacks resiliency to withstand the invasion of a Batesian Mimic. Interestingly we observe an extensive period of uncertainty (during time steps 850 to 1600). The system appears not fully in pooling nor able to eject the Batesian invader from the predator's avoidance region. Eventually the system restarts to a babbling equilibria. Additionally (after time step 2400) we observe that while toxic type 1 has drifted far from avoidance, toxic type 2, non-toxic type and predator seem to act very similar as they might in system (1, 1).
(S12) Mostly adversarial dynamics in system (1, 2) with slight reprieve of short lived signal locking with toxic type. . We consider a population of one toxic, two non-toxic and one predator type, all composed of 100 organisms. For each type, we initialize signal vectors by selecting a uniform random vector in Γ = [−1, 1] 7 , and replicate the selected vector to all 100 members of the type. Additionally the predator's tolerance parameter will be fixed throughout as τ = 0.2. The encounter process used in this experiment will group the non predator types and compute six random pairs for each predator per time step for signaling game play. We use the mimic, co-mimic-A and co-mimic-B rewards for the types (see table 2).
Dynamics are shown in figure S15. When we compare the dynamics of this system with that of (2, 1) we notice substantially less time spent in any signal locking state, and this policy of blanket avoidance seems to anticipate the predator behavior.
(MM2) Complex equilibria transitions in system (2, 2). We consider a population of two toxic, two non-toxic and one predator type, all composed of 100 organisms. For each type, we initialize signal vectors by selecting a uniform random vector in Γ = [−1, 1] 7 , and replicate the selected vector to all 100 members of the type. Additionally the predator's tolerance parameter will be fixed throughout as τ = 0.2. The encounter process used in this experiment will group the non predator types and compute six random pairs for each predator per time step for signaling game play. We use the mimic, co-mimic-A and co-mimic-B rewards for the types (see table 2).
We illustrate the dynamics are shown in figure S16. We note that with system (2, 2) we see many of the dynamics from constituent systems, formation of the Ring, destabilization by a mimic, reformation and multiple invasions. Despite the complexity of the dynamics the transitions of equilibria can be understood in terms of our basic image presented 5. Rather than having two partial pooling equilibria (one for locking signals with toxic type, the other for locking signal with non-toxic type) We may view the separating equilibria with a binary relation on four types. Separating equilibria transitions can be done by change of relation in one of the types. The babbling equilibria will correspond to absence of signal locking relation for all the types. This then cases the pooling equilbria as separating equilibria that contain a relation with at least one toxic and at least one non-toxic type. The strongest Müllerian ring is composed of a separating equilibria with all members of the toxic type and without any members of the non-toxic class. : Evolutionary games results in separating equilibrium in the common interest to both predator and toxic organism. The evolutionary game history begins with babbling or incoherent signaling, mutation drives search for both types until at least one predator and toxic type organism realize a meaningful signaling/avoidance convention, this leads to a transition in population strategies to yield a steady and stable separating equilibrium. In this example the the signal space is: Γ = [−1, 1] 7 .  Fig. S7: System (0, 1) learns. Variation and motion in Signal Space: Above in (a) the intra-type variations as a function of time are plotted for toxic type (blue) and predator (orange), Additionally the extra-type variations illustrate clearly the transition to an honest signaling purpose (i.e., separating equilibrium). The centroid path variation for each type signaling is plotted below in (b). Its worth noting that both toxic and predator types appear to have similar characteristics, suggesting that the signaling convention meets in the middle. Throughout this learning is done in Γ = [−1, 1] 15 .  Fig. S8: Aversion dynamics. The predator and the non-toxic types have opposing goals, predator seeks its metabolic requirements at the expense of the non-toxic type, while non-toxic type organisms seek protection at the expense of the predator. There is no common goal so the dynamics are adversarial. The signaling convention is repelled by the predator, when the non-toxic type learns and exploits the predator's avoidance region the predator immediately updates obscures region. Throughout Γ = [−1, 1] 7 .  Fig. S10: Signal centroid motions as species search for separating equilibrium. Notice that the the motion for predator and prey-2 slows after a shared equilibrium is found around generation 250. On the other hand the motion of prey-1 remains high until it joins the equilibrium signal. Samples of signals are viewed in (b) where the top row is the mean signals for (prey1, prey2, predator) at generation 500. Notice that prey 2 and predator are similar but prey 1 is not. In the bottom row the mean signals are shown again for generation 1300, notice that prey 1 (purple) transitions and all signals look alike. Throughout the signal space is: Γ = [−1, 1] 7 . Fig. S11: A Signaling system Oscillator. In a system with one toxic, one non-toxic and one predatory type an oscillation is observed. The system is initialized from a babbling equilibrium. The sequence continues in search, followed by toxic type securing a separating equilibrium. From this equilibrium, the nourishing non-toxic type invades as Batesian mimic, this partial pooling equilibrium doesn't appear to be dynamically stable as these invasions seem to ejects the convention in favor of a babbling equilibrium.  Fig. S12: A detailed view of the Memetic Oscillator. In (a) the main progression can be seen, but also of interest are the cooresponding centroid path movements in (b). We observe that Batesian invasion seems to precipitates an increase in preator path movement, thus suggesting urgency of return to babbling. Fig. S13: Repulsion to invasions. In system (0, 2) predators must stabilize their rewards by preventing non-toxic types entry to an avoidance region, reserved for toxic types. The dynamics are adversarial, where non-toxic mimics probe for a predator's vulnerability, and predator adjusting defenses to prevent being exploited. The constant probe and evasion is illustrated by the cosine distance plotted in (a). The corresponding reward functions which drive replication are plotted over the same time in (b). Throughout the signal space is: Γ = [−1, 1] 7 (a) (b) Fig. S14: Stability of a Müllerian ring will depend on the Batesian invasion. The signaling system (a) first finds a separating equilibrium with toxic type 1 and predator (formed around time step 100). Next the toxic type 2 joins to create a partial pooling ring (around time step 330), The ring is stable until (near time step 800) it is invaded by non-toxic type. The ring appears destabilized but neither broken nor restored fully, until finally reaching a full babbling equilibrium (near time step 1700). The corresponding reward functions over the same time are plotted in (b). Throughout the signal space is: Γ = [−1, 1] 7 (a) (b) Fig. S15: System (1, 2) predators must stabilize their rewards by preventing and one or both of the non-toxic types entry to an avoidance region, reserved for the toxic types. The dynamics are adversarial, with slight moments of reprieve when predator can lock to toxic organisms signal. However these signal locking epochs are not long lived, due to invasion from non-toxic types. The cosine distance of encounters are plotted in (