The Role of Redundant Information in Cultural Transmission and Cultural Stabilization

Redundant copying has been proposed as a manner to achieve the high-fidelity necessary to pass on and preserve complex traits in human cultural transmission. There are at least 2 ways to define redundant copying. One refers to the possibility of copying repeatedly the same trait over time, and another to the ability to exploit multiple layers of information pointing to the same trait during a single copying event. Using an individual-based model, we explore how redundant copying (defined as in the latter way) helps to achieve successful transmission. The authors show that increasing redundant copying increases the likelihood of accurately transmitting a behavior more than either augmenting the number of copying occasions across time or boosting the general accuracy of social learning. They also investigate how different cost functions, deriving, for example, from the need to invest more energy in cognitive processing, impact the evolution of redundant copying. The authors show that populations converge either to high-fitness/high-costs states (with high redundant copying and complex culturally transmitted behaviors; resembling human culture) or to low-fitness/low-costs states (with low redundant copying and simple transmitted behaviors; resembling social learning forms typical of nonhuman animals). This outcome may help to explain why cumulative culture is rare in the animal kingdom.

Redundancy is a feature of many natural and artificial systems, such as the genetic code, natural languages, or modern computer networks. In information theory, redundancy is defined as the difference between the quantity of information used to transmit a message and the quantity actually conveyed by the message (Shannon, 1948). The presence of redundancy poses a fundamental trade-off (Krakauer & Plotkin, 2002): As it involves the production and the processing of information that is not always necessary, redundancy is costly; therefore a perfect storing/transmission mechanism would not benefit from it. However, because all transmission systems are, up to a certain degree, error-prone, redundancy can be useful to overcome errors in both transmission and storage. In the DNA, for example, deleterious effects of negative mutations can be avoided as long as different genes contribute to the same function, or when different copies of the same gene are present in the genotype (Tautz, 1992).
In this article we examine the role of redundancy in social learning, and its effect on the transmission and maintenance of cultural traditions. Recently, it has been argued that the existence of stable animal traditions conflicts with the fact that their individual-level processes of transmission seem not to be faithful enough to sustain them (Claidière & Sperber, 2010;Tennie, Call, & Tomasello, 2009). However, in the case of animal traditions, stability may largely be the result of various constraints that limit the number of possible alternative behaviors (Acerbi, Jacquet, & Tennie, 2012;Claidière & Sperber, 2010). Many behaviors can be seen as "latent solutions," resulting from interactions of genetic setup, ontogeny, and environmental factors-with (low-fidelity) social learning playing a role in influencing their frequencies . Models and experiments have indeed shown that simple social learning mechanisms, such as local and social enhancement, are sufficient to support the existence of some forms of animal traditions (Franz & Matthews, 2010;Matthews, Paukner, & Suomi, 2010).
Redundancy in social transmission can be obtained in different ways. One of the advantages of selective, as opposed to random, copying is, for example, that it may allow individuals to repeatedly attempt to copy the same variant (compare also Byrne, 1999), counteracting errors produced by low fidelity transmission mechanisms. Various social learning strategies (general purpose heuristics orienting individuals toward who, when, and what to copy, see Laland, 2004;Rendell et al., 2011) have, among others, the effect of increasing the probability of encountering the same behavior, thus creating more opportunities to copy it. Boyd and Richerson (1985) pointed out how a conformist bias, that is, a disproportionate tendency to copy the most common cultural variants in a population, has the same effect of encountering repeatedly the same behavior.
Larger groups-or denser social networks-may have a similar effect (though see Pradhan, Tennie, & van Schaik, 2012 for a critique of this approach). Mathematical and simulation models have shown how cumulative culture may be linked to population size. Whereas small groups are supposedly unable to sustain complex cultures, large groups, which contain enough individuals, prevent cultural innovations from deteriorating or disappearing (Henrich, 2004;Powell, Shennan, & Thomas, 2009). One effect of large groups is that individuals can potentially have access to more examples of the same behavior (Sterelny, 2011). Recent laboratory experiments support these results (Derex, Beugin, Godelle, & Raymond, 2013;Kempe & Mesoudi, 2014;Muthukrishna, Shulman, Vasilescu, & Henrich, 2014).
What characterizes all these examples is that redundancy is the possibility to repeatedly learn over time, by copying the same behavior from several models, and/or from several repetitions by the same model across time. Byrne and Russon (1998) have also pointed out how repeated demonstrations may further help achieve copies by allowing observers to form a hierarchy of the demonstrated behavior (so-called program-level-imitation). Although we do not intend to dispute these former insights, we would like to add a new perspective to the debate (in future, we may explore how our model relates, or perhaps even extends, earlier proposals such as program level imitation). Indeed, another complementary way to think about redundancy in social learning refers to the possibility of simultaneously accessing different-but equivalent-layers of information at the same copying event, that is, at the same time (see also Heyes, 2013;Tennie, 2009;Tennie et al., 2012). We call this redundant copying-and we additionally tentatively postulate that only humans are engaged in this type of copying (see Discussion).
As described in the observational learning literature (e.g., Call & Carpenter, 2002), individuals can potentially access different layers of information simultaneously when copying. Examples of different information layers that may be acquired are (Call & Carpenter, 2002) actions (the particular actions or techniques with which a behavior is performed, e.g., a power-transmitting grip of a twig with both arms), results (environmental effects, e.g., the resulting breaking of a twig), goals (as in the likely external goal of an action, e.g., that the twig should break-or to produce smaller twigs), as well as explicit language descriptions ("I am breaking this twig"), and potentially others conveyed through active teaching (e.g., by molding a pupil's hands to break a twig).
Up until now, the literature on observational learning has mainly concentrated on describing the effects that any particular information layer in isolation can have on copying performances. For example, action copying can preserve behavioral sequences that would be lost if individuals only copied results. In some casessuch as the learning of arbitrary gestural tools for communication-precise action copying is necessary for social learning, because key elements of behavior could not otherwise be acquired (Moore, 2013). More generally, action copying may be the only way to acquire behaviors when feedback on results is difficult to access or not available (Acerbi, Tennie, & Nunn, 2011;Tennie et al., 2012). However, although we acknowledge the importance of this point, here we are not interested in distinguishing between the specific information that is transmitted, but rather in modeling what happens if several information layers, when considered equivalent, are exploited at the same time (a possibility briefly mentioned by Heyes, 2013, andTennie et al., 2012). This allows us to abstract from the details of the content and to analyze how the mere availability of additional and redundant information, in each copying event, could enhance cultural transmission.
Below, we first describe a simple model that simulates learning "sessions" in which naïve observers are paired with knowledgeable demonstrators. In these sessions they attempt to learn behaviors represented by sequences of discrete steps. We explore how "redundant copying" (i.e., the ability to exploit multiple layers of different, but equivalent, information during a single copying event) increases the probability of learning a behavior correctly more than other strategies, assuming, realistically, that individuals' working memory is subject to decay (for a recent comparison of working memory in primates and humans, see Read, 2008). Showing how, in an idealized situation, redundant copying may increase the effectiveness of social transmission, we are then able to consider a more complex scenario, with evolving populations of individuals and different cost functions for redundant copying (costs that, e.g., may derive from the need of additional brain tissue to allow simultaneous processing of information). We analyze under which conditions redundant copying can be expected to evolve, and the consequences for the complexity of the population's "culture." We conclude by discussing how our results suggest that the evolution of the ability to exploit multiple layers of redundant information might be one of the hallmarks of human high-fidelity cultural transmission and stabilization.

Methods
In this model, target behaviors were shown by one demonstrator (who performed a perfect demonstration and who was not affected by any interactions) and were observed by one fully naïve observer. Behaviors were modeled as sequences of discrete and independent steps. These steps are independent because acquiring a correct step at a certain position in the sequence had no effect on the probability of also correctly acquiring the subsequent steps. The first relevant parameter of our model was thus the length of the sequence (n). Intuitively, long sequences represent more complex behaviors, and short sequences represent simpler behaviors. A learning event is only considered successful if the observer has correctly acquired all steps of a sequence.
For illustrative purposes, it is useful to think to concrete examples to explain our rationale. Nut cracking in chimpanzees is considered by some a cultural behavior (Boesch, 2012). If this is This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
the case, a subject would have to learn the correct sequence to meaningfully use this combination of tools. An incomplete sequence (say placing the nut on the anvil without hitting) would not give any advantage to the subject, and neither a sequence in the incorrect order (say hitting the anvil before placing the nut). Or let us assume a relatively complex behavior typical of (some) human culture (like tying a Windsor knot for a tie). This is a complex behavior (i.e., a sequence), where one needs to correctly perform several subbehaviors (i.e., steps), before the Windsor knot (i.e., success) appears. Note that, even if one performs 90% of the correct actions to produce a Windsor knot, the result will have little resemblance to the intended knot (i.e., no success).
Going back to our model description, social learning is error prone: a further parameter of the model (a) indicated the accuracy of transmission, implemented as the probability that a given step of the sequence was copied correctly.
The amount of information gathered by the observer was further influenced by the number of learning occasions (l), meaning that the demonstrator "performed" the sequence l times, and the observer attempted to copy and reproduce it each time.
Finally, another parameter (r) determined the number of layers of information an observer was capable of accessing in a copying event. Usually, in models of social transmission, observers try to copy the demonstrated sequence by acquiring one layer of information in each demonstration (in our model this is the case when r ϭ 1; i.e., these individuals were not redundant copiers). In this model, an increase of the redundant layers of information allowed the observer to simultaneously acquire several "replicates" of the same sequence at once. If r ϭ 2, for example, the observer had one additional layer of redundant information on the demonstrated sequence (i.e., a replicate of the sequence; this individual was a redundant copier). In this situation, an error in a step of a sequence can be "fixed," as long as the same step is correct in another replicate of the sequence.
It is important to note that the layers of information we model here are closely related to each other (this fact enables the concept of redundant copying in the first place). Thus we envision here fine-grainedness of each single layers of information. Let us take again the Windsor knot as an example, and let us focus on the first two steps in the sequence of producing such a knot. Here, the tie is A) first placed around the neck, and then B) crossed in front. Looking at this example to further illustrate our approach, the steps A and B can be translated as follows. The action layer of information would then code A and B as something like this: A) grasp tie with both hands equally apart and then lift arms over head and release, B) move arms to front and grasp object, then cross arms. The results layer of information would code A and B as something like this: A) object moves centrally over head and then comes to rest over neck, B) object's loose fronts are crossed and touch. Similar descriptions would hold for all the necessary steps of tying the knot. Note also that these layers of information are indeed very different, but they are also in important ways equivalent. This underlying reasoning applies to each information layer that we model. Note also the fine grainedness of information: here we are not interested in "results information" as in end-state information (a frequent use of the term). Our use of the term results information is thus close to the type of information underlying the subtype of results emulation called object movement reenactment, and for which there is good empirical evidence (Huang & Charman, 2005 provide a nice overview of emulation types discussed in the literature).
A simplified pictorial representation of the basic model is reported in Figure 1.
Simulation procedures. We used behavioral sequences of four different lengths (n ϭ 2, 4, 8, 16), and we analyzed the effect Figure 1. A simplified pictorial representation of the basic model. The observer aims to copy a sequence composed of 4 steps (n ϭ 4). (a) The observer is not a redundant copier, that is, the observer can access only a single layer of information during each observation (r ϭ 1), and here it has two learning occasions-that is, two consecutive attempts to copy the sequence (l ϭ 2). As transmission is error prone, the observer cannot reproduce the correct sequence. (b) The observer is a redundant copier, and thus can access two layers of information during each observation (r ϭ 2). In this case, even though the transmission is again error prone, the observer can fix the occurred error during this first observation. This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
of variation of a single parameter, while keeping the other parameters fixed (see Table 1). We designed three sets of simulated learning sessions varying (a) the accuracy of transmission, (b) the number of layers of information, and (c) the number of learning occasions. Each condition was replicated 10^6 times, and we collected the proportion of successful copying events. The simulations were run with the Matlab software package; R was used for data analysis and visualizations. All original data available upon request.

Results
Our results showed that, as expected, when the complexity of the sequence increased (i.e., when the behavior became more complex), performance decreased (see Figure 2).
Observers had three alternative options to counteract this drop in performance: increase the number of layers of information they exploited (Figure 2, left panel), increase the accuracy of transmission (Figure 2, center), or increase the number of learning occasions (Figure 2, right). All three options resulted in better copying, but redundant copying, that is, increasing the number of layers of information exploited, was more effective than increasing the number of learning occasions (compare left and right panel). Increasing learning occasions was effective for short sequences but not for longer sequences, where the positive effect disappeared.
Although it is possible to directly compare the effect of increasing redundant copying to the effect of increasing the number of learning occasions (to add one layer of information is equivalent to add one learning occasion), there is no obvious way to compare the effects of increased accuracy of transmission (central panel) and increased redundant copying (left panel). However, at visual inspection the effect of redundant copying appears more robustespecially for long sequences. A very high accuracy of transmission (a ϭ 0.9), for example, enabled observers to acquire only about half of the 16-steps sequences that a scenario with a ϭ 0.5 and three levels of redundancy permitted.
It is important to note that, in this model, an increase in redundant copying was more effective than an increase of learning occasions because we assumed (to keep the model simple) that individuals had no memory. However, when provided with memory, the performances of individuals with an equal number of learning occasions and layers of information were equivalent only when given perfect memory, that is, with no memory decay or memory corruption at all-which is an unrealistic assumption. We ran the same model for a subset of parameters (a ϭ 0.5; r ϭ 1; l ϭ 4; i.e., 50% accuracy of transmission, no redundancy and four learning occasions), for individuals provided with a realistic type of memory. Here, a further parameter m determined the decay of the memory (the probability that, at each learning occasion, a memorized step will be retained-with m ϭ 1 equivalent to a perfect memory). Figure 3 shows the results of this model. Comparing these results with the matching condition with no memory and redundancy (Figure 2, left panel, r ϭ 4 -consider again that four learning occasions are equivalent to three levels of redundancy), only perfect memory (m ϭ 1) produced the same performances for redundant and nonredundant learners.

Methods
In a second set of simulations, we considered populations of interacting individuals (N ϭ 100) and explored the evolution of redundant copying. The start of simulations was as simple as possible: at the beginning, the number of layers of information accessible (r) was equal to 1 for all individuals (no redundancy), and only a single simple sequence (i.e., behavior), consisting of one step (n ϭ 1), was present in the population. Furthermore, this sequence was present in only one individual. To save computational time (and to prevent unrealistic results) the maximum number of accessible layers was fixed to 4, and we also use this as a "stopping rule" for the simulations (simulations were terminated when the population reached an average r Ͼ 3.5, otherwise they ended at T ϭ 200,000 time steps).
Payoffs. A payoff P was associated with each sequence by randomly extracting a value from a normal distribution with a mean equal to the length of the sequence (n) and a fixed standard deviation equal to 1. This simply means that more complex sequences tend to be associated with higher payoffs, but that there is also a stochastic component, so that different sequences of the same complexity (i.e., length n) may have different payoffs (even though, on average, they have a payoff equivalent to their n). This stochastic component was included in the model to add more realism, but results are qualitatively the same if the payoff P was equal to the length of sequence n (data not shown).
Copying process. At each time step, individuals were ranked according to their sequence's payoff and a fraction (10% of the population) of highest ranked individuals was chosen to act as demonstrators. We implemented here a copy-successful-individuals strategy (Laland, 2004), as it is generally considered effective in cultural evolution studies (see, e.g., Mesoudi, 2008; see also introduction). Demonstrators were paired at random with the remaining individuals, and the observers tried to copy the demonstrators' sequences (i.e., their behaviors, as described in the "dyadic learning sessions" model above).
Innovation. At each time step, each individual had a low probability of inventing a new sequence ( ϭ 0.01), meaning that, on average, one new sequence was introduced at each time step), and, when this happened, there was again a low probability (c ϭ 0.05, meaning, on average, one out of 20 new sequences introduced) that this new sequence was more complex than the one they already performed, becoming one step longer (n ϩ 1).
Evolutionary algorithm and fitness calculation. At each time step, one individual randomly chosen from the population "died" and was then replaced by a new individual who inherited her level of copying redundancy from one of the 10% individuals Note. n ϭ length of the sequence; r ϭ number of layers of information; a ϭ accuracy of transmission; l ϭ number of learning occasions. This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
in the population with the best fitness (see below). With a small probability ( ϭ 0.01) mutations happened, in which case the inherited number of accessible information layers (r) was increased or decreased, with equal probability, by one unit (notice that newborns inherited only the level of acquisition redundancy but not the sequence). Individual fitness was a function of the payoff provided by the sequence expressed by the individual, and by the number of information layers she was capable of accessing. Given that it is not possible to have a clear estimate of the "cost" of redundant copying (e.g., additional brain tissue), we explored two different options: one in which the cost was a linear function of the number of layers that were available to the individual, and one in which the cost was an exponential function. A further parameter (k) scaled the cost of redundant copying such as, for the linear scenario, the individual fitness of individual i was equal to: F i ϭ P i -kr i and for the exponential scenario the individual fitness of individual i was equal to: where P i is the payoff associated to the sequence shown by individual i, and r i is the number of layers of information to which individual i has access.
Simulation procedures. We ran two sets of simulations, one for the "linear-costs" scenario, and one for the "exponential-costs" scenario. For each set three values of k ϭ 1,2,3 were tested giving a total of five conditions (note that, if k ϭ 1, the exponential and the linear scenario are equivalent). Finally, for each condition, we tested levels of accuracy of transmission (a) from 0.1 to 0.9 (using steps of 0.1) and ran 100 replicates for all parameters values (see all parameters values in Table 2). The simulations were run with the Matlab software package; R was used for data analysis and visualizations. All original data available upon request. Figure 4 shows an output from the first 10,000 time steps of a typical simulation run (in the exponential cost scenario, with k ϭ 2, and a ϭ 0.5). At the beginning, sequences are simple, and individuals with no redundancy can learn them correctly in approximately half of the cases (see top-left panel). However, as more complex sequences are invented (see bottom-right panel) the proportion of successful transmissions drop (see again top-left panel). Just before T ϭ 2,000, most of the individuals had acquired a first level of redundancy (top-right panel), which led to an increase in the proficiency of transmission (see top-left panel around T ϭ 2,000) and allows to further increase the complexity of sequences. The same happens between 5,000 and 6,000 time steps, when a further level of redundancy evolves. Population fitness (bottom-left panel) is a result of a trade-off between more complex sequences (with increasing returning payoffs), and increasing level of acquisition redundancy (with increasing costs).  This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Results
Redundant copying evolved in four cost-conditions out of a total of five (excluding k ϭ 3 exponential, see Figure 5). We considered that redundant copying evolved for runs when the population reached an average r .5 (with a maximum of 4 -see the "stopping rule" above).
An interesting aspect of this result is the relationship between the level of accuracy of transmission (which is a parameter of the model and not costly for individuals), and the evolution of redundant copying (see Figure 5).
On one hand, redundant copying was, in general, more likely to evolve with higher levels of accuracy of transmission. On the other hand, when redundant copying was high-cost (like in the k ϭ 2 exponential cost scenario in Figure 5), the evolution of redundant copying decreased when accuracy of transmission was high. In sum, costly redundant copying tended to evolve for intermediate values of accuracy of transmission.
We conclude by considering the results of one of the costconditions in more detail, specifically the k ϭ 2 exponential cost scenario (see Figure 5). This scenario is especially interesting because the cost of redundant copying here is substantial, so that it did not evolve for all the values of accuracy of transmission tested, but it was also not so high that redundant copying could not evolve at all (as in the k ϭ 3 exponential cost scenario). Figure 6 shows the final complexity of culture (measured as the average length of the sequences present in the population when the simulation was stopped). A discontinuity in the final complexity of culture exists between low and high levels of accuracy of transmission. Until a ϭ 0.3 (i.e., before redundant copying started to evolve, see Figure 5) there was no complex culture present in the population. From a ϭ 0.4 we observe an abrupt increase in the complexity of culture, which then gently keeps increasing alongside the accuracy of transmission.  This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Discussion
Our simulations provide new insights into the importance of redundant copying (as defined in our introduction) for the transmission and maintenance of cultural traditions. First, we showed in an idealized model (dyadic learning sessions) how the possibility to access and process multiple layers of information simultaneously (redundant copying) was associated with better performances-with individuals being more likely to acquire and correctly perform demonstrated behaviors-compared to alternative ways to improve social learning, such as copying repeatedly over time, or increasing the accuracy of transmission. Our aim in this initial simple model was not to provide a neurologically detailed model of memory and learnin, but to show how a largely neglected aspect of the social learning literature-redundant copying -may impact on the fidelity of the transmission process.
It may be interesting to relate our results to actual cultural transmission experiments in modern humans. The results of most relevant studies (transmission chains experiments that track actual cultural evolution in microsocieties) are coherent with our predictions. One study looked at improvements across chains of performances in a simple task: building an effective paper airplane (Caldwell & Millen, 2009). This is a simple task that can be resolved mainly by individual trial and error (especially as most participants will probably have made paper airplanes in their youth), combined with some basic form of social information. In line with our predictions, improvements in this task occurred and were largely independent from the number of "layers of information" to which participant had access to. Cumulative improvements even occurred when participants had access to results information alone-perhaps by reminding them of the way these paper airplanes are made. However, more complex tasks (novel to partic-ipants and not easily solved by individual learning alone; e.g., Wasielewski, 2014) showed that cumulative improvements were possible only if participants watched full demonstrations (thus including-at the very least-actions and results; i.e., at least one level of redundancy, sensu our model), but that participants failed when presented with the end result alone. In addition, when humans have unrestricted access to different layers of information, they tend, in many contexts, to use all of them, even when they are not actually relevant for the task at hand, including superfluous actions, a phenomenon dubbed overimitation (Lyons, Damrosch, Lin, Macris, & Keil, 2011).
In the second model (evolutionary population model) we explored how evolutionary constraints influence the characteristics of redundant copying. The first main finding was that the ability to access additional layers of information did indeed evolved, at least in some conditions. Especially interesting is the fact that redundant copying seemed to evolve for intermediate levels of accuracy of transmission. This might reflect a trade-off between the cost and the utility of redundant copying: when accuracy of transmission is very low, costly evolution of redundant copying is unlikely, because even if individuals could access multiple layers of information, transmission would remain largely ineffective. When accuracy of transmission is instead high, the marginal gain of adding redundant copying is decreased, preventing the evolution of the costly ability to access further layers of information. In our model, accuracy of transmission was a parameter not under evolution, and memory was kept constant, but future works can explore more in details their own evolutionary dynamics in a similar scenario. Taken together, our findings indicate that we should expect a coevolutionary trade-off for which species develop the costly machinery for redundant copying only after having developed Figure 6. Average sequence length versus accuracy of transmission in the "evolutionary population model" for redundant copying cost-condition k ϭ 2 (exponential). This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
learning mechanisms for sufficiently accurate transmission fidelity in the first place. This is also reflected in a second aspect of the results of our evolutionary model, namely the discontinuity between conditions in which populations developed a redundant copying-sustained complex culture and the conditions in which they did not. After developing the minimally necessary transmission fidelity, the evolution of redundant copying drastically changed the complexity of the culture evolved. Here it is important to note that it is not the increase of transmission fidelity per se that allowed complex culture but is the fact that, at a certain threshold level, it made redundant copying useful, which then kick-started a major leap of cultural evolution. This may provide a justification for the massive gaps we observe between human and nonhuman culture, that are difficult to explain when assuming a gradual increase in the accuracy of transmission alone (e.g., Dean, Kendal, Schapiro, Thierry, & Laland, 2012;Tennie et al., 2012; for opposite opinions see, e.g., Boesch, 2012;Sanz & Morgan, 2009). Therefore, an important specific prediction of our model is that intermediate forms of culture should not exist (thus, there should not be "halfcumulative cultures").
Our results contribute to the debate on the differences between social learning abilities of humans and other primates. In the case of humans' closest living relatives, the great apes, the current common opinion is that their copying fidelity is generally low (Tennie, Greve, Gretscher, & Call, 2010, see also Claidière & Sperber, 2010) or, at best, medium (Whiten et al., 2009) when compared to human copying fidelity. With respect to our concept of redundant copying, the current best evidence is that primates can, when copying from others, access one information layer only (namely the copying of environmental results; Tennie et al., 2009, see also Tennie et al., 2012). Others claim that, in addition, great apes can and do copy actions-though in that case their performance could not be nearly as good as humans' action copying (Whiten et al., 2009). Finally, it is also possible that great apes may occasionally understand and copy some simple goals (Tennie, Call, & Tomasello, 2010). Thus, there may be some potential for redundant copying in great apes-in the sense that they may be able to track several layers of information (though the crucial tests are still outstanding). However, given their lack of copying fidelity within all or most of these layers of information, our results indicate that great apes' copying fidelity is may be too low to have fuelled the evolution of a robust redundant copying system. In contrast, for humans, the fidelity of social learning mechanisms coevolved with the possibility to access different layers of information at each copying event. Teaching (absent in nonhuman great apes, Moore, 2013), can be, as we discussed above, one way to add a layer of information during social learning, and it could have played an important role in stabilizing the human redundant copying system, as argued by several authors (Dean et al., 2012;Pradhan et al., 2012;Tennie et al., 2009).