Food-exchanging Norway rats apply the direct reciprocity decision rule rather than copying by imitation

The decision rule of direct reciprocity states that an individual helps someone who previously helped them. An alternative explanation to observations of reciprocal exchanges of help is copying by imitation. Norway rats, Rattus norvegicus , are known to exchange food and allogrooming reciprocally among social partners. We asked whether this behaviour is based on copying by imitation or the application of the direct reciprocity decision rule. Norway rats used a sequential food-pulling paradigm. To assess whether focal rats help according to the direct reciprocity decision rule, we predicted that focal rats should be less helpful to partners from an experimental defection or self-pulling treatment than to partners that previously helped them in a cooperation treatment. To assess whether focal rats help partners by copying via imitation, we predicted that focal rats should be more helpful to partners that previously pulled more often than to partners that had pulled less often. The experimental design involved experience phases consisting of three treatments and three sessions per treatment for each experimental subject, in which a partner operated a stick-pulling apparatus providing food to the focal rat or only for themselves. This was followed by a test phase in which the focal rat could help the partners. Focal rats gave less help to partners from the defection or self-pulling treatments than to previously cooperative partners, and latency to the ﬁ rst help by focal rats was longer for partners from the self-pulling treatments than for cooperative partners, which are results consistent with the direct reciprocity decision rule. Focal rats did not give more help to partners that pulled more often, which is not consistent with copying by imitation. Hence our results are consistent with the direct reciprocity decision

Reciprocity refers to an apparently cooperative behaviour that benefits the recipient of the help at a cost to the actor and increases the probability of the actor of the cooperative behaviour receiving help in return from the same or different partners (Carter, 2014;Taborsky et al., 2016;Trivers, 1971). If a member of a mutual relationship involving iterated occasions to help a partner is uncooperative, it will receive less help in return in the future, which prevents exploitation. Conditions for reciprocity to evolve include that (1) helping costs for the actor should be low (2) benefits for the receiver should be high, and (3) the probability of receiving help in return for help given should be high Trivers, 1971). At the proximate level of cooperation, the mechanism of direct reciprocity implies the decision rule of an individual helping someone who has previously helped them Trivers, 1971). Evolutionarily stable cooperation is easier to establish if the exchange of help is concurrent; however, there is often a delay between help given and help received (van Doorn et al., 2014). For direct reciprocity to work in sequential encounters, individual recognition and memory of the outcomes of past interactions with a specific individual are required (Kettler et al., 2021;Stevens & Hauser, 2004). Evidence supporting direct reciprocity has been reported, among others, in humans (Bartlett & DeSteno, 2006;Rand & Nowak, 2013;Trivers, 1971), vampire bats, Desmodus rotundus (Carter et al., 2020;Carter & Wilkinson, 2013a;Wilkinson, 1984), brown capuchin monkeys, Cebus apella (de Waal, 2000;de Waal & Brosnan, 2006;Suchak & de Waal, 2012), and Norway rats, Rattus norvegicus (Delmas et al., 2019;Dolivo & Taborsky 2015a, 2015bRutte & Taborsky, 2008;Schneeberger et al., 2012;Schweinfurth, Stieger, et al., 2017;, 2018a, 2018bWood et al., 2016). A comprehensive review is provided in Taborsky et al. (2021).
Male and female wild-type Norway rats help same-sex conspecifics in accordance with direct reciprocity, providing more help to previously cooperating partners than to defecting partners (Dolivo & Taborsky, 2015a, 2015bRutte & Taborsky, 2008;Schneeberger et al., 2012;Schweinfurth & Taborsky, 2018a, 2018bWood et al., 2016). Females also allogroom according to direct reciprocity, and their allogrooming patterns depend on their partner's previous cooperation level and relative rank , and it may involve the exchange of different commodities (Schweinfurth & Taborsky, 2018a;Stieger et al., 2017). Female Norway rats help according to the quality of help they received (Dolivo & Taborsky, 2015a) and their partner's need (Schneeberger et al., 2012(Schneeberger et al., , 2020Schweinfurth & Taborsky, 2018c). In experiments testing whether rats return food donations of social partners reciprocally, a sequential experience test paradigm has been developed (Rutte & Taborsky, 2008). The experience phase is a period in which focal rats may receive help from partners; the consecutive test phase is a period in which focal rats can decide to help partners, which may be contingent on the partner's behaviour in the preceding experience phase (Dolivo & Taborsky, 2015b;Rutte & Taborsky, 2008;Schneeberger et al., 2012;Schweinfurth & Taborsky, 2018a, 2018b. Direct reciprocity in female Norway rats is mainly based on the outcome of the most recent encounter with a specific partner, independent of the last interaction preceding the test, as shown in a series of experience phases with different partners (Kettler et al., 2021;. This resembles tit-for-tat with the time delay between help received and given of up to 4 days (Kettler et al., 2021;. These results highlight that Norway rats meet the required cognitive demands of direct reciprocity (Kettler et al., 2021).
An alternative explanation to direct reciprocity for the manifestation of mutual aid is copying, that is, copying what another individual does (Whiten et al., 2009). Copying involves imitative and emulative social learning processes (Ashley & Tomasello, 1998;Byrne, 2002;Hoppitt & Laland, 2013;Tennie et al., 2006;Whiten et al., 2009). Imitation is defined as (1) action learning by copying the motor patterns used by the demonstrator to achieve its goal (Tennie et al., 2006) or as (2) copying the form of an action (Whiten et al., 2009). Imitation in the broader, everyday sense, includes actions in an individual's own general repertoire (Whiten et al., 2009). In previous studies of reciprocal cooperation in rats, both focal and partner rats were trained to use the same provisioning mechanism, that is, pulling a stick fixed to a movable platform that slides into the cage and provides a food reward to another rat, and the donor does not receive a food reward whenever it pulls (Dolivo & Taborsky, 2015a, 2015bRutte & Taborsky, 2008;Schneeberger et al., 2012;, 2018b, 2018c. These studies did not experimentally test for reciprocity resulting from copying but experimentally excluded copying by (1) imitation, (2) production imitation (i.e. after observing a demonstrator perform a novel action or a novel sequence or combination of actions, none of which are in its own repertoire, an observer then becomes more likely to perform that same action or sequence of actions; Hoppitt & Laland, 2013), and (3) the emulative learning process of object movement re-enactment (i.e. copying the form of a caused object movement; Whiten et al., 2009). In one study, rats were trained to use two provisioning mechanisms to provide help that differed in shape and form, and they used different provisioning mechanisms in the experience and test phases . In another experiment, rats were trained on the same provisioning mechanism to provide help, that is, food rewards, and they reciprocally traded food rewards for allogrooming and vice versa (Schweinfurth & Taborsky, 2018a). Since help was provided by using different means, these studies did not test (1) whether the copying of actions can directly affect the rats' helping decisions , 2018a, and (2) whether focal rats copy when there is an opportunity to copy, such as when focal rats can help partners that used the same provisioning mechanism to acquire food merely for themselves, that is, by 'self-pulling'. To rule out copying by imitation in the broader, everyday sense, which includes actions in an individual's own general repertoire (Whiten et al., 2009), we need to experimentally distinguish between direct reciprocity and copying by imitation. Depending on context, an individual copying the behaviour of a social partner may thereby reciprocate received help, that is, reciprocation of received help may be a side-effect of a general copying mechanism. To check whether reciprocal exchanges are merely due to the copying by imitation of a social partner's actions or whether other processes are involved, experiments are required that allow us to distinguish between these alternatives.
Here we assessed whether female wild-type Norway rats helped their partners according to the direct reciprocity decision rule. Given this mechanism, we predicted that (1) rats should provide more help to a partner that had previously helped them (i.e. 'cooperation') than to a partner that had defected (i.e. a partner that could not pull since no stick was available) and (2) the latency to the first pull should be shorter for a partner that had previously helped them than for a partner that had defected. To distinguish between the direct reciprocity decision rule and copying by imitation, we ran an experiment with three experience phase treatments in which each treatment consisted of three 7 min sessions, as follows: (1) partners that helped focal rats, that is, focal rats received a food reward for each pull by their partners ('cooperative treatment'); (2) partners that helped themselves to food rewards by pulling, without focal rats receiving anything (the number of self-pulls was experimentally limited to the number of pulls partners performed for focal individuals in the cooperative treatment, 'limited self-pulling treatment'); and (3) partners that helped themselves to food rewards by pulling an unlimited number of times, again without focal rats receiving anything ('unlimited self-pulling treatment'). After each session of each experience treatment, focal rats could provide help to their partner during 7 min test phases, that is, each focal rat experienced nine experience and test phases. To test whether the experimental design worked appropriately, we predicted that unlimited self-pulling partners should pull more often than both limited self-pulling partners and cooperative partners in the experience phase. To test the hypothesis that focal rats help partners according to the decision rule of direct reciprocity, we predicted that (1) focal rats should provide less help, by pulling a stick, in the test phase to limited and unlimited self-pulling partners than to cooperative partners ( Fig. 1a), (2) the latency to the first help by focal rats in the test phase should be longer for limited and unlimited self-pulling partners than for cooperative partners (Fig. 1b). To test the hypothesis that focal rats pull a stick according to copying by imitation, we predicted that focal rats in the test phase should pull a stick more often and the latency to the first pull should be shorter for partners that had pulled more often than for partners that had pulled less often in the experience phase, regardless of who had received the food reward ( Fig. 1c and d).

Housing Conditions
Thirty-four female, outbred, wild-type Norway rats (source: Animal Physiology Department, University of Groningen, Netherlands) were used. We drove the rats to the Ethologische Station Hasli of the University of Bern, Switzerland, where the study took place. To visually distinguish individuals, upon arrival the rats were marked by ear punches, which caused light momentary bleeding at the ear. If blood was visible after the ear-punching procedure, we stopped the bleeding by gently pressing on the ear with a paper tissue for 10 s, after which the bleeding stopped.
Sisters were housed together in groups of three to six individuals per cage. Housing cages (80 Â 50 cm and 37.5 cm high) were separated from each other by opaque walls to limit interactions between groups. They contained litter, hay, a wooden shelter, a tunnel, paper toys and a wooden block. Conventional rat pellets and water were provided daily ad libitum. Grain mix was additionally provided three times a week, and fresh fruits or vegetables were provided twice a week. We performed daily health checks.
The rats were habituated to handling from weaning onwards and did not show signs of stress during rearing and all experimental stages. Rats were handled regularly to keep them habituated to the experimenters. The average ambient temperature was 20 ± 1 C, and the relative humidity ranged from 50% to 60%. We used a daily, reversed 12:12 h light:dark cycle, and the white light was turned on at 2000 hours with 30 min of dusk and dawn. All stages of the experiment were conducted under red light during the dark phase of the daily cycle, because Norway rats are primarily nocturnal (Barnett, 1963;Calhoun, 1963;Norton et al., 1975) and cannot perceive the colour red due to a lack of red light receptors (Yokoyama & Radlwimmer, 1998, 2001. A pilot study and the experiment to test for direct reciprocity were run with 21 of the 34 rats. The experiment to distinguish between the direct reciprocity decision rule and social learning mechanisms was run with 30 of the 34 rats. The four additional rats did not pass the social pulling training criteria. The mean mass and age of rats were 248 ± 3 g and 143.7 ± 0.5 days, respectively. The permanent ear punches and temporary tail markings were used in combination to visually identify individual rats in the experiments. The tail markings (i.e. lines and dots) were applied weekly with a black felt pen.

Ethical Note
We followed the ASAB/ABS Guidelines for the treatment of animals in research. The licence to perform animal experiments was provided by the Swiss Federal Veterinary Office of the Canton of Bern (licence number BE 55/18) to M.T. and a ticket for indispensable research was provided by the University of Bern (ticket number EAC-201216-T#212) to M.T.

Pre-experimental Set-up
The pre-experimental set-up followed the methods developed by Rutte and Taborsky (2008), which were derived from de Waal Predictions to assess whether rats help their partners because (a, b) they are following a direct reciprocity decision rule or (c, d) they are copying by imitation. The three experience phase treatments were (1) 'cooperative', that is, the partner had previously provided help, (2) 'limited self-pulling', that is, the partner had pulled food for itself as often as the cooperative partner had pulled for the focal subject (self limited; the number of pulls was limited by the experimenter and not by the partner's choice), and (3) 'unlimited self-pulling', that is, the partner had pulled for food rewards for itself as often as it wanted (self unlimited). Predictions are for (a, c) the number of pulls by focal rats and (b, d) the latency to the first pull by focal rats in the test phase based on the treatment in the experience phase. The whiskers represent the envisaged 95% confidence intervals. An asterisk marks a predicted significant difference. and Berger's (2000) two-player sequential food exchange task. Test cages (80 Â 50 cm and 37.5 cm high) were divided into two compartments by a wire mesh. The training protocol consisted of two training procedures. The first was a solo pulling procedure in which rats were trained to pull a stick fixed to a movable platform that would slide into the cage, which gave the rat access to a food reward, an oat flake. Solo pulling sessions lasted 7 min, and the criterion to classify a rat as a successful solo puller was !50 pulls/ 7 min on two consecutive sessions. On average, rats participated in 10 sessions. The second procedure was social pulling in which, over 22 sessions, rats were paired with a partner to learn how to operate a food donation paradigm. The donor could pull without itself having access to a food reward, while the recipient received the food reward without being able to pull. Subsequently, the roles were switched. Throughout the training period, paired partners were cage mates to ensure familiarity and avoid perturbing effects associated with rats being unfamiliar to each other. The paired partners were in separate compartments of a test cage, and the delay between interchanging the roles of the partners increased from one food donation (the partners interchanged roles after each food donation) to a 24 h delay across the sessions of the training period (see Appendix Table A1). To move to the next session, both partners had to donate food at least five times. We classified rats as high, medium and low cooperators based on the number of pulls across the last 10 sessions. We used the same social pulling training procedure (sessions 1e18 as described in Table A1) as used in several previous studies of the direct reciprocity decision rule of wild-type Norway rats (Dolivo & Taborsky, 2015a;Gerber et al., 2020;Schweinfurth & Taborsky, 2016. From sessions 9 to 18, rats classified as showing a high, medium or low cooperative propensity pulled an average 13.1 times, 9.1 times and 7.8 times in 7 min, respectively. We ran four additional 7 min training sessions with a 24 h delay between interchanging partner roles to validate the individual pulling propensity of rats based on the average of the number of pulls during sessions 17 and 18 with the number of pulls in sessions 19e22 (see Appendix). From sessions 17 to 22, rats classified as showing a high, medium or low cooperative propensity pulled an average 10.4 times, 6.8 times and 4.9 times in 7 min, respectively.

Test of the Direct Reciprocity Decision Rule
The direct reciprocity experiment was designed as a sequential iterated Prisoner's dilemma (Axelrod & Hamilton, 1981;Nowak & Sigmund, 1994). The experimental cage consisted of two compartments divided by wire mesh, and each rat was in a separate compartment. The experience phase is a period when focal rats may receive help from partners, and the test phase follows the experience phase and is a period when focal rats can give help to partners (Dolivo & Taborsky, 2015b;Rutte & Taborsky, 2008;Schneeberger et al., 2012;Schweinfurth & Taborsky, 2018a, 2018b. In both the experience and test phases, the rats pulling the stick did not receive food rewards. Both the experience and test phases lasted 7 min, and there was a 30 min delay between them (Fig. 2a). There were two experimental treatments: (1) Figure 2. Experimental set-up following Rutte and Taborsky (2008). (a) For the direct reciprocity decision rule experiment, focal rats during the experience phase either received help from a cooperative partner or did not receive help from a defecting partner. The cooperative partner, partner 1, pulled on a stick, which was attached to a platform with a food reward, to donate food to the focal rat. The defecting partner, partner 2, could not pull on the stick to donate food to the focal subject, because there was no stick and the platform was blocked. Following a 30 min delay, the focal rat was then enabled to donate food to the partner. Focal rats experienced one treatment per day. There were two sessions per treatment, and each focal subject was tested in both treatments in a randomized sequence. (b) For the direct reciprocity versus copying by imitation experiment, the experience phase consisted of three treatments: (1) a focal rat receiving help from cooperative partner 1 ('cooperative treatment'), (2) partner 2 pulling food for itself as often as the cooperative partner had pulled food for the focal rat, but without donating food to the focal rat ('limited self-pulling treatment') and (3) partner 3 pulling food for itself as often as it wanted until the end of the experience phase ('unlimited self-pulling treatment'). For both the limited and unlimited self-pulling treatments, only the partner received the food reward during the experience phase, whereas in the cooperative treatment only the focal rat received food during the experience phase. Following a 30 min delay, the focal rat was then enabled to donate food to the partner. There were three sessions per treatment, and each focal subject was tested in all three treatments. The sequence of cooperative and unlimited selfpulling treatments was randomized.
received help (i.e. a food reward), each time a partner pulled the stick, and (2) a 'defection treatment', in which the partner could not pull since no stick was available and the focal rat received no help (i.e. no food reward). Each rat had two experience sessions with a cooperative partner, and the same partner was used for both sessions. A cooperative partner pulled a stick fixed to a movable platform that would slide into the cage. The focal rat received one food reward (an oat flake) for each pull by a cooperative partner. Each of the 21 rats acted as both a focal rat and a partner. Each rat also had two experience sessions with a partner that defected, that is, a partner that could not pull since no stick was available and the platform was blocked, and again the same partner was used for both sessions. To avoid carryover effects, the dyads between the cooperative and defection sessions were different, and the focal and partner rat roles were not interchanged during the defection sessions: for example when rat A was the focal rat, its defecting partner was rat B; the latter had rat C as its defecting partner instead of rat A. Experimental partners were pseudorandomly assigned to each focal rat, avoiding (1) previous experience between pairs and (2) cage mates. Each focal rat had all four sessions on subsequent days. The experience treatments and the test order of rats were randomized. Focal and partner rats were potential donors, that is, partners in the experience phase and focal rats in the test phase, only once per day. We recorded the number of pulls and the latency to the first pull.

Direct Reciprocity Decision Rule Versus Copying
For the experiment distinguishing between the direct reciprocity decision rule and copying by imitation, the experience phase of 10 focal rats consisted of three sessions, each with three treatments: (1) focal rats receiving help in the form of food rewards each time partner 1 pulled the stick ('the cooperative treatment'); (2) partner 2 helping itself by pulling the stick and eating the food rewards without the focal rat receiving food rewards, and the pulling stick was removed once partner 2 had pulled as often as the cooperative partner 1 had pulled before for the focal rat ('the limited self-pulling treatment'); and (3) partner 3 helping itself by pulling the stick and eating the food rewards as often as it could until the end of the experience phase, again without the focal rat receiving food rewards ('the unlimited self-pulling treatment'; Fig. 2b). Each pull of the stick provided one oat flake as a food reward, and which rat received the food reward depended on the treatment as outlined above. Following a 30 min interval after the experience phase, the focal rat was given access to the stick to pull food for its partner in the test phase (Fig. 2b). The cooperative and unlimited self-pulling experience phases and test phases lasted 7 min, whereas the limited self-pulling experience phase lasted until the limited self-pulling partner pulled as often as the cooperative partner had pulled, that is, usually for less than 7 min. Focal rats only received food rewards during the cooperative experience phases, and partners only received food rewards during the test phase. We recorded the number of pulls and the latency to the first pull. Each focal rat was tested in all three treatments. Eight high cooperators and two medium cooperators were chosen as focal rats, and they were paired with eight low cooperators and two medium cooperators as cooperative partners (see the Appendix for the classification of cooperator levels). Previous studies of reciprocity in Norway rats also used high cooperators as focal rats (Rutte & Taborsky, 2008;Schneeberger et al., 2012;, 2018b, 2018c. We added two medium cooperators to increase the sample size of the focal rats. Ten additional rats trained for the solo and social pulling procedures (six and four rats with high and medium cooperative propensities, respectively) were chosen as limited selfpulling and unlimited self-pulling partners. The experiment was designed as a sequential iterated Prisoner's dilemma. Partners 1, 2 and 3 remained the same for all three sessions per treatment. The order of treatments was pseudorandomized into three rounds of the three treatments for a total of nine sessions of experience and test phases. The limited self-pulling treatment in each round occurred after the cooperative treatment, because information about the number of pulls by the cooperative partner was needed to determine when the self-pulling activity had to be terminated. The sequences of the unlimited self-pulling and cooperative treatments were randomized within each round. Focal rats and partners were selected from the training stock, pilot study and experiment to test for direct reciprocity, and thereby avoiding (1) previous experience between members of experimental dyads and (2) cage mates. Focal subjects and their partners pulled only once per day. The experiment was designed and run between December 2020 and January 2021.

Statistical Analyses
For the experiment to test the direct reciprocity decision rule, we first conducted a generalized linear mixed model with a Poisson distribution and a log-link function with the number of pulls by the 21 focal rats in the test phase as the response variable, and the treatment in the experience phase, that is, cooperation or defection, as a fixed effect. The identity of the focal rats and partners were each included as random intercept effects. The residuals were not overdispersed. Second, we conducted a Cox proportional hazard model with the latency to the first pull by focal rats in the test phase as the response variable, treatment in the experience phase as a fixed effect, and the identity of focal rats was included as a random intercept effect (Klein & Moeschbergerm, 2003;Landes et al., 2020). The proportional hazards assumption of a Cox proportional hazard model was met.
To compare the number of pulls by partners in the experience phase, we conducted a generalized linear mixed model with a Poisson distribution and a log-link function with the number of pulls by partners in the experience phase as the response variable and treatment in the experience phase as a fixed effect. The identities of partners and focal rats were each included as random intercept effects. The residuals were not overdispersed.
For the experiment to distinguish between the direct reciprocity decision rule and copying by imitation, we first conducted a generalized linear mixed model with a Poisson distribution and a log-link function with the number of pulls by the focal rat in the test phase as the response variable, and treatment in the experience phase and session (categorical variable) as fixed effects. The identity of the focal rats and partners were each included as random intercept effects. The residuals were not overdispersed. Second, we ran a linear mixed model with a Gaussian distribution with the log of the latency to the first pull as the response variable, and treatment (categorical variable) and session (categorical variable) as fixed effects. The identities of the focal and partner rats were included as random intercept effects. We log-transformed the latency to the first pull, so the model residuals were normally distributed.
To compare the limited self-pulling and unlimited self-pulling treatments for the models, we performed post hoc comparisons with Bonferroni correction for P value adjustment to account for multiple testing. To compare sessions 2 and 3 for the models, we performed post hoc comparisons with Bonferroni correction for P value adjustment to account for multiple testing. We corrected the alpha to correct for multiple comparisons to 0.025. All means and coefficients are reported with standard errors or 95% confidence intervals (CI), and an alpha of 0.05 was chosen. All statistical analyses were performed with R (R Core Team, 2020), and the packages lme4 (Bates et al., 2015), lmerTest (Kuznetsova et al., 2017), survival (Therneau, 2021), effects (Fox & Weisberg, 2018 multcomp (Hothorn et al., 2008) and tidyverse (Wickham et al., 2019) were used.
The latency to the first pull by focal rats was 18.20 s longer for limited self-pulling partners (raw data for the latency to the first pull for limited self-pulling partners: 48.43 ± 7.00 s; predicted logtransformed latency to the first pull for limited self-pulling partners: 3.58 ± 0.23 s, 95% CI ¼ 3.13e4.03 s) than for cooperative partners (raw data for the latency to the first pull for cooperative partners: 30.23 ± 4.63 s; predicted log-transformed latency to the first pull for cooperative partners: 3.01 ± 0.23 s, 95% CI ¼ 2.56e3.46 s; limited selfpulling versus cooperative: estimate ± SE ¼ 0.56 ± 0.23, P ¼ 0.019; Fig. 4). The latency to the first pull by focal rats was 49.17 s longer for unlimited self-pulling partners (raw data for the latency to the first pull for unlimited self-pulling partners: 79.40 ± 16.73 s; predicted logtransformed latency to the first pull for unlimited self-pulling partners: 3.75 ± 0.23 s, 95% CI ¼ 3.30e4.20 s) than for cooperative partners (unlimited self-pulling versus cooperative: estimate ± SE ¼ 0.74 ± 0.23, P ¼ 0.002; Fig. 4). The latency to the first  Figure 3. The number of food donations by focal rats in the test phase to previously cooperative, limited self-pulling and unlimited self-pulling partners. There were 10 focal rats, and each was part of nine test phases, that is, three test phases per treatment. The central dots and whiskers represent the predicted values and the 95% confidence intervals. The black dots represent the raw data of food donations by focal rats in each of the three treatments. **P < 0.01; ***P < 0.001.  Figure 4. The latency to the first pull (s) by focal rats in the test phase for previously cooperative, limited self-pulling and unlimited self-pulling partners. The treatments in the experience phase included cooperative, limited self-pulling and unlimited selfpulling partners. The latency to the first pull was log-transformed, so the model residuals were normally distributed. There were 10 focal rats, and each was part of nine test phases, that is, three test phases per treatment. The central dots and whiskers represent the predicted values and the 95% confidence intervals. The black dots represent the raw data of the log-transformed latency to the first pull by focal rats in each of the three treatments. *P < 0.05; **P < 0.01. pull by focal rats did not differ for limited and unlimited self-pulling partners (post hoc comparison: estimate ± SE ¼ À0.18 ± 0.23, P ¼ 0.73, 95% CI ¼ À0.73e0.37; Fig. 4). These results are consistent with rats helping according to the direct reciprocity decision rule, whereas they are not consistent with the predictions of copying by imitation. The latency to the first pull by focal rats was longer (22.53 s) in session 3 than in session 1 (estimate ± SE ¼ 0.51 ± 0.23, P ¼ 0.03). The latency to the first pull by focal rats did not differ between session 2 and session 1 (estimate ± SE ¼ 0.21 ± 0.23, P ¼ 0.37) and between session 3 and session 2 (post hoc comparison: estimate ± SE ¼ 0.30 ± 0.23, P ¼ 0.41, 95% CI ¼e0.25e0.85).

DISCUSSION
In the first experiment, focal rats gave more help in the test phase to previously cooperative partners than to previously defecting partners, which supported our first prediction for the direct reciprocity decision rule and is consistent with previous results (Delmas et al., 2019;Dolivo & Taborsky, 2015a, 2015bRutte & Taborsky, 2008;Schneeberger et al., 2012;Schweinfurth, Stieger, et al., 2017;, 2018a, 2018bWood et al., 2016). The latency to the first pull by focal rats in the test phase did not differ between previously cooperative partners and previously defecting partners, which did not support our second prediction for the direct reciprocity decision rule. This result is consistent with previous studies that used wild-type Norway rats as the model system, which supported the direct reciprocity decision rule but did not find a difference in the latency to the first help provided to cooperative or defecting partners , 2018a. This study reinforces previous results that supported the direct reciprocity decision rule in Norway rats (Delmas et al., 2019;Dolivo & Taborsky, 2015a, 2015bLi & Wood, 2017;Rutte & Taborsky, 2008;Schneeberger et al., 2012;, 2018b, 2018c. However, these previous studies did not experimentally test for the social learning processes of copying by imitation. The direct reciprocity decision rule is 'help someone who previously helped you'. In the experience phase of the second experiment, the cooperative partners but not the limited and unlimited self-pulling partners helped the focal rats. In the test phase, focal rats helped limited and unlimited self-pulling partners, which had pulled food for themselves, less often than cooperative partners, and their latency to the first help was longer for self-pulling partners. These results are consistent with our predictions of the hypothesis that focal rats apply the direct reciprocity decision rule when choosing to help a social partner. This result again reinforces previous results that supported the hypothesis that Norway rats apply the direct reciprocity decision rule (Dolivo & Taborsky, 2015a, 2015bLi & Wood, 2017;Rutte & Taborsky, 2008;Schneeberger et al., 2012;Schweinfurth & Call, 2019;, 2018a, 2018b. These previous studies and this study's results showed that focal rats do provide some help also to partners from whom they did not receive help, for example defecting and self-pulling partners, although focal rats give significantly more help to partners if they received help from them than if they did not. Focal rats learned during the social pulling training that they only receive help when a partner pulls, and the focal rats did not go to the food tray to attempt to get food when they themselves pulled on the stick during the test phases. As such, focal rats are apparently not pulling for previous defectors in the test phases just in case they might receive food for themselves, for example via reinforcement learning. Direct reciprocity in a tit-for-tat scenario, as previous studies suggested applies in Norway rats (e.g. Kettler et al., 2021;, requires some general helping propensity (or 'generosity') to get a cooperation chain started. This might be the reason why animals selected to apply the direct reciprocity decision rule may sometimes behave generously even if they have not experienced previous help.
In contrast to our predictions for copying by imitation, focal rats did not give more help to partners that pulled more often (unlimited self-pulling partners) than to partners that pulled less often (cooperative and limited self-pulling partners), and the latency to the first pull also was not shorter for partners that pulled more often (i.e. unlimited self-pulling partners) than for partners that pulled less often (i.e. cooperative and limited self-pulling partners). Hence these results were not consistent with the hypothesis that focal rats help social partners through copying by imitation. If focal rats in our study had copied the action of the demonstrator (the partners) or the form of an action after observing what another individual does, they should have pulled the stick more often for partners that had pulled more often during the experience phase, that is, in the unlimited self-pulling treatment, than for cooperative partners and for limited self-pulling partners that pulled as often as cooperative partners. More generally, if focal rats were imitating social partners irrespective of whether these were helping themselves or helping the focal rats, focal rats should have helped cooperative and limited self-pulling partners indiscriminately. Yet, neither of these predictions was supported in this study.
Unlike previous studies, our study was designed to test for copying by imitation. Previous studies have not tested whether focal rats help partners that used the same provisioning mechanism to acquire food merely for themselves, that is, self-pulling. In a previous study, focal Norway rats learned to help their partners by using two different devices, pulling a stick or pushing down a lever, prior to the study . In this previous study, focal Norway rats helped their partners by using a different device in the test phase than their partners had used in the experience phase, which excluded the possibilities of copying by imitation, that is, copying what others do, and by the emulative learning process of object movement re-enactment . This ability was also demonstrated in domestic dogs, Canis familiaris (Gfrerer & Taborsky, 2018). Rats have also been shown to exchange different commodities, which again cannot be explained by copying (Schweinfurth & Taborsky, 2018a). Nevertheless, previous studies have not explicitly tested for the possibility that copying may be involved when there is an opportunity to copy. The potential for the exchange of the same commodities to be accounted for by social learning processes, such as copying, rather than by applying direct reciprocity decision rules has yet to be tested in other models of the reciprocal exchange of food, such as brown capuchins (de Waal & Berger, 2000;de Waal & Brosnan, 2006;Parrish et al., 2015;Suchak & de Waal, 2012) and common vampire bats (Carter et al., 2020;Carter & Wilkinson, 2013a, 2013bWilkinson, 1984). Copying and the direct reciprocity decision rule are not always distinct decision rules, such as when there is no learned instrumental task. If an individual were to give goods or services, such as food or allogrooming, to a previously helpful individual simply by copying the actions of the individuals from whom it received similar goods or services, copying and direct reciprocity would both be involved. The tit-for-tat strategy in principle involves both copying and reciprocity, since copying a partner's last action, that is, cooperate or defect, results in direct reciprocity (Axelrod & Hamilton, 1981). Copying is one of several cognitive processes that can lead to reciprocity; however, alternative cognitive processes underlying reciprocity should be investigated in future studies.

Conclusions
First, we ran an experiment to test for the direct reciprocity decision rule. Focal rats gave more help to previously cooperative partners than to previously defecting partners, which supported this rule. Second, we ran an experiment to distinguish between direct reciprocity and copying by imitation. Our results are most consistent with the hypothesis that Norway rats apply the direct reciprocity decision rule when providing help to a social partner. Focal rats helped limited and unlimited self-pulling partners that had pulled food for themselves less often than cooperative partners that had helped the focal rats receive food. The latency to the first help by focal rats was also longer for the self-pulling partners than for the cooperative partners. Consequently, our results are not consistent with copying by imitation and further strengthen the conclusions of previous studies that Norway rats apply the direct reciprocity decision rule when helping each other in a sequential iterated Prisoner's dilemma game.

Author Contributions
S.C.E. and M.T. designed the study. S.C.E. and M.T. conceptualized and designed the study. S.C.E. ran the study and collected the data from video files. S.C.E. performed the statistical analyses. S.C.E. and M.T. wrote the manuscript. M.T. supervised the research project.

Data Availability
The R script and data files are included in the Supplementary material.

Declaration of Interest
The authors do not have conflicts of interest.
Two rats were paired to learn how to operate a food donation paradigm. The paired rats were cage mates in sessions 1-18 and noncage mates in sessions 19-22. Each time a rat pulled on a stick its training partner received a food reward (an oat flake). The paired partners were in separate compartments of a test cage, and the roles of donor and recipient were interchanged. The time between interchanging the roles of the partners increased with the sessions. The table is adapted from Dolivo and Taborsky (2015a) and Kettler et al. (2021). In session 1, individual A is in compartment 1 ( L ; left) and pulls first ( F ), and individual B is in compartment 2 (right) and pulls second (see the scheme in Fig. 2). Once A pulls once and B eats the food reward, the roles are interchanged, and B pulls once and A eats the food reward. The roles are interchanged for 14 min. In session 2, B is the first one to pull. Sessions 1 and 2 last 14 min. In sessions 3 and 4, the positions of the rats are exchanged, such that A is in compartment 2 and B is in compartment 1, and both sessions last 14 min. In session 3, A pulls first and pulls twice before the roles are interchanged. In session 4, B pulls first and pulls twice before the roles are interchanged. The identity of the first puller, the position of the first puller and the duration of the sessions are the same in sessions 5 and 6 as in sessions 1 and 2, but rats pull four times for their partner before the roles are interchanged. From session 7 to session 18, the rats pull for a given amount of time rather than a fixed number of pulls. In sessions 7 and 8, the sequence is as follows: (1) the first rat to pull can donate food for 4 min, (2) the partner pulls for 4 min and (3) the first rat to pull can donate food for another 4 min; for example in session 7, A pulls first for 4 min, B then pulls for 4 min, and then A pulls again for 4 min. This sequence is reversed for session 8, in which B pulls first. Sessions 7 and 8 both last 12 min. From session 9 to 18, each rat can pull for 7 min before the roles are interchanged. The interchange of roles in sessions 13e22 is not immediate as in the previous sessions 1 to 12; instead, the first puller has to wait 1 h (sessions 13 and 14), 3 h (sessions 15 and 16) and 24 h (sessions 17e22), respectively, before its partner has the opportunity to provide food in return. During the delay periods in sessions 13e22, the rats are transferred to their housing cage.  Figure A2. Validation of the experimental design. The number of pulls by cooperative, limited and unlimited self-pulling partners in the experience phase is the response variable. The central dots and whiskers represent the predicted values and the 95% confidence intervals. The black dots represent the raw data of pulls by partners. ***P < 0.001.  Figure A1. Test of the direct reciprocity decision rule. The number of pulls by rats to cooperative or defecting partners in the test phase is the response variable. The central dots and whiskers represent the predicted values and the 95% confidence intervals. The black dots represent the raw data of pulls by rats. *P < 0.05.