Introduction

The function of gaze and general body orientation, sometimes strengthened by pointing with an arm and/or a finger, is crucial to human referential communication. Referential communication per se implies “joint attention”: a triadic relationship between an informant, a recipient and a precise object, place, or event on which the attention of the first individual is fixed (Butterworth, 1995). Deictic gesture following, as opposed to merely looking where someone else is looking, implies a certain degree of comprehension of the communicative and referential nature of this gesture. The specific informative gesture has to be identified among all other gestures as indicative of a specific object among other objects. The capacity to follow such gestures is developed quite early in human children (Lempers, 1979; Morissette, Ricard, & Decarie, 1995), and plays a crucial role in language learning.

Numerous studies have been conducted to assess this ability in other species in the last few decades, using the so-called object-choice task. This task tests the ability to use human-given cues to find a hidden reward located in one of two (or more) containers. The question is whether certain animals can follow human pointing gestures to the correct object without any prior formal training, and to what extent the distortion of these cues or the complexity of the situation affect their behavior. Dogs have been reported to be particularly sensitive to human pointing cues in this type of task (Canis familiaris; Hare, Call, & Tomasello, 1998; Lakatos, Gácsi, Topál, & Miklósi, 2009; Soproni, Miklosi, Lorand, Topal, & Csanyi, 2002). Other domestic species have been successful in exploiting at least some explicit pointing (i.e., with an extended arm): cats (Felis catus; Miklosi, Pongracz, Lakatos, Topal, & Csanyi, 2005), horses (Equus caballus; Maros, Gacsi, & Miklosi, 2008; Proops, Walton, & McComb, 2010), goats (Capra hircus; Kaminski, Riedel, Call, & Tomasello, 2005), and pigs (Sus scrofa domestica; Nawroth, Ebersbach, & von Borell, 2013). Taken together, these results have led some researchers to hypothesize that successful interpretation of human communicative cues arises as a side effect of domestication (e.g., see Agnetta, Hare, & Tomasello, 2000; Hare & Tomasello, 2005; Virányi, Gácsi, Kubinyi, Topál, Belényi, Ujfalussy, & Miklósi, 2008).

However, this hypothesis has been countered by other findings: human-socialized undomesticated individuals, such as wolves (Canis lupus; Udell, Dorey, & Wynne, 2008; Udell, Spencer, Dorey, & Wynne, 2012), African elephants (Loxodonta africana; Smet & Byrne, 2013), dolphins (Tursiops truncatus; Pack & Herman, 2004, 2007; Tschudin, Call, Dunbar, Harris, & van der Elst, 2001), fur seals (Arctocephalus pusillus, Schuemann & Call, 2004) and a gray seal (Halichoerus grypus, Shapiro, Janik, & Slater, 2003) have also been shown to be very successful, some not far from a dog's level of performance. Great apes are generally unskillful (e.g., see Call & Tomasello, 2005; Povinelli, Reaux, Bierschwale, Allain, & Simon, 1997), but recently Lyn, Russel, and Hopkins (2010) showed that chimpanzees and bonobos raised in socio-linguistically rich environments perform much better than chimpanzees raised in standard laboratory housing. These findings suggest a complementary hypothesis: in some species, rich daily interactions with and exposure to human gestural communication can be sufficient for correctly interpreting pointing cues. This could be especially the case for species that have native ability in attending to and exploiting social cues (Smet & Byrne, 2013).

To further test this hypothesis, we investigated the understanding of human pointing gestures in four human-socialized Californian sea lions (Zalophus californianus). Even though other social marine mammals have already been reported to succeed to some degree (Herman, Abichandani, Elhajj, Sanchez, & Pack, 1999; Pack & Herman, 2007; Schuemann & Call, 2004; Shapiro et al., 2003), only dolphins have been tested in object-choice tasks involving more than two targets, and none have been tested with pointing gestures made with unusual body parts (i.e., not associated with any specific commands). The formal training of these four sea lions included several commands involving arm gestures, and two subjects had to be trained to respond to proximal points (i.e., with an extended arm, finger at approximately 30 cm from the target while pointing) at the beginning of the current experiment (See “Pretraining”, in the Methods section). The aim of this experiment was to investigate whether they could exploit new pointing gestures, in relatively complex situations.

Three studies were conducted, using different object-choice tasks. In Study 1, we tested these sea lions’ ability to spontaneously transfer their response from explicit pointing (i.e., with an extended arm) to variations of the pointing gesture, to select the correct target between two similar objects, placed on either side of the informant. However, here we define the ability to use the referential property of the pointing gesture as the ability to rely on extrapolating precise linear vectors along the pointing arm in order to identify a precise object, instead of a general direction. In the basic version of the object-choice task used in Study 1, it may be possible that the subject would simply choose the object situated on the side of the human at which they saw a protruding body part and/or a movement (Lakatos et al., 2009; Lakatos, Gácsi, Topál, & Miklósi, 2011; Soproni et al., 2002), without any use of the referential property of the pointing gesture. This possibility was addressed by Study 2, where two similar targets were placed along the same line on each side of the informant (resulting in two proximal and two distal targets). Finally, Study 3 tested the sea lions’ response to pointing to a hidden target, in the presence of a visible distractor item.

Four subjects were tested in Study 1, but only three participated in Studies 2 and 3. This was because one of the subjects arrived at the dolphinarium only one month prior to the beginning of the experiment, and required a period of familiarization with this new environment before participating in our experiment. Thus, this subject (Kaï) started participation in the experiment later than the other three subjects (i.e., when they were already into Study 2), and completed Study 1 only.

Study 1

Methods

Subjects

The subjects comprised four male sea lions (Zalophus californianus), housed together at Parc Asterix, France. All were born in captivity, but came from different zoos and had different life experiences prior to coming to Parc Asterix. Santo, Smack, and Gonzo (respectively 6, 5, and 3 years old) arrived at the park 30 months prior to the experiment. They were quite familiarized with training sessions and shows, and knew a wide range of commands. Kaï (2 years old) arrived only one month prior to the beginning of the experiment, and had never been trained to any human gestural and/or vocal command before. None of the subjects had previously participated in any behavioral study. They lived in an enclosure of 125 m2, with an outdoor saltwater pool (313 m3), and a set of two indoor habitats (total over 50 m2). They were fed a mixture of capelin, squid, and herring, distributed during their five daily training sessions. Typically, they had to respond correctly to both vocal and gestural commands to be rewarded with a whistle blow and pieces of fish. Caretakers proceeded in such a way that by the end of the session (including experimental sessions), each sea lion had received his planned ration, regardless of his performance.

Procedure

Experimental sessions were conducted as much as possible like usual training sessions. We made the choice not to separate the tested individual from the other subjects. The tested sea lion was simply led to the part of the enclosure where the experimental setup was placed. He could meanwhile keep visual and acoustic contact with his conspecifics, as they participated in routine training sessions with the other caretakers (learning new commands and being trained for shows and medical examinations, without any direct connection with our pointing experiment).

Along the same line, the informant was not an unfamiliar experimenter, but one of their customary caretakers. Four caretakers participated in the experimental sessions, each with the same animal throughout the experiment. We made sure all caretakers were previously trained to correctly execute each cue, and a strong emphasis was put on displaying only the gestural and verbal cues defined for each experimental condition throughout each trial in order to avoid unintentional extra cues.

Two identical and familiar green Frisbees (24 cm in diameter) served as the response targets. They were placed on the ground, at equal distance from the subject, approximately 150 cm, on either side of subject. Each trial began with the sea lion sitting down at his starting position (represented by a piece of carpet), the caretaker in front of him between the two Frisbees (approximately 100 cm from both). He/she obtained visual contact with the subject (calling him by name if necessary), demonstrated the cue for 2 s, and then returned to his/her initial position. If the subject did not move at the first cue, the experimenter repeated it every 2 s, up to a maximum of three times. The subject indicated his choice by touching one of the Frisbees with his muzzle. Correct choices were rewarded with a whistle blow and a piece of fish. If the subject made an incorrect choice, no reward was given and he was simply signaled to return to his starting position. The cues were presented in a predetermined semi-random order (14 trials per cue), each Frisbee was pointed to an equal number of times, and the same Frisbee was never pointed to more than twice in a row.

Six different cues were presented to the subjects, defined as follows (Fig. 1). In all cases, the caretaker was positioned in front of the subject and looked at him while pointing, unless otherwise specified below:

Fig. 1
figure 1

Experiment setup and illustration of the six different pointing cues used in Study 1, from the sea lion's point of view. PG : Point and Gaze; P : Point; C : Cross-body point; E : Elbow point; G : Gaze; F : Foot point.

  • Point and gaze (PG): Caretaker extended his/her ipsilateral arm and index finger toward the target object while turning his/her head to look at it.

  • Point (P): Caretaker extended his/her ipsilateral arm and index finger toward the target object.

  • Cross-body point (C): Caretaker extended his/her contralateral arm and index finger toward the target object.

  • Elbow point (E): Caretaker stuck his/her ipsilateral elbow out toward the target object while keeping the hand behind the back at hip level.

  • Gaze (G): Caretaker turned only his/her head to look at the target object.

  • Foot point (F): Caretaker stood on one foot while pointing at the target container with his/her ipsilateral foot.

The number of trials per session varied between five and 18, depending on the study, the age of the individual, his interest in participating in the training and the current state of the other sea lions’ training sessions. Sessions were conducted until the subject completed the number of trials planned for each study, over a period of two months. Inter-trial time varied from approximately 5 to 30 s, during which the caretaker sometimes gave known and rewarded commands (unrelated to pointing cues) to maintain the sea lion’s motivation and attention.

Pretraining

The subjects had previously been trained to respond to several commands which included arm gesture. Only two of the commands involved an extended arm pointing toward a general direction: all of the subjects had been trained to return to the water when the caretaker extend his/her arm in the direction of the pool, saying “water.” The second one was known only by Santo, who had been trained to return a floating toy when the caretaker extended his/her arms in the direction of the water and formed a square with his two hands. Two other commands involved an extended arm oriented toward the sea lion: with the palm open, while saying “stay,” to make him sit down and wait at a specific place, and with a finger pointed toward the animal, while saying "shy", to make him cover his muzzle with his flipper. They were not trained to any other gestural command including an extended arm, and more specifically, had never been previously trained to select (e.g., by touching with his muzzle) a precise object indicated by an extended arm and pointed finger. Therefore, considering their previous experience of commands involving arm gestures, all described above, the main aim of this study was not to test their ability to follow gestures to a general direction; it was to assess their ability to spontaneously generalize their response when other caretaker’s body parts were used to emit the pointing cue (Study 1), and to follow it to a precise object rather than merely in a general direction (Studies 2 and 3).

Accordingly, the initial session was conducted as follows: if the subject spontaneously gave a correct response to the first pointing gesture, the session continued as planned (i.e., with the six cues presented in a predetermined semi-random order). If he did not, training sessions with proximal pointing were conducted until the subject gave eight correct responses in a row (i.e., moving to the pointed object and touching it with his muzzle). Trainings to proximal point were conducted in the same setup as Study 1 (as described below), except that the caretaker was on his/her knees, with the tip of his/her finger at approximately 30 cm from the target while pointing.

Data analysis

All trials were recorded by digital camera for post-session review and analysis. Only individual analyses of performances with nonparametric statistical tests were used. Binomial tests were conducted for each individual to determine if he performed better than chance when using a particular cue. Permutation tests were used to compare latencies of choices between the different cues (starting when the caretaker began to demonstrate the cue until the subject touched the Frisbee). To test the hypothesis that the sea lions learned to respond correctly to each cue across trials, the number of correct trials on the first and the second half of the trials (seven trials each) for each cue were compared using Fisher’s test. Additionally, the response of each individual on the first trial for each one of the six cues was recorded. Binomial tests were conducted to determine for each individual whether he performed better than chance on these first trials (six in total).

Results and discussion

Spontaneous responses to an explicit pointing gesture were first tested. Santo and Smack responded correctly on the very first trial, by moving to the pointed target and touching it with their muzzle. On the other hand, Gonzo and Kaï both attempted to respond by executing previously learned commands (specifically by touching the caretaker’s pointing hand with their muzzle, or by executing the “shy” behavior, signaled by the caretakers' finger pointed toward the sea lion); therefore both needed two training sessions to proximal point to reach the learning criterion (as described above, in the Pretraining section), while Santo and Smack directly moved up to the next test sessions.

Figure 2 presents the percentage of correct reponses in test sessions, for each sea lion, across the six pointing cues. All subjects performed at or near ceiling level to the Point cues, with or without gaze (binomial tests: p<.001 in both cases, with 50 % chance). Three of our four subjects also performed beyond chance in the cross-body point (C), elbow point (E), gaze (G), and foot point (F) cues (ps≤ .05). However, one subject, Gonzo, failed to successfully respond to the cross-body point (C), and another one, Smack, failed to respond to the three cues that did not involve the arm (i.e., G, E, and F). Table 1 presents the response on the very first trial, and the number of correct trials on the first and second half of the trials for each subject. For each correctly interpreted cue (i.e., except for Smack and Gonzo for the cues cited above), the subject’s very first choice was correct. Santo and Kaï performed beyond chance level on the overall six first trials (ps<.05 for both), but Smack and Gonzo did not (ps>.13). There was no significant difference in number of correct responses between the first and the second half of the trials (Fisher’s tests (N = 14), ps > .19 in all cases).

Fig. 2
figure 2

Bars represent the percentage of correct reponses for each sea lion across the six pointing cues used in Study 1: Point and Gaze (PG), Point (P), Cross-body point (C), Elbow point (E), Gaze (G), and Foot point (F). The cues were presented in a predetermined semi-random order (14 trials per cue for each subject). Curves represent their mean latencies (starting when the caretaker began to produce the cue and ending when the subject touched the Frisbee). Dotted lines represent chance level (50 %). *p ≤ .05, **p ≤ .01, ***p ≤ .001, binomial tests.

Table 1 Number of correct trials in the first and second half of the trials (seven trials each) for Study 1, and in the first and last ten-trial blocks for Studies 2–3, and response in the very first trial (underlined value: correct; non-underlined value: incorrect), across conditions for each subject (Santo, Smack, Gonzo, Kaï).

Latencies (Fig. 2) were shorter with PG, P, and C than with the less conspicuous cues (G, E, and F). Santo's latencies were significantly longer with F compared to all pointing gestures involving the arm; with E versus PG and C; and with F versus PG and P (permutation tests: ps < .05 in all cases). Smack's latencies were longer with G than PG, P, C, and E (p < .05 for C, ps < .01 for the others). It took Gonzo longer to choose with G versus PG, P (ps < .05) and F (p < .01). It took Kaï longer with F compared to PG, P and C (ps < .01). During these intervals, the sea lions typically stayed at their starting place, or moved very slowly, making gaze alternations between the caretaker and the Frisbees. As their performances show, the sea lions were able to interpret variations of the pointing gesture correctly, but sometimes hesitated and waited for cue repetition before acting.

Study 2

Methods

Subjects

Subjects included three of the four previous participants: Santo, Smack, and Gonzo.

Procedure

The general procedure was the same as in Study 1, except that instead of two targets, there were four. We used a similar design to that of Morissette et al. (1995) and Lakatos et al. (2011). Four identical Frisbees were placed in front of the caretaker in such a way that the distances between each Frisbee and the tested subject were the same (approximately 1.5 m). In the starting position, the caretaker stood behind the two inner Frisbees (hereafter called “proximal” targets), while the subject faced him behind the two outer (“distal”) targets. Consequently, the targets lay along the same visual line, and pointing gestures towards distal and proximal targets differed by the angle of the caretaker’s arm (50° and 20°, respectively). Thus, even when the caretaker pointed to a distal target, his/her pointing finger was actually closer to the proximal target located on the same side (respectively 1.50 m and 0.90 m) (Fig. 3a). Each Frisbee was pointed to an equal number of times, in a semi-random order, with a total of 60 trials per subject.

Fig. 3
figure 3

(a) Experimental setup of Study 2 (left side only), illustrating the distal pointing cue. (b) Experimental setup of Study 3, illustrating pointing toward the left hidden target

Data analysis

Binomial tests were conducted for each individual to determine if performance was better than chance. The ability of sea lions to choose the side (left vs. right) of the object correctly on the basis of a point and gaze cue (PG) was not in question (and indeed, they chose the correct side 100 % of the time in this study); rather, the question was whether they would choose the indicated Frisbee between the two present on the correct side. This is why the level of chance was 50 % and not 25 %. Learning across trials was investigated by comparing for each subject the number of correct responses on the first ten-trial and the last ten-trial blocks of the Study (i.e., trials 1–10 vs. trials 50–60), using Fisher’s test. The response of each individual on the first trial was also recorded.

Results and discussion

All subjects performed well in this task: Santo was correct in 85 % of the trials (binomial test: p<.001, with 50 % chance), Smack 67.8 %, and Gonzo 71.7 % (ps<.01) (Fig. 4a). There was no difference in performance between proximal and distal targets (Pearson’s Chi-Square tests (1, N = 60), ps>.14), despite a slight tendency to be more efficient with distal targets than proximal ones. Table 1 presents the number of correct trials on the first and the last ten-trial blocks, and the response on the very first trial for each subject. There was no significant difference between the blocks for any subject (Fisher’s tests (N=20), ps>.21). However, none of the subjects were correct on the first trial.

Fig. 4
figure 4

Bars represent the percentage of correct reponses for each sea lion in Study 2 (a) and Study 3 (b). The total number of trials per subject was 60 for Study 2 and 42 for Study 3. Dotted lines represent the level of chance (50 % in Study 2: a correct reponse was choosing the indicated Frisbee between the two present on the pointed side; 33 % in Study 3: a correct response was choosing the indicated Frisbee between the three available). *p ≤ .05, **p ≤ .01, ***p ≤ .001, binomial tests.

Study 3

Methods

Subjects

The subjects were the same as in Study 2: Santo, Smack, and Gonzo.

Procedure

The general procedure and the pointing gesture (point and gaze cue) were the same as in Study 2. However, here three instead of four Frisbees were pointed to in a semi-random order. One Frisbee was placed in front of the caretaker (approximately 60 cm from him/her) and another on either side of him/her (approximately 140 cm). Two opaque screens (100 cm2) were placed in such a way that the subject could not see the two lateral targets while he was at the starting position in front of the caretaker. Only the Frisbee placed in front of the subject was visible (Fig. 3 b). Each of the three Frisbees was pointed to an equal number of times, in a semi-random order, with a total of 42 trials per subject.

We suspected the central Frisbee to be a highly salient stimulus in Study 3 considering that no other target stimulus was visible. It was reasoned that strong behavioral inhibition of responding to that stimulus might be necessary in trials where the occluded Frisbees were pointed to. Considering this, we planned the first trials as follows: if the sea lion correctly moved to the hidden Frisbee when pointed to, the test session was conducted directly. But in case of systematic choice of the visible Frisbee, we conducted a training session. This training session consisted of eight trials without the visible Frisbee. Therefore only the two hidden Frisbees were present, and each one pointed to four times, in a semi-random order. After that, test sessions would be conducted under the conditions described above.

Data analysis

Given that three Frisbees were present in the testing sessions of this experiment, binomial tests were conducted for each individual with a chance level of 33 %. Learning across trials in the test sessions was investigated by comparing for each subject the number of correct responses on the first ten-trial and the last ten-trial blocks of the study (i.e., trials 1–10 vs. trials 32–42), using Fisher’s tests.

Results and discussion

The subjects’ initial responses were different: Santo correctly responded immediately (i.e., in less than 2 s for each trial) during the first session. Smack was correct but took longer than Santo during the first four trials. It took him approximately 10 s to respond, making several gaze-alternations, but, little by little, he got around the screen (no reinforcement was given before he finally touched the Frisbee). After that, he always responded immediately. On the contrary, Gonzo systematically chose the visible Frisbee on the first session, including in the trials where the hidden Frisbees were pointed to. Thus, Gonzo was the only one to be trained for a session of eight trials without the visible Frisbee. Gonzo’s response attempts during the first training trials were similar to those of Smack in his first test trials (e.g., he tried to respond by touching the screen with his nose). During these training trials, no feed-back was given to Gonzo until he had chosen a Frisbee. The same reinforcement procedure as for test trials was applied : in case of a correct choice, the subject was rewarded, while in case of an incorrect choice, no reward was given and he was signaled to return to his starting position to perform the next trial. At the end of the training session, Gonzo successfully completed five of eight trials. Results presented hereafter are from the first test trials for Santo and Smack, and from test sessions following the training for Gonzo.

The percentage of correct choices for each subject was significantly above the values expected from chance: Santo was correct in 92.9 % of the trials (binomial test: p<.001, with 33% chance), Smack in 66.7 %, and Gonzo in 85.7 % (ps<.01) (Fig. 4b). No better performance was found in their response to pointing toward the visible target than toward hidden targets. Table 1 presents the number of correct trials in the first and the last ten-trial blocks, and the response on the very first trial for each subject. No significant difference between the blocks was found for any subject, but Smack and Gonzo had a tendency to perform better in the last block of trials compared with the first block (Fisher’s tests (N=20), p=.14 and p=.08, respectively).

Because one out of the three tested subjects, Santo, responded directly (i.e., in less than 2 s, and without any other attempt at response) to the pointing gestures towards the hidden Frisbees, we conclude that this sea lion was able to follow pointing gestures made towards occluded targets, despite the presence of a visible and directly accessible target. The two remaining subjects needed training in order to correctly follow these pointing gestures. For Gonzo, a training session without the visible Frisbee had to be conducted, as described in the Methods section. Smack did not systematically choose the distractor, and thus did not receive this explicit training. However, he also made other attempts at response before touching the pointed hidden Frisbee (e.g., touching the screen). Touching the visible Frisbee despite the caretaker’s gesture to a hidden one was considered as an incorrect choice (no reward was given and the sea lion was signaled to return to his starting position), while in the case of touching the opaque screen the trial continued until the sea lion touched one of the Frisbees. It may thus be possible to consider the lack of reinforcement in both cases (touching the visible Frisbee as well as touching the screen) as a form of training. As Santo was the only tested subject to not attempt either of those actions, he was the only one considered to be successful without any prior training.

General discussion

The current findings establish that three of the four tested sea lions generalized their responses to all novel pointing cues presented in Study 1: cross-body point, elbow point, foot point, and gaze only. Because correct responses were rewarded, it would be tempting to explain such results by reinforced learning. However, these three subjects were errorless on the very first trial for each cue, and tests comparing the number of correct responses between the first and the second half of trials for each subject for each condition revealed no difference in performance with time. Increasing latencies related to decreasing conspicuousness of the cues suggests that even if three out of the four sea lions performed above chance level with all the pointing cues, it was easier for them to respond to the most “explicit” ones (i.e., involving an extended arm).

Their level of generalization was equivalent to that observed in dogs. Sea lions followed the foot point cue, which is a body part never used before with them to give commands, an ability that has also been reported in dogs (Lakatos et al., 2009), half of the tested wolves (Udell et al., 2012), and elephants (Smet & Byrne, 2014). They also correctly exploited the elbow point cue, while dogs and wolves have been reported to fail to do so (Soproni et al., 2002; Udell et al., 2012). Unlike the two previously cited studies, in our study we asked the caretaker to put the hand of the used arm behind his/her back while producing elbow point. This may have facilitated elbow point following by reducing the conflict between the different potential cues (i.e., pointing elbow and hand). Such interference might also explain why chimpanzees (Povinelli et al., 1997), dogs (Lakatos et al., 2009; Soproni et al., 2002), and elephants (Smet & Byrne, 2013) performed below the level of chance with a cross-elbow point cue (i.e., elbow protuberance could interfere with the index finger on the midline of the body, pointing at the opposite target). The sea lions also used gaze as a directional cue, as do dogs (Hare et al., 1998; Miklosi, Polgardi, Topal, Csanyi, 1998), dolphins (Pack & Herman, 2004), fur seals (Schuemann & Call, 2004), and chimpanzees (Barth, Reaux, & Povinelli, 2005; Povinelli et al., 1997; Povinelli, Bierschwale, & Cech, 1999). This ability seems to be restricted: horses (Proops et al., 2010), goats (Kaminski et al., 2005), wolves (Udell et al., 2012), a gray seal (Shapiro et al., 2003), and elephants (Smet & Byrne, 2013) failed. A possible explanation could be that the species that were unsuccessful in this task have poor eyesight, which does not allows them to perceive this very subtle cue (Shyan-Norwalt, Peterson, Milankow King, Staggs, & Dale, 2010, in Smet & Byrne, 2013). We also suggest that sensitivity to human head orientation could have been enhanced in the tested sea lions because of the importance of visual contact during interactions with the caretakers (e.g., during shows, sea lions have learnt to respond to a caretaker only when being in visual contact with him/her).

The results of the first experiment show that three sea lions out of four selectively responded to the only fundamental characteristic which appears to link the six presented cues: the target was located in the extension of the pointing body part. However, subjects could rely on very basic decision-making rules to respond correctly in this task. They could merely move to the unique target located on the side of the caretaker where they perceived a protruding body part and/or a movement (Lakatos et al., 2009, 2011; Soproni et al., 2002) or to the closest one, according to a distance-based rule from hand to target (Povinelli et al., 1997). Therefore, in the second study we assessed their use of the referential property of the pointing gesture by testing whether they could not only follow it in a general direction, but more precisely to the pointed object, in the presence of two potential targets along the same line. All the sea lions tested in the current study correctly selected, beyond the level of chance, the pointed object between the two present on each side of the informant. There was no indication of learning across trials as performance in the first and last ten-trial blocks were not different for any subject. In studies using only two potential targets, a gray seal (Shapiro et al., 2003), fur seals (Schuemann & Call, 2004), wolves (Udell et al., 2012), dogs (Hare et al., 1998; Soproni et al., 2002), and elephants (Smet & Byrne, 2013) successfully chose the pointed target even when the informant stood closer to the incorrect one. These results show that the pointing cue prevails over body position, and that the hand-to-target distance-based rule is not sufficient to explain the performances of the tested subjects, contrary to some findings in chimpanzees (Povinelli et al., 1997). The sea lions’ performance in our setup leads to the same conclusion, because the informant's hand was closer to the proximal target than to the distal one, even when pointing to the distal item. But in these previously cited studies, subjects could still simply follow the discriminative rule of choosing the object located on the side in which a body part of the informant is protruding or moving to be successful. Tasks in which the subject has to follow a precise linear vector along the pointing arm to a particular object among several objects located in the same general direction appear to be more valuable (Pack & Herman, 2007). To our knowledge, only two other species have been tested in this way. Bottlenose dolphins were tested in a setup similar to the one we used, with the same positive results (Pack & Herman, 2007). Results in dogs are more heterogeneous: using the same setup, recently Lakatos et al. (2011) have shown that 14 out of 16 tested dogs failed to select the correct target between the two present on the pointed side, and tended to choose the containers closer to the informant; while Hare et al. (1998), using three instead of four containers, tested two dogs that succeeded above chance level.

Results of our second study showed that the tested sea lions understood the pointing cue as indicating a particular object, rather than merely a general direction. To investigate this more thoroughly, we tested the subjects’ point following to a target located outside their visual field. This has been tested in other studies by placing a third target behind the subject, which has led to mixed results: dogs succeed in selecting the correct target (Hare et al., 1998), but the gray seal (Shapiro et al., 2003) and one of the two tested bottlenose dolphins failed to do so (Pack & Herman, 2006). The lack of salience of this cue may represent an additional difficulty, and could make this setup a weak test of the ability of these animals to follow pointing gestures to targets located outside their visual field. Moreover, our limited knowledge of these species' visual field makes it difficult to be sure that the target is in fact not visible for them (Pack & Herman, 2006). To our knowledge, the present study is the first in which barriers and distractors, commonly used in studies on gaze-following (e.g., see Tomasello, Hare, & Agnetta, 1999), are used in an object-choice task on point following. We showed that only one sea lion out of the three tested was successful from the very first trial in following the pointing cue past a distractor target (i.e., a similar item placed in front of him and directly accessible), to targets located behind opaque screens that prevented him from seeing the target at the moment of his choosing. The two other subjects needed extra training as they began by attempting different responses (i.e., touching the distractor or the screens). Their subsequent success could thus be explained in terms of reinforced learning: because these first attempts were not rewarded, the sea lions continued to search, guided only by the general direction from the pointing (i.e., left or right), until finding the rewarded object behind the screen. Considering these mixed results, we suggest that following point gestures towards targets situated outside an animal’s visual field remains a difficult task, even when the pointing cues’ saliency is improved (i.e., in comparison with studies placing the target behind the animal).

The current experiment was designed to test the sea lions' ability to exploit human pointing cues. This necessitated preventing the animals from basing their responses on inadvertent cues instead of, or in addition to, the tested cues. For example, some unintentional subtle changes in the caretakers’ attitude could indicate to the tested sea lion that he is looking at or moving to the correct target. In our experiment, the greatest caution was taken to avoid any possible uncontrolled cueing from the experimenter. For instance, the caretakers were instructed to avoid any extra changes in their position, movement, vocalizations, or facial expression while providing a pointing cue. Additionally, they repeated the cue if the subject did not respond every 2 s, totally independently of current sea lion behavior. Of course, we cannot completely guarantee that the sea lions could not perceive any subtle and incidental cuing from the caretakers, including some that human observers may not perceive. In fact, this limitation is present in many experiments that use pointing gestures. Future research will have to be conducted, in sea lions as well as in other species, in order to directly address this unintentional cues account. One way would be to use an experimenter made unaware of the animal’s behavior (e.g., by asking the experimenter acting as informant to close his eyes from the pointing cue initiation until the final response of the animal, signaled to him by someone else). Successful animals in pointing experiments involving such types of control would undoubtedly be argued to be relying on human-given pointing cues to manage the task. Nonetheless, the current study shows that sea lions equal dogs’ performance reported in previous studies using similar experimental conditions.

In conclusion, these results obtained in a non-domesticated species support the hypothesis of a transfer from conspecifics directional and social cue reading skills to sensitivity to human communicative cues (Smet & Byrne, 2013; Udell, Dorey, & Wynne, 2010a). Spontaneous production of pointing cues can be observed in the species tested successfully in joint-attention tasks. Several reports have been made about production of pointing gestures involving an extended arm without any formal training in great apes, both in the wild (Hobaiter, Leavens, & Byrne, 2013; Inoue-Nakamura & Matsuzawa, 1997; Vea & Sabater-Pi, 1998), and in captivity (Brakke & Savage-Rumbaugh, 1996; Call & Tomasello, 1994; Leavens & Hopkins, 1998). African elephants, which have been demonstrated to successfully interpret human pointing without any prior training, have been reported to regularly make prominent trunk gestures (Smet & Byrne, 2013). But whether or not those motions act in elephants as “points” have to be explored further. Other researchers demonstrated the ability in dogs (Miklosi, Polgardi, Topal, & Csanyi, 2000) and captive dolphins (Xitco, Gory, & Kuczaj, 2001, 2004) to spontaneously indicate the location of a hidden food item by aligning the axis of their body with a container, while producing gaze-alternations between its owner (or caretaker) and the object.

To our knowledge, no such experimental studies had yet been conducted in California sea lions. However, specific features of their ecology and social behaviors would make sensitivity to gaze and body-line orientation of conspecifics highly adaptive for this species. Their foraging behavior in particular, which seems to involve precise coordination of high-speed directional movements among large groups of conspecifics, appears to be precisely the sort of behavior that would lead to such sensitivity. As has been suggested for some of the other species tested successfully in pointing studies, their ability to exploit varied human communicative cues may be due to the generalization of conspecific cue reading skills to this interspecific context (Pack & Herman, 2006, Smet & Byrne, 2013; Shapiro et al., 2003). Smet & Byrne (2013) posed the hypothesis that some species’ native ability in interpreting social cues may have contributed to their effective use by man, regardless of whether or not domestication has taken place, with the notable example of elephants. Although sea lions do not have such a long history of close cooperation with humans, according to Smet & Byrne’s hypothesis (2013), the extensive use of sea lions nowadays in dolphinariums for shows and educational programs could be due in part to their ability to exploit social cues.

In addition to this ecological account for the current findings, it may be interesting to consider the potential influence of daily interactions with humans. Lyn et al. (2010) have shown that chimpanzees and bonobos raised in a socio-linguistically rich environment performed much better in exploiting human pointing gestures than chimpanzees raised in standard laboratory housing. In the same way, stray dogs living in shelters perform very poorly compared to pet dogs (Udell, Dorey, & Wynne, 2010b), and some recent experimental data suggest in young pet dogs an inability to exploit such gestures (Dorey, Udell, & Wynne, 2010), contrary to previous findings (Hare, Brown, Williamson, & Tomasello, 2002). Like the majority of the tested marine mammals, the sea lions tested here had undergone training and participated in public shows, during which they focus their attention on their caretaker’s gestures in order to detect his/her commands. Similar to pet dogs and primates raised in a socio-linguistically rich environment, sea lions would have been highly attuned to human gestural cues in these previous experiences because of their frequent association with food rewards and social praise (Elgier, Jakovcevic, Mustaca, & Bentosela, 2012; Prato-Previde, Marshall-Pescini, & Valsecc, 2008). This is supported by observations in a young gray seal born in the wild who was trained to follow explicit pointing gesture; while he was able to respond appropriately to different types of points, he did not spontaneously exploit more subtle cues such as head orientation (Shapiro et al., 2003). The results of the present experiment support both an ecological and a human-socialization account of some undomesticated species’ ability to use human pointing gestures. Further exploration of Californian sea lions’ behavior in the wild as well as additional experiments would be needed to disentangle between the two hypotheses (for example, comparative studies between human-socialized and wild individuals).

In summary, we have provided evidence that the ability to exploit varied human pointing gestures can emerge in human-socialized Californian sea lions, at least when these are performed by their familiar caretakers. However, potential exploitation by the sea lions of unintended cues that the caretakers might have provided instead of, or in addition to, the tested pointing cues still has to be assessed. We suggest that they understand to some extent the referential property of the pointing gesture, because they were able to follow a precise linear vector along the pointing arm to a particular object among several objects located in the same general direction. But saying that they understand pointing as referring to a particular object, place, or event, does not assume that they comprehend in another social agent the aspect of looking at and/or indicating an object as having a mental experience about this object. For instance, 3-year-old children can choose the correct cup (i.e., the one pointed to by the experimenter), but cannot explain why they have done so (Povinelli & deBlois, 1992). This supports the idea that no theory of mind (i.e., the ability to interpret the behavior of oneself or others in terms of internal states as knowledge, beliefs, intents, etc.) is necessary to explain sea lions' current results. Further investigations would have to be made about the ecological background and the histories of learning underlying the sophisticated social cue reading skills observed in studies on joint-attention in some non-human animals, and their relations with the potential development of high-level representations in both humans and non-human animals.