The Role of Perspective-Taking in Children’s Quantity Implicatures

ABSTRACT Young children excel at pragmatic inferences known as ad hoc quantity implicatures: they can infer, for example, that a speaker who said “the card with apples” meant the card with nothing but apples. However, it is not known whether children take into account the speaker’s perspective in deriving such inferences, as adults are able to do, and as the received theories of pragmatics claim. In two experiments, we tested children (5–7 years, N = 33 and N = 25) and adults using a picture-matching director task, in which participants played a game giving cards to the speaker, with some cards being in common ground and some in privileged ground. We found that adults can both derive implicatures when all information is in common ground and not derive them when relevant information is in privileged ground. Children also derive ad hoc implicatures when relevant information is in common ground but, crucially, fail to not derive them when it is in privileged ground. Children’s difficulty with integrating perspective-taking with pragmatic inferencing challenges the received theories about the necessity of perspective-taking in pragmatics.


Introduction
Learning to communicate involves developing pragmatic skills to make inferences about what others mean, beyond what they say explicitly. One type of communicative inference that children have to learn is known as an "implicature": for instance, if in answer to the question, "What is on your card?" the speaker replies, "there are apples," then the hearer may infer that there are only apples on the speaker's card. This case is known as an ad hoc quantity implicature (Grice, 1975;Hirschberg, 1991).
Widely accepted, though diverse, approaches to implicature have in common the notion that such inferences not only involve an assumption that the speaker is being fully informative by giving the maximum quantity of relevant information, but also take into account the speaker's perspective and epistemic state, including what is in common ground with the listener (e.g., Frank & Goodman, 2012;Grice, 1975;Horn, 1984;Sperber & Wilson, 1995). In the example above, the hearer assumes that the speaker knows all the objects on the card (the Competence Assumption, Geurts, 2010) and infers that, had there been other objects on the card, the speaker would have said so (the Epistemic Step, Sauerland, 2004). If the hearer knows that the speaker is not fully knowledgeable, then he does not derive this implicature, but would arrive at the intended meaning that there are at least apples on the card, as far as the speaker knows. These are linguistic-theoretical models at the computational level of explanation, but they have implications for behavior and competence in development: if pragmatic inferencing and epistemic reasoning occur together, then either they have to develop simultaneously or epistemic reasoning has to be in place first in order to enable implicature comprehension.
According to alternative proposals, reasoning about the speaker's epistemic state is not always required in pragmatic inferences (e.g., Andrés-Roqueta & Katsos, 2017Breheny, 2006;Jary, 2013;Kissine, 2016;Moore, 2018;Sperber, 1994). For example, Kissine (2016) suggests that pragmatic difference in perspective between the speaker and the hearer: the relevant information is available to both the speaker and the hearer. Furthermore, it is mostly assumed that perspective-taking, as part of social cognition more generally, is foundational for deriving these inferences.
Children also learn Level 1 perspective-taking (Flavell, 1977) -assessing what someone can or cannot see -from 2 years and by 5 years are competent with more complex perspective-taking, including Level 2 -assessing how what someone else sees differs (Moll & Meltzoff, 2011;Moll & Tomasello, 2006). The understanding that seeing leads to knowing is also evident explicitly, as well as implicitly, from age 3 or 4 years (Robinson, 2011). There is also some evidence that from 3-4 years, they are able to harness this skill in communicative situations identifying the speaker's intended referent. In the director task, Nilsen and Graham (2009) gave children a potentially ambiguous instruction like "pick up the duck," either where one of the two ducks was visible only to the participant or where both ducks were visible to both the speaker and participant. They found that children aged 3.8-4.2 years were significantly less likely to pick the alternative target object when it was visible only to them, than when it was visible to both them and the speaker along with the target object; children aged 4.6-5.6 years performed even better (also see Nadig & Sedivy, 2002, for complementary findings with eye tracking). This suggests that in an experimental context where the difference in perspective is current and salient, and the target and distractor objects are of similar saliency, young children are able to take into account the speaker's perspective when it differs from theirs. We adopted the director task paradigm in the current study in order to investigate whether children can also integrate perspective-taking with implicature derivation. However, evidence from other experimental paradigms is mixed: on the one hand, Matthews, Lieven, and Tomasello (2010) found evidence in favor of early perspective-taking, in that children were slower in processing a new label for an object when the speaker had previously used a different label, compared to when a new speaker used the new label. On the other hand, Ostashchenko et al. (2019) found evidence only for slower processing of new labels by both the same and a new speaker, and a slowdown for a new speaker, suggesting that for young children, apparent perspective-taking effects may be due to low-level memory mechanisms. They concluded that for young children, perspective-taking may be independent of language processing. Recent research in children with more or less access to Theory of Mind skills (children with Autism Spectrum Disorder), children with Developmental Language Disorder and neurotypical peers also suggests that some pragmatic inferences may be available without perspective-taking (Andrés-Roqueta & Katsos, 2017Katsos, , 2020. Likewise, for implicatures, there is some empirical evidence in favor of both hypotheses, although, as shall be seen, no study to date has straightforwardly tested children's ability to integrate the speaker's perspective into the implicature derivation itself. On the one hand, there is accumulating evidence that children are able to derive quantity implicatures (either ad hoc or scalar) before ignorance inferences, which supports the alternative view that perspective-taking is not an inherent part of implicature derivation. In an ignorance inference, the hearer infers from a speaker's utterance that the speaker does not know some relevant information: for example, if the speaker says, "he ate an apple or a pear," the hearer can infer that she does not know which, otherwise she would have said so. Crucially, the sensitivity to informativeness and reasoning about the speaker's perspective needed for an ignorance inference are arguably some of those abilities required for deriving an implicature, on the received Gricean view. They would therefore be expected to emerge in development at least at the same time as implicatures, which, in addition to the Competence Assumption, require the Epistemic Step (Hochstein et al., 2016). However, Barner et al. (2018) found that 4-year-olds were competent with ad hoc implicatures in a Felicity Judgment Task, where the speaker's perspective was not at stake, but were not competent with ignorance inferences. Similarly, Papafragou et al. (2018) observed that 4year-olds could not attribute an under-informative utterance to an ignorant speaker (who did not share their perspective) when asked "who said that?" In a follow-up task, 4-year-olds were able to answer correctly when there was no manipulation of perspective -a straightforward quantity implicature. Both Papafragou et al. (2018) and Hochstein et al. (2016) found, however, that older 5year-olds were developing competence with ignorance inferences. Note, though, that by using Felicity Judgment Tasks or "who said that?" paradigms, these studies are not testing ignorance inferences per se but rather the ability to match an under-informative utterance to an ignorant speaker, given the utterance, the state of the world, and two speakers whose perspective is made clear; an ignorance inference instead involves inferring the epistemic state of the speaker given the utterance. These studies are also not testing children's ability to integrate perspective-taking into implicature derivation, but a related skill; they do not address whether children can appropriately derive an implicature or not given the speaker's perspective.
On the other hand, Kampa and Papafragou (2020) provided evidence that children's perspectivetaking abilities are integral to their development of implicature inferencing. In their study, they presented 4-year-olds with two pictures of a speaker with a box: in one picture the speaker could see only a spoon, for instance, in the box, while the hearer could see a spoon and a bowl, and in the other picture both objects were visible to both interlocutors. They found that 4-year-olds were mostly able to answer correctly when asked "which box is she talking about?" regarding the utterance "I see a spoon," which suggests that young children can do some sort of epistemic reasoning in pragmatic inferencing. These findings are open to interpretation, though: the correct choice could be arrived at purely based on sensitivity to informativeness (by reasoning that "I see a spoon" is an underinformative description of a box with a bowl and a spoon, so it must be the other one), or on the ability to match an implicature interpretation to the speaker's perspective (reasoning that "I see only a spoon" is not a true description of a box with a bowl and a spoon, so it must the other one), or by instead answering the implied question "which speaker said that?" (given the display in which the same speaker appeared twice on screen, one with each type of box). That is, it is still possible but not certain that hildren are taking into account the speaker's perspective in deriving an implicature, and appropriately taking the Epistemic Step.
The upshot of the emerging research, therefore, is that the evidence is mixed, with some studies suggesting that perspective-taking is not consistently used by young children in pragmatic inferences (e.g., Barner et al., 2018) and others suggesting that it can be (e.g., Kampa & Papafragou, 2020). In this study, we investigated children's ability to integrate perspective-taking into the derivation of pragmatic inferences, in particular quantity implicatures, using a paradigm combining the director task, which tests referential communication and perspective-taking, with a simple picture-matching task, which tests implicature derivation. The director task has been successfully employed to test visual perspective-taking in communication in children aged 4 years and above (e.g., Nilsen & Graham, 2009) as well as having been used extensively with adults to test their comprehension of referential utterances. Importantly, it makes what is visible or not visible to the speaker and to the hearer obvious and salient through a physical display which the interlocutors interact with, and avoids the potential ambiguity in interpretation of presenting the speaker twice on a screen, as in (Kampa and Papafragou, 2020). Likewise, picture-matching tasks have been widely used to test young children's understanding of implicatures (e.g., Horowitz et al., 2018;Stiller et al., 2015) and are arguably a better measure of comprehension than Felicity Judgment Tasks, which may require metalinguistic processing and only tap into sensitivity to informativeness (Veenstra & Katsos, 2018). We tested 5-to 7-year-olds, as at this age, children are known to be able to reliably do perspective-taking and derive ad hoc implicatures independently, and can possibly use perspective-taking in some communicative situations (referential communication and ignorance inferences).
In our study, children saw a simple display containing four double-sided picture cards, one of which was occluded for the speaker, and were invited to play a game in which they had to select which card the speaker would like and put it in a "card box." In the critical condition, children heard, for example, "pick the card with pears," and saw two cards with pears: one with only pears but occluded for the speaker (in privileged ground), and one with pears and bananas, visible to both speaker and hearer (in common ground). Crucially, given the other items in common ground, "pick the card with pears" is an informative way for the speaker to give an instruction to pick the target card with pears and bananas, as it uniquely identifies the only card with pears from the speaker's perspective. A child who is able to integrate perspective-taking and implicatures will not derive an ad hoc implicature, the card with only pears, and choose the card in common ground. We included three further conditions: an ad hoc implicature condition, which conceptually replicates standard picture-matching tasks with both the target and distractor cards in common ground; a perspective-taking condition, in which the speaker's perspective has to be taken into account to resolve a semantic ambiguity, with an identical card in common ground and in privileged ground; and finally, an unambiguous condition with only one possible target card.
This study therefore differs additionally from Kampa and Papafragou (2020) in that we were most interested in whether children could appropriately not derive an implicature, given relevant information in privileged ground (akin to Breheny et al., 2013 with adults), rather than deriving an implicature which incorporates knowledge of privileged ground. This provides a clearer indication as to whether hearers are actively reasoning about the speaker's perspective in deriving an implicature, rather than just possibly matching an inferred interpretation to a speaker. We were also able to test the two components of, first, implicature derivation (where the speaker's perspective is not at stake) and, second, perspective-taking (for a semantic ambiguity) within the same paradigm, allowing for a clearer comparison of how these skills relate developmentally.

Participants
Thirty-eight English-speaking children aged 5;3 to 6;3 were recruited from primary schools in Cambridge, UK; the group included both monolingual and multilingual speakers, but all included were competent in English and had been in English language schooling for at least one year. Of these, 5 children were excluded from analysis due to experimenter error (N = 1), little knowledge of English (N = 1), not completing the task (N = 2), or for failing a Theory of Mind task (N = 1). Adult native speakers of English (N = 36) were recruited via Prolific (www.prolific.co), an on-line research recruitment platform. A further 4 adults were excluded due to issues with the audio, one for clicking randomly and one for completing the task twice. This and the next study were approved by the Humanities and Social Sciences Research Ethics Committee of the University of Cambridge.

Stimuli
Participants sat on one side of a wooden display case with four cubby-holes, each with a double-sided picture-card ( Figure 1). The picture cards were placed vertically in clear plastic card holders, which secured the base of the card only, so it was clear that the pictures were visible to both participants and experimenters. By the display, there was also a "card box" for placing selected cards in during the game. Three cards were in common ground with the speaker, a puppet, who sat on the opposite side of the display. One card was in privileged ground behind a screen, meaning that it could be seen only by the participant; for this card, the entire cubby-hole was covered on the speaker's side. Each picture card showed five items, either five of the same items (e.g., five bananas) or two of one item and three of another (e.g., two bananas and three pears). In each display, three of the cards showed five of the same item, and one showed two types of item. There were six sets of picture cards, each with a theme such as fruit or animals. During the warm-up phase, children were able to handle the cards and were shown that they were double-sided; throughout the experimental session, they picked up the cards, as instructed by the puppet; these features mean that it should have been clear to the children that the cards had the same picture on both sides and were visible to both interlocutors just as 3D objects would be. Intuitively, this is an improvement in the direction of a naturalistic setup, compared to screen-based tasks (e.g., Kampa & Papafragou, 2020

Procedure
Participants were introduced to the speaker, a puppet, who was positioned on the opposite side of the picture display from the participant. His voice was prerecorded and played via a laptop. He was operated by the experimenter, who sat between the child and puppet. The puppet's voice was always a male, and the experimenter a female, to reinforce that they were different agents. This does mean that children were expected to attribute an epistemic state -of knowledge or ignorance -to a puppet, but previous studies using puppets suggest that children are capable of this and that there is no reason to think that this renders the task a less valid measurement than if the speaker were a human interlocutor (e.g., Diesendruck & Markson, 2001;Hochstein et al., 2016;Siegal et al., 2010); indeed, standard falsebelief tests such as the Sally-Anne change-of-location task are often carried out using puppets (Baron-Cohen et al., 1985), and in implicature tasks, the speaker is often a character (physical or on screen) rather than a human experimenter (e.g., Stiller et al., 2015).
For the warm-up phase, based on Nilsen and Graham (2009), the puppet explained that he wanted to play a guessing game: he could see three of the items in the display, but not the fourth. He asked the child to describe it, so that he could guess what it was. Each card showed only one item, all of which were different from those used in the test phase and were common objects or animals (e.g., a book and a frog). Children could describe the privileged ground item in any way they wanted (for example, by providing a description, "it is green and hops," or a label, "a frog"), and for each of the three warm-up trials, the puppet correctly guessed the item. The aim was to highlight the difference in perspective between the speaker and the hearer. The puppet also explained after the first warm-up trial that he would turn around so that he could not see the experimenter putting out the new cards for each trial. Between each trial thereafter, throughout the experiment, he thanked the child and said, "now I'm turning around," so that it was clear he never saw what was on the privileged ground card. For the test phase, the puppet explained that they were going to play a different game, in which he would tell the child to pick a card, by saying, for example, "Pick the card with pears." The child had to select the cards and put them in a "card box." He also explained that each time the child collected four cards, he or she would receive a sticker for a sticker chart; this was shown to the child, but the stickers were kept out of the way to avoid distraction.
There were four conditions, with six trials per condition, so that each child saw 24 trials. The order of presentation of conditions within each set of four trials (containing one of each condition) was counterbalanced across six lists, and the position of the privileged ground card was rotated across sets. The experimenter replaced the cards as necessary between each trial, turning the puppet around so that he could not see which cards were being changed. Every four trials, children were asked which cards the puppet could and could not see, and whether he knew what was on the covered card. This reinforced the difference in perspective between the puppet and child. At the end of the testing session, children were given the Sally-Anne change-of-location task to test their ability to track false belief (Baron-Cohen et al., 1985).
In the unambiguous condition, only one card, visible to both the puppet and participant matched the puppet's utterance (Figure 1).
In the common ground ad hoc implicature condition, two cards visible to both the puppet and participant were semantic matches for the utterance, but only one matched an exhaustive implicature interpretation. This tested children's ability to make ad hoc inferences with full common ground and replicated the ad hoc implicature condition in previous picture-matching tasks where the speaker's perspective was not at stake (e.g., Horowitz et al., 2018).
In the privileged ground ambiguous condition, two identical cards matched the utterance, but one was in common ground and the other in privileged ground. This condition tested children's perspective-taking with semantic ambiguity.
In the critical privileged ground ad hoc condition, two cards were semantic matches for the utterance, but only one of them matched an exhaustive ad hoc implicature interpretation. This card was in privileged ground, though, while the other was in common ground. This condition tested the participants' ability to take into account the speaker's epistemic state and not derive an implicature, instead selecting the semantically matched card in common ground. Crucially, from the puppet's point of view, his utterance was an informative way of instructing the child to pick this card, given the cards he could see: from his perspective, the object in the utterance was a unique identifier of that card as that card displayed the only objects of that type which he could see. A hearer who is able to take into account the puppet's perspective will suspend the implicature and pick the card with both types of object in common ground; a hearer who is not able to do so will pick the card with only one type of object in privileged ground.
Adults carried out the same task online via Qualtrics (Qualtrics, 2016), except that they heard the audio stimuli but saw an avatar instead of a puppet; they did not do the warm-up production task, but instead completed questions to check they had understood the situation correctly; and they were asked which cards the speaker could see only twice, at the beginning and halfway through. On-screen methods with virtual speakers have previously been used with both children and adults in perspectivetaking tasks (e.g., Wang et al., 2016).

Results and analysis
The adult control group was at ceiling in all conditions except the critical one, and the child group was at ceiling for both common ground conditions (Table 1 and Figure 2). All children passed the Sally-Anne Theory of Mind test, except for one who was therefore excluded from the analysis. The responses in the critical privileged ground ad hoc condition were bimodally distributed (see Figure 3), so participants were coded as passers (scoring 5/6 or 6/6) or failers (otherwise) for each condition, and chi-squared-based analyses were used to compare the two privileged ground conditions across adults and children (McNemar's χ 2 test was used for within group comparison, and Fisher's exact test for between group; see Skordos & Papafragou, 2016, for a similar approach to analysis).
Among children, there were more passers in the privileged ground ambiguous condition than in the critical privileged ground ad hoc condition (McNemar's χ 2 = 8.1, p = .0044; Table 2). There were significantly more adult passers than child passers in both the privileged ground ambiguous condition (Fisher's exact test p < .001) and the privileged ground ad hoc condition (Fisher's exact test p < .001; Figure 5). 2

Discussion
The results indicate that children, like adults, excel in deriving ad hoc quantity implicatures in a picturematching task when the speaker's perspective does not differ from theirs, in accord with previous findings (e.g., Horowitz et al., 2018;Katsos & Bishop, 2011;Stiller et al., 2015;Yoon & Frank, 2019). Adults were  A mixed-effects logistic regression model with condition and age and their interaction as fixed effects, and item and subject random intercepts, could not converge due to lack of variance and ceiling effects in the children's data, particularly in the unambiguous and common ground ad hoc conditions. Non-parametric tests were therefore conducted as best suited to the data.
able to take into account the speaker's perspective to resolve a semantic ambiguity, and largely, to not derive an ad hoc quantity implicature when the speaker did not know the relevant information. In contrast, the majority of children were not able to take into account the speaker's perspective to not derive an ad hoc quantity implicature, and many also struggled to do so to resolve a semantic ambiguity. This lends support to the second hypothesis that, at least in contexts such as the task we use, children learn to derive implicatures and to take another's perspective and then to integrate the two skills online. The experiment was designed to follow as closely as possible previous director tasks and implicature picture-matching tasks (Horowitz et al., 2018;Nilsen & Graham, 2009). However, some resulting features of the experimental context could have hindered children's performance, masking their actual competence. First, children might have perseverated with the warm-up game of showing the puppet what was on the hidden card, increasing the incorrect responses in the two privileged ground conditions. Second, in the privileged ground ad hoc condition, the privileged ground card displayed five objects of the same type, while the common ground card displayed only three of those objects (and two others): this could make it harder to ignore the privileged ground card and, for those children not taking into account the speaker's perspective at all, could mean that they are choosing this card simply because it has more of the relevant items. Finally, the pseudo-randomized trial order might have increased the difficulty of the task: if children are unable to integrate speaker perspective in implicature derivation, this forces them to choose the privileged card in the privileged ground ad hoc condition, which, in turn, licenses selection of the privileged card for the privileged ground ambiguous condition. We addressed these concerns in Experiment 2.

Participants
Twenty-five English-speaking children aged 5;11 to 7;11 were recruited from a primary school in Sussex, UK, and Saturday schools in Cambridge, UK (an unintended age difference to Experiment 1, which tested 5-to 6-year-olds, but one that does not seem to make an important difference to resultssee discussion below). Five children were excluded, due to not being English-dominant speakers (N = 3), for falling outside this age range (N = 1), and for failing the Theory of Mind task (N = 1). Adult native speakers of English (N = 18) were recruited via Prolific (www.prolific.co).

Stimuli
The stimuli replicated those used in Experiment 1 (Figure 4), except that cards with just one type of item showed either three items or two items (e.g., three bananas or two pears). Also, in the unambiguous condition, for half of the trials, the target card had three of the requested item and two of the other items or two of the requested items and three of the other items, although in each case, the request was unambiguous given the display. For the other half of the trials, a card with three of the same items was used. This highlighted that the "correct" choice of card could display two types of items and ensured that this was not only the case in the privileged ground ad hoc condition. 3

Procedure
The procedure replicated that of Experiment 1, except that in the warm-up phase, the experimenter asked the child which cards the puppet could and could not see and then presented two trials in the unambiguous condition. By keeping the game the same from the warm-up to experimental phase, we removed any possible confusion about the aim of the game in the experimental phase. As in Experiment 1, children consistently answered correctly the questions about which cards the puppet could and could not see. Also, the order of presentation of conditions within each set was again counterbalanced across the six sets, but within any one set, the privileged ground ambiguous condition always appeared before the critical privileged ground ad hoc condition; this meant that if children were struggling in the critical ad hoc condition such that they chose the privileged ground card in those trials, this choice would be less likely to have an effect on privileged ground ambiguous trials by licensing the privileged ground card as there was now a bigger time period and change of card set between the two trials.

Results
The same analysis was followed as for Experiment 1, given that again the data were bimodally distributed (Table 3 and Figure 6). Among children, there were more passers in the privileged ground ambiguous condition than in the critical privileged ground ad hoc condition (McNemar's χ 2 = 10.08, p = .001; Table 4). There were significantly more adult passers than child passers in the privileged ground ad hoc condition (Fisher's exact test p = .005) but not in the privileged ground ambiguous condition (Fisher's exact test p = .37).

3
Additionally, for half of privileged ground implicature items, the unnamed object on the target card was also present on another card in common ground, as in Experiment 1, while for the other half both object types on the target card appeared only on that card. However, this makes no difference to the informativeness of the utterance, and no difference to participants' performance: in a logistic regression model (response ~ age * item type + (1 | utterance) + (1 | participant)), only age (adult or child) was a predictor of performance in the privileged ground ad hoc condition (β = −26.1, SE = 5.9, p = <.001) while item version and its interaction with age were not significant (β = −3.5, SE = 3.1, p = .25; β = 2.1, SE = 3.2, p = .5).

Discussion
The results of Experiment 2 corroborate the main finding of Experiment 1: there was still a significant difference between adults and children in the critical privileged ground ad hoc condition, such that children were not able to take into account the speaker's perspective to not derive an implicature. However, in the privileged ground ambiguous condition, there was no longer evidence for a difference in performance between adults and children. This could be due to the methodological improvements to the task or to the somewhat older age of the sample of children in Experiment 2, although a lack of correlation between age and performance in both privileged ground conditions favours the former explanation (tau = −.004, p = .98; tau = −.04, p = .8; Figure 7). Before turning to a general discussion of our findings, we need to consider another explanation for children's -and adults' -behavior in this task, suggested by reviewers: it could be that the speaker's utterance in the critical condition may be perceived as under-informative, even though logically speaking it is perfectly informative and moreover adequate given the goal at hand in the game. That is, hearers may be expecting the speaker to say "pick the card with pears and bananas," rather than "pick the card with pears." If this was the case, then it would make hearers more likely to choose the card with only pears, reconsider what the puppet can see, and select the card that is in privileged ground. This would explain the lower rate of correct responses in the critical condition. To test this possibility, we ran a follow-up study with adults using an acceptability judgment task.
In the task, participants saw a display of three picture-cards, which matched the speaker's perspective in Experiments 1 and 2, and were asked to rate how acceptable an utterance was as an instruction to pick up a particular card. Importantly, they were given the context of the utterance: that it was part of a game in which an "instructor" had to tell the "matcher" which cards to collect, where both the instructor and matcher can see the three cards (following the online version of Experiments 1 and 2). As in Experiments 1 and 2, the instructions were always of the type, "pick the card with . . . " and occurred in four conditions: ambiguous, ad hoc implicature, the critical condition for this followup study (possibly under-informative), and fully informative -see Figure 8. For instance, for an item in the critical condition, they were first presented with a picture card with pears and bananas, a picture card with oranges, and a picture card with pears; after 2 seconds, the card with pears and bananas was highlighted with a red border; then, one second later, the utterance "pick the card with bananas" and the Likert scale appeared. They were asked to rate each instruction on a 4-point scale (bad -kind of bad -kind of good -good), following the findings of Jasbi et al. (2019) about the most informative response scales in implicature rating tasks. An acceptability judgment task arguably measures production as it invites participants to model what they would have said as the speaker, given a state of the world (Degen & Goodman, 2014). We decided to test adults only as prior findings show that children tend to be more accepting of under-informative utterances than adults (Katsos & Bishop, 2011) and also under-informative in their production (Davies & Katsos, 2010;Nilsen & Graham, 2009); in other words, testing adults is more likely to provide support for this objection to the paradigm.
We predicted that the fully informative condition would be rated as overwhelmingly "good," while the ambiguous condition would be rated as "bad." We expected the ad hoc implicature condition to also be rated as "kind of good" to "good," given that "the card with oranges" likely gives rise to an   exhaustive interpretation, "the card with only oranges," and thereby identifies a single card. For the critical condition, if participants expected the speaker to be informative (and also succinct) then they would also rate it as "kind of good" to "good"; if, on the other hand, the context gives rise to expectations of strictly speaking over-informative utterances, then they would rate it more poorly. English-speaking adults (N = 49) completed the task online using the Gorilla Experiment Builder (Anwyl-Irvine et al., 2020); they were recruited via Prolific (www.prolific.co).
We found that participants rated the critical condition (in which an instruction with a single item was used to describe a card with two kinds of object) as "good" or "kind of good" for 52% and 43.2% of trials, respectively -Table 5 and Figure 9. As predicted, such utterances are much more acceptable in this context than ambiguous cases (good and kind of good for 6.5% and 14.6% of trials). Planned pairwise comparison revealed that there is a significant effect of condition, such that ambiguous trials are judged worse than critical ones (Wilcoxon signed rank test V = 536, p < .001). Straightforward implicature trials (where 7.1% and 35% of trials are rated "good" and "kind of good") are also, interestingly, judged worse than critical condition trials (Wilcoxon signed rank V = 1343, p < .001). This indicates that for adults, at least, in Experiments 1 and 2, the utterance in the privileged implicature condition was largely considered to be informative enough for its context. Interestingly, the ad hoc implicature condition proved to be less acceptable than we expected, which was surprising  Table 4. Experiment 2 number of child failers and passers for the privileged ground ambiguous and privileged ground ad hoc conditions.

Privileged ground ambiguous fail
Privileged ground ambiguous pass Privileged ground ad hoc fail 5 12 Privileged ground ad hoc pass 0 8 given the ceiling performance for the common ground ad hoc implicature condition in Experiments 1 and 2. An obvious explanation for this is that participants considered a more informative alternative, "with only oranges," given the strictly speaking ambiguous nature of the utterance. This may seem at odds with the critical condition, but note that in the critical condition, the utterance is not semantically ambiguous given the display (although a more informative alternative utterance, "with bananas and pears" is available), whereas it is semantically ambiguous in the ad hoc implicature condition. In summary, these results suggest that in Experiments 1 and 2, it is not the perceived underinformativeness of the utterance in the privileged ground ad hoc condition which lowers the rate of correct responses, to the extent that adults do respond incorrectly in this condition. Furthermore, it does not seem likely that hearers completely revise their assessment of what the speaker can see, given that in the privileged ground ambiguous condition, in Experiments 1 and 2, they are performing much better (at ceiling for adults). This means that it is likely to be the added requirement of perspectivetaking which accounts for performance in the critical privileged ground ad hoc condition. The discrepancy between adults and children -and the lower performance in adults compared to the other conditions -is therefore the finding of interest, which could be explained by a variety of factors, which we turn to below.

General discussion
We found that adults are largely able to take into account the speaker's perspective in implicature processing in order to not derive an implicature when the speaker lacks the relevant knowledge -the first demonstration to our knowledge of this ability with ad hoc quantity implicatures using off-line methods. In contrast, children aged 5-7 years are not able to take into account the speaker's perspective to not derive an implicature when the speaker is ignorant of relevant information. This is despite the fact that they excel at deriving implicatures where the speaker's perspective is not at stake and are able to track the speaker's perspective in other situations (being able to explicitly say which cards the puppet could not see and passing the Sally-Anne False Belief task). This suggests that they have trouble integrating the two skills -implicatures and perspective-taking -and implies originally distinct trajectories of development, in support of our second hypothesis: first, children learn to derive implicatures and to track someone else's perspective, as separate skills; then later, they learn to integrate the two skills in online interpretation.
These results complement findings from other developmental studies. First, much younger children can excel with ad hoc implicatures in simple picture-matching tasks (e.g., Stiller et al., 2015;Yoon & Frank, 2019), so the ceiling performance from children in this study in the common ground ad hoc implicature condition is not at all surprising and furthermore suggests that the combination of a picture-matching task and director task yields an effective measure of children's inferencing.
Second, children's non-adult-like performance in the privileged ground ambiguous condition in Experiment 1 was somewhat surprising, given that children seem able to incorporate perspectivetaking in straightforward reference resolution by 5 years (Graham et al., 2016;Nilsen & Graham, 2009). However, it seems plausible that the experimental context, in which privileged ground implicature trials and ambiguous trials were mixed, at least partly explains this difference: some children's inability to pick the common ground picture in the privileged ground ad hoc condition made selection of the privileged ground picture permissible for them throughout the experiment. This seems particularly likely given the better performance in Experiment 2, where changes to the procedure were designed to mitigate this effect.
Third and crucially, the challenge for most children in taking into account the speaker's perspective in order to appropriately not derive an implicature accords with previous findings which indicated that children could derive quantity implicatures where the speaker's perspective was not at stake before they could make ignorance inferences, which arguably require some of the same reasoning about speaker knowledge (Barner et al., 2018;Papafragou, Friedberg and Cohen, 2018). It also fits in with the more general suggestion that a key challenge for children in becoming adult-like in their pragmatic inferencing is integrating non-linguistic with linguistic sources of information (Huang & Snedeker, 2009;Skordos & Papafragou, 2016). Indeed, this seems to be a challenge in other domains of language development, too, for instance, in syntactic or semantic processing where visual stimuli conflict with common ground more generally (e.g., De Cat, 2015;Pomper & Saffran, 2016;Trueswell et al., 1999).
Our results stand in contrast, however, to the study by Kampa and Papafragou (2020), in which 4year-olds were observed to combine perspective-taking with informativeness in inferencing. One possibility is that the precise inference being tested makes a difference: remember that in their study, children had to take into account the speaker's different perspective in deriving an ad hoc implicature, whereas in ours, taking into account the speaker's perspective led to not deriving the implicature. However, it is not immediately clear a priori whether one should be easier than the other. Alternatively, children might be choosing the target display in their study not because they are deriving an implicature, but by reasoning by exclusion that "a box with a spoon" is an underinformative way to describe a box with a spoon and a bowl, and so therefore it must be the other box; or, again, they might instead be answering an implicit question, "who said that, the speaker who can see what I can see or the other one," with the same kind of reasoning. In our study, there does not seem to be an alternative way of arriving at the correct choice other than by engaging with a quantity implicature and taking into account the speaker's perspective. Finally, the simpler design of Kampa and Papafragou (2020), with only two visual alternatives and two conditions could also make it an easier task for children. The current data does not allow us to tease apart these possibilities, but at the very least, comparison of the two studies suggests that fine differences in experimental context -and consequently communicative context -can have a significant impact on what children are able to infer successfully. More importantly, if we take the conclusions of the study by Kampa and Papafragou (2020) as truly pertaining to implicature and perspective-taking rather than to the alternative skills we noted above, they help us restrict our conclusions while still making a crucial point: we have shown that at least in some contexts -such as the ones created by our experimental tasks -the two skills, pragmatic inferencing and perspective-taking, are likely to be separate. This is a finding with major implications for theories of pragmatics and development to which we turn later in this section. Turning briefly to the results from adults in our study, as expected, adults were able to derive ad hoc implicatures, take into account the speaker's perspective in a simple semantic ambiguity resolution, and also largely appropriately not derive an ad hoc implicature when the speaker was ignorant. This follows the findings of a previous eye-tracking experiment with ad hoc implicatures (Breheny et al., 2013) but is the first demonstration of this to our knowledge using off-line techniques. Adults were, however, not at ceiling in this condition (with 9/36 "failers" in Experiment 1 and 4/18 in Experiment 2). This is in line with the evidence that there are individual differences in pragmatic abilities (Franke and Degen, 2016); in other studies where there are additional contextual factors to integrate into pragmatic inferencing, such as the speaker's level of informativeness or co-operativity, adults may fail to do so (e.g., Dulcinati and Pouscoulous, 2016;Pogue et al., 2016). We consider possible explanations for this as we now turn to the theoretical implications of our findings.
Overall the findings lend support to the alternative views of pragmatic inferencing, which suggest that reasoning about the speaker's epistemic state is not always required for pragmatic inferencing: the study provided evidence that children can derive a simple ad hoc quantity implicature when the speaker's perspective is not at stake without being able to appropriately not derive an implicature when the speaker is ignorant. If children were reasoning about the speaker's epistemic state where relevant information was in common ground, making the Competence Assumption and taking the Epistemic Step, there is no theoretical reason to suppose a priori that this is easier than reasoning about the speaker's epistemic state and not taking the Epistemic Step. That is, it seems that an implicature may be derived independently of integrating information about the speaker's epistemic state -the inferencing process is separate from the inferencing strategy identified. Presumably, over development, children acquire more pragmatic strategies, which enable them to infer the speaker's meaning when the speaker's perspective differs from theirs, as can be inferred from the three groups of children identified here: those who never took the speaker's perspective, those who did but only in the privileged ground ambiguous condition, and those who did all the time.
As proposed by alternative theories of pragmatics, it may be that adults retain these different strategies (e.g., Andrés-Roqueta & Katsos, 2017;Kissine, 2016;Ostashchenko, Deliens et al., 2017) and do not themselves always engage in epistemic reasoning about the speaker. The evidence for egocentric biases in referential processing accords with this possibility (e.g., Keysar, Lin, and Barr, 2003). This could explain why adults in this study were not at ceiling in the critical condition: if perspective-taking is not an integral part of implicature derivation but requires certain cues in the context, it may be that these cues were not provided to a sufficient degree of salience for some participants in an online version of the task. The question then arises, however, of why the privileged ground ad hoc condition was more challenging than the privileged ground ambiguous one, or, to put it another way, why an egocentric strategy is harder to overcome when not deriving an implicature is required, rather than resolving a potential semantic ambiguity. One harmonising explanation could again pertain to contextual cues to the pragmatic strategy: in the case of potential semantic ambiguity, the visual cue and need to resolve the utterance's reference can trigger the use of a more sophisticated strategy; the ambiguity, in a sense, forces the hearer to "notice" the additional information about the speaker's differing perspective. In contrast, the utterance referent in the privileged ground ad hoc condition can happily -if incorrectly -be resolved from an egocentric perspective. Whether this is indeed the case -both in adults, and, to a greater extent, in children -can be further investigated.
There is, however, another possible explanation for the developmental trajectory we observed: common ground with the speaker could always be assumed in deriving an inference -that is, children are actually doing some sort of epistemic reasoning as part of their pragmatic inferencing, but failing to update their representation of the speaker from a fully knowledgeable to a partially ignorant speaker (see, for example, Katsos & Andrés-Roqueta, 2021, who suggest that whether perspective-taking is used to update common ground might be dependent on the communicative situation). Along similar lines, there could also be a difference in how children and adults weight factors in the communicative context or in how certain they are about the speaker's perspective: children may give more weight to informativeness than perspective or be less certain about the speaker's perspective and more willing to revise it when faced with conflicting information. These alternatives need to be pursued in further research, in which different aspects of the visual, social, and discourse context are manipulated. However, the main findings of this study remain novel and interesting as a demonstration that at least in certain contexts, children cannot accurately integrate the speaker's visual perspective and pragmatic inferencing to arrive at the speaker's intended meaning.