Intonation, yes and no

English polar particles yes and no are interchangeable in response to negative sentences, that is, either one can be used to convey both positive and negative responses. We provide a critical discussion of recent research into this phenomenon (Kramer & Rawlins 2009; Krifka 2013; Roelofsen & Farkas 2015; Holmberg 2016), which leads to three questions: Does the intonation produced on yes and no depend on whether the response is positive or negative, and can intonation affect the interpretation of bare polar particle responses? Which particles do speakers prefer to use when? Are preference patterns sensitive to the polarity of preceding sentences in the context? In a series of experiments, we demonstrate that the contradiction contour (Liberman & Sag 1974) is an intonation that is commonly produced on positive responses to negative sentences, and that it affects hearers’ interpretations of bare particle responses. Beyond intonation, our experimental results add new evidence regarding speakers’ preferences for using yes and no in response to negative polar questions and rising declaratives. Finally, our results suggest that preference patterns are not sensitive to the polarity of context sentences.


Introduction
Whether a speaker chooses to say yes or no in response to a polar question (PQ) depends on the intended answer. Consider (1). If B means to say that Jane is coming, she uses yes, not no. 1 If she means to answer that Jane isn't coming, she uses no, not yes.
However, in response to negative PQs, B is free to choose yes or no to convey either answer. 2 1 Note that (1b) is not always impossible in response to positive PQs. We will return to this below in footnote 35. 2 High negation questions in which the negation is fronted with the auxiliary, e.g. Isn't Jane coming?, do not have this effect, and pattern with (1) (at least in American English). Yes can only mean that she is, while no can only mean that she is not (Romero 2006;Kramer & Rawlins 2009). We will not discuss high negation questions in detail in this paper, and when we say "negative questions" below, we will be referring to PQs with low negation (negation following the subject as in (2)). There is a further difficulty here, which is that low negation questions can sometimes be interpreted as if the negation were syntactically high (Romero & Han 2004;Reese 2007). We will ignore such readings. Moreover, an examination of our experimental results by item shows that participants did not interpret any of the low negation questions in our experiments as high negation.
(2015) and Holmberg (2016) have offered accounts meant to explain the diversity found in the polar particle systems of the world's languages. Interestingly, almost all prior work notes that factors such as intonation and whether the particle is followed by an overt clause or not could play a role in the felicity of the responses. However, there has been little agreement on the nature of the intonational effect. Which intonations do naïve speakers actually produce in yes/no responses? How do those intonations affect interpretation and preference patterns? We tried to answer these questions by running several production studies eliciting yes/no responses to polar questions and rising declaratives, in order to see which intonational tunes speakers choose to convey certain intentions. Furthermore, we ran a perception experiment to see how intonation affects the interpretation of bare particle responses. Taken together, these production and perception experiments expand our understanding of how intonation interacts with the use of polar particles in particular, but also of how intonation affects interpretation more generally.
Our experimental results contribute to the empirical landscape beyond the intonational findings. We also collected participants' felicity ratings. Our results partly replicate earlier findings, but also go beyond them. For example we compared responses to polar questions with those to rising declaratives, and we compared yes with yeah responses. Neither comparison had been tested experimentally before. 7 Finally we ran a follow-up reading experiment in English, modeled on work by Meijer et al. (2015) on German that tested whether preference patterns are sensitive to the polarity of sentences in the context that precede the sentence that yes and no respond to.
In order to appreciate how intonation affects the interpretation of polar particles and how our results interact with the literature, we need to understand the issues involved in the analysis of polar particles. We therefore begin by summarizing some contemporary theories of polar particles in section 2. While doing this, we make some new observations that suggest a synthesis of the different theories may be needed. Then in section 3, we discuss relevant background on the contradiction contour (Liberman & Sag 1974), which, as we will see, plays a major role in responses to polar questions.
Having introduced the relevant background, we then turn in section 4 to experiments in which participants produce polar particle responses to polar questions and rising declaratives in scripted dialogues. In section 5, we describe a perception experiment in which participants interpret bare polar particles varied by intonation. In section 6, we discuss the reading experiment modeled on the work of Meijer et al. (2015).

Accounts of interchangeability
What does interchangeability in negative contexts tell us about the nature of polar particles like yes and no? Kramer & Rawlins (2009) ;Holmberg (2013;2016); Krifka (2013); Roelofsen & Farkas (2015) offer various answers to this question. In the following, we distill their theories down to their core explanations. Each account has its merits, and we will discuss some new evidence that suggests that certain aspects of each are needed. Krifka (2013) explains interchangeability by arguing that negative sentences introduce multiple antecedents for yes and no to pick up. Consider the likely interpretation of the propositional anaphor that in the following two examples:

Krifka (2013)
(3) A: You didn't win the jackpot. a. B: I expected that. b. B: I didn't expect that.
That can either take the proposition that B won the jackpot or its negation as its antecedent, and which interpretation is more likely depends on the plausibility of the different readings given our world knowledge (winning a jackpot is unlikely and therefore should be unexpected). In Krifka's analysis, yes and no are propositional anaphora which, just like that or so, require linguistic antecedents. Polar particles are different from regular propositional anaphora, however, in that they do not just refer to a proposition, they also either assert it (in the case of yes) or assert its negation (in the case of no). Krifka therefore treats yes and no as standing in semantically for entire speech acts, and assumes that they have a corresponding syntactic phrasal category which Krifka calls ActP. This analysis aims to explain their syntactic distribution. For example, it can account for why it is not possible to say Yes surprised me.
The antecedents for polar particles are contextually salient linguistic expressions that denote propositions, which Krifka (2013) calls propositional discourse referents. A positive PQ as in (4) makes available a discourse referent d anchored to the TP, which denotes the proposition Jane is coming: (4) Is Jane coming? simplified LF: [ TP Jane is coming] ↪ d a. Yes = assert (d) [meaning: Jane is coming] b. No = assert (¬d) [meaning: Jane is not coming] The particle yes can pick up this discourse referent d as in (4a) and assert Jane is coming. It cannot assert ¬d as there is no antecedent denoting that proposition. Similarly, no can only negate d producing ¬d as in (4b) because d is the only antecedent available. To produce d, no would need to be able to negate an antecedent that is equivalent to ¬d, but there is no such antecedent available in the context in (4). Krifka's account hence correctly predicts the non-interchangeability of yes and no in the context of a positive polar question (1).
The key idea accounting for interchangeability in negative contexts in Krifka's analysis is that these structures make two antecedents available, similar to (3). The first antecedent is the embedded antecedent d anchored to the TP, which denotes the positive proposition Jane is coming. The second antecedent is d′ anchored to the NegP, which denotes the negated proposition Jane is not coming: Is Jane not coming? simplified LF: [ NegP not [ TP Jane is coming]] ↪ d′ ↪ d a. (i) Yes = assert (d) [meaning: Jane is coming] (ii) No = assert (¬d) [meaning: Jane is not coming] b. (i) Yes = assert (d′) [meaning: Jane is not coming] (ii) No = assert (¬d′) [meaning: Jane is coming] In (5a), yes asserts d, and no asserts its negation. In (5b), yes asserts d′ (which is equivalent to ¬d), and no asserts its negation. Under this analysis, bare polar particles are in principle expected to show the same interchangeability in negative contexts, since they should also be able to pick up one of the two antecedents. However, Krifka makes additional assumptions about pragmatic factors which affect which uses of polar particles in general and bare particles in particular should be more or less preferable given a certain discourse. We will return to these finer grained claims when discussing the experimental results.

Kramer and Rawlins (2009)
Under Krifka's analysis, yes and no are syntactically not related to the sentences that follow them. They form separate speech acts, and the relation to a following sentence would be guided by the same principles that guide sequences of speech acts in discourse more generally. Prior analyses have often assumed a much more direct syntactic relation between the particle and the following sentence, however, in which yes and no are part of the structure of the following sentence. For example, Kramer & Rawlins (2009;referred to as K&R), building on Laka (1990), propose that yes and no are adverbs in the specifier of the functional polarity head in the left periphery. 8 According to K&R's account, the use of no is constrained in that it can only occur if the following, optionally elided, sentence contains negation. This is achieved formally by assuming that no has an uninterpretable negative feature (uneg) that must enter into a negative concord chain with an interpretable negative feature (ineg). Yes does not contribute any features, and thus is not constrained by the polarity of the following sentence. This explains why both are compatible with negative responses to negative questions.

(6)
A: Is Jane not [ineg] coming? a. B: [ PolP No [uneg] [ TP she is not [ineg] coming] ] b. B: [ PolP Yes [ TP she is not [ineg] coming] ] While for Krifka the use of polar particles without a following sentence is unexceptional, under this analysis such cases are analyzed as fragment answers, where the TP-complement of the polar projection has been elided. As in all fragment answers, ellipsis requires an appropriate antecedent. K&R assume with Merchant (2001) that elided structure and the antecedent have to mutually entail each other. In the context of a negative question, the antecedent for ellipsis is simply the TP of the question: (7) A: Is Jane not [ineg] coming? a. B: [ PolP No [uneg] [ TP she is not [ineg] coming] ] b. B: [ PolP Yes [ TP she is not [ineg] coming] ] Yes and no are both predicted to be acceptable in (6) and (7). No is in a negative concord relationship with ineg in the elided constituent. Yes, on the other hand, does not place any syntactic requirements on the follow-up sentence, and hence is also acceptable here. The source of the interchangeability of yes and no in negative answers to negative questions is therefore that both particles are compatible with cooccurring with a clause that contains negation. One issue with this account is that it only deals with one half of the interchangeability phenomenon, negative answers. As K&R note themselves, it is not obvious under their analysis why no can appear with a positive clause in response to a negative PQ, e.g. No, she IS coming. Since no contains a uneg feature, it must enter into a negative concord chain with an ineg feature, but there is none present in such a response. To explain this use of no, K&R claim that there is a second lexical entry for no that encodes, in combination with an obligatory intonational peak on the auxiliary, a reverse feature urev, similar to reverse particles such as French si and German doch. Ellipsis is argued to be impossible in reversing no responses due to the obligatory intonational peak on the auxiliary. While K&R do not discuss positive yes responses to negative questions, for example Yes, she IS, we believe that their account predicts that ellipsis should be impossible, since there is no appropriate antecedent.
What about polar particles in response to a positive question? The analysis of the use of yes in positive responses is straightforward. The elided structure is simply identical to the antecedent Jane is coming provided by the question, see (8a). The reason no cannot be used in a positive response is also clear: There is no syntactic negative feature that it could enter into a concord chain with, and no is therefore not licensed, see (8b).
But how do no responses convey negative answers to positive questions? It would seem that the positive question cannot provide the required antecedent for the elided negative sentence in (8c). K&R argue that in such cases, an ineg feature appears in the polarity head, higher up in the structure than the ellipsis site, see (8c). This account of the use of no works technically, but it seems a bit ad hoc in that it runs counter to the idea that ellipsis is licensed by mutual entailment. Both Krifka (2013) and Roelofsen & Farkas (2015) note an additional problem: Since yes does not impose any syntactic requirement on the structure it occurs in, there is actually nothing in this account to stop yes from appearing in negative responses to positive PQs. Thus contrary to fact, the response in (9) is predicted to be entirely acceptable.
(9) A: Is Jane coming? B: [ PolP Yes [ TP she is not [ineg] coming] ] We turn now to an account in the same vein as Kramer & Rawlins (2009) that addresses some of these issues. Holmberg (2013;2016), like Kramer & Rawlins, treats yes and no as specifiers of a functional projection, and analyzes bare particles as fragment answers in which everything but the polar particle remains unpronounced. For Holmberg, yes and no are operators that assign positive and negative values respectively to the polarity head of their complement. The polarity variables of the answers in these cases are different from that of the question, an apparent problem since the identity condition on ellipsis seems to be violated. Holmberg argues, however, that elided pronoun variables can have different interpretations than their antecedents, and he suggests that polarity variables behave in the same way. For example, Jane took her car, and Amanda did, too has an interpretation where Amanda took her own car. The idea then is that yes and no bind an elided polarity variable and assign it [+] or [-] respectively. The feature value [-] would be spelled out as not if the following sentence were not elided. While Krifka finds the source of interchangeability in the possibility of either picking up the entire proposition in the context or just the one embedded under negation, Holmberg finds its source in a structural ambiguity in the negated context sentence. According to his analysis, negation in English attaches at two different heights. High negation attaches between TP and vP, and low negation is adjoined to vP/VP. When A asks the question "Is Jane not coming?", B is free to interpret the question as having high or low negation. If it is low as in (11a), then the polarity head is open and yes and no are both free to bind it. Yes will produce a negative response (11a-i), while no produces a double negative, which reduces to a positive response (11a-ii). If negation is high, as in (11b), then the polarity head is negative and yes requires an unelided clause to change it to positive (11b-i), while no forms a negative concord chain with it (11b-ii The structural ambiguity of the two attachment heights of negation in the polar question is what makes yes and no interchangeable in negative contexts in this account. The two polar questions (11a) and (11b) are string identical and interpretively indistinguishable. 10 Similar to the analysis in K&R, a bare yes cannot convey a positive answer, but must be followed by an overt clause as in (11b-i). Holmberg (2013) notes that in some contexts, bare yes seems to be able to convey a positive response to a negative question. He suggests that intonation may play a crucial role in making such responses felicitous though he does not discuss what the intonation may be or how it interacts with his account to make the response felicitous.

Roelofsen and Farkas (2015)
Unlike Krifka (2013), Roelofsen & Farkas (2015; referred to as R&F) do not find the source of interchangeability in the presence of two potential antecedents, nor do they find it in a structural ambiguity in the context as Holmberg (2016) does. Rather, building on Farkas & Bruce (2010), which was in turn inspired by Pope (1972), R&F attribute the interchangeability of yes and no to their ability to encode two different types of features. On this view, polarity particles in English do "double duty": They can either signal the polarity of the answer being given or they can signal whether the present response agrees or disagrees with a prior utterance. According to R&F, absolute polarity features are responsible for the former function, while relative polarity features perform the latter. Roelofsen & Farkas (2015) assume a polarity projection in the left periphery whose head takes a clausal argument, which is elided in the case of bare particle utterances, similar to Rawlins (2009) andHolmberg (2016). There are two types of features that are realized on the polarity head. The absolute features [+] and [-] presuppose that the complement of the polarity head has positive or negative polarity respectively. 11 The relative features [agree] and [reverse] each introduce presuppositions relative to a unique most salient antecedent in the discourse: [agree] presupposes that the complement has the same polarity and identical propositional content to the antecedent, while [reverse] presupposes that the complement has opposite polarity and complementary propositional content to the antecedent. Yes is capable of realizing [+] or [agree], while no is capable of realizing [-] or [reverse]. Only one of the features it realizes needs to be present to license the polar particle.
This system is demonstrated in (12) and (13) -] are not felicitous, since one of their presuppositions would not be fulfilled. Similar to Krifka's analysis, R&F's account is in principle compatible with bare particles conveying either response, but just like Krifka, they make additional assumptions that predict bare particle interpretations to be more constrained, and we return to these when we discuss the experimental results.

Summary and synthesis
Now that we have reviewed several theories of polar particles, let's recap how they differ: First, they differ in the structural analysis of polar particles themselves. Some treat them as separate speech acts, syntactically unrelated to the following clause (Krifka 2013: the pure sentential anaphor view), others treat them as part of the syntactic structure of the following clause, which may possibly be elided (Kramer & Rawlins 2009;Holmberg 2013;2016;Roelofsen & Farkas 2015: the single structure view). Second, they differ in their accounts of interchangeability. They attribute interchangeability to the presence of multiple potential antecedents (Krifka 2013), to a structural ambiguity in the context sentence (Holmberg 2016), or to the ability of yes and no to realize absolute or relative polarity features (Roelofsen & Farkas 2015, building on Farkas & Bruce (2010 and Pope (1972)).
Are there any arguments that can distinguish between these accounts on these two points? In the following we will make several novel observations that bear on these issues. But before discussing these observations, we note that the various accounts we discussed also differ in other ways. First, they make more fine grained claims about the possible readings of bare particles, and about which of the four possible combinations of polar particles and responses in the interchangeability paradigm in (2) are more or less felicitous. As mentioned before, we postpone discussion of these more fine-grained issues until section 4, in which we discuss results from our production experiments that bear on these predictions. Second, these accounts differ in their crosslinguistic empirical reach. The proponents of each of these accounts discuss ways that they can be extended to explain the behavior of polarity particles crosslinguistically. However, Roelofsen & Farkas (2015) and Holmberg (2016) are more ambitious than Krifka (2013) in this regard in that they each make proposals for how their accounts might make universal predictions for polar particle responses. Krifka's account would need to be extended in nontrivial ways in order to make such universal predictions. Given space limitations and the fact that our experiments below focus only on English, we will not engage in an in depth discussion of these issues.

An argument for the single structure view
One difference between Krifka's theory of polar particles and all of the others is that it assumes that polar particles do not stand in a direct syntactic relationship with the following sentence, but are instead analyzed as separate speech acts (Krifka 2013: 7-8). While any analysis must allow for uses of yes and no as separate speech acts, we present two arguments in favor of a single structure analysis of polar particles from the intonation of polar particles and their following clauses in English.
The first argument is a simple observation: Polar particles and following sentences can easily be pronounced with a single intonational tune (observed already in Pope 1972: 147, ex. (73R2), and also demonstrated in our experiments below). This seems to be impossible with sequences of two separate speech acts. Consider (14). In (14b), B clearly makes two separate speech acts. 12 (14) A: Jane likes steak, right? a. B: No, she doesn't. b. B: She doesn't, she's vegetarian.
(14a) can be pronounced using a single intonational contour without difficulty. In the recordings (see footnote 12), it is produced with a single contradiction contour over the whole utterance. However, (14b) cannot be pronounced this way. The recordings include three pronunciations of (14b). (i) is a single contradiction contour over the whole utterance. (ii) is a single falling contour over the whole utterance. (iii) is a falling contour on she doesn't, followed by a separate falling contour on she's vegetarian. We believe there is a clear contrast between the first two, which are infelicitous, and the third, which is felicitous. 13 If we are correct that two separate speech acts cannot be pronounced with a single intonational tune, then the fact that (14a) can be pronounced with a single contour suggests that polar particles and following clauses can exist in a single speech act. This would run contrary to the predictions of Krifka's account. A second, related argument is due to Michael Rochemont (p.c.), who pointed out to us that it is possible to shift prominence to a polar particle and deaccent a following sentence as in (15) If prominence shifts can indeed only occur intrasententially, this suggests that contrary to Krifka's analysis, yes and no are at least sometimes part of a larger syntactic structure with the following sentence. We note further that they pattern with propositional adverbs in this regard: (18) A: Does she like coffee? a. B: Of COURSE, she likes coffee. b. B: SURE, she likes coffee.
The meaning of the prominence shifted utterances in (15) as well as those in (18) seems to have focus on the polarity of the proposition (similar to "verum focus", see Höhle 1992), which is compatible with Holmberg's and R&F's proposed connection between polar particles and the polarity head. These observations, taken together with crosslinguistic facts discussed by Kramer & Rawlins (2009), Roelofsen & Farkas (2015) and Holmberg (2016), arguably speak in favor of analyzing the syntax of polar particles similarly to propositional adverbs, high in the left periphery, and analyzing uses of bare yes and no as involving ellipsis. Note that this conclusion only impacts Krifka's syntactic assumptions, not his account of interchangeability. One can imagine the minimal modifications required to make Krifka's account compatible with the preceding facts. The result is a hybrid between Krifka's and R&F's theories that retains Krifka's explanation for interchangeability. First, suppose that yes and no attach high in the left periphery, as proposed in K&R's, Holmberg's and R&F's analyses. Krifka's semantics for yes and no are roughly identity and negation. Translating these into a form that operates on syntactic complements produces operators that are indistinguishable from R&F's relative polarity features: Yes requires its complement to have identical propositional content and polarity to a salient antecedent (R&F's [agree] feature), while no requires its complement to have the opposite propositional content and polarity from a salient antecedent (R&F's [reverse] feature). Despite this, the revised theory does not capture interchangeability in the manner of R&F's account, which depends on having both relative and absolute features. Instead, it makes all of the same predictions as Krifka's theory (apart from the prominence shifting facts above) because it captures interchangeability via the same basic insight proposed in Krifka (2013): Negative sentences introduce two discourse referents, one corresponding to the NegP and one corresponding to the TP. Therefore, there are still three accounts of interchangeability on the table: Multiple discourse referents (the modified Krifka 2013), structural ambiguity of negation (Holmberg 2016), and multiple kinds of polarity features (Roelofsen & Farkas 2015). 14 2.5.2 An argument for the multiple antecedents view of interchangeability Krifka hypothesizes that yes and no are sensitive to multiple propositional discourse referents arising from negative sentences. If yes and no can pick up multiple antecedents from other kinds of sentences, this would provide independent evidence in favor of the multiple antecedent view. Krifka (2013: 5) already demonstrates the existence of multiple antecedents with the propositional anaphor that, similar to our example (3). (19) provides more direct evidence for the analysis from uses of yes and no responses themselves.  Based on the intuitions of a few informants as well as our own, (ib) and (id) seem to be degraded relative to the other responses. This might suggest that in responses like (ia) and (ic), the polar particles can be within the same structure as the following sentence, while in responses like (ib) and (id), they cannot be. Huddleston & Pullum (2002: 848) claim that the responses Yes it isn't and No it is are ungrammatical as single clauses. It may be that while on the face of it English polar particles seem to be interchangeable, uses of yes with a following negated sentence and of no with a following non-negated sentence are in fact necessarily separate utterances. This might be taken as an argument in favor of an account like Roelofsen & Farkas's (2015), which allows yes and no to realize either absolute or relative polarity features, if we were to make the additional assumption that relative features are more pragmatic in nature, while only absolute features are true polarity features that take part in a syntactic structure with a following clause. Clearly more work is needed, both to fully establish the empirical facts, and to flesh out an explanation.
of the [reverse] feature to be met in each response, it has to be assumed that each embedded propositional constituent introduces an antecedent. Thus under R&F's account, we are forced by (19) to make the same crucial assumption undergirding Krifka's account, namely that polar particles can refer to more than one propositional discourse referent made available by a preceding utterance. Given that both of the accounts must allow for the existence of multiple antecedents arising from other embedding environments, and given that that is sensitive to both positive and negative discourse referents arising from negative sentences, analyzing yes and no as picking up propositions embedded under negation à la Krifka seems appealing. It explains the phenomenon of interchangeability without positing any other extra factors such as multiple kinds of polarity features, or sensitivity to the structural height of negation.
In response to ( We think that the cause of these intuitions is pragmatic. In particular, the availability of an antecedent may be constrained by whether the resulting assertion is relevant to the question under discussion (QUD, Roberts 1996Roberts /2012. Discourse referents attached to matrix propositions are usually relevant to the QUD, thus are usually available. In (19), the QUD is whether John is home, so the most deeply embedded clause is available as an antecedent. Moreover the issue is disputed, so Mary's beliefs on the topic may be relevant, and since this is the case, what A knows about Mary's beliefs on the topic is also relevant, making each of these larger constituents available as antecedents in (19). Suppose we alter (20) by switching the matrix subject to the first person pronoun I. The resulting I hope that Peter wrote a thank you note may be taken as an indirect question about whether Peter wrote a thank you note. Thus we would predict the responses in (20b) to be more felicitous, and we believe this matches intuitions. 15,16 If we return now to considering embedding via negation, it should not be surprising if we don't see strong asymmetries with respect to whether the embedded antecedent is 15 These observations connect to Simons (2007), which reports the following judgements: (i) A: Who was Louise with last night? a. B: Henry thinks/believes/suggested/hinted that she was with Bill. b. B: I think/believe/imagine/suppose that she was with Bill. c. B: (?)Henry hopes/I hope that she was with Bill.
Simons claims that embedded clauses can answer questions in some cases, with the matrix clause serving as an evidential. On (ic), she writes, "The oddity […] is presumably due to the fact that Henry's hopes […] do not provide very good evidence as to what is the case, and so are not evidence on which answers to a factual question should be based" (Simons 2007(Simons : 1037. We think that there is a contrast between the third and first person subject in (ic) with the latter being more acceptable. This contrast may be related to the contrast in whether the polar particle can pick up the embedded clause in (20): First person subjects with hope may make embedded clauses more directly relevant to the QUD (or "main point" in Simons's terms), thus making them more available as antecedents. 16 The Glossa reviewer also introduced the example in (i): (i) If Jane is at home, Peter will be too.
Similar to our comments about (20), we think that when the conditional in (i) is embedded into a larger discourse with a QUD about either Jane or Peter's whereabouts, the relevant responses may improve, however the intuitions here are subtler than those for (20), and may require further exploration. available or not. If the truth of a proposition is relevant to the QUD, then we would also expect its negation to be relevant, thus discourse referents arising from both propositions should be available to polar particles. We conclude that there is some independent evidence for a multiple antecedents account of interchangeability such as Krifka (2013). In combination with the evidence in favor of treating polar particles as part of a larger syntactic structure with following clauses (as advocated for by Kramer & Rawlins 2009;Roelofsen & Farkas 2015;Holmberg 2016), this suggests that the correct theory may be a combination of ingredients from the described accounts. But as mentioned above, there is still non-trivial work to be done to extend a multiple antecedents analysis of interchangeability so that it could capture the full range of crosslinguistic facts, a task that remains beyond the scope of this paper. Further comparisons of the theories discussed above, in particular their accounts of preference patterns, will be taken up when we discuss our experimental results.

The contradiction contour
What we know so far about interchangeability in responses to polar questions in English is mostly based on impressionistic judgments and two rating studies in which participants evaluate written dialogues (Kramer & Rawlins 2012;Brasoveanu et al. 2013). The reliance on written stimuli is potentially problematic, however. All prior research we have discussed so far invariably notes that certain intonational tunes may correlate with certain types of responses. Roelofsen & Farkas (2015) claim that positive responses to negative sentences have to bear verum focus prominence on the auxiliary of the following clause, for example No, she IS. Krifka (2013) claims such positive responses require a "rejecting accent", though he does not describe what it should sound like. Cooper & Ginzburg (2011) observe a "rise fall" tune on the polar particle no in disagreeing responses. The observations about the role of intonation so far have remained impressionistic, and there has not been any empirical work to test which intonational patterns English speakers actually produce.
Our own intuition was that one particular rising tune, the contradiction contour (CC) described in Liberman & Sag (1974), would play a role. The contradiction contour, as its name suggests, has been associated with evoking a sense of contradiction with the prior discourse. We are not alone in the intuition that it is crucial in understanding responses with polar particles. Pope (1972: 145-147) identifies a rising-falling-rising tune that can be imposed on the utterance no, it isn't in response to a positive assertion (ex. (67R2)) or the utterance yes, he is in response to a negative assertion (ex. (73R2)). If this is correct, and the contradiction contour can appear on particles themselves, then it might disambiguate bare particle responses to negative questions. In the following, we will describe the form and meaning of the contour in question. 17

The form of the contradiction contour
The example in (21) exhibits a common use of the CC. 18 The image in Figure 1 is a pitch track of a particpant's performance of B's utterance made using Praat (Boersma & Weenink 2017). 17 In experimental work on Catalan and Russian, González-Fuente et al. (2015) find that Catalan speakers frequently use a rise fall rise intonation they call "contradiction tune". This tune seems to be similar in form to English contradiction contour (Liberman and Sag 1974), a fact meriting further exploration. Interestingly, González-Fuente et al.'s (2015) intonational results for Russian differ from those for Catalan (and English, based on our results), both in terms of distribution and form. 18 "(CC)" following an utterance indicates that the utterance bears a CC tune. There are two CCs in (21).
Again, recordings of all examples where intonation is crucial can be found via section 7. According to Ladd (1980: 150), the CC begins with a rise-fall, followed by a low-rise pitch accent on the nuclear or main stress of the sentence. The initial high-fall is likely to be preceded by a rise if the utterance involves multiple syllables preceding the syllable carrying the final accent, such as in our example (21). On a monosyllable like no, the tune starts high and falls to the low pitch accent before the final rise. 19 Pierrehumbert & Hirschberg (1990) transcribe the nuclear tune of the CC in ToBI as L* L-H%, while Constant (2012) transcribes it as L*(+H) L-H%. These transcriptions do not capture the fact that the CC necessarily has an initial rise or at least starts high before reaching the final low pitch accent. This pre-nuclear part of the CC is reflected in Ladd's (1996) transcription L*+H L* L-H% and Bartels's (1999) H+!H* L* L-H%. Hedberg et al. (2013) conduct a corpus study that lends support to Ladd's and Bartels's transcriptions, and they further observe that the CC may have one or more L* pitch accents before reaching the obligatory final L*L-H% nuclear tune (Hedberg et al. 2003: 2). 20 A problem we see with these transcriptions is that it is not clear that the initial rise or high tone is a pitch accent, that is, it does not need to align with a stressed syllable. For instance, in the monosyllabic no in (21), the tune reliably starts high despite that there can only be one pitch accent, and it is a low one. Moreover, in longer utterances, the initial rise-fall (or high-fall) does not necessarily align with a stressed syllable, contrary to what we would expect if it were the result of a pitch accent (cf. Ladd 1980: 150). Thus, we believe a more accurate transcription might be %H L* L-H%, with the possibility of iterated L* pitch accents after the initial %H in cases where multiple words are accented. 19 In Iberian Spanish, Torreira & Grice (2017) demonstrate experimentally that tunes may have one form when produced with a single prosodic word, but another, longer form when produced with two or more prosodic words. We think something similar may be happening with English CC, although more phonological work is needed. 20 We are discussing here what Hedberg et al. (2003) refer to as "Classic Contradiction Contour", which they contrast with two variations of the CC. One of them, the "Contrastive Contradiction Contour", may be related to the rise-fall-rise contour, discussed in section 3.2.  Pope (1972: 139-147) observes the use of a rising-falling-rising intonation in responses contradicting a previous statement, which she characterizes as a "B-accent", drawing on Jackendoff (1972) who discusses B-accents in the context of contrastive topics. Utterances with a final B-accent were later analyzed as involving the so-called rise-fall-rise contour (RFR, Ward & Hirschberg 1985), which is usually transcribed as L*+H L-H%. We believe that the tune Pope observes is not the RFR/B-accent, but the CC. Arguably, the two contours have often been confounded. In fact, Liberman & Sag (1974) analyze certain examples that are now widely agreed to exhibit the RFR as involving the CC. In response to Liberman & Sag, Ladd (1980: 148-152) gives compelling arguments for distinguishing the two, but identifies a particular set of circumstances in which they are practically indistinguishable, thus providing an explanation for the common confusion. We illustrate Ladd's insights with the following examples: 21 (22) A: Jane doesn't like movies. a. B: Jane likes movies (CC). %H (L*) L* L-H% b. B: JANE likes movies (RFR).

Keeping the contradiction contour separate from the phonetically similar rise-fall-rise contour
L*+H L-H% The example in (22a) has the CC with the nuclear stress falling on the object. (22b) has the RFR with subject focus, hence the nuclear accent falls on the subject. Ladd's insight is that these two types of utterances sound very similar to each other, despite the difference in tune and prominence location. Varying the choice of tune and prominence such that we have CC + broad focus and RFR + subject focus produces two intuitively indistinguishable contours. Ladd (1980) also helps us convince ourselves that, despite appearances, the intonational tunes in (22)  L*+H L-H% Whereas the contour in (23a) rises and falls utterance initially, in (23b) the contour does not start rising until the third syllable ra. This is because RFR locates its initial rise on the stressed syllable of the focussed constituent, whereas the CC's rise fall always occurs utterance initially. When we try to use these two utterances in a context that motivates the use of the RFR but not the CC, we can see that they are also clearly semantically distinct: A: Who do we know who likes movies? a. B: #Alvarado likes movies (CC). b. B: ALVARADO likes movies (RFR).
The meaning of the CC as described by Liberman & Sag (1974) requires some element of contradiction, which is not motivated in the context in (24), hence (24a) is infelicitous, whereas the requirement is clearly met in (23), so (23a) is felicitous. The RFR, on the other hand, signals that the utterance is incomplete or that some further implication is intended. 23 This contribution is compatible with the contexts in both (23) and (24), but has a very different effect on meaning from the CC: When the CC is used in (23a), B simply conveys that she disagrees with A's statement; when the RFR is used in (23b), B conveys that while Alvarado does like movies, there is someone else who does not (or at least remains noncommittal about others). There is no such implication in (23a). This incomplete/implication meaning of RFR is also compatible with (24), where it implies that there may be others who like movies as well, and B isn't sure whether she has completely answered A's question.
Thus it is clear that the CC and the RFR are distinct contours with distinct semantic contributions. While the RFR necessarily involves some sort of implication about focus alternatives, the CC necessarily involves a sense of contradiction. Ladd teaches us that when we are in doubt about which contour is in question, we should try to construct the example so that the first stress does not appear until at least three syllables in.
We turn now to the task of producing a more precise characterization of the sense in which the contradiction contour contradicts.

The meaning of the contradiction contour
About the meaning of the contradiction contour, Liberman & Sag (1974: 421) write, "This contour is appropriate (although of course optional) just when the speaker is using the utterance that bears it to contradict-he may contradict what has just been said by another, he may contradict some assumption or implication of what has been said or done by another, or he may contradict himself" [boldface ours, replaces underlining in the original text]. Defining the notion of "contradict" in this quote is not trivial, as is anticipated by the authors, who allow for contradictions of implications of both verbal and non-verbal actions. If we were to define a contradiction as two contradictory sentences in the logical sense, then we would not predict the use of the contradiction contour in responses to PQs, as in for example (25). 24 (25) It's been a busy day at work. You have ten clients to meet with before your boss gives a presentation at 4 pm that everyone is expected to attend. You are intent on going to the presentation because you have an important question to raise.
In your haste to meet with all ten of your clients, you completely lose track of time. Your coworker Thomas knocks at your door. You look at the clock which reads 4:07 pm and you realize you are late for the presentation. Thomas asks: Thomas: Are you not coming to the presentation? You: No (CC). I'm coming to the presentation (CC).
The apparent antecedent of the CC is not a prior assertion, but the information provided by the context and the question, which at best imply evidence in favor of the negative response (for discussions of the notion evidence in negative questions, an issue we will take up below, see Büring & Gunlogson 2000;Trinh 2014;Roelofsen & Farkas 2015;Krifka 2017). Clearly, a weaker notion of contradiction is needed. One possible analysis is that an utterance of p(CC) requires a propositional discourse referent anchored to ¬p, and thus involves a propositional anaphor similar to polar particles under Krifka's (2013) or Roelofsen & Farkas's (2015) analyses. But such an analysis would miss an important difference between polar particles and the CC. Consider (26), in which A asks a positive PQ. (26) It's been a busy day at work. You have ten clients to meet with before your boss gives a presentation at 4 pm that everyone is expected to attend. In your haste to meet with all ten of your clients, you completely lose track of time. Your coworker Thomas knocks at your door. You look at the clock which reads 4:07 pm and you realize you are late for the presentation. The judgments in (26b) through (26e) are exactly as expected given existing theories of polar particles: Since A asks a positive PQ, yes and no can only be used as in (26b) and (26c). (26d) and (26e) are unavailable. Note however, that using the CC on a positive response as in (26a) is perfectly acceptable here. An analysis that treats the CC as similar to polar particles creates the following puzzle: Why is the CC licensed on a positive response here when interchangeability is not licensed? To explain how the CC is licensed, we will adapt ideas about contextual evidence introduced by Büring & Gunlogson (2000) for constraints on polar questions. We propose that the CC requires contextually salient evidence against the asserted proposition. We define evidence as follows: Contextual Evidence: Evidence for p is a change in the context that increases the likelihood that p is true.
As Büring & Gunlogson note, contextual evidence needs to be publicly available. We further note that contextual evidence can come from any kind of perceptual experience or from interlocutors' actions, including speech acts. Moreover, contextual evidence does not necessarily affect the speaker's commitments or expectations about p (Büring & Gunlogson 2000: 8). For example, the contextual evidence for p in (28) and (29) does not seem to affect B's commitment to ¬p: (28) B is an experienced animal tracker who knows that mountain lions no longer live in these parts. Then B sees some mountain lion scat, and says to herself: B: There aren't any mountain lions around here (CC).
(29) A: There are mountain lions around here. B: There aren't any mountain lions around here (CC).
Using p as a label for the proposition that there are mountain lions around here, we take (28) and (29) to both be contexts in which there is contextual evidence for p by our definition in (27). In (28) evidence for p is in the form of B's perception of scat (combined with B's expertise), and in (29) the evidence for p is in the form of A's assertion of p. In both cases B asserts ¬p despite the evidence for p. This presumably means that B's commitments or expectations have not been altered with respect to p by the new evidence. So, while evidence for p is some publicly available feature of the context that would ordinarily increase the likelihood that p is true, it needn't necessarily cause B to change her own beliefs about p.
Before demonstrating the role that this notion of evidence plays in the meaning of the CC, we note that it is independently needed for other linguistic phenomena. For example, it can be motivated based on broad principles of question answering. Consider Roberts's (1996Roberts's ( /2012 formulation of Gricean relevance (Grice 1989): A conversational move is relevant to the question under discussion (QUD) iff it either introduces a partial answer to the QUD, or raises another question as part of a strategy to answer the QUD.
Suppose that questions denote sets of alternative propositions (Hamblin 1973). For Roberts, a partial answer is a proposition the contextually entails that one of the Hamblin alternatives is true or false. There may be reason to recast relevance in terms of our notion of contextual evidence in (27) by adjusting (30) with the italicized part in (31).
(31) A conversational move is relevant to the QUD iff it either provides evidence for or against one or more of the propositions denoted by the QUD, or raises another question as part of a strategy to answer the QUD.
(31), but not (30), is broad enough to be compatible even with very indirect responses to questions, such as in (32): (32) A: Who will come to the reading group meeting today? B: Jane's mother is in town. ⇝ Jane is unlikely to come to the meeting.
B's response does not contextually entail that Jane is not coming, thus it does not provide a partial answer to the QUD according to (30). It is nevertheless an intuitively valid response. When combined with world knowledge about visiting mothers, it provides evidence for the proposition that Jane will not come to the meeting by our definition of evidence in (27), that is it increases the likelihood that the proposition is true. Since this proposition is the negation of a partial answer to A's question, B's response counts as providing evidence against an answer, and (31) is met. The need for a notion of contextual evidence such as (27) is further motivated by its use in Büring & Gunlogson (2000); Sudo (2013) For example, in (28) in which B sees mountain lion scat, B could use There are mountain lions around here? to convey her incredulousness. In fact, a question rise on a declarative is by far the contour speakers prefer to use to convey incredulity in North American English (Goodhue et al. 2016). This suggests that an analysis of rising declaratives in terms of speaker bias in favor of the proposition would be too strong (though see Westera 2017 for a defense of a speaker bias analysis). Rather, rising declaratives seem to require contextual evidence for p, and are compatible with speaker bias in either direction.
With the notion of contextual evidence in (27) in place, we return to the task of characterizing the meaning of the contradiction contour. Following Truckenbrodt (2012), we assume intonational contours compose with proposition denoting constituents and act as partial identity functions, placing felicity requirements on the utterance.
The contradiction contour takes a proposition p as input and presupposes that there is contextual evidence for the complement of p, i.e. evidence for ¬p. If the presupposition is met, the contradiction contour returns p.
The resulting assertion will have the same propositional content as p. It is the tension between presupposing contextual evidence for ¬p and being true iff p is true that creates the signal of disagreement that the CC is known for. 26,27 Returning to our comparison between the CC and polar particles, we can see that they are similar in that they are both dependent on previous context, but they differ in that polar particles require a discourse referent denoting a certain proposition, while the CC merely requires contextual evidence for a certain proposition. This explains the asymmetry in (26).
Our analysis of the CC is similar to that of Liberman & Sag's (1974) in that we tie its meaning to the notion of contradiction, but it is more precise and is therefore able to address an objection that Pierrehumbert & Hirschberg (1990: 293) raise via the following example: A: My chances? The election isn't over till the last ballot has been counted. B: #But CBS has just declared you the next president (CC).
In (36), B is in some sense trying to contradict A's claim that the election is not over, thus Pierrehumbert & Hirschberg note that Liberman & Sag's analysis incorrectly predicts 26 A possible challenge to treating the CC as a presuppositional operator is that, assuming the principle of maximize presupposition (Heim 1991), we might expect the CC to appear obligatorily in all contexts in which it is licensed. However, as Liberman & Sag (1974) already note, the CC is used optionally. It may be that the CC includes some additional expressive meaning that a speaker may or may not consider appropriate in a given context. However, we think it is likely that a speaker is always free to choose one tune or another given some context and sentence to be uttered. Thus, if we wish to maintain an analysis of tunes as imposing felicity conditions on utterances as a general program, we may need a more general explanation for this optionality. 27 Portes & Reyle (2014) provide a QUD account of French implication contour, which the authors claim primarily encodes a contradiction. Therefore, one might wonder whether such an account could also be applied to the CC. Note however that there are empirical differences in the distributions of the two contours. E.g. Portes & Reyle's example (3), in which French implication contour is acceptable, would be unacceptable with the CC in English, but acceptable with a contrastive accent on curtains:  (37) it is odd to use the CC, but if the order of the two statements is reversed as in (38), using the CC in the response is possible: A's claim in (37) (37), it combines with p, the proposition that Alvarado said ¬q, and there is no evidence against p in the context, so the CC utterance is infelicitous. But in (38), the CC combines with q, that there are mountain lions around here, and B's utterance provides evidence against q since B asserts p, which embeds ¬q under the verb said (see Simons 2007 for an argument that utterances of the form X said p introduce evidence for p). So while the CC cannot see embedded propositions in its complement, utterances in the preceding discourse can provide contextual evidence for propositions embedded within them, at least in some cases. 29 There are other observations in the prior literature that pose a challenge to our analysis, however. Pierrehumbert & Hirschberg (1990) argue that the contour we describe can also be used in completely non-contradictory speech acts such as in greetings (e.g. Good morning (CC)!). We also note here the similarity of the contour we analyze as the CC and O'Connor & Arnold's (1973) "low bounce" intonation, which they demonstrate can be used to ask a question, among other things. If what we call the contradiction contour can really also be used in these different types of speech acts, then our characterization of its meaning may not be general enough to cover all its uses. We leave it to future work to determine if these other cases are indeed instances of the contradiction contour, or whether they may be another distinct contour. 28 Nor can the CC itself be embedded, as noted in Ladd (1980). If it could, we would expect there to be a pronunciation and reading of (37)B that is felicitous. This suggests that the CC might actually be an operator over speech acts, similar to Wagner's (2012) proposal for the RFR contour. Another possibility is that the CC contributes a conventional implicature, as in the analysis of expressive meaning in Potts (2005). These issues are left to future work. 29 Meanwhile, the meaning that Pierrehumbert & Hirschberg (1990: 293) propose for the CC-roughly "the addressee should already be aware that p"-cannot be accurate. Suppose Betty tells Ann that tomorrow she wants waffles for breakfast. In the morning, Ann says, "What do you want for breakfast today?" Since Ann should be aware that Betty wants waffles and the question implies she is not aware, Pierrehumbert & Hirschberg predict the CC to be felicitous on "I want waffles," but it clearly isn't. However, it is clearly felicitous on "You know what I want." Since Ann's question provides evidence against the latter sentence but not the former, our account predicts the asymmetry.

Production experiments
There are three groups of empirical questions that we aim to address with the experimental work below: i. Intonation: Does a special intonation appear on positive yes/no responses to negative PQs, as claimed in previous studies? If so, how does it affect the interpretation of bare particles? ii. Preference patterns: When responding to a negative sentence, which particles do speakers prefer to use when giving a response with negative polarity? With positive polarity? How are bare polar particle responses to negative sentences interpreted? iii. Context sensitivity: If the negative sentence that the polar particle responds to is itself responding to a negative sentence, are preference patterns affected, and in particular is yes now more acceptably interpreted as a negative response, e.g. "she didn't"?
As already discussed in the introduction, previous experimental work has already provided partial answers to the questions in ii. Brasoveanu et al. (2013) found that no was preferred over yes when giving a negative agreeing response to a negative declarative with a referential NP subject, e.g. "No, she didn't." However their study does not consider positive responses to negative sentences, which are included in our experiments. Moreover, Kramer & Rawlins (2012) found that bare polar particles are more likely interpreted as agreements with negative questions, e.g. "she didn't." Our results will suggest that it is crucial to control for intonation when exploring these questions. Question iii has been addressed for German in Meijer et al. (2015), where it was found that preference patterns are not context sensitive. We will report on a similar experiment in section 6.
Finally, our experiments provide empirical evidence answering question i for the first time.
In total we will report on five experiments. 1 -A production experiment that gathers intonations and naturalness ratings for the polar particles yes and no in response to both positive and negative polar questions (PQs) in which inversion has taken place and the auxiliary is fronted (section 4). 2 -A production experiment testing yeah and no in response to negative rising declaratives, that is, sentences with declarative word order that have the rising intonation typically associated with PQs. 3 -A follow up to experiment 2 that tests naturalness ratings for bare particles (section 4). 4 -A perception experiment that tests participants' interpretations of bare yes/no responses to negative PQs, controlling for intonation (section 5). 5 -A rating experiment modeled after the experiments in Meijer et al. (2015) that tests whether the polarity of a preceding context sentence affects preference patterns (section 6). 30

Methods
The methods of our three production experiments (Experminents 1, 2 and 3) are very similar. We will describe the methods for 1 in detail first, then we will note how it differs from experiments 2 and 3.
In each trial, the participant silently read a context story, followed by two lines of dialogue. See (25) above and (39) below for samples of our stimuli. Participants were asked to produce the second line of dialogue as naturally as possible. When ready, they would press any button to hear a recording of the first line of dialogue through headphones, and then they would be recorded producing the second line. The recording of the first line always featured rising, polar question intonation.
Experiment 1 has three factors: question, whether the questioner asked a positive or a negative question; particle, whether the participant used the polar particle yes or no; and answer, whether the participant gave a positive answer (yes/no, I am coming) or a negative answer (yes/no, I'm not coming).
Each level of the factor answer required a different context story. The context in (25) sets up a positive answer (e.g. yes/no, I'm coming to the presentation). The context in (39) sets up a negative answer: It's been a busy day at work. You have ten clients to meet with before your boss gives a presentation at 4 pm that everyone is expected to attend. You've been to hundreds of your boss's presentations, and you think they are boring and keep you from doing important work. You plan to meet with your clients, and if you can't finish meeting with all ten by 4 pm, then you'll just have to miss the presentation since clients come first. Your coworker Thomas knocks at your door at 4:07 pm. He asks: Thomas: Are you not coming to the presentation? You: No ___ I'm not coming to the presentation Regardless of whether the context sets up a negative or positive answer, all contexts were designed to make contextual evidence for the negative response salient. E.g. in both (25) and (39), there is contextual evidence suggesting that the participant's character is not coming to the presentation, namely they are in their office working even though it has already started. When a negative polar question was used, it reinforced the contextual evidence, at least if the analysis of negative polar questions in (33) is correct. 31 There were eight context pairs total (eight items). Responses were always complete sentences, but we will refer to them as "Am" and "Am Not" in plots and tables for brevity. We instructed participants to pause at "___" to maximize unique intonations on polar particles themselves. This made annotations of intonations easier, and also enabled us to use participants' polar particle productions in the perception experiment to be discussed in section 5. Participants were not made aware that the experiment was about intonation. After recording each trial, we asked for the participant's naturalness judgments: "Please indicate how natural this response seems on a scale of 1 to 5 (1 = least natural, and 5 = most natural)." We intended for participants to rate the naturalness of the response they were asked to give, not the naturalness of their own production of the response. Evidence that they conformed to our intentions is found in the fact that the intonations they produced had no effect on their naturalness ratings. 32 31 Note that our experimental trials sometimes featured positive PQs in contexts like (25) and (39) in which there was contextual evidence for ¬p. A NELS reviewer writes "... that in recent work [Trinh (2014)] has argued that positive polar questions are incongruent with contextual evidence towards ¬p." While Trinh's examples seem to support this claim, we believe that positive PQs are nevertheless acceptable in our experimental contexts. Further work on the relationship between the licensing conditions of positive PQs and contextual evidence is needed. 32 Moreover, we ran a follow up experiment (not reported here) in which participants rated the responses without actually producing them themselves. The results were not affected by this variation, and were qualitatively identical to the results reported below.
The three factors were crossed, 2 × 2 × 2, making eight conditions. Each participant saw each condition for each item in eight randomly ordered blocks of trials with one condition from each item. 33 We ran 30 native speakers of North American English (mostly McGill undergraduates), but had to exclude 7 due to technical issues, making for 1,472 observations total.
To facilitate analysis, we split the results between the levels of the factor question. I.e., we analyzed the results of responses to negative questions and positive questions separately, each as a 2 × 2 design with the two factors particle and answer.
Experiment 2 tested responses to negative rising declaratives, e.g. You're not coming to the presentation?, instead of polar questions as in experiment 1. Moreover, instead of using yes, yeah was used. Otherwise, the design was the same, crossing the two two-level factors, particle and answer, 2 × 2. There were six items and four conditions. 34 There were 22 participants, therefore 528 observations total.
Finally, experiment 3 is a follow up experiment to experiment 2 with the exact same design, except that participants were only asked to produce the particles yeah and no themselves, with stage directions indicating the meaning of the response. E.g. if the question was You're not coming to the presentation?, the response would have been indicated to the participant as follows: Yeah (You want to convey: I'm coming to the presentation). There were 33 participants.

Naturalness results
Figure 2 displays a plot for the naturalness ratings of responses to positive PQs in experiment 1.
As expected, when indicating a positive response to a positive PQ, yes is rated as highly natural and no is rated as unnatural, and vice-versa in negative responses. Yes and no are therefore indeed not interchangeable in response to positive PQs. 35 Figure 3 displays plots for experiments 1, 2 and 3: The experiment 1 plot (left) shows how participants rated responses to negative PQs, while the experiment 2 plot (middle) and the experiment 3 plot (right) show ratings for full sentence responses and bare particle responses respectively to negative rising declaratives.
We observe that yes/yeah and no are overall acceptable in both positive and negative responses. Experiment 3 exhibits certain nuances to be discussed shortly, but nevertheless has a qualitative pattern that is distinct from responses to positive PQs in Figure 2. Taken together, the results indicate a high degree of interchangeability of English polar particles when used in response to negative questions or rising declaratives.
We fitted cumulative link mixed model regressions for each experiment (Christensen 2010), with random intercepts and slopes for participant and item (see Table 1). We also ran a cumulative link mixed model regression with random intercepts and slopes for 33 If need be, this enables a latin-square analysis by looking at only the first quarter of trials. The same holds for all other experiments to be described below. 34 In fact, there were six conditions, four test-conditions with negative rising declaratives, and two additional conditions testing responses to positive PQs. In these two latter conditions in experiment 2, we did not fully cross the two factors, particle and answer, because we took it as given that yeah, I'm not and no, I am are infelicitous in response to a positive PQ. We will only report on the subset of the data with negative rising declaratives, since experiment 1 captures responses to positive PQs more fully. 35 However, we note that No, I am, while clearly degraded relative to Yes, I am/No, I'm not, is nevertheless rated as somewhat more natural than Yes, I'm not. We suspect that the difference exists because no is able to pick up on a questioner's negative bias, even when they are asking a positive PQ. Thus if a questioner asks ?p, but in doing so in a certain context clearly implies that they suspect ¬p, the speaker can reply, no, p. Cf.
participant and item on all of the data combined, so that we could check for statistically significant differences between the different experiments. 36 First we consider the models for experiments 1 and 2. The interaction between particle and answer, which is the largest effect, is due to the fact that in positive answers, both yes/yeah and no are equally acceptable, while in negative answers, yes/yeah is rated as significantly less natural than no. Moreover, particle had an overall effect in that no responses are rated more natural than yes/yeah responses, and answer had an overall effect in that positive responses are rated more natural than negative responses. These effects are driven by the fact that no is more acceptable than yes/yeah when the answer 36 We did not include the intonation that participants produced as a predictor since exploratory data analysis as well as a separate model showed there was no effect. Thus, the intonation participants produced had no effect on how natural they rated a response to be.  polarity is negative. No is not more acceptable than yes/yeah in general. Likewise, positive responses are not more acceptable than negative responses in general. The pattern of the naturalness ratings for experiment 3 are somewhat different from those for experiments 1 and 2. Yeah-am not responses are still rated as significantly less natural than no-am not responses, however there is a clear main effect of answer in that positive responses are significantly less natural than negative responses in general.
Considering now the model for all of the data, we note significant differences between the data from experiment 1 and 2, and between experiment 1 and 3: Overall, the polar particle responses are rated slightly less natural in response to polar questions than in response to rising declaratives. We had no expectations about this, and do not expect there to be a theoretical motive behind it.
As one would expect from Figure 3, there is a large effect of answer polarity on the difference between experiments 1 and 3. Positive (am) responses were significantly less natural in experiment 3 than in experiment 1. There are no other significant differences between experiment 1 and the other two experiments.

Discussion of naturalness ratings
Recall question ii from section 4, "Preference patterns: When responding to a negative sentence, which particles do speakers prefer to use when giving a response with negative polarity? With positive polarity? How are bare polar particle responses to negative sentences interpreted?" Our participants prefer to use no to convey a negative polarity response to a negative PQ or rising declarative. This result confirms earlier findings by Brasoveanu et al. (2013), but also expands on it since their study considered responses to declarative sentences while ours is on polar questions and rising declaratives. Moreover, our participants are equally happy using yes or no to convey a positive polarity response, which had not been tested before. However, participants disprefer using bare particles to convey a positive response. This result confirms Kramer & Rawlins's (2012) felicity judgment findings, but again expands on it by considering responses to rising declaratives instead of PQs, and using yeah instead of yes. Moreover, we found a significant interaction effect, showing that negative no bare particles are more felicitous than negative yeah bare particles. This is a new result since although this trend was seen in Kramer & Rawlins's (2012) data, it was not significant there. How do the theories of polar particles proposed by Krifka (2013), Roelofsen & Farkas (2015: R&F) and Holmberg (2016) compare in light of these results? The main result of the naturalness ratings is that no is more natural when agreeing with a negative question than yes. The theories of Krifka and R&F capture this result. To see this, we need to describe briefly how preference patterns are accounted for in these theories.
Krifka hypothesizes two pragmatic markedness principles to explain preference patterns.
The relative salience of discourse referents is contextually determined. Krifka argues that in unmarked contexts the discourse referent anchored to the embedded TP is more salient than its negative counterpart because negative sentences are usually uttered in contexts in which the positive sentence is already salient. Krifka uses these principles in the optimality theory (OT) tableau in Table 2 to predict different preferences for yes/no responses to negative sentences. Krifka's account of preference patterns accurately predicts the result from the naturalness ratings that no is more natural than yes in negative agreeing responses. For Krifka, this is because no picks up the most salient discourse referent, the positive one, and negates it, while yes incurs a violation for picking up the less salient negative discourse referent.
The ranking in Table 2 erroneously predicts negative yes responses to be preferred to positive yes responses. However, Krifka notes that *DisAgr is only ranked higher than *NonSal in responses to assertions, which indicate a high degree of commitment to the proposition on the part of original speaker, making disagreement costly. In response to questions, *NonSal is ranked higher, which predicts negative yes responses to be less natural and is more in line with the results.
Under either ranking, there is a problem for Krifka's theory however. Positive no responses are predicted to be the least natural. This does not match our participants' judgments. Roelofsen & Farkas (2015) argue that some feature combinations in their account are more marked than others. The more marked a feature is, the greater its need to be realized overtly.
[reverse] is more marked than [agree] since complementation is more marked than identity. [-], which stands for negative polarity, is assumed to be more marked than [+], which means positive polarity, because crosslinguistically negativity always seems to be pragmatically marked (cf. Horn 1989  In [agree, -] responses, only one of the features is marked, [-]. Since only no realizes this feature, no is predicted to be more natural than yes for realizing agreements with preceding negative sentences. This prediction matches the results. Furthermore, R&F predict yes and no to be equally natural when indicating a positive response to a negative PQ ([reverse, +]), which is borne out by the results above. Holmberg (2016) argues that speakers who disprefer negative yes responses do so because they interpret the preceding negative PQ as having a syntactically high negation between TP and vP. Those who interpret the PQ as having a low negation, adjunct to vP/VP, are predicted to find such negative yes responses to be completely natural. Therefore, this account predicts that such responses should be rated by individual participants either as completely natural or completely unnatural, but should not receive in between ratings, i.e. a bimodal distribution is predicted. However this is not what we found. The median rating of 3 in our results were not caused by participants being evenly split between ratings of 5 and ratings of 1. Instead, negative yes responses are consistently rated as somewhat degraded relative to all other response combinations by most of our participants. Thus Holmberg's account does not capture the results from the naturalness ratings.
We used yeah in experiment 2 because we had the intuition that it would be more acceptable in negative, agreeing responses. Roelofsen & Farkas (2015) report this same intuition in a footnote, and suggest more empirical work is needed. We were therefore surprised that negative agreeing responses were dispreferred with yeah in this experiment, just like yes was in experiment 1. It is an open question whether systematically varying yes and yeah in the same experiment might nevertheless produce preferences for the latter in the negative response level of the answer condition.
As for the decreased naturalness of positive bare particle responses in experiment 3, both Krifka (2013) and Roelofsen & Farkas (2015) predict this result. Both argue that positive sentence polar particle responses to negative utterances require overt following sentences, and that such responses are less natural with bare particles. Holmberg (2016) predicts bare yeah to be less natural when conveying a positive response since it requires an unelided elliptical clause to change the polarity from negative to positive. However, the correspondingly low rating of bare no indicating a positive response is not predicted by Holmberg. Interestingly, the results of the perception experiment in section 5 below 37 Central to R&F's account of the preference patterns is the claim that the [agree, +] and [reverse, -] feature combinations each form what R&F call a "natural class", importing terminology from phonology. However, it is not clear to us why two marked features should be likely to co-occur. In fact, the combination of two marked features is assumed to be the least likely combination in phonology. The "worst-of-the-worst", in the terminology of Smolensky (2006), is least likely to occur crosslinguistically (e.g. voiced aspirates are rare will conflict with these naturalness ratings by our speakers. That is, the way hearers interpret bare particles differs from the preferences exhibited by the speakers themselves.

Intonation results
The recorded responses in experiments 1, 2 and 3 were annotated for intonational contour by an RA and the first author. We annotated contours on polar particles and their following sentences separately. After listening to a subset of the data, we determined the vast majority of productions fell into one of three intonational categories: Declarative fall, contradiction contour (CC), and rise fall. Intonations were marked "unclear/other" if it did not fit any of these categories, and "problematic" in case of disfluencies, recording errors, etc. Polar particles were marked as "none" when the participant produced a single contour over the whole utterance. The form and meaning of contradiction contours were discussed in section 3. Declarative falls have a high final pitch accent followed by a fall, H* L-L%. Rise fall intonation is an upstepped high pitch accent that can appear on either the particle or the following sentence.
Counts and percentages of intonations used in response to PQs and rising declaratives in experiments 1, 2 and 3 are summarized in Table 3. 38 Considering Table 3, we can see that declarative falling intonation is by far the most frequent contour observed, followed by CC, and then rise fall. There were relatively few contours that fell into the other/unclear category, suggesting that the annotation scheme was well suited to the data. Cooper & Ginzburg (2011) claim that when no is used to convey a positive response to a negative PQ, it will bear a distinct rise fall tune. We think that our rise fall category corresponds to what they have in mind, however it does not have the restricted distribution they have suggested. While it does appear more frequently in positive responses to negative sentences, it is well represented in negative responses as well, as demonstrated in Table 4. Moreover, it was produced relatively infrequently (see percentages). Thus it does not seem likely that the rise fall is the special intonation hypothesized by theorists to be reserved only for positive responses to negative sentences.
Before moving on, we note that we believe that the production of rise fall in our data set was correlated with verum focus prominence shifts, that is prominence shifts to the auxiliary, for example Yes, I AM coming. Roelofsen & Farkas (2015) argue that verum focus is obligatory in positive disagreeing responses, but we note that, like rise fall, it appeared in both positive and negative responses in our data. We believe this makes intuitive sense: 38 A category for polar question rises was included in the annotation, but was only used 4 times total. Question rise annotations are included in the "other/unclear" row of Table 3. Verum focus can emphasize the truth of an assertion regardless of whether it has the same or opposite polarity of the preceding sentence. E.g. when responding to the negative PQ Are you not coming to the presentation?, both verum focus and rise fall intonation can appear on either a positive response (No, I AM coming), or a negative response (No, I'm NOT coming).
The tune with the most interesting distribution in our results was the contradiction contour (CC): It was produced almost exclusively in positive responses to negative sentences. Figure 4 shows the proportion of CC that participants produced in response to various antecedents: Positive polar questions (left), negative polar questions (second from left), negative rising declaratives (second from right), and in bare particle responses to negative rising declaratives (right). The top panels show the proportion of CC produced by participants on the particles themselves, while the bottom panels show the proportion of CC on the following sentences. Positive sentence answers are in blue, negative in purple. For instance, the Exp1-PosPQs column (leftmost) shows that the CC was produced on the sentence in 36% of trials in which positive sentences followed the particle no, but only in 2% of trials in which negative sentences followed the particle no.
Descriptively, the CC appears quite frequently in positive responses (blue), but hardly ever in negative responses (purple), suggesting an effect of answer polarity. In each case, the rate of CC is greater on following sentences than on the particles themselves. Rates of CC are greatest in response to rising declaratives (experiments 2 and 3), followed by  responses to negative polar questions (experiment 1-Negative), and with the least amount produced in response to positive polar questions (experiment 1-Positive). Furthermore, in responses to PQs only, the CC appears more frequently on no than yes, suggesting an effect of particle choice.
We fitted a mixed effects logistic regression for the combined data with random intercepts and slopes for participant and item (for discussion on the advantages of mixed effects regressions, see Baayen 2008; Jaeger 2008; Barr et al. 2013). The dependent variable was whether or not the CC was produced. The model tested whether it matters that the response was positive or negative (answer), and that yes or no was used (particle). It also tested whether it matters that the context sentence was positive or negative (ContextPos.vs.Neg., the difference between the positive half of the experiment 1 and the rest of the data), whether the context consisted of a polar question or a rising declarative (ContextPQ.vs.RiseDec, the difference between the negative half of experiment 1 and experiments 2 and 3), and whether the particle was bare or not (ContextFull. vs.Bare, the difference between experiment 2 and experiment 3), as well as interactions between particle and these latter predictors (see Table 5).
The effects of answer and polarity just mentioned are borne out by the model. We also found several other effects: More CC is produced in response to negative PQs and rising declaratives than in response to positive PQs (ContextPos.vs.Neg). More CC is produced in full sentence responses to rising declaratives than bare particle responses (ContextFull.vs.Bare). Finally, as one might expect from plot 4, whether yes or no was used had a fairly large effect on the difference in CC production between responses to positive and negative sentences (ParticleY.vs.N.:ContextPos.vs.Neg) and on the difference between PQs and rising declaratives (ParticleY.vs.N.:ContextPQ.vs.RiseDec).

Discussion of intonation results
Consider again question i from section 4, "Intonation: Does a special intonation appear on positive yes/no responses to negative PQs?" The answer is yes, the CC appears on positive yes/no responses to negative PQs. This result confirms the suspicions of researchers that a special intonation appears only in positive responses to negative sentences. It also establishes for the first time that it is the contradiction contour that is used in these circumstances. This is significant since, now that we know that this intonation plays a crucial Table 5: Logistic mixed effects regression modeling whether the contradiction contour was used on the particle or the sentence (or both). The model includes which answer was intended (I am vs. I am not), which particle was used (yes vs. no), and which context it was used in. role in these contexts, we can test experimentally whether it has a significant impact on the interpretation of bare polar particles (discussed in section 5).
That the CC appears in positive responses to negative PQs could be taken to show one of two things. Either the CC requires a linguistic antecedent with opposite polarity, or it merely requires contextual evidence for that proposition, as we claimed in section 3. We can determine which is the case by looking at responses to positive PQs in contexts that supply evidence for the negative answer before the question is asked (as all of our experimental contexts did). If the former view is correct, then the contextual evidence shouldn't matter and the CC should only appear on negative responses to positive PQs. If the latter is correct, then the polarity of the PQ shouldn't matter and the CC should appear on positive responses. As noted above, we found the latter to be true. The CC is not reserved just for contradicting a linguistic antecedent with opposite polarity, it is sensitive more generally to contextual evidence for a proposition that is opposite from the proposition that the speaker asserts. This fits with our analysis of the contradiction contour in section 3.3, and demonstrates the importance of a notion of contextual evidence like in (27) to certain linguistic phenomena.
As noted in footnote 26 at the end of section 3, one might expect the CC to appear in 100% of trials in which it is licensed, assuming it is a presuppositional operator and maximize presupposition is in effect. But this is not what we found. Liberman & Sag (1974) were right when they claimed the CC is optional. Interestingly, not only is the CC optional, but its rate of use is sensitive to the kind of antecedent, with negative rising declaratives eliciting it more frequently than negative polar questions. We speculate that there may be a gradient correlation between the likelihood of producing the CC on p and the strength of the contextual evidence for ¬p: The stronger the evidence for ¬p the more likely that an intonation reserved for disagreement, the CC, is produced. If this is right, then we would have to make the intuitively plausible assumption that negative rising declaratives convey stronger evidence for ¬p than negative polar questions. This in turn would require a theory of negative question licensing that predicts that negative rising declaratives require more or stronger evidence for ¬p than negative PQs. We leave these issues to future work.
The second interesting result is that the CC is produced more frequently in no responses than yes responses in experiment 1, but not in experiments 2 and 3. One possible explanation for this effect is that, given the non-interchangeable use of polar particles in response to positive PQs, no is generally more likely to be used in disagreements than yes. Therefore it is more likely to appear with the CC. This correlation affects speakers' choices even in cases where in principle the CC would be licensed on yes. One possible reason that this effect is missing in experiment 2 and 3 is that the latter used yeah whereas yes was used in experiment 1. For a reason that remains unknown, it may be the case that yeah more readily admits CC intonation than yes.

Perception experiment
Given the finding from our production experiments that yes and no themselves can carry the CC, we wondered whether its presence affects a listener's interpretation. This corresponds to question i from section 4: Does the contradiction contour (CC) affect the interpretation of bare particles? Here is a sketch of how the CC might affect the interpretation of a bare particle: In response to "Are you not coming to the presentation?" a bare yes/no can either convey a positive disagreeing response (I am), or a negative agreeing response (I'm not). Therefore, it may be unclear to the hearer which interpretation was intended. Intonation can provide a clue, however: If the particle bears the CC, the speaker conveys that the response disagrees with some contextually salient evidence.
The negative question requires that there is contextually salient evidence in favor of the negative response (Büring & Gunlogson 2000;Trinh 2014;Roelofsen & Farkas 2015;Krifka 2017). If this is the evidence that the CC signals disagreement with, then the particle must indicate the positive response.
On the other hand, the choice not to use the CC in a context in which there is negative evidence might lead a listener to conclude that the speaker does not disagree with the evidence, leading to a negative interpretation. We conducted a perception experiment to test the effects of intonation on the interpretation of bare polar particles.
The perception experiment also provides answers to question ii: How are bare polar particle responses to negative sentences interpreted? While Roelofsen & Farkas (2015) predict bare particles to be ambiguous, they also say that they are more likely to be used to convey a negative response to negative PQs. Krifka (2013) predicts that bare no will unambiguously convey a negative response. Holmberg (2016) predicts that bare yes will unambiguously convey a negative response. The results of our experiment test these predictions.

Methods
Participants were presented with a context story on a computer screen. The experiment's contexts were similar to those in the production experiments, except that now they crucially leave open whether the character will give a positive or negative response: (42) Context: Taylor and Mark are coworkers. Their boss is giving a presentation at 4 pm that they are both supposed to attend. Mark is running a bit late, and on his way to the presentation at 4:05, he notices Taylor is on the phone and hasn't gone to the presentation yet either. The question recordings came directly from the stimuli of production experiment 1. The bare particle answers were extracted from recordings of the participants in experiment 1. The perception experiment had three factors. particle: Whether the word uttered is yes or no. intonation: Whether the intonation used was the CC or a declarative fall (Dec). origin: Whether the sound file used was originally followed by a positive or negative sentence in the production experiment. Thus the response in (42) is particle = yes. Suppose the recording used has the CC, then intonation = CC. Finally, if the recording came from a trial in the production experiment in which yes was originally followed by a positive sentence, e.g. I'm coming to the presentation, then origin = positive. On each trial, participants first silently read the context and the dialogue, then pressed a key to hear the dialogue. Afterwards, the participants were asked how they interpreted the response: (43) Question: Based on Taylor's response, which of the following is true: 1. Taylor is coming to the presentation. 2. Taylor is not coming to the presentation.
The participants were 25 North American English speakers, mostly undergraduate students. There were eight different dialogues (items), and the experiment was run so that each participant saw all conditions in all items, therefore 1,536 observations total. 39 The trials were randomized so that participants never saw the same condition twice in a row, and trials from the same item were organized into different blocks to maximize their distance.

Results
The results showing the percentage of positive interpretations chosen are visualized in Figure 5. Results for yes are in blue, no in purple. Particles that were originally followed by a positive sentence are in the left panel, those originally followed by a negative sentence are in the right panel. Within each panel, particles bearing declarative falling intonation are on the left, particles bearing the CC are on the right. The CC made participants more likely to interpret the particle as conveying a positive answer. There were also more positive interpretations if the particle yes was used. For the particle yes, it didn't seem to matter whether it originated from a positive or negative response-the rate of interpreting it positively was roughly the same. However, for the particle no, it mattered whether it was originally followed by a positive or a negative sentence, with the latter making negative interpretations more likely. So besides obvious main effects of particle and intonation, it looks like there may be an interaction between particle and origin.
We fitted a logistic mixed effects regression, which included particle, intonation, origin, and all interactions, with random effects and slopes for participant and item (see Table 6).
The largest significant effect was for intonation. Particles bearing the CC were significantly more likely to be interpreted as a positive sentence, e.g. I am coming to the presentation (26% positive interpretation with Dec, 65% with CC). There was also an effect of the choice of particle. Yes responses were significantly more likely to be interpreted as conveying a positive sentence response, e.g. I'm coming to the presentation (30% for no, 53% for yes).
Whether an utterance was originally uttered in a negative agreeing response or a positive disagreeing response (our factor origin) did not significantly affect interpretation (p > 0.32), nor was there an interaction between origin and intonation (p > 0.39). This gives our intonational annotation some validation-there could have been some Figure 5: Percent positive I am interpretations of bare particle responses by particle (yes vs. no, intonation (CC vs. Dec), and origin (whether the particle was originally part of a positive or negative response in the production study that it was culled from).
crucial prosodic cues revealing the intent of the speaker that is not captured by annotating whether they used the CC or not, or maybe what we annotated as CC could have been quite different depending on which type of utterance it occurred in. But there is no evidence for either of these conceivable problems with the way we annotated the data.
On the other hand, participants were more likely to interpret no responses as agreements with the negative question when the no sound file came from a negative agreement response in experiment 1 than when it came from a positive disagreement response (the interaction between origin and particle in Table 6). We make two observations about this effect: First, because there were relatively few CCs produced in confirming responses in the production experiments, the perception experiment could not be completely balanced. That is, there were only three noes followed by negative sentences that were produced with the CC in experiment 1, and only two yeses. To have had a completely balanced design, we would have needed eight of each. Therefore, the data for this experiment includes relatively few observations in which polar particles that were originally followed by a negative sentence bore the CC (the rightmost dots in figure 5-this is likely why the error bars are larger here than elsewhere in the plot). Since there are fewer negative-origin CCs, we might expect any effect of origin to be attributable to intonation. That is, fewer CCs result in more negative interpretations, given the strong effect that the CC has on interpretation.
The second observation is that despite this lack of balance leading to possible interference by intonation, it must be noted that even the negative-origin no responses bearing Dec intonation were interpreted as a negative response more frequently than the positiveorigin no responses with Dec intonation. Thus, it seems likely that this interaction effect between particle and origin is at least in part genuine: Negative-origin noes must sound more negative to our participants than their positive-origin counterparts. This means that there is likely an effect of prosody on these no responses not captured by the intonation annotation. We note that this effect is smaller than the main effects of intonation and particle.

Discussion
The questions we set out to answer were: How are bare polar particle responses to negative sentences interpreted? Does the contradiction contour (CC) affect the interpretation of bare particles? We found that in response to negative PQs, bare yes is interpreted as a positive response at about chance level (53% of trials), while bare no is interpreted as positive Table 6: Logistic mixed model looking at how the three factors particle, intonation and origin, and all possible interactions affect whether listeners interpret the bare particle to mean I am or I am not. ***p < 0.001, **p < 0.01, *p < 0.05. Goodhue and Wagner: Intonation,yes and no Art. 5,page 35 of 45 in 30% of trials. However, intonation has a big effect on bare particle interpretation, both in the case of yes and no. Particles bearing the CC are interpreted as positive in 65% of trials, while particles bearing a declarative contour are interpreted as positive in only 26% of trials. None of the theories discussed above explicitly considers predictions for the interpretations of bare polar particles that have different intonations. The theory of Krifka (2013) predicts that in order to convey a positive response to negative PQs, no must be followed by an overt sentence. Holmberg (2016) makes the same prediction for yes. Roelofsen & Farkas (2015) predict negative interpretations of bare particles to be preferred, other things being equal. One might read this as meaning that a systematic manipulation of intonation could alter the expected interpretation. On the other hand, R&F claim that positive interpretations of polar particles will require an overt following clause with verum focus. Thus, our results are not directly anticipated in the previous literature. However theorists have suggested that intonation could have some effect, and that positive responses to negative questions in particular would bear a special intonation, even if that intonation is predicted to appear on a following clause. Our results lend empirical support to this intuition, but expand on it by identifying for the first time that this unique intonation is the CC, that it can appear on bare particles themselves, and that it has a large effect on the interpretation of bare particles.

Perception
Given our proposed meaning for the CC, we could ask why positive interpretations are not at 100% when the CC is present. After all, in order for the CC to be licensed on a bare particle in response to a negative PQ, the intended response must be positive, right? Actually, not quite. Note that using the CC on a negative response to a negative question is possible in principle, if there is also positive contextual evidence (in addition to the negative evidence necessary to license the negative question). The contexts in our perception experiment may have been open enough to leave some room for listeners to posit that there may have been such evidence. Sometimes, even asking a negative PQ can suggest that the speaker considers there to be some evidence for p (in addition to evidence for not p). For example, the following question (adapted from examples by Trinh 2014) both suggests contextual evidence that B is not left-handed (antecedent for CC in (44a)), and given the lower odds of being left-handed the formulation of the question suggests a prior belief of A that B is left-handed, which B can take as positive evidence (antecedent for CC in (44b) The fact that negative PQs can license CC in both responses might explain why not all CCbearing particles were interpreted as positive. The main effect of particle is that yes responses were significantly more likely than no responses to be interpreted as conveying a positive sentence response. Viewed through the lens of Krifka's theory, this is expected since a pragmatic constraint is in effect, *NonSal: Being anaphoric to a less salient discourse referent is dispreferred. Since the negative discourse referent is typically less salient, both yes and no prefer picking up the inner, positive discourse referent, which results in a positive response interpretation for yes and a negative interpretation for no. R&F also predict no to be more frequently interpreted as a negative response, since it realizes the [-] feature of the [agree, -] response. However, they predict positive responses to the be the most marked, thereby requiring an overt following clause, so they predict that bare particles will more likely be interpreted as negative responses in general. This latter prediction does not fit with the effect we found here. Holmberg predicts bare yes should only be interpreted negatively, which is exactly the opposite of the main effect of particle in our experiment. Suppose Krifka's theory is right, participants are driven by *NonSal to interpret polar particles as picking up the inner antecedent. But now we have a question: If the positive antecedent is preferred, why is yes with declarative intonation only interpreted as positive in under 50% of observations? It should be interpreted positively much more of the time. This can be explained if there is a preference to use the CC whenever possible. Failing to use the CC in a context where there is negative evidence (such as in response to negative PQs) then licenses the inference that the speaker must agree with the negative evidence. The absence of the CC is a cue in favor of the outer, negative antecedent. 40 There is an aspect of Kramer & Rawlins's (2012) results that diverge from ours. They found that when trying to convey a positive response (I am), bare yes is mostly judged false, while no is judged more ambivalently. In our experiment, yes was more likely than no to be interpreted as conveying a positive response. The discrepancy might be due to the fact that Kramer & Rawlins did not control for intonation. Our participants were more likely to produce the CC on no than on yes (production experiment 1), and bare particles carrying the CC are more likely to be interpreted as positive (perception experiment). It is plausible then that Kramer & Rawlin's participants imagined more no responses with the CC when silently reading the dialogues, leading to more positive interpretations, while yes was imagined with more Dec intonation. This would lead to more negative interpretations of yes than no, and therefore more judgments of falsity when the particle was meant to convey a positive response in K&R's experiment. We note that the possibility of such issues is a good argument in favor of controlling for intonation when asking participants for judgments about bare particle responses.
Before moving on, we would like to note that the interpretation patterns found here differ in a substantive way from the naturalness ratings found in the production experiments. The latter were best predicted by the theory of Roelofsen & Farkas (2015), especially the fact that yes and no were rated equally natural when conveying a positive response. On the other hand, as just argued above, Krifka's (2013) theory best predicts the preference to interpret yes as a positive response more frequently than no in the perception experiment. This difference in results will be discussed in detail in section 7.

Context Sensitivity experiment
Recall question iii from section 4: "Context Sensitivity: If the negative sentence that the polar particle responds to is itself responding to a negative sentence, are preference patterns affected, and in particular is yes now more acceptably interpreted as a negative response?" Meijer et al. (2015) attempt to answer this question for German, and they note that Krifka (2013) and Roelofsen & Farkas (2015: R&F) make different predictions with respect to these questions. As discussed above in section 4.2.1, Krifka argues that negative sentences are usually uttered in a context in which the positive discourse referent is already salient, thus making the positive TP discourse referent more salient to anaphora. However, the relative salience of discourse referents introduced by negative sentences can be flipped according to Krifka, in  According to Krifka, yes is most naturally interpreted as meaning "he didn't climb it," since yes picks up the more salient negative discourse referent. No is also predicted to be most naturally interpreted as meaning "he didn't climb it," since, even though this would require picking up the less salient positive discourse referent, *DisAgr is more highly ranked than *NonSal when responding to an assertion (see the OT tableau in Table 2). Finally yes is predicted to be preferred over no when indicating a negative response, since *DisAgr plays no role when the response agrees, and yes picks up the more salient discourse referent. This all holds for German ja and nein as well, with the difference that doch is preferred for indicating a positive response to a negative sentence.
On the other hand, neither R&F nor Holmberg predict shifts in context to have any effect on preference patterns in polar particle responses. Meijer et al.'s (2015) results do not favor any theory. They found two groups of speakers when it came to preferences for negative, affirming responses. One group preferred nein for such responses while the other preferred ja. Their preferences held regardless of whether the context WH-question was positive or negative, counter to Krifka's predictions.
We wondered whether these results would be the same in English. The goal of this experiment was to see if putting our original contexts into Meijer et al.'s (2015) experimental design would affect the naturalness ratings of our stimuli, either in the way predicted by Krifka, or to be more like the results of Meijer et al. for German.

Methods
This experiment has three two-level factors, 2 × 2 × 2. Two are identical to the first set of production experiments, the particle used (yes vs. no), and the polarity of the answer (positive vs. negative). In all conditions, these responses are made to a negative declarative antecedent. The third factor was manipulating the polarity of the context sentence preceding the antecedent. Participants were asked to rate the naturalness of the responses, and they were asked a follow up verification question. Here is an example stimulus. The participants were 12 North American English speakers (mostly McGill undergraduates), making for 768 observations total. The trials were pseudo-randomized so that participants never saw the same condition twice in a row, and trials from the same item were organized into different blocks to maximize their distance.

Results and discussion
From Figure 6, it looks like there is an interaction between particle and answer with yes being rated as less natural than no when giving a negative response, as we saw in the production experiments in section 4. The polarity of the context preceding the antecedent appears to be unlikely to have any effect, except perhaps on yes, I'm not responses, which appear to be rated slightly more natural in negative contexts (median 5 in negative contexts to median 4 in positive contexts). If this three-way interaction is significant, it would lend support to Krifka's hypothesis that preceding context polarity modulates naturalness ratings. The null hypothesis is that the preceding context has no effect. We fitted a cumulative link mixed model regression with all three factors and all possible interactions, with random intercepts and slopes for participant and item (see Table 7). We found a significant main effect of particle, as well as a significant interaction between particle and answer, as expected and like in the other production experiments. We did not find a significant three way interaction (p > 0.67), nor were any other factors significant. A likelihood ratio test revealed no difference between this model and one with only particle, answer and their interaction (p > 0.46). Therefore, we fail to reject the null hypothesis. No, I'm not responses are rated more natural than their yes counterparts regardless of context, which is predicted by Roelofsen & Farkas's (2015) and Holmberg's (2016) accounts. However, we note that, according to Meijer et al.'s (2015) results, context polarity had no effect on naturalness in German either, yet the result did not clearly fit with R&F's predictions. Therefore, when taking our results together with Meijer et al.'s, the results of both experiments demonstrate that controlling for the polarity of a preceding context sentence has no effect on preference patterns in either English or German, at least not in this experimental paradigm.
Thus, it seems that the answer to question iii, "If the negative sentence that the polar particle responds to is itself responding to a negative sentence, are preference patterns affected, and in particular is yes now more acceptably interpreted as a negative response?" is no. Naturalness ratings of particle responses appear to be insensitive to systematic manipulation of the polarity of context sentences.
Nevertheless, we think there is some merit to the intuition that in a context like (45), yes more naturally indicates a negative, agreeing response, whether as a bare particle or with a following overt sentence. We point out that we only fail to reject the null hypothesis and context (whether the sentence preceding the sentence that yes and no respond to was positive or negative).
here, and that it is possible that the experiment is underpowered. In future work, we would like to see other experimental designs used to test whether context can affect preference patterns. If, after further testing, yes-negative responses remain dispreferred regardless of context, then it seems likely that a context-insensitive theory is on the right track.

Concluding discussion and future directions
In section 4, we posed three groups of research questions, reprinted below.
i. Intonation: Does a special intonation appear on positive yes/no responses to negative PQs? If so, can it affect the interpretation of bare particles? ii. Preference patterns: When responding to a negative sentence, which particles do speakers prefer to use when giving a response with negative polarity? With positive polarity? How are bare polar particle responses to negative sentences interpreted? iii. Context sensitivity: If the negative sentence that the polar particle responds to is itself responding to a negative sentence, are preference patterns affected, and in particular is yes now more acceptably interpreted as a negative response?
As discussed in the introduction, various researchers working on polar particle responses have shared the intuition that a special intonation may appear in positive responses to negative utterances, and affect the interpretation of polar particles. However there has been disagreement over the form that the intonation takes, and no quantitative data has been presented in support of any particular claim. The main finding of our work is that there is indeed a particular intonation that is produced in positive responses to negative utterances: A fall rise that we identify as the contradiction contour (Liberman & Sag 1974). The CC is the only intonation we found that is systematically produced in positive responses that disagree with the negative bias of negative PQs, while not appearing in negative, agreeing responses. The results of our perception experiment demonstrate that this intonation has a strong effect on how hearers interpret bare polar particle responses to negative utterances: Participants are much more likely to interpret a bare yes or no as indicating a positive response when it bears the CC than when it bears falling intonation. Building off of Liberman & Sag (1974), we have argued that the CC, when used on an assertion of p, conveys that there is contextual evidence for ¬p. This hypothesis accounts for why it is Table 7: Cumulative link mixed effects model looking at three factors affecting whether readers found the responses natural given the context: Which particle was used, what the intended reading was, and whether the context preceding the antecedent was positive or negative.
used almost exclusively on positive responses, given that our contexts made evidence in favor of the negated proposition salient. Prior to this experiment, it would have been hard to guess how sensitive to intonation naïve participants are. We believe the results suggest that intonation plays a prominent role in interpretation. Thus, when researchers probe intuitions about the interpretation of polar particles (whether from the armchair or the lab), the effects of intonation might need to be kept in mind.
Regarding question ii, one clear finding from the naturalness ratings, replicating a finding from Brasoveanu et al. (2013), is that no is more acceptable than yes when giving a negative, agreeing response to a negative sentence. Both Roelofsen & Farkas's (2015) and Krifka's (2013) theories account for this preference, as discussed above, while Holmberg's (2016) does not.
Besides this result, the production naturalness ratings and the perception results clearly bear on question ii in other ways, however determining exactly how the results bear on previous theoretical claims about preference patterns is complex. The reason for this is that the literature on preference patterns does not distinguish between preferences regarding the use of polar particles and preferences regarding the interpretation of bare particle responses, but our results do. Our results provide data in the form of (i) naturalness ratings by the producers of polar particles with complete, overt following sentences, 41 and (ii) forced choice interpretations of bare particle responses in a perception experiment. The two kinds of results do not perfectly align with one another, and could be argued to support the predictions of competing theories as follows.
First, the naturalness ratings from the production experiments reveal that participants find no followed by a positive sentence and yes/yeah followed by a positive sentence to be equally acceptable. This holds true both when the particles are followed by overt sentences in experiments 1 and 2, and when they were bare in experiment 3 (section 4.2). This result runs counter to Krifka's (2013) theory, which clearly predicts that positive no responses should be less natural than positive yes responses, regardless of previous context (see the OT tableau in Table 2). Roelofsen & Farkas (2015), on the other hand, predict this result.
On the other hand, the interpretation results from the perception experiment (section 5.2) reveal that participants are more likely to interpret bare no as conveying a negative response than bare yes, regardless of the intonation used on the bare particle. While the naturalness ratings contradicted the predictions of Krifka (2013), those predictions are confirmed by this result. Since positive no responses are predicted to be less natural, it makes sense that we would find that bare no responses are more likely to be interpreted as negative than bare yes responses (Table 2). Roelofsen & Farkas's (2015) theory predicts both bare yes and bare no to be more likely interpreted as negative. Thus it does not predict our finding. Finally, Holmberg's (2016) theory also has trouble explaining this result, since it makes the strong prediction that bare yes in English cannot be interpreted as a positive response to a negative utterance, but instead must be followed by an overt positive clause.
Therefore, the results for question ii and how they bear on existing theories of polar particles are somewhat mixed. More testing using various experimental paradigms may be needed. For example, if we had allowed participants the freedom to give responses in their own words, along the lines of the completion task described in González-Fuente et al. (2015), then perhaps the production and perception results would have been more similar. However, a drawback to this approach is that the experimenter cannot control whether the participant will even use a polar particle at all.
Regarding question iii, the experimental results from section 6 reveal that, so far, no experimental paradigm has successfully demonstrated that manipulating the polarity of preceding context sentences affects the result that negative yes responses are dispreferred. This result contradicts the predictions of Krifka (2013), and supports those of any theory that predicts preference patterns to be context insensitive, such as Roelofsen & Farkas (2015) and Holmberg (2016).
Our experiments have several limitations. For one, we only used contexts which convey negative contextual evidence. We did not vary evidence in order to keep the complexity of the experiments under control. As noted in footnote 31, there are theoretical questions about whether it is even felicitous to use a positive PQ in the presence of negative contextual evidence. We believe our experimental contexts do license the use of both positive and negative PQs, but this raises the question of why. Is it because theories that predict this to be impossible are wrong, or is it because our contexts were complex enough that they could be taken to have both positive and negative evidence? If it is the latter, then why was the CC only produced on positive responses, regardless of whether the PQ was positive or negative? If evidence can be manipulated between purely negative and purely positive in a production experiment, we predict it to have an effect on the distribution of the CC, with CC appearing on negative responses in the presence of positive evidence and on positive responses in the presence of negative evidence.
Another limitation is that the perception experiment only tested CC against falling intonation. We noted in section 4.3 that rise fall intonation was distributed relatively evenly across conditions in the production experiment. However it may nevertheless be interesting to run a perception experiment with rise fall intonation on the polar particles to see how it affects interpretation.
Moreover, there are clear next steps to be taken. In particular, which intonations are produced in response to other kinds of sentences such as falling declaratives, positive rising declaratives, high negation questions, and tag questions? In the introduction, we said we focussed only on responses to PQs and rising declaratives to simplify the number of experimental conditions and discussion, as well as to expand on the work of Brasoveanu et al. (2013), which focussed on responses to falling declaratives. Therefore, it remains to be seen how using falling declaratives might affect intonation and naturalness ratings in our experimental paradigms. Likewise, positive rising declaratives would be interesting to test, as they are usually thought to require evidence for the positive response, as already mentioned above in (34). Thus we might expect the CC to appear in negative responses to these questions. However, given that rising declaratives can be used to express incredulity (Goodhue et al. 2016), it is possible that the very question itself, even if phrased as a positive rising declarative, suggests some doubt on the part of the speaker, i.e. an expectation that ¬p. Thus it may be possible that, depending on context manipulations, the CC could appear in positive responses as well as negative responses to such utterances.
To conclude, we have provided experimental evidence demonstrating that a particular intonation, the contradiction contour, is used primarily in positive, disagreeing polar particle responses to negative polar questions and rising declaratives, and that this tune has a large impact on the interpretation of bare particle responses to negative polar questions. This suggests that when formulating theories of the preferred interpretation of bare particles, the way that intonation impacts empirical data should be taken into account.

Additional files
Sound files: This zip folder contains audio recordings of numbered examples in which intonation is crucial. Each wav file in the folder is named after the example that it comes from. For example, the first file "14a.wav" is a recording of example (14a). "14bi.wav", "14bii. wav", and "14biii.wav" are three different recordings of (14b), as described in the text surrounding that example. The rest of the recordings in the folder relate in this self-explanatory way to the examples in the text, with two exceptions: Recordings corresponding to examples in footnotes are labeled after the footnote number and example number. For example, "fn14_ia.wav" is a recording of example (ia) in footnote 14. Finally, the two final examples in the folder, "z_declarative_intonation.wav" and "z_rise fall on no_CC on following sentence. wav" do not correspond to any numbered examples in the main text, but are examples of the declarative intonation and the rise fall intonation discussed in section 4.3.
If it would be helpful to visualize the pitch tracks, we recommend that readers examine the sound files with a free phonetic analysis software such as Praat (Boersma & Weenink 2017).