Measuring the cognitive cost of downward monotonicity by controlling for negative polarity

Our goal in this study was to behaviorally characterize the property (or properties) that render negative quantifiers more complex in processing compared to their positive counterparts (e.g. the pair few/many). We examined two sources: (i) negative polarity; (ii) entailment reversal (aka downward monotonicity). While negative polarity can be found in other pairs in language such as dimensional adjectives (e.g. the pair small/large), only in quantifiers does negative polarity also reverse the entailment pattern of the sentence. By comparing the processing traits of negative quantifiers with those of non-monotone expressions that contain negative adjectives, using a verification task and measuring reaction times, we found that negative polarity is cognitively costly, but in downward monotone quantifiers it is even more so. We therefore conclude that both negative polarity and downward monotonicity contribute to the processing complexity of negative quantifiers.


Introduction
Negative quantifiers like few or less take longer to process and cause more errors in comprehension compared to their respective positive counterparts many and more (Just & Carpenter 1971;Deschamps et al. 2015). We call this additional complexity of the negative antonym the polarity effect. How general is this effect? Does it carry over to adjectival antonyms? Identifying the correct generalization will provide an answer to the question of what property exactly evokes this effect, and therefore what property exactly is cognitively relevant for linguistic processing. In this study, we will compare the polarity effect of monotone quantifiers with that of expressions that contain antonymous adjectives (examples of such polar pairs are a small number and a large number). Using these adjectival antonymous pairs (henceforth: adjectives) as a control, allows us to test two possible sources for the polarity effect in quantifiers and quantify their unique contribution: Note that these are not mutually exclusive hypotheses. Theoretically, it could be the case that only negative polarity contributes to processing and not downward monotonicity. It could also be the opposite case, that only downward monotonicity contributes to processing and not negative polarity. Finally, both could play a role and have their own independent contributions to processing. In this study, we assess the amount of contribution of each source. The structure of the paper is as follows: First, we present linguistic data to convince that the comparison with adjectives is justified, as adjectives are similar to quantifiers in many respects but downward monotonicity. Then, we present a reaction time (RT) experiment to assess the contribution of these two sources. We finish with a discussion on each source.

The similar: Negative polarity
Quantifiers denote a property of a set, which makes them second order predicates ( Barwise & Cooper 1981). Usually, as implied by their name, they denote some information regarding the quantity of the set. For the sake of comparison with gradable adjectives, we will focus on quantifiers that make reference to a scale by having a degree of numerosity in their denotation. The degree can be overt as in (1) or contextually dependent as in (2): (1) More than three students came: |{x:x is a student} ∩ {x:x came}| > d & d = 3 (2) Many students came: The numerosity argument d, whether explicit or implicit, is a degree on a linearly ordered scale. This scale is lower bounded by 0 (as the minimal intersection of two sets is the empty set, whose size is zero). Gradable adjectives, by definition, also make reference to a scale with linearly ordered degrees. The gradable adjectives which will be discussed here are dimensional adjectives (cf. Bierwisch 1989), i.e. adjectives that measure a natural dimension (e.g. 2 meters tall, 3 inches wide). As with quantifiers, the degree can be explicit (3) or equals a contextual standard (4). 1 (3) John is 2 meters tall: (4) John is tall: Dimensional adjectives are similar to quantifiers not only in their reference to a linear scale, but also in the lower bound of the scale. Both are lower bounded by zero-natural dimensions, as quantities, cannot get a negative value. A second point of similarity is that of polarity. Both polar quantifiers and antonymous adjectives can be divided into pairs that constitute a positive and a negative. The negative counterpart reverses the ordering of the scale relative to the positive (i.e. if P is more than Q then Q is less than P; if x is taller than y then y is shorter than x). The positive quantifiers/adjectives denote that their degree is above some standard, and their negative counterparts denote that this degree is below some standard. For instance, in the following example where the standard is contextual, the positive counterparts (5a) and (6a) have a more-than sign, while the negative counterparts (5b) and (6b)  If the polarity effect in quantifiers stems from a "less than" computation (<), as given in the denotations in (5b) and (6b), a similar polarity effect in adjectives is also expected. Indeed, there are some hints that negative adjectives are more cognitively costly than positive adjectives (Clark & Card 1969;Just & Carpenter 1971;Sherman 1976;Tucker, Tomaszewicz & Wellwood 2018), but in these experiments a polarity effect in dimensional adjectives was not the main interest and therefore the contrast was not a clean one. Importantly, it was not directly compared with that of quantifiers. A different perspective on the nature of negative polarity is through a morpho-syntactic analysis. Negative quantifiers are often analyzed as containing a hidden negation in their underlying structure (Hackl 2000: 126;Heim 2006;Penka 2011: Chapter 4;Xiang, Grove & Giannakidou 2016). A similar proposal was made for negative adjectives (Rullmann 1995;Kennedy 2001;Büring 2007). In quantifiers, this hidden negation evokes an interaction of the polarity effect with the truth value of the sentence (Just & Carpenter 1971;Grodzinsky et al. 2018), similarly to how overt sentential negation interacts with truth value (Krueger 1972;Mayo, Schul & Burnstein 2004;Kaup, Lüdtke & Zwaan 2005;2006;Kaup, Zwaan & Lüdtke 2007): the polarity effect is smaller for false sentences; false negatives have similar or shorter RTs than true negatives. If the polarity effect in adjectives stems from the same source as the polarity effect in quantifiers, not only do we expect a polarity effect in adjectives, but we also expect it to be sensitive to the truth value of the sentence in a similar manner as quantifiers.
To summarize, negative adjectives are similar to negative quantifiers. Both types of negation involve a "less than" computation on a linear scale, and possibly also a hidden negation. Therefore, it is natural to compare the polarity effects between these two types. If the polarity effect in quantifiers stems from the morpho-syntactic representation or 2 To formalize, we can abstract over the degree element in a proposition. The relationship between a proposition with a negative adjective/quantifier (neg) and a proposition with a positive adjective/quantifier (pos) has these two properties: (a) scale reversal: Scale reversal means that if degree d 1 makes pos stronger (relative to degree d 2 ), then it makes neg weaker, and vice versa. For example, the proposition with the degree four is stronger than the proposition with the degree three in pos (e.g. more than four students came ⇒ more than three students came), and this reverses for neg (fewer than four students came ⇐ fewer than three students came). Proximity to zero means stronger neg propositions have smaller degrees, and vice versa (e.g. fewer than three students came ⇒ fewer than four students came, therefore 3 < 4 and vice versa). See Sassoon (2010) for a semantic analysis of negative adjectives somewhat similar to the one suggested here.
from cognitive computations over degrees on scales, then we expect both quantifiers and adjectives to manifest a polarity effect, and this polarity effect should interact with truth value in the same way.

The different: Downward monotonicity
Negative quantifiers (few or less than x) are downward monotone, which means that they reverse the entailment pattern of the sentence relative to the pattern observed with their positive counterparts (e.g. more than half of the students came early à more than half of the students came; less than half of the students came à less than half of the students came early). The formal definition is provided in (7): (7) a. If Q is a positive quantifier: Downward monotonicity is a property of linguistic importance (Ladusaw 1980), therefore it is reasonable to assume its reflections can be found with cognitive measures (Geurts & Van Der Slik 2005;Chemla, Homer & Rothschild 2011). However, it is difficult to test whether downward monotonicity contributes to the processing complexity of negative quantifiers, since it is usually confounded with negative polarity. We will use dimensional adjectives in constructions like a small number or a low proportion as a control for negative polarity.
A small number is similar in meaning to few-it is negative in polarity, as it contains a hidden negation and triggers a "less than" computation. Specifically, a small number refers to small amounts and therefore seems synonymous to few. However, as we will now show, a small number is not downward monotone. Thus, we can isolate downward monotonicity by comparing the polarity effects in quantifiers and in adjectives. For sake of simplicity, we will refer to expressions like a small number as "adjectives" to distinguish them from monotone quantifiers as few, although the expression a small number as a whole is not an adjective and could be analyzed as a generalized quantifier. The exact status of the whole expression needn't concern us, for we focus on the contrast between the positive and negative members of the antonymous pair a small/large number of NP PL vs. few/many of NP PL .
First, let us be convinced that a small number is not downward monotone. 3 Prima facie, both few and a small number convey that a numerosity (of e.g. students who read the book) is below some contextual standard on a scale: a. Few of the students read the book. b. A small number of the students read the book.
Although (8a) and (8b) seem synonymous, only (8a) can license negative polarity items (NPIs) like ever, a hallmark of downward monotonicity (Ladusaw 1980): (9) a. Few of the students have ever read the book. b. *A small number of the students have ever read the book.
A small number does not license NPIs and therefore it cannot be downward monotone. This might seem surprising, because given the similarities between quantifiers and dimensional adjectives, one might expect a small number to have the same denotation as few. The difference in monotonicity lies in the inclusion of the zero point in the scale.
While few is consistent with none, a small number entails existence and therefore is inconsistent with none, hence-it is not downward monotone. 4 The exclusion of a zero point is evident when embedding under a universal modal. For example, the oddity of (10b) relative to (10a), is because it is intuitively understood that the requirement for passing the exam is to make some mistakes (an odd requirement for passing exams in the world as we know it): (10) a. To pass the exam, you should make few mistakes. b. #To pass the exam, you should make a small number of mistakes.
The contrast in (10) suggests that a small number excludes zero from the scale, and therefore: (11) a. Few students came: Prima facie, a small number entails the existence of students due to the existential force introduced by the article a. One problem with this explanation is that a quantifies over the NP small number and not over students. There is no obvious reason why zero should be excluded from the extension of small number, for after all, isn't zero also a small number? The precise explanation for the existential entailment is beyond the scope of this work, i.e. we will not give a semantic analysis of the internal structure of a small number. 5 For our purpose, it is sufficient that we showed that a small number is not consistent with zero, and therefore not downward monotone. This property makes expressions like a small number a proper control to test the unique contribution of downward monotonicity to the polarity effect in quantifiers. Algorithmically, it was already suggested that the consistency with the empty set adds a step to the verification process thus explaining the processing difficulty of downward monotone quantifiers (Bott, Klein & Schlotterbeck 2013;Bott, Schlotterbeck & Klein 2018). However, it is not until our study that negative polarity can be experimentally distinguished from downward monotonicity. 6 To summarize, if the source of complexity in negative quantifiers is due to negative polarity, we expect to find a similar effect in adjectival expressions like a small number (a main polarity effect and an interaction with truth value). If the source of complexity in negative quantifiers is also due to downward monotonicity, then we expect the polarity effect in quantifiers to be larger than the polarity effect of adjectival expressions that are negative but not downward monotone. It might also be that both sources play a role and all of the above effects are found.

Experiment
The goal of the experiment is to test whether the polarity effect in quantifiers stems from entailment pattern, the negative polarity, or both. We do that by measuring reaction times (RT) in a verification task, and comparing the polarity effect in quantifiers to the polarity effect in adjectives. If our cognitive system is indeed sensitive to downward monotonicity, we should expect a larger RT difference in few-many compared to small-large. This would result in a significant interaction between the Type of the degree operator (quantifier/adjective) and its Polarity (positive/negative). If negative polarity is a component in processing (perhaps in addition to downward monotonicity), we expect to find a polarity effect and a Polarity × Truth-value interaction in both quantifiers and adjectives.
We introduce a verification paradigm with the factors Type (adjective/quantifier), Polarity (positive/negative) and Truth-value (true/false). To increase the number of items and add variability, we added another factor, Standard, which regarded the kind of comparison required from the participants: whether the degree is a proportion or not (cardinal/proportional). Having more than one downward entailing quantifier allows us to correlate performance-if downward monotonicity is a relevant property cognitively, then with enough between-subject variability in the size of the polarity effect, we should find a high correlation between the polarity effects of the quantifiers.

Materials
For a verification task, we used English sentences that described some ratio between blue and yellow circles using one of the degree operators presented in Table 1 and Table 2 in the structure of X of the circles are blue/yellow (e.g. many of the circles are blue).
Each trial could be true or false. For sake of counter-balancing, sentences also varied in the referred Color (blue/yellow).
All sentences were recorded in English (female voice, native English speaker, Canadian accent), and later processed in Audacity to equalize them in term of their average pitch, duration, average amplitude and speed.  Images either depicted more blue circles than yellow circles or vice versa. The number of blue circles was fixed to 16 and the number of yellow circles varied. The comparison task was kept easy by clustering circles of the same color close to each other and using ratios such that it is clear which color is the majority (4: 16, 8:16, 32:16 and 64:16). Hence, for each truth value (yellow<blue and yellow>blue) there were 2 levels of Ratio,small (8:16,32:16) and large (4:16, 64:16). Pictures were created in Mathematica.

Procedure
Experiment was run using Presentation (version 17.0). In each trial, subjects had to decide whether a sentence they heard on earphones correctly described a picture that later appeared on the screen. The procedure and timing of a single trial is summarized in Figure 1.
Each trial started with a fixation cross on the screen (see Figure 1). After 400 ms, the subject heard a sentence, while the fixation cross was still on the screen. The duration of each sentence was 1600 ms (cardinal quantifiers) or 2000 ms (all the rest). At the offset of the sentence, the fixation cross disappeared and a picture appeared after an ISI of 200 ms, and disappeared once the subject decided whether the sentence correctly describes the picture or not (true/false) by pressing one of two possible buttons on the keyboard (the "ß" key or the "à" key, counterbalanced for coding between subjects). To avoid ceiling effects and stress out subtle differences, subjects were encouraged to answer as fast as they could. If a subject didn't respond within 1100 ms the trial was counted as a "miss" and the next trial began. Reaction times were measured from the onset of the picture until the press of the button. 7

Participants
35 students, aged 22 ± 3 (average ± standard deviation), native English speakers who were taking a summer course in Hebrew at the Hebrew University International School and participated in this experiment for payment, after signing informed consent approved by the Hebrew University Research Ethics Committee. 29 were right handed, 15 male and 20 female. One subject quit the experiment in the middle, so his data was not included 7 The sentences in the adjective condition were with the copula is (as in a large number of circles is blue), which five participants reported as ungrammatical for them. The exclusion of these subjects did not affect the results in any way. In addition, we ran a similar experiment in Hebrew where no such issue of copula arises, and replicated the results reported here in English. in the analysis. Two other subjects were excluded from the analysis due to low accuracy rates (<70% correct responses).

Analysis and results
Based on our theoretical considerations regarding the similarities between quantifiers and adjectives, these are our predictions for the RTs: (i) a main effect of Polarity; (ii) a Polarity × Truth-value interaction that is not significantly different between Types. If downward monotonicity also plays a role in processing, we expect also: (iii) a Type × Polarity interaction that stems from a larger effect for quantifiers versus adjectives; (iv) a relatively high correlation between subjects' Polarity effects of quantifiers (i.e. between proportional quantifiers and cardinal quantifiers). Analysis was carried out on 32 subjects, whose accuracy was relatively high (>70%, average = 83%). Misses and incorrect responses were removed (1.4% and 15.6% of data respectively). RTs were log-transformed.
As shown in Figure 2, negative polarity is generally costly, but in quantifiers it is more costly than in adjectives by 36 ± 8 ms on average (see Table 3). This accounts for about 30% of the total polarity effect in quantifiers. To test the significance of this result, we fitted a linear mixed effects model using R's lmer function (Bates et al. 2015), with the logarithmic transformation of RT as the dependent variable. Polarity, Type, Standard and Truth-value were used as fixed effects, as well as all interactions. Random intercepts and slopes of Polarity, Type, Truth-value, Polarity × Type and Polarity × Truth-value were included, adjusted by subjects. P-values were obtained using R's lmerTest package (Kuznetsova, Brockhoff & Christensen 2017). A significant main effect of Polarity was found (t = 16.2, p < 0.0001) as well as a significant Polarity × Type interaction (t = 4.5, p < 0.0001). This interaction stems from a stronger Polarity effect for quantifiers, in each of the Standard levels, as can be visualized in Figure 2.
In addition, a significant Polarity × Truth-value interaction was found (t = -8, p < 0.0001). This interaction was not found to be different between adjectives and quantifiers (non-significant Polarity × Truth-value×Type interaction: t = -0.6, p = 0.56), nor between cardinal comparison and proportional comparison (non-significant Polarity × Truth-value × Standard interaction: t = 1.9, p = 0.06). This is visualized in Figure 3. The effects are summarized in Table 3. As shown in Figure 2 and presented in Table 3, an effect of Standard was also foundproportional comparison takes longer than cardinal comparison. This effect can be explained by additional working memory resources required to process proportions, as corroborated by experimental data on quantifiers Zajenkowski, Szymanik & Garraffa 2014). We did not find the effect of Standard to be affected by Type. We did find it, though, to be affected by Polarity (stronger for negatives than for positives) and Truth Value (stronger for false than for true). We have no explanation for these interactions. A main effect of Type was found, as quantifiers took longer to verify than adjectives. Examining the means in Figure 1, this main effect can be explained by the elevated RTs in the negative quantifiers, which when averaged across polarity, result in a significant higher average for quantifiers.
A second analysis we performed was to correlate the Polarity effects of all four classes of Standard × Type (cardinal/proportional, quantifier/adjective). Our prediction was to find the strongest correlation between the two quantifier Polarity effects (cardinal and proportional), as only they require processing downward monotonicity. First, for each subject in each Standard × Type condition, we calculated the Polarity effect by subtracting the averaged RT for the positive from the averaged RT for its negative counterpart, and dividing be the mean RT of that subject: ( ) neg pos RT RT /RT -. We then calculated all possible correlations ( ( ) 4 2 =6 correlations). We found a high correlation between the two quantifier Polarity effects (r = 0.49). Figure 4 visually summarizes these correlations.
Of course, it might not be surprising that our predicted correlation turned out to be the highest among six, as there is an a-priori probability of 1/6 to get this result by chance. At least, this fact does not go against our hypothesis. However, as pointed out by an anonymous reviewer, the differences between the correlations are not major, casting doubt on whether being the highest in rank is even meaningful. We agree with this comment, but Table 3: Summary of the effects up to 2-way interactions (3-way and 4-way interactions were not significant); predictors were sum coded; average difference is calculated within subjects; p-values are obtained by using the Satterthwaite approximation of degrees of freedom ( Satterthwaite 1946), which is implemented in R's lmerTest package (Kuznetsova et al. 2017). do want to point out that the difference between the highest correlation and the second highest correlation was relatively high (0.08, while the other differences, by order, are 0.02, 0.01, 0.11 and 0.03). However, without a larger sample we cannot draw a more decisive conclusion.

Discussion
The current study examined the components of the polarity effect in quantifiers by comparing them to highly similar polar pairs-dimensional adjectives embedded in an indefinite construction. We used pairs of adjectives and pairs of quantifiers that have many properties in common-their reference to a scale with ordered degrees, and the classification  into positive and negative antonyms-but differ in their logical properties. The negative quantifiers are downward monotone while the negative adjectival constructions are not. We found this difference to be of cognitive relevance, as a larger polarity effect was found in quantifiers. Furthermore, a correlation between the polarity effects in quantifiers was also found, not strong but yet consistent with our hypothesis that downward monotonicity has a distinct contribution to processing. We estimate that downward monotonicity contributes about 30% of the polarity effect in quantifiers, while the rest of the effect is explained by factors that are shared with negative adjectives, such as a hidden negation or a "less than" computation. We will now discuss these two sources and their cognitive underpinning.

Negative polarity
In the beginning of this paper we presented negative polarity as triggering a "less than" computation. However, since every "<" is symmetrical with a ">", it would be more accurate to define negative polarity as satisfying these two following properties (see also Footnote 2): (i) scale reversal; (ii) proximity to zero. Scale reversal means that any degree d that makes the positive proposition (pos) stronger (compared to d'), makes the negative proposition (neg) weaker, and vice versa (i.e.

POS(d) ⊂ POS(d′) ⇔ NEG(d) ⊃ NEG(d′)).
Proximity to zero means that smaller degrees make negative propositions stronger, and vice versa (i.e. d < d′ ⇔ NEG(d) ⊂ NEG(d′)). Given these two properties of negative polarity, we can examine each separately, thus being more specific about the reasons for the cognitive cost of negative polarity. Scale reversal can be thought of as a consequence of an operator in the underlying structure, akin to the sentential negation not, whose output is the complement of its input. And indeed, a hidden negation was suggested by many linguists to exist in the representation of both negative quantifiers and negative adjectives (Rullmann 1995;Heim 2006;Büring 2009;Penka 2011: Chapter 5;Sassoon 2012). Negation and its processing costs have been the center of many discussions in the psycholinguistic literature. Therefore, it could offer us insights regarding negative polarity in quantifiers and in adjectives. If hidden negations have processing demands similar to those of overt sentential negation, it is no surprise that negative polarity is more costly than positive polarity across linguistic types. Of course, a thorough discussion of sentential negation is beyond the scope of this paper; hence, we will focus here only on negation as denial, which we believe is the most relevant property for understanding the processing cost of sentential negation.
Denial is one of the main functions of sentential negation (Clark 1976;Givón 1978;Horn 1989;Glenberg et al. 1999). For example, the negation in the sentence the square is not blue serves to deny a presupposition-that the square is blue. The meaning of the denied presupposition has to be considered in the comprehension process of the negative sentence, and therefore also mentally represented, thus inducing a heavier load on the cognitive system. 8 The same gist of argumentation was made regarding negative quantifiers. Moxey and her colleagues argue that when a sentence like few of the fans were at the match is uttered, it is to deny that a larger quantity of fans were there, and thus the alternative with the larger quantity is also mentally represented (Moxey, Sanford & Dawydiak 2001;Ingram & Moxey 2011). Unlike with sentential negation, where we find experimental evidence for the representation of the denied presupposition (MacDonald & Just 1989;Kaup 2001;Kaup & Zwaan 2003;Hasson & Glucksberg 2006), we are not aware of experiments that directly test this hypothesis for negative quantifiers. We are also not aware of similar claims that were made regarding the denial function of negative adjectives (it sounds odd to say that a sentence like the picture is small presupposes a larger size of the picture). Thus, it is not clear that denial, which brings about a richer mental representation and presumably a heavier cognitive load, is really what underlies processing effects of negative polarity. In addition, one would like to understand also why it is the negative antonyms that are the denials of the positive ones, rather than the other way around. Since scale reversal is a symmetrical relation, it cannot be used alone to classify antonyms by their polarity. The second property we mentioned-proximity to zero-breaks this symmetry.
The zero point classifies antonyms into positives and negatives unequivocally, as only negative antonyms support stronger propositions the closer the degree argument is to zero. But for that, a zero point has to exist. The adjectives and quantifiers discussed in this paper have a scale with a natural zero point, hence polarity is defined. Positive polarity reflects the ordinary way we perceive the world, as the positive antonym points away from the zero point, toward a naturally observable direction (Clark 1973). Scalar ordering with an opposite directionality, towards the zero point, is therefore less natural and more difficult to process (De Soto, London & Handel 1965;Huttenlocher & Higgins 1971;Clark 1973;Geurts & Van Der Slik 2005: 111;Tribushinina 2009;Hoorens & Bruckmüller 2015). It follows that an effect of negative polarity will be measured only when the scale has a zero point. An interesting question that immediately arises is what happens in the processing of antonymous adjectives that lack a zero point (consider, for instance, the pair ugly/beautiful). Do they have the same processing cost? And how should polarity be determined for such a pair, if at all? We hope future research would shed more light on this matter.
Finally, we wish to point at a potential difficulty to the claim that negative polarity has an independent processing cost. As an anonymous reviewer kindly points out, Chemla et al. (2011) distinguish between formal monotonicity, a property of linguistic environments, and perceived monotonicity, a property of cognitive systems that is expressed in particular experimental settings. While the former is evaluated and determined by logical tools (e.g. entailment tests), the latter is evaluated by the measurement of some continuous dependent variable, typically RT or error rate. In Chemla et al.'s case, individual judgments of NPI licensing (an indicator of formal monotonicity) and monotonicity judgments (an indicator of perceived monotonicity) correlate positively. In our case, we find a simple effect of polarity in the adjectives antonym contrast. Negative adjectives are formally not downward monotone, but it could be that subjects perceive the negative antonyms as almost downward monotone, which explains the effect. To assess the perceived downward monotonicity of negative adjectives, it could have been useful to test for the correlation between the individual polarity effects in adjectives and individual acceptability judgments of NPI licensing. Such a correlation would suggest that perceived monotonicity plays a role in the polarity effect of adjectives. We do not have such data for the subjects of the current study. Yet, in response to the reviewer's suggestion, we examined unpublished data from a study in which subjects filled judgment questionnaires after performing a similar verification task. A correlation was not found. Though circumstances were different (e.g. the task was carried out inside an MRI scanner), the absence of a correlation is suggestive. Future studies may hopefully control for perceived monotonicity in testing polarity effect evoked by non-monotone expressions.

Downward monotonicity
We now turn to discuss downward monotonicity as an independent source of the polarity effect. Prima facie, it is unclear why a logical property such as downward monotonicity should be determinant in language processing. One possibility is that part of understanding a sentence is to know what it entails and what it is consistent with (i.e. entailed by), and therefore entailment reversal is highly informative and has to be mentally represented. However, meaning is more than its semantics. For instance, although few is downward monotone, few of the circles are blue sounds odd when describing a scenario with no blue circles. 9 Reiterating, few triggers a scalar implicature of not-none (i.e. some) making it pragmatically identical to a small number. People split on whether they interpret a sentence on its logical level or its pragmatic level (Spychalska, Kontinen & Werning 2016). Taking into account the pragmatic meaning seemingly destroys downward monotonicity (see Footnote 4). Nevertheless, an effect was found, implying that the semantic meaning is automatically calculated, even if it is later rejected for pragmatic reasons. Evidence from NPI licensing (Chierchia 2004;Gajewski 2011) or from processing modified numerals (Marty, Chemla & Spector 2015) seem to support such a view.
A different view attributes the processing cost of downward monotone quantifiers to the verification strategies they invite (Barwise & Cooper 1981;Bott et al. 2018). In their seminal work, Barwise & Cooper (1981) predicted that the verification process of sentences with downward monotone quantifiers should be longer than that of upward monotone ones. The reason is that to verify a sentence with a downward monotone quantifier, subjects need to go over all relevant subsets, while with an upward monotone quantifier it is sufficient to find one relevant subset. For example, to verify few of the circles are blue, one needs to verify that all blue subsets have a numerosity smaller than some standard. To verify many of the circles are blue it is sufficient to find one blue subset whose numerosity is above some standard. However, this algorithm turns out to have some wrong predictions. It wrongly predicts a dependency of the effect on the total number of items (Deschamps et al. 2015: 125), as well as a complete reversal of the effect for false sentences Grodzinsky et al. 2018).
A different verification algorithm was suggested by Bott et al. (2018). Their algorithm explains the processing cost of downward monotone quantifiers as a result of them being consistent with an empty-set scenario (e.g. few of the circles are blue is true also when there are no blue circles at all). The fact that an empty-set scenario is true under negative quantification requires an algorithm that considers negative information (i.e. the circles that are not blue) rather than just positive information (i.e. the circles that are blue). Such an algorithm is more complex and takes longer to employ. According to this theory, there is nothing difficult in downward monotonicity per se, the only relevant question is whether it is consistent with a zero numerosity of the argument or not. Downward monotone quantifiers, by definition, are (see Footnote 4). Therefore, what we called here the effect of downward monotonicity is really the effect of verifying a sentence with a quantifier consistent with the empty-set (an empty-set quantifier).
The question of which of the two properties is the one relevant for processingempty-set or downward monotonicity-can be tested empirically. Bott and colleagues ran an experiment comparing two non-monotone quantifiers, only one of which was also an empty-set quantifier (Bott et al. 2018: 32-37). The items they used were disjunctions of the sort none or three (empty-set) and one or three (non-empty-set). Their results overall confirmed their hypothesis-verification times for the empty-set quantifier none or three were longer than those for the non-empty set quantifier one or three. On the other hand, although none or three is non-monotone, it does contain a downward monotone quantifier in an embedded position, namely none. This might confound their results, if the number of downward monotone operators is actually the index of processing complexity (as suggested by Grodzinsky et al. 2018: Section 5). To further support the empty-set hypothesis, and to adjudicate it from the downward monotonicity hypothesis, it could be interesting to compare two empty-set quantifiers, only one of which contains a downward monotone quantifier, and test whether a polarity effect is still found. Excitingly, there might already a candidate-no X versus zero X (in sentences like no circles are blue). While both are consistent with the empty set, only no X is downward monotone (consider the unacceptability of *zero students have ever opened the book, compared to the acceptability of no student has ever opened the book). Zero X is not a downward monotone quantifier, and in fact, it is not even a quantifier (Bylinina & Nouwen 2018). The everyday interpretation of zero as none, according to Bylinina & Nouwen (2018), comes about via an implicature. Thus, if downward monotonicity has a processing effect, no X should be more costly than zero X, despite the superficial similarity between their meanings. If the only thing that matters is consistency with the empty set, there should be no difference between the two. Again, we leave this direction for future research.

Coda
The upshot of this study is that the polarity effect in quantifiers is a more complex phenomenon than what was usually thought. It is not explained by only one factor, but rather there are at least two factors involved: negative polarity and downward monotonicity. We discussed the possible cognitive underpinnings of both sources, and suggested some interesting directions for future investigations in the realm of negation, adjectives and quantifiers.