An analysis of all-clefts

This paper provides an in-depth analysis of the semantic and syntactic properties of all -clefts ( All I ate for dinner was a salad ). The main characteristic of all -clefts is the inference that what is designated by the cleft is not much (the “smallness effect”). On the basis of novel observations on all -clefts with multi-clausal precopular clauses, and the interaction with negation and questions, I argue for three claims: (i) the word all is the head of a relative clause (not a free relative), (ii) the precopular clause is derived by syntactic movement, and (iii) the source of the smallness effect is the mirativity of only (Beaver & Clark 2008; Zeevat 2009). The little formal work that exists on all -clefts (Homer 2019) does not offer an analysis that reflects these three claims. Instead I propose a derivational account of all -clefts based on Boeckx (2007).


Introduction
This paper investigates copular sentences in which all heads a relative clause, as in (1): (1) a. [All I ate for dinner] was a salad. b. That's [all I ate for dinner].
I will refer to these constructions as all-clefts, since they have been discussed under that label in earlier literature. Whether the term "cleft" is justified, will be discussed in more detail in this paper. The most striking characteristic of all-clefts is that they come with an inference that what is designated by the relative clause, in a way to be made precise, "isn't much": in (1), that what I ate for dinner wasn't much. I will refer to this as the smallness effect. This effect is perhaps best appreciated when (1a) is compared with the seemingly similar sentence with everything instead of all, in which case no smallness effect arises: (2) a. The everything-relative clause (RC) (2b) has a transparent universal quantified meaning, which may be represented as "[∀x : I ate x for dinner][x is healthy]". On the other hand, no such straightforward meaning representation is obtained for (2a), because of the smallness effect. Another way of looking at the smallness effect is to say that all-clefts can be paraphrased by a phrase containing only: for example for (2a), suitable paraphrases are The only thing I ate for dinner was a salad, or I only ate a salad for dinner. Such paraphrases do not work for (2b). We will see that while only-paraphrasability is a good way to capture our intuition about the meaning of all-clefts, it doesn't quite work in all cases. There is a line of recent work in which the meaning components of various types of cleft constructions and exclusive particles (such as only) are analyzed and compared, both theoretical (Velleman et al. 2012;Büring & Križ 2013;Coppock & Beaver 2014, a.o.), and experimental (DeVeaugh-Geiss et al. 2015;2018). The analysis of all-clefts proposed here adds to this body of research by showing both a formal connection between all-clefts and exclusives, and differences between all-clefts and other types of clefts such as wh-pseudoclefts.
This paper investigates all-clefts in English, but all-clefts exist in various other languages. Below are a few examples (De Cesare 2014 provides further data from Italian and Spanish): (3) a. Dutch (example from CGN corpus) 1 Alles wat je moet doen is dat terzijde leggen. all what you must do is that aside put 'All you should do is put that aside. all what I want, it is that Gottéron wins the title 'All I want is that Gottéron wins the title.' In the current paper I propose that a formal account of all-clefts should satisfy three desiderata. First, all is the head of a relative clause in the precopular clause, and not (as has been previously assumed) a free relative. Second, that in the syntactic derivation of all-clefts, movement has taken place in the precopular clause. Support for this claim comes from novel data that involve syntactically complex all-clefts (i.e. with a bi-clausal precopular constituent). They show that the smallness effect is subject to syntactic movement constraints. Third, all-clefts are derived from syntactically simple sentences containing only, which is the source of the smallness effect, an effect usually known as the mirative function of exclusives (Beaver & Clark 2008;Zeevat 2009). I provide an analysis that meets these criteria, according to which all-clefts are derived in a way similar to how wh-pseudoclefts are derived, following the account by Boeckx (2007).
The paper is structured as follows. In section 2, I will present an overview of the main semantic and syntactic properties of all-clefts, including their specificational nature, connectivity effects, the smallness effect, and the semantic relation to only. Section 3 presents the three desiderata for an analysis of all-clefts listed above. In section 4, I consider a number of different theoretical options for constructing an analysis that meets these criteria, taking into consideration work on the diachronic development of all-clefts by Traugott (2008). This eventually leads to the proposal sketched above.

Earlier work
Despite the large amount of work on the syntax and semantics of (pseudo)clefts (see e.g. den Dikken 2017), there is no detailed study of the empirical properties of all-clefts, nor a formal analysis, other than the recent work by Homer (2019). Homer provides an analysis of all-clefts in which all is a quantity superlative, and derives properties of all-clefts from the semantics of superlatives and adjectival only. I will argue against Homer's analysis and offer a different analysis, in part because of a class of readings that Homer's superlative analysis cannot account for. Also, the focus of this paper is somewhat different than Homer's (2019) approach, in that this paper will focus on novel data on the smallness effect in embedded sentences, as well as data on the interaction between all-clefts and negation. A more detailed discussion of Homer (2019) requires the introduction of some descriptive and technical aspects of all-clefts, so this will be postponed until section 4.1.2.
There has been some work on the diachronic development of all-clefts (Traugott 2008), as well as descriptive work (Bonelli 1992;Kay 2002). Some side remarks on all-clefts have appeared in work mainly directed at other types of cleft constructions (Collins 1991: 32-34;Horn 1996;18;den Dikken, Meinunger & Wilder 2000: §4.3;den Dikken 2017). There is a bit of work on a specific variant of all-clefts that is known as the "alls-construction", which is attested in certain dialects of English: (4) All's I see is a crazy woman throwing away our supplies. (Kay 2002) See Kay (2002) and Wood (2013) for some brief overviews. Putnam & van Koppen (2009;2011) provide a more formal in-depth study, but their main focus is on the nature of the -s agreement morpheme, not on the semantics of regular all-clefts. They do, however, make some observations on regular all-clefts in passing, which I will comment on where relevant. I will not investigate the specific variant of the alls-construction further.

Basic empirical properties of all-clefts
I will start with some basic properties of all-clefts that will reveal the complexity of the construction. Further details will be covered in section 3 in the discussion of the internal structure of all-clefts.

Specificity and connectivity effects
All-clefts are specificational copular sentences, and cannot be predicational. This can be seen in (5a), where the postcopular part is the predicate healthy. 2,3 In contrast, as is well known, wh-pseudoclefts can be either predicational or specificational, (6).
(5) a. *All I ate for dinner was healthy. *predicational b. All I ate for dinner was a salad.
 specificational (6) a. What I ate for dinner was healthy.  predicational b. What I ate for dinner was a salad.
 specificational Everything-RCs show the opposite pattern from all-clefts. They are predicational, as in (7a), whereas the specificational reading in (7b) is pragmatically odd because it conveys that every item I ate was identical to a sandwich.
(7) a. Everything I ate for dinner was healthy.  predicational b. Everything I ate for dinner was a sandwich. #specificational The claim that all-clefts are specificational is further supported by observing that all-clefts display the familiar connectivity effects associated with specificational copular sentences. While connectivity effects are typically studied for specificational pseudoclefts only, they are in fact attested with all specificational copular sentences, as pointed out by den Dikken (2017). There is an extensive literature on connectivity effects, see e.g. Heycock & Kroch (1999); Sharvit (1999) among many others, or den Dikken (2017) for an overview.
Here I illustrate a number of connectivity effects for all-clefts: reciprocity connectivity (8), principle C connectivity (9), and NPI connectivity (10). In each case, the (a) sentence shows that all-clefts display the connectivity effect, i.e. they behave like the corresponding simple sentence in (b) with respect to the phenomenon in question. The (c) sentences show the contrast with everything-RCs, which are predicational: they do not show the connectivity effect.
(8) a. All they i did was embrace each other i . b. They i embraced each other i . c. *Everything they i did was surprising to each other i .
(9) a. *All she i said to me was that I should call Mary i . b. *She i said to me that I should call Mary i . c. Everything I said to her i was frustrating to Mary i .
(10) a. All I said (that) I hadn't done, was read any books about syntax. b. I said (that) I hadn't read any books about syntax. c. *Everything I didn't want, was important in any way.
The main theoretical task that connectivity effects give rise to is to give an account of specificational pseudoclefts that explains why "structural" relations such as binding or licensing in pseudoclefts work as in their non-clefted counterparts. In the large literature on this topic, purely semantic accounts have been proposed, as well a variety of syntactic proposals. This question thus extends to all-clefts.

Meaning components of only and all-clefts
In order to better understand the various meaning components of all-clefts, it is useful to compare them to the larger class of constructions called exclusives in Coppock & Beaver (2014). These include lexical items such as only, mere(ly), sole(ly), etc. Although all items discussed in Coppock & Beaver (2014) are lexical items, and an all-cleft is a complex construction, the meaning similarities between them make a direct comparison possible. The basic reading of VP only (and several other exclusives) is the complement exclusion reading, paraphrasable by nothing other than: (11) Mary only ate a salad. ⇝ Mary ate nothing other than a salad. This is the "negative" or "at most" meaning component of only, which is at-issue content for only. Additionally, exclusives have a "positive" or "at least" meaning component: in (11) this is the prejacent, i.e. that Mary ate a salad. With VP only, this positive component is non-at-issue (presupposed). 4 Coppock & Beaver (2014) provide a unifying analysis of exclusives in which the positive and negative meaning components are represented by the operators min(π) and max(π), respectively. These express that there is some answer to the current question under discussion that is at least as strong (min) as the prejacent π, or that there is no stronger answer (max) than the prejacent π (see Velleman et al. 2012;Coppock & Beaver 2014;DeVeaugh-Geiss et al. 2018 for more discussion and details on the min-max-analysis of exclusives).
All-clefts also have the complement exclusion reading, for example (5b) conveys that I ate nothing other than a salad. In order to see how this meaning component of all-clefts relates to the division of positive and negative meaning components of exclusives, it is worth making a brief excursion to it-clefts (e.g. It was John that called) and their semantic analysis. Velleman et al. (2012) propose that it-clefts and only are symmetric in the sense that what only presupposes, an it-cleft makes at-issue, and vice-versa. The following test, going back to Horn (1981) and used by Velleman et al. (2012: 445), is based on how at-issue and non-at-issue meaning components behave in context.

(12)
It-cleft I know Mary ate a pizza … a. #but I've just heard it was a pizza she ate. b. #but it wasn't a pizza she ate.
(13) Only I know Mary ate a pizza … a. but I've just heard she only ate a pizza. b. but she didn't only eat a pizza.
Both the it-cleft (12a) and the only-sentence (13a) contain a meaning component that Mary ate a pizza. In (12a), this leads to infelicity because of information redundancy, whereas (13a) is fine: this supports the idea that Mary ate a pizza is at-issue meaning in an it-cleft (so that (12a) asserts the same information again), but non-at-issue meaning with only ((13a) asserts something new, namely that Mary ate nothing other than a pizza). The negation sentences in (12b) and (13b) make a similar point: assuming that negation targets at-issue meaning, the tests show that Mary ate a pizza is at-issue in (12) (so that negating the cleft contradicts the preceding sentence). For only, we can explain that (13b) does not lead to contradiction because negation targets the at-issue component that Mary ate nothing other than a pizza. This does not contradict the preceding sentence.
We can apply the same tests to wh-pseudoclefts and all-clefts (see also Homer 2019 for discussion of similar tests): wh-pseudocleft I know Mary ate a pizza … a. #but I've just heard that what she ate was a pizza. b. #but it's not the case that what she ate was a pizza.
(15) All-cleft I know Mary ate a pizza … a. but I've just heard that all she ate was a pizza. b. but it's not the case that all she ate was a pizza.
We find that wh-pseudoclefts behave like it-clefts in making the proposition that Mary ate a pizza at-issue, which leads to redundancy in (14a), and contradiction in (14b). Allclefts, however, despite their syntactic similarity to wh-pseudoclefts, pattern with only: in (15a), that Mary ate a pizza is non-at-issue, and what is asserted is the novel, at-issue information that Mary ate nothing other than a pizza. In (15b), what gets negated is the claim that Mary ate nothing other than a pizza (these judgments are a little easier with a pitch accent on all).

Smallness effect
As mentioned in the introduction, the characteristic property of all-clefts is the associated inference that what is represented by the relative clause isn't much, the smallness effect. 5 What is crucial about the smallness effect, is its very limited distribution: no smallness effect arises when a relative clause headed by all appears in a non-copular sentence.
Below are some examples drawn from the British National Corpus (BNC), 6 for which we don't find a smallness effect: (16) a. Praise the Lord for all he has done. b. He deserves all he has got. c. The dropping of these conditions may have allowed my great-grandfather to qualify, as by all I have heard about him he had none of the virtues which were needed to gain entry.
Likewise, as mentioned in relation to (2b), repeated below, no smallness effect is found with everything-RCs in copular constructions.
(17) Everything I ate for dinner was healthy.
Hence, the smallness effect is neither specific to the lexical item all as opposed to everything (witness (16)), nor to the copular construction as opposed to the non-copular construction (witness (17)), but to the combination of both. Moreover, the smallness effect is not only found in all-clefts, but also in constructions with only, as well as some other constructions, such as no more (see Homer 2019: 17). The remainder of this subsection studies the smallness reading in more detail, and compares the smallness effects in various constructions.
In addition to the complement exclusion reading, only in (11) carries an inference that a salad isn't much to eat (for Mary). This is similar to the smallness effect of all-clefts, but in the literature on exclusives this is usually known as the mirative effect of only (e.g. Beaver & Clark 2008: 250;Zeevat 2009). For example, Beaver & Clark (2008) talk about the "discourse function" of only as "weakening an expectation": Discourse function of exclusives: To make a comment on the Current Question (CQ), a comment which weakens a salient or natural expectation. To achieve this function, the prejacent must be weaker than the expected answer to the CQ on a salient scale. (Beaver & Clark 2008: 251) Likewise, Zeevat (2009) argues that "the semantic contribution of only is only low quantity mirativity: less than expected" (p. 124). That this mirative effect is a separate meaning component from the complement exclusion reading can be shown by using a complement that denotes something that is contextually understood as "large": Mary only ate a 1kg steak.
This still conveys that Mary ate nothing other than a steak (the complement exclusion reading), but has a "funny" interpretation because it suggests that a huge steak is a small meal (for Mary). This is due to the mirativity effect (Homer 2019 has similar discussion of "funny" smallness readings). A similar conclusion is reached by Homer (2019), who points out that the smallness effect is defeasible, as it can be detached from the complement exclusion reading. He makes the point for all-clefts, but the same observation holds for only (replace A's response by No, he only ate a roasted pig for lunch): Forensic context (Homer 2019: 18) A: The victim ate a roasted pig for lunch, sir. B: Anything else you maybe forgot to mention? A: No, all he ate for lunch was a roasted pig.
In this context, only the complement exclusion reading remains in A's response, and there is no smallness effect.
Thus, we see that both only and all-clefts carry a reading that I refer to as the smallness effect, but which is also known as mirativity. It is independent of the complement exclusion reading, and is weak in the sense that it can be overridden by context. To further investigate the status of the smallness meaning component, it is useful to see what happens when the triggering construction (only or an all-cleft) is embedded in environments such as questions or negation (the discussion and data below follow a similar discussion in Homer 2019). In questions, the smallness effect is preserved (again this effect is stronger when a "large" complement is used instead, such as 1 kg steak, recall (19)): Was a salad all you ate for dinner? ⟹ smallness effect (a salad isn't much to eat for dinner) At-issue meaning is what gets asked in a question, so (21) targets the at-issue component of all-clefts, i.e. whether you ate nothing other than a salad for dinner. The presupposed part (that you ate a salad for dinner) projects out of the question, and the relevant judgment is what happens with the smallness effect. The reported judgment is that the inference that a salad isn't much to eat for dinner is preserved in (21), although "maybe not as strongly" according to Homer (2019: 17).
For negation, facts are more complicated: a. ??All I ate for dinner wasn't a salad. cannot mean: I didn't only eat a salad for dinner b. A salad wasn't all I ate for dinner.
As for (22a), it has been observed before that specificational copular sentences are incompatible with predicate negation, unless a specific sort of correction or contrast reading is intended (den Dikken 2017: 40). In (22b), Homer (2019) claims there is no smallness effect, although the judgment seems to more subtle here.
The only-counterparts of (21) and (22b) have a similar status: mirativity is preserved in (23), but less clearly so in (24).

(23)
Did you only eat a salad for dinner? ⟹ smallness effect (24) I didn't only eat a salad for dinner (… I ate more).
The data discussed above, which show that the smallness effect is defeasible, and does not project in all environments, lead Homer (2019) to the conclusion that the smallness effect is not a presupposition, but possibly akin to a conversational implicature (his p. 18), although different views exist in earlier literature regarding the mirativity of only (Zeevat 2009, and see Homer 2019: fn. 17). I will remain neutral with regard to the precise status of the smallness effect in all-clefts and other constructions, but what is important for my proposed analysis is the close parallelism between all-clefts and only.

Rank-order readings
In addition to the complement exclusion reading (section 2.2.1 above), some exclusives have what is called a rank-order reading by Coppock & Beaver (2014). This is illustrated for the exclusive merely in (25): (25) This is merely a down payment. (Coppock & Beaver 2014: 373) This sentence does not mean that this is nothing other than a down payment, but rather that it is nothing more than a down payment. Complement-exclusion and rank-order readings are analyzed in the same min-max framework in Coppock & Beaver (2014), but the nature of the scale is different (see there for details). Note that in addition to a complement exclusion reading, VP only can also have a rank-order reading, but noun-modifying only does not. This raises the question whether all-clefts allow rank-order readings as well. A simple case, based on a hierarchy scale of employment, works with only as well as with all-clefts: All he is is a simple employee.
For a more contextual example, the rank-order reading of all-clefts can be illustrated in the following context, borrowed from Beaver & Clark (2008: 258). Imagine that John is at a conference, which is also attended by many philosophers. John is collecting signatures, and tries to get signatures from the most famous philosophers. With this context in mind, the following sentences refer to the fame-scale, and not the quantity scale: (27) a. John only got a Soames [… but Bill got a very famous one]. b. All John got was a Soames.
Both sentences mean that John didn't get an autograph from a more famous philosopher than Soames. They do not mean that the only signature John got, was one from Soames. Another case, provided to me by Jessica Rett (p.c.), is modeled after a scenario from Zanuttini & Portner (2003: 50): Again, the relevant reading here is not that I put in nothing other than A peppers, but that I didn't put in anything more spicy than A peppers. In other words, the scale here is one of spiciness, and not of exclusion.
The presence of rank-order readings with all-clefts will turn out to be important in developing a formal analysis (in section 4), because here paraphrasability with the only thing breaks down: a. The only thing he is, is a simply employee. ≠(26) b. The only thing John got is a Soames. ≠(27b) c. The only thing I put in is A peppers.
≠ (28) The reason is that the only thing contains adjectival only which does not have a rank-order reading. Hence, an analysis that is based on quantity scales alone, or on adjectival only, will not be able to account for rank-order readings of all-clefts.
In summary, in this section I have described parallels between the mirativity of only and the smallness effect in all-clefts in terms of their meaning components and informationstructural properties, their behavior under embedding in questions and under negation, and the availability of rank-order readings. This suggests strongly that the smallness effect and the mirativity of only have a similar source. I will come back to this in section 3.3. I will finish this section with a side note on focus properties of all-clefts as compared to only.

Side note: Focus properties of all-clefts and only
Overt only, when used adverbially, is focus-sensitive in that its semantic scope is marked by a pitch accent. See Coppock & Beaver (2014) for more discussion on the focus-sensitivity of exclusives and its source. In the case of all-clefts, it is a rather obvious observation that the semantic scope of only in the paraphrase cannot be bigger than the counterweight clause in the all-cleft:

Internal structure of all-clefts
Now that we have gone through the major properties of all-clefts (they are specificational copular sentences showing a smallness effect), I proceed to the analysis of the internal structure of all-clefts. Above, I have used the labels "relative clause headed by all" and "cleft" in a descriptive manner without formal justification. I will now provide arguments that this is indeed the right structure. For easy discussion of various alternative options of the structure of all-clefts, I will introduce some neutral terminology now. I will refer to the gapped clause that directly follows all as the "accompanying clause", together with all forming the precopular clause. 7 Following the practice in the cleft literature, I refer to the postcopular material as the "counterweight".
(33) All I ate t for dinner was a sandwich. ALL accompanying clause copula counterweight precopular clause I will argue for the following claims: i. The precopular clause is a headed relative clause, headed by all. In particular, I argue against the idea that the accompanying clause is a free relative or amount relative.

The precopular clause is a headed relative clause
In order to consider the status of the precopular clause of an all-cleft, it is useful to look at what has been said about the precopular clause of wh-pseudoclefts (compare (1a) with What I ate for dinner was a salad). The syntactic status of the wh-phrase has been a major issue in the literature (see den Dikken 2017: §5 for an overview). Two major options that have been proposed is that it is a full wh-question, or that it is a free relative (Caponigro 2003). An example of a proposal that argues for the former is den Dikken, Meinunger & Wilder (2000), who emphasize the similarity between a wh-pseudocleft and a questionanswer pair: What John ate is a pizza. ⟷ What did John eat? A pizza.
I will not summarize the various arguments that the authors present for analyzing the wh-phrase in a wh-pseudocleft as a wh-question.  (16), and does not seem to be related semantically to all-clefts.
The alternative proposal that the wh-phrase in a wh-pseudocleft is a free relative has a long history in the literature (see den Dikken 2017: 84 for references), but to the best of my knowledge, this literature is not concerned with its applicability in the domain of allclefts. On the other hand, Homer (2019), although not mentioning a link to wh-pseudoclefts, assumes that the accompanying clause in an all-cleft is a free relative. I will argue against the free relative approach, and propose instead that the accompanying clause is a regular relative clause headed by all.
First, a property characteristic of headed relative clauses in English is that subject relative clauses must contain a complementizer that or a relative pronoun (e.g. the man *(that/ who) saw me), an effect known as the anti-that-trace effect (Douglas 2017). Subject allclefts show the same effect: they must have a complementizer that. 8 All *(that) surprised me was that Mary was there.
Second, English is a language that allows dropping the relative pronoun or complementizer in non-subject RCs, i.e. it allows bare RCs. As for all-clefts, in most (non-subject) examples there is no relative pronoun or complementizer present, so they could be called bare all-clefts. Corpus data show that non-subject English all-clefts are not necessarily bare, as they may contain a complementizer that (or, for some speakers, a wh-word what 9 ): The optionality of a relativizer in regular English RCs thus finds a parallel in all-clefts. Conversely, Dutch is a language that does not allow bare RCs at all. In regular relative clauses, a relative pronoun is obligatory: (37) Dutch de man *(die) ik zag t the man that rel I saw 'the man that I saw' Likewise, in Dutch all-clefts a relative pronoun is obligatory. It must be a wh-item wat, instead of the d-series relative complementizer in (37) We thus find a parallel between the optional or obligatory nature of a relative pronoun in regular relative clauses and all-clefts, both in English and in Dutch (summarized in Table 1). These findings suggest that all and the accompanying clause form a headed relative clause, and are problematic within a free relative approach. Clearly, there is no such thing as a "bare free relative": if the wh-word is left out in a free relative, only the gapped clause is left. Moreover, English free relatives must contain a wh-word, not the complementizer that. In English all-clefts can be bare, but if they do have a relativizer, it is typically the complementizer that -not what.
Another argument against the free relative approach comes from a set of curious facts concerning the interaction between all-clefts and negation, which to my knowledge have not been reported before. The accompanying clause in an all-cleft cannot contain matrix negation: 11,12 10 See Sportiche (2011) for more on the w-series and d-series relative pronouns in Dutch. 11 Various native speakers I polled have confirmed the constraint on matrix negation in all-clefts, and negative examples are not found in corpora. Note that Putnam & van Koppen (2011: 102) provide the example "All John doesn't buy is any wine" in passing (their main focus in on alls-constructions), but this appears to be a constructed example to illustrate NPI connectivity effects. 12 The constraint is specifically about matrix negation. Negation is possible in all-clefts when it is in an embedded clause. More on embedded clauses in all-clefts in section 3.2 below. (39) a. *All (that) I don't like about her, is that she is sometimes late. intended: The only thing I don't like about her, is that she is sometimes late.
b. *All (that) I would never do is climb a mountain. intended: The only thing I would never do is climb a mountain.
Note that on the other hand, matrix negation is fine in a variety of related constructions, such as the only thing-RCs, wh-pseudoclefts, everything-RCs, and, crucially, free relatives.
(40) a. The only thing I would never do is climb a mountain. b. What I didn't do was climb a mountain. c. Everything I don't want anymore is in this box. d. John put what I didn't eat in the fridge.
According to the free relative approach, the accompanying clause in an all-cleft is a free relative. This would require an explanation for why a normal free relative allows negation, but the one in an all-cleft does not. On the basis of the strong similarities with regular relative clauses, as well as the negation data, I conclude that the accompanying clause in an all-cleft is not a free relative, but a regular relative clause headed by all. I conclude this discussion on relative clauses with a remark on one further potential direction of analysis that I will reject. It may be suggested that the accompanying clause is a special sort of headed relative clause, namely an amount relative (also known as degree relative) (Carlson 1977, Grosu & Landman 19961998;McNally 2008). This is because Carlson (1977), who first investigated the construction, discusses some relative clauses headed by a universal quantifier as amount relatives: (41) a. That's all there is. (Carlson 1977: (10b)) b. Marv put everything (that) he could in his pockets. (Carlson 1977: (17)) The relevant degree reading for (41b) is "for the maximal amount d that Marv could put in his pocket, Marv put d in his pocket" (as paraphrased in McNally 2008). The sentence also has a regular relative clause (non-degree) reading, which says that for every object that Marv could put in his pocket, he put that object in his pocket. It is not clear to me whether there is any truth-conditional difference between the (putative) degree reading and the non-degree reading in (41a), which in my proposal is analyzed as an all-cleft with a regular relative clause headed by all. I am not aware of any further discussion in the literature of the specific example of (41a) as an amount relative. I take it that, in general, all-clefts do not have degree readings, so I do not analyze the relative clauses inside them as amount relatives.

Syntactic movement in the precopular clause
In this section I will argue for the claim that all-clefts involve syntactic movement in the precopular clause. Support for this idea comes from considering what happens when the accompanying clause in an all-cleft contains an embedded clause, as in the following example: All [Mary said she wanted to do] is sleep.
I will consider how two properties of all-clefts already discussed above extend to this syntactically more complex case: the smallness effect ( §3.2.1), and negation restrictions ( §3.2.2).

Smallness effect in embedded clauses
The few mentions of the smallness effect in previous literature have all been illustrated with cases in which the relative clause headed by all is syntactically a single clause. In (43) the accompanying clause has an embedded clause introduced by the verb say. In this case, we find an ambiguity with respect to the smallness effect: a "high construal" reading in which smallness is about John's saying, and a "low construal" reading in which smallness is about Mary's wanting. When the smallness effect associates with the higher clause, (43) has reading (i), paraphrasable as John only said that Mary wanted a pencil for her birthday. When the smallness effect associates with the lower clause, we get reading (ii), paraphrasable as John said that Mary only wanted a pencil for her birthday. This ambiguity, which to my knowledge has not been reported before, is important because it suggests that the smallness effect has a syntactic origin. To see this, it is useful to compare (43) with two sets of similar scope ambiguities with reportative verbs discussed in earlier literature. The first arises with adjectival superlatives or only, as discussed in Bhatt (2002). The superlative adjective first/only in the DP in (44) can have a high construal reading (paraphrased in (i)), or a low construal reading (paraphrased in (ii)): the first/only book that John said that Tolstoy had written (i) the first/only book of which John said that Tolstoy wrote it (ii) the book of which John said that Tolstoy wrote it first/only wrote it The second ambiguity that I want to compare (43) to is one associated with temporal adverbial clauses (Larson 1987;Haegeman 2009 Both the superlative adjective ambiguity in (44), and the temporal adverbial ambiguity in (45), have been accounted for by a syntactic mechanism where the low and high construal readings correspond to syntactic movement from the lower or higher clause (movement of the superlative operator -est in (44), and movement of the temporal adverbial clause in (45)).
A reviewer suggests that the reported ambiguities with all-clefts may instead be due to different ways of association with focus. However, as was discussed in the note on page 13, all-clefts do not associate with a focus-marked constituent in the way that exclusives such as only do. Hence, it is not clear how the ambiguity in (43) can be explained as an ambiguity of focus association.
Note that all examples in (43)/(44)/(45) contain the verb say. Indeed, it has been noted that the ambiguity in (45) does not arise with factive verbs (Haegeman 2009: §6). When say is replaced with factive regret, only the high construal reading remains: This may be related to factive verbs being a barrier for movement from the embedded clause (sentences with factive embedding verbs also do not show syntactic Main Clause Phenomena, see Haegeman 2009 for discussion), or to the evidential reading that reporting verbs such as say and claim have (see Hunter et al. 2006). Likewise, although not discussed by Bhatt (2002), we may observe that when say is replaced by regret in (44), the low construal reading is no longer available: (47) the first book that John regretted that Tolstoy had written = first in the order of John's regretting NOT: the first book that Tolstoy had written (which John regretted) Now going back to all-clefts, we find that the "embedded smallness effect" observed with say in (43) is not found with factive verbs such as regret: 14 (48) All John regretted that Mary asked for her birthday, was a pencil.
(i) John didn't regret much NOT: (ii) Mary didn't ask for much In summary, there are systematic parallels between all-clefts and the other two constructions illustrated in (46) and (47): they all have an ambiguity of high and low construal readings with reporting verbs, but not with other verbs such as factive ones. These parallels provide support for the view that, just as has been proposed for superlative DPs (Bhatt 2002, recall discussion above) and temporal adverbial clauses (Haegeman 2009, recall discussion above), the smallness effect in all-clefts is due to syntactic movement in the accompanying clause.

Negation
The second set of data relevant for the movement claim goes back to the restriction on matrix negation in all-clefts, recall the discussion around (39) above. We have already seen that all-clefts do not allow matrix negation in their accompanying clause, (49). However, when the negation is in an embedded clause, the all-cleft is grammatical, as in example (50): (49) *All (that) I don't like about her, is that she is sometimes late. (=(39a)) (50) All I said (that) I don't like about him, is that he is sometimes late. (i) I didn't say much NOT: there isn't much I don't like about him Here only the high construal reading is available. This suggests that negation has a blocking effect: movement out of the embedded clause past negation (in order to obtain a low construal reading) is not possible. On the other hand, the shorter movement step out of the matrix clause, which does not pass negation, is possible, and results in the high construal reading.

All-clefts are syntactically related to non-clefted sentences with only
Arguments for the third claim have already been discussed earlier in the paper, and will be brought together here. A first argument for the claim that all-clefts are syntactically related to their non-clefted counterparts comes from the presence of connectivity effects in all-clefts, as already discussed in section 2.1. Examples (8), (9), and (10) show the formal relation between the all-clefts (in the (a) sentences) and their non-clefted counterparts (in the (b) sentences). A natural analysis of these facts is that the all-clefts are syntactically related to their nonclefted counterparts (although, of course, there exist some alternative accounts for connectivity effects, such as purely semantic ones, see den Dikken 2017 for an overview).
I moreover propose that the underlying base sentences of all-clefts contain an expression with the meaning of exclusive only. The semantic similarities between all-clefts and exclusive only were discussed in section 2.2.2. The first similarity that was discussed is the information status of the negative meaning component ("at most"), which is at-issue for both constructions, and the positive meaning component ("at least"), which is presupposed for both constructions, as diagnosed by the tests (12)-(15). In this respect all-clefts behave differently from wh-pseudoclefts. Second, a number of similarities relating to the smallness effect were discussed. A smallness or mirative effect is found in both constructions, and facts about the strength of this inference (defeasibility, context-sensitivity), and its behavior in embedded contexts (questions, negation) are parallel. Hence, I assume that the source of the smallness effect in all-clefts comes from the exclusive particle in the nonclefted sentences they are derivationally related to.

Three analytical options
In the previous section we have collected three desiderata for a formal analysis of allclefts. Before presenting an account that meets these criteria, I would like to distinguish three different types of analysis.
Approach 1 We may assume that all in all-clefts is the same lexical item as the regular universal quantifier all, and the two have the same semantics. Then the account has to provide an explanation for how the particular semantic/syntactic structure of clefts conspires to give all its smallness inference, but no such inference arises when all appears in any other environment. The main challenge for this approach is to explain why the smallness effect does not arise with everything-RCs, for these are also copular constructions containing a relative clause headed by a universal quantifier.
Approach 2 We may assume that there is homophony in the case of all, and there is a universal quantifier all 1 , and a different lexical item all 2 whose meaning is similar to only. This approach seems to be too radical, as one would need to explain why there is no smallness effect in relative clauses headed by all in non-copular contexts (see (16)). In other words, one would have to explain the very peculiar distribution of all 2 (e.g. I brought all 1 that you ordered is fine, but *I brought all 2 that you ordered is not).
Approach 3 As an intermediate option, we may propose that the smallness effect arises only in cleft structures because it comes into existence during the derivational building process of a specificational cleft sentence. Below I will argue that Approach 3 is more compatible with diachronic observations suggesting that all-clefs are a specific grammaticalized construction, rather than a general property of the meaning of the word all.
Approach 1 is perhaps the natural inclination of the formal semanticist: when faced with a single expression that means different things in different environments, we would like to provide a single semantic entry that is flexible enough to fit various different uses. This is also the approach that Homer (2019) takes. This general approach is attractive from a theoretical perspective, but is only plausible if the various meanings of the expression under investigation are in some sense of equal standing. In the case of all-clefts, there is diachronic evidence that this is not the case. The smallness reading of all-clefts has been shown to be a grammaticalized inference linked to very particular contexts. This inference then got reinterpreted as part of the meaning of all in the particular environment of a cleft structure. This makes Approach 1, which posits a single semantics for all, that in the vast majority of cases does not have the smallness reading, highly implausible. I will now review in some more detail the diachronic development of all-clefts as reconstructed in Traugott (2008).

Historical development
The following review is a summary of Traugott (2008) a. For all he did, was to deceiue good knights.
[1590] 'everything he did was in order to deceive good knights' b. And all I did, was to aduance thy state.
[1601] 'everything I did was in order to advance your state' These were distributive universally quantified statements, which would be expressed with everything-RCs in present-day English.
After 1600, all-clefts started to be used in a reading with a smallness effect. Traugott argues that these constructions appeared mainly in adversative contexts, that is, ones in which "everything that one person says or does may not be enough for some other person or may be interpreted as mistaken/inadequate" (p. 160). This is illustrated with the negative case in (53a), and the modal case in (53b) in which the to-phrase no longer has a purposive reading.
(53) a. But as for my self he doth me notorious wrong. I did not mention any Principle of Vnity in this place, nor so much as dream of them, … All I said was this, that we doe not separate from other Churches, but from their Accidentall Errours [1658] b. there is no possibilitie of overthrowing the new election which shalbe made when the place is voyd, and if it be so allready, or shalbe so, all you can doe is to do some good for the tyme to come [1624] In these examples, the smallness was literally expressed in the negative contexts all-clefts appeared in. The assertion that one's actions or words were inadequate/not sufficient eventually became grammaticalized, and is now an inference associated with the word all in all-clefts. This is an example of change occurring in "dialogic" discourse contexts, which is to say involving two different viewpoints in turn-taking discourse. This is a situation that has been claimed to enable pragmatic change from neutral expressions to expressions encoding multiple perspectives (see Traugott 2008: 145 for more details). The crucial historic change was thus that a distributive, universally quantified statement ("every action I took is x") got interpreted as something measuring out an amount ("the sum of actions I took is x"). This amount may now be compared to some level of expectation. This historic perspective gives us a number of insights that are relevant for developing a formal synchronic account of all-clefts. Traugott shows that the development of all-clefts took place as a separate "track" from other types of clefts such as wh-pseudoclefts. Thus, the highly restricted distribution of the smallness effect can be understood because it is a grammaticalized inference connected to a specific construction. Hence, to pursue a strategy that posits a general semantics for the universal quantifier all that comes out as having a smallness effect in certain specific environments, but not others (Approach 1 above), although perhaps attractive from a general principle of parsimony, is not compatible with the historic facts that all in this construction simply got reinterpreted because of the contexts it appeared in, and shifted away from its original universal quantifier meaning. The opposite option, to posit a full lexical ambiguity (Approach 2), is too extreme: all 2 is not a separate lexical item, but specifically associated with the construction it appears in. Hence, I will argue for Approach 3. All in all-clefts is no longer a universal quantifier, but got reinterpreted as part of the derivational process of building all-clefts. Before proceeding to the analysis, I will review two existing accounts. The first is a recent theoretical account of all-clefts, Homer (2019), which is an instantiation of Approach 1. The second is a prominent analysis of specificational pseudoclefts, den Dikken, Meinunger & Wilder (2000), but the authors also offer some comments on all-clefts.

Homer (2019)
Homer (2019) assumes that there is a single morpheme all, which is not a universal quantifier, but rather a quantity superlative, similar to words such as most, following Hallman (2016). The accompanying clause is taken to be a free relative, which semantically serves as the comparison set for the superlative.
Homer explains the narrow distribution of the smallness effect as reflecting a semantic type distinction: all-clauses with a smallness reading denote properties (type ⟨e,t⟩). This makes them suitable to fit in copular sentences, but not in argument positions (which require an expression of type e, or a generalized quantifier). The explanation for their status as properties follows Sharvit (2015), who makes the same claim for DPs of the form the only NP. They can get a non-definite (i.e. property) reading in copular positions.
The regular distributive (i.e. universal quantifier) reading of all-RCs in argument position is obtained by type-lifting the superlative all (∃-closure). The smallness effect, which Homer assumes is associated with the semantics of the superlative operator -est, then disappears with this type-lifting operation, although the details of the process are not made entirely clear.
I have given arguments against the free relative analysis of the accompanying clause in section 3.1. In addition, there are a number of reasons to doubt Homer's claim that the accompanying clause is an argument of all. First, English morphology does not distinguish between nominal all and adjectival all, but other languages, such as Dutch, do. The nominal form, alles, is the one that is also used in all-clefts. This makes it less likely that it takes the accompanying clause as an argument, as one would expect this for adjectival all only.
A second problem comes from partitive constructions. Quantity expressions such as most and (according to the hypothesis) all can take partitive constructions. In this case, it seems clear that the partitive phrase it combines with indeed contains a free relative: Crucially, though, (56b) does not have a smallness reading, even though the sentence has precisely the structure (quantity superlative + free relative) that Homer ascribes to allclefts.
The type-based analysis for the distribution of the smallness effect runs into a problem described by Bošković (1997: 238n). He draws attention to specificational pseudoclefts in which the counterweight is morphologically an adjective: What John is, is unusual.
[specificational reading] Bošković relates his discussion to an analysis of Partee (1986), but the same argument goes for Homer's story. In order for unusual to function as the postcopular material, it has to be type-lifted into something of type e. This incorrectly predicts, as Bošković points out, that a sentence like What John likes is tall should have a meaning like What John likes is tallness. The corresponding all-clefts (All John is, is tall/unusual) make the same point: in Homer's story tall/unusual has to be of type e. So *All John likes, is tall is predicted to mean All John likes, is tallness. These data argue instead for an account in which all-clefts are related to their simple counterparts John {is/*likes} tall (cf. my claim in section 3.3). In Homer's (2019) account there is no movement in the precopular clause: it assumes that all modifies a free relative, but there is no movement of all from the accompanying clause into its final position. So his account does not meet desideratum 2, and it is not clear how it would account for the embedded smallness data, and the blocking effect of negation.
Finally, Homer's (2019) account has a problem with rank-order readings of all-clefts (discussion in section 2.2.3 above). Because the semantics of all as a quantity superlative is based on the part-whole relation on quantity (mereology), it would analyze All John is, is a simple employee as having a complement-exclusion reading, i.e. The only thing John is, is a simple employee (see (26)/(29a) above). It is precisely the rank-order readings in which this type of only-paraphrasability breaks down: rank-order readings are possible for adverbial only, but not for adjectival only (as in the only thing), which is the basis of Homer's analysis.

Den Dikken et al. (2000)
There is a great variety of accounts of the structure of pseudoclefts available (see den Dikken 2017 for a very extensive overview), and a review of even a selection of them is outside the scope of this paper. However, I will discuss one prominent analysis of specificational pseudoclefts, den Dikken, Meinunger & Wilder (2000), which is also one of the few that makes some remarks on all-clefts.
den Dikken, Meinunger & Wilder (2000) make a distinction between two types of specificational pseudocleft (SPC): Type A and Type B. The main differences have to do with the syntactic category of counterweight (full IP for Type A, other possibilities for Type B), as well as the reversibility of the pseudocleft (Type A non-reversible, Type B reversible; I refer to the original paper for further details). In their brief discussion of all-clefts, the authors state that all-clefts "display all the restrictions on Type A SPCs" (p. 81, their italics), such as connectivity effects (as discussed in my section 2.1). On the other hand, a basic property of all-clefts is that they are reversible. This would classify all-clefts as Type B SPCs, not Type A. When we look at the syntactic category of counterweights, we find a variety of options, such as IPs, VPs, DPs, and APs:

An analysis of all-clefts based on a derivational account of pseudoclefts
Instead, I will base my account of all-clefts on Boeckx (2007). Boeckx gives a "fully derivational" account of pseudoclefts. Very concisely, the derivation of a wh-pseudocleft proceeds as follows: A wh-pseudocleft is derived from its non-clefted counterpart, (60a). First, what will end up as the counterweight moves to the specifier of Foc(us)P, (60b). This is supported by the common idea that this part of a cleft construction is focused, i.e. provides new information. In the case of all-clefts this is even more clear, because the counterweight is the scope of only, as discussed before. Next, the remnant moves to the specifier of Top(ic)P, (60b). This is a "focus reinforcing" process that articulates the informationstructural contrast between counterweight and wh-clause of pseudocleft sentences (see Boeckx 2007 for further details). Then, a relativization process happens by lexicalizing the wh-word what out of the copy t i , (60d). Finally a copula is inserted in the Topic head, (60e). The "construction-specific" nature of Boeckx's account makes it suitable to extend to all-clefts. I propose that all-clefts are derived in a way similar to (60). The main issue is where all is introduced in the derivation. I assume that all is the phonetic realization of the relativization of only. A reviewer points out that the account for the alls-construction in Putnam & van Koppen (2011) shares some properties of the account proposed here for all-clefts, because the former also employs syntactic movement inside the pseudocleft (T-to-C movement in order to check tense/mood features of the copular verb). However, the type of movement is different in both analyses, and the empirical properties of the construction that are to be explained are also not the same (all-clefts and the alls-construction are distinct). Hence a direct comparison between the two accounts is not possible.
We find that the account presented in this section satisfies the three desiderata for an analysis that I formulated in section 3. First, it relates all-clefts to non-clefted sentences, providing an easy explanation of connectivity effects, and avoiding Bošković's problem of adjectival counterweights (desideratum 3). Second, it involves movement of only (desideratum 2). In a syntactically complex accompanying clause, only in the base sentence can either appear in the embedded sentence, or in the matrix sentence, thus accounting for the embedded smallness effect: (62) a. John only said that Mary wanted a pencil for her birthday. b. John said that Mary only wanted a pencil for her birthday.
When a factive verb, such as regret, is used instead of say, it blocks the movement of the lower only, parallel to the other restrictions of factive verbs discussed in section 3.2.1. Similarly, matrix negation blocks movement of only out of the embedded clause. Finally, all comes out as the head of a relative clause, as desired (desideratum 1).

Conclusion
I provided an analysis of all-clefts that reflects the special relationship with only. In earlier work, cleft structures have been analyzed as in such a way that they consist of the same meaning components min and max as exclusives, but their informational status (at-issue vs. non-at-issue/presupposed) differs (Velleman et al. 2012;Büring & Križ 2013;Coppock & Beaver 2014). For all-clefts, I have argued that they have a different type of relationship with exclusives: both assert a "negative" meaning component about complement-exclusion (or exhaustivity) at-issue. This is reflected in the formal analysis in the sense that only is present in the process of deriving an all-cleft. This explains the close parallelisms between only and all-clefts that have have been reviewed in this paper.
The internal syntactic structure of all-clefts turned out to be more complex than was previously assumed. In particular, by considering data in which the accompanying clause contains an embedded clause, effects of movement could be recognized: low and high construal readings of the smallness effect, and blocking effects of embedded negation. As a result, the analysis I put forward is different in important respects from Homer (2019), whose analysis based on superlative semantics and adjectival only cannot account for the rank-order readings of all-clefts.
This paper aimed to set out the most important empirical facts about all-clefts in English, and propose an analysis of the key theoretical issues relating to their syntactic and semantic structure. Consequently, various empirical and theoretical details could not be addressed here, and this leaves room for much future work. Not all syntactic details of the English construction could be addressed in this paper, and further work is in order on the semantics side as well, for instance on the feasibility of a compositional semantics in view of the diachronic facts, and on the focus-sensitivity of all-clefts. Experimental work should address some of the more subtle claims and judgments relating to all-clefts in English. All-clefts exist in various other languages, as was briefly touched upon in the introduction. Research on cross-linguistic variation in the domain of all-clefts is a promising avenue for further research, as it gives the opportunity to study the interface of syntax and semantics from a typological perspective.