Unifying the that-trace and anti-that-trace effects

This article proposes a unified analysis of the that-trace and anti-that-trace effects in English. Unification of these two seemingly diametrically opposed effects remains an outstanding problem. It is argued that complement and relative clauses in English exhibit systematic variation in terms of how articulated their C-domains are. This, combined with Spec-to-Spec Anti-Locality, leads to a novel analysis of the anti-that-trace and that-trace effects. The analysis has interesting theoretical implications for phase theory and the mechanics of successive cyclicity, particularly concerning the position of the phase escape hatch, which is claimed to be the specifier of the complement of the phase head, and not the specifier of the phase head as in standard phase theory.

(1) a. You said (that) Mary saw John. b. Who(m) did you say (that) Mary saw? c. Who did you say (*that) saw John?
(2) a. You said (that) Mary left. b. Who did you say (*that) left?
(3) a. I met the man (that/who(m)) Mary saw. b. I met the woman *(that/who) saw John.
The complementiser that is generally optional (at least in bridge verb contexts) when introducing a finite declarative clause, as in (1a). This optionality is also seen in A'-movement contexts, as in (1b) and (3a), where the direct object of the embedded clause has been questioned and relativised respectively. When the subject is A'-extracted, as in (1c), (2b) and (3b), the optionality of the complementiser vanishes. 1 However, in (1c) and (2b),

Clause structure
In this section, I argue that the optionality of that found in complement clauses and relative clauses (RCs) in English reflects systematic variation in terms of how articulated the C-domain of the clause is. I argue that clauses introduced by that, which I will call thatlanguages; there seems to be genuine parametric variation on this point (Maling & Zaenen 1978;Rizzi 1982;Chacón et al. 2015). However, for there to be genuine parametric variation, the Primary Linguistic Data must contain robust and salient cues for the setting of whichever parameter is ultimately responsible for the presence/absence of the that-trace effect (see Chacón et al. 2015 for discussion of such cues). For those positing a parametric difference between different varieties of English, it is unclear what differences exist between English varieties that would lead an acquirer to set the relevant parameter(s) appropriately. 2 There is a small class of (apparent) exceptions.
(i) There's a man sells bread at the market.
However, McCawley (1998: 460-463) observes that relatives in such "existential" contexts behave quite differently from normal restrictive relatives, e.g., in terms of extraction possibilities, co-occurrence with proper names and ability to insert parentheticals (Harris & Vincent 1980;Lambrecht 1988;Henry 1995;den Dikken 2005). I therefore do not consider these to be genuine Ø-relatives (pace Doherty 1993Doherty , 2000. 3 Anti-locality and economy accounts of the that-trace effect have been gaining currency (see, e.g., Roussou 1994Roussou , 2002Ishii 1999Ishii , 2004Pesetsky & Torrego 2001Erlewine 2014;Bošković 2016, among many others), especially with the advent of Minimalism and the abandonment of the Empty Category Principle.
clauses and that-RCs, systematically project more syntactic structure in their C-domains than clauses without an overt complementiser (or relative pronoun), which I will call Ø-clauses and Ø-RCs (where Ø is a null complementiser). In other words, that and Ø are not morphophonological variants of a single C head (Weisler 1980;Doherty 1993Doherty , 2000Bošković 1996Bošković , 1997Bošković , 2016Grimshaw 1997;Douglas 2016). Evidence for differences in structural size between that-clauses and that-RCs on the one hand, and Ø-clauses and Ø-RCs on the other, comes from fronting. 4 In brief, fronted material is permitted in the former but not in the latter (see Douglas 2016 for more detail and discussion). Turning first to complement clauses, observe that fronted adverbials are permitted in that-clauses (obligatorily following that on the embedded construal), as in (6) (Doherty 2000: 15 The contrast is even stronger in RCs. Fronted adverbials are permitted in both object and subject that-RCs and wh-RCs, i.e. RCs introduced by a relative pronoun, as in (8)  The contrast between that-clauses and that-RCs on the one hand, and Ø-clauses and Ø-RCs on the other, strongly suggests that that and Ø are not simply phonological variants of the same head. If they were, we would not expect a difference. I thus interpret these facts to mean that Ø-clauses and Ø-RCs actually lack the requisite structure to host such fronted elements, whilst that-clauses and that-RCs (and wh-RCs) do project enough structure to host such elements. Many analyses, assuming a C-domain with a single C head, adopt the conclusion that Ø-clauses and Ø-RCs actually lack the C-domain altogether (see, e.g., Weisler 1980;Doherty 1993Doherty , 2000Bošković 1994Bošković , 1996Bošković , 1997Grimshaw 1997;Ishii 2004). However, if the C-domain can be split (see Rizzi 1997 et seq.), Ø-clauses and Ø-RCs may have a C-domain, just one that is less articulated than the C-domain found in that-clauses and that-RCs. Indeed, positing a less articulated C-domain rather than no C-domain at all seems preferable given the observation that that-RCs and Ø-RCs exhibit no discernible interpretive difference (Doherty 1993(Doherty , 2000, i.e. both involve A'-movement to the C-domain thereby creating an A'-chain in the RC (Chomsky 1977). This is interpreted as lambda abstraction and turns a proposition into a predicate (Heim & Kratzer 1998). If Ø-RCs lacked a C-domain entirely, there would be no available target for A'-movement (at least on standard assumptions) and hence no clausal predicate could be formed (though see Doherty 1993Doherty , 2000Bošković 1994Bošković , 1996Bošković , 1997 for alternative proposals).
I thus assume that Ø-clauses and Ø-RCs have a C-domain, but that there is only a single C head in such cases, spelled out with the null complementiser Ø. In contrast, I assume that that-clauses and that-RCs always have at least two C heads in their C-domains, with that spelling out the higher/highest C head. In fact, we have already seen evidence for multiple C heads in the C-domain of that-RCs in (8b), which involved negative adverbial preposing and subject-auxiliary inversion. In (8b), the inverted auxiliary occupies a low C head position, whilst that spells out a higher C head position. For exposition, I will refer to the single C head of Ø-clauses and Ø-RCs as Cᵒ, and to the higher and lower C heads of that-clauses and that-RCs as Forceᵒ and Finᵒ,respectively. 5 There is an interesting question about whether Cᵒ is equivalent to Finᵒ (a truncation approach) or is rather a fusion of Forceᵒ and Finᵒ (a syncretic approach; see Giorgi & Pianesi 1997). On the truncation approach, Ø-clauses and Ø-RCs would effectively be FinPs, whilst their that-counterparts would be ForcePs. This is illustrated in (10) On this approach, we could straightforwardly say that Forceᵒ is spelled out as that and Finᵒ is null Ø. Furthermore, when that is absent, Forceᵒ is absent. If both that-clauses and Ø-clauses are phasal, as I will argue below, this approach would be consistent with the dynamic phase approach (see, e.g., Bobaljik & Wurmbrand 2005;Bošković 2014;Harwood 2015), according to which the phase head of the clause is simply the highest head that is projected. On the syncretic approach, Ø-clauses and Ø-RCs are full CPs just like their that-counterparts, but the former project less syntactic structure than the latter. This could be understood in terms of Giorgi & Pianesi's (1997) Feature Scattering. Applied to the C-domain, this idea says that the features of the C-domain may be located on a single (syncretic) syntactic head or scattered across a number of heads (always in a strict functional sequence as revealed by cartographic studies). In other words, the C-domain may be split or unsplit/fused, with a split C-domain projecting more syntactic structure than an unsplit/fused C-domain. According to this, the C-domain of Ø-clauses and Ø-RCs would be unsplit/fused, with the single Cᵒ head being null Ø, whilst the C-domain of their that-counterparts would be split, with the Forceᵒ head being spelled out as that. This is illustrated in (11 Cartographic considerations arguably favour this syncretism approach. Research has revealed that relativisation typically targets a high position within a split C-domain, often taken to be SpecForceP (Rizzi 1997). This suggests that whatever properties or features are responsible for relativisation are associated with this high head Forceᵒ. If Ø-RCs have only a single head in their C-domain, this head must also be associated with the same properties/features that allow relativisation, which suggests that the single Cᵒ head must be syncretic. If Ø-RCs were truncated, i.e. lacked a Forceᵒ head and all Force-associated features entirely, we would have to say that relativisation in English could target Spec-ForceP or SpecFinP, the latter being unexpected from a cartographic perspective. I will not commit myself to either of these two approaches as the present analysis seems to be compatible with both, as far as I am aware. For this reason, I will continue to use the labels Forceᵒ, Finᵒ and Cᵒ in what follows. Finally, I assume the so-called Matching Analysis of RCs (see, e.g., Munn 1994;Citko 2001;Salzmann 2006), according to which the noun modified by the RC (the RC head) is syntactically represented both internally and externally to the RC. Importantly, the RC-external and RC-internal representations of the RC head are not related by movement, and only the RC-external representation is pronounced. In what follows, only the structure of the RC itself will be illustrated. In other words, the pronounced, RC-external representation of the RC head will not be shown in the labelled bracketing. Consequently, the representations of the RC head that are illustrated are all RC-internal and hence unpronounced. I will also assume that that in both that-clauses and that-RCs is a complementiser (see, e.g., Kayne 1994;Bianchi 1999;de Vries 2002;Bhatt 2002;Douglas 2016), though, as far as I can tell, the analysis would be unaffected if that were treated as a relative pronoun (Arsenijević 2009;Kayne 2014;Manzini 2014). What is crucial for the present article is the idea that wh-relative pronouns and that occupy the highest part of the clausal structure in a split C-domain, i.e. Forceᵒ or SpecForceP, whilst Ø spells out the single Cᵒ head of an unsplit/fused C-domain.
To summarise, I have argued that Ø-clauses and Ø-RCs are CPs, i.e. clauses where the C-domain contains a single C head called Cᵒ, as in (12), and that that-clauses and that-RCs (and wh-RCs) are ForcePs, i.e. clauses where the C-domain contains at least two C heads called Forceᵒ and Finᵒ, as in (13) I assume that Forceᵒ, which is ordinarily spelled out as that, is null when its specifier is overtly pronounced (see, e.g., Koopman 2000;Starke 2004;Neeleman & van de Koot 2006). Recall that in (13b) the representation of the RC head in SpecForceP is not pronounced, hence Forceᵒ is spelled out as that.

Analysis
In this section, I provide an analysis of the that-trace and anti-that-trace effects. We will turn first to the anti-that-trace effect. Given the structures we have concluded for RCs, we are now in a position to see more precisely what the anti-that-trace effect consists in. Consider (15) and (16). Recall that the DP in SpecForceP in (15) is not spelled out so a Doubly-filled Comp Filter violation does not arise. Furthermore, only the relative clause itself, enclosed in square brackets, is represented in the tree diagram, and I use traces for expository convenience.
(15) I met the man [that saw Mary].
These are (short) subject RCs and I assume that A'-extraction is from SpecTP (pace Rizzi & Shlonsky 2007). (15) shows that movement from SpecTP to SpecForceP is permitted, whilst (16) shows that movement from SpecTP to SpecCP (where CP immediately dominates TP) is not. There is nothing wrong with A'-extraction to SpecCP per se since all elements except the subject in (the highest) SpecTP can be relativised using Ø-RCs, as we saw in (12b) for example. Therefore, the problem seems to be that SpecTP is too close to SpecCP, i.e. movement from SpecTP to SpecCP is ruled out because it is anti-local (see also Bošković 2016). This falls under the type of anti-locality proposed by Erlewine (2016: 445). 6 (17) Spec-to-Spec Anti-locality A'-movement of a phrase from the Specifier of XP must cross a maximal projection other than XP.
(18) Definition: crossing Movement from position α to position β crosses γ if and only if γ dominates α but does not dominate β.
According to (17), (16) is ungrammatical because A'-movement of the subject is from SpecTP to SpecCP, which only crosses the maximal projection TP. In contrast, (15) is fine because movement from SpecTP to SpecForceP crosses TP and FinP. See Section 4.1 for an attempt to derive this anti-locality condition. 7 Importantly, this anti-locality condition only rules out movement to SpecCP from SpecTP. It does not rule out movement from lower positions. Other arguments can thus be relativised using either that-RCs or Ø-RCs. The structures for direct object that-and Ø-RCs are shown below. I assume that internal arguments transit through the edge of the v-domain on their way to the C-domain (Chomsky 2000(Chomsky , 2001(Chomsky , 2004(Chomsky , 2008. Recall that only the relative clause itself, enclosed in square brackets, is represented in the tree diagram.

(19)
The man [that John saw] was tall. (20) The man [John saw] was tall.
Furthermore, if the subject moves from a position lower than SpecTP, as in relativisation in expletive-associate constructions, anti-locality is not violated and a Ø-RC is grammatical.
(21) a. There was a man in the garden. b. The man [there was __ in the garden] was tall.
To summarise, anti-locality combined with the systematic variation in the structural size of that-RCs and Ø-RCs straightforwardly captures the anti-that-trace effect.
Assuming that this is correct, what implications does it have for an analysis of the that-trace effect? We first consider subject extraction from Ø-clauses, which is permitted. As argued above, Ø-clauses have a C-domain with a single C head Cᵒ. There are two possible derivations to consider, one of which is ruled out by anti-locality. In the following diagrams, the highest vP is the matrix vP and CP is the embedded Ø-clause. The diagrams show movement of the subject only as far as the edge of the matrix vP.

(22)
Who did you think saw Mary? a. b.
In (22a), the subject moves directly from SpecTP of the embedded clause to the matrix SpecvP without transiting through the edge of the embedded C-domain. In (22b), the subject moves via SpecCP. However, the first step of A'-movement from SpecTP to SpecCP violates the anti-locality condition. Therefore, anti-locality rules out the derivation in (22b), leaving (22a) as the only remaining derivation (see also, e.g., Ishii 1999Ishii , 2004Erlewine 2014).
We turn now to subject extraction from that-clauses, which is ruled out. 8 As argued above, that-clauses have a C-domain with at least two heads, Forceᵒ and Finᵒ. There are three basic options to consider (the highest vP is the matrix vP and ForceP is the embedded that-clause).
(23) *Who do you think that saw Mary? a.
b. 8 The that-trace (or rather, the whether-trace) effect is also found in embedded interrogatives, insofar as extraction is allowed at all.
(i) a. ?Who do you wonder whether John saw? b. *Who do you wonder whether left?
Unlike embedded declaratives, there is no Ø-clause option for embedded interrogatives. In other words, embedded interrogative clauses obligatorily have a split C-domain. c.
(23a) shows that an extracted subject cannot move directly from SpecTP of the embedded clause to SpecvP in the matrix clause across both FinP and ForceP. Such movement is too far, i.e. it violates locality. (23b) shows that it is impossible to move via SpecFor-ceP. Recall that A'-extraction from SpecTP to SpecForceP is precisely what we proposed for short subject that-RCs, which are perfectly grammatical. Therefore, the problem with (23b) is not the movement to SpecForceP per se, but rather the nature of the landing site.
In other words, SpecForceP is a possible final landing site, as in short subject that-RCs, but an impossible intermediate landing site. Finally, (23c) is ruled out by anti-locality since the subject moves from SpecTP to SpecFinP crossing only TP. As above, if A'-movement is from a position lower than the embedded SpecTP, movement to SpecFinP does not violate anti-locality. Consequently, direct objects and other internal arguments can be extracted from that-clauses without problem.
(24) Who(m) do you think that Mary saw?
Furthermore, subjects moving from a position below SpecTP can also be extracted from that-clauses without violating anti-locality, hence they do not trigger a that-trace effect (see also Rizzi & Shlonsky 2007: 126). 9 (25) a. You think that there was a man in the garden. b. Who do you think that there was __ in the garden?
To summarise, movement through the edge of the C-domain of the embedded clause does not target the very edge of the clause, but rather the specifier of the complement of the highest C head.
An anonymous reviewer asks about A'-extraction from whether-clauses and tensed wh-islands. They point out that whether-clauses and tensed wh-islands should be analogous to (24) in terms of the structural configurations involved, yet, as they correctly point out, A'-extraction in such contexts is degraded or ungrammatical, as in (26) I propose that Relativised Minimality (Rizzi 1990(Rizzi , 2013Starke 2001) is responsible. According to Relativised Minimality, in the abstract configurations in (27), X can be related to Y by movement so long as Z's feature specification is a proper subset of X's. Hence (27a) is permitted, but (27b, c) are not (α and β are features).
On the reasonable assumption that wh-arguments are [wh,φ] and that wh-complementisers and wh-adverbs are [wh], the configuration in (27a) corresponds to (26a, b), the configuration in (27b) corresponds to (26c), and the configuration in (27c) corresponds to (26d). 10 It is plausible to assume that the complementisers that and Ø do not have [wh] features. Consequently, extraction out of that-and Ø-clauses instantiates the abstract configurations in (28).
There is thus no Relativised Minimality effect in cases of extraction from that-and Ø-clauses, but there is such an effect in cases of extraction from whether-clauses and wh-islands, hence the difference between (24) and (26). Before moving on to the discussion, let us summarise the main claims so far: 9 I assume that pro-drop languages, which do not exhibit the that-trace effect, involve subject extraction from a position below SpecTP more generally (Rizzi 1982). 10 Note that (26a, b) are not fully grammatical as the configuration in (27a) would predict. This is arguably due to the fact that Z in (27a) is a non-empty, proper subset of X and Y in featural terms. Relativised Minimality straightforwardly permits the relation between X and Y based on the [β]-feature, but would not permit this relation based on the [α]-feature. This means the result is not ungrammatical, but not fully grammatical either.
(29) a. The anti-that-trace effect derives from Spec-to-Spec Anti-locality. b. SpecForceP can be a final but not an intermediate landing site. c. Successive cyclic movement does not proceed through the very edge of the C-domain, rather it targets the specifier of the complement of the highest C head. Putting this in phase theoretic terms, the escape hatch is not the specifier of the phase head, but rather the specifier of the complement of the phase head.
In the next section, we will discuss the theoretical consequences of these claims and attempt to derive Spec-to-Spec Anti-locality.

Anti-locality
To account for the anti-that-trace effect, we adopted Erlewine's (2016) Spec-to-Spec Antilocality condition, repeated below.
(30) Spec-to-Spec Anti-locality A'-movement of a phrase from the Specifier of XP must cross a maximal projection other than XP.
However, as it is, it is more a descriptive generalisation than an explanation. Bošković (2016) proposes to subsume it under a more general theory of labelling. In essence, he proposes that movement must cross a labelled projection. Following suggestions in Chomsky (2013Chomsky ( , 2015, Bošković proposes that there are essentially two types of labelling. The first type is labelling when a minimal and non-minimal projection merge, i.e. a head and its complement. This type of labelling is argued to take place in the syntax as soon as the relevant configuration arises. The second type is labelling when two non-minimal projections merge, i.e. two phrases. There are two options: (i) some shared feature of the two phrases projects; or (ii) one of the phrases must move (traces/copies not being relevant to labelling). In this latter case, Bošković argues that labelling does not (and cannot) take place immediately. Instead, there is indeterminacy and the structure is, at that point of the derivation, syntactically unlabelled. Bošković proposes that movement counts as anti-local when an element crosses only an unlabelled projection. Thus consider Bošković's analysis of the anti-thattrace effect in (31), which is configurationally identical to ours. First, the subject DP merges with TP. Bošković claims that this cannot be labelled by any shared feature(s), so one of the phrases must move. Therefore, at this point in the derivation, the structure is unlabelled (indicated by ?P), as in (31a). The next head is then merged. In Ø-RCs, using our terminology, this head is Cᵒ. Because Cᵒ is a head, it can label the structure immediately via the first type of labelling, as in (31b). If the subject DP internally merges with CP, it will only cross the unlabelled phrase ?P and will thus qualify as anti-local according to Bošković,as in (31c). Note that the subject DP is internally merging with CP; it is not crossing CP.
In that-RCs, on the other hand (and still using our terminology), Finᵒ is merged with ?P, projecting FinP. Forceᵒ is then merged and projects ForceP. The subject DP then internally merges with ForceP crossing not only ?P but FinP as well. This involves crossing a labelled projection so this movement is not anti-local. This is illustrated in (32) Bošković's approach thus derives the effects of Spec-to-Spec Anti-locality. However, it is unclear why, when the subject DP merges with TP, the resulting structure cannot be labelled with the φ features that the subject and T(P) are typically assumed to share. Indeed, this is Chomsky's (2013Chomsky's ( , 2015 assumption, although Chomsky also assumes that such labelling results in the subject being frozen, thereby deriving the Subject Criterion (Rizzi 2006;Rizzi & Shlonsky 2007). However, as Bošković (2016) points out, the Subject Criterion is too strong. If it were impossible to move subjects from SpecTP at all, we would not expect the well-known adverb effects, where that-trace violations are alleviated (Culicover 1992(Culicover , 1993Browning 1996). I will return to the adverb effect in Section 5. I thus propose that the subject DP and TP project a φ label from their shared φ-features without this resulting in a subject freezing effect. According to this view, in the case of Ø-RCs, Cᵒ would effectively take φ as its complement. The result would be labelled CP, as in (33a). The subject DP then attempts to internally merge with CP, yielding the configuration in (33b). Now, recall that the φ label came (in shared fashion) from the subject DP. Therefore, as far as the system is concerned, Cᵒ has merged with the same φ twice: once as its complement and again as its specifier. In other words, this configuration resembles the illicit movement of a complement to the specifier of the same projection, i.e. we have a (derived) form of Comp-to-Spec Anti-locality, which is itself derived from Economy (Abels 2003(Abels , 2012a. Note that internally merging a non-subject DP with CP will not trigger any problem because the φ in Cᵒ's complement and the φ in Cᵒ's specifier would come from different arguments. In that-RCs, Finᵒ takes φ as its complement and projects FinP. Forceᵒ is then merged and projects ForceP. The subject DP then internally merges with ForceP. There is thus no derived Comp-to-Spec Anti-locality and the result is grammatical, as in (34) To summarise, I have proposed that Spec-to-Spec Anti-locality is a (derived) form of Compto-Spec Anti-locality arising as a consequence of φ labelling. Given that φ-agreement is typically related to A-positions, this type of anti-locality will only affect the first step of movement from an A-position, namely the first step of A'-movement.

The phase escape hatch
Our adoption of the Spec-to-Spec Anti-locality analysis of the anti-that-trace effect suggested a novel analysis of the that-trace effect. This novel analysis involved the claim that (i) SpecForceP can be a final but not an intermediate landing site, and (ii) the phase escape hatch is not the specifier of the phase head, but rather the specifier of the complement of the phase head. These claims depart from the standard claim that successive cyclic movement takes place via the phase edge, i.e. through the specifier of the phase head (Chomsky 2000(Chomsky , 2001(Chomsky , 2004(Chomsky , 2008, though note that phase theory is typically associated with skeletal phrase structures meaning that the phase escape hatch can only be the specifier of the phase head. The cartographic enterprise deals in much more fine-grained phrase structures, which raises the question of how phase theory and cartography fit together (see also, e.g., Chomsky 2001;Rizzi 2004;Shlonsky 2010;Biberauer & Roberts 2015). The exact location of the phase escape hatch thus becomes an empirical problem, albeit one that is difficult to test. Evidence in favour of the standard characterisation of the phase escape hatch comes from quantifier float in West Ulster English (McCloskey 2000). In this variety, floated quantifiers may appear at the edge of an embedded clause through which an A'-extracted element has moved. Crucially, in cases of extraction from a that-clause, the floated quantifier appears to the left of that, as in the following examples (McCloskey 2000: 61). (35) a. What did he say all (that) he wanted t? b. Where do you think all they'll want to visit t? c. Who did Frank tell you all that they were after t? d. What do they claim all (that) we did t?
In our terms, this suggests that successive cyclic movement takes place to the left of Forceᵒ, i.e. through SpecForceP, and not to its right, i.e. not through SpecFinP. However, the evidence is not as conclusive as it may first appear (see also Bobaljik 2003;Koopman 2009). Assuming punctuated paths (as in standard phase theory), evidence from A-movement shows that floated quantifiers can appear in positions through which the DP has not moved, as in (36).
According to standard phase theory, the subject DP moves from a position internal to the v-domain directly to SpecTP. The variety of positions in which all can appear is thus unaccounted for. Furthermore, in (36), the DP subject is standardly assumed to have originated as an internal argument, yet floated quantifiers are impossible in such positions. The presence of all to the left of that thus cannot be taken as conclusive evidence for movement through SpecForceP and not through SpecFinP. Evidence in favour of our non-standard characterisation of the phase escape hatch comes from cases where overt C-elements appear to the left of the position through which successive cyclic movement has occurred. One such case comes from Dinka. 11 (37) Dinka ( Van Urk & Richards provide several arguments for the idea that successive cyclic movement proceeds via the position marked by the underscore (__). Importantly, this position can be preceded by an overt complementiser element (in bold). This is unexpected if the phase escape hatch were at the very edge of the embedded clause. Instead it suggests that the phase escape hatch may actually be slightly lower. I thus take the empirical evidence currently available to be inconclusive: it does not rule out the standard phase theoretic approach, but equally it does not rule out our non-standard approach either.
Interestingly, our approach is compatible with previous ideas concerning motivations for CP recursion. Following Watanabe (1992), Browning (1996) proposes that CP recursion (in our terms, the splitting of the C-domain) is motivated by clause-typing (Cheng 1991). In essence, wh-clauses must have a wh-phrase in their specifier, whilst non-wh-clauses must have no specifier. This is a derivational property so that if, at some point in the derivation, a non-wh-clause has a filled specifier, CP recursion will be triggered. This creates another (higher) C-projection without a specifier. In this way, the embedded clause is not typed as a wh-clause. Although Browning does not seem to note this, her approach implies that intermediate wh-movement cannot be through the very highest specifier position of an embedded declarative clause (otherwise that clause would be typed as a wh-clause). We thus have an implicit precedent based on selection and clause-typing for the idea that the phase escape hatch is the specifier of the phase head's complement.
This analysis claims that SpecForceP in that-clauses cannot be an intermediate landing site and that the phase escape hatch in such cases in SpecFinP. What about Ø-clauses? Recall that we concluded that A'-extraction of subjects takes place from SpecTP directly into the matrix clause (see also, e.g., Ishii 1999Ishii , 2004Erlewine 2014). There are two possible reasons for this. The first is that, being in SpecTP, the subject is already effectively in the phase escape hatch of a Ø-clause (SpecTP being the specifier of the complement of the phase head Cᵒ). The second is to say that Cᵒ is not a phase head at all. These options can be teased apart using A'-extraction of non-subjects. If Cᵒ is not a phase head, we would predict that non-subjects could move directly into the matrix clause without having to transit through the edge of a Ø-clause. However, if Cᵒ is a phase head, we would predict that non-subjects will transit through SpecTP (one that is higher than the specifier occupied by the subject). Evidence from reconstruction effects, as in (38), suggests that Cᵒ is a phase head. 12 (38) a. *You told the girls i that/Ø Peter likes these pictures of each other i . b. Which pictures of each other i did you tell the girls i that/Ø Peter likes?
In (38a), each other cannot be bound by girls -the anaphor is too far from its antecedent. However, if the phrase containing the anaphor undergoes A'-extraction, binding by girls becomes possible, as in (38b). This suggests that there is an intermediate landing site at the edge of the embedded clause where each other is c-commanded by girls but not by Peter, and so can be bound by girls. Crucially, the result is the same regardless of whether the embedded clause is a that-clause or a Ø-clause. If the Ø-clause were non-phasal, we would predict that A'-extraction would move which pictures of each other from the edge of the embedded v-domain directly to the edge of the matrix v-domain. In this case, each other would never be in a position where it is c-commanded by girls but not by John, hence we would incorrectly predict that binding fails even under A'-extraction. Since this is not the case, we conclude that, if reconstruction in intermediate positions is evidence for the phasal status of that-clauses, then Ø-clauses must be phasal too. 13

Summary
Recall the main components of our analysis from Section 3, repeated below.
(39) a. The anti-that-trace effect derives from Spec-to-Spec Anti-locality. b. SpecForceP can be a final but not an intermediate landing site. c. Successive cyclic movement does not proceed through the very edge of the C-domain, rather it targets the specifier of the complement of the highest C head.
In this section, we derived Spec-to-Spec Anti-locality (39a) from labelling, claiming that the computational system effectively treats it as a derived Comp-to-Spec Anti-locality violation. We also claimed that the phase escape hatch is not the specifier of the phase head (as in standard phase theory), but rather the specifier of the complement of the phase head (39b, c). We noted that the question of the exact position of the phase escape hatch is not often addressed and, indeed, only makes sense when one adopts more articulated syntactic structures (unlike standard phase theory). We saw that empirical evidence is inconclusive in deciding between our idea and the standard idea of phase escape hatches, but noted that our idea fits well with previous suggestions concerning the motivation for CP recursion based on selection and clause-typing.
(40) Lee forgot which dishes Leslie had said that *(under normal circumstances) should be put on the table.
According to our analysis, A'-extraction out of a that-clause is ruled out because the subject attempts to A'-move from SpecTP to the phase escape hatch in SpecFinP, a move which is anti-local. SpecFinP is the phase escape hatch in such contexts because it is the specifier of the complement of the phase head Forceᵒ. We thus predict that there are two possible ways to avoid this problem: either we insert syntactic material between TP and FinP, or we insert material above FinP but below ForceP. In the former case, SpecFinP is the escape hatch but the subject can now move there without violating anti-locality, whilst in the latter case, SpecFinP would no longer be the escape hatch since it would no longer be the specifier of the complement of the phase head Forceᵒ, and A'-moving the subject to the escape hatch would no longer be anti-local. We saw in Section 2 that Ø-clauses and Ø-RCs do not permit fronted adverbials. If these contain a C-domain with a single C head, we conclude that fronted adverbials cannot be inserted directly above TP. For that-clauses, this means that fronted adverbials do not occur between TP and FinP, but rather between FinP and ForceP. This is consistent with Rizzi's (2001Rizzi's ( , 2004 proposal that fronted adverbials occupy SpecModP where ModP occurs between ForceP and FinP. I assume that ModP is only present when it hosts a fronted adverbial in its specifier and that its presence requires a split C-domain to ensure that it is positioned between Forceᵒ and Finᵒ. 14 (

41) [ ForceP Forceᵒ [ ModP Modᵒ [ FinP Finᵒ [ TP Tᵒ …]]]]
We thus conclude that the adverbs which participate in the adverb effect are located in SpecModP. If fronted adverbials are present, the phase escape hatch is SpecModP since this is now the specifier of the complement of the phase head Forceᵒ. Subjects can move from SpecTP to SpecModP (one that is higher than the one hosting the fronted adverbial) without violating anti-locality. Adverbs lower than TP are correctly predicted to have no alleviation effect since they are too low to affect the distance moved from SpecTP, as shown in (42) (Rizzi 1997: 311;Brillman & Hirsch 2015). 15 (42) *Who i did she say that t i hardly speaks to her?
In (42), hardly is below TP. Consequently, the phase escape hatch is still SpecFinP and movement from SpecTP to SpecFinP is still anti-local.
As things stand, our analysis predicts that any material between ForceP and FinP will alleviate the that-trace effect. However, it is well-known that fronted arguments have no such effect, as in (43) (Browning 1996;Rizzi 1997Rizzi , 2004. 16 (43) a. Adapted from Boeckx & Jeong (2004: 84) *Who did you say that to Sue introduced Bill?
b. Adapted from Koizumi (1995: 140) *Who did you say that to Aaron will give these books?
If the fronted argument occupies a position between ForceP and FinP (Rizzi 1997), it is unclear why A'-extraction of subjects from that-clauses is not alleviated for the same reason that it is with fronted adverbials. 17 However, fronted arguments are standardly claimed to actually prevent otherwise licit A'-extraction, as in (44); this is the topic island effect (Haegeman 2012: 116 and references therein).
b. Koizumi (1995: 140) *Which books did Becky say that to Aaron she will give?
15 Though see Levine & Hukari (2006: Ch. 2) for potential counterexamples. 16 Though see Culicover (1992: 98, fn. 1) for potential counterexamples. 17 As Rizzi (2014), for example, points out, this difference between fronted adverbials and fronted arguments suggests that the that-trace effect cannot be (or at least cannot only be) a PF-effect. Evidence for the syntactic nature of the that-trace effects comes from their presence at LF (Kayne 1981;Rizzi 1982).
The following examples of embedded multiple interrogatives exhibit a subject-object asymmetry (Kayne 1981: 322).
(i) ?I know perfectly well which man said that he/I was in love with which girl.
(ii) *I know perfectly well which man said that which girl was in love with him/me.
According to Kayne, which girl must undergo movement at LF. In (i), this movement is from an oblique position, whilst in (ii) it is from subject position. The contrast between (i) and (ii) suggests a that-trace effect violation in (ii) even though which girl is not overtly displaced. In current theory, we could say that there is movement but spell-out of the lower copy. If this is a that-trace effect, we predict that removing that should improve acceptability, which seems to be true (Ian Roberts p.c.), though the judgement is admittedly delicate.
(iii) ?I know perfectly well which man said which girl was in love with him/me. I assume that fronted arguments are in SpecTopP (Rizzi 1997), ignoring the distinction between topic and focus. Following Rizzi (2004), I also assume that TopP is higher than ModP (TopP is only present when there is a fronted argument in its specifier The topic island effect suggests that SpecTopP is incapable of serving as an intermediate landing site. Therefore, in a configuration like (45), even though SpecTopP is the phase escape hatch in virtue of being the specifier of the complement of the phase head Forceᵒ, it cannot fulfil this function. One possibility is that SpecTopP is a bona fide criterial position. Any element in SpecTopP is frozen there and interpreted as a topic. SpecTopP therefore cannot function as an intermediate landing site. By assumption, SpecModP is not a criterial position and hence can serve as a phase escape hatch in configurations like (41). Alternatively, SpecTopP may be able to serve as an intermediate landing site, but the topic constituent in the lower SpecTopP acts as an intervener to any constituent being A'-extracted across it. Finally, Haegeman (2003: 644) observes that long-distance fronted adverbials do not alleviate that-trace violations, as in (46c, d), unlike short-distance fronted adverbials, as in (46a, b).
(46) a. *This is the linguist who I think that t will get appointed in Geneva. b. This is the linguist who I think that next year t will get appointed in Geneva. c. *This is the linguist who I think that t expects that all his students will have a job. d. *This is the linguist who I think that next year t expects that all his students will have a job. Haegeman (2000Haegeman ( , 2003 argues that long-distance fronted adverbials are scene-setters and more akin to topics. If so, they can be assimilated to the topic cases rather than to those involving short-distance fronted adverbials.

The v-domain
Two anonymous reviewers ask what implications our analysis has for the phasal v-domain. Given our configurational analysis of the that-trace effect, we might expect similar effects to occur at the boundaries of other phasal domains as well, including the v-domain. I tentatively suggest that this is correct. The evidence comes from double object constructions, as in (47).
(47) John gave Mary a book.
All varieties of English permit A'-extraction of the theme argument (48a), but there appears to be variation in how acceptable A'-extraction of the recipient argument is (48b) (Hornstein & Weinberg 1981;Holmberg et al. 2015).
What did John give Mary? b. (?/ * ) Who(m) did John give a book? 18 Rizzi (2004: 242) proposes the following articulated C-domain: (i) Force … Top* … Int … Top* … Focus … Mod* … Top* … Fin … IP However, since we are not concerned with Int (the position for higher wh-phrases) or Focus, and since topics cannot follow foci in English, we adopt the simplified structure in (45). Furthermore, all varieties of English permit passivisation of the recipient argument (49a), but there is variation in how acceptable passivisation of the theme argument is (49b) (Haddican 2010;Haddican & Holmberg 2012;Biggs 2014;Holmberg et al. 2015).
(49) a. Mary was given a book. b. (?/ * ) A book was given Mary.
Crucially, whilst all varieties allow A'-extraction of the theme argument from a recipient-passive (50a), no variety allows A'-extraction of the recipient argument from a theme-passive (50b), even if the variety allows theme-passives, and this seems to be true cross-linguistically (Holmberg et al. 2015).
(50) a. What was Mary given? b. *Who(m) was a book given?
If we assume that recipient arguments are introduced by an applicative head between VP and vP (Holmberg et al. 2015), and that Voiceᵒ is present in passive contexts but not in active contexts and is higher than vP (following Harwood 2015), the articulated v-domain of a theme passive would be as in (51).
Taking Voiceᵒ to be the phase head, A'-extraction of the recipient argument would be forced to proceed through SpecvP according to our analysis, but A'-movement from SpecApplP to SpecvP would be anti-local, hence the ungrammaticality of (50b). In active contexts, Voiceᵒ is absent. The phase head is vᵒ and A'-extraction from SpecApplP is permitted. This also accounts for those speakers who find (48b) acceptable, though admittedly it leaves the degradedness reported by other speakers somewhat puzzling (see Holmberg et al. 2015 for a different proposal). Finally, it captures (48a) since the theme argument moves from the complement of V to the escape hatch in SpecApplP, which does not violate anti-locality. Though tentative, the configurational parallels between A'-extraction of recipient arguments from theme-passives and A'-extraction of subjects from that-clauses are intriguing. However, further discussion of the v-domain goes beyond the scope of this paper so I leave this for future research.

Subject-object asymmetries in topicalisation and wh-questions
We have argued that anti-locality provides an account of the subject-object asymmetry of the that-trace and anti-that-trace effects. This raises the question of whether this analysis can be extended to other subject-object asymmetries in English, and eventually beyond. 19 In English, matrix wh-questions and topicalisation exhibit subject-object asymmetries. However, I argue that, whilst our anti-locality analysis extends to topicalisation, it does not extend to matrix wh-questions. For a detailed discussion of anti-locality in relativisation, topicalisation and wh-questions in English, I refer the reader to Douglas (2016; see also Bošković 2016), but I will briefly review some of the arguments here.
Turning first to the subject-object asymmetry in English topicalisation, it is standardly claimed that short subject topicalisation is impossible in English (see, e.g., Lasnik & Saito 1992;Bošković 2016).
(52) a. John thinks that Bill, Mary would never love. b. *John thinks that Mary, would never love Bill.
Whilst object topicalisation is possible, as in (52a), subject topicalisation is not, as in (52b). Like Bošković (2016), I analyse this as topicalisation targeting the lowest C head of a split C-domain (see Douglas 2016), which in the case of subjects moving from SpecTP would trigger an anti-locality violation, directly analogous to the anti-that-trace effect. However, I also observe that short subject topicalisation becomes acceptable if there is material intervening between the topic and SpecTP, as in (53).
(53) John thinks that Mary, under no circumstances would ever love Bill.
When adverbial material is present, topicalisation targets a higher C position and movement from SpecTP to this higher position is no longer anti-local. Evidence that Mary in (53) is in a topic position rather than the subject position comes from expletive-associate constructions. Expletive there cannot be topicalised and can only appear in SpecTP. The adverbial phrase under no circumstances can only precede there, as in (54), showing that the adverbial in (53) is higher than SpecTP. Consequently, Mary in (53) is not in SpecTP.
(54) a. *John thinks that there, under no circumstances would ever be a woman who loves Bill. b. John thinks that under no circumstances would there ever be a woman who loves Bill.
I thus conclude that our anti-locality analysis can account for the subject-object asymmetry in English topicalisation (see Douglas 2016 for further discussion).
We will now turn to English matrix wh-questions. The subject-object asymmetry here is illustrated in (55).
(55) a. Who(m) did John see? b. Who saw Mary?
In object wh-questions, as in (55a), the wh-phrase is fronted to the left periphery and there is subject-auxiliary inversion. In subject wh-questions, as in (55b), there is no subject-auxiliary inversion and it is not immediately clear whether the wh-phrase has been fronted or not. We believe that there are strong conceptual and empirical arguments for thinking that wh-subjects do move to the left periphery. Conceptually, given the ample and salient evidence for wh-movement of all non-subject wh-phrases in English, the simplest hypothesis is that wh-subjects behave in the same way (see, e.g., Cheng 1991;Rizzi 1996Rizzi , 1997Trotta 2004, among many others). This is supported by a range of empirical arguments. I will illustrate here using echo interpretations and wh-the hell licensing.
Leaving aside instances of multiple wh-questions, an in-situ wh-phrase results in an echo interpretation in English, not a matrix wh-question. 20 If wh-subjects were always in-situ, we would only expect them to have echo interpretations. This is incorrect: wh-subjects can have matrix wh-question or echo interpretations. An in-situ wh-phrase in an embedded question also yields an echo interpretation.
Note that it is possible for an overt complementiser to appear in (57b) under the echo interpretation. Under a non-echo interpretation, however, overt complementisers are impossible. This can also be seen with wh-subjects (Trotta 2004: 4).
(58) a. Bill didn't say that/whether/if John would arrive first. b. Bill didn't say that/whether/if who would arrive first? (echo) c. *Bill didn't say that/whether/if who would arrive first. (non-echo) The wh-subject in (58b) is in-situ, hence it can appear with an overt complementiser and receives an echo interpretation. In (58c), however, if we have the non-echo interpretation, overt complementisers are impossible. This suggests the wh-subject has moved. Evidence from wh-the hell licensing yields the same result. Wh-the hell questions are only licensed if the wh-phrase undergoes wh-movement, i.e. they are not licensed in-situ (Ginzburg & Sag 2001;Pesetsky & Torrego 2001). As can be seen in (61), wh-the hell questions are also licensed with wh-subjects. This would be unexpected if wh-subjects were always in-situ. When a wh-subject is forced to be in-situ, e.g. in an embedded context with an overt complementiser, as in (61c), we can see that wh-the hell questions are not licensed.
(61) a. Who the hell would have seen Bill? b. I wondered who the hell would have seen Bill. c. *I wondered whether/if who the hell would have seen Bill.
These considerations, among others, strongly suggest that wh-subjects undergo movement to the left periphery in English. Therefore, according to our analysis, they cannot be subject to anti-locality. This is also Bošković's (2016) conclusion. However, Bošković claims that anti-locality is avoided because in wh-contexts subjects move from a position lower than SpecTP. In other words, according to Bošković, in Ø-RCs and matrix subject questions, the clause has only a single head in the C-domain whose specifier is targeted by A'-movement. In Ø-RCs, subjects move from SpecTP and so violate anti-locality, resulting in the anti-that-trace effect, whilst in matrix subject questions, subjects move from below SpecTP and so do not violate anti-locality. However, I would like to propose that wh-subjects do move from SpecTP and are not subject to anti-locality because they target a higher C position in a split C-domain. It is often argued that wh-phrases move to SpecFocP (see, e.g., Rizzi 1997Rizzi , 2004Haegeman 2012, among many others), where FocP is higher than FinP and ModP. Independent evidence for this comes from the fact that wh-phrases target a position to the left of fronted negative adverbials (recall that we are assuming that fronted adverbials are in SpecModP).
(63) a. What under no circumstances would you ever consider buying? b. Who under no circumstances would ever consider buying that dress?
Since fronted adverbials are impossible in Ø-RCs, these data show that wh-phrases in matrix wh-questions do not target the same position as relativisation in Ø-RCs. I therefore conclude that the subject-object asymmetry in English matrix wh-questions concerns the distribution of T-to-C movement and not wh-movement. Our analysis, however, suggests that relativisation and topicalisation behave more alike than wh-questions (see also Kuno 1976;Abels 2012b;Douglas 2016) in being sensitive to anti-locality.

Conclusion
The major contribution of this paper lies in the proposal of a novel and unified analysis of both the that-trace and anti-that-trace effects in English, a long-standing and recalcitrant problem in the literature. I have argued that these effects arise from Spec-to-Spec Antilocality interacting with systematic variation in the degree of articulation of the C-domain in clauses and RCs with and without that. This offers a new perspective on the mechanics of phases and successive cyclicity. Our analysis suggests that the phase escape hatch is the specifier of the complement of the phase head rather than the specifier of the phase head itself, contrary to standard assumptions. Finally, we argued that this anti-locality analysis can be extended to the subject-object asymmetry in English topicalisation, but should not be extended to the subject-object asymmetry in English matrix wh-questions. This indicates that subject-object asymmetries are not homogeneous, but their typology must for now be left as a topic for future research.