Demonstratives as bundlers of conceptual structure

Pronoun resolution has long been central to psycholinguistics, but research has mostly focused on personal pronouns (“he”/“she”). However, much of linguistic reference is to events and objects, in English often using demonstrative pronouns, like “that”, and the non-personal pronoun “it”, respectively. Very little is known about potential form-specific preferences of non-personal and demonstrative pronouns and the cognitive mechanisms involved in reference using demonstratives. We present a novel analysis arguing that the bare demonstrative “that” serves a different function by bundling, and making linguistically accessible, complex conceptual structures, while the non-personal pronoun “it” has a form-specific preference to refer to noun phrases mentioned in the previous discourse. In two English self-paced reading studies, each replicated once with slight variations, we show that readers are reading the demonstrative slower throughout, independently of frequency or complexity of the referent, as a reflection of differences in processing demonstratives vs. pronouns. These findings contribute to two distinct but connected research areas: First, they are compatible with an emergent experimental literature showing that pronominal reference to events is preferably done with demonstratives. Second, our model of demonstratives as conceptual bundlers provides a unified framework for future research on demonstratives as operators on the interface between language and broader cognition.


Introduction
This paper investigates how demonstrative pronouns in English are resolved in comparison to non-personal pronouns, what their respective form-specific preferences are, and which mental processes they engage in a comprehender. Pronouns as a whole are crucial discourse-structuring devices: They identify and, as we argue later, make available a mental representation created by linguistic or nonlinguistic context, such that it can be further talked about.
For instance, imagine a comprehender listening to her friends talking about their recent trip to Paris, how they visited museums and Napoleon's tomb, how they ate macarons, and how they went to see some famous sights. While listening, the comprehender is creating mental representations, sometimes called "situation models" (Radvansky & Zacks, 2014;Zwaan, Langston & Graesser, 1995) of the traveling, walking, eating, and visiting events her friends are talking about. Then the friends end the story with one of the following sentences: (1) a. It's a really beautiful city. b.
They did not taste as sweet as the American ones. c.
Did you know that he crowned himself? d.
Then we walked up to Montmartre. That was a really steep hike.
Once the comprehender hears one of the sentences (1a-d), the pronouns in each sentence trigger a search back in memory to one of the concepts Paris, macarons, Napoleon, or ascending Montmartre, in order to pick the intended (or at least what the comprehender perceives as the most likely) referent, and integrate it with the predicate structure required by one of the sentences in (1). 1 1 In this paper, we use the term 'referent' to denote the linguistic and conceptual entity a pronoun refers to and we use the term 'entity' for concrete or abstract objects, excluding events, facts, situations, or propositions. We use 'event' to generally mean 'things that happen over time,' as nothing in this paper hinges on the exact definition of event (Casati & Varzi, 2008). We use the terms 'concept' and 'conceptual structure' to refer to mental representations, broadly construed as mental objects with semantic properties (Jackendoff, 2002). We do not distinguish between concepts and percepts, using the term 'concept' to refer to both.
Glossa: a journal of general linguistics DOI: 10.5334/gjgl.917 (2) a. The catacombs hold the remains of more than six million people. This is fascinating! b.
I like Beaubourg. It always has great exhibitions. c.
They took a walk from the Quartier Latin to Pigalle, and then had dinner. That took almost five hours.
In example (2a), the proximal demonstrative pronoun "this" makes reference to the fact described in the first sentence; in example (2b), the pronoun "it" refers to an inanimate object, namely a museum; and in example (2c), the distal demonstrative pronoun "that" can refer to the whole event of walking and dinner, or either of the subevents; the ambiguity is likely resolved by a combination of world knowledge and a recency bias (e.g., Gordon & Scearce, 1995, but see Stewart, Holler & Kidd, 2007).
Thus, to use Nunberg's (1993) terminology, the 'classificatory component' of these pronouns is qualitatively different from that of personal pronouns: Whereas personal pronouns restrict the search space to the set of animate entities, non-personal and demonstrative pronouns (in English, at least) restrict the search to its complement set. 2 But the classificatory component is also quantitatively different: The set of non-animate concrete and abstract entities in the world is substantially larger and more heterogeneous than the set of animate entities, let alone events, facts, or situations, which non-personal and demonstrative pronouns can also refer to.
We can therefore conclude that in English, neither non-personal nor demonstrative pronouns have as tight a link between their form and the conceptual category of their referent as personal pronouns do. Investigating this richer space of referential possibilities allows new insights into reference resolution that go beyond what one can observe with personal pronouns.

Previous research on non-personal pronouns and demonstratives
Understanding the different roles and representations of non-personal and demonstrative pronouns has been notoriously difficult, since their functions and uses overlap considerably. Some researchers claim that "it" and "this"/"that" "are indistinguishable with respect to the description they provide for the intended referent (an inanimate object)" (Ariel, 2001, p.29).
Others claim that the interpretation of "it" and "this"/"that" depends largely on discourse status, such that non-personal pronouns are used for topics and/or salient referents, and demonstratives are used for non-topical but activated content (e.g., Gundel et al., 1993; see also Grosz, Weinstein & Joshi, 1995;Grosz & Sidner, 1986; see Bosch, Rozario & Zhao, 2003, for a similar argument for German d-pronouns). Halliday (1985) proposes a system in which demonstrative pronouns establish reference to a specific token, whereas "it" specifies a non-specific token; this model was extended by Strauss (2002), who proposes a gradience in focus from "this" (high focus, important referent) and "that" (medium focus) to "it" (low focus, unimportant referent). According to these systems, the choice of a particular pronoun depends upon how much attention a speaker is asking the interlocutor to pay to the particular referent. In a similar vein, some researchers have simply claimed that bare demonstratives like "this" or "that" refer to anything that best fits all the cues which a "reasonable and attentive addressee will take the speaker to be exploiting" (Wettstein, 1984: 73;cited in Smit, 2012).
Finally, there is also a claim that "it" is sensitive to syntactic prominence, or grammatical category, while demonstrative pronouns like standalone "that" are more likely to refer to complex/composite entities (Brown-Schmidt, Byron, & Tanenhaus, 2005). In their study, Brown-Schmidt et al. (2005) found that when people were told "Move the cup onto the saucer. Now move it onto the table", they were more likely to move only the cup; but when they were told to "move the cup onto the saucer. Now move that onto the table", people were more likely to move the composite object cup+saucer, which has no unified linguistic antecedent.
Glossa: a journal of general linguistics DOI: 10.5334/gjgl.917 This finding that comprehenders tend to resolve demonstratives as referring to more complex entities than pronouns is also confirmed by Çokal, Sturt and Ferreira (2016), who found that demonstratives tend to refer to propositions, and simple pronouns to objects linguistically encoded in noun phrases (NPs). In an eye-tracking-while-reading study, they found longer reading times when the non-personal pronoun "it" referred to a proposition than when the demonstrative pronoun "this" referred to a proposition; the pattern was reversed for reference to an object NP.
Recent work employing the sentence continuation method -using both constructed texts and snippets of naturally occurring stories -found that people have a strong tendency to use "this" for event reference, and "it" for object reference, modulated both by verb class and intention to re-mention (Loáiciga, Bevacqua, Rohde & Hardmeier, 2018). The observation that reference to events tends to be accomplished with demonstrative pronouns received further support from recent corpus studies showing that almost three-quarters of demonstratives in dialogues refer to events, while only about 5% of non-personal pronouns do (Evans, 2001;Müller, 2007;Poesio, 2015).
Put together, these studies -in addition to corroborating data from other corpus analyses (e.g. Gundel et al., 1993Gundel et al., , 2005) -seem to demonstrate two things: First, simple non-personal pronouns tend to refer to easily accessible referents encoded in prior discourse by noun phrases; and second, there is evidence for form-specific constraints (Kaiser & Trueswell, 2008): Demonstrative pronouns trigger comprehenders to construct an antecedent from previous context, one that does not necessarily have to have a simple NP antecedent. That is, the form itself (simple non-personal pronoun "it" vs. demonstrative pronouns "this" and "that") provides valuable cues to the comprehender about what kind of referent the speaker intends, even if these cues are probabilistic.
These data fit with a theoretical proposal by Elbourne (2008), who analyzes demonstratives as denoting 'individual concepts' packaged as definite descriptions. Crucially, demonstrative pronouns in this model also introduce existence and uniqueness presuppositions, just as strong definite determiners do (Abbott, 2004;Strawson, 1950). That is, "I like this" presupposes that there exists something to like, and that this 'something' is uniquely identifiable in the context (just like "I want to eat the raspberry macaron" triggers the presupposition that a particular unique raspberry macaron exists), and can be found in the visual/spatial context (for instance, among the other macarons in the box). 3 In this theory of demonstratives, Elbourne (2008) argues that "that" and "this" are forms with a relational component, for instance, distal and proximal factors: It has been argued that "this" tends to refer to things closer in context, and "that" to things further away (e.g. Kruisinga, 1925-32;Quirk, Greenbaum, Leech & Svartvik, 1985, and many others;see Scott, 2013, for a detailed analysis distinguishing the form-based preferences of both demonstrative pronouns.) Demonstratives also take a property as argument, and use both the property and the proximal/ distal cue to map onto a specific referent (Reuter & Lew-Williams, 2018). For instance, in the sentence "This is the best macaron", the comprehender will use the proximal cue and combine it with the property of being a macaron; ideally, this way the comprehender finds the referent that the speaker had in mind (presumably a very delicious macaron nearby.) Elbourne (2008) does not explicitly discuss reference to events, and how exactly the comprehender may map a demonstrative pronoun to a chunk of linguistic or non-linguistic conceptual structure is unclear; but we agree with the analysis that the demonstrative pronoun identifies an individual concept by means of reference (Loar, 1976; see also O'Madagain, 2020, for a compatible approach).
We argue furthermore that it is the function and purpose of the demonstrative to bundle a potentially complex, diffuse set of conceptual structure into such an individual concept, which the linguistic discourse can access and use down the road (cf. Grosz, 2018). Thus, in an extension of this model, we argue in the spirit of Wiese and Maling (2005) that demonstratives can serve as 'universal bundlers' for complex concepts, such as events. 3 As a reviewer points out, the uniqueness requirement depends on context, and can apply to types as well as tokens. In a restaurant setting, asking for "the raspberry macaron" would be construed as "one of those macarons the chef makes", whereas asking for a specific token of a macaron would likely be done with a demonstrative pronoun, as below.

A proposal: Demonstratives as 'universal bundlers'
We propose an approach to demonstratives that views them as potential markers of a conceptual process that bundles a chunk of conceptual structure, and marks that chunk linguistically, such that the bundle can be referred to in the ongoing discourse. In this view, demonstratives tend to accomplish more, and have different goals, than simple pronouns.
Simple pronouns like "he" or "it" are indices that typically link the pronoun to an easily accessible noun phrase in the context, provided that this noun phrase adheres to the constraints posited in the relational and classificatory components of the pronoun (including discourselevel, lexical-level, syntactic, and semantic constraints). This is a complex process, as evidenced by the pronoun resolution literature; yet simple pronouns are still usually properly co-referential with a noun phrase in the discourse, except in the occasional case of an unheralded pronoun (Greene, Gerrig, McKoon & Ratcliff, 1994), or other rare exceptions.
Our account argues that the processes of simple pronoun resolution and demonstrative pronoun resolution, given the same context, are fundamentally different (for precursors of this idea, see e.g., Hankamer & Sag, 1976;Jackendoff, 2002, and probably others): Unlike simple pronouns, demonstratives serve as triggers for bundling chunks of conceptual or linguistic structure into an individual concept, such that this individual concept can serve as a referent for further discourse. This bundling procedure can be purely conceptual in nature, and the demonstrative pronoun is simply the linguistic marker for this 'universal bundling.' 4 We propose that non-personal simple pronouns like "it" and demonstrative pronouns like "that" have different form-specific constraints in English (see Kaiser & Trueswell 2008 for the formspecific multiple-constraints framework), and employ different psycholinguistic mechanisms to go from index to interpretation: Whereas "it" has a bias to quickly attach to the first noun phrase that satisfies both its relational and classificatory component, "that" can function as a 'universal bundler' that makes a chunk of conceptual structure (whether linguistically encoded or not) accessible to linguistic discourse. 5 In other words, demonstratives can serve as universal feature bundlers that allow the subsequent linguistic structure to further address the content of these bundles (or, in Lewis & Vasishth's 2005 terminology, the content of these 'chunks').
This idea of a linguistic element serving as linguistic marker and trigger for a conceptual operation is obviously not new. Other 'universal machines' have been introduced in linguistics and philosophy to capture mapping functions that take conceptual objects as input, and yield continuous substances as output, or vice versa. The examples in (3) illustrate cases of the 'universal grinder', 'universal sorter', and 'universal packer,' all of which are conceptual operations triggered by mass and count syntax, respectively (Bunt, 1985;Pelletier 1975;Pelletier and Schubert, 1989; examples from Wiese & Maling, 2005): There is chicken in the soup.
(mass syntax with a typically count noun => 'universal grinder': animal to meat) 4 One prediction of this account is that once a complex conceptual structure has been bundled by a demonstrative (i-a), further reference to it with a simple non-personal pronoun (i-b) should be preferred over another demonstrative reference (i-c): The friends rented an Airbnb near Gare du Nord. That was much cheaper than staying in a hotel.
b. It was also more convenient than being further south. c.
?That was also more convenient than being further south.
This paper does not aim to answer this question, but we hope to test it in further research (see also Loáiciga et al., 2018 for similar findings).

5
Note that under some circumstances and in some contexts, "it" also does not need a linguistic antecedent. To us, these circumstances are limited to expletive "it", cataphoric usages of the pronoun (e.g., "After it mates, the male bee dies"), or when the Question Under Discussion is abundantly clear (e.g., "Oh no! It's clogged again", standing in front of a toilet.) This fact does not change our analysis of the demonstrative's role in reference. Also, we want to remind the reader that none of the observations on pronoun behavior are absolute; rather, they reflect preferences that have different probabilistic distributions. This is to say, one will likely be able to find a demonstrative pronoun triggering a simple reference process to an NP, and one will likely be able to find a nonpersonal pronoun triggering a conceptual bundling process, such as is needed in example 3a. Our goal here, as in most psycholinguistic studies, is to describe and explain and experimentally restricted situation that disentangles these referential mechanisms (Mook, 1983).
The best wines are from Chile. (count syntax with a typically mass noun => 'universal sorter': substances to kinds) c.
Two beers and a coffee, please.
(count syntax with a typically mass noun => 'universal packer': substances to portions) We propose that demonstratives are linguistic markers, analogous to mass or count syntax, of a conceptual operation that takes a conceptual structure as its input. This operation, in English, can only apply to conceptual structures that satisfy the demonstratives' form-based relational and classificatory components. Its output is a bundle of conceptual structure (an 'individual concept') in a linguistic form, which can serve as a definite description of that individual concept in the ongoing linguistic discourse (see Figure 1). 6 For instance, an informal, simplified outline of the conceptual structure for the situation depicted on the lower left in Figure 1 could roughly be sketched as Example (4) , … Note that this sketch does not take into account the social, visual and spatiotemporal complexities that the static picture in Figure 1 implies: The children's and woman's facial expressions and gaze, inferences about the persons' ages and relations among each other, the architectural style of the kitchen in the picture, inferences about its location and inhabitants, and so on. All of these additional aspects of the event could be, in principle, included in a finer-grained representation  of Figure 1. Important to note is that out of this infinitely rich conceptual structure, a speaker can uniquely target any structure and substructure using a bare demonstrative: (5) That looks like in my house!
The referent of Example (5) is necessarily vague without more context; "that" can bundle anything from the style of the cabinets to the mess on the counter. However, in the absence of prior discourse, once the speaker includes more lexical content to restrict the classificatory component of the pronoun, identification of the referent is easier: Thus, demonstratives tend to bundle up and refer to eventive or otherwise complex antecedents that are not encoded with a noun phrase, based on these observations and on the evidence from Brown-Schmidt et al. (2005) and Çokal et al. (2016). Hence, we propose the following formbased classificatory and relational properties of English demonstratives: 1. The classificatory component of standalone demonstratives specifies that the search for a referent needs to be restricted to a non-animate entity.

2.
The relational component of standalone demonstratives specifies that the referent is not immediately accessible in context.
The classificatory component's restrictions are intuitively quite straightforward and easily verifiable. For instance, *"I like Nils i . This i has such a sunny smile" is ungrammatical in English (even though the use of demonstratives in conjunction with copula verbs may be acceptable for some speakers, i.e., ?"I like Nils i . This i is such a happy kid"). We do not discuss this component further.
The relational component's specification can surface in varying ways. Linguistically, either the conceptual structure to be bundled up is (i) far away in the discourse (Çokal, Sturt, and Ferreira, 2014), or it is (ii) far away from upper layers in discourse structure, i.e., it is neither the current focus nor the current topic in discourse (Webber, 1989), and is less salient on the level of discourse (Gundel et al., 1993) or the level of argument structure (Chafe, 1976;Brennan, Friedman, & Pollard, 1987).
But the referent of a demonstrative does not have to be linguistically expressed, as evidenced by examples (5) or (6) above, drawn from Figure 1. It can also be a percept in the visual or auditory domain, or another non-linguistic and purely conceptual structure, as (7) shows (e.g., Kaplan, 1989;Jackendoff, 2002): [gesture at terrible shirt] I cannot believe you want to go out dressed like that.
(=bundle from a visual conceptual structure) b.
[screeching sound] What was that? (=bundle from an auditory conceptual structure) c.
[smelling cigarette smoke outside] That must be my mom.
(=bundle from an olfactory conceptual structure) d.
[confronted with a surprise party, friends, and a birthday song] This is too much! (=bundle from a multisensory conceptual structure) Crucially, demonstratives can not only bundle up non-linguistic conceptual structures, but also conceptual structures that have been encoded linguistically, for instance, when the conceptual structure to be bundled up is not an easily co-referential noun phrase. In the case of Brown-Schmidt et al.'s (2005) study, this bundle contained both the cup and the saucer. The data from their study provides evidence that demonstratives may be used as a cue to gather and bundle up concepts expressed by noun phrases. But how do people refer back to things that are realized linguistically but not in the form of noun phrases, such as in (6a) and (6c) this question, we turn to event descriptions, such as in "The friends visited Paris", "The hikers explored the forest", or "Adam heated the lasagna".

Current studies and predictions
In this paper, we use reference to objects (linguistically realized as NPs) and events (conveyed by an entire clause) to investigate whether non-personal pronouns like "it" and stand-alone demonstrative pronouns like "this" or "that" access different cognitive mechanisms in reference resolution in English.
We use sentence pairs like in (8), where a context sentence sets up a potential referent for the pronoun that is present in the next (critical) sentence: "it" refers to "lasagna"; "that", to the act of making the lasagna. We use "it" and "that" because they are distributionally more similar to each other than "it" and "this" (Strauss, 2002).
(8) Sentence 1: Adam made lasagna for me last night. Sentence 2: a. It was really amazing. b.
That was really amazing.
Specifically, we propose that non-personal pronouns are interpreted as coreferential with (salient) lexical item that satisfies the pronoun's form-specific relational and classificatory constraints: inanimate, grammatically singular noun phrases ("lasagna" in (8)). We propose that, in the context of our experiment, this is a process in which only the linguistic surface needs to be implicated (cf. Hankamer & Sag, 1976): In order to determine the referent of "it", it is sufficient to access form-based information of the candidate words in context. In languages with grammatical gender, this information may be morphological information (Cacciari, Carreiras & Barbolini Cionini, 1997); but also, lexemic properties such as length and frequency of a word (Simner & Smyth, 1999;Duffy & Rayner, 1990). Thus, we predict that "it" should be sensitive to frequency and length of the preceding linguistic material resolution (although see Egusquiza, Navarrete &Zawiszewski, 2016, andLago, 2014, for failure to find frequency effects with personal pronouns).
We also predict that the demonstrative "that" should be sensitive to the same surface-formbased features, because these features need to be accessed for any kind of reference. Overall, we predict that after encountering "it" or "that" at the start of the critical sentence, reading times -which reflect processing ease -will reveal effects of the antecedents' surface properties.
We further predict that "that" will lead to slower reading times than "it." This is because under our approach, "that" crucially differs from "it" in that only "that" is accessing and bundling up complex conceptual structures -processes which can be assumed to carry a cognitive cost. Thus, we predict that "that" should be read slower than "it" throughout, above and beyond the difference in orthographic length of the pronouns.
Additionally, we predict that "that" should be uniquely sensitive to higher-level conceptual features, such as the complexity of a concept. Thus, we manipulate the conceptual complexity of an event (as discussed below) that a subsequent demonstrative "that" will refer back to. 7 In particular, we predict that (i) more complex events will lead to faster reading times than less complex events for sentences where "that" is used to refer to the event, whereas (ii) we do not predict effects of event complexity for sentences where "it" is used to refer to the event. This prediction originates from literature suggesting that semantically rich representations lead to faster re-access than semantically poor representations (Fisher & Craik, 1980;Craik & Tulving, 1975;Gallo, Meadow, Johnson & Foster, 2008;van Gompel & Majid, 2004;Heine et al., 2006a,b;Hofmeister, 2011;Karimi & Ferreira, 2016).
Finally, we also make an additional prediction that when non-personal pronouns are subsequently specified by event-denoting adjectives (such as "It was very adventurous" or "It was quite laborious"), comprehension should slow down at the adjective, due to an effect of mismatched expectations; likewise, we expect the same when demonstratives are subsequently 7 In this paper, we only talk about eventive referents, since the distinction between events, states, situation, and facts is not crucial for the claims we are making here: Looking at non-eventive states and situations, as well as at different kinds of conceptual features, is an important direction for future work.
Glossa: a journal of general linguistics DOI: 10.5334/gjgl.917 specified by object-denoting adjectives (such as "That was very small" or "That was quite pretty"). We refer to the difference between adjectives like "adventurous" (event-denoting) and "small" (object-denoting) as adjective bias. Observing these kinds of slow-down patterns would indicate that readers consider an event reference more when they have read a demonstrative, and an object reference more when they have read a non-personal pronoun. Violations of these expectations would result in a type mismatch, as an interaction of adjective bias with pronoun type. However, we issue this prediction with caution for two reasons: In order to limit the length of the experiment, we introduced such a mismatch only for half of our experimental items, and thus, power is significantly reduced for this analysis. Second, the adjectives were in sentence-final position, which has been associated with complex sentence-wrap up effects (e.g., Warren, White & Reichle, 2009).
In what follows, we present two experiments, each replicated once with a slight difference in stimuli, to test these hypotheses. Both experiments allow us to test the prediction that "that" is read more slowly than "it", and that adjective bias may interact with pronoun type. Experiments 1a and 1b investigate whether both non-personal and demonstrative pronouns are sensitive to surface features of the linguistic context, including those of the potential referents. Experiments 2a and 2b test the prediction that only demonstrative pronouns are sensitive to higher-level features of the linguistic context.

Experiment 1a
This study tests the hypothesis that both "it" and "that" access the surface features of a potential referent, the prediction that "that" is read more slowly than "it", and the interaction of adjective bias with pronoun type. We operationalize surface features as lexical frequency and word length in number of letters, which are inversely correlated: longer words are usually less frequent (e.g., Kliegl, Grabner, Rolfs & Engbert, 2004).

Participants
200 self-described native English speakers with IP addresses within the United States, recruited from Amazon Mechanical Turk, participated in the experiment for monetary compensation. Mechanical Turk is used widely in research because it allows access to a large number of study participants, and most results, although perhaps somewhat noisier, are comparable to results obtained in the lab (e.g., Mason & Suri, 2012;Munro et al., 2010;Sprouse, 2011).

Materials
We created 40 sets of stimuli, consisting of a sequence of two sentences, as shown in (9) and (10) It was really adventurous.
The context sentence in each set (9a or b) always contained an animate subject, and a singular non-personal object, that is, a potential referent for "it". The critical sentence started with a pronoun ("it" or "that"), continued with a copula verb, and ended with an intensifier and an adjective (10a or b). Half of the adjectives were compatible with an event reading like in (9) and (10), where "adventurous" refers to the whole exploring event; and half of the adjectives were compatible with the noun phrase (which would have been "forest" or "jungle"). 8 We refer to this as the adjective bias manipulation. This was done to ensure that participants would not be inadvertently learning a pattern of (in)compatibility throughout the course of the experiment (Fine et al., 2013), and allows us to test the prediction that that pronoun type should interact with reference type signaled by the adjective (object vs. event).
The experimental manipulations were (i) pronoun type ("that"/"it") and (ii) noun frequency ("forest" = high frequency, "jungle" = low frequency) in the context sentence, using synonyms or semantically closely related nouns. As described above, we also manipulated (iii) adjective bias (event-denoting vs. object-denoting) between items. Frequency was determined comparing each noun pair in the Celex corpus (Baayen, Piepenbrock, & Gulikers, 1995): low frequency nouns were always less frequent than high frequency nouns. As expected, noun length was inversely correlated with frequency: Low-frequency nouns were significantly longer on average (average length: 6.1 characters) than high-frequency nouns (average length: 5.4 characters; F(1,40) = 4.31; p < .05). Since both noun length and noun frequency are surface level factors, in this paper we do not aim to separate these factors from each other.
In order to ensure that the pronoun "it" is indeed biased to refer to the noun phrase, and the demonstrative "that" to the event, we conducted a norming study. We created a forcedchoice rating task, in which we presented each scenario but replaced the final adjective (e.g., "adventurous" in (10)) with the nonsense adjective "dax." This resulted in sentences like "The hikers entered the forest. That was really dax." The adjective replacement was done in order to prevent semantic interference from the final adjective, thus better reflecting interpretations at our regions of interest, before participants read the whole scenario. Participants (40 native speakers of English from Amazon Mechanical Turk) were asked to indicate whether the object ("the forest that the hikers entered") or the event ("that the hikers entered the forest") was "dax." Confirming our intuition, in sentences that contained the pronoun "it", people overwhelmingly chose the object meaning (67.9%); in sentences that contained the demonstrative "that", however, people strongly dispreferred the object meaning (24.2% object meaning; β = 1.7, p < .0001 in a mixed binomial regression). These norming data show that, in the presence of a neutral adjective, the object interpretation in sentences containing "it" was much more likely than the event interpretation; and the reverse was true for sentences containing "that." In the main experiment, we presented trials in random order in a masked self-paced reading paradigm, together with 40 filler items, using Ibex, an experiment software and platform tailored to the self-paced reading paradigm (Drummond, 2014). Each filler was followed by a comprehension question. 9 In addition to the sentences that people read word-by-word, there were forty comprehension questions in total. However, when calculating accuracy statistics, one comprehension question was removed, because the answer to the question was coded wrongly.
We predict that both "it" and "that" access the surface features of the context, and thus, we expect a main effect of noun frequency in the critical sentence, at and after the pronoun: People should read both pronoun types faster after sentences containing high-frequency nouns than after sentences with lower-frequency nouns. We also predict a main effect of pronoun, namely that the demonstrative should lead to slower reading times than a simple pronoun; and an interaction of adjective bias with pronoun type.

Data Analysis
For this and all other experiments, we used mixed-effects regression models on log-transformed data (Baayen, Davidson, & Bates, 2008) with R's lme4 package (Bates et al., 2014) to analyze the reading times for each word. As justified by the design, we implemented a maximal random effects structure (Barr, Levy, Scheepers, & Tily, 2013). 10 Where noted, the regression structure was modified to ensure model convergence by eliminating interactions on random effects first, then, if necessary, taking out subject random effects, or item random effects. In Experiments 1a and 1b, our fixed effects were Frequency (high vs. low) and Pronoun ("it" vs. "that"); in Experiments 2a and 2b, the fixed effects were Complexity (high vs. low) and Pronoun ("it" vs. "that"). All of these categorical factors were centered (i.e., coded as -0.5 and 0.5) in regression analyses. In the adjective region, we also included Adjective Bias as a predictor, i.e., whether the final adjective was more compatible with an event ("adventurous") or an object ("wild").

9
The full list of stimuli can be found under https://osf.io/59qzj/.

Results
We excluded 34 participants because of low comprehension-question accuracy (<75%), or because their median reading times were below 200 ms or above 1500 ms. The accuracy of comprehension questions in the remaining 166 participants was 92% (range = 79% ~ 100%). Reading times above 2000 ms (<.5% of the data) or below 100 ms (<1% of the data) were also excluded. Figure 2 shows log reading times for each region in Experiment 1, and Table 1 shows the mean reading times with Standard Errors. A summary of the regression results is shown in Table 2. While our predictions only pertain to the critical sentence, we report reading times in each region (including the context sentence) for completeness's sake.
In the context sentence, there was a spurious effect of pronoun in the very first region, the subject NP ("the hunters"). Spurious effects are very common in self-paced reading experiments (e.g. Omaki, Lau, Davidson White, Dakan, Apple & Phillips, 2015; Meng & Bader, 2020; among many others). We classify effects as spurious when they fulfil two criteria: When they could not have been introduced by our manipulation, i.e. because they occurred before the manipulation, and when they do not consistently occur between experiments or across subsequent regions.
As predicted, starting in the object noun region ("the forest/jungle"), people read faster starting at a high-frequency noun ("forest"), than after a low frequency noun ("jungle"). This main effect of frequency remains significant over three contiguous regions: the object noun region (the region where frequency was directly manipulated) in the context sentence, and the pronoun region and the copular verb region in the critical sentence. It is marginal on the subsequent adverb region. Furthermore, we also find a main effect of pronoun type, with "it" conditions being read faster than "that" conditions, at the copular verb that immediately follows the pronoun as well as at the immediately following adverb. There are no interactions involving frequency and pronoun type anywhere in either of the two sentences.
We also analyzed the adjective region ("adventurous"), with adjective bias as an additional fixed predictor: Here, no main effects were significant. We also did not find any significant interactions of adjective bias with complexity or pronoun type (all ps > .15).

Experiment 1b
This experiment was a replication of Experiment 1a, with one slight change: We included a spillover region after the critical noun ("forest"/"jungle"), to ensure that the object-induced  frequency effect found on the pronoun in Experiment 1a was not due to a simple spillover effect of people slowing down, in general, after reading low-frequency words.

Methods
The method was the same as Experiment 1a: We used masked self-paced reading on Ibex, hosted on Ibex farm.

Participants
We recruited a different set of 200 native speakers on Amazon Mechanical Turk.

Materials
We used the 40 sets of stimuli used in Experiment 1a, consisting of pairs like in (11) and (12), but added an adjunctive two-word spillover region ("at night") specifying a location or time: Context sentence: a.
The hikers explored the forest at night. b.
The hikers explored the jungle at night.
It was really adventurous.
Our predictions were the same as in Experiment 1a, namely a main effect of frequency, a main effect of pronoun, and an interaction of adjective bias with pronoun type.

Results of Experiment 1b
Of 200 initial participants, we excluded 29 based on the same criteria as in Experiment 1a: Either comprehension question accuracy of less than 75% (N = 11), median reading time (across all regions) of less than 200 ms (N = 19) and more than 1500 ms (N = 0). As before, we also excluded trials with reading times above 2000 ms (2.5% of observations) or less than 100 ms (<1% of observations). The accuracy of comprehension questions in the remaining participants was 94% (range = 77% ~ 100%). Figure 3 shows log reading times for each region in Experiment 1b; Table 3 shows the mean reading times with Standard Errors, and Table 4 shows regression results on raw reading times over the critical regions.    We find a spurious effect at the Context Sentence's subject as a main effect of frequency. Predicted effects of our frequency manipulation started at the object noun: High frequency nouns were read faster than low-frequency nouns. This effect of frequency continued and did not subside until the end of the second sentence; specifically, pertaining to our predictions,   starting in the pronoun region ("that/it"), people read faster after a high-frequency noun ("forest"), compared after a low frequency noun ("jungle"), with a significant main effect of frequency. Importantly, at the following copular ("was"), at the adverb ("very"), and, marginally, in the sentence-final adjective, we also found a main effect of pronoun type, with people reading faster after "it" than "that." We also found a marginal interaction of pronoun type with frequency, at the pronoun region, with an unpredicted slightly larger effect of frequency for demonstratives than for pronouns. Again, we did not find any significant interactions of adjective bias with complexity or pronoun type (all ps > .34). 11

Discussion of Experiments 1a and 1b
Experiments 1a and 1b both show that surface properties, operationalized here as frequency (which is also correlated inversely with word length) of the antecedent object noun, influence the speed with which people process both non-personal "it" and demonstrative "that." This confirms our prediction that both pronoun types are sensitive to surface features in the linguistic context. We did not find a mismatch effect for adjective bias, contrary to what we expected based on our norming data. This may be due to reduced sample size for that region, since the adjective-bias manipulation was between-items, leading to a loss of power; and also due to their position as sentence-final words, which has been shown to introduce complex wrap-up effects (e.g., Warren et al., 2009).
In addition, we had predicted that people would read faster after they resolve the non-personal pronoun "it" compared to the demonstrative "that". Our data confirmed this prediction. We argue that this effect is based on our model of conceptual bundling: When resolving the demonstrative "that", readers execute a different, perhaps more extensive, search for a referent or a referential structure, than when resolving the non-personal pronoun "it". These effects are unlikely to be due to short-lived spillover from orthographic differences between the two referring expressions, since these effects remain significant over two regions in Exp. 1a and three regions in Exp. 1b.
In addition to showing that demonstratives and non-personal pronouns lead to subsequent differences in reading behavior, and thus providing evidence for our main hypothesis that demonstratives accomplish a fundamentally different operation than pronouns, these results also provide a crucial foundation for Experiments 2a and 2b, which test the prediction that only "that", and not "it", is uniquely sensitive to higher-level conceptual features of its referent. This prediction about the asymmetrical sensitivity of the two pronoun types to conceptual features is derived from our claim that demonstratives are universal bundlers that make a chunk of conceptual structure available to the discourse, and thus have to access the conceptual, not only surface, features of the referent. Non-personal pronouns, in contrast, do not act as bundlers and thus are not expected to show the same level of sensitivity to conceptual properties of referents.
To test this claim, in Experiments 2a and 2b we manipulate the conceptual complexity of an event that a subsequent demonstrative "that" (or non-personal pronoun "it") will refer back to.
Based on previous studies arguing for semantically richer representations leading to faster reaccess (e.g., van Gompel & Majid, 2004;Heine et al., 2006a,b;Hofmeister, 2011;Karimi & Ferreira, 2016), we predict an interaction: reading "that" will be faster after complex events, but reading times for "it" will not be affected, since "it" tends to refer only to the object, not to the whole event.

Experiment 2a
In the following, we test our prediction that only "that", and not "it", is uniquely sensitive to higher-level conceptual features of its referent, following the claim that demonstratives tend to be universal bundlers that make a chunk of conceptual structure available to the discourse, and thus have to access the conceptual, not only surface, features of the referent. In addition, we seek to replicate our findings from Experiments 1a and 1b, namely that demonstratives lead to slower reading times than non-personal pronouns; and we test our prediction that adjective bias may interact with pronoun type.

Participants
200 new native English speakers from Amazon Mechanical Turk participated in the experiment for monetary compensation.

Materials
The same experimental item sets as in Experiments 1a and 1b were used, but they were adjusted to manipulate event complexity instead of nouns' linguistic frequency: We used complex events, such as "explore" (13a), and simple events, like "enter" (13b), combined with the highfrequency nouns of Experiment 1 (e.g., "forest").
The hikers explored the forest. b.
The hikers entered the forest.
It was really adventurous.
Simple events were created by using presupposed or potential sub-events of complex events. For instance, exploring a place (complex) presupposes entering that place (simple), or cleaning a room (complex) may include sweeping it (simple). Verbs' lexical frequency was matched, using the Celex English wordform database (simple event verbs: 18 per million words, complex event verbs: 17 per million words; no statistically significant difference, as determined by a one-way ANOVA (F(1,78) = .01, p > .92). If we find a reading time difference, it should not be due to frequency.
Event complexity was normed in two different ways (see Figure 4). First, we gave 20 native English speakers on Amazon Mechanical Turk a forced-choice task in which they had to rate which event was conceptually more complex, pitting a simple sub-event (i.e. 13a) against its more complex counterpart (i.e. 13b). 12 Our "explore"-type events were rated as more complex than the "enter"-type events 95.48% of the time. Second, since more complex events often take longer than simple events, we asked 20 different native English speakers on Amazon Mechanical Turk to rate the duration of the events, in another forced-choice test (Wittenberg & Levy, 2017). 13 12 The exact instructions were: "Your task is simply to imagine the described actions, and tell us which action is more complicated compared to the other. What does it mean to be more complicated? Just imagine what needs to happen in each. For instance, "eating an apple" may be less complicated than "slicing an apple", because the latter involves more hand movements, with an instrument (a knife), and it may take more time. "Smelling an apple", on the other hand, may be less complicated than eating it -it only takes a moment, there is no movement involved, and it happens effortlessly and automatically." 13 The exact instructions were: "Your task is simply to imagine the described actions, and tell us which action takes longer compared to the other. For instance, "slicing an apple" may take longer than "eating an apple". "Smelling an apple", on the other hand, may take even less time." rated as more complex r ated as taking longer simple event 4.52% 4.51% complex event 95.48% 95.49% 0% 20% 40% 60% 80% 100% Figure 4 Norming of events used in Experiment 2; complex (e.g., "explored the forest") is shaded light grey, simple is shaded dark grey (e.g., "entered the forest"). Here, 95.49% of the "explore"-type events were rated as taking longer than the "enter"-type events. These results taken together indicate that the items were constructed and classified appropriately into two conceptual, nonlinguistic classes: more complex and less complex events.
Again, the critical items were presented randomly in a self-paced reading paradigm, together with the same 40 filler items as in Experiment 1a and 1b, using Ibex. Each filler was followed by a comprehension question. 14

Results
We excluded 37 participants because of low question accuracy, or because their median reading times were below 200 ms or above 1500 ms. Reading times slower than 2000 ms (<2% of observations) or faster than 100 ms (<2% of observations) were excluded as well.
The accuracy of comprehension questions in the remaining participants was 94% (range = 75% ~ 100%). Figure 5 shows log reading times for each region in Experiment 2a; Table 5 shows the mean reading times with Standard Errors, and Table 6 shows model results for the critical regions.
Before the pronoun region, no significant effects were found (ps > .05), except for a main effect of Complexity at the object NP following the manipulated verb, but this effect had subsided by the next region. As predicted, and consistent with results from Experiments 1a and 1b, we found a main effect of pronoun type starting in the pronoun region ("that/it"), which remained significant throughout the rest of the trial: "it" conditions were read faster than "that" conditions. No other effects were significant, except for a main effect of complexity at the sentence-final adjective; we did not find any significant interactions of adjective bias with complexity or pronoun type (all ps > .38).

Experiment 2b
This experiment was a replication of Experiment 2a, with the same change as from Experiment 1a to Experiment 1b: We included a spillover region after the critical noun ("forest".) 14 The full list of stimuli can be found under https://osf.io/59qzj/.

Materials
We changed the 40 sets of stimuli used in Experiment 2a, consisting of pairs like in (15) and (16), such that they contained a two-word spillover region ('at night'): (15) Context Sentence: a. The hikers explored the forest at night. b.
The hikers entered the forest at night.
(16) Critical Sentence a. That was really adventurous. b.
It was really adventurous.

Procedure
Again, we used self-paced masked reading, collecting data over the internet.

Results
From the initial set of 200 speakers, we excluded 25, because of low accuracy on comprehension questions (N = 3), or due to median reading times being too fast (N = 20) or too slow (N = 2). The accuracy of comprehension questions in the remaining participants was 94% (range = 78% ~ 100%). Again, we also excluded individual trials based on reading times: Longer than 2000 ms (<3% of observations) or shorter than 100 ms (<2% of observations). Figure 6 shows log reading times for each region in Experiment 2b; Table 7 shows the mean reading times with Standard Errors, and Table 8 shows results of the regression. 15 We find a spurious main effect, and spurious effects of interactions with pronoun type in four regions of the context sentence. However, as expected, and as found in Experiment 2a, we also found a main effect of complexity at the manipulated verb ("entered/explored") and the object NP. Reassuringly, this effect had subsided in the spillover region, and only reached marginal significance at the pronoun at the start of the critical sentence.
15 Supplemental analyses as well as raw (anonymized)    Glossa: a journal of general linguistics DOI: 10.5334/gjgl.917 In the auxiliary region ("was") and the adverbial region ("very") of the critical sentence, we find a main effect of pronoun type, but no effect of complexity, and no interaction, in line with our other experiments. This time, we also find the predicted interaction of adjective bias with pronoun type at the final adjective (marginal, β = .03, p < .07), a main effect of    complexity (β= .03, p < .05), and a three-way interaction between pronoun type, complexity, and adjective bias (β= .07, p < .02).

Discussion of Experiments 2a and 2b
Experiments 2a and 2b had three main aims: (i) to see whether the main effect of pronoun type that we observed in the first two experiments are replicable in events of differing complexity, (ii) to test whether adjective bias interacts with pronouns, and (iii) to test our prediction that the bare demonstrative "that" -but not the pronoun "it" -is sensitive to higher-level conceptual features, operationalized in this study as the conceptual complexity of an event. These predictions were derived from our model of demonstratives as 'universal bundlers' that take chunks of conceptual structure that are not referred to (in our stimuli) with noun phrases. In order to do this bundling, demonstratives must access the conceptual, not only surface, features of the referent. To test this claim, we manipulated the conceptual complexity of the events that subsequent demonstratives referred back to.
Specifically, we predicted that more complex events (as denoted by verbs) would lead to faster reading times for sentences containing "that", compared to sentences containing "it". We did not find this predicted interaction, but we did replicate the main effect of pronoun type predicted and found in Experiments 1a and b already: People read slower after a demonstrative than after a personal pronoun, throughout several regions. We discuss these findings in more depth in the general discussion.
We also found the predicted mismatch effect, in the form of an interaction of adjective bias with pronoun type or complexity or both, at the sentence-final adjective in Experiment 2b. We can only speculate as to why this effect only surfaced in one of four experiments: One reason may be that at the sentence-final word, complex wrap-up effects can mask other effects (e.g., Warren et al., 2009).

General Discussion
This paper set out to test our hypothesis that the English bare demonstrative pronoun "that" tends to refer to bundles of chunks of conceptual structure, whereas the non-personal pronoun "it" can simply refer to a noun phrase that satisfies its classificatory and relational components. We argued that because "that" accesses and bundles up complex conceptual structures, it should induce longer reading times, and perhaps be uniquely sensitive to higher-level conceptual features, such as the complexity of a concept. Thus, we predicted that (i) both "that" and "it" would be sensitive to surface features such as frequency and word length, whereas (ii) only "that" would be sensitive to event complexity, and (iii) "that" would be read more slowly that "it" overall. These predictions were partially supported by two sets of self-paced reading studies:

Sensitivity to surface properties
In the first pair of self-paced reading studies (Experiment 1a and 1b), the data show that both non-personal and demonstrative pronouns were read faster after nouns that were shorter in length and higher in frequency than after nouns that were longer and less frequent. Importantly, these results also surface when there is a delay between the referent and the pronoun, such that the reduction in reading times cannot merely be taken as a spillover effect from the nouns themselves, but rather can be attributed to the reference resolution process itself, supporting our model of demonstratives as bundlers of conceptual structure as opposed to simple linguistic anaphora devices. In addition to this, these findings provided the baseline for Experiments 2a and 2b, which asked whether "that", and not "it", is uniquely sensitive to higher-level conceptual features of its referent.

Sensitivity to conceptual properties
The second part of our hypothesis that the bare demonstrative "that" tends to bundle up complex conceptual structures, whereas "it" does not necessarily do so, results in two predictions: First, "that" should result in longer reading times throughout compared to "it", and "that" should be uniquely sensitive to referential complexity.

24
Wittenberg et al. The first prediction was confirmed in all four experiments: The demonstrative "that" resulted in longer reading times than "it", and this main effect was robust, stable, and replicable. This pattern is very likely not due to orthographic differences, since we also analyzed the data residualized over the length of the pronouns, and still found this effect. On the contrary: Since both forms (non-personal pronouns and demonstratives) are on the very high end of the lexical frequency spectrum, finding any effect on the pronoun region is striking, given that due to the word frequency and the word length effect, reaction times to short, high-frequency words tend to stick to floor level (see e.g., Morton, 1970;Kuperman, Drieghe, Keuleers & Brysbaert, 2013;Dirix, Brysbaert & Duyck, 2019, and others for data from many behavioral paradigms). Thus, these results are a powerful demonstration of the different reading behaviors induced by nonpersonal pronouns and demonstratives.
To test the second prediction, we conducted Experiments 2a and 2b, which manipulated potential referents' conceptual complexity. Specifically, we manipulated event complexity: We compared sentences with more complex events to sentences with less complex events.
We expected the demonstrative "that" to pattern differently from "it" as follows: Based on prior research, and our own previous data, we expected an interaction such "that" would be read faster in conditions with more complex events than in conditions with simpler events. This is because "that" tends to refer to the event, and complex events have been shown to be more easily retrieved the more complex they are (Hofmeister, 2011;see below). 16 While this second prediction was not borne out, the first was, showing that demonstratives are processed differently from personal pronouns, affecting reading times overall.

Our results in light of previous literature
Let us briefly compare our results to other experiments on antecedent frequency effects on pronoun processing, because our results may, at first blush, seem surprising given prior work: Several recent studies have reported that less frequent antecedents lead to faster reading times at the pronoun (van Gompel & Majid, 2004), or to no reliable differences at the pronoun (Egusquiza et al., 2016;Lago, 2014). However, each of these studies is crucially different from ours in several aspects. First and foremost, van Gompel & Majid (2004) as well as Lago (2014) used animate antecedents ("the arsonist/the criminal"), and likely more importantly, unambiguous possessive determiners ("his bag") to investigate effects of frequency. Second, they measured effects in eyetracking-while-reading paradigms. And third, they used frequency as a proxy for saliency, on the assumption that all low-frequency items are more salient than high-frequency items.
In contrast, we used a standard masked self-paced reading paradigm, and measured reading times at a bare (non-human-referring) pronoun which was, in principle, compatible with at least two different referent types: the sentential object alone, such as "forest" or "jungle", or the whole event ("The hikers entered/explored the forest/the jungle.") We also used items that were close synonyms, such as "forest" and "jungle", and not as conceptually far apart as many of van Gompel & Majid's (2004) as well as Lago's (2014) stimuli (pairs in these papers were, for instance, "student" vs. "vagrant", or "doctor" vs. "envoy", which may have entailed not only a differential in frequency, but also in register, pragmatics, and semantic associations and features.) In light of these differences, direct comparisons between experiments are presumably not meaningful. 17 In contrast, Hofmeister (2011) explicitly manipulated conceptual complexity. He found that people were faster reading "banned" in sentences like (17b) than in sentences like (17a); that is, the more complex NP "alleged Venezuelan communist" in (17b) was easier to retrieve from working memory and integrate in the argument structure than the less complex NP 'communist' in (17a): (17) a.
It was a communist who the members of the club banned from ever entering the premises. b.
It was an alleged Venezuelan communist who the members of the club banned from ever entering the premises.
In Hofmeister's (2011) stimuli, 'alleged Venezuelan communist' is indeed semantically richer than 'communist', but it is also a longer, more complex noun phrase (and retrieval effects could potentially be due to longer encoding time, Karimi, Diaz & Wittenberg, 2020); whereas in our study, the only variation between the verbs was one of explicitly controlled conceptual complexity. Hofmeister's Experiment 2, which replaced semantically specific referents like "soldier" with non-specific referents like "person" and is in some sense similar to ours, found only marginally significant retrieval effects.
Even without an effect of conceptual complexity, however, our data can be taken as evidence that "that" and "it" result in significantly different processing. We propose a model of how a bundling process may be triggered by demonstratives like "this" or "that", and we view this proposal as an extension and unification of approaches to demonstratives that have described their functions in discourse from the perspective of information structure (Ariel 2001, Gundel et al., 1993Strauss, 2002), or from the perspective of anaphora (Çokal, Sturt, & Ferreira, 2016).
We must stress that the form-specific preferences and mechanisms that we propose here are not deterministic: In English at least, both "it" and "that" can refer to both objects and events. Furthermore, both "it" and "that" can be temporarily ambiguous: "it" could be an expletive ("It is raining"), and "that" could be a demonstrative determiner ("that macaron"; although see Strauss, 2002, for corpus data on how "that" is more than twice as often used on its own as a bare demonstrative pronoun, as opposed to as a modifier to an NP). It seems unlikely that these temporary ambiguities could explain away the results we observed, but we do think that our studies should be extended, possibly into other languages.
Languages other than English also often show a contrast between simple pronouns and demonstrative pronouns, as extensive fieldwork has shown (e.g. Diessel, 1999;Dixon, 2003;Givón, 1978;Himmelmann, 1996); and it must be assumed that they slice the conceptual pie between object and event reference in interestingly different ways than English does (for instance, see Bosch et al., 2003;Grosz, 2018;Kaiser, 2011; and many others for German d-pronouns). To take an example that has caught the attention of psycholinguistic research, Kaiser & Trueswell (2008) investigated the processing of simple and demonstrative personal pronouns in Finnish.
Their data indicate that these types of pronouns exhibit different form-specific constraints: Whereas the personal pronouns are sensitive to grammatical role, the demonstratives were sensitive to both information structure and grammatical role. Little is known so far about event or object reference in Finnish, however. It will be interesting to see whether all languages have elements that can function as bundlers, and if so, whether demonstratives (or their equivalent) cross-linguistically show this tendency to be able to refer to conceptual structures wider than simple pronouns. 18 Other open questions concern the limits of bundling. If it is true that bundling is a conceptual process, it should operate under the same working memory constraints as operations on other cognitive units (Baddeley, 2012). In recent years, there have been many promising attempts at integrating research on working memory and language processing. Lewis & Vasishth (2005), a prominent example, proposed a content-addressable memory architecture that is integrated within linguistic theory. Our model of the role of the bare demonstrative "that" as a universal bundler fits squarely within this theory, while also accounting for the non-linguistic, conceptual content that demonstratives bundle up for use by linguistic means.
In sum, in this paper we presented data from two English self-paced reading studies, each replicated once, showing that demonstratives are processed differently from personal pronouns, and that this affects reading patterns throughout the whole sentence. These data can be seen as initial evidence supporting a new, unified model of reference by demonstratives as a process of conceptual bundling, with demonstratives as operators on the interface of language and broader cognition.