An electrophysiological investigation of referential communication

A key aspect of linguistic communication involves semantic reference to objects. Presently, we investigate neural responses at objects when reference is disrupted, e.g., “ The connoisseur tasted *that wine “… vs. “ … * that roof …” Without any previous linguistic context or visual gesture, use of the demonstrative determiner “ that ” renders interpretation at the noun as incoherent. This incoherence is not based on knowledge of how the world plausibly works but instead is based on grammatical rules of reference. Whereas Event-Related Potential (ERP) responses to sentences such as “ The connoisseur tasted the wine …” vs. “ the roof ” would result in an N400 effect, it is unclear what to expect for doubly incoherent “ … * that roof …” . Results revealed an N400 effect, as expected, preceded by a P200 component (instead of predicted P600 effect). These independent ERP components at the doubly violated condition support the notion that semantic interpretation can be partitioned into grammatical vs. contextual constructs.


Introduction
Reference is a key aspect of human communication.That is, both speakers and hearers (and/or writers and readers) assume a shared common ground of objects during communication (Stalnaker, 1978;Heim, 1982).Languages use numerous grammatical devices for the purposes of referring to both animate and inanimate objects (Roberts, 2002).For example, English uses names ("Pat is an excellent art teacher"), pronouns ("She loves her easel") and demonstrative determiners ("That easel has been in her studio forever") as tools of reference (King, 2006).These objects (names, pronouns, nouns with demonstrative determiners) are assumed to exist for both speaker and hearer; that is, they are "presupposed" to exist.
Reference can be disrupted 2 if a term is used that is unknown or unfamiliar to the hearer/reader in the context of utterance.For example, if "Pat" is mentioned in conversation and she is unknown to the listener, communication is disrupted.Next, starting a conversation with a pronoun, as in "She loves her easel", without any previous context to indicate who the pronoun "she" would refer to, also results in incoherence.Similarly, in the absence of context or any visual/gestural cue of demonstration, using the determiner "that" in "That easel has…" would be difficult to understand, resulting in incoherence. 3We note that the latter two devices for reference, pronouns and determiners, are closed class elements (also known as grammatical or function words) that contribute to a sentence's grammatical meaning.Thus, semantic context can refer to a shared discourse or text between individuals; and if/when an anaphoric term (e.g., s/he, they; that in that easel; that car) is used that has no explicit previous mention (i.e., no antecedent) -reference is disrupted.On a wider scale, semantic context can also refer to a shared understanding of how the world plausibly works.This shared understanding of how the world works is dependent on experience (and is independent of grammatical function words and/or structure).For example, "The connoisseur tasted the roof" is a nonsensical sentence vs. "The connoisseur tasted the wine".In the former sentence, "roof" does not violate grammatical expectations (it is a concrete noun in direct object position, as expected) nor is it violation of presupposed existence.Instead, the sentence is incoherent because "roof" does not fit the immediate sentence context.In other words, it violates our shared understanding of what a connoisseur might taste.The word is contextually implausible.For coherence to ensue, massive contextual adjustments would need to be made (e.g., where "the roof" would be the name of a brand of wine4 ).
In ERP language studies, both kinds of aforementioned incoherence are typically labeled as 'semantic' violations; however, it is clear from the descriptions above that the violations arise from independent modules of knowledge of language.The lexical-experience based violation is derived from conceptual word-level semantics or world knowledge.The violation of referential meaning is derived from compositional semantic knowledge, i.e., grammatical knowledge.
In the present work, we asked the question: what is the neural response to sentences that are doubly incoherent according to both experience in the world, as well as grammatical devices of reference?
We expect that when sentences are incoherent due to experience in the world (e.g., "The connoisseur tasted the roof" vs. "The connoisseur tasted the wine"), an N400 response would be elicited.Since Kutas & Hillyard (1980;1983), the N400 component has been hailed as the neural signature of (lexical) 'semantic anomaly'.This negative-going waveform generally peaks around 400 ms post-stimulus onset (Kutas & Federmeier, 2011), as in "Jai spread the warm toast with #socks/ butter."5 .
Predictions for sentences that exhibit anomaly at both the lexical level ("…tasted the #roof") and grammatical level ("…tasted *that #roof") are less clear.However, based on previous work in our lab where reference was disrupted due to linguistic semantic context, we predict a P600 response (discussed below), in addition to an N400 for such double violations.
In previous work (Dwivedi et al., 2006; see also replication in Dwivedi et al., 2010) we examined ERP responses to incoherent sentences embedded in context.In that 2x2 study, the semantically anomalous condition (1c below) included the pronoun "it" which did not have an appropriate referent in the previous sentence (see Karttunen, 1976).That is, we compared two-sentence discourses such as in (1) below: (1) a. John is considering writing a novel.It might end quite abruptly.b.John is reading a novel.It might end quite abruptly.c. John is considering writing a novel.#It ends quite abruptly.d.John is reading a novel.It ends quite abruptly.
In (1a), the use of a modal auxiliary "might" in "It might end…" indicates that the hypothetical nature of the first sentence "John is considering…."is continued in the second sentence.As a result, the pronoun "it" can successfully co-refer with its antecedent, (hypothetical) "a novel" in the first sentence.In contrast, in (1c), the lack of a modal auxiliary in "It ends …." means that the sentence (and therefore the pronoun) is not hypothetical-in contrast to its antecedent.This results in referential disruption or anomaly-since now the anaphoric pronoun "it" is asserted to exist but is linked with the antecedent "a novel" which was hypothetical.Results showed that this intuitive contrast for sentence (1c) (vs.its control 1d) did result in an ERP contrast: a frontal P600 effect (Osterhout & Holcomb, 1992;Hagoort et al., 1993;Kaan et al., 2000;Kaan & Swaab, 2003a, 2003b); no such P600 effect was in evidence for sentence (1a) vs. its control (1b).This positive-going waveform, usually peaking in the 600 ms range had been typically associated with difficulty in structural integration of a word into a sentence (e.g., "The spoilt child *throw the toys; The broker persuaded *to conceal the transaction…").We interpreted our finding as a 'semantic' P600 effect, (see also, Aurnhammer et al., 2021;Bornkessel-Schlesewsky & Schlesewsky. 2008;van Herten et al., 2006;Kuperberg, 2007).We note that the latter two devices for reference, pronouns, and determiners, are closed class elements (also known as grammatical or function words) that contribute to a sentence's grammatical meaning, (among others), where in our case, the structure that was difficult to integrate was the (semantic) discourse representation structure (Dwivedi, 1996;Roberts, 1989).Moreover, given that we found frontal positivity, which is associated with revision of structure (Kaan et al., 2000;Kuperberg et al., 2020), we speculated that the frontal positivity was a re-interpretation of the noun phrase "a novel".That is, rather than as a non-specific, hypothetical novel (that does not yet exist), perhaps the noun phrase was re-interpreted as a specific novel, e.g., "John is considering writing a (specific) novel.It ends quite abruptly (given what we know about John and his tendencies)". 6Given that the meaning violation arose via grammatical rules governing the interpretation of pronouns and other closedclass elements such as modal auxiliaries (might, should, would, etc.) in discourse structure (Dwivedi, 1996;Kaplan, 1978;Karttunen, 1976;Roberts, 1996;Heim & Kratzer, 1998)-we called this a violation of compositional semantics, or a semantic P600 effect (see also van Herten et al., 2006, van Berkum, 2009; among others for other claims regarding semantic P600 results).
Given these previous findings, we hypothesize a similar semantic P600 effect in the current experiment, when reference is again incoherent-albeit now due to anaphoric nature of the determiner "that".Thus, in addition to N400 ERP responses expected at semantically anomalous critical words such as roof (vs.wine), as in (i) "The connoisseur tasted the # roof on the tour" vs. (ii) "The connoisseur tasted the wine…" we compared responses at these conditions to sentences using demonstrative determiners such as (iii) "The connoisseur tasted *that #roof…" vs. (iv) "The connoisseur tasted *that wine…" For the double violation condition *that #roof… we expected an N400-P600 effect at the critical word.Following the logic above, for the condition *that wine… a P600 effect was also expected at semantically congruent "wine".
On the other hand, we note that in our previous work, we examined pronouns embedded in sentences in discourse, in contrast to the current experiment which used single sentences only.In addition, we note that the current experiment is exploratory since little is known whether empirical effects are found (if any, and if so, under what conditions) when determiners, such as the demonstrative "that",7 are used with nouns that lack appropriate referents in context, vs. definite determiner "the"8 (see Murphy, 1984;Anderson & Holcomb, 2005 for experimental work on "the").Most cognitive neuroscientific investigations of anaphoric elements have focused on pronouns with and without appropriate antecedents (see, among others, Filik et al., 2008;Hammer et al., 2005;Osterhout and Mobley, 1995;Nieuwland and van Berkum, 2008).We build on those findings here by examining potential interpretive costs at the direct object position when it is preceded by the demonstrative vs. definite determiner.Differing empirical effects are expected at the direct object "roof" vs. "wine" when preceded by "that" vs. "the" because noun phrase interpretation is compositional.That is, the interpretation of the sentence "The apples on the table are delicious" differs from "All apples on the table are delicious" due to meaning differences associated with "the apples" vs. "all apples" (Chierchia & McConnell-Ginet, 1990).
Thus, in this within-participants study, we examined two independent variables, direct object type (Plausible vs. Implausible) and determiner type (Demonstrative vs. Definite).See Table 1 for a list of the four conditions.

Participants
37 Brock University undergraduates were recruited and either paid for their participation or received partial course credit.All participants were native, monolingual speakers of English, had normal or correctedto-normal vision and were right-handed, as assessed by the handedness inventory.No participants reported any neurological impairment, history of neurological trauma, or use of neuroleptics Four participants with comprehension question accuracy for filler items (discussed below) at less than 85 % were excluded from analysis leaving 33 eligible participants (25 females; mean age = 19.6 years; ranging from 18 to 25 years).
This study received ethics approval from the Brock University Bioscience Research Ethics Board (BREB) prior to the commencement of the experiment (REB 13-282).Written, informed consent was received from all participants prior to their participation in the experiment.

Materials
Stimuli used here are described in Dwivedi & Selvanayagam (2021).In brief, these stimuli consisted of 160 critical items in four conditions (see Table 1) counterbalanced across four lists and 170 filler items.Critical items were simple active sentences with an animate subject (e.g., the connoisseur), an active past-tense verb (e.g., tasted), a determiner (definite: the vs demonstrative: that), an inanimate direct object (plausible: e.g., wine vs implausible: e.g., roof) and a prepositional phrase (e. g., on the tour).9Direct objects were not repeated, word length and frequency for direct objects in plausible vs implausible conditions were controlled for.
No tasks were associated with critical trials.170 filler sentences were included in order to reduce predictability.Comprehension questions were asked at 125 filler items (38 % of all trials) consisting of superficial Yes/No or True/False questions.An example filler item and corresponding question is given below: (2) After her surgery Anita slept for two days.Q: Anita had a vacation.1) True 2) False.

Offline plausibility ratings
We evaluated the plausibility of our critical materials by conducting a web-based norming study using Qualtrics software, Version (March 2020) of the Qualtrics Research Suite (Qualtrics, 2020).Participants were asked to rate sentences in terms of plausibility on a scale from 0 (very implausible) to 5 (neutral) to 10 (very plausible), in steps of 0.1.There was no time pressure as 16 sentences were presented on each webpage, for a total of ten pages.The 160 critical items were presented in eight pseudorandomized, counterbalanced lists such that half of the critical items were presented in each list and each participant only saw each item once.80 filler items were presented in all lists, for a total items in each list.
109 participants completed the study, of which 66 met the eligibility criteria described above (as outlined in Section "Participants").Twenty-five participants were excluded for having a mean plausibility rating lower than seven on filler items (all of which were perfectly plausible).Data from the remaining 41 participants (36 females; mean age = 18.73; ranging from 18 to 22) were used to calculate plausibility ratings.

Electrophysiological measures
Electroencephalographic recordings were made using a 64-channel Active Two BioSemi system (BioSemi, Amsterdam).Data were sampled at a rate of 512 Hz and digitized with a 24-bit analog-to-digital converter.Two infinite impulse response filters were applied at 12 db/ octave: a bandpass filter from 0.1 to 100 Hz used to remove high and low frequency noise and a bandstop filter from 59 to 61 Hz used to remove 60 Hz noise.All electrodes were re-referenced to the averaged mastoids for analysis.Prior to segmentation, eye movements artifacts and blinks were filtered from the data using a spatial ocular artifact correction algorithm (Pflieger, 2001) available in the EMSE v5.5.1 software (Cortech Solutions, 2013).Due to equipment malfunction, data from electrode Fp1 was lost in some participants.A spatial interpolation filter (Cortech Solutions, 2013) was applied for this electrode, for all participants.requiring experimental participation for course credit.Data collected from ineligible students are eliminated later for data analysis (presently, 38 bilingual/ multilingual students, 4 students diagnosed with neurological disorders, and 1 student under 18 years of age were ineligible.Thus, a total of 43 students were excluded; as such, their data were not included for analyses).
V.D. Dwivedi and J. Selvanayagam Epochs were created from an interval 200 ms prior to stimulus onset to 1200 ms after stimulus onset.

Procedure
Participants were tested individually in one session of approximately three hours.In each session, participants completed a short questionnaire regarding reading habits, a handedness inventory (Briggs & Nebes, 1975) and the Positive and Negative Affect Schedule (PANAS, Watson et al., 1988) before the application of the electrodes. 11Following a practice session of eight trials, each participant completed the experimental trials in six blocks of 55 trials, with rest periods between each block.Each participant saw one of four pseudorandomized, counterbalanced lists consisting of 330 items.The pseudorandomized lists were created using the Mix utility (van Casteren & Davis, 2006) such that the first three items and last two items of each block were always filler sentences; no more than two critical items were presented sequentially and items from the same condition were never presented sequentially.Sentences were presented in the centre of the computer monitor in light grey, 18-point Courier New font on a black background.See Fig. 1 for a sample trial procedure.
Each trial sentence began with the participant being prompted to press a button on the response pad, then the word "Blink" was presented for 1000 ms, followed by a fixation cross (+) for 500 ms.After a variable inter-trial interval lasting between 200-400 ms, the sentence was presented one word at a time with a stimulus onset asynchrony (SOA) of 600 ms and an inter-stimulus interval (ISI) of 200 ms.125 filler items were followed by comprehension questions after the last word of the sentence, to which participants were asked to press a "1" or "2" key corresponding to answers on the screen using the response pad.Response time and accuracy was recorded for each response.The next trial began following another inter-trial interval lasting between 500-1000 ms.

Behavioural analyses
Comprehension rates for questions at filler conditions were 95.2 % (SD = 2.75 %), indicating that participants were indeed paying attention to sentence materials.

Electrophysiological analyses
The grand average ERPs, time-locked at the position of the critical word #roof vs. wine are shown for all four conditions, i.e., the/*that wine and the/*that #roof.Fig. 2A shows a clear N400 effect for incongruent "roof" vs. "wine", occurring with typical distribution (see Fig. 2C): maximal over centroparietal sites with a slight right lateralization (Kutas & Federmeier, 2011).Contrary to our a priori hypothesis, no P600 effect emerged at the dual violation condition *that roof.
That is, a positive deflection did not follow the N400 effect but instead preceded it (see Fig. 2B).A P200 effect was elicited before the N400 component for the double violation condition *that #roof but not the #roof or *that wine.Moreover, the N400 effect at the double violation condition *that #roof was attenuated (see Fig. 2A and 2C).In order to ensure that N400 amplitudes were not influenced by the immediately preceding positive-going P200 deflection, we renormalized N400 amplitudes relative to the post-stimulus interval of 100-300 ms (the P200 latency window) after onset of roof/wine 12 (see also Hagoort, 2003, andCarreiras, Vergara &Barber, 2005 for similar analyses).Results below are reported using the renormalized N400 amplitude (see Fig. 3). 13 Next, we conducted single trial, linear mixed effect regression analyses using the R statistical programming language (v4.2.2) with packages lme4 (v1.1.34,for linear mixed effects regression model fitting) and EM means (v1.8.8, for Bonferroni corrected pairwise contrasts).Statistical analyses reported below were completed using custom R code, and figures were generated using custom Python code.All materials (stimuli, data, and scripts) associated with this experiment are available at http s://gitlab.com/dwivedilab/erp_reference.

100-300 ms
A significant P200 effect was revealed for the double violation condition *that #roof vs. *that wine (and *the #roof).The highest order interaction, of Object, Determiner, and Anteriority (and Hemisphere for lateral) models significantly improved fit as compared to lower order models for medial, χ (8) = 17.87, p = 0.022, and lateral ROIs, χ(8) = 22.89, p = 0.029.Pairwise contrasts revealed a significant increase in mean voltage in the dual violation condition (*that #roof vs *that wine, Δ = 0.70-1.03µV) 100-300 ms following critical word onset.This effect was largely observed at medial posterior sites (p's < 0.05) and was marginal (.045 < p<.058) at lateral posterior sites as well as at medial anterior and central sites but not at lateral anterior sites (see Fig. 2B).No such effects were observed in the definite conditions (the #roof vs. the 11 This questionnaire was employed to ask questions orthogonal to the current paper and is not discussed further.For a thorough account of that question and results, see Dwivedi & Selvanayagam (2021).
12 That is, we subtracted the mean voltage in the 100-300ms time window from the 300-500ms time-window to compute a re-normalized N400 amplitude. 13For a discussion of statistical analyses for the N400 effect observed in Fig. 2C, see Supplementary Material S-1.
V.D. Dwivedi and J. Selvanayagam wine, p's > 0.05; see Fig. 2B).In sum, we observed a significant medial posterior P200 effect for the implausible critical word following the demonstrative but not the definite determiner.This effect appears to index the double violation, and critically, this difference is observed in the contrast (*that #roof vs *that wine) where the baseline is held constant.

N400 effect: 300-500 ms
Fig. 3A shows the grand average ERPs, time-locked to the onset of the critical word (#roof vs. wine) at medial and lateral electrode sites, with a post-stimulus baseline of 100-300 ms as opposed to the pre-stimulus baseline of 0-200 ms used in Fig. 2A.Visually, it is evident that the N400 effect is typical with respect to both latency and topography (see Fig. 3B).The highest order interaction, of Object, Determiner, and Anteriority (and Hemisphere for lateral) models significantly improved fit beyond all other reduced models, medial: χ (8) = 17.862, p = 0.022, lateral: χ(12) = 28.248,p = 0.005.Pairwise contrasts here confirmed an N400 effect robustly across both Determiner types, with a Central-Posterior distribution with slight right lateralization.N400 amplitudes were slightly attenuated for the demonstrative condition (medial: Δ = 1.08-1.80µV, lateral: Δ = 0.175-0.975µV) and spatially restricted (not significant in the left anterior ROI) as compared to the definite condition (medial: Δ = 1.67-1.97µV, lateral: Δ = 0.837-1.356µV, significant in all ROIs).

Critical word minus one position, determiner, the vs *that
Next, grand average ERPs, time-locked at the position of the Determiner, or critical word minus one position (the vs. that), are shown at medial sites in Fig. 4. Visual inspection of the waveform reveals a difference in voltage starting at 300 ms which persists until 500 ms.Although it is maximal at right, centroparietal sites, there is no peak as characteristic of a typical N400 component.To investigate these differences, as above, we evaluated linear mixed effect regression models separately for medial and lateral ROIs, omitting the factor of Object type and all associated interactions: medial While the effect observed here resembles the N400 in timing and topography, the shape of the waveform does not correspond to this component.This negativity likely indexes the differences in wordfrequency differences between "the" and "that" (van Petten & Kutas, 1990;van Petten, 1995).

Discussion
In the present study, we were interested in neural responses to words that exhibited dual meaning violations: first, in terms of real-world plausibility, and second, in terms of referential meaning.That is, "The connoisseur tasted *that #roof…" (vs."… *that wine"; also vs. control condition "… the wine") is incongruent both in terms of contextual plausibility and in terms rules of reference.Given our previous findings regarding semantic anomaly associated with reference, we predicted an N400-P600 complex.We predicted "roof", an implausible direct object in its immediate sentence context, would elicit an N400 component, and the use of the demonstrative determiner "that", would result in an independent (semantic) P600 effect, since "that" violated discourse structure algorithms regarding semantic reference, (as in Dwivedi et al., 2006Dwivedi et al., , 2010)).
Our predictions were partially borne out.We did, in fact, see independent neural responses to the combined violation condition, *that #roof.However, instead of an N400 followed by a P600 component at the critical word "roof", we observed a P200-N400 complex.Given the clear implausibility of tasting a roof vs. wine, the N400 was an expected neural response at this condition.However, the P200 was not.Below, we discuss the cognitive significance of the P200-N400 complex and then conclude with why the P600 was not observed.14

P200, attention and the algorithm of meaning
The P200 component has been associated with allocation of attention, where stimuli that are attended to yield larger P200 components vs. unattended stimuli (Hillyard & Münte, 1984;Luck & Hillyard, 1994).With respect to language, studies with P200 effects often discuss this ERP component in terms of attention and salience of the relevant linguistic cue.
In a recent ERP language experiment, Vergis et al. (2020) showed that when participants listened to sentences that were spoken with either rude or polite voices, P200 effects were found at sentence-final words in the rude prosody conditions.The researchers hypothesized that the P200 effect found for rude-sounding intonation reflected greater attention by listeners since that cue was salient to the task at hand; during the experiment, participants had to rate how likely it was that someone might comply with rude vs. polite requests by the speaker.Thus, the rude vocal cue was more noticeable since it was germane to the task of deciding whether someone might comply or co-operate with the speaker.In another experiment, Zhao et al. (2021) examined scalar implicature sentences in Mandarin and showed that focus conditions elicited larger P200 components.They reasoned that focus conditions would require more attentional resources than non-focused conditions for interpretation.
On a view where the P200 indexes attention, the results for the current experiment become straightforward.As mentioned previously, meaning is compositional such that the interpretation of a sentence varies when the noun is preceded by a different determiner (e.g., "the apples…" vs. "all apples…").When "roof" was perceived in the sentence containing "connoisseur… taste…", it was clearly not expected or associated with the local sentence context (Dwivedi, Goertz, Selvanayagam, 2018), in contrast to "wine", which fit perfectly.The extra effort required in retrieving the meaning of "roof" vs. "wine (Aurnhammer et al., 2021;Federmeier & Kutas, 1999) would necessarily result in extra attentional resources for interpreting "that roof", indexed here by a P200-N400 complex.No evidence of extra attention or salience is there for "wine" due to its 'good enough' fit with the local context (see more We note that a P200-N400 was also observed in another ERP language experiment (Carreiras et al., 2005) and that a clear (though speculative) link can be made between those previous findings and ours; especially on account that takes difficulty/ease of lexical retrieval into account.Carreiras et al. (2005) were interested in whether and how sublexical rules of syllabification applied to single words during reading.In two experiments, syllable boundaries were marked by colour boundaries for (both high and low frequency) words and pseudo-words in a lexical decision task.These colour boundaries either matched (e.g., "casa"), or did not match (e.g., "casa") syllable boundaries.When a mismatch between syllable and colour boundaries occurred, a P200-N400 complex emerged for low frequency and pseudo-words-but not high frequency words.Presumably, interpreting low frequency and pseudo-words required more cognitive effort resulting in more attention-the same idea as proposed above.This increase in salience would have elicited the P200 effect for low-frequency words, vs. high frequency words.Regarding the lack of a P200 effect for incongruently marked high frequency words, they indicated that "[s]yllabic parsing may routinely occur for high-frequency words but may be quickly overshadowed by the fast lexical access to the word itself," (Ibid., p. 1811).Without extra attentional resources, the effort required for syllabic parsing would not occur.Similarly, in the present experiment, interpreting "wine" in the sentence did not require extra attentional resources due to its 'good enough' (Townsend & Bever, 2001;Ferreira, 2003;Dwivedi, 2013) fit with the immediate context-resulting in no P200 effect.

Addendum: P200 and presupposition
We note here that P200 effects were also found in a series of ERP language studies by Regel and colleagues (2010Regel and colleagues ( , 2011Regel and colleagues ( , 2014) ) that examined comprehension of ironic vs. literal sentences.That is, sentences, when interpreted on their ironic vs literal interpretation, elicited P200 effects (exhibiting similar parietal topography as reported herein) at sentence-final words when preceded by appropriate context.For example, P200 effects were shown in sentences such as "You should take a break" only on the ironic interpretation (where the context sets up the addressee as someone who has barely worked at all) vs. the literal interpretation (the context is about an addressee who has worked for several hours).Ironic sentences have a presuppositional meaning, in that they require context for interpretation, or (i.e., the sentence expresses the opposite of its literal meaning, which can only be derived by context, see Bollobás, 1981;Schlöder, 2017).Thus, another related way of interpreting the P200 effect found in the current experiment would be that, provided attentional resources are available, the P200 marks presupposition.If so, then the finding of this early ERP response emerging before the N400 would suggest that the interpretation of presupposition occurs at the earliest stages of nominal processing.In fact, one could further speculate that the P200 component is a neural signature of discourse linking (Pesetsky, 1987), and consider that recent P300 findings by Jouravlev et al. (2016) examining presuppositional failure consist of the same component, or family of components (relatedly, see also Leckey and Federmeier, 2020).We leave these questions for further research.
Furthermore, we note that P200 effects found for sentences in visual field studies conducted by Federmeier andcolleagues (2002, 2005) are V.D. Dwivedi and J. Selvanayagam likely not related to the current findings.First, larger P200 effects were found for expected vs. unexpected words, which is the opposite of our findings.Moreover, the aforementioned studies manipulated highly constraining vs. weakly constraining sentences, which was not an aspect of the present design.

The lack of a P600
We did not observe the P600 effect at *that roof, as predicted, given our earlier studies (Dwivedi et al., 2006;2010) examining referential anomaly.Perhaps differences between our preceding work and the present experiment could explain why.First, our previous work examined anaphoric dependencies between two sentences vs. the current single sentence study.Second, it was suggested in the previous studies that the observed semantic P600 might reflect the cost of cognitive procedure of revision.That is, for co-reference to occur between the pronoun "it" and the (hypothetical) antecedent "a novel", interpretation of the antecedent in the first sentence would need to be adjusted.In the current experiment, the only context available is the single sentence, and it cannot be revised in any way to help with interpretation.That is, when "roof" is perceived after *that, there is no probable or possible adjustment to be made.This could explain the difference in the ERP components-different cognitive procedures are at play.
A difference in cognitive procedures would also explain the lack of an empirical effect at *that wine… That is, there is no previous context to adjust to accommodate the presuppositional meaning associated with that wine; and/or if any adjustments are made, the cost of updating the common ground is minimal (see von Fintel, 2008).This would be because "wine" is a 'good enough' fit with the internal sentence context and few resources (if at all) would be required for accommodation (Ferreira, 2003;Chwilla & Kolk, 2005;Dwivedi, 2013).
Interestingly, our off-line findings did show empirical differences between the/that wine sentences.This is likely due to the differences in methodology of the norming study vs. ERP methods.That is, the off-line norming study displayed the entire sentence all at once, and participants were tasked with rating sentences for naturalness, under zero-time pressure.This contrasts with the ERP experiment, where no task was associated with critical sentences (Kaan & Swaab, 2003a;Kolk et al., 2003;Schacht et al., 2014).In addition, the presentation was timed, using standard RSVP methods.As a result, participants would not have the opportunity to look back and review the sentence for interpretation, resulting in potentially different interpretive processes.

Conclusion
In sum, our findings support the notion that meaning can be derived from separate sources of information; both contextual heuristics and grammar, as indexed by independent ERP components.We note that although we did not find a semantic P600 effect, we did find a P200-N400 complex for the combined violation condition, supporting the independence of meaning derived by context vs. grammar (in contrast to Hagoort et al., 2004).We construed the P200 effect as a marker of increased attention, perhaps due to the increased effort associated with interpreting an implausible noun.We further speculated that the P200 effect could be a marker of presupposition, as argued for the P300 component found in Jouravlev et al. (2016).Finally, we note that our findings are consistent with our sentence processing model of Heuristic first, algorithmic second (Dwivedi, 2013), as well as the Retrieval-Integration account of language processing (Aurnhammer et al., 2021).

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Condensed sample trial for the current paradigm.Time values above the screen represent the duration of stimulus presentation, and time values below represent inter-stimulus intervals.The "Ready?" slide requires user input to proceed and was occasionally preceded by a comprehension question.

Fig. 2 .
Fig. 2. Grand average ERPs at the critical word (wine/#roof) with a 200 ms pre-stimulus baseline.(A) ERP waveforms at medial electrode sites for all four conditions the wine (solid black), *that wine (dotted black), the #roof (solid red), *that #roof (dotted red).Topographic plots of mean amplitudes (µV) during time windows for P200 (100-300 ms) in (B) and N400 (300-500 ms) in (C) after stimulus onset at the critical word.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 3 .
Fig. 3. Grand average ERPs at the critical word (wine/#roof) with a 100-300 ms post-stimulus baseline to compensate for differences due to the preceding P200 effect.(A) ERP waveforms at medial electrode sites for all four conditions the wine (solid black), *that wine (dotted black), the #roof (solid red), *that #roof (dotted red).(B) Topographic plot of mean amplitude (µV) during 300-500 ms time window for renormalized N400.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 4 .
Fig. 4. Grand average ERPs at the determiner (critical word minus one; the/*that) with a 200 ms pre-stimulus baseline.(A) ERP waveforms at medial electrode sites for both conditions the (solid black) and *that (dotted black).(B) Topographic plots of mean amplitude (µV) 300-500 ms after stimulus onset at the determiner to investigate the distribution of the observed effect.

Table 1
Experimental conditions with example stimuli.Critical words are in bold and underlined.