Is Passive Priming Really Impervious to Verb Semantics? A High-Powered Replication of Messenger Et al. (2012)

,


Introduction
There can hardly be a question that is more central to the cognitive sciences than that of how language -and in particular grammatical structure -is represented in the brain. To frame the question in more concrete terms, consider a sentence such as A witch is being hugged by a cat (a sentence that the vast majority of English speakers have never previously encountered). What are the syntactic representations that allow any English speaker to produce (and, indeed, comprehend) this sentence?
One class of approaches -which we term semanticsbased approaches -holds that speakers produce and comprehend such utterances using constructions: pairings of forms and functions that they have acquired by abstracting across input utterances (e.g., Goldberg, 1995Goldberg, , 2006Langacker, 2008). For example, the utterance A witch is being hugged by a cat might be formed using the construction (approximately speaking) [AGENT] [BE] [ACTION] by [PA-TIENT].
A rival class of approaches -which we term pure syntax approaches -holds that there exists a "syntactic level of representation [that] includes syntactic category information but not semantic information…or lexical content" (Branigan & Pickering, 2017, p. 8). This system "operates in a particular way, manipulating categories via their form, and not their meaning" (Adger, 2017, p. 29). For example, the utterance A witch is being hugged by a cat might be formed using the syntactic representation (again, very approximately speaking) [S [NP]
Of course, pure syntax approaches do not assume that semantic information is not represented in the grammar at all. On the contrary, they assume that syntax and semantics are intimately linked, and different individual accounts make different assumptions regarding the nature of these links. For example, one possible interpretation of accounts such as Pickering & Branigan (1998) is that encountering a particular verb may activate not only the relevant lexical node, but also lexical nodes for verbs with similar meanings. Similarly, Cai et al. (2012) advocate -and present evidence for -an account under which thematic role-syntax mappings (e.g., THEME=SUBJECT) are stored, and yield priming effects.
At least one such account, however, is -at least on our reading -unambiguous in its claim that "syntactic representations do not contain semantic information" (Branigan & Pickering, 2017, p. 8). In particular, Branigan & Pickering (2017, p. 2) claim that the results of syntactic priming studies -including Messenger et al. (2012) -are "consistent with priming of representations that are specified for syntactic information but not semantic, lexical, or phonological information".

Evidence for Pure-syntax Representation of the Passive
A key testing ground for this debate has long been studies of the passive (mainly, but not exclusively, the English passive). In addition to the study of Bock & Loebell (1990; but see Ziegler et al., 2019), Branigan & Pickering (2017, p. 16) cite as a key piece of evidence for their approach the syntactic priming study of Messenger et al. (2012), in which both adults and children "were primed to produce passives involving Patient/Agent thematic roles (e.g., The witch was hugged by the cat) to the same extent when the prime involved Experiencer/Theme roles (e.g., The girl was shocked by the tiger) and Theme/Experiencer roles (e.g., The girl was ignored by the tiger)". [Emphasis added].
This finding is particularly key to Branigan and Pickering's (2017) argument, since it undermines a large number of previous studies that showed apparent effects of semantics on passive production and comprehension. Pinker et al. (1987) characterized the semantics of the passive construction in terms of "affectedness" such that [B] (mapped onto the surface subject [of a passive]) is in a state or circumstance characterized by [A] (mapped onto the by-object or an understood argument) having acted upon it. Accordingly, several previous comprehension studies (Fox & Grodzinsky, 1998;Gordon & Chafetz, 1990;Hirsch & Wexler, 2006;Maratsos et al., 1985;Meints, 1999;Sudhalter & Braine, 1985) had found that children showed better performance for passives with agent-patient verbs (e.g., The girl was bitten by the tiger) than passives with experiencer-theme verbs (e.g., The girl was ignored by the tiger) (see also Ferreira, 1994 for adults). These results have been interpreted by some as reflecting limitations in young children's representations of passive syntax (e.g. Borer & Wexler, 1987;Fox & Grodzinsky, 1998, cf. Messenger et al., 2012. However, since adults' spontaneous passives more often contain theme-experiencer verbs (e.g., The girl was shocked by the tiger; Maratsos et al., 1985) and since the subject of passive with (for example) bitten or shocked is -almost by definition -more affected than the subject of a passive with (for example) ignore, these findings have alternatively been taken as evidence that children's representation of the passive (and possibly adults' too) is semantically constrained in something like the way proposed by Pinker et al. (1987). That is, these findings have been taken as evidence for semantics-based approaches (Maratsos et al., 1985). Messenger et al's (2012) finding that theme-experiencer and experiencer-theme verbs appear to be equally effective at priming passives (e.g., The witch was hugged by the cat) challenged both conclusions by showing that (a) both adults and children have a syntactic representation for the passive and (b) this representation is seemingly impervious to semantic information. That is, these priming effects constitute evidence for pure-syntax approaches. They are difficult to reconcile with semantics-based approaches, which would seem to predict a greater priming effect for themeexperiencer (e.g., frighten) than experiencer-theme (e.g., ignore) passives; at least on the assumption that semantically more prototypical passives (i.e., theme-experiencer passives) lead to greater activation of speakers' passive representation than do semantically less prototypical passives (i.e., experiencer-theme passives).

Do Syntactic Representations of the Passive Contain Semantics After All?
Following the publication of Messenger et al. (2012), Ambridge and colleagues published a series of studies demonstrating apparent semantic effects on the passive, for both adults and children.
First, focussing on adults, Ambridge et al. (2016) showed that independent ratings of verbs' "affectedness" (designed to capture Pinker's semantic constraint on the passive construction) predicted both the rated grammatical acceptability of passives and (negatively) reaction-time in a forcedchoice comprehension task. Importantly, while similar effects were observed for actives too, a significant interaction demonstrated that the effect was bigger for passives. This latter finding contradicts another finding reported by Messenger et al. (2012) that, for both adults and children, forced choice comprehension was worse for experiencertheme than theme-experiencer verbs, but to an equal extent across passives and actives, perhaps because the former are more difficult to illustrate (c.f., The girl was ignored/frightened by the tiger). The grammatical acceptability findings of Ambridge et al. (2016) were subsequently replicated in In-Is Passive Priming Really Impervious to Verb Semantics? A High-Powered Replication of Messenger Et al. (2012) Collabra: Psychology donesian (Aryawibawa & Ambridge, 2018), Mandarin Chinese (Liu & Ambridge, 2021), Balinese (Darmasetiyawan & Ambridge, submitted) and Hebrew (Ambridge, Arnon & Bekman, in preparation).
Second, adopting Messenger et al's (2012) distinction between theme-experiencer, experiencer-theme and agent-patient verbs (e.g., frighten, ignore, hit) Bidgood et al. (2020) again found that experiencer-theme verbs showed the worst performance in a forced-choice comprehension task; in this case for both children and adults. Again, although a similar effect was observed for actives, a significant interaction demonstrated that (contra the findings of a similar study in Messenger et al., 2012) the effect was bigger for passives.
Third, Bidgood et al. (2020) went on to show that, in a passive priming study, both adults and children produced fewer experiencer-theme passives (e.g., The girl was ignored by the tiger) than theme-experiencer passives (e.g., The girl was shocked by the tiger; and also than agent-patient passives; e.g., The girl was hit by the tiger). This finding was later replicated (using a slightly different methodology) for children with and without autism spectrum condition . Note that these later priming studies reversed the design used by Messenger et al. (2012): Messenger et al held constant the type of the target verb as agent-patient (e.g., hit) and investigated the effect of manipulating the prime verb: theme-experiencer (e.g., frighten) vs experiencer-theme (e.g., ignore). Ambridge and colleagues held constant the type of the prime verb as agent-patient (e.g., hit) and investigated the effect of manipulating the target verb: theme-experiencer (e.g., frighten) vs experiencertheme (e.g., ignore).

The Present Study
To sum up, the current literature yields contradictory evidence regarding the representation of the passive construction. Consistent with semantics-based accounts, several studies using grammaticality-judgment, comprehension and production-priming methods have shown an advantage for theme-experiencer passives (e.g., The girl was shocked by the tiger) over experiencer-theme passives (e.g., The girl was ignored by the tiger). Inconsistent with such accounts, and consistent instead with pure-syntax accounts, Messenger et al. (2012) found that theme-experiencer and experiencer-theme verbs appear to be equally effective at priming agent-patient passives (e.g., The witch was hugged by the cat). Semantics-based accounts predict that theme-experiencer passives (e.g., frighten) will yield a greater priming effect than experiencer-theme passives (e.g., ignore), since the former are more consistent with the semantics of the construction.
A key to resolving this contradiction may lie with the fact that, at least numerically speaking, the adult findings of Messenger et al. (2012) are in the direction predicted by semantics-based accounts: Participants' increased production of passives following passive versus active primes is indeed greater for theme-experiencer primes (26% vs 9%; i.e., 17 percentage points) than for experiencer-theme primes (17% vs 9%; i.e., 8 percentage points). This raises the possibility that the findings of Messenger et al. (2012) are indeed consistent with the predictions of semantics-based accounts, but that the study was not sufficiently powered to detect the effect.
The aim of the present study was therefore to conduct a pre-registered replication of the adult condition of Study 2 from Messenger et al. (2012) using an online methodology, and a sample size appropriately powered to detect the crucial interaction of prime-type by verb-type, such that participants' increased production of passives following passive versus active primes is bigger for theme-experiencer (e.g., frighten) than experiencer-theme verbs (e.g., ignore).

Participants
A sample size of N=240 was chosen on the basis of a power analysis based on Messenger et al's Study 2 adult data (kindly supplied by Kate Messenger). Details of the analysis can be found at https://osf.io/7fekv/ (R syntax). In brief, we first used the lme4 package Bates, Mächler, et al. (2015) to build a mixed-effects model of the original data: M2=glmer(RecodeStrict ~ PrimeType*VerbType + (1+PrimeType*VerbType| Participant) + (1+Prime-Type|Prime_Verb), adults, family=binomial,glmerControl(optimizer ="bobyqa")) The dependent variable was (binomial) participant response ("RecodeStrict": Active = 1, Passive = 0), with independent variables of PrimeType (Active/Passive) and Verb-Type (Theme-Experiencer/Experiencer-Theme), and the interaction term. Treatment coding (the default in R) was used. Following the recommendation of Barr et al. (2013) we used all random intercepts and slopes that were justified given the design; a model which converged in lme4, provided that the bobyqa optimizer was used.
We then used the "extend" function of simr package (Green & MacLeod, 2016) to extend this model to 250 simulated participants, while retaining the model parameters.
(Interestingly, with these 250 simulated participants, the crucial interaction is statistically significant, but only narrowly so, at p=0.028). Next, we used the "powerSim" function of this package to run 20 simulations of this model at each of ten sample sizes: 24, 48, 72…240 (for output, see https://osf.io/m8wx2/). These simulations found that a sample size of N=240 is required to yield at least 95% power for detecting a significant effect of the crucial interaction (PrimeTypeP:VerbTypeTE): Point estimate = 100%, 95 Confidence Interval = (83.16% -100%). The 240 adult (18+) participants were recruited from a student experiment participation pool at the University of Liverpool, and from https://www.prolific.co. As in Messenger et al. (2012), all were monolingual native speakers of British English. In accordance with our pre-registration (https://osf.io/a4tm5/) participants who completed the study but did not produce any passives were discarded and replaced (N=50). However, for consistency with Messenger et al. (2012), who did not replace such participants (N=5/24), we also ran additional non-preregistered analyses in which they were retained. The study was approved by the University of Liverpool research ethics committee, and participants gave informed consent via the Gorilla platform (see https://gorilla.sc/ openmaterials/44690; look for "Information and Consent" and click "Preview").
Is Passive Priming Really Impervious to Verb Semantics? A High-Powered Replication of Messenger Et al. (2012) Collabra: Psychology

Analysis Code
The remainder of the R syntax available at https://osf.io/ 7fekv/ constitutes our pre-registered data analysis code (written and tested on the basis of the simulated data described above). Briefly, we obtained priors for the Intercept, the main effects of Verb Type and Prime Type and the Verb Type x Prime Type interaction from a new model of Messenger et al's (2012) N=24 adult data (model M2 above) 1 .
We then, for 240 simulated participants, used the Savage-Dickey method to calculate a one-sided Bayes Factor for the crucial interaction (based on the methods outlined in Bannard, Rosner, & Matthews, 2017, and at https://rpubs.com/ lindeloev/bayes_factors). These steps required the use of the packages brms (Bürkner, 2017), for running the Bayesian model, and logspline (Stone et al., 1997), for calculating the Bayes Factor. The use of a Bayesian approach is important here, as it allows us to quantify the strength of evidence for and -crucially -against the interaction of theoretical interest, and thus avoids the problem of inferring a null effect from a non-significant result. The use of preregistered analysis code is an important strength of the present replication, because it removes all researcher degrees of freedom with regard to the statistical analyses. The preregistration document (https://osf.io/a4tm5/) also specifies the reference levels according to which we interpret our Bayes Factor (those in Jarosz & Wiley, 2014).
Note that the Bayesian model used for our main analysis -like the frequentist model on which it was based (model M2 above) -used maximal random effects structure (following Barr et al., 2013). However, because Barr et al's (2013) recommendation has attracted some controversy in the literature, we additionally ran a set of exploratory (i.e., non-preregistered) analyses with different random-effects structures.

Design and Materials
The study was run online using the Gorilla platform. Readers can complete the study procedure at the following link https://gorilla.sc/openmaterials/44690 (look for "Syntax Priming" and click "Preview").
Our goal was to replicate Messenger et al's (2012) Study 2 as precisely as possible, with the only major difference being the online nature of the study. That is, we used the same 2x2 (Active/Passive Prime Sentence x Theme-Experiencer/Experiencer-Theme) design, the same number of trials per participant (24, plus 8 "snap" filler trials), and the same prime-target verb pairings, constructed according to the same four counterbalance lists. We used the same six experiencer-theme prime verbs (ignore, remember, see, love, hear, like), the same six theme-experiencer prime verbs (frighten, surprise, scare, shock, annoy, upset), and the same eight agent-patient target verbs (shake, wash, push, hug, kick, chase, kiss, drop). The prime and target sentences, as well as the pictures that accompanied/elicited them (kindly sup-plied by Kate Messenger), were also identical to those used in Messenger et al (2012), and the audio recordings used to present the prime sentences were voiced by the same experimenter (Kate Messenger). A complete set of stimuli (for one of the four counterbalance lists) is shown in Table 1 below (though note that, within each list, trials were presented in fully random order, as determined by the Gorilla platform).

Procedure
In order to replicate as closely as possible the procedure of Messenger et al (2012) -which was optimized for use with both adults and children -we adopted the same "Snap" game framing. First, participants read the following onscreen instructions: In this experiment, you will take turns with a (virtual) experimenter to describe pictures. The experimenter will describe her picture, then you should -out loud -describe yours. BUT there is one more thing to remember: Sometimes, the experimenter's picture and your picture will be identical. When this happens, DON'T describe your picture -instead say "SNAP!" as quickly as possible.
The instructions then introduced the procedure for testing the online audio recording procedure, and a set of X practice trials: Let's have a practice… We will record your voice. But first, before we start, let's just check the sound is working. When prompted, you will need to give Gorilla permission to access your microphone. Have fun! Participants then completed the four practice trials shown in Table 2 (again, identical to those used in Messenger et al., 2012). All used agent-patient verbs and consisted of two active primes, two passive primes and one "snap" filler trial. For each practice trial, unlike the main study, the prime and target sentences used the same agent, patient or both.
No feedback was given during the practice trials (again, mirroring the original study, in which only general encouragement was given), although participants were presented with a reminder of the task: That's the end of the practice trials. Did you remember to either describe your picture as soon as it appears or -if it's the same as the experimenter's -say SNAP? Now click Next to start the study proper.
Participants then completed the 32 experimental trials in random order (see Figure 1 for an example of a standard trial and a "snap" filler trial respectively). At the start of each trial, the experimenter's picture was already present on the left-hand side of the page, and playback of the prime This model differed slightly from that reported in Messenger et al. (2012) which, due to a coding error, treated adult and child data with the same participant number as having been produced by the same participant.  Participants hear an audio recording of the Prime Sentence (accompanied by a matching picture) and are then presented with the accompanying Target Picture, which they then describe verbally (with their audio recorded), usually producing either an active (e.g., A tiger is shaking a doctor) or a passive (e.g., A doctor is being shaken by a tiger) sentence began immediately. 1.5 seconds after the offset of the prime sentence, the participant's picture then appeared. After speaking her sentence, the participant clicked "Stop Recording" to move immediately on to the next trial.
The following instructions remained onscreen at all times:

Transcription and Coding
Audio responses were transcribed by the first author, and all were subsequently coded by both the first and second authors independently. Initial agreement was 95.1% (Kappa=0.87) and 96.1% (Kappa=0.89) according to the strict and lenient coding schemes set out in Messenger et al. (2012) respectively (defined below). In all but three cases, apparent disagreements reflected simple misunderstandings of the coding scheme, and were easily rectified. For the remaining three sentences, agreement was reached by discussion.
As per Messenger et al. (2012) and our preregistration document -we "base our interpretation on the analysis re-• The first recording you hear describes the picture on the left screen • Describe your picture immediately after you see the second picture on the right screen • Press Stop Recording when you are ready to continue.   sulting from the strict scoring criteria". These criteria (from Messenger et al., 2012, p. 574) are reproduced below: A target description was scored as an Active if it was a complete sentence that provided an appropriate description of the transitive event in the target picture and contained a subject bearing the agent role, a verb, and a direct object bearing the patient role, and could also be expressed in the alternative form (i.e., a passive). A target description was scored as a Passive if it was a complete sentence that appropriately described the picture's event and contained a subject bearing the patient role, an auxiliary verb (get or be), a main verb, a preposition by and an object bearing the patient role, and that could also be expressed in the alternative form (i.e., an active)…. We also re-coded the data using more lenient scoring criteria …whereby short passive and short active descriptions were coded as Passive and Active descriptions respectively.
Note that these criteria do not necessarily require that the participant use the verb and/or noun phrase intended, provided that it constitutes "an appropriate description". For example, if instead of the intended A doctor is being shaken by a tiger a participant produced A surgeon is being attacked by a leopard, the sentence would still be scored as an appropriate passive. Such substitutions are allowed, since the experimental manipulation concerns the prime verb, not the target verb (and does not directly relate per se to the verbs' arguments). As in Messenger et al. (2012), only trials scored as complete appropriate Active or Passive responses were retained in the statistical analysis, with all others treated as missing data.

Confirmatory Preregistered Analysis
Figure 2 (produced using the yarrr package, Phillips, 2018) shows the mean number of passives versus actives produced following active and passive primes with experiencer-theme (e.g., see) and theme-experiencer verbs (e.g., frighten), along with 95% Bayesian Highest Density Intervals ([HDIs]). The pattern of these means is consistent with the prediction that participants' increased production of passives following passive versus active primes is bigger following primes with theme-experiencer verbs (  Note that because these effects are not of primary theoretical interest, we did not include investigation of them in our pre-registered syntax; these claims are based solely on whether or not the credible interval includes zero. It is also important to bear in mind that since we used treatment (/dummy/baseline) coding rather than effect (/sum/ deviation) coding, the effects of Prime Type and Verb Type are simple effects rather than ANOVA-style main effects (e.g., https://mediaup.uni-potsdam.de/Play/Chapter/223). That is, the effect of Prime Type -more passives following passives than active primes -is the effect of prime type when verb type is Experiencer-Theme (the baseline). In hindsight, it would probably have been better to use effect coding, in order to yield an estimate of Prime Type as a main effect. However, this is not a serious problem given that (a) a main effect of Prime Type is clearly visible in Figure 1 and (b) the effect of primary theoretical interest is the interaction of Prime Type by Verb Type, whose interpretation is identical under treatment and effect coding.
To test the crucial prediction of an interaction of Verb Type by Prime Type (recall from Figure 2 that the observed means were in the predicted direction), we calculated onesided Bayes Factors using the Savage-Dickey method (see Appendix A for model summary and calculations). The Bayes Factor was 2.11 which, according to our pre-registered reference standard (Jarosz & Wiley, 2014) constitutes "Weak" (Raftery) or "Anecdotal" (Jeffreys) evidence for H1 over H0. That is, the observed data are roughly twice as likely under a scenario in which participants' increased production of passives following passive versus active primes is bigger for theme-experiencer than experiencer-theme prime verbs than under a scenario in which participants' increased production of passives following passive versus active primes is unrelated to prime verb type.

Are These Findings Robust to Coding and Exclusion Decisions (Exploratory Analyses)?
The findings above (like the main findings in Messenger et al., 2012 are based on the strict coding scheme. Recall, however, that we also coded responses under a more lenient coding scheme which allows short passive and active forms. Furthermore, and in contrast to Messenger et al. (2012), the findings above are based on data from 240 participants, all of whom produced at least one passive, excluding data from 50 participants who did not. In order to check whether the findings reported above are robust to these (preregistered) decisions, we ran additional exploratory Bayesian analyses using the lenient coding scheme, N=240 (Appendix B), the strict coding scheme, N=290 (Appendix C), and the lenient coding scheme, N=290 (Appendix D). Note that the (in principle) N=290 analyses in fact include only 280 participants, since 10 failed to produce at least one scorable active or passive under either the strict or lenient coding scheme, and so were automatically excluded.
The findings of these additional analyses were all but identical to those of the main analysis. This is to be ex-  pected given that (a) the vast majority of responses were full actives or passives, meaning that the inclusion of short forms under the lenient coding scheme makes little difference and (b) the additional inclusion of participants who produced no passives inevitably dilutes the overall priming effect to a small degree, but -since they produced no passives -makes little difference to the relative rates of passives following experiencer-theme vs theme-experiencer passive primes. For the record, the Bayes Factor for the crucial interaction of Verb Type by Prime Type was 2.11, 2.00, 2.11 and 2.13 for the analyses in Appendix A-D respectively.

Are These Findings Robust to Different Random Effects Structures, and to the Use of a Frequentist Analysis Strategy (Exploratory Analyses)?
All of the findings reported so far (both confirmatory and exploratory) are based on models with maximal random effects structure (Barr et al., 2013). However, a number of recent studies (Bates, 2019;Bates, Kliegl, et al., 2015;Matuschek et al., 2017) have argued that maximal models are too conservative -which decreases power -and instead advocate model selection by some goodness-of-fit criterion (e.g., AIC, BIC, likelihood ratio test). Other studies have cautioned against removing terms from the random effects structure simply because they cause convergence failure (Eager & Roy, 2017) or fail some goodness-of-fit criterion (Heisig & Schaeffer, 2019). Given the lack of agreement amongst experts, we therefore decided to adopt a mixedeffects-multiverse approach (Ambridge, 2021), and test for the crucial interaction of Verb Type by Prime Type (as well as the observed simple priming effect of Prime Type) under models with all possible random effects structures. Given the very large number of models this entails, and the fact that each takes several hours to run under a Bayesian approach, we adopted a frequentist approach, using the lme4 package (Bates, Mächler, et al., 2015). This also allows us to check whether the conclusions drawn on the basis of the main analysis -which used a Bayesian maximal models approach -hold under a frequentist approach. Using the bobyqa optimer, 74/83 possible lme4 models achieved convergence (including all of those with the closest to maximal random effects structure).
Is Passive Priming Really Impervious to Verb Semantics? A High-Powered Replication of Messenger Et al. (2012) Collabra: Psychology Figure 3 (see also Appendix E) plots, for these 74 models, (a) the mean estimate and standard error and (b) p values (approximated via the z-distribution) for the crucial interaction of Verb Type by Prime Type, as well as the simple priming effect of Prime Type. The simple priming effect is comfortably significant (adopting the conventional cutoff of p<0.05) under all random effects structures. For the crucial interaction, the picture is more complicated. Rather alarmingly, an unscrupulous researcher could achieve almost any p value required from well under 0.05 to almost 1.0 by choosing a particular random effects structure. Reassuringly, though, the models with low AIC values, indicating good model fit, give much more uniform, nonsignificant estimates. Fortunately, the maximal model model structure adopted for the main Bayesian analysis (AIC=4212; 1 + PrimeType * VerbType | Participant) + (1 + PrimeType | Prime Verb; shown 22 nd from the left) was a fairly typical one; although -at least on the basis of AIC -it was somewhat overparameterized: The most parsimonious model (AIC=4206) includes by-participant random-slopes for Prime Type and Verb Type (but not the interaction) and a by-prime-verb random slopesfor Prime Type, but no random intercepts at all. Importantly, all of the models with low AIC values yielded estimates of the interaction close to that obtained from the main Bayesian analysis (M= -0.47, SE=0.45), whose conclusions can therefore be taken as robust.

Are These Findings Robust to the Use of a Continuous Measure of Verb Semantics (Exploratory Analyses)?
All of the findings reported so far (both confirmatory and exploratory) are based on statistical models that treat verb semantics as a categorical predictor (experiencer-theme / theme-experiencer). However, several other studies of this construction (Ambridge et al., 2016;Aryawibawa & Ambridge, 2018;Liu & Ambridge, 2021;Darmasetiyawan & Ambridge, submitted;Ambridge, Arnon & Bekman, in preparation) have instead used a continuous measure of passive-relevant verb semantics: "affectedness" ratings obtained from adult speakers. In order to investigate whether the findings above are robust to the use of a continuous measure of verb semantics, we reran the main analysis above replacing the dichotomous predictor of Verb Type with scaled and centred continuous affectedness ratings taken from Ambridge et al. (2016). Because we have no basis for setting priors for this analysis, we used a wide, flat prior (M=0, SD=10) and did not calculate Bayes Factors.
The findings of this analysis are shown in Figure 4 and Appendix F. Although the magnitude of the simple effect of Prime Type was virtually unchanged (M= -2.62 [-3.33, -1.92]), the crucial interaction of Prime Type by Verb Semantic Rating (c.f., Verb Type) was reduced (M= -0.15 [-0.82, 0.53]). Thus, as shown in Figure 4, although the proportion of passives (blue line) versus actives (red line) isas predicted -greater following verbs in which the passive subject is highly affected (SUBJECT is being annoyed/scared/ shocked/surprised… vs heard/seen/liked/remembered…) the 95% confidence interval straddles zero, indicating no strong evidence for an effect. This confirms the finding from the main analysis that the effect of verb semantics, while probably not quite zero, is negligible.

Do These Data Show Any Evidence of Primesurprisal Effects (Exploratory Analyses)?
Several syntactic priming studies (e.g., Bernolet & Hartsuiker, 2010;Jaeger & Snider, 2013;Peter et al., 2015) have observed prime surprisal or inverse frequency effects, such that the priming effect is increased when the verb+Prime Type combination that serves as the prime sentence is of low frequency (i.e., "surprising"). For example, the verb tell is considerably more frequent in the DO dative (The writer told the publisher a story) than the PO dative (The writer told a story to the publisher). Conversely, the verb pass is considerably more frequency in the PO dative (The writer passed a story to the publisher) than DO dative (The writer passed the publisher a story). Thus, holding construction constant (here, as DO dative), The writer passed the publisher a story is considerably more surprising than The writer told the publisher a story, and thus leads to greater priming; i.e., greater production of DO versus PO datives.
In order to investigate whether the present data show any evidence of prime-surprisal effects, we repeated the analysis from the previous section, replacing the by-verb continuous semantics measure with -for separate analyses -two different by-verb surprisal measures. Both of these measures were calculated from the by-verb active and passive corpus counts reported in Ambridge et al. (2016).
The findings of this analysis are shown in Figure 5 and Appendix G (proportional measure) and Figure 6 and Appendix H (chi-square measure). In both plots, the regression lines for active and passive sentences are almost flat and almost parallel, suggesting no evidence of a prime-surprisal effect (i.e., no evidence of an interaction of Prime Type by either the Proportional or Chi-Square surprisal measure); a pattern confirmed by the statistical models. Indeed, if anything, the plots suggest a reverse-prime-surprisal effect: a larger passive priming effect for verbs that are more frequent in the passive (e.g., annoy, scare, shock, surprise vs hear, see, like, remember). This pattern is consistent with the -albeit tiny -effects observed in the main and continuous-semantics analyses above. Compared to experiencer-• Proportion of passives versus actives. Jaeger and Snider's (2013) corpus measure of surprisal was based on the conditional probability of the prime structure (in our case, passive) given the verb. However, because -for the present dataset -active and passive uses sum to 100%, conditional probability is equivalent to the simple proportion of passive versus active uses of each verb. We therefore used this simpler measure (scaled and centred). • Chi-square measure. A disadvantage of the proportion measure above is that it is insensitive to the raw frequency of passive versus active uses of each verb. We therefore calculated for each verb a chi-square statistic which reflects the extent to which, compared to other verbs in the corpus (N=475), it is biased towards (multiply by 1) or against (multiply by -1) passives. Again, this measure was scaled and centered.  theme verbs (e.g., hear, see, like, remember), theme-experiencer verbs (e.g., annoy, scare, shock, surprise) (1) score higher for continuously-rated semantic affectedness (2) are more frequent in the passive and (3) yield (marginally) higher rates of passive priming (NOT lower rates as would be predicted under prime-surprisal).
One possible reason why a prime surprisal effect was not observed in these data is that, regardless of the identity of verb, the passive construction is extremely surprising in and of itself, constituting -in the corpus counts used for the present analyses -around 1% of all verb uses. Consequently, all verb+passive combinations were hugely -and roughly equally -surprising: Even the least surprising (i.e., most passive-biased) verb, ignore, is 98.6% surprising in Is Passive Priming Really Impervious to Verb Semantics? A High-Powered Replication of Messenger Et al. (2012) Collabra: Psychology

Fig 6
the passive (i.e., 1.4% passive uses), meaning that all other verbs can be more surprising to the tune of less than 1½ percentage points. No wonder, then, that we failed to find any evidence that one verb is more surprising in the passive than another. On the other hand, it is important to remember that any prime surprisal effect for the present dataset would run counter to the effect of verb semantics already observed (albeit very weakly). Perhaps the frequency with which a verb appears in a particular construction (here the passive) is somehow differently related to surprisal and/or priming effects than the semantic compatibility between the verb and the construction. That said, given that neither a semantic nor a prime surprisal effect was strongly evidenced in the present study, this issue must await further research.

Summary
To return to the main, preregistered analysis, while these data constitute only weak support for the experimental hypothesis, they can certainly not be taken as support for the original claim of Messenger et al. (2012, p. 568): that "the magnitude of priming was unaffected by verb type". That is, they do not offer any support for this null hypothesis, which -on the basis of the present data -is only around half as likely as the alternative hypothesis (BF=2). Then again, the finding of such weak, anecdotal evidence from such a large sample suggests that the magnitude of priming is affected, if at all, to only a very small degree.

Discussion
The aim of the present study was to conduct a particularly stringent pre-registered investigation of the claim that there exists a level of linguistic representation that "in-cludes syntactic category information but not semantic information" (Branigan & Pickering, 2017, p. 8). As a test case, we focussed on the English passive; a construction for which previous findings have been somewhat contradictory. On the one hand, several studies using different methodologies have found an advantage for theme-experiencer passives (e.g., The girl was shocked by the tiger; and also agent-patient passives; e.g., The girl was hit by the tiger) over experiencer-theme passives (e.g., The girl was ignored by the tiger). On the other hand, Messenger et al. (2012) found no evidence that theme-experiencer and experiencer-theme passives vary in their propensity to prime production of agentpatient passives.
The aim of the present study was therefore to conduct a pre-registered replication the adult condition of Study 2 from Messenger et al. (2012) using an online methodology, and a sample size (N=240) appropriately powered to detect the crucial interaction of prime-type by verb-type, such that participants' increased production of passives following passive versus active primes is bigger for theme-experiencer (e.g., frighten) than experiencer-theme verbs (e.g., ignore).
In fact, our preregistered Bayesian analysis found only "Weak" (Raftery) or "Anecdotal" (Jeffreys) evidence for the presence of this interaction, with a Bayes Factor of around 2 indicating that the observed data are roughly twice as likely under the presence of this interaction than its absence. This conclusion of, at most, a small, anecdotal effect of verb semantics was robust to (a) different coding and exclusion decisions, (b) different random effects structures and a frequentist approach and (c) the use of a continuous -as opposed to dichotomous -measure of verb semantics. Neither did we find any evidence for (d) a prime-surprisal effect whose predictions are -although differently opera-Is Passive Priming Really Impervious to Verb Semantics? A High-Powered Replication of Messenger Et al. (2012) Collabra: Psychology tionalized -more-or-less in the opposite direction to those of the verb semantics hypothesis.
On the other hand, these findings do not constitute support for the claim of Messenger et al. (2012, p. 568): that "the magnitude of priming was unaffected by verb type", since this null hypothesis received only half as much support as the alternative hypothesis.
It is important to bear in mind, however, that in contrast to the interaction, the main effect of prime type, which is generally considered to constitute evidence of syntactic priming, was very large: Participants produced passives at a rate of 39% following passive primes (Bayesian 95% Highest Density Interval = 38%-41%) but only 10% (HDI = 9%-11%) following active primes. Thus, in contrast to very weak evidence for an influence of semantics, we seemingly have very strong evidence for the role of pure syntax.
This conclusion, however, is called into question by the findings of a recent study by Ziegler et al. (2019), which suggests that "syntactic priming" effects may not be purely syntactic. Almost certainly the study that is most often cited as evidence of purely syntactic priming is that of Bock & Loebell (1990). In this study, passive sentences such as The construction worker was hit by the bulldozer were primed by intransitive locative (i.e., non-passive) sentences such as The 747 was landing by the airport's control tower, providing evidence for a level of syntactic representation of the (

approximate) form [S [NP] [VP [AUX] [V] [PP [P] [NP]]]].
In a high-powered modified replication of Bock & Loebell (1990), Ziegler et al. (2019) found that this apparently-syntactic priming effect was driven solely by the lexical item by, which was both necessary and sufficient for priming to occur. That is, no priming of passives occurred following locatives that lacked by (e.g., The 747 was landing next to [c.f. by] the airport's control tower). Conversely, priming of passives did occur following active locative sentences with by (e.g., The pilot landed the 747 by the control tower).
Note, however, that hearing the by phrase is not always necessary for priming of passives: Messenger et al. (2011) showed that children and adults produced more 'full' passives (e.g., The king was scratched by the tiger) following short passive primes (e.g., The girls are being shocked) that did not contain the by phrase, than following active primes. These findings imply an underlying syntactic element of syntactic priming, but Ziegler et al's (2019) findings do highlight the importance of lexical factors.
Indeed, although -to our knowledge - Ziegler et al. (2019) is the first study to demonstrate that priming is influenced by closed-class lexical items (here, by), at the level of the verb, the so-called lexical-boost effect is well accepted in the literature (see, for example, the meta-analysis of Mahowald et al., 2016). This is the phenomenon that priming effects are increased if the same verb appears in the prime and target sentence (e.g., between The vase was broken by the ball and The window was broken by the hammer).
This raises the question of what type of account could incorporate all of these different types of representations. One viable candidate here is usage-based models of language acquisition which assume that learners retain, and are influenced by, individual lexical strings even when they have formed more abstract representations too (e.g., Abbot-Smith & Tomasello, 2006;Ambridge, 2020aAmbridge, , 2020bGoldberg, 2006;Langacker, 1998).
In particular, Ambridge (2020b, p. 640) argues for an "abstractions made of exemplars" account under which "(a) we store all the exemplars that we hear (subject to attention, decay, interference, etc.) but (b) in the service of language use, re-represent these exemplars at multiple levels of abstraction, as simulated by computational neural-network models such as BERT, ELMo and GPT-3". Lexical effects are driven by low-level representations -at the lowest level, individual stored passives sentences -while effects of pure syntax are driven by the highest-level, most-abstract representations, that correspond -if only approximatelyto traditional linguistic representations of the passive construction. Semantic effects are driven by mid-level representation that are more abstract than individual sentence exemplars but less abstract than the (approximate) passive construction representation. For example, although these representations notoriously defy intuitive explanation, one level might constitute separate, and relatively distinct, clusters of passives with experiencer-theme and theme-experiencer verbs. Indeed, there already exist computational models along these lines which exhibit both syntactic priming effects and sensitivity to lexical overlap (e.g., Johns et al., 2020;Prasad et al., 2019). An interesting direction for future research would be to investigate whether these models can also simulate the semantic effects observed in previous studies of the passive.
Finally, on a methodological note, it is important to acknowledge that while the method used in this study has a long pedigree, there is something rather unnatural about presenting passive sentences with no prior discourse context. In more naturalistic settings, the passive is used when the Noun Phrase about which the speaker wishes to make some comment or assertion is already highly topical in the current discourse (e.g., Have you heard the news about YouTube? It was bought by Google). Utterances that violate this principle are infelicitous and difficult to process (e.g., Have you heard the news about Google? YouTube was bought by it; examples from Pullum, 2014, p. 64). It may well be the case, then, that the relative unnaturalness of the present context-free passives either boosted the overall rate of passive priming (on a prime-surprisal account whereby context-free passives are more surprising) or inhibited it (if participants were reluctant to produce passives with no such topicalization function); or perhaps both, perhaps for different participants. In ongoing research (Darmasetiyawan & Ambridge, in preparation) we are investigating the effect of discourse context on the relative acceptability of passive sentences similar to those used in the present study.
In the meantime, and to sum up, the present high-powered online replication of Messenger et al's (2012) passive Is Passive Priming Really Impervious to Verb Semantics? A High-Powered Replication of Messenger Et al. (2012) Collabra: Psychology priming study found strong evidence for syntactic priming, but only weak evidence for an influence of verb semantics. Future studies, ideally incorporating a computational modeling component, should seek to explain not only this finding, but the finding that semantic effects on the passive appear to vary quite dramatically according to the paradigm used to assess them (c.f., Ambridge et al., 2016;Bidgood et al., 2020). Given the importance of the passive construction as a test case, future work along these lines holds the promise of uncovering the representations that underlie humans' remarkable ability to produce and understand novel utterances.