Syntactic priming in the classroom: using narratives to prime L2 arabic speakers

A robust finding in psycholinguistics is that prior language experience influences subsequent language processing. This phenomenon is known as syntactic priming. Most of the empirical support for L2 syntactic priming comes from lab-based experiments. However, this evidence might not reflect how priming occurs in typical language activities in the L2 classroom. As such, we conducted a classroom-based priming study. Using a between-subject design, 52 L2 Arabic speakers read and listened to eight story-reading sessions over two weeks that either included a high proportion of the fronted temporal phrase (TP) structure (experimental group) or included no fronted TPs (controls). The effect of L2 proficiency was also investigated. Results revealed that the experimental group did not significantly increase their use of fronted TPs in the immediate posttest or the one-week delayed posttest relative to the baseline. A null effect of Arabic L2 proficiency was also observed. We discuss our findings in light of related priming theories and previous findings. This study highlights the need for more research on syntactic priming via common language tasks in the L2 classroom.

A common design in syntactic priming research is the lab-based experimental design (Contemori, 2023).In the classic syntactic priming paradigm, participants read or listen to a series of unrelated sentences (primes) either from an experimenter or from the computer (McDonough & Trofimovich, 2009).These sentences are typically introduced individually and without discourse context preceding them.Following each prime sentence, participants are asked to describe a picture representing the target event.This turn-taking between the experimenter and the participant constitutes an atypical communicative situation where this is little opportunity for purposeful linguistic interaction.
The use of such a controlled experimental design in syntactic priming research has not only stood the test of time but also offers several advantages.Researchers can carefully control the stimuli, which allows them to examine the effect of interest in more detail.For example, one well-studied effect in syntactic priming studies is the lexical boost effect, which can only be captured when the stimuli are designed in a way that the head verb/noun is repeated in the prime and target sentences (e.g., Branigan & McLean, 2016).Another benefit of the experimental design is that it makes it possible to examine less-frequent syntactic structures (e.g., reduced relative clauses) that otherwise would be difficult to investigate in natural data (e.g., Tooley, 2020).Additionally, a controlled labbased design allows the robust examination of several factors (e.g., working memory) on syntactic priming (e.g., Coumel et al., 2023a).
Although experimental syntactic priming studies have informed our understanding of language processing and acquisition (for a review, Jackson, 2018), it remains unknown whether syntactic priming effects also occur in classroom-based activities.Unlike the classic syntactic priming paradigm, a typical classroom language activity such as story reading occurs in linguistically and situationally embedded contexts.Exposing L2 learners to the prime structure through short stories in a group setting provides an opportunity to examine whether syntactic priming could occur in a common L2 classroom activity.In this study, we decided to focus on investigating syntactic priming in an L2 classroom context to shed more light on the ecological validity of syntactic priming.This gap has not been sufficiently addressed in the syntactic priming literature, as stated in a recent review by Contemori (2023).

Related theoretical frameworks
Several theoretical accounts have been developed over the years to explain the mechanisms underlying syntactic priming.An early account is the residual activation account, which maintains that when language speakers are exposed to a syntactic structure, both the head of that structure and the whole structure are activated in the speaker's explicit memory (Pickering & Branigan, 1999).This activation does not disappear quickly, which facilitates the subsequent reuse of that structure.Although this account could successfully explain immediate abstract priming effects and the lexical boost effect, it does not explain long-term priming effects (e.g., Hartsuiker et al., 2008).
To account for the longevity of priming effects, the implicit learning account posits an error-based learning mechanism underlying syntactic priming (Chang et al., 2006).This account assumes that speakers make predictions about upcoming utterances.When the predictions do not match the actual utterance, the speakers update their linguistic knowledge in the direction of the actual input.Repeated exposure to the target structure bolsters this linguistic adaptation, eventually leading to implicit syntactic knowledge.Importantly, the account points out one important determinant of the magnitude of priming: previous experience with a structure.A less frequent structure is expected to induce larger priming effects than more frequent structures.Support for this account comes largely from two main observations: long-term abstract structural priming (Branigan & Messenger, 2016;Grüter et al., 2021;Kaschak et al., 2011bKaschak et al., , 2014;;Kroczek & Gunter, 2017) and the inverse-frequency effect (Flett et al., 2013;Kaschak et al., 2011a;Montero-Melis & Jaeger, 2020;Muylle et al., 2021b;Reitter et al., 2011).Nevertheless, the implicit learning account does not sufficiently address the role of explicit memory in syntactic priming, although there is some evidence to suggest this role (Coumel et al., 2023a;Zhang et al., 2020).
Extending earlier models, multifactorial accounts of syntactic priming maintain that both implicit and explicit might give rise to syntactic priming (e.g., Reitter et al., 2011;Zhang et al., 2020).They assume that speakers temporarily store the surface structure and the lexical items of the prime in explicit memory while also being susceptible to an implicit error-based learning mechanism.One form of the multifactorial accounts (Zhang et al., 2020) argues that an explicit memory-based retrieval process might be implicated in lexically-independent syntactic priming (no lexical overlap) as well as lexically-dependent syntactic priming (lexical overlap).Speakers are believed to show priming effects via a cue-dependent memory retrieval process.In lexically-dependent priming, the repetition of the verb "read" in the prime (e.g., the story was read by the boy) and target (e.g., the newspaper was read by the father) might serve as a cue and trigger priming.In lexically independent priming, the repetition of the event structure in the prime (e.g., the cat was chased by the dog) and target (e.g., the cake was eaten by the man) could constitute a memory retrieval cue and give rise to priming.Thus, multifactorial accounts generally argue that explicit memory-related processes contribute a larger role in syntactic priming than originally proposed in the implicit learning account.
Another relevant model for the present study is Hartsuiker andBernolet's (2015, 2017) shared syntax model, as it directly addresses the role of L2 proficiency in syntactic priming.This model assumes that L2 syntactic representations vary with language proficiency: beginners have item-specific representations, while more advanced L2 speakers have more abstract syntactic knowledge.For instance, when a beginner L2 speaker is asked, "Do you want a hot drink or a cold one?", she may reply, "A hot drink" or "A cold one", whereas a more proficient L2 speaker could also reply "A hot one" or "A cold drink" (Coumel et al., 2023b).Thus, beginners may exhibit priming only when there is a lexical overlap, while advanced speakers may show both abstract and lexically-mediated priming effects.As for intermediate L2 speakers, the model posits that although they rely less on surface forms and explicit memory strategies (unlike beginners), they have abstract representations for the more frequent structures (unlike advanced learners).As such, intermediate learners may show abstract priming effects for more frequent structures but not for less frequent structures.For example, Arabic L2 speakers may show priming for the Double Object dative (DO) structure in the absence of a lexical overlap (Alzahrani, 2023), whereas a lexical overlap may be required for the priming of relative clauses.
Overall, these theories suggest that syntactic priming is based mainly on the explicit memory system (Pickering & Branigan, 1999), the implicit memory system (Chang et al., 2006), or both (e.g., Zhang et al., 2020).Syntactic priming via classroom language activities might be explained by any of these theories, as this area of research is still in its early stages.Therefore, all discussed theories could provide valuable insights into how syntactic priming might arise in the classroom context.

Syntactic priming in language comprehension
Two characteristics of syntactic priming often found in the literature include the lexical boost effect and the inverse-frequency effect.Previous research has consistently reported larger priming effects when the prime (e.g., the book was read by the man) and target (e.g., the story was read by the girl) share one or more open lexical items (e.g., verbs, nouns, adjectives).Experimental evidence confirms this for L1 and L2 speakers (Branigan & McLean, 2016;Hartsuiker et al., 2008;Jackson & Ruf, 2018;Muylle et al., 2021a;Ruf, 2011;Wei et al., 2019).Another widely reported finding is the observation that speakers exhibit enhanced priming when primed to less frequent structures (e.g., the English passive) compared to more frequent structures (e.g., the English active).This finding is termed the inverse-frequency effect, and much of its strong empirical support comes from studies on L1 adult speakers (e.g., Flett et al., 2013;Kaschak et al., 2011;Reitter et al., 2011).
While L2 speakers may show the inverse-frequency effect (e.g., Montero-Melis & Jaeger, 2020; Wei et al., 2023), frequency effects are sometimes observed in the L2 syntactic priming literature (Hurtado & Montrul, 2021;Jackson & Ruf, 2017;Kaan & Chun, 2017).For example, Jackson and Ruf (2017) reported that L2 German speakers showed more priming for the fronted temporal phrases than the fronted locative phrases because the former are more frequent in German.Likewise, Kaan and Chun (2017) found that L2 English speakers showed weak immediate priming effects for the less frequent double object dative structure and continued to prefer the more frequent prepositional object structure during the priming phase.Hurtado and Montrul (2021) also found that L2 Spanish speakers produced more target constructions with recipient structures (more frequent in Spanish) compared to nonrecipient constructions (less frequent).These studies suggest that L2 speakers may show a frequency effect.
Another factor that may modulate the strength of syntactic priming is previous knwoledge and use of the L2 prime structure.It has been shown that a prerequisite for successful syntactic priming is basic prior knwoledge of the prime structure (McDonough & Trofimovich, 2015).This is based on the idea that syntactic priming arises from an implicit learning process (Chang et al., 2006), whereby speakers are not given explicit information about language form.Syntactic priming, thus, may not facilitate the development of Page 5 of 20 Alzahrani and Almalki Asian. J. Second. Foreign. Lang. Educ. (2024) 9:68

Method
This study used a mixed design with a between-subject variable: (a) condition (experimental, control), as well as within-subject variables: (b) task phase (baseline, immediate posttest, delayed posttest) and (c) proficiency level (continuous; measured using a C-test).
Three tests were administered to the participants: a baseline (pretest), immediate posttest, and delayed posttest.In all these tests, participants were asked to type a sentence describing the target picture without exposure to the prime.The baseline was completed before reading and listening to the narratives.The baseline assessed participants' preference to use the fronted TP versus non-fronted TP structures prior to prime exposure.The two posttests were administered after participants read and listened to the narratives.They measured whether exposure to the prime via narratives could trigger shortterm learning (immediate posttest) and long-term learning (delayed posttest).All study materials and R codes are available on the Open Science Framework at https:// osf.io/ ebtsq/?view_ only= 1f80c 04a13 e841b 58f3c 49009 20904 0e.

Participants
A total of 52 L2 speakers of Arabic took part in this study.We recruited two intact L2 Arabic learning classes from the third level of an Arabic learning program at King Saud University.These two classes comprised L2 learners who were at least at a pre-intermediate proficiency level and had at least one year of experience in an Arabic-speaking country.Our rationale for this is to ensure that they had the required language skills to understand and engage with the narratives, which would have been challenging for L2 beginners and easy for highly proficient L2 Arabic learners.There were 34 L2 female learners enrolled in the experimental class, while the control class had 32 female L2 learners.However, some participants across both groups either withdrew from the classes prior to the study (n = 9), did not complete the baseline due to a technical issue (n = 3), or only completed the baseline test (n = 2).Participants who completed both the baseline and either the immediate posttest (n = 3) or the delayed posttest (n = 2) were included in the study.Data from the remaining 27 L2 learners in the experimental group and 25 learners in the control were analyzed.As shown in Table 1, participants were L1 speakers of various languages (n = 24) due to the limited number of L2 speakers who shared the same L1 background at the recruitment site.The study was approved by the Humanities and Social Sciences Research Ethics Committee at King Saud University.All participants were informed about the study and gave their consent prior to completing the study.

Target structure
The target linguistic structure is the MSA TP alternation (fronted vs. non-fronted).The canonical word order in MSA is Verb-Subject-Object (VSO) (Alhawary, 2011;Ryding, 2005).This preference for the VSO word order suggests two crucial points.First, it indicates that the default position of TP is at the end of the sentence, as MSA is typically a verb-initial language.Second, it provides some evidence that fronted TPs are likely less frequent than non-fronted TPs.Due to the relative flexibility of word order in MSA (Ryding, 2005), the TP structure could be placed at the beginning (Example 1), middle (Example 2), or at the end of the sentence (Example 3).The TP structure has two components: a preposition (e.g., fi:/in) and a noun phrase (NP, e.g., sˤabah/morning).The NP in this structure is always in the accusative case, regardless of where it appears in the sentence (Ryding, 2005).
In (the month of ) Ramadan, the Muslim fasted.
The Muslim in (the month of ) Ramadan fasts.
The Muslim fasted in Ramadan.
One of the main functions of fronted TPs in MSA is to foreground the time of an event, which is crucial in narratives.By placing the TP at the beginning of the sentence, the narrative emphasizes the "when" of the event.This directs the readers' attention to the sequence of events and could possibly aid comprehension.Alzahrani and Almalki Asian. J. Second. Foreign. Lang. Educ. (2024) 9:68

Prime and target pictures
All experimental and filler stimuli were adopted from a previous study (Alzahrani, 2023).
A total of 18 experimental sentences were used across the baseline, immediate posttest, and delayed posttest (Table 2).Four prepositions were used in the target trials: ‫بعد،‬ ‫في،‬ ‫يوم‬ ‫,:‪/fi‬قبل،‬ bәʕdә, qәblә, jәwmә/ in, after, before, day of.Two of them (Fi:/in and bәʕdә/ after) were used six times, one (qәblә/before) was used five times, and the other (jәwmә/ day of ) was used four times.The use of an unequal number of prepositions was dictated by the semantic and grammatical compatibility of prepositions with nouns.Some prepositions like "day of " can precede a smaller number of nouns compared to the prepositions "in" and "after".Each preposition was combined with a different noun to create 18 unique TPs.Only verbs that appeared in the participants' textbooks were used in the target trials to ensure familiarity with their meaning (e.g., Grüter et al., 2021).To control sentence length, all the experimental sentences included a transitive verb followed by one argument (the subject).Furthermore, 36 fillers were included.Filler sentences had transitive and intransitive verbs that were distinct from those used in the experimental sentences.All pictures were labeled with the appropriate vocabulary, and the infinitive form of the verb was printed below each picture in bold to limit the production of unrelated structures (Branigan & Gibb, 2018).In MSA, the infinitive form of a verb is the third person, singular, masculine, and past tense form.Participants were informed that they could change the form of the verb to match the gender of the subject if the subject was feminine.Finally, this study also counterbalanced the place of TP image in target trials to account for the effect of TP-image placement on priming (Coumel et al., 2022a;Jackson & Hopp, 2020;Jackson & Ruf, 2018).Half of the images representing TPs were placed in the upper right corner, and the other half in the upper left corner.A sample trial is included in Fig. 1.

Narratives
Two sets of six stories were adapted from existing children's books (Dar Al-Manhal, 2005): one set for the "fronted TP" condition and another for the "non-fronted TP" condition.All stories contained 200 to 400 words.Two L1 Arabic speakers with MA/ PhD degrees in Arabic rated the difficulty of the stories on a five-point scale (1 = very easy, 5 = very difficult).Participants rated the difficulty of the following aspects of each story: syntactic structure, vocabulary, figurative language, clarity of events, and overall perceived difficulty.The Cohen's kappa coefficient was 0.833, indicating substantial agreement between the two raters.Four stories had matched ratings (i.e., 1) across all categories, while the remaining two had different overall perceived difficulty levels (range = 1, 5).Thus, only four stories were included in the story-sessions.
All four stories were centered around animal characters or young people and discussed general and culturally sensitive themes such as the importance of hard work and perseverance as well as the negative effects of greed.For instance, one story mentioned that jungle animals worked together to build a well, except for the lions, and when the well was finished, the other animals refused to let the thirsty lions drink from the well because they did not contribute.
The fronted TP condition consisted of stories with a high proportion of TP constructions, with around 70% of all sentences per story, including a fronted TP structure (e.g., Vasilyeva et al., 2006).This percentage ensured that participants were sufficiently exposed to the prime structure while maintaining the naturalness of the stories.A total of 56 unique TP constructions were included in the four stories, and no TP was repeated across stories to expose L2 speakers to a wide range of the TP structure.Additionally, if the original story used a TP at the end of the sentence, it was modified to be placed at the beginning.This guaranteed that all TPs in the fronted-TP condition were always placed at the start of the sentence.
The "non-fronted TP" condition set was created by moving all fronted TPs to the end of their sentences.For example, the sentence "In one week, Ashoor planted the seed" was turned into "Ashoor planted the seed in one week".Thus, the two story sets were identical except for the fronted/non-fronted TP manipulation.
All stories in both conditions were audio-recorded by a native female Arabic speaker who was instructed to read them as naturally as possible in Modern Standard Arabic (MSA).The speaker was not aware of the study's goal to minimize unnatural production of the target structure.Alzahrani and Almalki Asian. J. Second. Foreign. Lang. Educ. (2024) 9:68

Comprehension task
In a yes/no picture-sentence matching task, participants were asked to determine whether each prime sentence matched the accompanying picture in prime trials (e.g., Bernolet et al., 2013;Coumel et al., 2022b;Jackson & Hopp, 2020).This was done to ensure that participants were paying attention during the task.There were six mismatches in each task phase.

Proficiency test
There exist very few objective L2 Arabic proficiency measures, and most are either not open-access or lack reliability (for a review, Raish, 2021).A recently developed C-test was shown to have a strong correlation (r = 0.63) with self-assessment of overall Arabic ability (Raish, 2017).As such, it was selected.
In a C-test, every second word is half-deleted.Test-takers must restore the missing parts of individual words across several L2 texts.The half-deleted words could be lexical items (e.g., cat) or morphosyntactic information like prepositions (e.g., for) and conjunctions (e.g., although).The C-test does not only tap at lexical and morphosyntax knowledge, but it also requires the use of collocational (e.g., strong coffee), pragmatic, logical, and situational clues in the text for successful test completion (e.g., Trace, 2020).Thus, the C-test requires L2 speakers to use micro-(linguistic) and macro-level (contextual) knowledge.The C-test is widely reported as being objective, easy to administer and score, and a reliable and valid format for estimating overall L2 ability (e.g., Eckes & Grotjahn, 2006;Tidball & Treffers-Daller, 2008), which makes it a suitable test for the present study.
The adapted C-test (Raish, 2017) was originally composed of five texts that had a range of difficulty and discrimination scores and took 30 min to complete.Because some of the texts had the same difficulty level and discriminatory power, only three texts with the best difficulty and discrimination scores were adopted (i.e., text 1, text 2, text 5).This decision also served the purpose of reducing completion time and, consequently, minimizing participants' fatigue.The three texts are all authentic texts drawn from different genres (autobiographical, biographical, and news).There were 25 gaps per text.The lowest possible score was 0, and the highest possible score was 75.Participants had only 20 min to complete the test to capture an accurate assessment of L2 knowledge (e.g., Harsch & Hartig, 2016;Raish, 2017;Trace, 2020).

Background questionnaire
The LEAP questionnaire (Marian et al., 2007) was used to collect basic linguistic information about the L2 speakers since it has been validated with other similar L2 groups.
The questionnaire asked participants to rate their reading, speaking, listening, and writing skills in Arabic on an 11-point scale, ranging from a minimum of 0 (none) to a maximum of 10 (perfect).An example of a self-rating question is, "On a scale from zero to ten, please select your level of proficiency in reading".The questionnaire also required participants to provide information about their age, L1, years of Arabic learning, and the order in which they acquired languages.Alzahrani and Almalki Asian. J. Second. Foreign. Lang. Educ. (2024) 9:68

Procedure
The experiment included five in-lab sessions held on different days over three weeks.In the first session, participants used computers in the university laboratory to complete two practice trials.The practice trials asked participants to write a sentence description for non-target structures to familiarize themselves with the task.Then, they completed the baseline test on Gorilla.sc while being assisted by the second researcher.In the baseline, participants first saw a prime trial comprising a picture in the middle of the page, a sentence description printed above the picture, and a yes/no comprehension question below the picture.The comprehension question required participants to judge whether the sentence correctly described the shown picture.Participants had to click on either the "Yes" or the "No" box to proceed to the next screen.In the next screen, participants saw the target picture and were asked to type a sentence describing the picture in the box using all of the labeled lexical items.Participants had to press the "Next" button to continue to the next trial.Participants took 20-30 min to complete the baseline test.
Immediately after completing the baseline, participants were exposed to stories 1 and 2. They first listened to the audio recording of the first story while the story text was displayed on the class smartboard until the recording ended.After this, the second researcher asked the participants general comprehension questions such as "what were the main characters in this story?What was the main idea in this story?Was it the elephant who told the lion about the hunters?" to check for participants' attention during the reading-session.The same steps were followed in presenting the second story.
The second session occurred two days after the baseline test.In the second session, participants were exposed to stories 3 and 4 in the same way as in the first session.The third and fourth sessions took place a week following the baseline test.In the third session, participants read and listened to stories 1 and 2 again.In the fourth session, they read and listened to stories 3 and 4.After the end of story 4, participants took the immediate posttest, which had a similar format as the baseline.The immediate posttest was completed within 10-15 min.The fifth session took place a week after the immediate posttest.
In the fifth session, participants completed the delayed posttest, the C-test, and the LEAP background questionnaire.The steps for completing the delayed posttest were the same as the ones in the baseline.For the C-test, participants were provided instructions and two practice trials.They were also informed that the C-test lasts only 20 min and that they will see their final scores at the end.Finally, participants filled out the LEAP questionnaire.The last session took around 50-60 min.

Target sentences
Responses in target trials were coded as fronted TP, non-fronted TP, or Other.Fronted TP sentences started with a preposition followed by a noun, then a verb, and finally a subject.Sentences that started with a TP and had additional adjectives or a subject preceding the verb were considered as Fronted TP.Meanwhile, sentences that started with a fronted locative phrase, such as "min Madinah ila uxra/from one city to another", were not coded as fronted TP responses.Non-fronted TP sentences included an initial verb, followed by a subject, and ended with a preposition and a noun.'Other' sentences always did not include a TP, and sometimes, they also had a missing subject or verb.Spelling and grammatical (i.e., agreement, tense) errors were ignored in the analysis (Coumel et al., 2022b;Grüter et al., 2021;Kaan & Chun, 2017).

Proficiency test
A dichotomous scoring method (i.e., correct/incorrect) was used for the C-test (Eckes, 2011;Raish, 2017).Responses that had spelling, agreement, or tense errors were not considered correct.Scores were calculated by automatically comparing participants' responses to a list of all the correct answers and their acceptable written forms.The test demonstrated excellent reliability, with Cronbach's α = 0.92 (95% CI = [0.90,0.95]) and ω total = 0.94.

Statistical modeling
A mixed-effect logistic regression model was built using the lme4 package (Bates et al., 2015) in R version 4.2.2 (R Core Team, 2022) since the dependent variable is binary.The dependent variable was the production of non-fronted TP sum coded as − 0.5 and fronted TP as 0.5.
The between-subject fixed effect of group (sum-coded: the control group = − 0.5, the experimental group = 0.5).The within-subject fixed effect task phase was treatment coded with the baseline as the reference level, the continuous proficiency test scores were standardized and centered around 0, and TP placement was sum coded (TP images placed on the left = − 0.5, on the right = 0.5).The model also included the maximal random structure justified by the design (Barr et al., 2013) and that converged: by-item and by-participant intercepts.

Background information
Participants' background and linguistic information is presented in Table 3.The non-parametric Mann-Whitney U test was used to compare the two groups when the data was ordinal or non-normally distributed.A Bayesian t-test was conducted for the non-normal continuous proficiency scores using the R package BEST (Kruschke, 2013).Participants across groups did not differ in any of the examined variables in Table 3.The Bayesian t-test's estimate suggests that there is no significant difference in means between the two groups (d = 0.177), as the 95% CI for the mean difference includes 0. Figure 2 exhibits the distribution of the C-test scores.

The comprehension task
All participants correctly answered more than 70% of the comprehension check questions on the picture-sentence matching task (Table 4), indicating an acceptable level of attention during each task phase.

The priming task
As shown in Table 5, the control participants produced only one fronted TP response per task phase, suggesting a limited increase in the production of the target structure from the baseline.In contrast, the experimental group slightly increased their production of fronted TP sentences in both the immediate (n = 4) and the delayed (n = 3) posttest relative to the baseline (n = 1).
The logistic regression model (Table 6) suggests that there was no difference in the production of fronted TP between the experimental and control groups in the immediate (estimate = 0.66, SE = 0.89, z = 0.74, p = 0.457) and delayed (estimate = 0.23, SE = 0.88, z = 0.26, p = 0.794) posttests compared to the baseline.This indicates that there was no fronted TP priming following extensive exposure to fronted TPs in narrative contexts (Fig. 3).

Discussion
This study examined whether the Arabic fronted TP structure could be primed via repeated exposure in classroom-based story-reading sessions.The priming effect was assessed in two posttests: an immediate posttest and a one-week delayed posttest.Although L2 speakers in the experimental group slightly increased their production of the fronted TP construction in the immediate and delayed posttests relative to the baseline, there was no significant priming effect.Further, L2 speakers' scores on the C-test did not mediate their fronted TP priming.These findings will be explained next in light of related priming research and theories.

Story priming
The current results contrast with prior classroom-based research, which found significant priming effects in both posttests (Garraffa et al., 2021;Hesketh et al., 2016;Serratrice et al., 2015;Vasilyeva et al., 2006).This difference in participants and study design between previous research and our current study could explain these results.Previous classroom-based studies were conducted on monolingual English-speaking children who might show different priming patterns than adult L2 speakers.There is some evidence suggesting that L1 children might exhibit unique priming effects compared to L1 adults (e.g., Peter et al., 2015).Given this, it is possible to posit that L1 children and L2 adult speakers may vary in their susceptibility to priming as well.The implicit learning account (Chang et al., 2006) predicts that speakers at different linguistic development stages show different priming effects, with larger priming effects for less experienced speakers.However, this prediction was based on L1 speakers and might not accurately capture variation in syntactic priming across L2 development.
Another explanation for this inconsistency in results is that some of the previous studies (Hesketh et al., 2016;Serratrice et al., 2015) required children to produce the prime structure during the reading sessions which could facilitate future productions of the prime, while the current study did not do so.The use of both comprehensionto-production and production-to-production priming methods might have enhanced the priming effects obtained in prior research.Nevertheless, the other two classroom priming studies (Garraffa et al., 2021;Vasilyeva et al., 2006) only implemented a comprehension-to-production method and found significant priming effects.In addition to differences in participants and priming methods, this study also examined a syntactic structure and language different from those of previous studies.As there are many possible reasons for the difference in results, it is difficult to compare this study directly to previous research.
The L2 syntactic priming literature could offer some explanations for the present findings.First, the frequency effects may account for the current results.Previous research suggests that L2 speakers might show a frequency effect rather than an inverse-frequency effect (e.g., Hurtado & Montrul, 2021;Jackson & Ruf, 2017;Kaan & Chun, 2017).For instance, Jackson and Ruf (2017) reported that L2 German speakers were primed to Fig. 3 The proportion of Fronted Temporal Phrase (TP) production by task phase and condition the more frequent construction fronted TP, but not for the less frequent structure nonfronted TP.In this study, the fronted TP is likely to be less frequent than non-fronted TP in MSA as the canonical word order in MSA is VSO (Ryding, 2005).This preference for.
The recruited L2 Arabic speakers strongly preferred the non-fronted TP structure at baseline and continued to prefer it to the delayed posttest.This high preference for the nonfronted TP construction might have inhibited the priming of fronted TP in the present study.
Second, the degree to which the L2 Arabic participants demonstrated prior knowledge and use of the prime could have contributed to the reported results.Empirical evidence indicates that L2 speakers must not only have knowledge of the prime (McDonough & Trofimovich, 2015) but also show actual use of it (Hurtado & Montrul, 2021;McDonough, 2006).All the recruited participants were introduced to the TP alternation in the first year of their current Arabic program, suggesting that they have some knowledge of fronted TPs.However, there were very few fronted TP productions in the baseline by the experimental group (n = 2) and the controls (n = 1).This suggests that the L2 Arabic participants knew to some extent the fronted TP structure but did not prefer to use it.This limited production of the prime in the baseline could have reduced participants' susceptibility to priming.
Third, the lack of lexical overlap between the prime and target in the posttests could constitute another possible reason for the current limited priming effects.Two priming accounts explicitly predict a positive role of lexical repetition in the prime and target: the residual activation account (Pickering & Branigan, 1998) and the shared syntax model (Hartsuiker & Bernolet, 2017).The activation account posits that repeating some of the lexical items in the prime sentence and target sentence could increase the magnitude of priming.Extending this idea, the shared syntax model postulates that the lexical overlap effect may be stronger for L2 speakers at beginner and intermediate L2 proficiency levels relative to more proficient L2 speakers.Most of the recruited participants were at either a pre-intermediate or intermediate stage in their L2.The presence of a lexical overlap may have benefited participants, as repeated items can trigger memories of prior encounters, resulting in a significant priming effect.
Fourth, the decay of explicit memory might offer an alternative explanation.In multifactorial accounts of structural priming (Zhang et al., 2020), it is argued that even lexically-independent syntactic priming might rely on explicit memory.This account postulates that explicit memory can be used to retrieve and reuse recently encountered syntactic structures based merely on similarity in the event structure (e.g., an initial preposition followed by a noun encoding time) even if those structures do not contain any repeated lexical items.In the present study, participants read and listened to the fronted TP structure embedded within semi-natural stories.This presentation method places the prime in a more linguistically rich context, which may shift focus away from sentence structure.As a result, participants may have been less likely to sufficiently encode the fronted TP structure into explicit memory, leading to faster decay of the priming effect over time.This explanation seems likely, considering the results of a related study.Using the classic syntactic priming paradigm, an L2 Arabic priming study found immediate abstract priming for fronted TP in the priming phase for the control group, although this priming effect was not sustained in the posttests (Alzahrani, 2023).This suggests that the method of presenting the prime may play a role in inhibiting (the present study) or triggering syntactic priming effects (Alzahrani, 2023).Overall, current evidence hints that using a context-rich priming design might not be effective when priming less frequent structures in the L2.
While the four discussed reasons might have played some role in inhibiting syntactic priming effects in the immediate and delayed posttests, the current results remain difficult to explain.More research is needed to better understand when and how syntactic priming could promote syntactic learning in the L2 classroom.

L2 proficiency
This study did not find an effect of L2 proficiency as measured in a C-test on syntactic priming via stories.The C-test scores showed a wide variability in the experimental group (M = 12.58, SD = 10.06,range = 0-33), indicating that the L2 participants came from different proficiency levels.All participants, regardless of their C-test scores, preferred to produce a non-fronted TP rather than a fronted TP across all experimental phases (Table 5).A possible explanation for the null L2 proficiency is the observed limited production of the prime structure (i.e., a floor effect).This non-significant priming effect did not make it possible to capture an L2 proficiency effect in the current study.

Limitations
This study has a few limitations that should be addressed in future research to examine the generalizability of the current findings.These limitations were imposed by practical constraints (e.g., the short academic term, limited class availability, scheduling conflicts) and scarce L2 empirical evidence (e.g., the number of required story primes).
First, the number of story-reading sessions (n = 8) may have been insufficient to trigger priming.Previous research has included ten sessions (Hesketh et al., 2016;Serratrice et al., 2015;Vasilyeva et al., 2006), and it is possible that a longer priming period is needed to produce observable effects for less frequent structures.Second, the priming effect for fronted TP structures may have been simply too weak to be detected in the present study.Therefore, subsequent research should consider increasing the sample size or target trials to increase the likelihood of detecting the priming effect.Further, researchers may modify the stories to emphasize sentence structure, for example, by presenting one sentence per line.Additionally, prompting participants to rewrite and reread parts of the story independently could enhance the magnitude of priming.
Third, this study did not examine the potential role of crosslinguistic influence (CLI) in L2 syntactic priming.CLI could refer to the observation that shared L1-L2 syntactic features are more likely to boost priming, while non-shared L1-L2 syntactic features might hinder priming (Serratrice, 2022).Based on this view, a reviewer suggested that L1-L2 differences in TP position might mitigate TP priming among lower-level L2 Arabic learners.On the other hand, L1-L2 similarity in TP position might facilitate TP priming among higher-level L2 Arabic learners.Nevertheless, it was not possible to investigate the effect of CLI in the current study.This is because it was not feasible to control the L1 of the recruited participants due to the limited number of participants who share the same L1 in the target Arabic learning program.Future studies may explore the potential effects of CLI in L2 Arabic speakers.
Fourth, the current study did not include a balanced number of L2 participants based on their L2 proficiency level due to the employed design (i.e., a classroom-based intervention).It remains desirable to examine the effect of L2 proficiency on classroom-based syntactic priming using a more balanced L2 sample.

Conclusion
The current study conducted an initial examination of L2 syntactic priming in a more ecologically valid context, the L2 classroom.L2 Arabic participants were primed to the fronted TP structure using a typical language activity in the classroom: group reading sessions over two weeks.Results indicated limited priming effects in the immediate and delayed posttests and no mediating effect for L2 proficiency.Several potential reasons could have contributed to these results.The present study provides a valuable starting point for future research on the potential of syntactic priming in the L2 classroom.

Fig. 2
Fig. 2 Distribution of participants' scores in the proficiency test.The blue dotted line indicates the mean score.One participant from the experimental group and three from the control group did not complete the C-test

Table 3
L2 learners' linguistic background, showing means, standard deviations, and ranges by groupOne participant from the experimental group and three from the control group did not complete the background questionnaire

Table 4
Frequency and percentage of correct responses in the comprehension task by condition and experiment phase Imm.posttest = Immediate posttest, Del. posttest = Delayed posttest

Table 5
Frequency of target responses in the priming task by condition and experiment phase Imm.posttest = Immediate posttest, Del. posttest = Delayed posttest