Cross-linguistic differences in case marking shape neural power dynamics and gaze behavior during sentence planning

Languages differ in how they mark the dependencies between verbs and arguments, e.g., by case. An eye tracking and EEG picture description study examined the influence of case marking on the time course of sentence planning in Basque and Swiss German. While German assigns an unmarked (nominative) case to subjects, Basque specifically marks agent arguments through ergative case. Fixations to agents and event-related synchronization (ERS) in the theta and alpha frequency bands, as well as desynchronization (ERD) in the alpha and beta bands revealed multiple effects of case marking on the time course of early sentence planning. Speakers decided on case marking under planning early when preparing sentences with ergative-marked agents in Basque, whereas sentences with unmarked agents allowed delaying structural commitment across languages. These findings support hierarchically incremental accounts of sentence planning and highlight how cross-linguistic differences shape the neural dynamics underpinning language use.


Introduction
Planning sentences requires speakers to transform thoughts into sequences of words that follow the grammatical rules of their language. Central to all grammars are syntactic dependencies, such as the dependencies between verbs and their arguments. Many languages use overt marking ("case") to indicate the nature of the dependencies, e.g., to signal whether an argument is the agent and or a patient. An example is Basque, which has a special case marker, the "ergative" case, for agent arguments (Laka, 2006b). If the argument structure of a verb specifies an argument as an agent (e.g., in verbs like "dance"), its corresponding noun phrase receives ergative marking; if not (e.g., in verbs like "fall"), the noun phrase appears in the unmarked nominative case. This is different from other languages, such as Swiss German, where agents and patients are not normally differentiated by case markers, i.e., they receive the same marking.
This contrast, illustrated in Fig. 1, poses a challenge for sentence planning. In Basque, the form of the canonically sentence-initial noun phrase depends on the verb at the end of the sentence. This is not the case in Swiss German, where the form of the first noun phrase is compatible with any verb. Contrasts of this kind can be modelled in different ways in theories of sentence planning. Overall, sentence planning proceeds incrementally, and the planning scope may range from little advance preparation and structure building (linear incrementality, Gleitman, January, Nappa, & Trueswell, 2007;Myachykov, Thompson, Garrod, & Scheepers, 2012) to the preparation of structures spanning full sentences before speaking (structural incrementality, Griffin & Bock, 2000;. While linear incrementality does not necessarily assume advance planning of dependencies (Schriefers, Teruel, & Meinshausen, 1998), under structural incrementality at least some dependencies are assumed to be planned well in advance (Bock & Levelt, 1994;Levelt & Meyer, 2000;Norcliffe, Konopka, Brown, & Levinson, 2015;Sauppe, Norcliffe, Konopka, Van Valin, & Levinson, 2013).
In particular, dependencies that occur before the verb appear to require some degree of advance planning on a structural level and are thus incompatible with a radical version of linear incrementality that builds up structures strictly word by word (Gleitman et al., 2007;Hwang & Kaiser, 2014b;. While sentence planning time courses have been shown to adapt to various aspects of individual languages' grammars (Hwang & Kaiser, 2014b;Sauppe et al., 2013), it is an open question in how far the signaling of argument-verb dependencies with ergative case, as in Basque, requires speakers to commit early to at least some structural properties of sentence plans. Basque speakers might be required to decide early on whether the event described in a sentence is best referred to with a verb that assigns an ergative case to the initial noun phrase or with a verb that does not (Laka, 2006b). German speakers, by contrast, could delay these planning decisions to exploit the structural flexibility (Ferreira, 1996) provided by the grammar (as initial noun phrases in canonical word order sentences carry unmarked nominative case by default and also most frequently; see Haider, 2010 for exceptional case marking used with, e.g., experiencers). They could, in particular, initially activate multiple sentence plans that all start with an unmarked noun phrase and only later decide on the specific plan (cf. Myachykov, Scheepers, Garrod, Thompson, & Fedorova, 2013;Stallings, MacDonald, & O'Seaghdha, 1998). Momma (2021) demonstrates that the elements of a dependency are planned together. For Basque, but not necessarily for Swiss German, this would mean that the verb and its argument structure are prepared together with the argument(s) because the argument structure determines the overall sentence type.
In previous work on sentence planning in Hindi, Sauppe et al. (2021) showed that sentences with ergative and sentences with nominative case marking exhibit different planning profiles. Transitive sentences with ergative agents were associated with more intensive relationalstructural encoding early in the planning process. This finding indicates that Hindi speakers quickly attended to encode the transitivity information relevant for the dependency between the ergative case marker and the verb. Nominative sentences, on the other hand, did not require speakers to commit to a structure early, giving them more flexibility in planning (Ferreira, 2008;Ferreira, 1996;Kempen & Hoenkamp, 1987) and the possibility to entertain multiple alternative continuations (Myachykov et al., 2013). Planning (transitive and intransitive) nominative sentences in Hindi was also associated with greater working memory engagement.
In a structural priming experiment on Basque, Santesteban, Pickering, Laka, and Branigan (2015) found a tendency to repeat the constituent structure but not case marking and that there was a lexical boost effect with verb repetition between prime and target sentences. These results suggest that case marking is planned after constituent structure selection and, at the same time, that verb selection precedes constituent structure planning. However, whether an argument is assigned ergative case is associated with the verb in Basque and case marking information should become available through the argument structure of the verb selected for a sentence under planning (cf. Pickering & Branigan, 1998). The two contrasting findings of Santesteban et al.'s study, however, do not allow to untangle the role of case marking in early sentence planning. Here, we compare the time course of sentence planning in Basque and German in a picture description experiment (cf. Griffin & Bock, 2000), aiming at complementing and extending Santesteban et al. (2015)'s priming evidence and Sauppe et al. (2021)'s eye tracking and EEG evidence. We characterize the temporal dynamics of early sentence formulation by analyzing eye movements and time-frequency representations of neural processing while speakers described pictures with sentences with either ergative-marked or nominative-marked agents (cf. Fig. 2). With this paradigm, we test how the differences between overt and covert signaling of agent-verb dependencies through case marking shape sentence planning processes across languages.
Eye tracking and time-frequency analyses of neural processing are valuable tools for this purpose. Visual attention and eye movements provide insights into the timing of different planning stages because they are tightly linked to structural and lexical processes (Griffin & Bock, 2000;Griffin, 2004;. Patterns of gazes to characters in event pictures can thus shed light on early relational and structural information encoding processes (e.g., Hwang & Kaiser, 2014b;Konopka, 2019;Sauppe et al., 2021;Sauppe et al., 2013;Sauppe, 2017). Structurally incremental accounts of sentence planning propose that speakers start out with encoding the event relations, the "who does what to whom" of a depicted event (Griffin & Bock, 2000), and with generating a structural plan of their utterance. To encode the event relations, speakers need to identify the agent, the patient, and the action that is being carried out; this is achieved by visually inspecting the depicted characters and their configuration. The action, i.e., the relation between the agent and the patient, arises from cues that are distributed between these two characters, such as their spatial configuration and postures (e.g., Dobel, Gumnior, Bölte, & Zwitserlood, 2007;Glanemann, Zwitserlood, Bölte, & Dobel, 2016;Hafri, Trueswell, & Strickland, 2018). Relational encoding is thus reflected in early distributed fixations (e.g., Hwang & Kaiser, 2014b;Konopka, 2019;Sauppe, 2017;Sauppe et al., 2021). Furthermore, crosslinguistic evidence suggests that event-relational and structural Case marking varies by verb type in Basque, so that agents are assigned ergative case, while patients are assigned nominative case. Swiss German assigns the same form to agents and patiens of transitives and to the sole argument of intransitives, glossed here as "nominative" because it is the same form as used for naming; it contrasts only with a dative case that is used with a handful of transitive verbs and for recipients in ditransitive verbs (abbreviations: AUX = auxiliary, COP = copula, ERG = ergative, NOM = nominative). encoding processes are tightly intertwined. In verb-initial languages, e. g., the grammatical structure under planning influences already the earliest fixation patterns (before 400-600 ms; Sauppe et al., 2013).
If the case marking differences between Basque and German shape the time course of relational-structural planning, especially when speakers need to decide on which kind of verb to use and when they need to commit to a sentence plan at the outset, we expect this to modulate the distribution of early fixations. Specifically, we expect Basque speakers to distribute their gaze more between depicted event participants or towards other aspects of the picture to gather information about event relations when they encode the "who-does-what-to-whom" Sauppe, 2017). This would aid the decision on which type of verb to plan and consequently which case marking to prepare for the first noun phrase (ergative or nomininative, Sauppe et al., 2021).
Neural oscillations play an important role in cognitive functions (e. g., Friederici & Singer, 2015;Siegel, Donner, & Engel, 2012) and prominently feature in sentence comprehension research (Hauk, Giraud, & Clarke, 2017;Prystauka & Lewis, 2019). By exploring the role of different frequency bands during sentence planning, we extend work on event-related synchronization/desynchronization (ERS/ERD, Pfurtscheller & Lopes da Silva, 1999) of oscillatory power in language production. Oscillatory power analyses are able to capture a wide range of neural processes because they represent both evoked (phase-locked) and induced (non-phase-locked) activity (David, Kilner, & Friston, 2006;Schneider & Maguire, 2018), so that both sentence planning processes that set in instantaneously with stimulus picture presentation and planning processes that set in later (and with somewhat varying latencies) can be captured.
While the functional role of activity in the theta, alpha, and beta frequency bands has been laid out for the comprehension of sentences (largely with reference to domain-general processes, Meyer, 2018;Prystauka & Lewis, 2019), the relation between sentence planning processes and ERS/ERD responses has only begun to be studied. Sauppe et al. (2021) provide evidence for the involvement of theta and alphaband activity in sentence planning, linking them to working memory and inhibitory processes that support syntactic structure generation. Bögels, Casillas, and Levinson (2018) show that alpha-beta ERD is related to response planning in conversation. Piai and Zheng (2019) review the role of theta, alpha, and beta power changes in the planning of single words.
In view of this scarcity of research, one aim of the current study is to probe further the association of ERS/ERD and sentence planning across diverse languages. It is essential to build up a corpus of results in order to characterize the functions of different frequency bands in planning and to assess whether sentence comprehension and sentence planning processes match with respect to the way they draw on the theta, alpha, and beta frequency bands, or whether potentially different affordances (MacDonald, 2013;Meyer, Huettig, & Levelt, 2016) lead to different neural signatures and implementations.
We target the early encoding processes of sentence planning. These are assumed to encompass both relational encoding processes, during Details on task and trial structure. Trials started with a version of the stimulus picture in which all pixels were distributed randomly to provide a period for participants to return to rest before transitioning into the actual trial (random duration between 1750 and 2250 ms). This display was followed by the presentation of an auditory lead-in sentence fragment. Participants fixated on a small black square (in one out of five evenly spaced and randomly selected positions at the top of the screen) to ensure that gaze did not fall on event participants at stimulus picture onset (Gleitman et al., 2007;. They then described the stimulus pictures by completing the lead-in fragment. Participants proceeded to the next trial by button press. Latencies are given relative to stimulus picture onset. The fixation square and the stimulus picture display are shown enlarged in the lower panel. which speakers encode the relations between participants in the depicted events (the "who does what to whom", Griffin & Bock, 2000), and structural encoding processes, during which speakers create or start to create at least parts of a grammatical plan for their utterance (Bock & Ferreira, 2014). Based on cross-linguistic studies using eye tracking, we expect these processes to be tightly intertwined and that they play out in the first 800 ms of sentence planning Sauppe et al., 2013;Sauppe et al., 2021). There were four sentence types of interest elicited during the current picture description experiment (cf. Fig. 1): unmarked intransitive sentences in German, unmarked transitive sentences in German, unmarked intransitive sentences in Basque (with patient-like sole arguments), and transitive sentences in Basque in which the agency of the first noun phrase is marked overtly by the ergative case (Basque ergative-marked intransitives were only elicited, cf. below).

Participants
Right-handed, native speakers of Basque (N = 40, age: 18-28 years, 32 female) and Swiss German (N = 26, age: 18-39 years, 18 female) 1 with normal or corrected-to-normal vision took part for payment. In addition, data were excluded from one participant due to technical problems during recording and from three participants who were not a native speakers of Basque. Ethical approval was obtained from the University of the Basque Country (85/2017) and the Ethics Committee of the Canton of Zurich (Req-2016-00294).

Materials and procedure
Stimuli were black-line drawings of 53 transitive (two-participant, agent-patient) and 57 intransitive (one-participant) events. Participants' task was to describe pictures with one sentence, overtly naming the agent, the patient, and the verb (cf. Fig. 2 for task and trial structure details), while their eye movements and electrophysiological activity were monitored. Before stimulus picture presentation, an auditory leadin sentence fragment was presented (cf., e.g., Piai, Roelofs, & Maris, 2014;Piai et al., 2016;Schriefers et al., 1998), the Basque or German translation of "What has happened here/in this picture is that…" (Basque: Irudi honetan gertatu dena da…, (Swiss) German: Was daa passiert isch, isch das…). Participants were instructed to describe the stimulus pictures, starting as soon as they were ready and so that the lead-in sentence fragment was being completed. This elicited strictly verbfinal clauses in both languages. The lead-in cue required a continuation in perfective aspect (conceptualizing the events as being completed) in order to avoid the use of a periphrastic progressive aspect (conceptualizing events as ongoing) in Basque which follows a different syntax, without ergative case marking (Laka, 2006a). To ensure broad recognizability of the depicted events, stimuli were selected for having at least 50% naming agreement for the verb, based on a separate norming study (with 40 Basque and 34 German speakers, using PsyToolkit; Stoet, 2010;Stoet, 2017).
Picture orientation (left/right agent) was balanced across participants by mirroring the original pictures. Trial order was randomized for each participant. Twenty practice trials were presented at the beginning. Participants first saw ten training pictures and heard prerecorded example descriptions, and then described these pictures themselves, receiving feedback from the experimenter if necessary. Experimental sessions, including application of the EEG and eye tracker calibration, took approximately 90 minutes.

Data recording and analyses
Stimuli were presented with E-Prime 2.0 (Schneider, Eschman, & Zuccolotto, 2002) on a 15.6 inch screen laptop computer placed approximately 60 cm away from the participants. Vocal responses were recorded for later transcription. Eye movements were recorded with an SMI RED250 mobile eye tracker (SensoMotoric Instruments, Teltow; 60 Hz sampling rate). Electrophysiological activity was recorded with an Enobio 32 EEG (Neuroelectrics Inc., Barcelona), using the manufacturer's NIC 2.0 software (500 Hz sampling rate) on a separate laptop computer. Twenty-six Enobio Geltrode electrodes (4 mm Ag/AgCl sintered) were placed on the scalp in a 10-20 montage (Fig. 3). To detect eye movements as well as articulator muscle movements during speech preparation (Piai et al., 2014;Porcaro, Medaglia, & Krott, 2015), two additional electrodes recorded the electrooculogram (placed on the outer right canthus and on the orbital part of the orbicularis oculi, below the right eye) and two electrodes recorded the lip electromyogram (placed on the left orbicularis oris superior and the right orbicularis oris inferior).
For reaction time analyses, the onset of the first word of each trial's response was annotated manually in Praat (Boersma, 2001). For the eye tracking analysis, areas of interest for agents (and patients) in the stimulus pictures were manually defined with SMI's BeGaze software; data were further preprocessed in R (R Core Team, 2021). For each trial, consecutive fixations within areas of interest were subsumed into gazes (Griffin & Davison, 2011) and then aggregated into 100 ms time bins to reduce temporal autocorrelation (Barr, 2008;Cho, Brown-Schmidt, & Lee, 2018).
EEG data were preprocessed in EEGLAB (Delorme & Makeig, 2004), FieldTrip (Oostenveld, Fries, Maris, & Schoffelen, 2011), and R. Recordings were re-referenced offline to the average of the mastoid electrodes; lip EMG electrodes were referenced to each other. Data were band-pass filtered (0.16 to 48 Hz), downsampled to 250 Hz sampling rate, and epoched (-1750 to 1750 ms relative to stimulus picture onset, Fig. 2). Channels with signal probabilities or kurtosis exceeding ±5 SD were rejected. The SASICA (Chaumon, Bishop, & Busch, 2015) and FASTER (Nolan, Whelan, & Reilly, 2010) algorithms were used to automatically remove artifactual independent components, identified by correlation with EMG and EOG electrodes, temporal autocorrelation (lag = 20 ms), or by being focal to individual epochs or channels. Next, the rejected channels were spherically interpolated. Time-frequency 1 Since participants described the pictures in this experiment without a scripted utterance format, Basque participants also dropped some noun phrases and did not overtly express the agent or patient argument, as it is done in natural language use. Only sentences with both arguments overtly expressed were included in the analyses (see exclusion criteria). Unconditionally dropping arguments is not possible in German. For this reason, we decided to increase the number of Basque participants to obtain an approximately equal number of responses in each language.
representations of EEG power were calculated with Hanning tapers and a sliding wavelet convolution transform (width = 3 cycles) in 0.5 Hz and 50 ms steps between 1 and 34 Hz. ERS/ERD on the single-trial level was defined as dB relative to the median power in a 300 ms interval during the presentation of the lead-in cue (ending 150 ms before the end of the cue, − 600 to − 300 ms relative to stimulus picture onset). To reduce the data dimensionality for statistical analyses, the power was then averaged into individually defined theta, alpha, and beta frequency bands and into four regions of interest (ROIs, Fig. 3).
Individual peak alpha frequencies (IAFs) were established using the channel reactivity-based method (Goljahani et al., 2012;Goljahani, Bisiacchi, & Sparacino, 2014). This method determines the frequency in the alpha range that is most responsive, in the sense of exhibiting the greatest desynchronization during a task (here, the first 1000 ms of picture description) compared to at rest (here, 1000 ms in the middle of the fixation square display, from − 1250 to − 250 ms relative to stimulus picture onset, Fig. 2). All trials that were not excluded based on EEG criteria (cf. below) were included to determine IAFs. Individual peak alpha frequencies ranged from 7.1 to 12.6 Hz (cf. Bazanova & Vernon, 2014, for a review of variability in IAFs). Individually adjusted bands were defined as follows: theta ranging from IAF-6 to IAF-4 Hz, alpha ranging from IAF-4 to IAF + 2 Hz, and beta ranging from IAF + 2 to IAF + 20 Hz (cf. Bice, Yamasaki, & Prat, 2020;Klimesch, 1999). The IAF of one participant could not be calculated using the method described above and was imputed with the median of all other participants' IAF.
Trials were excluded from analyses if participants omitted the agent or the patient from a transitive sentence or the sole argument in an intransitive sentence, uttered ungrammatical sentences or sentences that did not match the pictures, restarted or corrected their utterance, did not use an ergative case (for transitive sentences in Basque), or did not describe events as being completed. For the speech onset analysis, trials with latencies shorter than 400 ms and longer than 6000 ms were excluded. For the eye tracking analysis, only responses to transitive (two-participant) pictures were included. Additionally, trials were excluded if the first fixation to either the agent or the patient occurred later than 500 ms after stimulus onset, or if track loss occurred (defined as a gap of more than 500 ms between subsequent fixations in the analysis time window, 200-800 ms). Trials in which participants fixated already on the position of the agent or patient at stimulus picture onset (Gleitman et al., 2007;Pokhoday, Shtyrov, & Myachykov, 2019) were excluded from the eye tracking analysis. For the EEG analyses, trials with flat-lined mastoid channels, or amplitudes surpassing ±200 μV or identified to be artifactual by visual inspection (after individual component analysis) were excluded. Trials with vocalizations before 1400 ms (including uttering fillers like "uh") were also excluded. Thus, only trials where speaking began at least 600 ms after the end of the analysis time window (0-800 ms) were included in the EEG analyses, largely avoiding contamination of the signal by muscle movement artifacts from the articulators (Ganushchak, Christoffels, & Schiller, 2011;Riès, Legou, Burle, Alario, & Malfait, 2012;Whitham et al., 2007). Six participants with flat-lined mastoid channels in more than half of the epochs were excluded from the EEG analyses. Responses to both transitive and intransitive pictures were included in the EEG analyses (except for sentences with intransitive verbs assigning ergative case in Basque as there were only 389 such responses in total, before applying any rejection criteria).
Overall, 4279 trials were included in the speech onset analysis (Basque: 2021 trials, German: 2258 trials, 58.9% of all trails), 1465 trials were included in the eye tracking analysis (Basque: 640 trials, German: 825 trials, 41.9% of all transitive trials) and 2949 trials were included in the EEG analysis (Basque: 1296 trials, German: 1680 trials, 41.0% of all trials). 2 The analysis time window spanned from 200 to 800 ms after the stimulus picture onset for the eye tracking analysis. Before 200 ms, few language-related eye movements are expected because it takes approximately takes this long to program the first saccade from the fixation square into the picture as soon as it appears (Pierce, Clementz, & McDowell, 2019;Rayner, 1998;Richardson & Spivey, 2008). The EEG analysis time window spanned from 0 to 800 ms because no such physiological restrictions did apply.
Data were analyzed with Bayesian hierarchical regression on the single-trial level using the brms (version 2.13.0) interface to Stan (Bürkner, 2018;Bürkner, 2017;Carpenter et al., 2017) in R (R Core Team, 2021). Speech onset latencies were modeled Eye tracking and EEG data were modeled following a growth curve approach (Mirman, 2014;Mirman, Dixon, & Magnuson, 2008), which employs orthogonalized polynomial time terms to describe non-linear changes over time. These polynomials model the overall slope of the curves (linear time), the shape of the primary inflection points (quadratic time, describing curves as flat or "peaky"), and shifts in peak latencies (cubic time) (cf., e. g., Kuchinsky et al., 2013). The eye tracking analysis modeled the time course of agent character fixations with a binomial regression, which is well-suited for eye movement data (Cho et al., 2018;Donnelly & Verkuilen, 2017;Sauppe, 2017). The EEG analyses separately modeled the time course of theta, alpha, and lower beta band ERS/ERD with a Gaussian regression.
For the eye tracking analysis, the predictors were orthogonalized linear, quadratic, and cubic time terms, and their interactions with language. EEG analyses also used this model structure and additionally included as a factor the syntactic transitivity of the sentences to capture the differences between transitive and intransitive responses, and the spatial factors anteriority (anterior, posterior) and laterality (left, right) to capture the topography of effects, as well as all of their interactions with the time terms and language. We focus on analyzing the proportion of fixations to agent characters. Agents are the locus of the case marking difference, mentioned first in the sentences, the instigators of events, and make up readily definable areas of interest in the stimulus pictures. Areas of interest for the action and the patient affected by the action, by contrast, are often not easily defined because the corresponding event information is spread over several parts of the picture.
As control variables, we included the length of the first noun phrase (in syllables) to statistically capture the potential effects of phonological encoding, and the trial number to statistically capture potential cumulative priming effects (Kaschak, Kutta, & Jones, 2011;Pickering & Ferreira, 2008). The eye tracking model additionally included control variables for the size of the agent and patient areas of interest because larger characters might be more likely to be gazed at in general. To account for temporal autocorrelation in trial-level time series, the eye tracking model included agent fixations from the respective previous time bin as predictor (Cho et al., 2018;Sauppe, 2017) and EEG models included an AR(1) term. Continuous predictors were z-transformed (Schielzeth, 2010), except for the time terms, and categorical predictors were sum-coded (-1, 1). Maximal random effects structures justified by design were included (Barr, 2013;Barr, Levy, Scheepers, & Tily, 2013). All predictors had Student's t priors centered on 0, with a standard deviation of 2, and 5 degrees of freedom (thus slightly fattening the tails). The models were fitted in six chains of 3000 iterations each (after 3000 warm-up iterations, 6000 in total). The default priors of brms were used for group-level predictors. We consider predictors with a posterior probability mass of at least 80% above or below 0 (i.e., where the 80% highest density interval excludes 0, cf. Kruschke, 2015) noteworthy and report for these effects how much of the posterior probability mass lies above or below 0. This indicates how probable an effect is given the data and the priors; it is not a significance threshold in the frequentist sense.
In interpreting the EEG models, we focus on predictors that involve interactions between language and syntactic transitivity. In combination, these factors define the different sentence types and allow to single out the planning of ergative sentences in Basque. We refrain from interpreting effects that involve the language difference but not transitivity because these could most parsimoniously be explained by differences between participant groups.
Agent fixations were overall similar between Basque and German, but exhibited a shifted, later peak in Basque, reflected in a negative interaction between the language contrast and the cubic time term (mean β Time 3 ×Language = − 0.30, P(β < 0) = 0.99, Fig. 4, Table S3, Figs. S3 and S4). The difference between Basque and German fixation curves was restricted to a shift in the latency of the peaks: the point of maximal differentiation occurred later in Basque, i.e., Basque speakers spent more time distributing their attention between agents and other aspects of the picture than German speakers. All other aspects of the fixation time course did not differ. In particular, the overall likelihood of looking at the agent was estimated to be practically identical in both languages (β Language = 0.01, P(β < 0) = 0.57, Fig. S3).
In the beta band, the planning of transitive sentences in Basque elicited a stronger desychronization in left electrode sites, whereas intransitive sentences desynchronized more in right electrode sites (β Language×Transitivity×Laterality = 0.02, P(β < 0) = 0.91). German sentences did not show substantial differences (Fig. 7).

Discussion
A picture description experiment contrasted the planning of sentences in Basque and German, two languages differing in how argumentverb dependencies are marked. The time course and neural underpinnings of planning were shaped by how structural dependencies between agent arguments and verbs are signaled: overtly, through ergative case marking in Basque, or covertly, through ambiguous nominative case that is used by default for both arguments of transitives and also for the sole argument (subject) of intransitive verbs in Swiss German (cf. Fig. 1).
As in previous studies on the interaction of case marking systems and sentence planning (Hwang & Kaiser, 2014b;Sauppe et al., 2021), we found that preparing to produce sentences with ergative case marking leads to differences in early relational-structural encoding. The planning strategy of Basque speakers suggests that the timing of dependency planning is determined by the argument structure and how this structure is expressed by case marking.

Relational-structural encoding and ergative case marking
Case marking in Basque is not solely assigned based on the position in the syntactic structure of a sentence, but rather reflects the thematic role that is assigned to an argument (Laka, 2006b;Laka, 2006a;Laka, 2017). Therefore, Basque speakers need to encode the argument structures of verbs that could be used to describe a given event to determine whether to plan an ergative or nominative noun phrase. Only agent arguments must be marked by ergative case. Gathering enough information about the depicted event and how it could be described during the initial stages of sentence planning likely involves lexical access because syntactic information is generally assumed to be part of the lemma stratum of the mental lexicon. There, argument structures are represented separately from the lemmas of individual verbs, in the form of "combinatorial nodes" (Levelt, Roelofs, & Meyer, 1999;Pickering & Branigan, 1998;Wheeldon, 2011). It might be possible that argument structure information could be accessed directly without necessarily going through the activation of the representations of specific verbs. Speakers, however, need to engage in at least some minimal form of verb encoding because they need to arrive at an appropriate and felicitous description for the event they are about to be talking about. This would make it possible that speakers access an argument structure through encoding specific verbs, at least to a degree that allows accessing the relevant syntactic information (e.g., by deciding on a larger class of verbs that share combinatorial nodes and from which a felicitous verb  can be selected at a later time point).

Eye movements and speech onset latencies
Support for distinct planning strategies of Basque and German speakers comes from differences between the planning of sentences with ergative-marked agents in Basque and sentences with unmarked agents in German in the time course of fixations during picture description. The analysis of overt visual attention allocation to agents during the first 800 ms of planning targeted the relational-structural encoding phase, which has been shown to take place during this time window (Griffin & Bock, 2000;Konopka, 2019;Sauppe et al., 2013). In both Basque and German, speakers quickly looked at the agent referents. During the planning of ergative sentences, however, Basque speakers reached the agent fixation peak later than German speakers (around 500 ms, Fig. 4). This means that they spent more time dividing their attention between the agents and other aspects of the pictures, i.e., on information that is important for encoding the event relations (cf. Hwang & Kaiser, 2014b;Sauppe, 2017). The fixation behavior of Basque speakers when planning ergative sentences is consistent with the findings from Hindi ergative production (Sauppe et al., 2021).
Visual attention that is distributed over different elements of a picture aids speakers in encoding relational information about the depicted event (Griffin & Bock, 2000;. Patterns of looks to agents (and patients) during the relational-structural encoding phase of sentence planning have been suggested to be an index of verb planning. In a picture description study, Sauppe (2017) showed that speakers of German engaged in more distributed fixation behavior when verbs were placed in sentence-medial positions compared to sentencefinal positions. Likewise, Konopka (2019) showed that speakers of Dutch fixated both agents and patients during verb encoding when describing pictures in response to questions that either put the agent or the patient in focus ("What does the panda do?" vs. "What happened to the wall?"). The later peak of visual attention to ergative-marked agents in Basque is thus consistent with the interpretation of increased theta and alpha ERS to reflect the retrieval of argument structure information and early structural encoding.
In addition, the speech onset latency results are in line with the account that the planning of ergatives requires an earlier relationalstructural commitment to the type of the argument-verb dependency. Basque speakers started articulating ergative-marked sentences faster  than nominative sentences (and also faster than German speakers started articulating transitive and intransitive sentences, Table 1). These speech onset latency differences could reflect faster planning when there is no competition from alternative sentence plans in Basque ergatives compared to when there is competition from alternatives for sentences with nominative-marked NPs. Myachykov et al. (2013) showed that the partial activation of syntactic alternatives lead to longer speech onset latencies in a comparison of Russian (high syntactic flexibility, more alternatives) and English (low flexibility, fewer alternatives).
The current results provide a link to recent reaction time findings on the role of argument structure in English and Japanese sentence planning (Momma & Ferreira, 2019;Momma, Slevc, & Phillips, 2016;Momma, Slevc, & Phillips, 2018). These studies show that speakers plan the dependency between verbs and patient arguments before articulating the patient noun phrase, but they do not plan ahead the dependency between verbs and nominative-marked agents. In addition, in English intransitive sentences, verbs are retrieved earlier when the subject is a patient than when it is an agent because these kinds of arguments are represented in the argument structure of the verbs, with patient arguments being more tied to the verb (Momma et al., 2018). The time course of Basque ergative sentence planning mirrors these findings by highlighting the role of argument structure encoding for the signaling of agent-verb dependencies through case marking morphology.

Theta-and alpha-band synchronization
In the current study, the planning of sentences with ergative-marked agents in Basque led to a pattern of agent fixations characterized by a shifted peak compared to nominative sentences in German (Fig. 4), to event-related synchronization of power in the theta and alpha frequency bands (Figs. 5 and 6), and to shorter speech onset latencies, compared to nominative sentences (Table 1).
Increases of neural activity in the theta band are indices of the processing of syntactic and semantic dependencies (Bastiaansen, van Berkum, & Hagoort, 2002b;Bastiaansen, van Berkum, & Hagoort, 2002a;Hald, Bastiaansen, & Hagoort, 2006;Weiss et al., 2005, i.a.) as well as working memory engagement (Karrasch et al., 2004;Krause et al., 2000;Pavlov & Kotchoubey, 2020;Riddle et al., 2020;Sauseng et al., 2004). Crucially, theta band synchronization supports the "structured retrieval of choice-relevant information around decision points" and is thus a neural mechanism that coordinates the integration of multiple types of information across brain networks (Womelsdorf, Vinck, Leung, & Everling, 2010, p. 10). Theta oscillations during memory encoding and retrieval also bind memory traces in the neocortex, coordinated by hippocampal activity (Herweg et al., 2016). Based on the function of theta to also bind and integrate language-specific relational information (Covington & Duff, 2016;Duff & Brown-Schmidt, 2012), Cross, Kohler, Schlesewsky, Gaskell, and Bornkessel-Schlesewsky (2018, p. 9) proposed that theta activity combines linguistic elements "into successively more complex representations, establishing relations between (nonadjacent) elements in a sentence". We accordingly suggest that the theta ERS during the planning of Basque sentences with ergative-marked arguments reflects that speakers quickly encoded the events' relational structure and accessed syntactic information in the mental lexicon to decide on case marking at an early time point. Consequently, the integration of event relations and the marking of syntactic dependencies could be the cause of the increased theta-band activity in the current study.
The fact that the current study found more pronounced theta ERS for the planning of sentences with overtly ergative-marked argument-verb dependencies in Basque points to two possible scenarios (which are not mutually exclusive): First, activity in the theta band reflects working memory engagement (e.g., Hsieh & Ranganath, 2014;Jensen & Tesche, 2002;Klimesch et al., 2005;Krause et al., 2000;Sauseng, Griesmayr, Freunberger, & Klimesch, 2010). Theta ERS could primarily reflect Basque speakers' increased working memory engagement when selecting appropriate verbs that assign ergative case and when accessing their associated argument structure information. In this process, multiple lemmas are activated simultaneously or the activation of inspected but not selected lemmas has not yet decayed (as decay is generally assumed to be a relatively slow process compared to activation, Dell, 1986;Levelt, 1999;Pickering & Branigan, 1998). This interpretation would also be in agreement with increased theta synchronization a reflex of lexical retrieval in lexical decision tasks (Bastiaansen, Oostenveld, Jensen, & Hagoort, 2008;Bastiaansen, van der Linden, ter Keurs, Dijkstra, & Hagoort, 2005). Second, more intensive theta ERS could indicate that Basque speakers have reached or are close to reaching a decision on which verb to choose for sentences with ergative case marking. This would happen earlier than in the other, less constraining sentence types. Peaking theta synchronized spiking activity would then mark the decision point for a verb lemma (Womelsdorf et al., 2010) and its retrieval from memory (Herweg et al., 2016). Basque speakers could then immediately engage in more intensive structural encoding of ergative sentences.
For sentence planning in Hindi, by contrast, Sauppe et al. (2021) found that theta band synchronization was associated with sentences without ergative case marking. Sauppe et al. argued that this reflects that speakers defer structural decisions and entertain multiple alternative sentence plans when they do not have to decide on the signaling of a specific argument-verb dependency (e.g., through a grammatical feature like case marking), which constrains the possible structures that could be prepared. Sentences with nominative arguments allow deferring the commitment to a structure because these arguments are compatible with several sentence continuations. As long as speakers have not yet committed to a sentence structure, syntactic alternatives could be activated and prepared in parallel (Dell & O'Seaghdha, 1994;Myachykov et al., 2013). While this provides more flexibility in planning, e.g., to accommodate attention or accessibility fluctuations (Ferreira, 1996;Velde & Meyer, 2014;Wagner, Jescheniak, & Schriefers, 2010), it likely also requires speakers to simultaneously maintain multiple utterance plans or to handle potential competitor plans (Hwang & Kaiser, 2014a;Myachykov et al., 2013;Stallings et al., 1998).
While Basque nominatives also allow continuation in different ways, the choices are limited because, unlike in Hindi, nominatives are only compatible with intransitive verbs assigning patients or themes. More importantly, Hindi differs from Basque in exhibiting a fundamental split in ergative case assignment: only the agent arguments in the perfective aspect (describing completed events) take ergative case marking (Bickel, Witzlack-Makarevich, Choudhary, Schlesewsky, & Bornkessel-Schlesewsky, 2015;Kachru, 2006;Sauppe et al., 2021). The decisions that speakers need to make in Hindi thus concern abstract properties of sentences and described events, specifically whether there are one or two participants involved (for choosing the verb's syntactic transitivity) and whether the event is completed or ongoing (for choosing the verb's grammatical aspect). These decisions are arguably based on conceptual and relational information about the event (Griffin & Bock, 2000), but without access to syntactic information represented in the lemma stratum. The processes during early argument structure encoding in Basque ergative sentences, by contrast, require more specific processing, which we propose is reflected in increased theta band synchronization. This could also exceed the required activity for simultaneously considering multiple abstract sentence plans. In summary, we propose that theta ERS in Basque reflects early lemma choice and retrieval necessitated by ergatives, while in Hindi, theta ERS reflects sentence plan choices necessitated by nominatives.
The planning of Basque ergative sentences also elicited a synchronization in the alpha band in frontal and central electrode sites between approximately 100-300 ms after picture onset (Fig. 6). In a sentence comprehension study, Segaert, Mazaheri, and Hagoort (2018) found an alpha ERS effect with a similar topography when syntactic binding processes could be anticipated to be applied to the next word. In relation to the current findings, alpha ERS could thus go in hand with the theta ERS as an additional index of planning the relationship between the verb and its case-marked argument. In a lexical decision study, Bastiaansen et al. (2008) also found that alpha power increased shortly after stimulus words were presented and participants accessed their mental lexicon to assess the wordhood of the stimuli and Meyer, Obleser, and Friederici (2013) argue that the role of alpha-band synchronization for working memory-related functions can be extended to language and that its function is to hold linguistic information active until it can be released. For sentence comprehension, this means the retention of arguments until the verb is encountered in long-distance dependencies. Furthermore, Klimesch, Sauseng, and Hanslmayr (2007) also propose that alpha ERS plays an integral function for the retention of memory traces through top-down processes specific to the experimental task. The alpha synchronization for ergative Basque sentences could thus also tentatively be linked to the process of early argument structure encoding: By acting as a gating mechanism, alpha-band synchronization inhibits neural processing (Klimesch, 2012). These processes could contribute to controlling the information flow during access to the lemma stratum and during structural encoding (Herweg et al., 2016, also found frequencies in the alpha range to be involved in memory retrieval and relational binding). The function of alpha ERS could therefore be to retain the combinatorial node (or even the verb lemma itself) until the later linguistic encoding of the sentence-final verb. Based on the finding by Momma (2021) that heads and dependents (here, the argument and the verb) may be planned together locally and then expanded with additional words later, a question for future research would be whether alpha ERS is a reliable marker of dependency planning in which the linguistic encoding of one element may need to be delayed (e.g., for sentence-final verbs).

Beta-and alpha-band desynchronization
The planning of ergative-marked transitive and nominative-marked intransitive sentences in Basque elicited beta-band desynchronization in different hemispheres and at different time points: Basque nominative intransitive sentences went in hand with right hemispheric beta ERD starting at approximately 200 ms, whereas ergative transitive sentences went in hand with left hemispheric ERD around 400-600 ms (Fig. 7). Power decreases in the beta band have been shown to occur during the processing of syntactic and semantic information during comprehension (Davidson & Indefrey, 2007;Meltzer & Braun, 2011;Weiss & Mueller, 2012). Hanslmayr et al. propose that beta desynchronization subserves information retrieval from long-term memory and that the degree of desynchronization "represent[s] the richness of information encoded in a memory trace" (Hanslmayr et al., 2012, 10). For sentence planning in the current study, the right hemispheric desynchronization could thus reflect the greater number of potentially simultaneously activated alternatives for sentences with nominative arguments. (That no withinlanguage differences for German are observed could be due to all subjects being in the nominative, so that all German sentences are planned with the initial activation of alternative structures.).
The left hemispheric beta ERD for transitive sentences with ergative marking might be driven by a different process. Meyer (2018) summarizes the evidence from sentence comprehension studies and concludes that activity in the beta band reflects the prediction of lexical-semantic properties of upcoming words (cf. also Lewis, Wang, & Bastiaansen, 2015). For ergative planning, the beta ERD over the left hemisphere could thus reflect that speakers' earlier structural commitment allows the projection of a sentence plan with more confidence than when there are still alternatives that could be considered. Based on the above interpretation of theta-and alpha-band ERS, the selection of a combinatorial node could constitute "anticipatory" processing here by making speakers look further ahead (cf. Lee, Brown-Schmidt, & Watson, 2013).
In the alpha band, the planning of sentences in Basque and German elicited an ERS-ERD pattern. In Basque specifically, a larger desynchronization for intransitive compared to transitive sentences was observed in posterior electrodes between 600-800 ms (Fig. 6). In German, by contrast, no differences in alpha ERD between sentence types were detected (cf. Fig. S8). In Sauppe et al. (2021)'s study on Hindi, the planning of nominative sentences elicited stronger alpha ERD than the planning of ergative sentences and we interpret the difference in Basque in line with Sauppe et al.'s proposal. As discussed above, Basque speakers engage more intensively in the encoding of verbal argument structure-related information during the planning of sentences with ergative-marked arguments (as reflected in theta and alpha ERS). The preparation of sentences with nominative-marked arguments, by contrast, could proceed with a later commitment, increasing speakers' flexibility (Ferreira, 1996). At the same time, however, this means that potential verbs and competitor plans need to be kept distinct until a definitive commitment is made. This could be achieved through the increased activation of the cortical networks that are implicated in syntactic information processing (Davidson & Indefrey, 2007;Vassileiou, Meyer, Beese, & Friederici, 2018), resulting in alpha-band desynchronization. This interpretation is also corroborated by the reaction time results from the current study: Basque intransitive sentences were initiated slower than ergative-marked transitive sentences, consistent with findings from the literature implicating higher processing loads for the planning of sentences under more flexible conditions (Hwang & Kaiser, 2014a;Myachykov et al., 2013;Stallings et al., 1998).
Future research will, at any rate, need to further tease apart the exact relationship between theta-, alpha-and beta-band activity and relational-structural encoding processes during sentence planning.

Conclusions
Overall, the current eye tracking and electrophysiological evidence implies that the cross-linguistic differences in case marking can only be captured by structurally incremental production accounts . When speakers prepare sentences with ergativemarked agents, they need to encode aspects of the first and last constituents together at the outset of sentence planning. At the same time, the similarities in the general planning time course allow the possibility that there are no all-or-nothing categorical differences in how Basque and German speakers prepare their utterances but rather that languages with richer morphosyntactic signaling systems follow strategies that shift structural encoding towards earlier stages of planning (Hwang & Kaiser, 2014b;Sauppe et al., 2013). The view that the case marking difference affected relational-structural encoding specifically is supported by studies on the planning of dependencies in verb-initial languages with sentence-final agents. In these languages, differences in verbal morphology are also associated with fixation differences in overlapping time windows (e.g., 0-600 ms in Tzeltal or Tagalog, Sauppe et al., 2013). Santesteban et al.'s (2015) priming study on syntactic choices in Basque sentences suggested that verb and constituent structure selection precedes (or is at least partly independent of) case assignment. Our current results from temporally more fine-grained measures show that ergative case marking shapes planning from early on. We also find that planning case marking in Basque appears to be tightly intertwined with argument structure encoding and possibly even verb selection (or at least the decision on a class of verbs, Antón-Méndez, 2020; Sauppe, 2017). This qualifies the offline effects measured through syntactic priming by Santesteban et al. (2015) by showing that the encoding of the verbal argument structure plays a central role in the planning of sentences with ergative case-marked agent arguments. Early argument structure encoding may be less central, however, in the planning of sentences with nominative arguments when speakers can more flexibly choose between alternative continuations of the sentence they started preparing. To characterize the relationship between case marking and verb and argument structure planning and to describe the determinants and mechanisms of advance planning in Basque, further studies are required that specifically contrast ergative-marked and nominative intransitive sentences. There is already evidence from sentence comprehension that the different kinds of intransitive sentences in Basque are processed differently: Martinez de la Hidalga, Zawiszewski, and Laka (2019) found that ergative-and nominative-marked intransitives in Basque are processed differently, so that the processing of the ergative-marked intransitives patterns with the processing transitives which always assign ergatives to agents.
On balance, the effects found in the current study are compatible with related findings on the planning of ergative sentences in another language, Hindi (Sauppe et al., 2021). At the same time, the comparison of sentence planning in Basque and German revealed neural activity patterns that were different from those previously reported. We argue that these differences can be traced to grammatical contrasts that govern the assignment of overt ergative case marking between Basque and German. In the future, it will need to be mapped out in more detail how sentence planning in languages with different ways of signaling grammatical relationships between the syntactic elements is supported by and encoded in neural oscillatory activity. The current study is among the first to study neural oscillatory activity during the planning of full sentences and makes use of a cross-linguistic perspective (cf. Norcliffe, Harris, & Jaeger, 2015, on the role of cross-linguistic comparison in psycholinguistics). The exploratory character of this endeavor is also reflected in the interpretation of the effects, which future research will need to refine.
The picture description task used in the current experiment allowed the elicitation of spontaneous utterances, while steering the semantic content of what speakers said (cf. also, e.g., Bögels, 2020). At the same time, this approach gives up experimental control over the form of participants' responses and increases the variability of lexical choices and sentence lengths. While we statistically controlled for some of these aspects, the use of approaches with more prescribed response formats in future studies will be beneficial to further describe the neural basis of speaking and the planning of ergative case marking.
In conclusion, with this comparison of Basque and German, we show that the early time course of sentence planning is tightly intertwined with the grammar of a language. This highlights the continuing need of the field to compare planning in different languages that are carefully selected based on their make-up to explore how speakers adapt their planning to different grammatical affordances (especially since the cross-linguistic basis of sentence production research remains narrow, Jaeger & Norcliffe, 2009;. This study also shows that the analysis of event-related neural synchronization/desynchronization can be employed to gain insight into how sentences are planned. This opens up new possibilities of studying how structural representations are processed during planning and how they relate to representations for comprehension and for domaingeneral cognition (e.g., Martin & Doumas, 2017;Meyer, Sun, & Martin, 2020).

Data availability
Data and analysis scripts are available from https://osf.io/s8tq5/.
for help with data collection and processing, and the Phonogram Archives of the University of Zurich for technical support. The authors also thank two anonymous reviewers for their helpful comments on an earlier version of the manuscript.