No evidence for prosodic effects on the syntactic encoding of complement clauses in German

Does linguistic rhythm matter to syntax, and if so, what kinds of syntactic decisions are susceptible to rhythm? By means of two recall-based sentence production experiments and two corpus studies – one on spoken and one on written language – we investigated whether linguistic rhythm affects the choice between introduced and un-introduced complement clauses in German. Apart from the presence or absence of the complementiser dass (‘that’), these two sentence types differ with respect to the position of the tensed verb (verb-final/verb-second). Against our predictions, that were based on previously reported rhythmic effects on the use of the optional complementiser that in English, the experiments fail to obtain compelling evidence for rhythmic/prosodic influences on the structure of complement clauses in German. An overview of pertinent studies showing rhythmic influences on syntactic encoding suggests these effects to be generally restricted to syntactic domains smaller than a clause. We assume that, in the course of language production, initially, clause level syntactic projections are specified; their specification is in fact the prerequisite for phonological encoding to start. Consequently, prosodic effects may only touch upon the lower level categories that are to be integrated into the clausal projection, but not upon the syntactic makeup of the higher order projection itself.


Introduction
The sound of a sentence, its melody and rhythm is, in great measure, dependent on word choice, word order, and the choice of a particular syntactic construction. This paper is concerned with the question whether, and to what extent, the reverse holds as well: that is, whether the syntax of a sentence is dependent on prosodic aspects like melody and rhythm; or, put differently, whether speakers consider prosodic well-formedness when making syntactic decisions in language production.
The influence of prosody on syntax is most obviously attested in metered poetry; poets may tweak sentence structure to the benefit of sound and they do so to an extent that would be unacceptable in normal speech, sometimes violating otherwise high-ranking syntactic rules (Kiparsky 1975;Youmans 1983;Fitzgerald 2007). Similarly, given the importance of phonological form for persuasive speech (Menninghaus et al. 2015), speakers are known to adjust the syntax for the sake of prosody in rhetoric registers, be it in speech or writing (Bolinger 1957;Anttila et al. 2018). Even in normal spontaneous language, prosodic influences on sentence structure have been reported. In what follows, we consider one specific prosodic feature, viz. the linguistic rhythm that is due to the distribution of stressed and unstressed syllables or of accented and unaccented words. The literature on this topic considers at least three different ways in which linguistic rhythm may affect sentence structure (cf. Anttila 2016 for a recent, more general, overview): First, rhythmic alternation of stressed and unstressed syllables may be obtained by the inclusion or omission of optional elements. A language production experiment by Lee & Gibbons (2007) suggests that speakers use the unstressed optional complementiser that to maximise rhythmic alternation of weak and strong syllables, as it is more often produced when the top of the complement clause starts in a stressed (Lucy) as opposed to unstressed (Louise) syllable (1). (1) Ian guessed (that) {Louise, Lucy} signed the contract.
Secondly, in languages with sufficient flexibility concerning the word order, speakers have been shown to make use of this flexibility to ensure rhythmic alternation (Schlüter 2005;Rohdenburg 2014).
(rhythmic) there wanted the Peter tomatoes in cook b. Da wollte der Peter drin Tomaten kochen.
(dysrhythmic) there wanted the Peter in tomatoes cook 'Peter wanted to cook tomatoes in there.' For example, Vogel et al. (2015) find that the syntactic placement of an inherently unaccented pronominal adverb drin 'in' (bolded in (2)) is conditioned by the prosodic structure of the immediate environment, suggesting that speakers favour rhythmic (here: dactylic) linearisations like (2a) over adverb placements which would lead to dysrhythmic word orders (2b), i.e. those in which the alternation between stressed and unstressed syllables is less regular.
Thirdly, prosodic constraints may condition the choice between two or more (quasi-) synonymous sentence constructions. Anttila et al. (2010) and Anttila (2016) show that the choice between competing ditransitive constructions is dependent on the viability of the resulting prosody: specifically, Anttila et al. (2010) show that double object constructions involving a stressed goal (and concomitantly involving a stress clash, i.e. a sequence of two stressed syllables: give John the ball) are clearly underrepresented compared to double object constructions involving an unstressed goal, as in give him the ball.
Similar prosodic effects on the choice between genitive case and prepositional constructions for the expression of possessive or partitive attributes (e.g. the mayor's house vs. the house of the mayor) have been reported by Shih et al. (2015) for English and Kentner (2018) for German.

On the nature of allegedly "syntactic" variations
The syntactic alternatives for both i) the ditransitives (double object versus prepositional dative; Anttila et al. 2010) and ii) the genitive attributes (genitive case vs. prepositional genitive; Shih et al. 2015) involve different constructions that are semantically quasiequivalent alternatives but that do not share a syntactic relationship. It is therefore not entirely clear whether the reportedly prosody-driven choice between these options affects syntactic operations per se, or whether it is better conceived as the selection between two precompiled syntactic frames, as advocated in construction grammar (Goldberg 1995;Kay 2002). In the latter case one might instead assume a prosodic effect on the selection from the "lexicon of constructions", as it were, not so much a direct prosodic effect on syntactic computation or on syntactic relations.
Even when word order is affected by linguistic rhythm, it is debatable whether this constitutes a prosodic effect on syntactic computation. Agbayani & Golston (2010; consider hyperbaton in Classical Greek and Latin to be a type of word order alternation that involves phonological constituents rather than syntactic ones, and cannot be explained in purely syntactic terms. In this regard, their notion of "prosodic movement" (Agbayani et al. 2015;Bennett et al. 2016) embodies a conception of word order variation as phonological rather than syntactic, and this notion may well apply to cases like (2) in which the syntax is largely indifferent to the word ordering.
Finally, the prosody-driven inclusion or omission of optional elements need not be a prosodic effect on syntax, but merely a case of phonological ellipsis. This may be true of optional that in English complement clauses as there is no obvious syntactic difference between clauses with vs. without that other than the presence or absence of the complementiser.
With these considerations in mind, the studies reviewed so far suggest that the observable rhythmic effects on sentence structure do not necessarily touch upon intrinsically syntactic relations or computations. Instead, they may affect either the resulting phonological representation (the mere linearisation of syntactic constituents), or the choice among syntactically unrelated, possibly precompiled, constructions that happen to be quasi-synonymous.
However, there are cases of syntactic variability that appear to involve processes that may be more properly construed as syntactic in nature and that cannot be relegated either to the phonology or to the "lexicon of constructions". One example is the alternation between introduced (4a) and unintroduced (4b) complement clauses (CC) in German, which is the object of the study to be presented here. These structures are the German equivalent to English complement clauses with or without that.
(verb-second CC) Sandra thinks Gisbert listens Techno 'Sandra thinks (that) Gisbert listens to Techno music.' In contrast to its English analogue, this kind of alternation does constitute a difference in word order (verb-final subordinate clause versus verb-second structure), but the word order difference is, crucially, not reducible to phonology, as it affects syntactic constituents, not phonological ones. Also, the word order difference is not a simple one, as the structure in (4a) requires the complementiser dass which is not licensed in verb-second structures like (4b), i.e. there is a complementary distribution of subordinating conjunctions and verb-second order. On the other hand, one might say that (4a) and (4b) are different and independent constructions; but in contrast to the cases discussed above (i.e. the English dative alternation or genitive construction choice), there is a systematic syntactic relationship between the two sentence structures. The conventional wisdom on these German structures holds that the verb-final order of a subordinate clause is the underlying word order. From this basic word order, other orders may be derived via verb movement and topicalisation (4) (see, among many others, e.g. Thiersch 1978;Wöllstein-Leisten et al. 1997).
(4) a. (dass) Gisbert Techno hört (SOV) underlying order b. hört i Gisbert Techno t i (VSO) verb movement → V-initial order c. Gisbert j hört i t j Techno t i (SVO) topicalisation → V2-order The alternation between complement clauses like (4a) and (4b) therefore constitutes a test case for the introductory question whether linguistic rhythm affects syntax in language production. Applying this question to the alternation concerning complement clauses in German is furthermore motivated by the above-mentioned language production experiment by Lee & Gibbons (2007) who found that the rhythmic environment affects the inclusion or omission of the optional complementiser that in English. Section 2 provides more background on the syntax, semantics and pragmatics of introduced versus unintroduced complement clauses. In Section 3, we report on four experiments that largely fail to produce compelling evidence for a rhythmic effect on the choice between introduced and unintroduced complement clauses in German. We consider potential reasons for this in Section 4; specifically, we present a synoptic discussion of studies concerned with ostensible rhythmic effects on sentence structure and argue that rhythmic effects are restricted to sub-clausal domains of structure building. Section 5 concludes the paper.

Introduced vs. unintroduced complement clauses
Finite complement clauses in German come in two varieties, viz. i) those that feature a complementiser (4a) and ii) those that don't (4b). The former are known as introduced complement clauses that display verb-final syntax, which is characteristic of most subordinate clauses in German. The latter are called unintroduced complement clauses for their lack of a subordinating conjunction (complementiser), or dependent main clauses (Auer 1998) because their word order resembles the syntax of simple declarative clauses with the tensed verb in second position (V2). In the following, we focus on complement clauses that serve as sentential objects to a preceding verbal head and ignore sentence initial complement clauses or ones that are licensed by nouns or adjectives.
(3) a. Sandra glaubt, dass Gisbert Techno hört. Sandra thinks that Gisbert Techno listens b. Sandra glaubt, Gisbert hört Techno. Sandra thinks Gisbert listens Techno 'Sandra thinks (that) Gisbert listens to Techno music.' The presence or absence of the complementiser usually does not affect the core meaning of the sentences; (4a) and (4b) are strictly synonymous. However, the literature on the subject notes several conditions for the choice between introduced vs. unintroduced complement clauses (see e.g. Reis 1997; Truckenbrodt 2006 for formal accounts). For one thing, the syntax of finite complement clauses depends on the embedding verb. Several verbs license both introduced as well as unintroduced complement clauses but they do so to different degrees. While some embedding verbs equally appear with introduced and unintroduced sentential complements (e.g. sagen, glauben 'say, believe'), other verbs hardly allow unintroduced complement clauses (this holds especially for factive verbs like e.g. akzeptieren 'to accept', or deontic verbs like befehlen 'to command'). However, in general, the environments that license unintroduced complement clauses also license the variant with the complementiser.
The matrix clauses with embedding verbs have to be distinguished from so-called "epistemic parentheticals" (Thompson & Mulac 1991) or "comment clauses" (Brinton 2008). These are clauses that exclusively feature a restricted set of verba putandi in the first person singular (e.g. I think, I believe, I suppose). Rather than serving as heads to the complement clause, these clauses are considered discourse markers or "hedges" that signal uncertainty on the side of the speaker (Auer 1998).
Apart from the lexical specifics of the embedding verb, the syntactic environment conditions the choice between introduced and unintroduced complement clauses. Negated matrix clauses decrease the likelihood of unintroduced complement clauses. The same holds when the embedding verb does not directly precede the complement clause. On the other hand, unintroduced complement clauses are more likely if they are set in conjunctive mood (Auer 1998).
Auer claims that the relative pragmatic import of matrix clause and complement clause predicts the structure of the complement clause: If the complement clause contains presupposed information or information that the speaker considers known or discourse-given, it is more likely to be introduced with a complementiser; conversely, if the information in the complement clause is new or specifically relevant to the discourse, it is more likely to be realised as dependent main clause, i.e. without a complementiser. That is, Auer assumes that the syntactic subordination marker signals semantic or pragmatic subordination with respect to the main clause. This claim seems somewhat at odds with the findings by Ferreira & Dell (2000) and more recent research on English complement clauses: Ferreira & Dell show that speakers are more likely to omit the complementiser when the subsequent words were previously mentioned (i.e. known to the speaker). Jaeger (2010) found that the inclusion of the complementiser is more likely when the content of the complement clause is more informative, i.e. when its wording is less predictable from the context. Moreover, the frequency of the lexical material at the top of the complement clause is inversely correlated with the presence of the complementiser (Jaeger 2010). Also, thatmention is strongly correlated with hesitations or disfluencies. A study by Hawkins (2004) suggests that the syntactic complexity or length of the complement clause increases the likelihood of the complementiser being present. Together, these findings suggest that, at least in English, the accessibility of the (lexical) material within the complement clause guides the production of this optional and syntactically redundant word; basically, speakers include that when they need to buy time to plan the subordinate clause, because its production turns out to be demanding. Other psycholinguistic studies find an effect of syntactic persistence. Speakers are more likely to produce the complementiser that when they did so in previous complement clause productions, but not necessarily when they produced that as demonstrative or relative pronoun (Ferreira 2003). Weinert (2012) compares the use of introduced and unintroduced complement clauses in English and German; her data suggest that, in spoken conversation, more than 80% of English complement clauses lack a complementiser, while the number is significantly lower in German (∼60%). In both languages, the number of unintroduced complement clauses is lower in the written as compared to the spoken modality.
We are aware of two studies that specifically ascertain the role of linguistic rhythm regarding the choice between introduced vs. unintroduced complement clauses, viz. Jaeger (2006) and Lee & Gibbons (2007). Both studies are concerned with English optional that and both reach the conclusion that that-inclusion or omission is sensitive to phonological rhythm. The corpus study by Jaeger suggests that the inclusion of the generally unstressed that is significantly more likely in complement clauses the subject of which starts in a stressed as opposed to unstressed syllable. Jaeger's later, more comprehensive, follow-up study (Jaeger 2010), however, does not include the rhythmic predictor any more. The likely reason for this is that, in this data set, the coding for stress at the top of the complement clause is confounded with the morpho-syntactic type of subject at the top of the complement clause (e.g. unstressed determiner or pronoun vs. lexical noun with initial stress) and with the frequency of this word (e.g. high-frequency determiner/pronoun vs. lower frequency lexical noun); these factors turn out to be strongly correlated with thatmention and probably render any effects of stress and rhythm redundant.
The language production experiment by Lee & Gibbons (2007), on the other hand, controls for the morpho-syntactic type of the embedded subject and its frequency -it is always a disyllabic proper name with stress falling either on the first or on the second syllable.
Lee & Gibbons, who found that the stress quality of the embedded subject affects that-mention, take their finding to support models of language production in which the phonological encoding feeds back to the grammatical encoding stage. This is especially noteworthy because psycholinguistic research has produced mixed, and rather little, evidence in favour of direct phonological feed-back to grammatical encoding (Vigliocco & Hartsuiker 2002). For example, while Bock (1987) reports an effect of phonological priming on word order variation, neither Bock (1986) nor Cleland & Pickering (2003) find a comparable effect.
However, as mentioned above, for English, it remains doubtful whether unintroduced complement clauses really differ, in syntactic terms, from those that present with the overt complementiser. Alternatively, the complementiser that may be considered syntactically redundant and the presence or absence of it the business of the phonology. If the latter were true, with no tangible syntactic difference between the two types of complement clause, Lee and Gibbons' experiment would, after all, not constitute evidence for a phonological effect on grammatical encoding, but an effect that plays out exclusively within the phonological encoding stage.
In the following experiments we will examine whether the use of the German complementiser dass is systematically susceptible to the stress quality of the surrounding syllables. As discussed above, the choice between these variants involves a difference in word order that is not reducible to phonology and therefore clearly syntactic in nature. A positive result would strengthen the idea of bidirectional information flow between syntactic encoding and phonological encoding in language production. Anticipating the results, we did not find compelling evidence for such an interaction, against our predictions.
The following sections present the studies in detail. The data and the scripts used for the statistical analyses are provided as supplementary material to this article.

Experiment 1
Experiment 1 is a conceptual replication of the language production experiment by Lee & Gibbons (2007), adapted to German, that closely follows their experimental protocol. Also, the sample size of this and the following experiment (i.e. the number of participants and items) is modeled on the example of Lee & Gibbons. Participants read sentence pairs silently in order to produce them afterwards in response to a recall cue. Sentence pairs consisted of a filler sentence as first stimulus and an experimental target sentence as second stimulus. Following Lee & Gibbons, target sentences were simple matrix clausecomplement clause structures that license the optional complementiser dass in German like (5) and (6). The presence or absence of dass (together with the word order of the complement clause) and the stress quality of the surrounding syllables were systematically varied.
Assuming that participants vary their use of dass when producing the memorised sentences, we predicted that they do so in favour of an alternating rhythm. Thus, in sentences with an embedded trochaic subject (e.g. Nadja), the unstressed dass should be produced more frequently than in those with an iambic one (e.g. Nadine) as the latter would result in a stress lapse (i.e. a sequence of two unstressed syllables, deviating from the optimal rhythmic alternation of stressed and unstressed). Additionally, we predicted that an unstressed final syllable of the main verb would lead to more omissions of dass, again in order to avoid a stress lapse. Conversely, the stressed monosyllabic embedding verb should result in more frequent use of the unstressed complementiser in order to avoid a stress clash (i.e. a sequence of two stressed syllables). We constructed 32 items like (5) and (6), each involving a sentence frame with four conditions. Half of the presented sentences include the optional complementiser dass (5a), (5b), (6a), (6b), and concomitantly a verb-final structure for the embedded sentence. The other half lacks the complementiser, featuring the tensed auxiliary hat 'has' in second position within the subordinate clause (5c), (5d), (6c), (6d). In each frame, the subject of the embedded complement clause is one of two disyllabic female first names, which differ in whether the first or the second syllable is stressed. As a consequence, in half of the items, the subject at the top of the subordinate clause starts in an unstressed syllable (5a), (5c), (6a), (6c), and in the other half it starts in a stressed one (5b), (5d), (6b), (6d). For the embedded subject we used 16 different name pairs (e.g. Nadja -Nadine) for the 32 items so that every pair appears twice in the set. We chose pairs that match closely with respect to segmental content and usage frequency, as gleaned from the Leipzig Wortschatz corpus (http://wortschatz.uni-leipzig.de/). Furthermore, in each frame the embedded sentence was constructed with one of 32 transitive verbs and their object in accusative case. The lexical verbs in the embedded clause are trisyllabic past participle forms (e.g. gereinigt 'cleaned') with the monosyllabic auxiliary hat.
The main clause verbs were eight different verbs that license a sentential complement which may or may not be introduced by the complementiser dass in spoken German. These eight main clause verbs were selected on the basis of Auer's (1998) study on complement clauses in spoken German. Auer identifies 13 high-frequency verbs that license both introduced and unintroduced complement clauses. For the purpose of our experiment, we included the ones that alternate between trochaic and monosyllabic word forms in their 3rd person indicative, depending on tense (sagt/sagte, glaubt/glaubte, weiss/wusste, hofft/hoffte, hört/hörte, findet/fand, meint/meinte, denkt/dachte, 3rd singular present/preterite of the verbs 'say, believe, know, hope, hear, find, reckon, think').
We systematically varied the stress position of the embedding verb as a between-items factor. Every main clause verb appears four times in the set, twice as a monosyllabic verb (e.g. glaubt 'believes') and twice in trochaic form (e.g. glaubte 'believed'). As a consequence, there are 16 items with monosyllabic (5) and 16 with disyllabic main verbs (6). Thirtytwo male proper names serve as main clause subjects, half of them trochees and the other half monosyllabic ones, i.e. matrix clause subject and verb together consist of exactly three syllables, as in the study by Lee & Gibbons. That is, 16 of these matrix clauses end in a stressed syllable (Felix denkt 'Felix thinks') and 16 in an unstressed schwa-syllable (Tim dachte 'Tim thought'). That way, depending on the quality of surrounding syllables, the presence or absence of the unstressed dass yield either rhythmically alternating sequences, or stress lapses or stress clashes at the clause boundary.
The 32 sets of materials were rotated around the experimental conditions in a latin square and mixed with fillers to yield four experimental lists. Each list contains 56 sentence pairs, 32 of which experimental sentences paired with filler sentences; the remaining 24 are pairs of filler sentences. All filler sentences are unrelated to the experiment and do not contain the sentential complement structure.

Participants
Thirty-two students from Goethe University Frankfurt, Germany, took part in the experiment. They identified as native speakers of German. In case of multilingualism, German was one of the mother tongues. Participants were paid 10 Euros for participation.

Procedure
Participants performed three practice trials followed by 53 trials. Each of the four lists was pseudo-randomised eight times using Mix software (van Casteren & Davis 2006) such that trials of the same condition or the same item had a minimum distance of two. This way, we created 32 different orders, one for each participant. Participants sat in front of a screen in a quiet room and were instructed to read and memorise the two successively appearing sentences silently. Each sentence appeared for exactly five seconds. After both sentences of a pair had been presented, the recall cue for the first sentence (filler) appeared and participants had to produce it from memory. The cue stayed on the screen until participants indicated that they had finished by pushing the enter key on a computer keyboard (60 seconds at most). After that, the cue for the second sentence appeared immediately (consisting of three words in the case of experimental sentences, viz. determiner-object-participle verb, e.g. den Anzug gereinigt 'cleaned the suit') and participants had to recall and produce the second sentence and push the key when they had finished. At this point, participants could take short breaks (again, 60 seconds at most) before they moved on to the next pair of items. Participants were instructed to recall the names and words in the sentences as accurately as possible but were also encouraged to use their own words to express the meaning when they could not remember the exact wording. The presentation on the screen and the recordings of the participants' speech were programmed using PsychoPy software (Peirce 2007). Additionally, the spoken productions were digitally recorded with an external recording device.

Scoring
All recordings were transcribed by an undergraduate student and scored by an additional student, adapting the scoring scheme of Lee & Gibbons to the present design: First, irrespective of presence/absence of the complementiser dass, a recall-based production was considered valid if it had one of the eight main clause verbs immediately followed by a sentential complement with a first name as the embedded subject. Due to a very high amount of failed recalls (see results), we deviate from the scoring scheme by Lee & Gibbons and did not exclude productions in which the word-prosodic structure of the embedding verb or of the embedded subject differed from the one in the stimulus sentence (Lee & Gibbons deemed those trials unusable). Rather, we scored for each valid production the prosodic status of the final syllable of the main verb (stressed or unstressed) and of the first syllable of the embedded subject (stressed or unstressed). Finally, we determined for each production whether it contained the complementiser dass or not.

Data analysis
Out of the 1024 recordings, 470 trials (=46%) were considered valid according to the above criteria. This number is considerably lower than the number of the usable trials that Lee & Gibbons (2007) obtained in their experiment (686 out of 1024 = 67%). The reason for this discrepancy is unclear. We speculate that our filler sentences (the first sentence used in each pair of trials) are more complex than the ones used by Lee & Gibbons and correspondingly strained the memory to a higher extent. Table 1 lists the distribution of recalls with and without dass broken down by the stress quality of the surrounding syllables. As evident in Table 1, recalls involving dass are by far more common than those without. Also, there were considerably more valid recalls with trochaic names (n = 278) as embedded subject compared to iambic names (n = 191).
In order to estimate the influence of the rhythmic environment on the inclusion/omission of dass (and concomitantly on the syntactic structure of the subordinate clause), we fit a Bayesian generalised linear mixed model (GLMM) with Bernoulli link function, using the brms package (Bürkner 2017) in the statistical computing environment R (R Core Team 2014) (see the tutorial by Nicenboim & Vasishth 2016). The brms package provides an interface between R and the Stan programming language for Bayesian statistical inference (Stan Development Team 2016). Instead of a point estimate, the Bayesian models yield so-called Credible Intervals (CI) along with the coefficient estimate, i.e. a distribution of plausible values of the model parameters given the data. 1 The binomial dependent variable was the inclusion vs. omission of dass. In line with the experimental design, we included as fixed effects i. the presence of dass in the stimulus sentence, ii. the stress status of the initial syllable of the proper name serving as embedded subject (stressed vs. unstressed), and iii. the stress status of the final syllable of the embedding verb (stressed vs. unstressed) as between item variable; finally, iv. the interaction term of ii. and iii. was included as fixed effect. To avoid correlation of the fixed effects, we applied simple coding (orthogonal sum contrasts, with the two levels of each factor coded as -.5 and .5, respectively). The contrast coding is shown in Table 2. Accordingly, for each of the three main effects, a positive coefficient would indicate that the factor works in the expected direction, i.e. increased dass-mention when dass is present in the stimulus as opposed to when it is not (DassStimulus); or when the initial syllable of the embedded subject is stressed as opposed to unstressed (NameStress); or when the final syllable of the embedding verb is stressed rather than unstressed (VerbStress).
The experiment involves repeated measures over participants (n = 31), 2 items (n = 32) and embedding verbs (n = 8). The model accounts for the association between the data 1 Bayesian Credible Intervals for posterior distributions differ from confidence intervals for point estimates.
Among other things, Credible Intervals are affected by the prior distributions assumed for the model (see, e.g. Nicenboim & Vasishth 2016;Nalborczyk et al. 2018). 2 One participant did not produce any valid data points, and is therefore not represented in the data set. points coming from the same participant or item or embedding verb when estimating the model's coefficients by including all of these factors as random effects, with random intercept and all random slopes justified for the experimental design (Barr et al. 2013).
For each estimated parameter, the prior was a normal distribution with mean 0 and a standard deviation of 1 (we report model comparisons with tighter prior distributions below). The parameter for the LKJ prior for the variance-covariance matrices of the random effects for subjects, items and embedding verb was set to 2 (see the tutorial by Sorensen et al. 2016). Four sampling chains with 10000 iterations each were run for each model, including a warmup period of 2000 iterations.

Results and discussion
The parameters of interest of this model are shown in Table 3. Along with the mean estimate, we report the 95% Credible Interval and the probability of the coefficient being greater than 0. We consider an effect to be of significance if the 95% CI does not include 0 or when at least 95% of its posterior distribution lies either below or above 0.
The positive estimate of the intercept of this model reflects the fact that recall-based productions involving dass were significantly more frequent than productions with unintroduced complement clauses (cf. Table 1). Furthermore, this model shows a clear effect of presence or absence of dass in the stimulus. When dass was present in the stimulus, participants were clearly more likely to use dass (∼92% of the times) than when dass was not part of the stimulus (∼68%) (hence the positive coefficient for the factor DassStimulus). The main effect coefficients of NameStress and VerbStress and the interaction have values closer to 0, with 0 well within the respective 95% Credible Intervals. Consequently, this experiment does not provide evidence in favour of a rhythmic effect  on complementiser use in German. In fact, there is some evidence that the influence of the rhythmic environment on complementiser use is qualitatively different from the corresponding effect in English, as reported by Lee & Gibbons. In the following, we report on a reanalysis of Lee & Gibbons' experiment and compare it to our results.
3.1.7 Comparison to Lee & Gibbons (2007) Based on the data reported in Lee&Gibbons, we can compare the effect of NameStress in our experiment to the corresponding effect that Lee & Gibbons obtained for English. To this end, we re-created a data frame with the 686 valid productions they reported (Lee & Gibbons 2007: 453). We then fit a Bayesian generalised linear mixed effects model with Bernoulli link function with that-mention as the dichotomous dependent variable. The independent variables were the ones used by Lee & Gibbons: i. ThatStimulus, i.e. the presence/absence of that in the stimulus sentence, ii. NameStress, i.e. the stress status of the initial syllable of the proper name serving as embedded subject (stressed vs. unstressed) and iii. the source of the recall cue (matrix clause or embedded clause). As we do not have any information about the distribution of data points over items and participants, this model lacks a random effect term. As in the model for our data, we applied simple coding with orthogonal sum contrasts for the fixed effects (as in our model, the levels were coded as .5 and -.5, respectively). As priors for the fixed effects, we used normal distributions with mean 0 and standard deviation of 1, as we did in the case of Experiment 1 reported above. This model yields the coefficients in Table 4, i.e. significant effects for all three independent variables. The comparison of our experiment on the one hand, and the reanalysis of the experiment by Lee & Gibbons on the other, reveals two noteworthy differences: first, in contrast to the data from our Experiment 1, the participants in Lee and Gibbons' experiment on English obviously use the complementiser to a far lesser extent (cf. the negative intercept of the model); secondly, the coefficient for NameStress is clearly positive in the case of Lee and Gibbons' experiment. In fact, this value lies outside of the 95% CI for the posterior distribution of the NameStress coefficient in our experiment (in Experiment 1, the probability of the coefficient NameStress to assume a value as extreme as in the English experiment, i.e. of .55 or greater is very small, viz. 0.015).
In sum, Experiment 1 yields no evidence for a rhythmic effect on complementiser inclusion/omission in German. To the contrary, a rhythmic effect of the size found by Lee & Gibbons is highly unlikely given our data. However, because of the considerable data loss due to a great number of invalid recalls, the validity of this experiment may be compromised. We therefore set out to replicate this experiment with an easier distractor task in order to reduce memory load and to obtain a higher number of valid responses.

Experiment 2
Experiment 2 was a replication of Experiment 1 with a few alterations to reduce memory load; this was done to achieve a higher number of valid productions. As in Experiment 1, participants read sentences silently in order to produce them afterwards in response to a three-word recall cue. Unlike Experiment 1, stimuli weren't sentence-pairs, but a combination of a target sentence as first stimulus and a simple arithmetic task.

Participants
Thirty-two students from Frankfurt and surrounding areas in Hesse, Germany, took part in the experiment. They identified as native speakers of German. Participants were paid 10 Euros. None of them participated in Experiment 1.

Design and materials
In order to create an easier task, the second stimulus sentence of all 56 sentence pairs (involving the 32 experimental sentences) of the item lists of Experiment 1 were extracted and paired with a simple addition task. Correspondingly, the stimuli were 56 combinations of a sentence and an arithmetic task, in that order. The arithmetic task was a simple addition involving as one addend the numbers 1, 2 or 3, and as second addend a double digit number (e.g. 2 + 34) -the calculation never involved crossing a group of ten.

Procedure
The procedure was the same as in Experiment 1 with the exception that the second stimulus of each item was a simple arithmetic task. Participants were instructed to read and memorise one sentence silently and to solve the arithmetic task afterwards. Every sentence appeared for exactly five seconds. Then the arithmetic task appeared and participants were to speak out loud the corresponding solution. The numbers stayed (60 seconds at most) on the screen until participants indicated that they had finished by pushing a button. Then the recall cue of the sentence appeared immediately and participants had to produce the sentence from memory and push the button when they had finished. At this point, participants could take short breaks (again, 60 seconds at most) before they moved on to the next item.

Scoring
The scoring scheme was the same as in Experiment 1 (see Section 3.1.4).

Results and discussion
Out of the 1024 recordings, 973 (=95%) were valid according to the above criteria, demonstrating that the recall-task was substantially easier in this experiment compared to Experiment 1. Table 5 depicts the distribution of recalls with vs. without dass, broken down by the characteristics of the rhythmic environment (left panel: preceding syllable; right panel: following syllable). As in Experiment 1, productions involving introduced complement clauses (with the complementiser) outnumber those with unintroduced complement clauses. Also, participants were more likely to recall the embedded subject as trochaic rather than iambic, and the embedding verb as ending in a stressed as opposed to unstressed syllable. In order to ascertain the effect of the rhythmic environment on complementiser use, we fit a Bayesian GLMM with the same parameters as in Experiment 1. The output of the model is shown in Table 6.
As in Experiment 1, the significantly positive intercept of the model shows a preference for productions involving dass over productions without dass. The main effect of dass-presence/absence in the stimulus (again, a positive coefficient) clearly reveals that participants used dass more often when it was part of the stimulus sentence; the posterior distributions for all other main effects and the interaction term are centered closer to 0. According to this model, the presence of a stressed initial syllable on the embedded subject name hardly promotes complementiser use (less than a 2% increase on average). Consequently, the size of the rhythmic effect reported for English complementiser clauses (coefficient NameStress in Table 4) is relatively unlikely given the data from Experiment 2, with P(NameStress > .55) = 0.054.
In sum, the lack of a significant effect of the rhythmic manipulation at the clause boundary in this experiment (and in the one reported above) does not hint at an influence of linguistic rhythm on dass-mention -contrary to what was found for that-mention in English complement clauses.

Evidence against rhythmic effects on complementiser use? A model selection approach via Bayes factors
In contrast to the study by Lee & Gibbons (2007) on English, the two language production experiments on German fail to provide evidence in favour of rhythmic effects on complementiser use. The posterior distributions for both rhythmic effects, NameStress and VerbStress, and their interaction, are centered relatively close to 0 in both experiments. Using Bayes factors, it is possible to ascertain the degree to which this absence of evidence provides evidence of absence of these effects (Kass & Raftery 1995). In general, Bayes factors can be used for hypothesis testing, as they express the extent to which the data support one hypothesis (as formulated in terms of a model for the data) over another hypothesis (i.e. a competing model). By means of Bayes factors, we compare the likelihood of the observed data given the "full" models (as reported in Tables 3  and 6) and the likelihood of the data given models from which factors of interest are eliminated, i.e. assuming a point null value for these factors (=models conforming to the null hypothesis). Specifically, we compare the likelihood of the data given the full models for Experiment 1 (Exp1), Experiment 2 (Exp2), and a model for the combined data set (Exp1 + Exp2) 3 with i. the likelihood of the data given a "reduced" model and ii. a "null" model. The reduced model assumes that the factor NameStress does not contribute to the manifestation of the dependent variable, i.e. to dass-mention; consequently, the coefficient of NameStress and the interaction involving NameStress are culled from both the fixed effects and the random effects terms of these models. The null models assume that both rhythmic factors, NameStress and VerbStress, have no influence on complementiser use; that is, both factors and the interaction are eliminated from these models.
Bayes factors are affected by the prior distributions assumed for the model coefficients. Following suggestions by Vasishth et al. (2018), we report Bayes factors under different priors that are all centered around 0 but have different spreads. The spread σ of the prior around 0 expresses the assumption that the null model is Normal (0, σ). Thus, smaller values of σ express more prior certainty that the true mean is 0. Apart from the weakly informative prior with sd = 1 (under which posteriors with means greater than 1 on the log odds scale are assumed to be quite likely), 4 we also chose tighter priors with sd = .5 and sd = .2 for the rhythmic effects NameStress and VerbStress.
The Bayes factors reported in Table 7 indicate the degree to which the data support the reduced models (top) or null models (bottom) over the full model. Bayesian convention (Jeffreys 1998;Lee & Wagenmakers 2014) holds that Bayes factors greater than 3 indicate moderate evidence, and Bayes factors greater than 10 constitute strong evidence in favour of the reduced/null models over the full model (the corresponding numbers are marked in bold in Table 7); Bayes factors close to 1 are deemed inconclusive, and Bayes factors smaller than .3 constitute evidence in favour of the full model (the latter are italicised in Table 7). Table 7: Bayes factors (BF) in favour of the reduced models (top) or the null models (bottom) over the full models for Experiment 1, Experiment 2 and the model for the combined data set (Exp1 + Exp2). Bayes factors under three different prior distributions (mean = 0; sd = 1 or sd = .5 or sd = .2) for the fixed effects were computed. The value given is the mean from five samples with lowest and highest from this sample in brackets. Bayes factors < .3 indicate evidence against the reduced/null models (marked with italics). Bolded cells indicate moderate (BF > 3) or strong (BF > 10) evidence in favour of the reduced/null models. In the case of Experiment 1, the Bayes factors do not indicate a preference for the reduced model over the full one, and in fact some evidence against the reduced or null models. However, given the relative sparseness of the data from Exp 1, we suspect that the success of the full model in Experiment 1 is due to overfitting. 5 With accumulating data in Exp 2 and the combined data set, the reduced or null models are clearly supported over the more complex models. The comparison of the models for the combined data set do support the reduced and the null models, that is the models that do without the rhythmic coefficients NameStress and VerbStress are more plausible given the data than models including these factors. Note that the support for the reduced or full models is weaker the tighter the priors are. This is expectable given that the reduced or null models are relatively similar to a model that assumes a narrow prior around 0 for the effect in question.
In sum, for the combined data analyses (which have all the data available), the model comparisons using Bayes factors suggest that the evidence either favors the null/reduced models, or there is strong evidence for the null/reduced models. That is, local linguistic rhythm at the clause boundary does not appear to affect the syntactic choice between introduced and unintroduced complement clauses in spoken German.

Experiment 3 -spoken language corpus
In order to validate the results of the language production experiments, we examined the dgd archive (http://dgd.ids-mannheim.de), the largest collection of spoken German corpora. 6 For our purpose, we chose sub-corpora that contained transcriptions of unscripted speech by native speakers of Standard German only, namely the FOLK corpus and the Freiburg corpus, which comprise 2.6 million word tokens in 270 hours of speech in total. The dgd archive does not provide syntactic annotations other than part-of-speech (POS) tagging. Therefore, the search for relevant structures turned out to be rather time-consuming because the data had to be sifted, and the relevant prosodic features annotated, by hand.

Method
We searched for structures with an embedding verb directly followed by a complement clause with a proper name at the top. To find the relevant sentences, we looked up all bigrams with any of the previously identified, eight potentially embedding verbs (in any inflectional form) followed by a proper name, and the respective trigrams with the intervening complementiser dass. In all cases, we checked whether the proper name was indeed the subject of a subordinate clause headed by the embedding verb (especially for the searches without complementiser, this was very often not the case). Also, we discarded all instances in which the verb was part of a comment clause rather than a matrix clause. The search was hampered by the fact that the POS tag for proper names was identical to the POS tag for negations ("NE"), increasing the number of false hits substantially.

Results and discussion
In the end, the search yielded a very small sample of 41 sentences, of which two cases had to be dropped because of disfluencies in the critical region (i.e. at the clause boundary). We coded the stress quality of the final syllable of the embedding verb and the initial syllable of the embedded subject proper name. Table 8 displays the instances of comple-ment clauses with vs. without dass, broken down by the stress quality of the embedding verb (left panel) and the embedded subject (right panel). All in all, the results suggests that unintroduced complement clauses are much more common than introduced ones in the corpus of spoken speech.
We employed Fisher's exact tests to test the (in)dependence of dass-mention and i. the quality of the preceding syllable (final syllable of embedding verb) and ii. the quality of the following syllable (initial syllable of the subject). Even though we observe more dass-mention when syllables at either side of dass are stressed, as predicted, neither test provides compelling reasons to discard the null-hypothesis that dass-use is statistically independent from the stress quality of surrounding syllables (cf. test statistics in Table 8).

Experiment 4 -written language
The previous experiments do not hint at an effect of linguistic rhythm on the choice between introduced and unintroduced complement clauses in spoken German. However, we know that certain types of language use, styles or registers are especially prone to observe constraints on prosodic well-formedness. A host of recent research suggests (see e.g. the collections of papers in Frazier & Gibson 2015;Kentner & Steinhauer 2017) that prosody plays a significant role in the processing of written language, in line with the Implicit Prosody Hypothesis (Fodor 1998;2002). This may seem paradoxical because written language generally lacks explicit cues to prosodic structure (e.g. Chafe 1988), but see Evertz & Primus (2013) and Ktori et al. (2018) on sublexical cues to word stress. In any case, psycholinguistic research strongly suggests that readers have immediate access to prosodic features like stress (Ashby & Clifton 2005) when reading words, and they make use of this information when parsing sentences (Bader 1996;Breen & Clifton 2011;Kentner 2012;Kentner & Vasishth 2016). Writers are likewise known to structure their text in a way that aligns prominent words with sentence positions that are likely to receive a (nuclear) accent when the sentence is spoken (Bolinger 1957;Anttila et al. 2018). Furthermore, the process of writing itself appears to be accompanied by prosody. This is at least suggested by Fuchs & Krivokapić (2016) who show that pauses between key strokes during spontaneous writing are correlated with prosodic breaks in a read rendition of the same text. Similarly, in handwriting, writers leave greater spaces between letters belonging to Fisher's P = 0.1818 Fisher's P = 0.1578 Odds ratio = 5.755 Odds ratio indetermined denk (0/6), find (0/3), glaub (3/17), hoer (0/1), hoff (0/1), mein (1/4), sag (0/0), wiss (2/1) different prosodic constituents (syllables, feet, phonological words) than between letters that belong to the same prosodic constituent (Domahs et al. 2016;Bronner et al. 2018). Given the role of prosody in written language, we ask whether the choice between introduced and unintroduced complement clauses in the written modality remains unaffected by rhythmic-prosodic features of the words surrounding the clause boundary, as in the studies on spoken speech (Experiments 1, 2, and 3). This would suggest that this specific syntactic decision is a very stable one and cannot easily be undone in favour of phonological well-formedness -in contrast to other types of syntactic variation. On the other hand, an effect of linguistic rhythm in the written modality would not make us assume bidirectional interaction between syntactic encoding and phonological encoding in language production. Writers can revise their wording and the final text does not give away the spontaneous syntactic choice writers make, but only the end-result of a process that may well involve revisions. Still, an effect would at least show that writers do not tie themselves down to their initial syntactic decision concerning the form of the complement clause but consider the local rhythmic environment at the clause boundary when formulating complement clauses.

Method
To answer the above questions, we examine the TÜPP-D/Z corpus 7 that comprises all editions of the daily newspaper die tageszeitung (taz) from September 1st 1986 until 7 May 1999. This corpus contains 11.5 million sentences comprising 204.4 million word tokens. Using CSniper (Eckart de Castilho et al. 2012), we searched for all tokens of the 8 embedding verbs that were immediately followed by a complement clause (with or without complementiser) with a proper name as clause-initial subject. This search yielded 2751 complement clauses, 1476 subordinate clauses with, and 1275 subordinate clauses without, the complementiser dass. Two student assistants hand-annotated the stress-status of i. the final syllable of the embedding verb (stressed or unstressed), and ii. the stress status of the initial syllable of the embedded subject proper name. While the verb-final syllable was either a stressed syllable or an unstressed schwa-syllable, the initial syllable of the proper name could bear primary stress (as in Theodor), secondary stress (as in Manuela) or remain unstressed (we assigned all syllables that directly precede the stressed syllable to this category, independently of the vowel quality, e.g. the first syllable in Nicole). In a couple of instances the stress status could not be determined, either because the word shows variable stress (Saddam or Saddam) or because the name, and therefore its stress pattern, was unknown. The affected cases (65 or 2.4%) were discarded from further analysis, so 2686 cases remain, 1429 of which feature the complementiser.
As the use of the complementiser is assumed to be affected by specifics of the clauseinitial subject (Roland et al. 2006;Jaeger 2010), we determined the usage frequency and length of this word in order to consider these factors in the analysis. As an approximation for the phonological length, we simply took the number of (orthographic) characters. Furthermore, we devised a simple measure of frequency that is sensitive to the (rather narrow) temporal context of the corpus. We did this because we assume the frequency of names in a newspaper corpus to be heavily affected by the nature of the events reported -at least more so than in the case of generic words. To this end, we calculated the logarithm of the absolute frequency of the name within the corpus sample and used this as a predictor in our statistical model. Table 9 shows the distribution of complement clauses with vs. without dass. This tabulation indicates that the rate of complementiser use is highest (.62) after embedding verbs ending in a stressed syllable, and lowest (.49) after verbs ending in a schwa-syllable. Moreover, the rate of complementiser use decreases as the degree of stress on the initial syllable of the embedded subject increases.

Results and discussion
In the following, we report the Bayesian GLMM. As in the previous models (Experiment 1 and 2), we set weakly informative priors with mean 0, sd = 1, and LKJ prior set to 2. 8 The model included as fixed effects i. the stress quality of the final syllable of the embedding verb (VerbStress), ii. the stress quality of the initial syllable of the proper name serving as embedded subject (NameStress), iii. the logarithmised frequency (FreqSubj) and iv. logarithmised length of the embedded subject (LengthSubj). Note that, in contrast to the models above, NameStress was treated as a categorical variable with three levels because instead of a dichotomy between stressed and unstressed we are dealing with three degrees of stress in the sample at hand: unstressed (serving as baseline), secondary stress, and primary stress. The embedding verb (VerbLemma) was entered as random intercept into the model, with random slopes for NameStress and VerbStress. Four sampling chains with 4000 iterations each were run for the sampling of the model, including a warmup period of 2000 iterations. The parameters of interest are shown in Table 10.
The negative coefficient for FreqSubj with a Credible Interval not including 0 confirms predictions based on Jaeger (2010) and Roland et al. (2006): high-frequency embedded subjects promote dass-omission, i.e. unintroduced complement clauses.  As predicted, the coefficient of VerbStress is positive. However, the 95% Credible Interval contains 0, so the evidence for this effect is weak at best. Still, a sizeable portion (89%) of the probability mass of the posterior is above 0.
Interestingly, the comma that necessarily marks the boundary between embedding verb and embedded clause, according to German orthographic rules, does not entirely nulify the effect of VerbStress; we suggest, in line with Truckenbrodt (2005), that this kind of clause boundary does not normally constitute a prosodic boundary in speech (although it certainly is a potential position for a prosodic break, see discussion in Truckenbrodt & Darcy 2010). In fact, it is possible, and perhaps a means to promote reading fluency, for prosodic constituents to straddle syntactic boundaries: In keeping with assumptions on trochaic phrasing by Lahiri & Plank (2010), final stress on the embedding verb would then open a prosodic position for the unstressed dass to be integrated in, as in the case of (7a). The same position would already be filled in the case of trochaic embedding verbs (7b), hence dass would remain unparsed in the prosodic representation. Unparsed syllables, however, constitute a violation of regularities concerning prosodic structure, and renditions like (7b) are therefore avoided.
The (likewise non-significant) coefficient of NameStress points in the opposite direction, suggesting that unstressed dass is more likely the lower the degree of stress is on the initial syllable of the following name. This tendency is clearly against predictions (and outside of our hypothesis space): We assumed, in line with Lee & Gibbons, that initial stress on the embedded subject would promote dass-mention for reasons of rhythmic well-formedness, i.e. to maximise the alternation of stressed and unstressed syllables.
We note that, as in Experiments 1, 2, and 3, this experiment doesn't suggest the syntactic choice regarding complement clause structure to be susceptible to the stress pattern of its subject. There is only very weak evidence for an effect of VerbStress in the predicted direction.

General discussion
Before discussing in detail the potential reasons for the sparse outcome regarding the effect(s) of rhythm, and potential implications, we note two general results from the experiments that are largely in line with insights from the previous literature on the topic of complementiser use.
First, both Experiments 1 and 2 replicate familiar findings regarding structural priming (Ferreira & Dell 2000;Ferreira 2003;Lee & Gibbons 2007): The participants showed a clear bias to mention the complementiser (and hence: a verb-final complement clause) when this conformed to the syntactic structure of the written stimulus (or to the presence of dass in it).
Secondly, the results of all four experiments confirm a hypothesis suggested by previous findings by Ferreira & Dell (2000), Hawkins (2004), and Jaeger (2010) concerning English complement clauses, namely that complementiser use is inversely correlated with the accessibility of the to-be-uttered material and with general ease of processing: Experiments 1 and 2 involved recall tasks and thus put the participants under cognitive pressure; in these experiments, productions involving the complementiser clearly outnumbered those with dass-less, unintroduced complement clauses. Also, the written corpus search in Experiment 4 yielded a higher rate of complement clauses including dass; written language is deemed to be less spontaneous and more effortful than normal speech, and this effort may be a factor (apart from the written norm) that promotes dass-mention. In contrast, the data from the spoken corpus (Experiment 3) shows the opposite: as in Auer (1998), unintroduced complement clauses were clearly more frequent in this spontaneously spoken data set, as it was likely produced with less effort when compared to the material from the other experiments.

Lack of rhythmic effect on dass-mention: Potential reasons and implications
The two sentence production experiments and the two corpus studies presented here largely fail to support our initial predictions concerning the role of rhythm on the choice between introduced and unintroduced complement clauses in German. The only evidence pointing into the predicted direction is the rather uncertain effect of VerbStress found in Experiment 4: writers apparently favour the complementiser dass, and hence a verb-final complement clause, when the embedding verb ends in a stressed syllable. We interpret this effect to reflect the propensity for rhythmic alternation in written language. We hasten to add that this effect does not necessitate the possibility of direct interaction between grammatical encoding and phonological encoding in models of sentence production (as advocated in e.g. Vigliocco & Hartsuiker 2002;Shih 2014). Rather, this effect is explicable with recourse to the notion of a monitoring loop (Levelt 1983;Hartsuiker & Kolk 2001) that checks the output after the formulation stage of language production has been completed, and may trigger stylistic repairs.
Apart from the uncertain effect of VerbStress in Experiment 4, the lack of a rhythmic effect in the direction predicted by Anttila et al. (2010), Shih (2014) or Vogel et al. (2015) in general, and Jaeger (2006) or Lee & Gibbons (2007) in particular, is consistent across the four experiments. While it is problematic to argue on the basis of "absence of evidence", this consistency warrants commentary (but note the suggestive evidence of absence obtained from the Bayes factor analysis in Section 3.3).
As alluded to in the introduction, the kind of syntactic variability studied here is qualitatively different from the kinds of syntactic variability studied by other authors. Specifically, the choice between a complement clause with vs. without the overt complementiser in German necessarily involves the choice between verb-final structures (with complementiser (4a)) vs. verb-second structures (without complementiser (4b)). No such word order difference is involved in the choice between English complement clauses with vs. without complementiser.
The fact that, in English, the difference between the two constructions merely affects the presence or absence of that invites the assumption that this seemingly syntactic difference is, in essence, a phonological one: that is, both complement clause variants may provide a structural position for the complementiser which, in the case of that-less complement clauses, simply remains unpronounced. If the presence or absence of that is indeed regulated by the phonological processing module, a rhythmic effect on complementiser use, as reported by Lee & Gibbons, is explicable and expectable without assuming a bidirectional interaction between grammatical and phonological encoding.
In the following, we submit an admittedly ad hoc and speculative, but testable, approach that accounts for the difference in susceptibility to phonological influences between complement clause selection in German (putatively no phonological effect), and other kinds of syntactic variation that have been shown to be affected by phonological constraints.
We argue that the encoding of clauses (or propositions for that matter) is less prone to be affected by rhythmic constraints than the encoding of phrases below the clause; we do so on the basis of the following assumption: the syntax of clauses needs to be specified earlier in sentence production than phrasal categories smaller than a clause; we conjecture that the specification of a coarse syntactic skeleton of clauses is in fact the prerequisite for phonological encoding, and that is why we find complement clause structure to be largely immune to influences of stress and rhythm.
This assumption is in line with a proposal by Ferreira (2000) who suggests that the formulation stage in language production involves, in a first step, the generation of basic syntactic frames or "elementary trees" (Frank 1992) that consist of a simple proposition, i.e. a predicate plus its extended projection including its arguments, in short: a clause; only once the argument slots are syntactically specified, e.g. with the corresponding NP treelets, their phonological encoding may begin. Evidence for this kind of hierarchical planning comes from a study by Bock & Cutting (1992) who found subject-verb agreement errors (8) triggered by the local noun books to be more likely within the work space of a clause (8a) than across clause boundaries (8b). This suggests that sentence production "proceeds in hierarchical rather than sequential fashion, with the planning of clauses preceding the sequencing of words" (Bock & Cutting 1992: 122). 9 Applied to the case at hand, we assume that the grammatical encoding of the complement clause is triggered once the embedding verb in the matrix clause is syntactically encoded (in the parlance of the language production literature: at the stage of "lemma selection"). However, the syntactic specification of the complement clause is encapsulated from the subsequent phonological encoding of the embedding verb.
Likewise, the syntactic skeleton of the complement clause is specified without regard to the phonological properties of the complement clause subject. That is, phonological imperfections that arise when the two clauses are fused cannot easily become undone because, at this point, the production system has already decided upon their syntactic form. 10 If true, this explanation makes strong predictions about what kinds of syntactic variability are susceptible to phonological well-formedness conditions and what kinds are relatively immune. Testing this prediction is beyond the scope of this paper. However, based on the available evidence concerning rhythmic-phonological effects on syntactic encoding, we can at least provide a plausibility check. To this end, we give a synopsis of relevant studies and categorise the available evidence as follows (see Table 11).
The first set of studies (upper section of Table 11) suggests a rhythmic effect on the realisation of optional elements (e.g. the optional infinitive marker to in English that is often omitted in the context of unstressed syllables). 11 As discussed above, a phonological effect on the mention of such elements is explicable without assuming bidirectional information flow between phonological and syntactic encoding; instead, we assume that these elements are part of the syntactic representation even when they are phonologically empty. That is, the presence/absence of these elements is exclusively regulated in the phonological encoding stage. 12 Accordingly, the fact that the phrasal level affected is seemingly a clause in the case of Lee & Gibbons (2007), does not constitute counterevidence to our proposal.
The second set (middle section in Table 11) consists of studies in which speakers/writers exploit word order flexibility in order to achieve a felicitous rhythmic representation. The most prominent phenomenon in this set is the ordering of (mostly NP/DP) conjuncts, e.g. salt and pepper vs. pepper and salt (McDonald et al. 1993;Benor & Levy 2006;Lohmann 2014). The word order variations do not touch upon the semantics of these constructions; however, some of them may signal a higher degree of expressiveness (e.g. the determiner inversion in Schlüter 2005; Kentner 2018) or possibly involve stylistic mannerisms (e.g. the verb cluster ordering reported in Vogel et al. 2015). The syntax appears to be indifferent to at least some of these word order alternations: For example, determiner inversion engenders split constituents (quite a long report ∼ [a [[quite long] report]]) and thus violates rules concerning phrasal integrity.
Nevertheless, we assume that the phenomena in this set are relevant for the stage of grammatical encoding that is concerned with linearising the syntactic structure (the "positional level" of sentence formulation according to Bock & Levelt 2002). In this set, the phrasal levels affected by linguistic rhythm are invariably below the clause level. Finally, the third set (bottom section of Table 11) contains cases in which speakers/writers consider linguistic rhythm when choosing a particular syntactic construction. This set involves studies on genitive (Shih et al. 2015) and dative construction choice in English (Anttila et al. 2010) and the case of negated attributive adjectives (avoidance of stress clash: */? a not popular person but  a not very popular person, Schlüter 2005). These phenomena involve syntactic decisions that go beyond the mere ordering of constituents (possibly touching upon the "functional level" of sentence production, cf. Bock & Levelt 2002). The phrasal categories affected by rhythm in this set are, again, smaller than a clause.
All in all, given this synopsis, it seems plausible that effects of linguistic rhythm on syntactic encoding are restricted to phrase categories that are smaller than a clause. The lack of a rhythmic effect in our experiments is consistent with this assumption. However, the nature, and the sparseness, of the data in Table 11 do not allow firm conclusions to be drawn. For one thing, there are, as far as we are aware, no other studies directly testing effects of linguistic rhythm on the production of higher syntactic levels such as the clause. The present study on complement clause structure in German appears to be the only one. Moreover, several studies only consider written material. As mentioned above, written corpora are not very informative about the process of language production, as they merely reflect the result of the process, and they do so in a modality that is often considered secondary to the spoken modality. Nevertheless, in the studies that consider written and spoken data, the findings are largely consistent. On a more general note concerning the current state of science communication, there is a very strong bias for positive results to be published, and negative or null results are hardly reported (Open Science Collaboration 2015). Therefore, the list in Table 11 has to be taken with some caution.

Conclusion
The experiments presented here were designed to ascertain the extent to which linguistic rhythm (i.e. the preference for rhythmic alternation of stressed and unstressed syllables) affects syntactic encoding in sentence production. To this end, guided by the example of Lee & Gibbons (2007), we tested whether the choice between introduced and unintroduced complement clauses in German is influenced by the immediate rhythmic environment at the clause boundary. Against the predictions that were derived from similar effects in the literature, we failed to find compelling evidence for rhythmic influences on sentence structure in two recall-based production experiments and in two corpus studies on German complement clauses.
Based on a synoptic view of relevant studies, we propose a taxonomy of different phenomena that have been claimed to show rhythmic effects on syntactic structure building. This taxonomy distinguishes rhythmic effects on i. the (non-)realisation of optional elements; ii. word order; and iii. on the choice of a particular syntactic construction. The (non-)realisation of optional elements can be interpreted as exclusively affecting the phonological encoding of a sentence, leaving the syntactic representation untouched. This type of variation is therefore deemed uninformative regarding potential phonological feed-back to grammatical encoding in language production. Word order and construction choice, however, do represent stages of syntactic encoding. Conspicuously, in the studies we revisited, the phrasal levels that were shown to be affected by rhythm are always smaller than a clause, i.e. concerning VP, PP, or NP-internal structure. We take this to suggest that the lack of a rhythmic effect in the case of German complement clause structures is due to their being of a higher syntactic order: they represent a clause, specifically the structure of the CP. We argue that only once the syntactic slots that the clause keeps available are being syntactically specified may the phonological encoding begin.
Accordingly, we assume that decisions about the syntactic structure of clauses remain largely unaffected by rhythmic-phonological encoding effects in language production.

Additional File
The additional file for this article can be found as follows: • Supplementary file 1. Data and R-Scripts for reproducing the analyses of Experiments 1, 2, and 4. DOI: https://doi.org/10.5334/gjgl.565.s1