Speaking Rate, Oro-Laryngeal Timing, and Place of Articulation Effects on Burst Amplitude: Evidence From English and Tamil

The relationship between speaking rate and burst amplitude was investigated in plosives with differing oro-laryngeal timing: long-lag voice-onset time (VOT) (North American English) and short-lag VOT (Indian Tamil). Burst amplitude (reflecting both intraoral pressure and flow geometry of the oral channel) was hypothesized to decrease in pre-vocalic plosive syllables with the increase in speaking rate, which imposes temporal constraints on both intraoral pressure buildup behind the oral occlusion and respiratory air flow. The results showed that decreased vowel duration (which is associated with increased speaking rate) led to decreased burst amplitude in both short- and long-lag plosives. Aggregate models of bilabial and velar plosives (found in both languages) suggested lower burst amplitudes in short-lag stops. Place-of-articulation effects in both languages were consistent with models of stop consonant acoustics, and place interactions with vowel duration were most apparent with long-lag English stops. The results are discussed in terms of speaking rate and language-internal forces, contributing to burst amplitude variation and their implications for speech perception and potential to affect lenition phenomena.


Background
Speaking rate is a highly variable phenomenon affected by a host of linguistic and extra-linguistic factors, resulting in adjustments to articulatory targets (Gay, 1981) and velocities (Adams et al., 1993;Dromey & Ramig, 1998).The phonetic consequences of modulations of speaking rate broadly affect both consonant temporal characteristics (e.g., voice-onset time [VOT]; Kessinger & Blumstein, 1997) and spectral properties (e.g., vocalic trajectories; Lindblom, 1983).There has been considerable attention given to the phonetic effects of speaking rate in the context of (non-phonologized) consonant lenition, with much of the literature focused on intervocalic lenition processes in a variety of languages as indexed by consonant intensity (Priva & Gleason, 2020;Warner & Tucker, 2011; see Lavoie, 2001 for a review) and consonant duration.Intensity (or root-mean-square [RMS] amplitude) in this literature is a measure of periodic, sonorous energy in intervocalic consonants often without complete closure and lacking the release characteristics of initial plosives.The general pattern is for increased speaking rate (either implemented directly or as a consequence of style or register shifts) to result in increased intensity in intervocalic consonants and reduced consonant duration (Priva & Gleason, 2020;Soler & Romero, 1999;Warner & Tucker, 2011;Winter & Grawunder, 2012).Lenition of this sort affects perception (Tucker, 2011) such that words with phonetically reduced variants are difficult to process.While a clear picture is emerging of how speaking rate modulates the acoustic characteristics of postvocalic and intervocalic consonants, very little is known about the relationship between speaking rate and the amplitude characteristics of initial prevocalic consonants, and in particular their release burst.

Obstruent bursts and extra-linguistic variation
The release of an oral stop consonant results in transient energy across the frequency spectrum, which is generally followed by broadband noise, collectively taken as the "burst."The transient portion of the burst, reflecting the release of pressurized intraoral air posterior the oral occlusion, together with the noise or aspiration following the transient, reveals the resonant properties of the tube anterior the oral occlusion being excited by air escaping the lungs prior to phonation.For example, velars generally have a mid-frequency prominence as a result of a relatively long oral cavity.Indeed much of the literature does not examine burst variation directly, but rather intraoral pressure (P o ), which serves as a proxy measure suggesting possible effects on burst characteristics.
Although a buildup of intraoral pressure P o is necessary for the plosive burst characterizing oral stop consonants, it is subject to naturally occurring variation caused by extra-linguistic factors like stress and emotional state (Laukkanen et al., 1996) or affective voice quality such as whispering (Murry & Brown, 1976).While the extra-linguistic feature of speaking rate would seem, at first blush, to affect P o (and by proxy, release bursts), models of P o for voiceless stops suggest that it is not straightforwardly affected by the temporal constraints imposed by increased rate.Müller & Brown (1980) (following Rothenberg, 1966) show P o in voiceless stops has a rise time on the order of 50 ms (beginning before complete oral occlusion), which is potentially sufficient time to achieve P o maxima in speeded speaking rate conditions where closure duration at fasts rates are generally greater than 50 ms (Allen & Miller, 1999;Pickett et al., 1999).Nonetheless, some studies suggest that P o is indeed correlated with speaking rate.In a study of Swedish aspirated "pa" spoken by a single speaker at different rates (1-3 syllables/s), Hertegård et al. (1995) found a high correlation between subglottal pressure and P o , which decreased by approximately 4-5 cm H 2 O at fast speaking rates.C. J. Miller and Daniloff (1977) examined P o in three speakers' productions of English CVCs in monologues (thought to be comparable to conversational speaking rate) and citation contexts.For all three speakers, P o for voiceless stops was at least 1 cm H 2 O lower in monologue than citation contexts.Speaking rate also affects physiological properties of respiration.Lung volumes in speeded speaking tasks are lower than in slow speech (Dromey & Ramig, 1998), which in turn results in lower subglottal pressure (Sundberg, 2018) roughly corresponding to P o during complete closure in plosive production.
Given the potential effect of speaking rate variation on P o , comparable effects on burst amplitude are expected; however, there is a gap in the literature directly addressing this question.There is some indirect evidence on the relationship between speaking rate and burst amplitude, which shows mixed effects.In their study of P o and speaking style, C. J. Miller and Daniloff (1977) also reported on peak absolute sound pressure level at consonant release, which showed mixed results between the monologue-and citation-style consonant bursts.Burst amplitude was an absolute measure of the maximum sound pressure level in the RMS trace of release.One of the three speakers showed a correlated effect of speaking style on burst amplitude, which was lower in monologue speech (presumably articulated at a faster rate) than citation in all three places of articulation.The two other speakers exhibited varying effects of speaking style.The authors note that in their data, burst amplitude was correlated with both P o and closure duration in two speakers, suggesting that it is not straightforwardly a reflection of P o and that aerodynamic variables along with additional physiological parameters must be incorporated into models of normal speech.Indeed speakers' passive expansion of the oral cavity (e.g., depressing tongue body, expanding cheeks) and tissue compliance have been shown to slow the buildup of P o during stop closure, and are highly variable both within and across speakers, thereby contributing to the complex effects on burst amplitude (Bell-Berti, 1975;Westbury, 1983).

Obstruent bursts and linguistic variation
Burst amplitude varies according to phonological features of voicing and place of articulation.Burst amplitudes in phonologically voiceless stops are generally higher than for voiced stops (Chodroff & Wilson, 2014;Stevens et al., 1999), reflecting greater intraoral pressure (P o ) and airflow at consonant release (Haag, 1977;Subtelny et al., 1966).Voicing in initial obstruents, which has as its primary articulatory implementation the synchronization of oral and laryngeal gestures, necessarily affects the pressurized air in the oral cavity (reflected in burst amplitude), but also results in corresponding phonetic characteristics like vocal fold oscillation prior to closure release, depressed fundamental frequency in the following vowel, and so on (Lindqvist, 1972;Kingston & Diehl, 1994), all of which potentially affect P o , and consequently, burst amplitude.While burst amplitude is affected by the laryngeal feature of voicing, there is some limited evidence that it may not be affected by the fortis-lenis contrast (DiCanio, 2012).
Place of articulation affects the shape and amplitude of the burst.The spectral envelope reflects the length of the cavity anterior the oral closure (e.g., no cavity for bilabials, short cavity for alveolars, and long cavity for velars) and resonances (Stevens et al., 1999).Place of articulation affects burst amplitude as well, generally reflecting P o at varying cavity volumes and rates of airflow after closure release, with bilabials having the lowest amplitudes (Stevens, 1998;Stevens et al., 1999).Importantly, both spectral envelope and amplitude characteristics of the release burst in certain frequency bands have been shown to be perceptually relevant for both place of articulation and voicing, respectively (Blumstein & Stevens, 1979;Chodroff & Wilson, 2014;Krull, 1990;Ohde & Stevens, 1983;Repp & Lin, 1989;Winitz et al.,1972).

Research goals
The goals of the present research are to (1) test the hypothesis that burst amplitude in CV syllables decreases with the increase in speaking rate, (2) examine whether such speaking rate effects are more or less evident at various places of articulation, and (3) explore the interaction between speaking rate and oro-laryngeal synchronization (which is phonetically implemented in terms of short-and long-lag VOT) on burst amplitude.It is hypothesized that burst amplitude is related to VOT, with long-lag VOT being associated with high P o and burst amplitude.The faster initiation of vocal fold oscillation in short-lag VOTs suggests that P o is sufficiently low to promote enough of a trans-glottal pressure differential for the rapid onset of phonation.This would be manifested in lower burst amplitude.The study focuses on consonants broadly classified as "voiceless," that is, consonants where the primary acoustic and perceptual characteristic is VOT.Previous research on speaking rate, P o , and burst amplitude was conducted in languages (English and Swedish), where pre-vocalic word-initial voiceless plosives are necessarily aspirated.The present study examines the relationship between relative burst amplitude and speaking rate in speakers of North American English (Indo-European) and Indian Tamil (Dravidian), a language which lacks phonemic voicing and exhibits short-lag VOT word-initial plosives (Keane, 2004).
In comparing short-lag (Tamil) and long-lag (English) plosives, the present study addresses the contribution of speaking rate and oro-laryngeal timing dynamics to observed burst amplitude effects.Short-lag plosives from Tamil are examined instead of English short-lag stops (the voiced series to avoid aerodynamic and acoustic characteristics of phonologically voiced stops (such as closure voicing/voicing lead variability, passive cavity expansion during closure voicing, vocal fold adduction variability, F 0 perturbation; Docherty et al., 2011;Flege, 1982;Kingston & Diehl, 1994;Lindqvist, 1972;Lisker, 1986) and associated P o variability which might potentially obscure the effects of VOT on burst amplitude.These phonetic characteristics associated with English have not been reported for short-lag Tamil plosives, where there is no phonological contrast.Given that the hypothesized mechanism affecting burst amplitude variation in the current study is P o constrained by time in speeded conditions, the short-lag series from Tamil, rather than the phonologically voiced English series, was considered a more appropriate comparison with the long-lag English series.Language differences are, therefore, explained in terms of the effects of VOT type on burst amplitude.
This more general effect of the temporal constraints imposed by speaking rate may have been obscured in previous research by an absolute burst amplitude measure (C.J. Miller & Danilo, 1977).In this study, we examine the effect of speaking rate on relative measures of burst amplitude, where peak amplitudes in various bands of burst noise are analyzed in relation to the amplitude of the following vowel.Such a normalization accounts for the overall amplitude of the utterance and may be a better index of the perceptual adjustments made by listeners when assessing place of articulation (Ohde & Stevens, 1983).
Finally, the results of the study are discussed in the context of broader sources of variation, both language-internal (oro-laryngeal synchrony) and extra-linguistic (speaking rate), and directions for future study examining potential perceptual effects of burst amplitude variation are offered.

Speakers
The speech of 12 North American (Canadian and American) English speakers (six female and six male), aged 36-45 years, and 10 Indian Tamil speakers (five female and five male), aged 24-44 years, was analyzed.All participants were native speakers of their respective languages, with no reported speech or hearing disorders.Participants were recruited via email solicitation and through various social media platforms.

Procedure
Due to restrictions to laboratory access during the global pandemic, participants were asked to record themselves in a quiet room at their respective residences (in Canada, America, and India).Participants were instructed to use a wired microphone (either ear-bud or closed-ear gaming style) along with recording software (such as Audacity or Praat) installed on their home computer.English-speaking participants were instructed to make three recordings, one each for the monosyllables "pa," "ta," and "ka."For each recording, participants were asked to repeat the syllable at four self-paced speaking rates (slow, normal, fast, very fast) for approximately 5 s per rate.Instructions for Tamil-speaking participants were identical but with recordings for four places of articulation, bilabial ([pa]), dental ([t ⊓ a]), velar ([ka]), and retroflex ([ʈa]).Instructions included an example audio file of the syllable "pa" (American English as recorded by the author) modeling the rate manipulation and providing a general structure of the recording.Participants made recordings with sampling rates at or above 22 kHz and saved as .wavor .aifffiles.Given the nature of the instructions (5 s per rate) participants produced more fast-rate tokens than slower rate tokens.There was some concern that amplitude measurements would be affected by the variety of microphones and recording interfaces employed by the participants.No post hoc scaling was performed on raw amplitude measurements (burst) as difference measures (between the burst and adjacent vowel) served as dependent variables (see Section 2.3).

Measurements
All measurements were made by hand using Praat speech processing software by trained phoneticians.Two temporal measurements were made: duration of consonant release/burst (VOT), and vowel duration.The onset of consonant release was identified at the first appearance of broadband transient noise, which was often followed by a burst of frication noise.Both transient and frication were considered the "burst."Burst offset was identified at the zero-crossing before the first glottal cycle of the following vowel.The vowel onset (coinciding with the burst offset) was indexed by the first low-amplitude periodic oscillation.The end of the vowel (in the CV syllables) was marked by three co-occurring events: (1) a dramatic change in amplitude in the waveform, (2) a change in the energy in the formants accompanied by a change in complexity in the waveform indicating a loss of energy in F2 and F3, and (3) the onset of aperiodicity.
Following Stevens et al. (1999), the spectrum of the burst was measured using an averaging technique with a 5 ms Hamming window.The onset of periodic glottal oscillation of the vowel was avoided in the averaging window.In cases where consonant release was immediately (<5 ms) followed by glottal oscillation (generally in Tamil bilabial recordings), the unique measurement window was centered at the zero-crossing immediately preceding the onset of voicing.As one of our questions was motivated by potential place-specific effects of speaking rate-induced burst amplitude variation, three amplitude measurements were taken reflecting important frequency bands for place perception following (Ohde & Stevens, 1983): (1) AvF1, or the average amplitude of F1 in the vowel; (2) MaxHi, or the maximum spectrum amplitude in the burst above 3 kHz (for males) and 3.5 kHz (for females); and (3) MaxMid, or the maximum spectrum amplitude in the burst in the F2/F3 range.Following Stevens et al. (1999), which identified two regions in the burst spectrum important for place perception, two relational measures were computed for each token, high-frequency HiDiff (AvF1-MaxHi), and mid-frequency MidDiff (AvF1-MaxMid) burst amplitudes.Relational measures of burst amplitude have been used in other studies examining strength (DiCanio, 2012) and provide a token-specific normalization that mitigates the potential effect of varying microphone fidelity.Figure 1 shows the waveform and spectrogram representations of a typical token along with average spectra of the burst and vowel portions.

Speaking rate and vowel duration
Although some literature on speaking rate follows from a controlled implementation of speaking rate regulation procedures, where participants are instructed to synchronize their production with a metronome ensuring a consistent implementation of rate (e.g., J. L. Miller & Baer, 1983), much of the literature on rate effects proceed in a way similar to the present study, where participants vary their speaking speed in a subjective manner (e.g., Kessinger and Blumstein, 1997;Beckman et al., 2011).The modeling strategies in this literature rely on collapsing rate-dependent critical measurements (Gay, 1978;Kessinger & Blumstein, 1997;Soler & Romero, 1999).For example, dependent variables are analyzed as a function of a categorical independent rate variable with levels such as "slow," "fast."The present study is motivated by the question of acoustic consequences of durational shortening as induced by speaking rate, and as such, variability along the vowel duration continuum, treating it as a continuous variable.For this reason, vowel duration is taken as a proxy for subjective speaking rate, that is, it is assumed that faster speech results in shorter syllabic durations-but this may not necessarily be the case (e.g., speakers may vary the duration of inter-syllable intervals and thereby achieve slower or faster rates).To confirm this assumption, the relationship between subjective speaking rate and vowel duration was examined.The raw vowel duration data according to speaking rate and language are shown in Figure 2. A linear mixed-effects model (nlme) was fit to the vowel duration data with rate (slow, normal, fast, very fast) crossed with language (English, Tamil) as predictors.The model was fit with random slopes and intercepts for rate by subject.Model coefficients are given in Table 1.
The model suggests that subjectively implemented speaking rate adjustments affect vowel duration in the direction predicted, increasing with the decrease in subjective rate.The interaction between rate and language reflects the longer mean vowel duration for slow rate in English speakers than Tamil speakers, which flips the more general trend in the data, where vowel durations in Tamil are longer than in English in normal, fast, and very fast rates.To determine the effectiveness of incremental speaking rate increases in each language, separate linear models were built for English and Tamil.Table 2 shows the mixed model coefficients for each language, while Table 3 shows pairwise comparisons from estimated marginal means computed using the emmeans package.
Pairwise comparisons of rate effects on vowel duration in English speakers showed significant differences between all rates, suggesting shortening with each successive increasing rate.The difference in vowel duration between slow and normal rates in Tamil speakers was small and not significantly different, whereas all other rates showed significant differences in the expected direction.

Vowel duration effects on burst amplitude
Vowel duration was used as a stand-in for speaking rate in all of the following analyses.Linear mixed-effects models were fit to examine (1) short-lag (Tamil) and long-lag (English) VOT differences in the effect of vowel duration on MidDiff and HiDiff across all places of articulation and (2) the language-specific interaction effects of vowel duration and place of articulation on MidDiff and HiDiff.All models were fit with random slopes and intercepts for vowel duration by subject.
Model coefficients for both MidDiff and HiDiff are given in Table 4.The models show an overall negative effect of vowel duration on burst amplitude measures: with the increase in vowel duration, the difference in amplitude between the consonant and vowel decreases ( 35 dB/sec).The  models also show a small (though not significant) language difference with short-lag (Tamil) burst amplitudes being lower in the high-frequency band (t = 1.58, p = 0.12) than long-lag (English) burst amplitudes.Figure 3 shows the observed amplitude measurements as well as the model predictions for each language.
The results confirm the hypothesis that temporal constraints on articulation negatively affects the amplitude of release bursts in word-initial stops-with the increase in vowel duration, the difference in amplitude between the consonant burst and the following vowel decreases, or put another way, with the increase in vowel duration consonant bursts increase in amplitude.Model estimates also suggest that short-lag Tamil burst amplitude may be lower than in longlag English plosives, especially for the high-frequency prominence at vowel durations longer than 200 ms where confidence intervals no longer overlap.The language/VOT difference may be obscured by the differing consonant inventories in English and Tamil.To explore the possible language/VOT differences on burst amplitude, the next section models the effect of vowel duration on a subset of the data where English and Tamil share consonant places of articulation-bilabial and velar plosives.

Place-of-articulation and vowel duration effects on burst amplitude
To tease apart possible differences between short-lag (Tamil) and long-lag (English) plosives, a subset of the full data with only bilabial and velar stops was analyzed (shared by both Tamil and English).Fully crossed models (Vowel Duration × POA × Language) of MidDiff and HiDiff were fit to the bilabial-velar data with random slopes and intercepts for vowel duration by subject.Table 5 gives model coefficients for both MidDiff and HiDiff.
The model coefficients suggest that short-lag (Tamil) burst amplitudes (for bilabials and velars) are lower (higher difference measures, βs  7-8 dB in both the mid-and high-frequency promi- nences) than in long-lag (English) plosives (MidDiff: t = 2.02, p = 0.05; HiDiff: t = 1.91, p = 0.07).The effect of vowel duration on burst amplitude is comparable to the models including all places of articulation in Section 3.2.
Figure 4 gives the vowel duration varying model estimates of MidDiff and HiDiff for bilabials and velars in English.For the model of MidDiff burst amplitude, the relationship between bilabial and velar was different in Tamil relative to English.The interaction between POA and vowel duration suggests that vowel duration has a different effect on velars than bilabials across both languages, and the three-way interaction term, although not significant (t =−1.65, p = 0.098), suggests the effect of vowel duration on the relationship between bilabials and velars is different in the two languages (as evidenced by a slightly steeper slope for bilabials in English).
In the HiDiff model, the effect of vowel duration (across both POAs) on burst amplitude was different in short-lag (Tamil) relative to long-lag (English) plosives.The model also shows that the relationship between bilabial and velar was different in short-lag (Tamil) relative to long-lag (English).There was a significant interaction between POA and vowel duration suggesting that vowel duration has a different effect on velar burst amplitude than bilabial burst amplitude across both laryngeal specifications (English and Tamil).Finally, the three-way interaction term suggests differing effects of vowel duration on the relationship between bilabials and velars in the two languages/laryngeal specifications.
The effect of vowel duration on the various places of articulation in the two languages (with language-specific models) is explored in the next section.

Language-specific place-of-articulation effects on burst amplitude
Place-of-articulation effects in English and Tamil were analyzed separately to account for their differing stop-consonant inventories.Although both languages share bilabial and velar places, coronal places vary-Tamil with two, dental ([t ⊓ ]) and retroflex ([ʈ]).Linear models of burst amplitude were fit with place of articulation and vowel duration as predictors and the same randomeffects structure as the aggregate models above.

English long-lag plosives.
The effects of vowel duration and place of articulation on MidDiff and HiDiff in English are given in Table 6.
All fixed effects and interactions were significant as were the vowel duration interactions with alveolar and velar places of articulation (relative to bilabials).That is, the effect of vowel duration on the burst amplitudes of alveolars and velars is different from vowel duration effects on the burst amplitudes of bilabials.These interactions are evident in the model estimates shown in Figure 5.
The overall effect of place of articulation on burst amplitude measures are consistent with those given in Stevens et al. (1999) and correspond to peak P o given in Subtelny et al. (1966), with bilabials having the highest difference measures (lowest amplitude relative to the vowel) in both regions of the spectrum (MidDiff = 40.03dB; HiDiff = 46.93dB).Velars have a high-amplitude mid-frequency prominence (MidDiff = 23.43 dB) and a high-frequency prominence that is between bilabials and alveolars in amplitude (HiDiff = 34.5 dB).Alveolars have a mid-frequency amplitude higher (MidDiff = 28.85dB) than bilabials, and a highamplitude, high-frequency prominence (HiDiff = 24.05dB).

Tamil short-lag plosives. The effects of vowel duration and place of articulation on MidDiff and
HiDiff in Tamil are given in Table 7.
All fixed POA effects were significantly different from the reference bilabial.Similar to the long-lag English data, and consistent with Stevens et al. (1999), bilabials had the lowest burst amplitudes in both mid-and high-frequency ranges (MidDiff = 46.87dB; HiDiff = 52.50dB).Velars had the highest burst amplitude in the mid-frequency range, (MidDiff = 25.32 dB) and retroflexes followed by dentals had the highest burst amplitude in the high-frequency range (HiDiff ret = 38.80dB, HiDiff den = 38.50dB), again consistent with Stevens et al. (1999) which showed alveolars with the highest amplitude in that range.Figure 6 shows the model estimates of the burst amplitudes for each place of articulation as a function of vowel duration.The vowel duration effects on the mid-frequency burst amplitudes were similar at all places of articulation (i.e., there are no significant interactions).While the overall effect of vowel duration on high-frequency burst amplitudes was not significant (t =−0.70, p = 0.48), retroflex stops showed a significantly steeper increase in amplitude with the increase in vowel duration relative to bilabials.

VOT, vowel duration, and burst amplitude in English and Tamil
To better understand the oro-laryngeal timing differences in burst amplitude as a function of speaking rate, the relationship between VOT and vowel duration was modeled in the two languages.The results of the VOT model are given in Table 8 and estimates visualized in Figure 7.  Vowel duration has a clear positive effect on VOT.This result was consistent with literature showing that as speaking rate decreases, vowel duration and VOT increase in equal proportions (e.g., Kessinger & Blumstein, 1997;Theodore et al., 2009), as well as the laryngeal realism literature, suggesting that phonetic cues to phonological categories decrease in duration as speech rate increases (Beckman et al., 2011).The effect of vowel duration on VOT is different between the two languages, with short-lag Tamil plosives having a shallower slope.
Although there is an overall positive association between speaking rate and VOT, the size of the effect is reduced in Tamil short-lag implementation of VOT (75% of VOTs are less than 0.025 s), while the long-lag VOT status in English allows for a more flexible oro-laryngeal timing.Given the relative immutability of VOT in short-lag Tamil plosives, does it contribute to burst amplitude variability?The effect of VOT on burst amplitude measures were modeled separately in both languages.3.6 VOT effects on burst amplitude 3.6.1 Long-lag English plosives.Mixed-effects models of MidDiff and HiDiff were fit to the data with VOT as a predictor and random intercepts and slopes for VOT by subject.Model coefficients are shown in Table 9.
The results show that VOT is correlated with burst amplitude in both the mid-frequency and high-frequency prominences.Similar to vowel duration, with the increase in VOT, burst amplitude increases (difference measures decrease).
3.6.2Short-lag Tamil plosives.Mixed models were likewise fit to the Tamil short-lag burst amplitude data, with VOT as a predictor and random slopes and intercepts for VOT by subject.Table 10 shows the model coefficients, which suggest a significant positive effect of VOT on burst amplitude in only the mid-frequency prominence, while the effect is muted (t =−1.83, p = 0.07) in the high-frequency prominence.
The overall temporal constraining of the Tamil syllable (by increasing speaking rate, which minimally decreases VOT) primarily affects the mid-frequency burst amplitude.That is, the effects of VOT are comparable to the effects of vowel duration on Tamil short-lag burst amplitudes (Section 3.4).

General discussion
Linguistic factors such as syllable structure (Mackay, 1974) and phrasal length (Yuan et al., 2006), and extra-linguistic factors like emotional content of the speech (Mozziconacci & Hermes, 2000), and age and geographical background (Quené, 2005) of the speaker all have an effect on speaking rate.The current study investigated whether speaking rate variation imposes constraints on the articulation (and consequently aerodynamics and acoustics) of consonants via changes in burst amplitude, a proxy for intraoral pressure and glottal frication.Speaking rate (as evidenced in vowel duration) affected burst amplitude, such that decreasing vowel duration led to a decrease in  amplitude (relative to the following vowel) in speakers' consonant bursts.Although the effect is found in both long-lag (North American English) and short-lag (Indian Tamil) plosives, it is more pronounced in the long-lag series, as revealed by its interaction with place of articulation.
The results suggest that burst amplitudes of bilabials were lower in short-lag (Tamil) than in long-lag (English) plosives; the effect was smaller for velars.These differences were unpacked by examining the association between vowel duration and VOT.Consistent with the extant literature, long-lag VOT is more tightly associated with vowel duration when intended speaking rate increases or decreases.Although there is a relationship between vowel duration and VOT in short-lag plosives (as a function of rate adjustments), it is considerably more shallow.This suggests that the differing oro-laryngeal timing in the two types of plosives may be responsible for the burst amplitude variation.Long-lag VOT is strongly correlated with vowel duration as well as higher overall burst amplitudes than in short-lag plosives.When coupled with the general tendency for voiceless aspirated stops to have long closure durations (Lisker, 1957;Stathopoulos & Weismer, 1983), it can be deduced that the high burst amplitude results from sufficient time for P o buildup.Conversely, short-lag stops have lower burst amplitudes than English (suggesting lower P o ), with less pronounced effects of speaking rate.However, the automatic relationship between short-lag VOT and burst-amplitude need not necessarily follow in a way exemplified in the present data.For example, Korean fortis stops show high P o and have burst energy comparable to lenis stops despite having short-lag VOTs (Cho et al., 2002).This suggests that the correlation between short-lag VOT, low P o and low burst amplitude may be an articulatory-aerodynamic default when there is no phonological laryngeal contrast for stops in initial position.

Conclusion and future directions
We can conclude from this study that speaking rate, as manifested in vowel duration, affects the burst amplitude of preceding consonants, and that VOT interacts with that effect.These results are very much in line with a growing phonetics literature focused on rate variability and its contribution to phonological lenition (Kirchner, 2004;Priva & Gleason, 2020;Warner & Tucker, 2011), which has identified fast speech as affecting shorter duration and increased intensity of the periodic portion of intervocalic consonants.Lenition of consonants in these environments, which are often produced without a release burst or complete closure, is characterized by more intense sonorous low-frequency energy.The comparable measure in the current study of prevocalic consonants is decreased energy concentrated in the high-frequency bands of the spectrum.Both of these results are consistent with increased speaking rate resulting in essentially weaker consonants. 1  While the outcome of fast speech in the intervocalic consonant literature is a non-phonologized synchronic lenition in production, directions for future research building upon the present results might consider how the acoustic variability (resulting from either speaking rate or oro-laryngeal timing) in consonant bursts contribute to both our understanding of (1) general consonant perception and (2) listener-driven diachronic patterns in languages.
From a purely speech processing perspective, we might ask about the extent to which the phonetic association between vowel duration and burst amplitude is represented in the listener, that is, does a listener expect an association between vowel duration (rate-conditioned or otherwise) and reduced burst amplitude?If so, how does a listener's language experience affect their perception of this trading relationship (Repp, 1982)?Would the association between rate/vowel duration and burst amplitude be stronger long-lag than in short-lag plosives?Does the relationship serve as a perceptual cue for phonological voicing in English speakers (or other languages that contrast longand short-lag VOT)?This line of research will contribute to not only our understanding of cue

Figure 1 .
Figure1.Waveform and wide-band spectrogram representations (left) of a typical "pa" token spoken by a male North American English speaker and averaged spectra for the burst and vowel (right).Three dashed lines on the waveform represent burst onset, burst offset/vowel onset, and vowel offset.Critical spectral measurements described in Section 2.3 are indicated on the burst spectrum (black) and vowel spectrum (red).

Figure 2 .
Figure 2. Observed vowel duration according to speaking rate and language.Violin plots show the density of vowel durations, while notched boxplots give medians and interquartile range.

Figure 3 .
Figure 3. Predicted estimates (with confidence intervals) from linear mixed-effects models and observed values of MidDiff and HiDiff in long-lag (English) and short-lag (Tamil) plosives.

Figure 4 .
Figure 4. Predicted estimates (with confidence intervals) from linear mixed-effects models of MidDiff and HiDiff in bilabial and velar places in long-lag (English) and short-lag (Tamil)plosives.

Figure 5 .
Figure 5. Predicted estimates (with confidence intervals) of MidDiff and HiDiff as a function of place of articulation and vowel duration in long-lag English plosives.

Figure 6 .
Figure 6.Predicted estimates (with confidence intervals) and observed values of MidDiff and HiDiff as a function of place of articulation and vowel duration in short-lag Tamil plosives.

Figure 7 .
Figure 7. Predicted estimates (with confidence intervals)mixed-effects models and observed values of VOT in (A) long-lag English and (B) short-lag Tamil plosives.

Table 1 .
Mixed-Effects Model of Vowel Duration as a Function of Subjective Speaking Rate and Language.Note.References are Rate Slow and Language English .

Table 2 .
Mixed-Effects Model of Vowel Duration as a Function of Subjective Speaking Rate in (A) English and (B) Tamil.

Table 3 .
Pairwise Comparisons of Subjective Speaking Rates from Vowel Durations in (A) English and (B) Tamil.

Table 5 .
Linear Mixed-Effects Models of Relative Amplitude of English and Tamil Stop Bursts in (A) the F2/F3 Region (MidDiff) and (B) Above 3-3.5kHz (HiDiff), as a Function of Vowel Duration and Consonant Place of Articulation.

Table 6 .
Linear Mixed-Effects Models of Relative Amplitude of English Long-Lag Plosive Bursts in (A) the F2/F3 Region (MidDiff) and (B) Above 3-3.5kHz (HiDiff), as a Function of Vowel Duration and Consonant Place of Articulation.

Table 7 .
Linear Mixed-Effects Models of Relative Amplitude of Tamil Short-Lag Plosives in (A) the F2/F3 Region (MidDiff) and (B) Above 3-3.5kHz (HiDiff), as a Function of Vowel Duration and Consonant Place of Articulation.

Table 8 .
Mixed-Effects Model of VOT as a Function of Vowel Duration in Long-Lag English and Short-Lag Tamil Plosives.Note.Reference level is Language English .**p < 0.001.

Table 9 .
Linear Mixed-Effects Models of Relative Amplitude of Long-Lag English Plosive Bursts in (A) the F2/F3 Region (MidDiff) and (B) Above 3-3.5 kHz (HiDiff), as a Function of Voice Onset Time.

Table 10 .
Linear Mixed-Effects Models of Relative Amplitude of Tamil Short-Lag Plosives in (A) The F2/ F3 Region (MidDiff) and (B) Above 3-3.5 kHz (HiDiff), as a Function of Voice Onset Time.