Tracking word frequency effects through 130years of sound change

Contemporary New Zealand English has distinctive pronunciations of three characteristic vowels. Did the evolution of these distinctive pronunciations occur in all words at the same time or were different words affected differently? We analyze the changing pronunciation of New Zealand English in a large set of recordings of speakers born over a 130 year period. We show that low frequency words were at the forefront of these changes and higher frequency words lagged behind. A long-standing debate exists between authors claiming that high frequency words lead regular sound change and others claiming that there are no frequency effects. The leading role of low frequency words is surprising in this context. It can be elucidated in models of lexical processing that include detailed word-specific memories.


Introduction
Living languages are always in flux. Not only do our inventories of words change, but also how we say the same words. Eventually, sound changes affect all relevant words the same way; at the completion of the change, all words containing the sound are affected. This generalization allowed the 19th century Neogrammarian linguists to reconstruct Proto-Indo-European by comparing lists of cognate words in different languages, laying the foundations of historical linguistics as a science.
The sound changes under investigation in this paper are cases of what Labov (2004) terms 'regular sound change'. That is, they can be described as the gradual transformation of the phonetic realizations of phonemes in a continuous phonetic space. It is now more than a century following the insights of the Neogrammarians, and yet the detailed dynamics of such regular sound changes are still not well understood. In particular, it is not agreed whether sound changes affect all words concurrently and uniformly, or whether different groups of words are differentially affected. Many scholars claim that frequent words lead gradual phonetic sound changes and that less frequent words follow (Bybee, 1985(Bybee, , 2000Pierrehumbert, 2001). This claim is contested by prominent scholars in mainstream historical linguistics and sociolinguistics. They maintain that regular changes proceed uniformly, and that documented cases of word-specific effects actually involve mechanisms other than gradual phonetic sound change (see Labov, 2004Labov, , 2010, and reviews by Kiparsky (2014) and Garrett (2014)).
Deep understanding of this controversial issue has been severely limited by the lack of speech recordings at historical time scales. Here, we investigate this question using the longest historical archive of transcribed speech now Using this archive, we investigate frequency effects in a set of interconnected regular sound changes -the short front vowel chain-shift that has unfolded over 130 years in New Zealand English. Our analysis is a considerable advance on previous work on frequency effects because it (a) considers how word frequency effects evolve over time and (b) also considers how word frequency effects manifest in interlinked sound changes involving two or more categories. The results reveal that words of different frequencies behave differently in a surprising way that can be elucidated by psycholinguistic theory.

Lexical frequency and regular sound change
According to theories in which people remember the detailed phonetic properties of individual words (such as Bybee, 2000;Phillips, 1984Phillips, , 2006Pierrehumbert, 2001Pierrehumbert, , 2002, it is expected that frequent words should lead regular sound changes. The memory of each word's pronunciation is updated at a rate that is proportional to the frequency with which the word is encountered. 1 The distribution of remembered pronunciations affects subsequent productions of the word. Assuming that words are subject to a persistent bias in pronunciation when a change is in progress-and that the bias is constant -then frequent words should change faster simply because they undergo the bias more often (Pierrehumbert, 2001).
Indeed, for sound changes that involve the deletion or reduction of speech sounds, it has often been claimed that frequent words display the innovative variant more at any given time than infrequent words do. For example, a highly influential study of the loss of final /t/ claims that frequent words are leading the change, on the basis that the /t/ is more often absent in frequent words such as just or plant than in less common words like rust or lint (Bybee, 2000(Bybee, , 2002. However, this result (and others like it) is also consistent with a stable effect where frequent words simply have more deletion, independent of the ongoing change. It is well documented that frequent words tend to be articulated with less effort, even in the absence of sound change (e.g. Gahl, 2007;Jurafsky, Bell, & Girand, 2002;van Son, Bolotova, Lennes, & Pols, 2004;Wright, 1979). The presence of an effect during periods of both change and stability is not sufficient to demonstrate a relationship between the change and that effect. Indeed, the pervasiveness of these effects has been used to argue that frequency does not actually play a role in regular sound change, because ''structured variation is not in itself sound change; it can persist for centuries and even millennia'' (Kiparsky, 2014:71). Dinkin (2008) argues that the 'real effect' of frequency on sound change is to exert a simple, stable lenition effect. In the terminology of mathematics and statistics (see Priestley, 1988), these authors suggest that a stationary process (one which is invariant with regard to shifts in time) is responsible for the fact that the distributions of phonetic realizations for frequent words typically include more lenited variants than those for rare words.
The claim that words of different frequencies are affected differently by regular sound changes amounts to the claim that there is a non-stationary effect, i.e. the effect of frequency is different during periods of language change than during stable periods, a fact that can be overlooked by researchers on both sides of the debate. As outlined above, there has been a strong tendency to interpret the presence of stationary frequency effects as indicative of a role of frequency in sound change. Conversely, the absence of stationary frequency effects has been interpreted as indicating that lexical frequency is not involved in sound change. For example, Labov (2010) investigates a range of sound changes in progress, finding negligible evidence for frequency effects at work. His conclusion is that frequency is not involved in sound change. However he also does not look for non-stationary effects. A non-stationary effect would reveal itself as a statistical interaction between word frequency and the progress of the change, and not as a constant effect.
If frequent words actually lead in a sound change, then the effect of word frequency should be largest when the change is progressing most rapidly, and smallest during stable periods in the language. A new study of the lenition of medial intervocalic /t/ in the ONZE corpus addresses this prediction (Hay & Foulkes, in press). Originally a voiceless plosive, /t/ in this position has become increasingly voiced as New Zealand English has developed. In the early stages of the change, there was a small difference in voicing rates between lower and higher frequency words, and this difference increased once the change was well underway. This result provides empirical support for the predictions of the models described above, in which the memory of a word's pronunciation is updated at a rate that derives from the frequency with which the word is encountered (Pierrehumbert, 2001).
Low frequency words are known to lead in a different type of linguistic change: analogical change (Bybee, 1985;Lieberman, Michel, Jackson, Tang, & Nowak, 2007;Phillips, 1984). In analogical changes, irregular words in a grammatical paradigm are replaced by words conforming to more regular patterns. For example, in the English past tense, irregular forms such as wept tend to become regularized (cf. weeped) by analogy with regular past tense forms. Low frequency irregular forms can be relatively quickly replaced with regular forms, whereas more frequent irregular forms (such as kept) have remained in use long after many rare irregular forms were lost. This happens because irregular high frequency forms are extremely well learned and resist competition from regular patterns. A related phenomenon, demonstrating the entrenchment of high frequency forms in memory, is the finding that high frequency words are slower to be replaced with other lexical forms (Pagel, Atkinson, Calude, & Meade, 2013;Pagel, Atkinson, & Meade, 2007). But this explanation cannot extend to phonetically regular sound changes, as they do not involve any analogical force.
1 Note that updating a distribution need not involve storing separate memories for the encountered token. As outlined in Pierrehumbert (2001), the phonetic parameter space is assumed to be granularized, and groups of tokens which are similar with respect to the granularization will be encoded as identical (see Kruschke, 1992).
Several other types of low frequency-leading changes have been reported. According to Phillips (2006), changes that involve abstract generalizations over the phonological structures of the word types in the lexicon may act like analogical changes in affecting low frequency words first. She contrasts such changes with those that only involve the phonetic realizations of phonemes (such as vowel shifts), which she describes as affecting more frequent words first. Though intriguing, these observations are limited by the fact that they rest on written historical records and synchronic experimental studies, rather than on longitudinal data. As a result, the gradualness of the changes can only be inferred. Ogura (2012) also suggests that changes may behave differently depending on whether they are driven by perceptual, rather than productive forces. She claims that, while high frequency words generally lead change, low frequency forms can dominate perceptually driven change because: ''perceptually or cognitively unfavourable forms can be learned and maintained in their unfavourable forms if they are high frequency in the input'' (Ogura, 2012, p.438).
For phonetically gradual sound changes that are not structurally complex, we are not aware of any reported case of low frequency forms leading. For this type of sound change, it is either argued that high frequency forms should lead (Bybee, 2000), or that there should be no frequency effect (Labov, 2004). Almost no work has investigated how frequency effects pattern over time during the course of such change. Only recently, with the creation of large transcribed and aligned longitudinal speech archives, has it even become possible to rigorously test for such patterns. The present study and Hay and Foulkes (in press), both using the ONZE corpus, are the only two studies we are aware of that carry out such a test. Hay and Foulkes explore a consonant lenition. Here we explore a different type of change: a set of interlinked vowel changes that shifts the vowels without leniting them.

The New Zealand English short front vowel shift
Modern New Zealand English (NZE) is notable for the unusual quality of three short front vowels. These vowels are so different from their counterparts in other dialects as to cause misunderstandings: NZE bat often sounds to speakers from elsewhere like bet, NZE bet sounds like bit, and NZE bit often sounds like but. We henceforth refer to the three sets of short front vowels as BAT vowels, BET vowels, and BIT vowels.
This pattern resulted from gradual changes over more than a century. Over the history of NZE, the vowels in BAT and BET words came to be closer and more front, and BIT words became more open and back. Gordon et al. (2004) provide evidence that the BAT words may have initially shifted in response to the encroaching vowel in words like bart, which are pronounced with long open vowels in the r-dropping New Zealand dialect. This is corroborated by the analysis presented here, which shows that BAT vowels before voiced sounds are most advanced in the change (see Table A1). These are exactly the vowels that would be most confusable with advancing BART words, due to their durational overlap (c.f. Chen, 1970;Mack, 1982). The later shifts in BET and BIT words occurred as a reaction to the advance of the BAT words. 2 The end result is that the vowels have rotated in acoustic space but have remained distinct from each other.
The rotation of these three vowels is a regular sound change, which eventually affected all words containing the vowels. It is characterized as a chain-shift because the changes are interlinked, and the original set of lexical contrasts is maintained. Specifically, it is a push chain because the lower vowels moved first, and the higher vowels moved away when the lower vowels encroached on them. The typology of chain shifts dates from Martinet (1952), who contrasted push chains with 'pull chains', in which categories move to occupy phonetic space previously vacated by the retreat of other categories. Although historical records with sufficient detail to distinguish these types of chain shift are rare, Lubowicz (2011) cites 5 other examples of push chains in addition to the New Zealand English Vowel Shift.
Chain shifts rotate vowels without any loss of lexical contrast. In this sense, they differ from /t/-deletion and many other leniting sound changes which can neutralize lexical contrasts. A major limitation of the model discussed above (Pierrehumbert, 2001) is that it in fact has no treatment of lexical contrasts. As a result, it does not provide a general mechanism for chain shifts, in which the system of lexical contrasts is actually preserved. In push chains, a category with a systematic bias moves first, and encroaches on another category which then retreats. According to Pierrehumbert's model, the result of such an invasion would be neutralization of the categories. Indeed, no existing model integrates a treatment of word-specific phonetic patterns with an up to date treatment of lexical competition.
When one vowel encroaches on another, a region of phonetic ambiguity between the two vowels is created. Our suggestions about the mechanism for push chains builds on findings that word frequency effects are greater for poor or ambiguous speech signals than clear ones (Norris & McQueen, 2008). In the ambiguous region near the category boundary, a low frequency word is more vulnerable to being misunderstood than a high frequency word. The best understood tokens of low frequency words will be those far from the boundary. We presume that neutralization is avoided because chain shifts proceed by such small increments that the region of phonetic ambiguity between adjacent vowels at any given time is also small, in relation to the total within category variation. The consequences of this situation will be discussed in more detail in Section 6. 2 All vowels have F2 voicing effects in which the pre-voiced (longer) vowels are fronter. This can be interpreted as stemming from the welldocumented effect that longer vowels provide more opportunity for an articulatory target to be reached, and so can have more peripheral realizations (Lindblom, 1963). However the effect of voicing on F1 of BAT vowels cannot be interpreted in this way. The pre-voiced vowels are the higher (and thus more centralized) tokens, and so would actually take less effort to produce.

Methods
Our data-set comprises a total of 2741 word types (80646 word tokens) from the speech of 549 different speakers, with birth-dates spanning 136 years . The large size of our data-set gives us more statistical power than other studies of sound change. The historical timedepth provides a unique opportunity to track sound change over many generations of speakers.
This time-depth is made possible by the unique corpus from which the data is drawn. The Origins of New Zealand English (ONZE) corpus contains recordings of speakers born between 1851 and 1987 (Gordon, Maclagan, & Hay, 2007). These have been orthographically transcribed, and timestamped at the utterance-level. The HTK-Toolkit (Young et al., 2006) was used to force-align the speech to the phoneme level, using phonological representations extracted from the CELEX lexical database (Baayen, Piepenbrock, & Gulikers, 1995). Using a LaBB-CAT interface (Fromont & Hay, 2008 first and second formants (F1 and F2) were automatically extracted at the midpoint of the vowel, using Praat's standard settings (Boersma & Weenink, 2005). 3 F1 and F2 are the first and second resonances of the vowel, which can be estimated from the acoustic signal and which effectively characterize dimensions of the vowel quality. F1 captures the open/close dimension. Vowels with a higher F1 are more open. F2 captures the front/back dimension. Vowels with a higher F2 are more front. Formants were converted to the Bark scale, a transformation that reflects the perceptual scaling of the two dimensions.
The range of formant values varies across speakers, partially as a function of vocal tract length. Some sociolinguistic work attempts to normalize these values, for example by adjusting with reference to the midpoint of the speaker's values for each formant, combined across vowels (see overview in Adank, Smits, & van Hout, 2004). We do not do this because normalization can make changes in one vowel manifest as adjustments to another vowel's formants. The adjustment could also be unduly influenced by frequent words in a way that would obscure potential frequency effects. To avoid such artifacts, we deal with the varying formant ranges across speakers statistically, by using mixed effects models, as described below.
The data-set was restricted to measurements of lexically stressed vowels occurring in monosyllabic and disyllabic content words. Vowels occurring before /l/ were eliminated because New Zealand English has a merger of BET and BAT vowels before /l/ (Hay, Drager, & Thomas, 2013). Because many of the recordings were made in field conditions using early recording technology, extensive precautions were taken to identify and remove values indicative of errors in the force-alignment or formant tracking. Specifically, tokens with F1 or F2 measurements beyond 3 vowel x gender standard deviations were removed. Extreme value cut-offs were set individually based upon each speaker's mean F1 and F2 vowel formants. Distributions of F1 and F2 values for each individual speaker were manually inspected. Multimodal and nonbivariate normal formant distributions were then further checked for measurement error in the source recordings. After remediation, speaker vowel means were recalculated and the extreme value cut-offs were adjusted appropriately. The analyzed data-set comprises 18,464 tokens of BAT words (868 types), 36,736 of BET words (1132 types), and 25,446 of BIT words (741 types) from 549 speakers (269 females, 280 males). The frequency of each word type was drawn from the full corpus. Word counts ranged from 1 to 4905 (mean 29.48, sd 201.10). The log of these values was taken as an estimate of lexical frequency. 4 Following established practice in sociolinguistics (see Bailey, 2008) we use speaker year of birth as a proxy for tracking the progress of the change. 5 Linear mixed effects models (see Baayen, 2008) were fit separately to Bark F1 and Bark F2 of each vowel, using the lme4 library in R (Bates, Maechler, & Bolker, 2011;R Core Team, 2012). Each model contained random intercepts for word and speaker. Fixed effects of following voicing, gender, log lexical frequency (centered), and year of birth (centered) were tested, as well as interactions between these factors. Non-linearities in the effect of year of birth were modeled using restricted cubic splines from the rms library (Harrell, 2013). Model selection was guided by v 2 likelihood tests, Akaike Information Criterion (Akaike, 1974) and Bayesian Information Criterion (Schwarz, 1978). The best fit models are reported. Models which included linguistic factors (frequency and/or voicing) as fixed effects also included these factors as random slopes for speaker. If these factors were not justified in the model, they were dropped from both the fixed and random effects. The final models were then rerun with uncentered factors 3 The choice of the midpoint was a pragmatic one, in that it can objectively and automatically identified for large numbers of vowels. Because all of the vowels involved are largely monophthongal in NZE, there is no reason to believe that a location based on target identification would have different results. 4 A reviewer asks about the relationship of these corpus-derived frequencies to CELEX frequencies (Baayen et al., 1995). Our analysis includes 84 types that do not contain any entries in CELEX (mainly proper nouns). Omitting these forms, a Spearman's correlation between the log corpus frequency and log CELEX wordform frequency returns a highly significant correlation (rs = .71, p < .0001). Examples of words overrepresented in the corpus, as opposed to CELEX, include dredges, paddock, and lambing, all farming terms associated with the New Zealand rural economy. Underrepresented words include cent, weapons, and actors. We take the corpus frequencies to be a closer, more context-relevant approximation of word frequency distributions as experienced by the speakers. 5 Using speech-date as a predictor rather than year of birth would not be practical, as it does not provide enough range, and would simply cluster the speakers into three main groups relating to when the data was recorded. The earlier born speakers were older when they were recorded than the later born speakers. If we assume that speakers may continue to move in the direction of change throughout their lifetime (Harrington, Palethorpe, & Watson, 2000), then using year of birth as a proxy for change will slightly under-estimate the speed of change (Gordon et al., 2004). In the case of vowels -formant values are affected as speakers age, so the fact that older speakers are over-represented amongst the earliest speakers will also affect the apparent speed of change in this respect. Effects of ageing on F1 (Harrington, Palethorpe, & Watson, 2007), for example, may mean that our data underestimate the speed of BAT and BET height changes, and overestimate the BIT changes. Taken together, these factors tell us to interpret the exact apparent speed and timing of movement of vowels in our data with some caution. However none of these factors interfere with the distribution of the change over different word frequencies. Any observed word frequency effects should be independent of the effects of this common simplifying assumption.
to aid in interpretability. Significance levels were calculated using Satterthwaite's (1946) approximations for the degrees of freedom using the lmerTest library (Kuznetsova, Brockhoff, & Christensen, 2013).

Results
For all three vowels, there was a significant interaction between year of birth and lexical frequency (in BAT F1 and F2, BET F1, and BIT F2). In each case, the interaction term goes in the opposite direction from the year of birth coefficient, indicating that high frequency forms are not advancing as quickly as low frequency forms. That is, we observe a non-stationary effect, with low frequency words are leading the change. A non-linear effect of year of birth was statistically justified for BIT but not for the other vowels. The coefficients from the final statistical models are shown in Table A1.
This outcome is the opposite of that predicted by the models of sound change outlined above. Figs. 1 and 2 display the changes in the short front vowels, as described by our analysis. For clarity, we show the values for female participants only. Note, however, that no models include interactions containing gender. The model predictions for males are therefore uniformly shifted in the direction of the gender effect, and otherwise identical.
In Fig. 1, all three vowels are plotted on a traditional vowel quadrant, and the modeled trajectory of movement from the beginning of our analyzed time-period to the end is indicated by separate arrows -one for the highest frequency words, and one for the lowest frequency words.
The lower frequency words move further, over the same time period. In Fig. 2, the separate models for F1 and F2 of each vowel are plotted over heatmaps, illustrating the underlying distributions over which they are modeled. The frequency effect is again illustrated by showing separate trajectories for the highest and lowest frequency words.

Discussion
We have found significant interactions between year of birth and frequency for all three vowels. These interactions provide evidence for the involvement of lexical frequency in regular sound change. For all three vowels, low frequency words lead the way. These effects appear surprising in the context of the sound-change literature, but a conjecture about the mechanism can be formulated using theories in which listeners have detailed phonetic memories for specific words (e.g. Goldinger, 1998;Pierrehumbert, 2006;Wedel, Kaplan, & Jackson, 2013).
The factors that set push-chains in motion are not well understood, though it seems likely that social factors as well as cognitive ones must be involved. However, once a push chain has been initiated, the encroachment of one vowel on the acoustic space of the next can be viewed as an advancing front of sound change. The phonetic overlap between the two distributions creates the potential for ambiguity between the categories.
How does the existence of a region of ambiguity between two categories cause the category encroached on to be repelled? A mechanism is sketched in Fig. 3.  Table A1). Solid lines summarize the model predictions for the lowest frequency words in the dataset. Dashed lines summarize the model predictions for the highest frequency words. All arrows cover the same time period. Low frequency words move more quickly.
The mechanism relies on the idea that if tokens are difficult to understand, they are less likely to be encoded in memory. Indeed, there is evidence that poor or noncanonical examples of words are less likely to be robustly stored even if they can be understood (e.g. Sumner & Samuel, 2009).
In Fig. 3, the advancing front of Category A (here, the BAT words) has created a phonetic overlap with a neighbouring Category B (here, the BET words). Examples of BET words that are well away from the ambiguous region are more reliably encoded and remembered than those falling in the ambiguous region. The loss of tokens from ambiguous region (in combination with random variation in production that extends the distribution of B on the side away from Category A) causes an overall shift of the B category away from the encroaching A (see also Blevins & Wedel, 2009;Martinet, 1952;Wedel, 2006). This category-level interpretation makes a further prediction regarding the distribution of phonetic variants across the words in the category, because ambiguity places low frequency words at a particular disadvantage. High frequency words are much more easily accessed than low frequency words (Forster & Chambers, 1973). The speech processing system resolves ambiguous speech signals by bias toward the more frequent lexical candidate (Norris & McQueen, 2008) and listeners have a greater tendency to mishear low frequency words as high frequency words than vice versa (Savin, 1963). The effects of phonological ambiguity are not confined to close lexical competitors. In lexical decision tasks, low frequency words are also more likely to be misinterpreted as nonwords (Luce & Pisoni, 1998). Thus, pronunciations of high-frequency words that lie near the advancing front of a nearby vowel should be easily and accurately recognized, on the average. We assume that these tokens will then be robustly stored and affect subsequent productions. Pronunciations of low frequency words that lie near the advancing front, on the other hand, are less easily recognized. They may be misheard as a competing high frequency word, or as a nonword. Or they may be recognized, but not reliably stored. 6 Year of Birth The hotter colors in the figure (yellow tones) are more frequent formant realizations than the cooler colors in the figure (blues). The relative frequency data was generated using the bkde2D function from the KernSmooth package in R. The frequency data were plotted as a heatmap using a modified form of the filled.contour function from the graphics package in R. Note that formant values on the y-axis are laid out from low to high, in contrast to the traditional vowel polygon in Fig. 1. Model predictions are laid over the data. Solid lines summarize the model predictions for the lowest frequency words. Dashed lines summarize the model predictions for the highest frequency words. 2C shows an effect of year of birth but no effect of word frequency. All other panels show a significant frequency effect. For F2 of bet words, low frequency words are consistently ahead of high frequency words. In all other panels (2A, 2B, 2D, 2F) there is a significant interaction between year of birth and word frequency. For all three vowels, low frequency words move more quickly (i.e. have a steeper slope) in one or both formants.
6 Non-canonical pronunciations may, paradoxically, have an advantage in being stored if they are socially salient (Sumner, Kim, King, & McGowan, 2014). This factor would not affect Category B in Fig. 3, which represents a stage in the sound change in which Category B has no active socially triggered bias.
On the average, this leads to more attrition for low-frequency words than for high-frequency words in the ambiguous region created by the sound change.
These factors conspire to mean that the distribution of remembered low frequency words should be repelled more quickly from the advancing front of a sound change. Importantly, we are not suggesting that the speaker actively enhances contrast in cases of potential ambiguity. The frequency effects simply follow from general mechanisms of word perception and storage.
The effects of word frequency in our study are small in comparison to the within-category variation for each vowel. This fact helps to explain why such a large sample has been needed to find the effects. It is consistent with Pierrehumbert's (2006) suggestion that word-specific phonetic patterns are subtle, within-category, effects. It is of course possible for words to acquire multiple, categorically distinct phonological spell-outs, as shown for example in studies of how frequent words are produced in spontaneous speech (Kemps, Ernestus, Schreuder, & Baayen, 2004). However, for the New Zealand short front vowel shift, the movement within a single generation was much smaller than the separation between categories. Differential effects of word frequency are therefore even smaller.
Importantly, it does not directly follow from our findings that such a frequency effect would definitely be observable in any push-chain relationship. The effect of frequency on confusability will be influenced by a range of factors, including the shape and size of the distributions, the degree of overlap and phonetic confusability, and the speed of change.

Conclusion
We have tracked three interrelated vowel changes as they unfolded over the history of New Zealand English. Our mixed effects models of automatically extracted formant values reveal robust frequency effects in all three changes, with the lowest frequency words moving most quickly. For the vowels being pushed in the chain shift that we investigated, the low frequency words were in advance. This finding is very surprising in the context of contemporary sociolinguistic debate about the role of frequency in regular sound change. However it follows naturally from speech processing models in which people remember detailed phonetic properties of words, and in which ambiguous tokens of low-frequency words are vulnerable to misperception.
Claims that word frequency can play no role in regular sound change are therefore wrong. However, competing claims that high frequency words always lead are also wrong. The particular nature of lexical frequency effects in a given sound change can only be predicted and interpreted in the light of a detailed understanding of the mechanisms underlying the change itself.

Acknowledgments
This work has been supported by a Royal Society of New Zealand Marsden Grant and a Rutherford Discovery Fellowship to the first author, and a University of Canterbury Erskine Fellowship to the second author. In addition, this project was made possible through a grant from the John Templeton Foundation. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation. The ONZE data was collected by the Mobile Disc recording Unit of the NZ Broadcasting Service, Rosemary Goodyear, Lesley Evans, members of the NZ English class of the Linguistics Department, University of Canterbury, and members of the ONZE team. The work done by members of the Origins of New Zealand English Project (ONZE) in preparing the data, making transcripts, and obtaining background information is also gratefully acknowledged. The Corpus was created and supported with funding from the following sources: University of Canterbury, Foundation for Research, Science and Technology (the New Zealand Public Good Science Fund), the Royal Society of New Zealand, The New Zealand Lotteries Board Fund, and the Canterbury History Foundation. We are particularly grateful to Robert Fromont for his work programming LaBB-CAT -the ONZE Corpus search engine and interactive interface. This  Fig. 3. Schematic diagram of the mechanism for word frequency effects in a push chain. A shifting distribution of the low front vowel in BAT words (Category A) encroaches on the distribution for the mid front vowel in BET word (Category B). The colored ellipses represent three examples of lexical items within each category. Their placement and shape is arbitrary. Once a region of ambiguity is created (shown here by the region of the BAT distribution that has crossed the original category boundary), productions of BET words are encoded with variable reliability. Productions of BET words that are far from the region of ambiguity are reliably encoded. Productions of BET words that fall in the region of ambiguity are more reliably encoded for high frequency words, such as ''red'', than for low frequency words, such as ''wrestle''. The noise in the production system that enables the BET distribution to retreat from the encroaching BAT distribution is not illustrated. manuscript has benefited from feedback from Cynthia Clopper, Donald Derrick, Matt Goldrick, Peter Racz and three anonymous reviewers.  Satterthwaite's (1946) approximations for the degrees of freedom using the lmerTest library (Kuznetsova et al., 2013). Broad significance classes are indicated with * (< .05), ** (< .01), and *** (< .001). Four models include significant frequency Â year of birth interactions. In each case, the interaction coefficient goes in the opposite direction from the year of birth coefficient, indicating that high frequency forms are not advancing as quickly as low frequency forms. Note that for BIT words, the effect of year of birth on formant values was modeled as a non-linear effect using a restricted cubic spline (RCS) from the rcs() function in the R package rms (Harrell, 2013). Non-linear effects were also tested for year of birth for other vowels and found non-significant. RCS applies a function to the predictor similar to the nonlinearity provided by an exponential function. RCS functions are described by the number of endpoints (called knots) in the function, which are the number of subsets used plus one. For both Bark F1 and F2, an RCS with 3 knots was used, which corresponds to a quadratic polynomial (x^2). The first year of birth parameter is labeled rcs(year of birth, 3) 0 and is applied to the first subset of the data. The second parameter is labeled rcs(year of birth, 3) 00 and is applied to the second subset of the data. The interactions of year of birth with word frequency are similarly labeled. A Bark scaling is chosen because it more accurately reveals the perceptual magnitude of the modeled shifts. Because this is a simple transformation, this choice does not have any meaningful influence on the shape of the final models.