Affrication as the cause of /s/-retraction: Evidence from Manchester English

Retraction of /s/ to a more [ʃ ]-like sound is a well-known sound change attested across many varieties of English for /stɹ/ words, e.g. street and strong. Despite recent sociophonetic interest in the variable, there remains disagreement over whether it represents a case of long-distance assimilation to /ɹ/ in these clusters or a two-step process involving local assimilation to an affricate derived from the sequence /tɹ/. In this paper, we investigate Manchester English and apply similar quantitative analysis to two contexts that are comparatively under-researched but allow us to tease apart the presence of an affricate and a rhotic: /stj/ as in student, which exhibits similar affrication of the /tj/ cluster in many varieties of British English, and /stʃ/ as in mischief. In an acoustic analysis conducted on a demographically-stratified corpus of over 115 sociolinguistic interviews, we track these three environments of /s/-retraction in apparent time and find that they change in parallel and behave in tandem with respect to the other factors conditioning variation in /s/-retraction. Based on these results, we argue that the triggering mechanisms of retraction are best modelled with direct reference to /t/-affrication and with /ɹ/ playing only an indirect, and not unique, role. Analysis of the whole sibilant space also reveals apparent-time change in the magnitude of the /s/–/ʃ/ contrast itself, highlighting the importance of contextualising this change with respect to the realisation of English sibilants more generally as these may be undergoing independent change.


Introduction
In recent years there have been a number of sociophonetic studies investigating /s/-retraction in English, a process by which /s/ is realised as a more retracted [ʃ]-like variant in complex onsets. This is most widely reported in the word-initial cluster /stɹ/, as in street and strength, but is also found word-medially, as in district and frustrated.
It has been attested in Estuary English (Altendorf 2003), Colchester (Bass 2009) and Edinburgh (Sollgan 2013); however, to date, there has been no detailed community-level study of the type seen for American English. While this dearth of research has been somewhat alleviated by a recent cross-dialectal study across varieties of Scotland (and North America) by Stuart-Smith et al. (2019), this work has focused primarily on comparisons of the status of /s/-retraction in a range of English varieties rather than a detailed investigation of change in a single speech community.
In this paper, we present the first evidence of /s/-retraction in Manchester English, spoken in the North West of England and, in doing so, we address the question of what phonetic factors motivate this process of retraction. It has often been claimed that /s/ retracts in these contexts due to long-distance assimilation to the rhotic segment in /stɹ/ clusters (see e.g. Shapiro 1995).
An alternative account to this is that retraction is in fact local, arising as a consequence of the affrication of /t/ by /ɹ/ rather than by /ɹ/ directly (Lawrence 2000). The problem in determining which of these competing explanations receives the strongest empirical support is summarised perfectly by Wilbanks (2017: 302) who writes that "it may prove difficult to tease apart the effects of contact with affricated /t/ and variably-articulated /ɹ/ […] and isolate a single underlying cause". That this affrication trigger is often overlooked in work on /s/-retraction is perhaps a consequence of the fact that work has primarily been conducted on yod-dropping varieties of American English; by turning the focus instead to yod-retaining varieties of British English we can consider the behaviour of /s/ in another environment, namely /stj/ clusters (e.g. student, stupid), in which /tj/ undergoes coalescence to [tʃ]. Crucially, this provides independent evidence of how /s/ is realised in a cluster with affrication but in the absence of a rhotic segment. To date, there is no quantitative evidence of the behaviour of /s/ in these clusters and consequently no comparison with the more widely-attested retraction in /stɹ/. In addition to this, we also provide the first acoustic comparison between these contexts and the /stʃ/ environment (e.g. mischief), where /s/ occurs before an underlying affricate rather than one derived through an independent (and still partially variable) phonological process.
Finally, we also address /s/-retraction in the context of the wider sibilant space, defined here as the speaker-specific range of typical spectral values for underlying /s/ and /ʃ/ segments in pre-vocalic environments, e.g. seep and sheep. By considering these two "end points" of the sibilant continuum, which may be changing independently of context-specific /s/-retraction, we can gain insight into how advanced this process is in Manchester English. In doing so, we ask the following question: is there evidence that /s/-retraction has become stabilised as a categorical phonological rule, with speakers for whom /s/ in /stɹ/ is phonetically identical to their realisation of an underlying /ʃ/?
In sum, the research described here is guided by the following questions: i. How advanced is /s/-retraction in Manchester English and is there evidence of an apparenttime change in this community?
ii. Is retraction observed in /stj/ clusters and to what extent does /s/ show similar behaviour in /stj/, /stɹ/ and /stʃ/ contexts?
iii. In light of this, which of the two competing accounts of /s/-retraction (local vs non-local assimilation) finds the strongest empirical support in Manchester English?
The results of this study provide the first robust empirical evidence of a community-level change in /stɹ/ in a British variety of English. This appears to be a regular coarticulatory sound change which is led by young women (and, possibly, working-class speakers) and in which higher frequency words are more advanced.
Our findings further provide new insight into the mechanisms of /s/-retraction. That is, in this first large-scale quantitative investigation of retraction in /stj/ and /stʃ/, we find that they are changing in parallel with /stɹ/ and this suggests that, although /ɹ/ and /j/ may have some direct effect on /s/, this is unlikely to be the primary cause that initiates this change. The proposed solution to the actuation problem advanced by Baker et al. (2011), which relies on covert articulatory variation in /ɹ/, is therefore unable to account for this particular instance of /s/-retraction. In addition to significant change in /stɹ/, /stj/ and /stʃ/, we also observe change in pre-vocalic /s/ and /ʃ/ in the form of a larger acoustic contrast between these sibilants among younger speakers. We discuss these results and their implications for the origins and underlying mechanisms of /s/-retraction, as well as variation in the wider sibilant space.

Preliminaries on /s/ and /s/-retraction
Although this study is concerned with context-specific /s/-retraction, there is of course a wealth of evidence highlighting the degree of within-and between-speaker variability in the production of sibilants more generally and the factors that condition these patterns of synchronic variation (Newman et al. 2001;Stuart-Smith 2007;Levon et al. 2017). This is particularly the case for /s/, in which variation has been shown to have taken on socio-indexical meaning: more fronted /s/ realisations are perceived as less masculine and more gay and, in production, are more frequent among male speakers who are gay or bisexual (Campbell-Kibler 2011;Podesva & Van Hofwegen 2014;Levon 2014).
However, it is important to separate the realisation of /s/ in /stɹ/ from the wider spectral variation in /s/ production. /stɹ/ has been studied as a sociolinguistic variable in its own right, separate from the wider patterns of socio-indexical variation in /s/ found in other environments (see e.g. Durian 2007;Gylfadottir 2015;Wilbanks 2017). There is also strong perceptual evidence to support this distinction: Phillips & Resnick (2018;2019) find that a retracted /s/ in /stɹ/ clusters does not carry the same socio-indexical meaning of masculinity, toughness or heterosexuality that has been observed for retracted /s/ in other /sCɹ/ environments.
There is a general lack of large-scale studies of /s/-retraction that combine robust acoustic analysis with community-level data in order to investigate its status within a given speech community in detail, especially with a view to investigating change. Existing work is also variable with respect to the method of coding /s/-retraction. Some studies have adopted a binary classification, coding tokens of /s/ as either retracted or non-retracted (e.g. Janda & Joseph 2003;Bass 2009); while this finds some support from a study by Rutter (2011), who reports that a majority of retracted forms fall within a speaker's normal range for [ʃ], Labov (2001) argues that, in Philadelphia English, there are at least four variants differing in how [ʃ]-like they are. A fine-grained acoustic measure of retraction is vital given that this process is also argued to occur, though to a much lesser degree, in other contexts such as pre-vocalic /sp, st, sk/ and /spɹ, skɹ/ (Labov 1984;Janda & Joseph 2003;Baker et al. 2011).
The few large-scale studies that have been conducted concern almost exclusively varieties of English spoken in North America (e.g. Gylfadottir 2015 in Philadelphia, PA andWilbanks 2017 in Raleigh, NC) and all show evidence of apparent-time change with increasing /s/-retraction among younger speakers. 1 However, the role of other social factors in explaining variation in retraction, such as gender and socioeconomic status, is less consistent. Durian (2006) andBass (2009) describe "rapid anonymous" surveys, in which /s/ is auditorily coded, but the former finds a female lead for /s/-retraction in American English (Columbus, OH) and the latter a male lead in British English (Colchester). Larger-scale studies based on acoustic data are similarly variable, finding either that women are leading in this change (Wilbanks 2017;Stuart-Smith et al. 2019) or detecting no gender effect at all (Gylfadottir 2015). Additionally, while Labov (2001) finds an association between /s/-retraction and working-class speech in subjective evaluation tests, there is not yet any clear evidence of a relationship between /s/-retraction and social class on the basis of actual production data. Although we plan to address these topics in future work that considers the wider status of /s/-retraction in the speech community, and we do discuss these factors briefly in Section 4 since they are included in the model, the primary focus of this paper is on the mechanisms by which /s/ undergoes retraction in this environment.
1 One exception to this bias towards American English is Ahlers & Meer (2019), a large-scale corpus study of /s/-retraction in /stɹ/ in Trinidadian English which also found retraction among younger speakers.
As discussed in Section 1, there are two major proposals that aim to explain why /s/ retracts in these contexts. These differ from one another with respect to the locality of the triggering mechanism: i. /s/ undergoes long-distance assimilation to /ɹ/; ii. /s/ undergoes local assimilation to an adjacent affricated /t/, itself affricated locally under the influence of a following /ɹ/.
A fuller consideration of these competing explanations follows.

Arguments for /ɹ/ as the trigger
The first of these competing explanations, that retraction is caused directly by the /ɹ/ in these /stɹ/ clusters, was initially proposed by Shapiro (1995). This argument was motivated by the observation that /s/ retracts to a significantly lesser degree in /st/ clusters, e.g. steep, suggesting that the rhotic segment in /stɹ/ plays a crucial role and that the retraction process is therefore a case of assimilation "at a distance". The presence of only low-level retraction in /st/ and other complex onsets has been widely attested and even varieties that are reportedly not undergoing the /stɹ/ change itself, such as Australian English, still show a slightly lower centroid frequency for /s/ in contexts such as /sp, st, sk/, described as the "phonetic pre-conditions" of the change in /stɹ/ (Stevens and Harrington 2016: 118).
While Shapiro (1995) relies on the inspection of secondary data and anecdotal evidence, later studies provide a more strongly quantitative and empirical support for these claims. An articulatory experiment by Baker et al. (2011), in which ultrasound tongue imaging is combined with simultaneous acoustic analysis, finds evidence of a coarticulatory bias towards retraction in other /sCɹ/ clusters in American English. In their study, /s/ is produced with a slight dampening of centre of gravity in all complex onset clusters, with a stronger lowering in centroid frequency observed when the complex onset contains /ɹ/, as in words such as spring and screech. Although the effect is registered most strongly in /stɹ/, which is clearly set apart from all other onset types, this result suggests that /ɹ/ has some long-distance lowering effect on the frequency of /s/ regardless of the intervening consonant.
Another line of argumentation provided by Baker et al. (2011) draws upon the inherent variability in the articulation of /ɹ/. It is well established that /ɹ/ can be produced with a range of lingual constriction types, ranging from more "tip-up" retroflex to more "tip-down" bunched articulations, often with little to no perceptible acoustic difference between them (Delattre & Freeman 1968;Twist et al. 2007). Baker et al. (2011) report that all speakers in their study used a bunched /ɹ/ variant in /stɹ/, but with variation within this category with respect to the similarity of the lingual shape in the articulations of /s/ and /ɹ/ in these clusters. Among speakers they classify as "non-retractors", for whom retraction in /stɹ/ stems from gradient coarticulation rather than a distinct production target, the magnitude of retraction correlates with this inter-speaker variation in the tongue shape for /ɹ/ and specifically its similarity to the tongue shape for /s/.
It is also interesting to note the claim by Baker et al. (2011) that /stɹ/ clusters favour a bunched rather than retroflex tongue configuration for /ɹ/ as there is independent evidence (albeit from British rather than American English) that bunched /ɹ/ is accompanied by more extreme lip protrusion relative to retroflex /ɹ/ (King & Ferragne 2020). This might suggest that any role of /ɹ/ may in part be due to regressive assimilation of labialisation and less so the lingual gesture, a possibility also raised by Janda & Joseph (2003: fn. 8) in a discussion of crosslinguistic differences in anticipatory labialisation.
One final piece of suggestive evidence regarding the role of /ɹ/ comes from a small-scale study of /s/-retraction in British English: Sollgan (2013) reports variability in tongue shape for /ɹ/ among speakers of Edinburgh English and, crucially, observes that alveolar realisations of /ɹ/ rarely co-occur with a retracted /s/ variant. Based on this, one might argue that there is a close relationship between the tongue shape of /s/ and of the rhotic segment in these onset clusters,

Arguments for affrication as the trigger
Competing accounts of /s/-retraction have proposed that affricated /tɹ/ clusters are responsible for the retraction of a preceding /s/. This argument was first made by Lawrence (2000), in response to the /ɹ/-centric explanation provided by Shapiro (1995) as detailed in the preceding subsection. He points out that when innovative [ʃ]-like variants are produced, they are always followed by an affricated /tɹ/ cluster and claims that the derivation follows a two-step process as follows: /stɹ/ → [stʃɹ] → [ʃtʃɹ] (Lawrence 2000: 83). Although this is based largely on anecdotal evidence, it is supported by several studies of /s/-retraction that note how participants who retract always have affricated /tɹ/ clusters and that /tɹ/-affrication predates /s/-retraction (Magloughlin & Wilbanks 2016;Smith et al. 2019). These results suggest a strong (though not necessarily unidirectional) link between these two processes: speakers may affricate the /tɹ/ cluster without also retracting the /s/ but they do not retract /s/ without affricating the following /t/. Affrication has also played a central role in explanations of /s/-retraction in other, lesser-studied varieties of English, such as Trinidadian English where it has been described as the catalyst for this change in /s/ centroid frequency (Ahlers & Meer 2019).
The affrication of such clusters, e.g. in words like train and try (similarly in its voiced counterpart /dɹ/, e.g. drink, dry) is well-accepted in descriptions of English, dating back to the relatively early reference by Jones (1956: §270) that "[t]he Southern English tr and dr seem to be intermediate between single affricates and sequences of two distinct sounds". Its status is noted by textbooks on English pronunciation and highlighted for foreign learners of English (Cruttenden 2014;Lindsey 2019). Additional support for affrication of /tɹ/ clusters comes from children's spellings, e.g. try as CHRIE and dragon as JRAGIN (O'Neil 2013: 222).
The extent to which /tɹ/-affrication is a stable phenomenon or worthy of study as a change in progress is not clear in existing descriptions and this is clouded by the lack of empirical studies, particularly when compared to /stɹ/. Wells (2011), at least, asserts its stability in Southern that within a given speech community, /tɹ/-affrication could be a sound change, even if Wells is right concerning present-day British English and it is stable. 2 /tɹ/ clusters are certainly affricated in Manchester, although we cannot comment here on whether there is full affrication to [tʃɹ] or some kind of in -between variant. Three of the four authors are native Mancunians and report complete homophony between sentry and compressed century, perhaps suggesting a more advanced situation than that reported by Wells (2011).
An attractive aspect of the affricate-based explanation, as opposed to assimilation to /ɹ/, is that it would also capture the behaviour of /s/-retraction in /stj/ and /stʃ/ clusters, for which this study provides the first acoustic evidence in a large-scale investigation of the speech community. This is because /t/ also affricates before /j/ in a process often called "yod-coalescence", e.g. tune [tʃʉːn], although there is not yet a detailed quantitative/acoustic study of this phenomenon.
/tj/-coalescence is discussed by Wells (1997), who claims that it was initially found in unstressed syllables, e.g. perpetual, before spreading to stressed contexts, e.g. tune, in the late twentieth century. Hannisdal (2006) provides a good overview of this widespread change in British English, with a focus on Received Pronunciation, by comparing various pronunciation dictionaries over the course of the twentieth century and the extent to which they list /tʃ/ as a possible variant for words historically containing /tj/. Hannisdal notes that coalesced forms are absent in the first edition of the English Pronouncing Dictionary (Jones 1917) but begin to appear sporadically by the later 16 th edition (Jones 2003). In other such dictionaries published since the turn of the century, affricated variants are listed more consistently, e.g. in the Longman Pronunciation Dictionary (Wells 2000) and the Oxford Dictionary of Pronunciation for Current English (Upton et al. 2001). Increases in /tj/-coalescence have also been attested within individual speech communities, including the likes of Ipswich and the Fens in East Anglia (Britain et al. 2008), where it co-exists with variable yod-dropping, and across various locales in the East Midlands (Braber & Flynn 2016).
As with the discussion of /tɹ/-affrication earlier and the fact that speakers have to affricate /tɹ/ for /s/-retraction to be licensed, it is also the case that significant retraction of /s/ in /stj/ clusters is similarly limited to instances where the /tj/ cluster itself does actually undergo coalescence.
2 Note that Prokofieva (2021) claims that /tɹ/-affrication is stable in Canada. However, she only records 18-21 years olds, using a male-led effect in terms of relative advancement as a proxy for stability. This is an unreliable assumption based on the observation that in stable variation, males show more of the non-standard variant than women and should not be deduced backwards. Thus, Canadian affrication may be stable but we cannot tell from the data of 18-21 year olds surveyed in this investigation.
Although the coalesced form is now incredibly widespread across varieties of British English, some speakers do still retain the /tj/ pronunciation. However, this resistance to /tj/-coalescence is largely restricted to speakers of "conservative RP", who see such realisations as a less formal variant, particularly in word-initial onsets of stressed syllables (Upton 2008: 229; see also Ramsaran 1990 and the Daily Telegraph quote reported by Kerswill 2001: 12 describing coalescence as an "insidious degradation of spoken English"). Anecdotally, coalescence is certainly predominant in Manchester English, applying consistently across the lexicon and throughout the whole social scale.
As outlined in Section 1, investigations of /s/-retraction have almost all focused exclusively on the /stɹ/ context. There is, however, some evidence from smaller-scale studies of how the affricate derived from /tj/-coalescence influences a preceding /s/ in words such as student.
Retraction in this environment is discussed briefly by Glain (2014) in a study of what he terms "instances of contemporary palatalisation", referring to the increasing occurrence of palatoalveolar sounds such as /ʃ, tʃ, ʒ, dʒ/ in certain segmental contexts in British English. However, it should be noted that Glain draws no causal link between affrication and retraction, suggesting instead that it is caused directly by the /ɹ/ and /j/ in these clusters. A retracted /stj/ variant is also mentioned briefly in the Longman Pronunciation Dictionary (Wells 2000: 50), where it is interesting to note that the derivation is given as /stj/ → [stʃ] → [ʃtʃ] but for a word like strong it is given as /stɹ/ → [ʃtɹ] with no equivalent affrication.
Finally, retraction in /stj/ has been attested in New Zealand English. In his response to Shapiro (1995), Lawrence (2000) explicitly mentions the behaviour of /stj/ sequences but only cites examples in word-medial position (e.g. moisture) and across word boundaries (e.g. last year).
Like Shapiro (1995), this paper is also largely based on anecdotal evidence rather than a robust quantitative investigation. A study of this type was conducted by Warren (2006) using elicitation recordings from the New Zealand Spoken English Database (Warren 2002), though the /stj/ results are based solely on the words student and Stewart and tokens were coded only auditory in a binary fashion (retracted vs non-retracted) rather than measured acoustically. The results suggest a strong gender divide with male speakers much more likely to retract in /stj/ relative to female speakers (42% vs 14%), with this stark divide leading to an overall significant difference in the rate of retraction between /stɹ/ and /stj/ among these speakers of New Zealand English.
The behaviour of /s/-retraction in this study is also further complicated by the way in which the process interacts with variable /t/-deletion in the same clusters.
Overall, there is some evidence of retraction in /stj/ across varieties of English. However, the severe lack of large-scale acoustic analyses, particularly compared to the widely-studied retraction in /stɹ/, means that we know very little about its exact synchronic and diachronic behaviour and the extent to which these two environments of retraction behave in a similar fashion.

Methodology
The present study contributes to our understanding of /s/-retraction with a number of methodological strengths, including robust acoustic analysis rather than auditory coding and working with conversational sociolinguistic interview data with a large and balanced sample of a single speech community, including sociodemographic metadata for all speakers. The methods of analysis are detailed further in the following subsections.

Data collection
This study is based on a sample of 118 speakers (61 male, 57 female) who grew up in Manchester from the age of 3 or younger, with at least one local parent, stratified by age, gender, social class and ethnicity. For the purposes of the study, Manchester is defined as the urbanised area within the M60 ring-road motorway, including neighbourhoods immediately south of the M60, such as Sale, Wythenshawe, Northenden, Cheadle and Stockport.
The informants' ages at interview range from 16 to 87. Social class is operationalised in terms of occupational levels as occupation has been shown to be the best single indicator of socio-economic status for the purposes explaining linguistic variation, both in the US (Labov 2001) and in the UK (Baranowski & Turton 2018). Following Baranowski (2017), there are five occupational levels ranging from lower-working for unskilled workers to upper-middle class for occupations such as university professors and high-level managers and administrators. The assignment to a particular social class is based the occupational history of a speaker rather than just the last job they held; children are assigned the social class of the parents. The coding of ethnicity is based on speakers' selfidentification, with 118 white British, 18 Pakistani and 13 Black Caribbean informants in the wider corpus, though in this study we focus on the white British speakers due to the relatively smaller samples in other groups. A more detailed breakdown by age and social class is given in Table 1  The informants were recorded during sociolinguistic interviews, conducted mostly between 2011 and 2018 with only a small number of speakers recorded earlier than this. Some informants were recorded in their homes, some at their place of work and others on a university campus.
All were recruited through a "friend of a friend" approach (after Tagliamonte 2006: 21-2). The interviews focused on eliciting narratives of personal experience, which are known to approximate speakers' vernaculars (Labov 1984;Tagliamonte 2006). The interviews were supplemented with two formal elicitations: word-list reading and minimal pairs for a number of vocalic and consonantal contrasts. However, as these elicitation tasks do not contain sufficient tokens of /s/ in the target environments, the results in this paper are based solely on spontaneous speech. The recordings were forced-aligned using FAVE (Rosenfelder et al. 2014), the online Forced-Alignment and Vowel Extraction suite developed at the University of Pennsylvania, in order to produce a time-aligned transcription for efficient and automated extraction of the relevant acoustic measures. This process of acoustic data extraction and processing is described in the following section.

Acoustic analysis
Studies of /s/-retraction almost always characterise the fricative quality using its centroid frequency, commonly referred to as its centre of gravity (CoG), which has been shown to correlate

Results
Although the primary focus of this paper is to establish the status of /s/-retraction in Manchester English and the extent to which /stɹ/, /stj/ and /stʃ/ pattern together, it is first important to establish an overall picture of the wider sibilant space in this community. Figure 1 shows the normalised CoG values for all tokens of underlying /s/ and /ʃ/ split by environment, ordered in the expected direction of retraction magnitude based on earlier work (Baker et al. 2011) with the addition of two contexts: /stj/, which is thus far understudied and here placed next to /stɹ/ for ease of comparison and /stʃ/ (e.g. mischief), which is similarly overlooked in the literature.
The overall picture appears to be quite comparable to that established by Baker et al. (2011) in their study of American English varieties: pre-vocalic /s/ unsurprisingly has the highest CoG and this is followed by /sp, sk, st/ clusters (e.g. spin, skin, sting) which demonstrate a very slight tendency for retraction. Interestingly, further along the hierarchy there is evidence from /spɹ/ and /skɹ/ clusters (e.g. spring, scream) that the presence of /ɹ/ in these clusters can lead to slightly more advanced retraction despite this not being an environment in which /t/-affrication takes place (see also Stuart-Smith et al. 2019 for similar results). However, crucially, it is evident that the /stɹ/ context is set apart from the other /sCɹ/ environments, demonstrating even more extreme retraction of /s/. Figure 1 also provides the first quantitative evidence of community-level change in /stj/ clusters (e.g. student), which appears to be at a similar level to /stɹ/. 3 Furthermore, 3 Retraction in /stj/ clusters has previously been reported in studies by Warren (2006) and Nichols & Bailey (2018) but these are based on elicited lab speech and neither investigate potential apparent-time change in these clusters. the /stʃ/ context provides an important point of comparison to /stɹ/ and /stj/ as here we see the realisation of /s/ before what is an indisputable underlying affricate rather than the derived affricate we see in the latter two environments. It is interesting to note that /s/-retraction is evident here too and, additionally, is slightly more advanced relative to /stɹ/ and /stj/.
Averages of the raw frequencies are additionally provided in Table 2 to facilitate comparison with earlier studies in which only non-normalised measures are reported.
While there is still a sizeable gap between the centroid frequencies of /s/ in /stɹ/ and /stj/ words compared with the /ʃ/ end-point of this continuum, it is important to note that these values are aggregated over the entire community. However, as reported in Section 2, there is of course substantial evidence from other varieties of English demonstrating that /s/-retraction is an ongoing, or recently-completed, sound change. With this in mind, Figure 2 plots the distribution of CoG values separately for the youngest and oldest cohorts of speakers in the corpus, with speakers in the "youngest" group being born between 1990 and 2001 (aged between 16-25 at time of interview) and speakers in the "oldest" group being born between 1907 and 1949 (aged between 63-87 at time of interview). Overall, there is a great deal of stability in the wider sibilant space whereas a clear and striking change between generations can be found in the /stɹ/ and /stj/ contexts that, for the younger speakers of Manchester English, now partially overlap with the frequency range for underlying /ʃ/.
A more in-depth analysis of the /stɹ/ and /stj/ contexts will be presented later in this section.
Before then, we will briefly explore these potential changes across all environments to lend insight    (Wilbanks 2016), it is also possible that this is simply an artefact of relying on apparent-time data for diagnosing change and that these results arise instead from the physiological effects of ageing, with the phonetic range of this /s/-/ʃ/ contrast diminishing within speakers as they age.
While further work would be needed to tease apart these potential explanations, we will briefly return to this point in Section 5.
Unsurprisingly, the /stɹ/ and /stj/ contexts demonstrate the most striking change with magnitude of this difference being almost identical across the two environments (β = 0.879 and β = 0.889 respectively, p < 0.001 for both). It should be noted that while no significant change was found in /stʃ/ in this particular model, the estimate was the next largest in size (β = 0.269, p = 0.513) and the lack of statistical significance is likely attributed to the small number of observations of this context, particularly among the smaller cohort of older speakers.
Nevertheless, it is interesting to note that /stʃ/ words already demonstrated a more retracted /s/ segment even for these oldest speakers born in the first half of the twentieth century and therefore a less dramatic change over time, another point that will considered in Section 5.
Taking a closer look now at the three main sources of /s/-retraction, Figure 3 shows the change discussed thus far in a more fine-grained manner through the use of birth year rather than binned age groups. The wider fricative space among younger speakers is again evident based on the distance between pre-vocalic /s/ and /ʃ/ but the most crucial finding for the purposes of this study is the striking change observed in /stɹ/, /stj/ and /stʃ/. All three of these contexts appear to change in parallel, once again providing strong evidence that the retraction of /s/ before these affricates (whether underlyingly present or not) is governed by the same process and behave in a unified manner. In all three cases, there are significant negative correlations between speakers' birth years and mean CoG values: ρ = -0.468 (p < 0.001) for /stɹ/, ρ = -0.487 (p < 0.001) for /stj/ and ρ = -0.343 (p = 0.017) for /stʃ/. Although there is little data before 1925, extrapolation  Date of birth Normalised centre of gravity (z) from the observed change also suggests that at the beginning of the twentieth century this process had not yet been initiated and that /s/ was very much [s]-like in these three contexts.
A set of mixed-effects linear regression models were fitted to the data to explain the variation observed in these three environments, taking into account a wider set of social and language-internal A baseline model was fitted to all /stɹ/, /stj/ and /stʃ/ tokens containing this full range of predictors but excluding environment completely, thus not differentiating between these three groups. This was compared to a model with environment as a predictor to determine whether or not there is a significant difference between them and also a third model in which environment interacts with all of these other predictors to determine whether or not /stɹ/ and /stj/ behave differently with respect to their conditioning factors. Table 3, based on the baseline model without an environment factor, indicates that there are a number of significant factors involved in the variation in /s/-retraction.

The coefficients table reported in
Younger speakers are significantly more retracted than their older counterparts (p < 0.001), confirming the pattern of change in progress illustrated earlier. Retraction is also significantly more advanced for female speakers (p = 0.008) and in word-medial position (p = 0.003). The nature of these effects is not surprising given the widespread nature of female-led change established across decades of sociolinguistic study in similar communities (Labov 2001), as well as previous reports that claim /s/-retraction actually started in word-medial position before spreading to initial positions (Durian 2007; see also Baker et al. 2011;Gylfadottir 2015;Wilbanks 2017) for similar reports of word-medial position leading the change). Retraction is also more advanced in segments of shorter duration (p < 0.001) and marginally so in words of higher token frequency (p = 0.04), which is also to be expected of an assimilatory sound change (see e.g. Bybee 2012). 4 There is already some consideration of the coarticulatory effect of vowels in the existing literature on /s/-retraction, with mixed results. Rutter (2011) discusses the possibility that anticipatory lip-rounding from a following rounded vowel can influence the spectral profile of /s/ and trigger a more extreme retraction, which he observes in words such as strudel, although this is not consistent across all speakers. On the other hand, Durian (2007) and Gylfadottir (2015) find no effect of the following vowel and argue that they are too distant to have a coarticulatory effect on the sibilant in these /stɹ/ clusters. Neither following vowel type (p = 0.391) nor social class are significant, although there is a weak trend within the latter for upper-middle-class speakers-the highest social group within our data-to be more conservative and produce less /s/-retraction compared with working-class speakers. It is interesting to note that the magnitude of the effect size is larger than the other categorical variables included in the model, even the significant predictors of gender and word position (β = 0.351, cf. -0.235 and -0.159 respectively), suggesting that the lack of statistical significance stems largely from a small sample size: there are only 6 upper-middle-class speakers in the corpus (producing 81 tokens of /stɹ/, /stj/ and /stʃ/), compared with 54 lower-middleclass and 58 working-class speakers.
Crucially, environment is not significant when added to this model (p = 0.243 for /stj/; p = 0.608 for /stɹ/) nor do any of the other predictors change in their behaviour in terms of the direction or significance of their effect on CoG. ANOVA comparison between these nested models also leads to no significant increase in model fit (p = 0.345), with the full details reported in Table 4 suggesting that allowing the model to differentiate between the three environments of interest leads to barely any increase in explanatory power. A further comparison with the third model, which includes an interaction between environment and all other factors, is also reported in this table but this also leads to no significant improvement in the statistical modelling (p = 0.437).  Table 3: Output of the mixed-effects linear regression model; random intercepts for speaker and word (reference levels for dummy-coded categorical variables given in square brackets).

Fixed effects Estimate Std. Error t-value
Although the environment-sensitive models explain slightly more of the observed variation in the dataset (indicated by the deviance values), the baseline model in which /stɹ/, /stj/ and /stʃ/ are treated as a single group would win in a model selection process based on AIC and BIC, which evaluate the trade-off between goodness of fit and model simplicity. In other words, the slightly reduced "information loss" in the more complex models is not large enough to warrant the increased complexity in the model structure. As such, the results here suggest strongly that there are no significant differences in the behaviour of /s/-retraction across any of these three contexts under study.
Though we do not provide robust quantitative evidence here of the status of /t/-affricationand a thorough investigation of this lies beyond the scope of this current paper-we do see clear evidence of this in our data. To illustrate this, in Figure 4, we provide representative examples of /t/-affrication in word-initial /stɹ/ and /stj/ clusters. A future study of /t/-affrication in Manchester English is certainly warranted, both as an independent instance of change but also to potentially better inform our understanding of /s/-retraction, a point we will return to in the following discussion of these results.

Discussion
The results here point to a number of interesting properties of /s/-retraction in Manchester English, including not only its synchronic behaviour but also its trajectory of change over the past century. At the start of this paper we set out three major research questions concerning (i) the presence of an ongoing change in /s/-retraction in Manchester English, (ii) the behaviour of /stj/ and /stʃ/ clusters specifically and the extent to which these pattern like /stɹ/ and (iii) the underlying mechanisms that motivate the change in these environments.
The results first provide strong evidence that Manchester English, like many varieties across the English-speaking world, is currently involved in an ongoing sound change of /s/-retraction.
The speakers in this corpus cover birth years ranging from 1907 to 2001 (although most speakers are born after 1925) and, in this time frame, we observe a notable change with the average centroid frequency of /s/ in retracting environments starting out at levels closer to a baseline prevocalic /s/ and ending up at a level that sees significant overlap with /ʃ/. This would suggest that the change is at a relatively advanced stage, nearing completion in this community, assuming of course that a speaker's own target for /ʃ/ is a natural endpoint for the change in this sibilant. This would be a sensible assumption in a framework that sees sound change progressing from phonetic rules that involve gradient coarticulatory pressures to a stabilised phonological rule in which /s/ is categorically changed to a discrete /ʃ/ target as part of the phonological derivation (see e.g. Bermúdez-Otero & Trousdale 2012 on the life cycle of phonological processes). Empirically, this might be reflected by a unimodal distribution in which there is no significant differentiation between the centroid frequencies of /s/ (in these retracting environments) and /ʃ/.

On the triggering mechanisms of /s/-retraction
This paper is the first to report a comparable apparent-time change also occurring in the /stj/ (e.g. student) and /stʃ/ (e.g. mischief) environments. Crucially, these three contexts appear to behave as one with respect to their current rate of retraction and there are no significant differences between their rates of change in this time period either. In Section 2, we outlined the two major proposals that have been put forward concerning the mechanisms behind /s/-retraction, specifically whether it is caused directly through a long-distance effect of /ɹ/ (see e.g. Shapiro 1995;Baker et al. 2011) or alternatively through contact with an adjacent affricated /t/ (see e.g. Lawrence 2000). Returning now to this question, the new evidence presented here lends strong support to the latter proposal. While it is not impossible that retraction is triggered by three distinct mechanisms in /stɹ/, /stj/ and /stʃ/, the fact that their behaviour is so similar and that they change in tandem suggests that we should instead appeal to a single unifying explanation that invokes affrication as the cause of retraction.
Of course, under a frequentist statistics framework centred around null hypothesis significance testing, it is perfectly possible that the lack of significant difference between these three environments is in part due to issues with sample size and statistical power. Indeed, while the three environments are broadly similar among the youngest speakers, visual inspection of the frequency distributions from Figure 2 indicated that /stʃ/ was perhaps slightly ahead of /stɹ/ and /stj/ at an earlier point of the change. While it would be difficult to account for a difference in the opposite direction, there is in fact an obvious reason why at one point /stʃ/ may have been ahead of the other two environments in this change: the affrication of /t/ before /ɹ/ and /j/ was (and to some extent still is) variable and, as a result, the change in these environments could be underestimated due to the inclusion of older speakers who may not actually take part in this /t/-affrication process at all. The change will naturally appear to be further ahead in /stʃ/ than in /stɹ/ and /stj/ while affrication remains optional in the latter two compared to its obligatory presence in the former, at least for older speakers who acquired the dialect at an earlier stage while affrication in /tɹ/ and /tj/ was still an ongoing change. The inclusion of "underlying" /stʃ/ words such as mischief in this analysis brings with it an obvious benefit, namely it allows us to isolate the effects of /s/-retraction from the interacting changes affecting the adjacent clusters, since these words have-and have always had-an affricate present underlyingly. A follow-up study should be conducted with a specific focus on apparent-time change in /tɹ/-affrication and /tj/-coalescence. This would shed light on whether these two changes have been taking place alongside /s/-retraction (and as such the three sound changes have been working in tandem) or, alternatively, that they have in fact been stable for a long time in this speech community, laying the foundations for /s/-retraction from an early stage.
It is important to note that the results may not necessarily be generalisable to all instances in which /s/-retraction has developed and propagated through a speech community. While this change appears to have been initiated in a number of varieties at a roughly similar time, it is possible that they are all independently triggered by a range of different mechanisms. This is especially pertinent given the fact that most varieties of American English are yod-dropping: given the absence of /j/, they demonstrate no comparable affrication or retraction in /stj/ words like student and stupid and, as such, the distribution of retracted /s/ tokens is quite different. Children acquiring the language are exposed to fewer instances of retracted /s/ before an affricate and therefore might not receive the kind of input required to reanalyse the segmental conditions of /s/-retraction in this way. 5 However, even setting aside our data for a moment, there are still questions that can be raised regarding the proposal put forward by Baker et al. (2011). Most notably, if retraction is caused directly by assimilation in tongue shape to /ɹ/, why is retraction consistently so further advanced in /stɹ/ relative to /spɹ/ and /skɹ/ in all varieties of English affected by this change? This is particularly surprising for /spɹ/, where the intervening consonant has a labial place of articulation and as such the tongue has no distinct target to hit between the /s/ and /ɹ/. In a study of /s/-retraction in Philadelphia English, Gylfadottir (2015) raises a similar concern and in doing so also argues against this being a case of "assimilation at a distance" to the non-local /ɹ/ in these clusters. This question could be further illuminated with articulatory data on this covert variability in rhotic production in different segmental environments and across a range of speech communities but, in the absence of this, it remains the case that an explanation in terms of adjacency to /tʃ/ more closely matches the data.

On the origins of /s/-retraction
Alongside the disagreement over why /s/-retraction takes place in these environments, there are also conflicting reports over the origins of /s/-retraction, specifically with respect to the initiation of change in /stɹ/ in the other environments that demonstrate some low-level retraction. Janda & Joseph (2003) argue that retraction started in /stɹ/, where today it is registered most strongly and later "spread" to the other contexts such as /sp, st, sk/. While this kind of rule generalisation is entirely possible, in which the target environment of the change is reanalysed to encompass a wider range of segmental contexts, this claim is only based on the fact that retraction is "sporadic" and less advanced in these contexts. In other words, it is based on neither real-time nor apparent-time data from which change can be observed or inferred. As such, there are obvious limitations to relying on these differences in effect magnitude from synchronic data as a tool for estimating the initiation of change.
Instead, it is entirely plausible that retraction began in all complex onset clusters simultaneously, before advancing at a faster rate of change in /stɹ/-and /stj/, as demonstrated in this paper for Manchester English-leading to the clear separation of contexts we observe in many varieties of English spoken today. Indeed, this has been suggested by Stevens & Harrington (2016) for Australian English, in which low-level retraction is observed in /sp, st, sk/ and even /stɹ/. This has been described as the "phonetic pre-conditions" for the more advanced retraction in /stɹ/ that has developed in most varieties of English, suggesting a trajectory of change that sees retraction staring off in all complex onset clusters first before advancing more rapidly in /stɹ/.
In our data, we observe that /st/ is significantly more retracted than pre-vocalic /s/ but that it actually shows very little change over time. Moreover, the minor change we do find is actually in the opposite direction, with younger speakers producing more /s/-like tokens with a higher CoG in /st/ relative to older speakers. This would not be the direction of change we expect if retraction in /st/ represented a later stage of the wider retraction process after spreading to more contexts.

On variation and change in the wider fricative space
This increase in the CoG of /st/ in apparent time mirrors the same change observed in our data for pre-vocalic /s/, which is also produced with a higher CoG and therefore a "hissier" quality, among younger speakers. Conversely, these same speakers produce a "hushier" /ʃ/ with a small but significant decrease in its CoG in apparent time. Taken together, these results illustrate an apparent expansion of the sibilant space over time leading to a greater acoustic contrast between /s/ and /ʃ/ for younger speakers. Interestingly, a similar result appears in work conducted by Wilbanks (2017) on the variety of American English spoken in Raleigh, NC although there it is restricted to male speakers.
There are two possible interpretations of this finding: (i) that it represents a change in progress involving more distinctive productions of /s/ and /ʃ/ targets and therefore an expansion of the fricative space over time or (ii) that it actually represents the physiological effects of ageing and a reduction in the acoustic /s/-/ʃ/ contrast within speakers' own productions as they age. There are also two possible pathways by which the latter might occur, both of which find support in previous literature (Matthies et al. 1994;Perkell et al. 2004;Koch & Janse 2015). An articulation-centric explanation might foreground the effects of reduced motor control in old age and by consequence the greater difficulty in producing the precise articulations of /s/ and /ʃ/, which already involve a combination of gestures such as tongue grooving and lip rounding in addition to the midsagittal tongue shape (Rutter 2011). There is an alternative explanation routed through perception and the feedback loop: speakers with hearing loss, which disproportionately affects older adults (Bowl & Dawson 2019), would suffer reduced auditory feedback and may as a result impact their own production of these segments. This would likely be registered most strongly in sibilant production, and particularly /s/ itself, since hearing loss primarily affects the higher frequency range where more of the energy in /s/ is concentrated.
Further work should be conducted in other speech communities to establish how geographically widespread this phenomenon is beyond the disparate locales of Manchester and North Carolina (Wilbanks 2017) and also to shed more light on the exact status of this change. The difficulty in identifying its direction-a community-level increase in sibilant contrast over time or an individual-level decrease as speakers age-is a natural limitation of relying on apparent-time data to infer language change. As such, future work would ideally draw upon a longitudinal or real-time approach in order to tease apart the various temporal dimensions of birth year, age at interview and time of interview (see Fruehwald 2017).
Regardless of the direction of change, the fact that pre-vocalic /s/ and /ʃ/ are not themselves stable highlights the importance of interpreting the degree of change in /s/-retraction with respect to the wider sibilant space itself rather than in isolation.

Conclusion and thoughts for future work
In this paper, we have provided evidence of /s/-retraction in Manchester English, marking the first time this variable has been studied in detail within a single British English speech community.
In doing so, we identify significant change in apparent time not only in the widely-studied /stɹ/ context but also in /stj/ and /stʃ/ words, all of which now approximate /ʃ/, suggesting that the changes are near completion. These latter two contexts have yet to be tracked in apparent time within a speech community but all three cases of retraction appear to be changing in parallel and are also comparable synchronically in terms of their absolute rates of retraction in the present day. Our results therefore speak to ongoing questions regarding the triggering mechanisms of this process, which focus either on the role of /ɹ/ in a case of long-distance assimilation or on the role of the adjacent affricate. Given this new evidence of parallel change in /stj/ and /stʃ/, in which /s/ appears adjacent to an affricate but in the absence of /ɹ/, we argue that a "retraction by affrication" explanation more accurately captures the relevant segmental environments involved in this process. These results are of course based on a single speech community and so we encourage similar work to be conducted on this full range of contexts in other varieties of English.
While the primary focus of this investigation concerned the triggering mechanisms of retraction, we also report a change in the wider sibilant space that sees a more acousticallydistinct /s/-/ʃ/ contrast among younger speakers. This has been reported independently in other varieties of English but there is not yet a consensus on whether this constitutes change in progress or simply a case of physiologically-motivated age-graded variation with speakers producing less extreme articulations of /s/ and /ʃ/ as they age. Future work is planned to lend further insight into this question and to help tease apart these two opposing vectors of change.
Future work is also needed to provide a more focused and dedicated investigation of variation and change in /t/-affrication in the same community, in the vein of Magloughlin (2018), as well as /tj/-coalescence. This will shed light on how these processes have developed together and the patterns of co-variation they exhibit not just with each other but also with /s/-retraction itself.
In addition to this line of inquiry, it would be fruitful to incorporate lab speech to complement the existing sociophonetic studies of /s/-retraction. While conversational data is excellent for sociolinguistic purposes, it would be beneficial to analyse controlled elicitations in order to conduct a more phonetically-detailed analysis including dynamic spectral measures to track retraction across the duration of individual /s/ tokens. Not only would this provide a closer look at the exact phonetic realisation and the coarticulatory nature of this change in relation to adjacent sounds, it would also open up the opportunity to analyse rates of retraction across various morphological, syntactic and prosodic boundaries to better understand how the change may interact with other elements of the grammar. 6 On a general note, the results presented here illustrate the importance of tracking /s/-retraction-and any sound change for that matter-in a more holistic way, which in this case involves considering its implementation across a range of segmental contexts and its status within the wider sibilant space. 6 Another advantage of looking at further environments is that it will open up analysis of retraction and affrication involving additional segments, e.g. /z, d, dʒ/, which may take part in similar processes to /s, t, tʃ/ but in more limited contexts, such as across word boundaries, e.g. these jars and his drink.