“My Vocal Cords are Made of Tweed”: Style-Shifting as Speaker Design

Intraspeaker variation is evaluated in terms of speaker design in a number of studies (Coupland 1985, Schilling-Estes 1998, Podesva 2008). This study explores possible motives for variation from a speaker design perspective through the analysis of three phonetic variables with differing social status. The variables occur in the speech of Stephen Fry, an intellectual whose public identity is closely linked with his Received Pronunciation (RP) speech. Fry uses more non-standard forms in contexts where his identity is more directly relevant, suggesting his desire to “accentuate the positive and eliminate the negative” associations of the RP register (Meyerhoff 2011:28). However not all the data fit this pattern, demonstrating the need for a broad model of speaker design incorporating multiple motives for style-shifting. It is proposed that the use of linguistic variables with differing social evaluation can give insight into prioritisation of speaker motives in future speaker-centred studies.


"My Vocal Cords are Made of Tweed": Style-Shifting as Speaker Design
Melissa Geere, Joy Everett, and Alasdair MacLeod

Introduction
Several studies of intraspeaker variation describe how individuals may use style-shifting as an active resource in the dynamic creation of an identity (Coupland 1985, Schilling-Estes 1998, Podesva 2008).These studies come under the theoretical umbrella of speaker design theory (Schilling-Estes 2002).Meyerhoff considers possible motivations for style-shifting under the speaker design approach, among them the desire to "accentuate the positive and eliminate the negative" (2011:28).The present study investigates how such a motivation may influence one individual's speech style across contexts involving a greater or lesser degree of personal involvement.
The speaker is Stephen Fry, a respected intellectual in the mass media spotlight.This study contrasts Fry's speech as narrator of the Harry Potter audiobooks, where his personal identity is somewhat irrelevant, with his speech as presenter of his own podcasts, where his identity is the direct topic of conversation.Three variables are investigated within Received Pronunciation (RP).Though perceived as neutral, RP is a variety of British English which has become enregistered (Agha 2003), and is rich in layers of social meaning, as demonstrated by the phonological phenomena chosen for analysisthe strongly stigmatised 'g-dropping' ([ɪŋ]~[ɪn]), the relatively unstigmatised reduction of unstressed medial syllables, and the conservative and highly prestigious glide cluster retention ([hw]~[w], following Minkova 2004).Each variable has a different social status and shows a different usage pattern, giving insight into how speakers might prioritise their many style-shifting motives.The results suggest that Fry uses fewer standard forms in his podcasts, where his identity is more directly relevant, and more standard forms in the audiobooks, where his identity is less relevant.Thus he is able to avoid negative associations attached to RP in the more personal context and capitalise on its positive associations in the less personal context.The concepts of attention paid to speech and covert prestige help explain some apparent contradictions in the data, demonstrating the need for multi-faceted accounts of styleshifting that recognise a speaker's complex and ever-changing motivations for style-shifting.

Theories of Style-Shifting
Speaker design theory arose in response to a perceived inadequacy in existing theories of style-shifting, such as Labov's (1966) seminal theory that attention paid to speech is a key predictor of standardness of style, and Bell's (1984) audience design theory, in which audience is the key predictor.These were criticised for characterising speakers as being too passive.By contrast, the speaker design approach attempts to capture an individual's conscious and unconscious internal motivations for style-shifting.Following Austin's (1962) theory that performative speech acts are utterances which perform actions in the world, LePage and Tabouret-Keller (1985) stated more specifically that speech performs acts of identity, and that a speaker actively selects styles "so as to resemble those of the group or groups with which from time to time he wishes to be identified, or to be unlike those from whom he wishes to be distinguished " (1985:181).While similar to Giles's (1973) earlier ideas of convergence and divergence in accommodation theory, LePage and Tabouret-Keller placed an emphasis on the speaker's own agency in style-shifting.
Later, Schilling-Estes challenged audience design in her 1998 study of Ocracoke English.The speaker she focused on, Rex O'Neal, shifted into performance speech, an exaggeratedly non-standard version of his own variety, when reminded that he was under linguistic analysis.Schilling-Estes argued that "[p]erformance speech does not fit neatly into models that view style-shifting as a primarily reactive phenomenon" (1998:77).She argued that shifts are motivated by role-changing within the conversation, not the audience, as stated by Bell (1984).Rex's style-shifting also demonstrated that non-standard speech was no less a 'performance' than standard speech, implying that there is no single 'default' variety for a speaker.Instead, all utterances may be conceived of as stylistic choices.
Numerous other studies have explored speakers' motivations for style-shifting as a complex phenomenon which, rather than being motivated solely by audience, is better explained as speakers designing an identity with reference to present or absent speech communities (some examples include Trudgill [1983], Podesva [2008], and Beal [2009]).A common theme in these studies is the way speakers shift style to, as Meyerhoff (2011:28) puts it, "accentuate the positive" or "eliminate the negative" in a given context.
What is considered positive or negative depends on context.For example, Labov (1966) identified that nonstandard forms, usually socially stigmatised, carry covert prestige in certain situations.Coupland (1985) describes how a Cardiff radio DJ employs more standard forms at technical moments in his show, accentuating his competence, and more non-standard forms when accentuating his solidarity with listeners.Thus he is able to construct a balanced identity, referencing the positive associations of both varieties, but avoiding the negative by shifting as the context changes.
The present study, from the perspective of speaker design theory, attempts to explore further the idea that speakers style-shift in order to accentuate the positive and eliminate the negative.It is thought that a speaker may select a different style in a context where their personal identity is foremost than they would in a context where their identity is not directly relevant.

Shifting within an Enregistered Standard
When investigating positive and negative associations, Received Pronunciation (RP) is a particularly apt choice of variety.Agha (2003) discusses how RP, originally the spoken variety of Southern Standard British English (SSBE), has become imbued with cultural meaning, leading to its enregisterment as an internationally recognised standard, no longer tied exclusively to Southern Britain.Agha (2003:233 fn.1) notes that RP is "preeminent in public life due to its social prestige, its links to education and economic advancement".
Yet Wells (1994), Milroy (2001), andRoach (2004) all argue that both the status and nature of RP are changing.With the declining importance of social class in British society, it no longer carries the prestige it once had: for example, the BBC no longer broadcasts exclusively in RP, nor is it the prescriptive norm for public schools.1 Additionally, phonological change has been observed over the last century as RP has come under the influence of other varieties such as Cockney (Wells 1994).Even the most canonical RP speaker, Queen Elizabeth II, showed diachronic vowel change in a study by Harrington et al. (2000).Between the 1950s and the 1980s, her vowels became closer to those of a younger generation of SSBE speakers.These changes in phonology and social status suggest that the social meaning of RP is currently in a state of flux.
A study by Giles (1970, cited in Wells 1982b:30) found that RP speakers are considered more intelligent and self-confident, but also less serious, less good-natured, and possessing less of a sense of humour than non-RP speakers.Other cultural associations are clear from RP's other names: 'Public School Pronunciation', 'the Queen's English' or 'BBC English'.Agha notes that "[s]uch labels personify speech by linking sound patterns to attributes of speakers " (2003:234).These attributes are positive (educated, genteel, good diction) as well as negative (posh, elitist, out of touch with reality).Although RP is progressively less used in the contexts which gave rise to these labels, the propagation of these values continues by means of what Agha (2003) calls a 'speech chain', disseminating beliefs throughout a community.Wells (1982b:279) notes that there is variation within RP, despite its reputation as a standard.He proposes a distinction between "U-RP", more conservative and especially associated with the upper class, and a more general "Mainstream RP" which is closer to SSBE norms.The possibility for variation and the community beliefs surrounding RP make it fruitful ground for style-shifting research.The variables in this study were particularly chosen for their differing degrees of recognition and prestige in the RP speech communityfrom the notorious phenomenon of 'g-dropping' ([ɪŋ]~[ɪn]) to the relatively neutral and overlooked reduction of unstressed syllables, to the highly self-conscious and esteemed retention of a [hw]~[w] contrast.RP contains internal variation about which the speech community holds a wide range of beliefs, making it ideal for intraspeaker variation research.Following a speaker design approach, this study contrasts a context where the RP speaker's personal identity is particularly salienta podcast, with one where it is not at the forefrontthe narration of a fictional audiobook, in order to investigate the speaker's use of style-shifting to accentuate positive associations and eliminate negative ones.

Choice of Speaker
Stephen Fry is an actor, broadcaster and journalist from Norfolk, England, with a career spanning three decades at the time of writing.He was educated in public schools and at Cambridge University.His influence is evidenced by what has been described as "the Fry effect" (Chishick 2011): his mentions of other content on Twitter or elsewhere cause dramatic increases in web traffic and book sales.His voice alone has its own Facebook page with 748 likes (as of August 31 st , 2014), and his RP accent reinforces his persona.One blogger describes his public image as "an eccentric English boffin, reassuringly upper-class but never snootily posh, a loveable professor" (Stekelman 2010).These words describe several of the social indexicalities of an RP accent, both positive and negative, relating to education, class, and personality.
Fry himself asserts "[m]y vocal cords are made of tweed.I give off an air of Oxford donnishness and old BBC wirelesses" (Fry 2004).This statement implies that he classes his own variety somewhere between Wells's conservative U-RP: "the popular image of an elderly Oxford Don" (1982b:280), and mainstream RP: "typified by the pronunciation adopted by the BBC" (Gimson 1980, cited in Wells 1982b:279-80).As will be seen, Fry is competent in both varieties, and is capable of manipulating their usage as a stylistic device.

Corpus Design
Fry's style was compared across two contexts: his narration of the audiobook Harry Potter and the Deathly Hallows (Rowling 2007) and his Podgrams, podcasts in which he comments on various cultural and personal topics (Fry 2008a(Fry , 2008b)).Several hours of high quality recordings were available.Both contexts were recorded at a similar time, minimising diachronic variation, and both required careful production aimed at an absent audience.
The two contexts were differentiated on the basis of "personal involvement".This may be considered a measure of the relevance of Fry's personal identity to the context.Fry himself wrote the podcasts about his reallife experiences and opinions, so his identity is more salient and he is more personally involved than in the audiobook context, where the topic is fictional and the words scripted by another author.It was posited that these differences would prompt a style-shift towards or away from conservative U-RP.
Under speaker design theory, the direction of the shift would depend on Fry's communicative motives.Agha (2003:233) writes that "RP is a supra-local accent [...] it is valued precisely for effacing the geographic origins of speaker".We might then expect Fry's usage to be more standard in the audiobook recordings, where his geographic origins are not directly relevant.Equally, as Coupland (1985) found, the standard can index competence, which may be wanted when Fry is interpreting the words of another author.Coupland conversely found that less standard forms index solidarity with other non-standard speakers.This may be a stronger motive for Fry in the podcast context, where he is more personally involved.Fry is publicly known to be linguistically anti-prescriptivist (indeed this is the topic of one of his podcasts in this study), and therefore the use of nonstandard features may for him symbolise solidarity, so as to identify more with the mainstream community of RP speakers.
Two files from each context were analysed (see Table 1 below).Podcasts were obtained from iTunes (via Fry 2012).The mp3s were converted to .wavformat compatible with Praat (Boersma and Weenik 2011), and all character dialogue was removed from the audiobooks using Audacity (Audacity Team 2006), so that only Fry's performance as narrator was examined.

Variables
This study measured conformity to the standard of what Wells (1982b:285) calls "speech-conscious", conservative U-RP according to three variables (see Where relevant, additional effects were coded, such as part of speech (following Labov 1989) and word frequency (following Bybee 2002).The frequency rank in the British National Corpus for the word containing the variable was retrieved (Harris 2003).Words not occurring in the corpus were coded as "very rare".Because frequency ranks are exponential, they were converted to a Standard Frequency Index (SFI, following Carroll 1970:65) using the formula (where p = frequency rank): SFI = 10(Log10 p + 10) We also coded the context as high or low "drama", as Labov found high drama topics can affect a style-shift at utterance level (cf.Labov's 1972 'Danger of Death' question).High drama utterances contained at least one intense action lexeme (e.g., "shouted", "frantically"), and were directly preceded or followed by at least one other utterance containing such a lexeme.All other utterances were considered low drama.

G-Dropping
G-dropping has many social predictors (see Hazen 2006 for a comprehensive review).Wells (1982a:262) states that the velar variant is nowadays preferred in conservative RP, so this variant was coded as standard.The most common non-standard variant [ɪn] regularly attracts criticism from prescriptivist commentators.
Monomorphemic (e.g., 'ceiling') tokens were excluded following Hazen (2006:583).Coding was based on auditory impressions, with ambiguous cases being diagnosed as [ɪŋ] by a velar 'pinch' in the spectrogram.Minority variants such as [ɪm] were coded as non-standard.We also coded following phonological environment (following Houston 1985) as [± coronal] on the basis that [ɪŋ] and [ɪn] themselves are distinguished in this way.
The envelope of variation was the syllabic peak of non-final unstressed syllables following the word's primary stressed syllable.Possible variants were the two unstressed vowels of RP, [ɪ] and [ə], syllabified sonorants [m ̩ ], [n̩ ], [r̩ ] or [l̩ ], or total vowel deletion, coded "0".Coding was based on auditory impressions, with ambiguous cases categorised as '0' if the preceding sound transitioned to the following sound without clear intervening formants in the spectrogram.[ɪ] and [ə] were coded as "standard" while 0 and syllabified sonorants were "non-standard".We also controlled for voicing and manner of articulation of both preceding and following environment, dividing manner into [± sonorant] with the expectation that a voiced or sonorant environment would favour reduction (following Murray 1997).

This variation has been called [hw]~[w] or [ʍ]~[w]
in the literature.Following Minkova (2004), this study treats both as the same phenomenon.According to Wells (1994:5), the conservative variant [hw] "is part of perhaps most speakers' ideal of 'good' pronunciation; but it is not part of the actual usage of most real-life RP speakers".Wells (1982b:285) and Milroy (2004:51) mention that [hw] is a highly self-conscious form, taught prescriptively at public schools such as the one Fry attended, with a high level of awareness in the speech community.
All words containing orthographic <wh> were coded.Coding was based on auditory impressions, with ambiguous cases categorised as [hw] if there was clear voicing or aspiration in the spectrogram which could not be attributed to overlapping phones.Both word initial and compound-medial tokens (e.g., 'anywhere') were included.In some words containing a following back rounded vowel (e.g., 'who'), <wh> is pronounced [h] due to assimilation of the approximant to the vowel (Minkova 2004:33).These words were excluded from analysis.

Motivations for Variables
The variables were chosen to reflect the range of positive and negative beliefs held about RP by the speech community.Fry's differential use across the two contexts of strongly stigmatised g-dropping, unstigmatised reduction of unstressed medial syllables, and the conservative but prestigious retention of glide clusters will help build a more detailed picture of the identity he wishes to portray.For example, in the audiobook context, where it is hypothesised that Fry will use more standard forms, g-dropping may carry more negative social meaning than glide cluster reduction, whose non-standard form [w] is not so stigmatised.However, the stigmatised variant of g-dropping may carry covert prestige in the podcast context, where Fry is more personally involved and so perhaps more motivated to stress solidarity with mainstream RP speakers.We may not see as much differentiation between contexts of unstressed syllable reduction, which is not generally stigmatised.

Coding
All utterances containing a token of one of the variables were transcribed orthographically and tokens were coded manually in Praat (Boersma and Weenik 2011).There were three coders (two of whom were native RP speakers) who coded for one variable each, and 25 tokens of each file for each variable were cross-checked to monitor accuracy.TextGrid files were converted to tab-delimited text files using ELAN (Max Planck Institute for Psycholinguistics 2011, Lausberg and Sloetjes 2009).All tokens were coded as "standard" or "nonstandard".Where relevant, logistic regression tests were run using Rbrul (Johnson 2009).

Whole Corpus
Results for the entire corpus for each variable are given in Figure 1 (below).There were only two tokens of the standard [hw].For g-dropping, there were 17 tokens of non-standard [ɪn].Reduction is more variable, with the non-standard reduced vowels occurring 31% of the time.Excluding glide cluster retention because of its limited tokens, the distribution of standard/non-standard forms for the other two variables across the four files can be seen in Figure 2 (next page).Standardness is lower in the podcasts than the audiobooks.The audiobook chapter "Shell Cottage" is the most standard file.The podcast "Language" patterns more closely to the audiobooks than it does to the other podcast, "Broken Arm", with an 18% difference in standardness between the two podcasts.

Reduction of Unstressed Syllables
Table 3 (below) shows p values for all significant predictors of unstressed syllable reduction in the logistic regression model.The following predictors were included in the model: file, manner of articulation of preceding and following phone, word frequency, part of speech, and high/low drama.File is the second most significant predictor after manner of articulation of following phone.All factors were significant except high/low drama.Nagelkerke's R2 showed that 49% of the variation could be explained by these predictors.With 14 degrees of freedom, an intercept of -4.123 and a deviance of 311, this was the best step up/step down model using our predictors.Table 3: P-values for significant predictors of syllable reduction (in order of significance).

Predictor
Table 4 (below) shows a pattern similar to the overall pattern in Figure 2. The log odds show that standardness is lower in the podcasts than in the audiobooks, with "Shell Cottage" as the most standard context, and "Broken Arm" as the least."Broken Arm" is the only context which favours unstressed syllable reduction.

Results for Other Variables
The data for the podcast file "Language" for g-dropping do not pattern with the data trend shown above."Language" contains no non-standard tokens of g-dropping at all, while all the other contexts do.As for glide cluster retention, there were only two tokens of the conservative [hw] variant, both in the audiobook file "Shell Cottage", and both in the word "white".Other tokens of "white" in the corpus did not contain the [hw] variant, ruling out frequency and phonological effects.

Analysis
The results for unstressed syllable reduction and g-dropping (see Figure 2) indicate that standardness is higher in the audiobooks than in the podcasts (apart from the anomalous lack of g-dropping in the podcast "Language", which will be discussed below).Clearly Fry is competent in the use of both standard and non-standard forms.
Mainstream RP contains frequent use of non-standard forms, but is standard enough that it would be an acceptable situational norm in both podcast and audiobook contexts.The significant pattern in the observed variation can perhaps instead be explained by the social meaning of conservative U-RP versus less standard mainstream RP.Coupland (1985) posited that one motive for use of the standard is to emphasise competence in a technical context.It may be that by using more standard forms in the audiobook context, Fry is drawing attention to his competence as a professional narrator.When asked what makes a good audiobook narrator, Fry commented that "[t]he key thing is for the reader not to show off, after a while you should forget that they are there" (amazon.co.uk 2006).By conforming to the standards of U-RP, with its lack of geographical ties, he consciously strives not to index any specific group, as this would interfere with the listener's immersion in the story.As a competent professional, he keeps his own identity distant from his narrative performance.
However, no variety, least of all a standard, is free of social meaning.As Agha points out: "[RP] is enregistered in public awareness as indexical of speaker's class and level of education " (2003:233).Harry Potter is set in a British boarding school and is part of a rather nostalgic British literary tradition of boarding school novels.By using U-RP, Fry evokes the tradition of British public schools for any listener who shares these enregistered notions.Thus style becomes an extra resource for narrative performance, rather than a means to neutralise it as Fry believes, pointing to a possible disparity between his conscious and unconscious style use.
Fry's narrative performance has parallels with the performance speech of Rex in Schilling-Estes's 1998 study: "No one on Ocracoke really talks or ever talked the way he talks in his speech performance.He is, however, evoking the cultural imageof the old-time Ocracoke waterman; in effect, he is playing a part" (1998:74-5).Fry too performs an exaggeratedly standard version of his own variety in the audiobook, evoking cultural associations of British boarding school attendees.Though Rex shifts away from the standard and Fry shifts towards it, both evoke a cultural image as part of a performance.
This theory is borne out by the appearance in the audiobook context of the only two tokens of the conservative [hw] variant in the corpus.Its usage is limited to the word "white" in "Shell Cottage", the file with the most standard forms.Though it is difficult to draw a conclusion from such a small number of tokens, their presence shows that this variant is part of Fry's repertoire.Its reputation as an "ideal" of U-RP (Wells 1994:5), but its absence from mainstream use, make it appropriate in the audiobook context, where it may be considered a part of Fry's performance speech evoking public school speakers.
Meanwhile, in the podcasts, Fry's own identity is to the fore.Like the radio DJ in Coupland's 1985 study, Fry may be stressing solidarity with the mainstream RP speech community by becoming less standard in the podcasts."Broken Arm", the least standard of the four files, is also the most intimate, as Fry shares an anecdote with his listeners."Language" is a prime example of stressing solidarity: contrary to the 'pompous' stereotypical conservative RP speaker (Agha 2003:237), he argues against prescriptivism and praises socially stigmatised varieties.As a result, his reduction of unstressed syllables is higher than in the audiobook context, though not as high as in the deeply personal "Broken Arm" podcast.Paradoxically though, in "Language", Fry's g-dropping decreases to zero, when it is present in all other files.One would expect g-dropping to increase if he were trying to stress social solidarity, because of the covert prestige of stigmatised variants, which Fry himself praises in this podcast.The simplest way to account for this is through Labov's (1966) original theory of style-shifting, attention paid to speech.In this context of high metalinguistic awareness, Fry standardises in direct contradiction to the actual content of his speech, but only in the variable that has a high level of social awareness.Syllable reduction, carrying less social stigma, appears more affected by covert prestige motives than by attention paid to speech.This would not be incompatible with speaker design theoryattention paid to speech and covert prestige can be incorporated into the model as elements of a complex range of speakercentred motives for style-shifting.
Fry's more conservative speech in the audiobook context is a performance indexing a positive nostalgic stereotype of a privately educated U-RP speaker.Yet the positive characteristics of U-RP, such as quality education, gentility and 'good diction' would surely serve to increase his prestige in the podcast context too, where it is not used to the same extent.It is when the negative characteristics of U-RP speakers (such as being posh, elitist, and old-fashioned) are brought into the equation, that Fry's strategy becomes clear.In the podcast context, where his own identity is more relevant, he relies on the covert prestige of non-standardness in order to avoid these negative associations with the standard, and to express solidarity with the mainstream RP speech community.
This interpretation seems entirely in line with Meyerhoff's (2011:28) assertion, under the speaker design approach, that speakers style-shift in order to "accentuate the positive and eliminate the negative".For Fry, the positive social evaluation of RP is outweighed by the negative one in a personally involved context such as the podcasts.Fry, with his persona as an RP stereotype, probably confronts these negative prejudices from the public.His podcasts are an opportunity to control the identity he conveys, to show his originality of thought and rejection of elitism.The covert prestige of non-standard variants are appropriate for these motives.In the audiobook context, where he speaks the words of another author and his identity is effaced, he is freer to demonstrate his competence at manipulating the positive social connotations of RP as a resource for narrative performance without being personally subject to the negative judgements accompanying it.The avoidance in the podcasts of the highly overtly prestigious glide cluster [hw] can tentatively (given the limited number of tokens in the corpus overall) be seen as 'elimination of the negative', as the overtly prestigious becomes disfavoured in contexts that favour covert prestige forms.Finally, the lack of stigmatised g-dropping in the podcast "Language" contradicts the trend, where attention paid to speech appears to become a stronger motive than prestige, perhaps due to a metalinguistic topic.

Conclusion
The now well-established concept of speakers using linguistic variation as a resource for performing identity is borne out by these data.Speaker design theory has provided a complex account for Fry's style-shifting.The data attest to speakers' motives to accentuate the positive, eliminate the negative, and evoke cultural associations through performance speech.Classic sociolinguistic concepts such as covert prestige and attention paid to speech still carry explanatory power, and ought to be incorporated into the speaker design model, with the emphasis remaining on the speaker's conscious or unconscious agency in dynamic style choices.
This study has also shown that a great degree of variation can exist within a so-called standard, neutral variety.In fact the enregisterment of the standard makes it ideal ground for investigation of sociolinguistic attitudes.It has also been shown that different variables can carry very different associations and so should be interpreted carefully, with reference to their status in the wider speech community.The use of variables with differing social evaluations can give insights into how speakers consciously or unconsciously prioritise styleshifting motives.Future studies could further investigate the interaction of simultaneous and perhaps contradictory speaker motives, by analysis of multiple variables with different social statuses.Thus our investigation into the complexity and creativity of style-shifting can deepen our understanding of the interaction between an individual's identity and their language use.

Figure 1 :
Figure 1: Distribution of standard and non-standard forms of each variant across the corpus.

Figure 2 :
Figure 2: Standard and non-standard realisations per file of g-dropping and syllable reduction combined.

Table 1 :
Files in the corpus.

Table 4 :
Results of logistic regression analysis of effect of file on unstressed syllable reduction.