The Young and the Old : ( t ) Release in Elderspeak

Elderspeak refers to a speech style used when talking to the elderly. The aim of this study was to find out whether a higher rate of standard phonetic variants of phonemes is a feature of elderspeak. To test this, the (t) release in the speech of the radio presenter Kirsty Young was analysed, comparing her speech towards younger and older guests. A significant correlation was found between her rate of (t) release and the age of the guests. After analysing the results, an alternative account of elderspeak is presented as well as possible avenues of future research.


Introduction
Elderspeak is the name given (since Cohen and Faulkner 1986) to what many researchers (e.g., Coupland et al. 1988, Ryan et al. 1991, Kemper 1994, Draper 2005) consider to be a speech style or register used when talking to the elderly.Features of elderspeak have been shown to exist on the full spectrum of linguistic levels, from the phonetic to the pragmatic (see overview by Samuelsson et al. 2013:618); however, while various phonetic studies have been undertaken on elderspeak, e.g., looking at speech rate and intonation, to my best knowledge, none have looked at the phonetics of elderspeak on the segmental level.
This paper investigates the following hypothesis: that the speech of an adult speaker towards the elderly will contain more released word-final (t) consonants than equivalent speech towards addressees of a similar age to the speaker.This research question was prompted by the fact that elderspeak appears to share many similarities with child-directed speech (CDS), which does show this pattern (Foulkes et al. 2005); moreover, researchers have mentioned the subjective experience of elderspeak having exaggerated articulation and pronunciation (Samuelsson et al. 2013, Ryan et al. 1995), suggesting that elderspeak uses stylistic features associated with Lindblom's (1990) notion of hyper-speech.
The hypothesis is investigated by looking at the rate of released (t) in the speech of a radio show host in her 40s to guests of different ages on a radio programme, with the data showing a significant correlation between an elderly addressee and increased (t) release.The implications of this result are then explored, and a novel account of elderspeak is presented that aims to help explain why certain features occur in elderspeak.

Elderspeak
Elderspeak is the term used to describe a speech style or register that appears to be employed when talking to the elderly (Cohen and Faulkner 1986, Keller 2006, Simpson 2002, Kemper 1994).It is characterised by features that are perceived, in many cases falsely, to improve communication (Cohen and Faulkner 1986, Kemper and Harden 1999, Samuelsson et al. 2013).Over the years, a variety of features have been ascribed to elderspeak at all levels of speech, such as syntactically simpler sentence structures, increased use of diminutives, exaggerated stress and intonation, and exaggerated articulation (Samuelsson et al. 2013:618).
To explain the existence of elderspeak, many have suggested accommodation to a perceived impairment on the part of the addressee.Kemper et al. (1998a) showed that features associated with elderspeak were more prevalent when the addressee was known to have reported cognitive problems such as memory lapses than when they were presented as healthy and independent.Kemper et al. (1998b) further demonstrated that the same was true in speech directed at those displaying signs of dementia.The use of elderspeak in nursing homes is also wellestablished-in a seminal paper on the subject, Caporael (1981) recorded that over 22% of speech in nursing homes between caregivers and care receivers was of this kind.However, elderspeak has also been found to occur in contexts where the addressee is neither clearly impaired nor dependent: Kemper et al.'s (1996) study showed that even when older adults did not request clarification or express confusion during an experiment, younger speakers were still likely to switch to elderspeak.
One interesting result in the study by Kemper et al. (1996) was that older speakers did not shift in style when talking to older addressees that were not clearly impaired, which may suggest that the shift in the younger adults was based on assumptions about cognitive impairment related to the age of their addressees that the older adults did not share.Indeed, many researchers (e.g., Ryan et al. 1986) have concluded that elderspeak is at least partly influenced by stereotypes linked to aging, such as "hearing loss" and "lower cognitive level" (Samuelsson et al. 2013), rather than a response to any actual impairment the addressee might have.Furthermore, while some features of elderspeak such as elaboration, repetition, and simpler clause structures could increase communicational effectiveness in the case of addressing older speakers, features such as shorter sentences, pausing, slower speech, higher pitch, and unnatural stress are in fact damaging to communication with the elderly (Cohen and Faulkner 1986, Kemper and Harden 1999, Samuelsson et al. 2013).Another problem is that elderspeak is also usually considered patronising and degrading, and constant use has been reported to be cognitively and psychologically damaging to older adults (Ryan et al. 1986, Simpson 2002, Williams et al. 2003, Samuelsson et al. 2013), with some researchers such as Caporael (1981) and Lanceley (1985) suggesting that elderspeak may engender dependency in elderly addressees over time.In fact, in much of the older literature (e.g., Caporael 1981), elderspeak is referred to as "baby talk", due its perceived similarities to infant-directed speech (IDS), both in terms of its features and its social context of care and dependence.

Hyper-speech Styles
Hyper and hypo-articulation (H&H) theory, first posited by Lindblom (1990), aims to explain the adaptability of speech in different contexts.According to the theory, speakers "tune their performance according to communicative and situational demands" (Lindblom 1990:403).Speech thereby falls along a continuum, from hyper to hypo-speech.Where something falls along this continuum is attributed to how the speaker balances two opposite tendencies.On the one hand, there is the scientific principle that "[u]nconstrained, a motor system tends to default to a low-cost form of behaviour" (Lindblom 1990:413), and on the other, that speech is "outputoriented" (Lindblom 1990:415), like other areas of motor control, so it is expected that the speaker will compensate for any difficulties they think that the listener will have in recovering information from the speech signal, and hence be able to distinguish between lexical items.Hyper-speech occurs when the latter principle trumps the former, while the opposite is true in hypo-speech.While Lindblom formalised H&H theory in 1990, the general phenomena it aims to describe have been researched since before its advent-for example, Picheny et al. (1986) found decreased vowel reduction and increased release of stops and all word-final consonants in speech towards the hard of hearing.
One relatively popular application of this theory in the study of speech styles is to use it to understand IDS and CDS.Kuhl et al. (1997), for example, found that when speaking to their infant children, the vowels of American English, Russian, and Swedish native-speaking mothers were "acoustically more extreme" (Kuhl et al. 1997:684), with a more exaggerated place of articulation, which is in line with what would be predicted under a hyper-speech model.Similar results were found by Beckford Wassink et al. (2007) who also argue that IDS fits further towards the hyper-speech end of the continuum than speech between adults, based on formant frequencies, segment duration, and speech intensity.Foulkes et al. (2005) found significantly higher rates of the standard released (t) (i.e., [t]) variants in CDS than inter-adult speech, which again fits in with the characterisation of IDS and CDS as speech styles utilising hyper-speech.
There are three main reasons to expect that elderspeak may also involve the hyper-speech features of the exaggerated articulation of vowels and a higher rate of released stops.First of all, elderspeak shares a large number of features with IDS.For example, an increased number of repetitions, simpler sentence structures, higher overall pitch, and exaggerated intonation have been reported in both speech styles (Samuelsson et al. 2013, Ryan et al. 1995, Fernald and Simon 1984, Snow 1972).Given these similarities, then, it is not unreasonable to suspect that they might share other features.The second reason is somewhat more direct: aging is associated with hearing loss (Ryan et al. 1995, Samuelsson et al. 2013), and if it is true that elderspeak occurs due to stereotypes about the elderly, then it makes sense that some of its features would be shared with speech towards the hearing impaired, which involves an increased release of stops (Picheny et al. 1986).Beyond the realm of speculation, there is additional evidence that both elderspeak and speech towards the hard of hearing involve a slower speech rate (Picheny et al. 1986, Ryan et al. 1986), something which they also share with IDS (Fernald and Simon 1984).Finally, while there have not been any quantitative phonetic studies of elderspeak at the segmental level, previous studies have mentioned "exaggerated articulation" (Samuelsson et al. 2013:618) and "exaggerated pronunciation" (Ryan et al. 1995:152), which suggests that elderspeak may, at least qualitatively, share features of hyper-speech.All in all, then, a large body of the previous literature suggests that elderspeak could be expected to have an increased rate of released word-final consonants.

Released (t)
Most studies on (t) realisation have focused on the non-standard variants of (t), such as deletion (e.g., Patrick 1991, Guy and Boberg 1997, Schuppler et al. 2009), glottal variants (e.g., Trudgill 1988, Milroy et al. 1994, Docherty et al. 1997, Marshall 2003, Foulkes et al. 2005), taps and flaps (e.g., Fukaya and Byrd 2005), and palatal variants (e.g., Lahiri andEvers 1991, Zsiga 1995).However, a growing body of work (e.g., Bucholtz 1996, Docherty and Foulkes 1999, Benor 2001, Podesva et al. 2002, Eckert 2003, Podesva 2008, Eckert 2008, Podesva et al. 2015) presents strong evidence that released (t) should also itself be viewed as a stylistic feature.If this is true (as this paper takes it to be), it means that rather than being analysed in terms of the presence or absence of other variants, released (t) should be considered a sociolinguistic variant in its own right, with its own linguistic and non-linguistic contexts.Some of the social meanings it has been associated with are nerdiness among nerd girls (Bucholtz 1996), (academic) authority among Orthodox Jews (Benor 2001), and competence in professional environments (Podesva 2008).Eckert (2008:469) gives a more general account of the variable, including meanings that range from "articulate", "clear", "elegant", and "polite", to "emphatic" and "exasperated", depending on the context of its use.

Overview
Based on previous research, the aim of this study was to investigate whether elderspeak involves features associated with hyper-speech, which may explain some of the perceptions of previous researchers (and participants) of its exaggerated articulation and pronunciation (as discussed in Section 2.2).To investigate this hypothesis, this study looks at the variation in the rate of the released (t) variant in a single speaker.The speech of a woman in her 40s to similar-aged or younger addressees and to those over the age of 80 was compared.As well as age, other relevant linguistic and extralinguistic variables that could influence this dependent variable were identified and included in the statistical model.

The Corpus
Desert Island Discs (n.d.) is a longstanding weekly radio programme that airs on BBC Radio 4 in the UK.It invites people who are prolific or successful in their field, or extraordinary in some other way, as guests to discuss their life and achievements.It has a regular format, and this discussion is interspersed with the presenter asking the guest about what music they would take if they were stranded on a Desert Island, as well as a book and a luxury item.This study analyses the speech of the current presenter Kirsty Young to 20 guests from 2011 to 2014.
The time period of 2011 to 2014 was chosen partly to minimise any possible effects of lifespan change or age grading (as discussed in Sankoff and Blondeau 2007), and partly to avoid any variation that might have occurred early on in her time as presenter (after having taken over in 2006), i.e., when she would have been "settling" into the role.
The episodes were then chosen from within this period based on the age and gender of the guests.Two AGE CATEGORIES were used, YOUNGER and OLDER, each with 5 MALE and 5 FEMALE guests, giving a total of 20 episodes.Over the time period, Young's age was 42-45, and so the YOUNGER category was chosen to roughly correspond to this (given the limitations in possible guests), with an ADDRESSEE age range of 35-44 (median = 39, mean = 39).In the OLDER category, the ADDRESSEE age range was 80-95 (median = 84, mean = 85.6).GENDER was matched in each AGE CATEGORY and included as a possible factor that could affect (t) release, as there is evidence that gender can affect the realisation of the (t) variable (Foulkes et al. 2005).Socioeconomic class was also matched as much as possible.
A further restriction was that only the first section of each episode was used to reduce any possible effects of familiarisation between the speaker and addressee over the course of the episode.In order to keep this as consistent as possible between episodes, this section was defined as starting after the radio-audience-directed introduction of the guest, and ending at the beginning of the first song.

The Variables
The dependent variable that was coded for was the realisation of word-final (t).This study follows Benor (2001) and Podesva et al. (2015) in comparing the rate of released (t) to other "phonetically weaker" (Podesva et al. 2015:65) variants.A binary distinction was therefore made between the RELEASED (i.e., [t]) and NOT RELEASED VARIANTS of (t).This latter variant included all other (non-[t]) variants of (t), such as glottaled, deleted, voiced, and assimilated (e.g., palatalised) variants, as well as flaps and taps.
Each token of (t) was identified auditorily, and the VARIANT was determined by auditory perception and inspecting the spectrogram in Praat (Boersma and Weenink 2016).As well as the previously mentioned AGE CATEGORY and GENDER of the addressees, each individual ADDRESSEE was also coded for to account for byaddressee variation.To examine their potential effects on the data, 5 additional linguistic variables were coded for, based on the study undertaken by Podesva et al. 2015: PRECEDING ENVIRONMENT (phonetic), FOLLOWING ENVIRONMENT (phonetic), CLAUSE POSITION (medial or final), WORD (the word in which it appeared), and MORPHEME STATUS (whether the (t) appeared at the end of a monomorphemic word or as part of a separate morpheme).The phonetic environments were then collapsed into the categories identified by Podesva et al. (2015) as affecting (t) release: OBSTRUENT, SONORANT CONSONANT, and VOWEL for PRECEDING ENVIRONMENT; and PAUSE, VOWEL, NONSIBILANT CONSONANT, and SIBILANT for FOLLOWING ENVIRONMENT.

Inter-rater Reliability and Data Refinement
Two inter-rater reliability checks were performed with fellow researcher Deborah Orr.The preliminary test (before any other coding had taken place) had only 73% agreement, after which it was decided not to include tokens occurring within interrupted words or within prosodic words such as sort've.For the second test, the two coders coded 268 tokens for 11 ADDRESSEES independently, with 98.1% (263/268) agreement.The 5 tokens where there was disagreement were removed from the data.The remaining 253 tokens in the corpus were then coded by the author, totalling 521 coded tokens overall.
A few further tokens were then removed to reduce the number of possible factors affecting the (t) release.As there were only 35 tokens where the (t) was not in a MONOMORPHEMIC word (these exceptions were made up of contractions such as won't, and verbs in the regular and semiweak past tense such as walked and kept), these were removed from the data set.Finally, given that reported speech, quotes, and sayings are known to have unique phonetic features (e.g., Günthner 1999), an additional 5 tokens of this kind were removed.This gave a final token count of 481 (267 to OLDER addressees and 214 to YOUNGER addresses).One final variable was added at this point, namely, WORD FREQUENCY.WORD FREQUENCY was coded as either HIGH or LOW, where the requirement for a HIGH frequency was that the word accounted for 5% or more of all tokens.There were 5 such words.

Results
Figure 1 presents a graph of VARIANT by AGE CATEGORY. 1 The graph shows that 37.1% of the instances of wordfinal (t) produced by Young were RELEASED with OLDER addressees, compared with 22.0% when they were in the YOUNGER category.This supports the original hypothesis of this paper, i.e., that there would be more released variants of (t) when the speaker was talking to OLDER addressees than YOUNGER addressees.The graph also shows that Young produced more of the NOT RELEASED variant than the RELEASED variant of (t) with both sets of addressees.A preliminary chi-squared test showed this result to be significant (χ 2 (1) = 12.134, p = 0.000495).The data were then tested further using the lme4 (Bates et al. 2015) package for RStudio (RStudio Team 2015).A linear mixed-effects model was created to see the effect of AGE CATEGORY on VARIANT.This model included PRECEDING ENVIRONMENT, FOLLOWING ENVIRONMENT, CLAUSE POSITION, and WORD FREQUENCY as fixed effects, and ADDRESSEE and WORD as random effects.The p-values for each effect in the model were calculated by performing likelihood ratio tests, comparing the full model to a model not including that effect.The results are shown in Table 1, with significant results in bold.In this model, AGE CATEGORY was a significant predictor of (t) VARIANT, as well as PRECEDING ENVIRONMENT, FOLLOWING ENVIRONMENT, and WORD FREQUENCY.CLAUSE POSITION and GENDER did not have a significant effect.

Data Analysis
The results show that the speaker Kirsty Young produced the released (t) variant compared to other variants at a significantly higher rate when her addressee was elderly than when her addressee was a similar age to her.This lends support in the form of quantitative evidence to the claim that elderspeak uses stylistic features associated with hyper-speech.In fact, given elderspeak's association with hearing impairment, elderspeak fits the prototypical account of hyper-speech that Lindblom (1990) seems to have in mind, where the speaker judges that the listener may have difficulty interpreting their speech signal and so pronounces words more "clearly".
The results also further cement the connection between elderspeak, IDS, and speech towards the hard of hearing: as well as sharing a reduced speech rate (Picheny et al. 1986, Fernald andSimon 1984), the present study provides quantitative evidence that supports the claim that they also all involve an increase in word-final stops (Fernald andSimon 1984, Picheny et al. 1986).

Re-evaluating Elderspeak
While listening to the data, one thing that both coders noted was that, other than a slightly slower speech rate, neither noted any of the other features previously identified as part of elderspeak.Of course, some features are more perceptively salient, but it is interesting to note that this subjective judgement is consistent with the considerable amount of variation in the kinds of features reported to occur in elderspeak (Ryan et al. 1995, Samuelsson et al. 2013).Ryan et al. (1995), for example, conclude that the elderspeak found in interactions between carers and dependents at nursing homes is generally the "most extreme" (Ryan et al. 1995:162), while Kemper et al. (1998a) found that some features depended on whether or not an elderly addressee was presented as cognitively impaired or dependent.These findings, overall, suggest that elderspeak can vary in degree depending on how the listener perceives the addressee.In this section, I present a brief sketch of how to take this observation a step further to demonstrate more explicitly how elderspeak can be related to the other speech styles that I have previously discussed.
The first step is to consider and adopt a theoretical claim advanced by researchers such as Eckert (2001), Irvine (2001), andCoupland (2001), namely, that speech styles should not be viewed as monolithic entitiesinstead, it is the individual stylistic features that a speaker uses that are enregistered with social meaning.There are two main implications of this that are of particular relevance to the present discussion.On the one hand, it allows speakers to use more than one feature to index the same meaning; on the other hand, and more importantly, it allows a speaker to index more than one thing at the same time.Eckert (2008:460-461) gives an example of the latter that is found in Zhang's (2005) study on variation in Beijing: while female managers in foreign-owned businesses tend to stick to using a sociolinguistic variant associated with a cosmopolitan, international context, male managers also use the variant associated with the streetwise "smooth operator" archetype, one of the variants that is commonly used by managers of both genders that work in state-owned businesses.
With such a theoretical background, then, patterns start to emerge.For example, slower speech rate is a feature that can be found in elderspeak (Samuelsson et al. 2013), IDS (Fernald and Simon 1984), speech towards the hearing impaired (Picheny et al. 1986), and speech towards foreigners that do not have English as their native language (Scarborough et al. 2007).Based on the present study, the same pattern can be found for the release of stops (Fernald and Simon 1984, Picheny et al. 1986, Scarborough et al. 2007).Considering that the speech modifications made when speaking to the hard of hearing are intended to increase clarity (Picheny et al. 1986:434), and an increased rate of released stops is associated with hyper-speech, which has the purpose of making speech clearer and easier to understand (Lindblom 1990), it would make sense to consider these particular features to be enregistered as features related to clear speech.2With IDS and foreigner-directed speech this makes sense, as the addressees may not be perceived to be fluent in the language, while in the case of elderspeak, the stereotype of hearing loss in the elderly suggests that speaking to the elderly incorporates the concept of speaking towards the hard of hearing.In other words, it seems that this might be the first step in integrating the previous research on elderspeak with such a model: rather than looking at elderspeak as involving a cluster of features, it may be more useful to look at it as a common intersection of certain stylistic features: for example, features associated with known or perceived hearing loss, cognitive impairment, or dependency on the part of the addressee.
Further evidence of such a pattern comes from looking at other features.Kemper et al. (1998a) found that when the elderly addressee was presented as cognitively impaired, the speaker was more likely to use repetitions or shorter sentences, which suggests that these features are related to the speaker's perception of the addressee's cognitive capabilities.This could also explain why these occur in speech towards infants (Fernald and Simon 1984;Snow 1972), who are still far from fully cognitively developed.Similarly, some of the more extremely patronising features such as higher pitch and exaggerated intonation seem to be more likely to occur in carerdependent speech in care institutions (Ryan et al. 1995).In this case, there is clearly an association with dependency, which would again explain why it shares such features with IDS (Fernald and Simon 1984)-the parent-child relationship is probably the most prototypical case of a carer-dependent relationship.
If these generalisations are valid, this might explain why some of the features were not perceived in the data in the present study.Simply appearing on the radio programme shows that all the guests were at the very least cognitively able and independent enough to come into the studio to give a radio interview.In fact, almost all of the OLDER guests discussed how they were active either in the area that they are known for or in some other way.Furthermore, considering the high status of the guests, who all "come from […] really the upper echelons of all their areas of achievement" (Waterstones 2012), it seems highly unlikely that Young would adopt a patronising attitude towards any of them.
Overall, then, it seems that separating the features in this way can help both with understanding why elderspeak involves the features it does when it does, and how it relates in social meaning to other speech styles.One interesting future avenue of research that would be useful and does not appear to have been investigated yet would be to look closely at the stylistic features of speech towards younger adults with cognitive impairments, and to see whether any similarities can be drawn between this and speech towards the elderly.Similarly, it would be interesting to investigate speech towards adults that are fully cognitively able but dependent in some other way.

Conclusions
This study has shown a younger speaker, Kirsty Young, increasing her rate of word-final (t) release when speaking to an elderly addressee compared to when speaking to an addressee of roughly the same age.The main implication of this is that there is now quantitative evidence to back up previous claims (e.g., Ryan et al. 1995, Samuelsson et al. 2013) that have suggested that elderspeak involves features associated with hyper-speech.
I have also argued that, especially in future, it may be useful to consider elderspeak as a bundle of features that depend on the context and the speaker's perception of the addressee rather than as a single style in its own right.Overall, I believe that if it is adopted (or at least considered) in future work, this approach may help with understanding why elderspeak has the features it does, what they mean individually and holistically, and how their use can affect the speaker and addressee.On a wider level, it could also help to incorporate elderspeak into broader accounts of sociolinguistics.

Table 1 :
Results of the Likelihood Ratio Tests