Short-term and long-term test-retest reliability of the Nasality Severity Index 2.0

doi:10.1016/j.jcomdis.2016.05.001

Journal of Communication Disorders

Volume 62, July–August 2016, Pages 1-11

https://doi.org/10.1016/j.jcomdis.2016.05.001 Get rights and content

Highlights

•
Test-retest measurements of the NSI 2.0 can be performed reliably.
•
No significant difference in reliability of NSI 2.0 between adults and children.
•
Difference of 2.82 between two NSI 2.0 scores represents genuine change in adults.
•
Difference of 2.67 between two NSI 2.0 scores represents genuine change in children.
•
NSI 2.0 can be reliably applied for follow-up of hypernasality in clinical practice.

Abstract

Purpose

The Nasality Severity Index 2.0 (NSI 2.0) forms a new, multiparametric approach in the assessment of hypernasality. To enable clinical implementation of this index, the short- and long-term test-retest reliability of this index was explored.

Methods

In 40 normal-speaking adults (mean age 32y, SD 11, 18–56y) and 29 normal-speaking children (mean age 8y, SD 2, 4–12y), the acoustic parameters included in the NSI 2.0 (i.e. nasalance of the vowel /u/ and an oral text, and the voice low tone to high tone ratio (VLHR) of the vowel /i/) were obtained twice at the same test moment and during a second assessment two weeks later. After determination of the NSI 2.0, a comprehensive set of statistical measures was applied to determine its reliability.

Results

Long-term variability of the NSI 2.0 and its parameters was slightly higher compared to the short-term variability, both in adults and in children. Overall, a difference of 2.82 for adults and 2.68 for children between the results of two consecutive measurements can be interpreted as a genuine change. With an ICC of 0.84 in adults and 0.77 in children, the NSI 2.0 additionally shows an excellent relative consistency. No statistically significant difference was withheld in the reliability of test-retest measurements between adults and children.

Conclusion

Reliable test-retest measurements of the NSI 2.0 can be performed. Consequently, the NSI 2.0 can be applied in clinical practice, in which successive NSI 2.0 scores can be reliably compared and interpreted.

Learning outcomes

The reader will be able to describe and discuss both the short-term and long-term test-retest reliability of the Nasality Severity Index 2.0, a new multiparametric approach to hypernasality, and its parameters. Based on this information, the NSI 2.0 can be applied in clinical practice, in which successive NSI 2.0 scores, e.g. before and after surgery or speech therapy, can be compared and interpreted.

Introduction

To assess and diagnose hypernasality, speech-language pathologists as well as other clinicians mostly rely on a combination of perceptual and instrumental measurements. A perceptual assessment based on spontaneous speech, automatic speech and reading or repeating sentences and words remains the “gold” standard to determine resonance disturbance. However, perceptual measurements are subjective and therefore can be influenced by vocal quality (Kataoka, Warren, Zajac, Mayo, & Lutz, 2001) and articulation errors of the patient (Bzoch, 1997) or by experience of the examiner (Lewis, Watterson, & Houghton, 2003). To support the perceptual analysis, several instrumental measurements are available to determine the presence and amount of resonance disturbance (Bettens, Wuyts, & Van Lierde, 2014). However, contradictory results can emerge when the outcomes of different assessment techniques are compared. Acoustic analyses based on, for example nasometry or spectral analyses, do not always strongly correlate with perceptual judgments (Keuning, Wieneke, van Wijngaarden, & Dejonckere, 2002; Lewis et al., 2003; Nellis, Neiman, & Lehman, 1992; Prado-Oliveira, Marques, Souza, Souza-Brosco, & Dutka Jde, 2015; Watterson, McFarlane, & Wright, 1993) or are based on vowels only, which may limit their representativeness of spontaneous speech (Lee, Wang, & Fu, 2009; Rah, Ko, Lee, & Kim, 2001; Vijayalakshmi, Reddy, & O’Shaughnessy, 2007). Therefore, a combination of complementary test results into a multiparametric index can form a solution. In a pilot study, Van Lierde, Wuyts, Bonte, and Van Cauwenberge (2007) developed the Nasality Severity Index (NSI) based on a combination of five parameters, more specifically the nasalance value of the vowel /a/, an oral and oronasal text derived by the Nasometer (model 6200); the maximum duration time (MDT) of /s/; and the mirror-fogging test by Glätzel of /a/. The equation yielded NSI = −60.69 − (3.24 × nasalance oral text (%)) − (13.39 × Glätzel value /a/) + (0.244 × MDT (s)) − (0.558 × nasalance /a/ (%)) + (3.38 × nasalance oronasal text (%)). However, influence of personal and environmental variables due to the inclusion of MDT of /s/ and the use of the mirror-fogging test by Glätzel (Foy, 1910) was detected (Bettens, Wuyts, De Graef, Verhegge, & Van Lierde, 2013). Therefore, Bettens, Van Lierde, Corthals, Luyten, and Wuyts (2016) proposed an adaptation of the NSI based on the data of different instrumental measurement techniques and the optimal statistical discrimination between 50 children without resonance disturbance and 35 children with hypernasality, in a stepwise statistical approach, with sensitivity and specificity as the serving criteria. A weighted linear combination of three variables was established, more specifically the nasalance scores of the vowel /u/ and an oral text obtained with the Nasometer (model II 6450) and the voice low tone to high tone ratio (VLHR) of the vowel /i/ with a cutoff frequency of 4.47*F0Hz (originally described by Lee, Yang, and Kuo (2003)) (see Bettens et al. (2016) for more information about the rationale behind and the derivation of the formula). The formula of the adapted NSI yields NSI 2.0 = 13.20 − (0.0824 × nasalance /u/ (%)) − (0.260 × nasalance oral text (%)) − (0.242 × VLHR /i/ 4.47*F0Hz (dB)). The mean NSI 2.0 value of patients with perceived hypernasality was −6.82 (SD 5.14), whereas the mean NSI 2.0 value of the control children with normal resonance was +4.08 (SD 1.59). With a cutoff score of zero, the NSI 2.0 discriminated patients with hypernasality from persons with normal resonance with a sensitivity of 92% and a specificity of 100%, in which patients with perceived hypernasality had scores below zero. The validity of this new index was proven to be high by application of the parameter results of an independent patient and control group on the derived formula (sensitivity 88%, specificity 89%), in which all patients were perceptually judged with hypernasality and all control children with normal resonance.

However, before the NSI 2.0 can be implemented in daily clinical practice, the reliability of this new index has to be verified. According to literature, several sources can affect the stability of instrumental measurements (Lewis, Watterson, & Blanton, 2008). More specifically, instrumental variance (e.g. microphone and sound cart characteristics, machine model), test procedure (e.g. distance from the microphone), subject performance (e.g. physiological factors, nasal patency) and the environment (e.g. air moisture and temperature) can influence the reliability of assessment techniques. Similarly, the components of the NSI 2.0 are susceptible to these sources of variation.

Two of the three parameters included in the index are obtained by the Nasometer. This device, originally developed by Fletcher and Bishop (1970) and manufactured by KayPentax (KayPentax, NJ, Lincoln Park), determines the amount of nasal resonance based on an acoustic analysis of both a nasal and oral signal, and is considered an indirect measure of nasality. The signals are obtained by two microphones divided by a sound separation plate which is positioned between the nose and the upper lip of the participant. After filtering the signals using a band pass filter with a center frequency of 500 Hz and a bandwidth of 300 Hz, the ratio of the nasal signal to the (nasal + oral) signal, multiplied by 100, yields the nasalance score in a percentage. Several authors state that, although based on similar acoustic analyses of nasal and oral signals, nasalance scores of different instruments, such as the Nasometer, NasalView and OroNasal System, are not interchangeable (Awan, Omlor, & Watts, 2011; Bressmann, 2005; Bressmann, Klaiman, & Fischbach, 2006; Lewis & Watterson, 2003). Additionally, scores obtained with different models of the same instrument can also vary significantly (Awan et al., 2011, Awan and Virani, 2013, de Boer and Bressmann, 2014, Watterson and Lewis, 2006). Even results determined with different devices of the same model may differ due to the characteristics of the nasal and oral microphone (Zajac, Lutz, & Mayo, 1996). When the same device is used, replacement of the headgear can introduce a second source of variability (Watterson, Lewis, & Brancamp, 2005; Watterson & Lewis, 2006), although Lewis et al. (2008) and Kavanagh, Fee, Kalinowski, Doyle, and Leeper (1994) found only small differences between the condition of no change of the headgear and headgear change between two successive measurements. Next to instrumental and procedure variation, personal variation also has an influence on the reliability of the test results. Extensive research focused on between-subject variability, more specifically the influence of age (Brunnegard and van Doorn, 2009, Luyten et al., 2012; Prathanee, Thanaviratananich, Pongjunyakul, & Rengpatanakij, 2003; Van der Heijden, Hobbel, Van der Laan, Korsten-Meijer, & Goorhuis-Brouwer, 2011; van Doorn & Purcell, 1998), gender (Abou-Elsaad, Quriba, Baz, & Elkassaby, 2012; Brunnegard & van Doorn, 2009; Karakoc, Akcam, Birkent, Arslan, & Gerek, 2013; Luyten et al., 2012, Nichols, 1999, Park et al., 2014, Prathanee et al., 2003; Sweeney, Sell, & O’Regan, 2004; Van de Weijer and Slis, 1991, Van der Heijden et al., 2011, van Doorn and Purcell, 1998; Van Lierde, Wuyts, De Bodt, & Van Cauwenberge, 2003) and dialect (Awan et al., 2015; D’haeseleer, Bettens, De Mets, De Moor, & Van Lierde, 2015; Kavanagh et al., 1994; Mayo, Floyd, Warren, Dalston, & Mayo, 1996; Nichols, 1999; Rochet, Rochet, Sovis, & Mielke, 1998; Seaver, Dalston, Leeper, & Adams, 1991). This resulted in normative values for nasalance scores in different languages.

Another personal inconsistency arises from intra-subject variability possibly due to the variation in physiological factors such as small changes in nasal patency (de Boer and Bressmann, 2014, Lewis et al., 2008, van Doorn and Purcell, 1998). Due to this variability, nasalance scores of an oral text or oral sentences differ from 4 to 6 percentage points in 95% of the recordings of participants with normal speech in the ‘no change of the headgear’ condition using a Nasometer (Lewis et al., 2008, Sweeney et al., 2004, van Doorn and Purcell, 1998, Watterson et al., 2005; Watterson, Lewis, Ludlow, & Ludlow, 2008) and from 5 to 9 percentage points in 95% of the recordings in the ‘change of the headgear’ condition between sessions (de Boer and Bressmann, 2014, Lewis and Watterson, 2003, Lewis et al., 2008, van Doorn and Purcell, 1998, Watterson et al., 2005, Whitehill, 2001). These studies are based on data obtained from adults or children. However, to the best of our knowledge, no study has yet investigated whether nasalance can be examined as reliable in children as in adults.

A third parameter included in the NSI 2.0 is VLHR of the vowel /i/ with a cutoff frequency of 4.47*F0Hz (Lee et al., 2003), determined by PRAAT software (Boersma & Weenink, 2014). After determination of the power spectrum of the sound wave by use of a Fast Fourier Transformation, the spectrum of the sound sample is divided into a low-frequency band (LFB) and a high-frequency band (HFB) using a specific cutoff frequency derived from the fundamental frequency, 4.47*F0Hz. To quantify LFB, the power of each frequency band ranging from 65 Hz to the cutoff frequency is summed; the power of HFB is calculated as the summation of the power ranging from the cutoff frequency up to 8000 Hz in accordance with the protocol of Lee et al., 2003, Lee et al., 2006. VLHR is defined as the power ratio of LFB vs. HFB expressed in decibels. Lee et al. (2003) reported no significant correlation between sound intensity and VLHR, due to the use of a relative rather than an absolute index. Division of LFB to HFB also eliminates the possible influence of different sound recording conditions such as characteristics of the microphone, sound card of the computer and distance from the microphone. Therefore, variability due to equipment and test procedure may be limited. However, between-subjects and subject performance variability can still influence the stability of the test results for this parameter.

To enable the implementation of the NSI 2.0 in the diagnosis of hypernasality and evaluation of surgical treatment, such as palatal re-repair, or speech therapy, the aim of this study was to explore both, short-term and long-term test-retest reliability of the NSI 2.0 and its parameters in adults and children without resonance disturbance. Furthermore, the possible difference in test-retest reliability between adults and children was verified in both conditions. Based on literature, long-term variability of the NSI 2.0 and its parameters is hypothesized to be larger compared to short-term variability. Additionally, variability of the NSI 2.0 and its parameters is expected to be larger in children than in adults.

Section snippets

Participants

Forty-one adults between 18 and 56 years old (mean age 32y, SD 11), 29 women and 12 men, and 29 children between 4 and 12 years old (mean age 8y, SD 2), 16 girls and 13 boys, were included in this study. All adult participants were recruited via friends, family or colleagues from the department of Speech, Language and Hearing Sciences at the Ghent University. Children were recruited via their youth movement or primary school. According to a short questionnaire (orally completed by the adults,

Results

Normality tests (Kolmogorov-Smirnov test, Q–Q plots, boxplots) revealed normal distribution of all parameters and the NSI scores in both the adult’s and children’s group. No heteroscedasticity was withheld as the Pearson correlation coefficients of the Bland-Altman plots were not statistically significant (p > 0.05), except for the parameter ‘nasalance of the oral text’ in the short-term condition of the children’s group (r = 0.64, p < 0.001) and the parameter ‘nasalance of the vowel /u/’ in the

Discussion

To enable the implementation of the NSI 2.0 in the diagnosis of hypernasality and evaluation of intervention, this study aimed to verify the short-term and long-term reliability of the NSI 2.0 and its parameters in adults and children without resonance disturbance. Additionally, the possible difference in test-retest reliability between adults and children was explored. NSI 2.0 scores and its parameters were determined twice within one session and between two sessions separated by approximately

Conclusion

Based on the results of the current study, the long-term test-retest variability of the NSI 2.0 and its parameters is slightly higher compared to short-term test-retest variability. This may be explained by the variation in personal performance and physiological changes of the nasal patency (de Boer and Bressmann, 2014, Lewis et al., 2008, van Doorn and Purcell, 1998) which may be larger when differences of a larger timespan are considered. The interval [NSI 2.0 ± MDD], i.e. [NSI 2.0 ± 2.82] for

References (63)

K. Bettens et al.
The instrumental assessment of velopharyngeal function and resonance: a review
Journal of Communication Disorders
(2014)
C. Costa-Santos et al.
The limits of agreement and the intraclass correlation coefficient may be inconsistent in the interpretation of agreement
Journal of Clinical Epidemiology
(2011)
K.E. Lewis et al.
The influence of listener experience and academic training on ratings of nasality
Journal of Communication Disorders
(2003)
M. Park et al.
Nasalance scores for normal Korean-speaking adults and children
Journal of Plastic Reconstructive Aesthetic Surgery
(2014)
P. Van der Heijden et al.
Nasometry normative data for young Dutch children
International Journal of Pediatric Otorhinolaryngology
(2011)
T. Watterson et al.
The relationship between nasalance and nasality in children with cleft palate
Journal of Communication Disorders
(1993)
T. Abou-Elsaad et al.
Standardization of nasometry for normal egyptian arabic speakers
Folia Phoniatrica et Logopaedica
(2012)
R.T. Anderson
Nasometric values for normal Spanish-speaking females: a preliminary report
The Cleft Palate-Craniofacial Journal
(1996)
G. Atkinson et al.
Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine
Sports Medicine
(1998)
S.N. Awan et al.
Nasometer 6200 versus Nasometer II 6400: effect on measures of nasalance
The Cleft Palate-Craniofacial Journal
(2013)

S.N. Awan et al.

Effects of computer system and vowel loading on measures of nasalance

Journal of Speech, Language and Hearing Research

(2011)

S.N. Awan et al.

Dialectical effects on nasalance: a multicenter, cross-continental study

Journal of Speech, Language and Hearing Research

(2015)

H. Beckerman et al.

Smallest real difference, a link between reproducibility and responsiveness

Quality of Life Research

(2001)

K. Bettens et al.

Effects of age and gender in normal speaking children on the Nasality Severity Index: an objective multiparametric approach to hypernasality

Folia Phoniatrica et Logopaedica

(2013)

K. Bettens et al.

The Nasality Severity Index 2.0: revision of an objective multiparametric approach to hypernasality

The Cleft Palate-Craniofacial Journal

(2016)

J.M. Bland et al.

Statistical methods for assessing agreement between two methods of clinical measurement

Lancet

(1986)

J.M. Bland et al.

Statistics notes: measurement error proportional to the mean

British Medical Journal

(1996)

Boersma, P., & Weenink, D. PRAAT: Doing phonetics by computer (Version 5.3.78). Available at http://www.praat.org/...

T. Bressmann

Comparison of nasalance scores obtained with the Nasometer, the NasalView, and the OroNasal System

The Cleft Palate-Craniofacial Journal

(2005)

T. Bressmann et al.

Same noses, different nasalance scores: data from normal subjects and cleft palate speakers for three systems for nasalance analysis

Clinical Linguistics & Phonetics

(2006)

K. Brunnegard et al.

Normative data on nasalance scores for Swedish as measured on the Nasometer: influence of dialect, gender, and age

Clinical Linguistics & Phonetics

(2009)

K.R. Bzoch

Communicative disorders related to cleft lip and palate

(1997)

D.V. Cicchetti

Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology

Psychological Assessment

(1994)

E. D’haeseleer et al.

Normative data of nasalance scores for Flemish adults using the Nasometer II: influence of dialect and gender

Folia Phoniatrica et Logopaedica

(2015)

G. de Boer et al.

Comparison of nasalance scores obtained with the nasometers 6200 and 6450

The Cleft Palate-Craniofacial Journal

(2014)

S.G. Fletcher

Diagnosing speech disorders from cleft palate

(1978)

S.G. Fletcher et al.

Measurement of nasality with TONAR

Cleft Palate Journal

(1970)

R. Foy

Contribution rhinométrique à l’étude de la respiration nasale

Annales des Maladies de l’Oreille, du Larynx, du Nez et du Pharynx

(1910)

A.P. Fukushiro et al.

Nasometric and aerodynamic outcome analysis of pharyngeal flap surgery for the management of velopharyngeal insufficiency

Journal of Craniofacial Surgery

(2011)

O. Karakoc et al.

Nasalance scores for normal-speaking Turkish population

Journal of Craniofacial Surgery

(2013)

R. Kataoka et al.

The relationship between spectral characteristics and perceived hypernasality in children

The Journal of the Acoustical Society of America

(2001)

Cited by (2)

Intensive speech therapy in Ugandan patients with cleft (lip and) palate: a pilot-study assessing long-term effectiveness
2019, International Journal of Pediatric Otorhinolaryngology
Citation Excerpt :
Even though there was a better perceptual rating of hypernasality on data point 4 (i.e. moderately disturbed) compared to data points 1 and 2 (i.e. severely disturbed), the NSI 2.0 index on data point 4 had a lower value (−15.06) when compared to data points 1 (−11.79), 2 (−14.78). It should be mentioned that Bettens, Wuyts [42] reported that the interval of NSI 2.0 ± 2.68 for children defines the 95% confidence interval. Hence, if a new obtained NSI 2.0 value lies within this interval for a specific patient, the observed change is not considered to be a result of physiological changes.
In resource-limited countries, traditional models for speech therapy delivery are not adequate to reach all patients in need. In those countries, intensive speech therapy might be a solution. Preliminary results of previous research demonstrated that intensive speech therapy can be effective in the short term for patients living in countries with limited access to speech therapy. Questions might arise whether or not intensive treatment results in long-term benefits for these patients. Hence, the present study investigated long-term effectiveness of intensive speech therapy provided to Ugandan patients born with a cleft palate with or without cleft lip (CP ± L) in terms of different speech characteristics.
Five Ugandan patients with CP ± L, who received intensive speech therapy in the past, were contacted to participate in this follow-up study. All patients agreed to participate. Perceptual and instrumental speech evaluations were performed identically to the assessments immediately before and after speech therapy, to allow for comparison. Additionally, the Cleft Evaluation Profile, investigating self-perceived satisfaction with cleft-related features was included to compare satisfaction before and after speech therapy.
Long-term improvement in percentage correct consonants was seen in four patients. Furthermore, after speech therapy, decreased presence of resonance disorders was observed in two of the included patients. Before speech therapy, all participants were dissatisfied with speech. Interestingly, after intensive speech therapy, satisfaction with speech was seen in every patient and this satisfaction remained in the long term.
In summary, speech improvements after speech therapy varied among the five patients. Nevertheless, present study provided encouraging results to further investigate effectiveness of intensive speech therapy in patients with CP ± L.
Nasality in Homosexual Men: A Comparison with Heterosexual Men and Women
2019, Archives of Sexual Behavior

View full text

Short-term and long-term test-retest reliability of the Nasality Severity Index 2.0

Highlights

Abstract

Purpose

Methods

Results

Conclusion

Learning outcomes

Introduction

Section snippets

Participants

Results

Discussion

Conclusion

Journal of Communication Disorders

Journal of Clinical Epidemiology

Journal of Communication Disorders

Journal of Plastic Reconstructive Aesthetic Surgery

International Journal of Pediatric Otorhinolaryngology

Journal of Communication Disorders

Standardization of nasometry for normal egyptian arabic speakers

Folia Phoniatrica et Logopaedica

Nasometric values for normal Spanish-speaking females: a preliminary report

The Cleft Palate-Craniofacial Journal

Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine

Sports Medicine

Nasometer 6200 versus Nasometer II 6400: effect on measures of nasalance

The Cleft Palate-Craniofacial Journal

Effects of computer system and vowel loading on measures of nasalance

Journal of Speech, Language and Hearing Research

Dialectical effects on nasalance: a multicenter, cross-continental study

Journal of Speech, Language and Hearing Research

Smallest real difference, a link between reproducibility and responsiveness

Quality of Life Research

Effects of age and gender in normal speaking children on the Nasality Severity Index: an objective multiparametric approach to hypernasality

Folia Phoniatrica et Logopaedica

The Nasality Severity Index 2.0: revision of an objective multiparametric approach to hypernasality

The Cleft Palate-Craniofacial Journal

Statistical methods for assessing agreement between two methods of clinical measurement

Lancet

Statistics notes: measurement error proportional to the mean

British Medical Journal

Comparison of nasalance scores obtained with the Nasometer, the NasalView, and the OroNasal System

The Cleft Palate-Craniofacial Journal

Same noses, different nasalance scores: data from normal subjects and cleft palate speakers for three systems for nasalance analysis

Clinical Linguistics & Phonetics

Normative data on nasalance scores for Swedish as measured on the Nasometer: influence of dialect, gender, and age

Clinical Linguistics & Phonetics

Communicative disorders related to cleft lip and palate

Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology

Psychological Assessment

Normative data of nasalance scores for Flemish adults using the Nasometer II: influence of dialect and gender

Folia Phoniatrica et Logopaedica

Comparison of nasalance scores obtained with the nasometers 6200 and 6450

The Cleft Palate-Craniofacial Journal

Diagnosing speech disorders from cleft palate

Measurement of nasality with TONAR

Cleft Palate Journal

Contribution rhinométrique à l’étude de la respiration nasale

Annales des Maladies de l’Oreille, du Larynx, du Nez et du Pharynx

Nasometric and aerodynamic outcome analysis of pharyngeal flap surgery for the management of velopharyngeal insufficiency

Journal of Craniofacial Surgery

Nasalance scores for normal-speaking Turkish population

Journal of Craniofacial Surgery

The relationship between spectral characteristics and perceived hypernasality in children

The Journal of the Acoustical Society of America