Elsevier

Journal of Communication Disorders

Volume 62, July–August 2016, Pages 1-11
Journal of Communication Disorders

Short-term and long-term test-retest reliability of the Nasality Severity Index 2.0

https://doi.org/10.1016/j.jcomdis.2016.05.001Get rights and content

Highlights

  • Test-retest measurements of the NSI 2.0 can be performed reliably.

  • No significant difference in reliability of NSI 2.0 between adults and children.

  • Difference of 2.82 between two NSI 2.0 scores represents genuine change in adults.

  • Difference of 2.67 between two NSI 2.0 scores represents genuine change in children.

  • NSI 2.0 can be reliably applied for follow-up of hypernasality in clinical practice.

Abstract

Purpose

The Nasality Severity Index 2.0 (NSI 2.0) forms a new, multiparametric approach in the assessment of hypernasality. To enable clinical implementation of this index, the short- and long-term test-retest reliability of this index was explored.

Methods

In 40 normal-speaking adults (mean age 32y, SD 11, 18–56y) and 29 normal-speaking children (mean age 8y, SD 2, 4–12y), the acoustic parameters included in the NSI 2.0 (i.e. nasalance of the vowel /u/ and an oral text, and the voice low tone to high tone ratio (VLHR) of the vowel /i/) were obtained twice at the same test moment and during a second assessment two weeks later. After determination of the NSI 2.0, a comprehensive set of statistical measures was applied to determine its reliability.

Results

Long-term variability of the NSI 2.0 and its parameters was slightly higher compared to the short-term variability, both in adults and in children. Overall, a difference of 2.82 for adults and 2.68 for children between the results of two consecutive measurements can be interpreted as a genuine change. With an ICC of 0.84 in adults and 0.77 in children, the NSI 2.0 additionally shows an excellent relative consistency. No statistically significant difference was withheld in the reliability of test-retest measurements between adults and children.

Conclusion

Reliable test-retest measurements of the NSI 2.0 can be performed. Consequently, the NSI 2.0 can be applied in clinical practice, in which successive NSI 2.0 scores can be reliably compared and interpreted.

Learning outcomes

The reader will be able to describe and discuss both the short-term and long-term test-retest reliability of the Nasality Severity Index 2.0, a new multiparametric approach to hypernasality, and its parameters. Based on this information, the NSI 2.0 can be applied in clinical practice, in which successive NSI 2.0 scores, e.g. before and after surgery or speech therapy, can be compared and interpreted.

Introduction

To assess and diagnose hypernasality, speech-language pathologists as well as other clinicians mostly rely on a combination of perceptual and instrumental measurements. A perceptual assessment based on spontaneous speech, automatic speech and reading or repeating sentences and words remains the “gold” standard to determine resonance disturbance. However, perceptual measurements are subjective and therefore can be influenced by vocal quality (Kataoka, Warren, Zajac, Mayo, & Lutz, 2001) and articulation errors of the patient (Bzoch, 1997) or by experience of the examiner (Lewis, Watterson, & Houghton, 2003). To support the perceptual analysis, several instrumental measurements are available to determine the presence and amount of resonance disturbance (Bettens, Wuyts, & Van Lierde, 2014). However, contradictory results can emerge when the outcomes of different assessment techniques are compared. Acoustic analyses based on, for example nasometry or spectral analyses, do not always strongly correlate with perceptual judgments (Keuning, Wieneke, van Wijngaarden, & Dejonckere, 2002; Lewis et al., 2003; Nellis, Neiman, & Lehman, 1992; Prado-Oliveira, Marques, Souza, Souza-Brosco, & Dutka Jde, 2015; Watterson, McFarlane, & Wright, 1993) or are based on vowels only, which may limit their representativeness of spontaneous speech (Lee, Wang, & Fu, 2009; Rah, Ko, Lee, & Kim, 2001; Vijayalakshmi, Reddy, & O’Shaughnessy, 2007). Therefore, a combination of complementary test results into a multiparametric index can form a solution. In a pilot study, Van Lierde, Wuyts, Bonte, and Van Cauwenberge (2007) developed the Nasality Severity Index (NSI) based on a combination of five parameters, more specifically the nasalance value of the vowel /a/, an oral and oronasal text derived by the Nasometer (model 6200); the maximum duration time (MDT) of /s/; and the mirror-fogging test by Glätzel of /a/. The equation yielded NSI = −60.69  (3.24 × nasalance oral text (%))  (13.39 × Glätzel value /a/) + (0.244 × MDT (s))  (0.558 × nasalance /a/ (%)) + (3.38 × nasalance oronasal text (%)). However, influence of personal and environmental variables due to the inclusion of MDT of /s/ and the use of the mirror-fogging test by Glätzel (Foy, 1910) was detected (Bettens, Wuyts, De Graef, Verhegge, & Van Lierde, 2013). Therefore, Bettens, Van Lierde, Corthals, Luyten, and Wuyts (2016) proposed an adaptation of the NSI based on the data of different instrumental measurement techniques and the optimal statistical discrimination between 50 children without resonance disturbance and 35 children with hypernasality, in a stepwise statistical approach, with sensitivity and specificity as the serving criteria. A weighted linear combination of three variables was established, more specifically the nasalance scores of the vowel /u/ and an oral text obtained with the Nasometer (model II 6450) and the voice low tone to high tone ratio (VLHR) of the vowel /i/ with a cutoff frequency of 4.47*F0Hz (originally described by Lee, Yang, and Kuo (2003)) (see Bettens et al. (2016) for more information about the rationale behind and the derivation of the formula). The formula of the adapted NSI yields NSI 2.0 = 13.20  (0.0824 × nasalance /u/ (%))  (0.260 × nasalance oral text (%))  (0.242 × VLHR /i/ 4.47*F0Hz (dB)). The mean NSI 2.0 value of patients with perceived hypernasality was −6.82 (SD 5.14), whereas the mean NSI 2.0 value of the control children with normal resonance was +4.08 (SD 1.59). With a cutoff score of zero, the NSI 2.0 discriminated patients with hypernasality from persons with normal resonance with a sensitivity of 92% and a specificity of 100%, in which patients with perceived hypernasality had scores below zero. The validity of this new index was proven to be high by application of the parameter results of an independent patient and control group on the derived formula (sensitivity 88%, specificity 89%), in which all patients were perceptually judged with hypernasality and all control children with normal resonance.

However, before the NSI 2.0 can be implemented in daily clinical practice, the reliability of this new index has to be verified. According to literature, several sources can affect the stability of instrumental measurements (Lewis, Watterson, & Blanton, 2008). More specifically, instrumental variance (e.g. microphone and sound cart characteristics, machine model), test procedure (e.g. distance from the microphone), subject performance (e.g. physiological factors, nasal patency) and the environment (e.g. air moisture and temperature) can influence the reliability of assessment techniques. Similarly, the components of the NSI 2.0 are susceptible to these sources of variation.

Two of the three parameters included in the index are obtained by the Nasometer. This device, originally developed by Fletcher and Bishop (1970) and manufactured by KayPentax (KayPentax, NJ, Lincoln Park), determines the amount of nasal resonance based on an acoustic analysis of both a nasal and oral signal, and is considered an indirect measure of nasality. The signals are obtained by two microphones divided by a sound separation plate which is positioned between the nose and the upper lip of the participant. After filtering the signals using a band pass filter with a center frequency of 500 Hz and a bandwidth of 300 Hz, the ratio of the nasal signal to the (nasal + oral) signal, multiplied by 100, yields the nasalance score in a percentage. Several authors state that, although based on similar acoustic analyses of nasal and oral signals, nasalance scores of different instruments, such as the Nasometer, NasalView and OroNasal System, are not interchangeable (Awan, Omlor, & Watts, 2011; Bressmann, 2005; Bressmann, Klaiman, & Fischbach, 2006; Lewis & Watterson, 2003). Additionally, scores obtained with different models of the same instrument can also vary significantly (Awan et al., 2011, Awan and Virani, 2013, de Boer and Bressmann, 2014, Watterson and Lewis, 2006). Even results determined with different devices of the same model may differ due to the characteristics of the nasal and oral microphone (Zajac, Lutz, & Mayo, 1996). When the same device is used, replacement of the headgear can introduce a second source of variability (Watterson, Lewis, & Brancamp, 2005; Watterson & Lewis, 2006), although Lewis et al. (2008) and Kavanagh, Fee, Kalinowski, Doyle, and Leeper (1994) found only small differences between the condition of no change of the headgear and headgear change between two successive measurements. Next to instrumental and procedure variation, personal variation also has an influence on the reliability of the test results. Extensive research focused on between-subject variability, more specifically the influence of age (Brunnegard and van Doorn, 2009, Luyten et al., 2012; Prathanee, Thanaviratananich, Pongjunyakul, & Rengpatanakij, 2003; Van der Heijden, Hobbel, Van der Laan, Korsten-Meijer, & Goorhuis-Brouwer, 2011; van Doorn & Purcell, 1998), gender (Abou-Elsaad, Quriba, Baz, & Elkassaby, 2012; Brunnegard & van Doorn, 2009; Karakoc, Akcam, Birkent, Arslan, & Gerek, 2013; Luyten et al., 2012, Nichols, 1999, Park et al., 2014, Prathanee et al., 2003; Sweeney, Sell, & O’Regan, 2004; Van de Weijer and Slis, 1991, Van der Heijden et al., 2011, van Doorn and Purcell, 1998; Van Lierde, Wuyts, De Bodt, & Van Cauwenberge, 2003) and dialect (Awan et al., 2015; D’haeseleer, Bettens, De Mets, De Moor, & Van Lierde, 2015; Kavanagh et al., 1994; Mayo, Floyd, Warren, Dalston, & Mayo, 1996; Nichols, 1999; Rochet, Rochet, Sovis, & Mielke, 1998; Seaver, Dalston, Leeper, & Adams, 1991). This resulted in normative values for nasalance scores in different languages.

Another personal inconsistency arises from intra-subject variability possibly due to the variation in physiological factors such as small changes in nasal patency (de Boer and Bressmann, 2014, Lewis et al., 2008, van Doorn and Purcell, 1998). Due to this variability, nasalance scores of an oral text or oral sentences differ from 4 to 6 percentage points in 95% of the recordings of participants with normal speech in the ‘no change of the headgear’ condition using a Nasometer (Lewis et al., 2008, Sweeney et al., 2004, van Doorn and Purcell, 1998, Watterson et al., 2005; Watterson, Lewis, Ludlow, & Ludlow, 2008) and from 5 to 9 percentage points in 95% of the recordings in the ‘change of the headgear’ condition between sessions (de Boer and Bressmann, 2014, Lewis and Watterson, 2003, Lewis et al., 2008, van Doorn and Purcell, 1998, Watterson et al., 2005, Whitehill, 2001). These studies are based on data obtained from adults or children. However, to the best of our knowledge, no study has yet investigated whether nasalance can be examined as reliable in children as in adults.

A third parameter included in the NSI 2.0 is VLHR of the vowel /i/ with a cutoff frequency of 4.47*F0Hz (Lee et al., 2003), determined by PRAAT software (Boersma & Weenink, 2014). After determination of the power spectrum of the sound wave by use of a Fast Fourier Transformation, the spectrum of the sound sample is divided into a low-frequency band (LFB) and a high-frequency band (HFB) using a specific cutoff frequency derived from the fundamental frequency, 4.47*F0Hz. To quantify LFB, the power of each frequency band ranging from 65 Hz to the cutoff frequency is summed; the power of HFB is calculated as the summation of the power ranging from the cutoff frequency up to 8000 Hz in accordance with the protocol of Lee et al., 2003, Lee et al., 2006. VLHR is defined as the power ratio of LFB vs. HFB expressed in decibels. Lee et al. (2003) reported no significant correlation between sound intensity and VLHR, due to the use of a relative rather than an absolute index. Division of LFB to HFB also eliminates the possible influence of different sound recording conditions such as characteristics of the microphone, sound card of the computer and distance from the microphone. Therefore, variability due to equipment and test procedure may be limited. However, between-subjects and subject performance variability can still influence the stability of the test results for this parameter.

To enable the implementation of the NSI 2.0 in the diagnosis of hypernasality and evaluation of surgical treatment, such as palatal re-repair, or speech therapy, the aim of this study was to explore both, short-term and long-term test-retest reliability of the NSI 2.0 and its parameters in adults and children without resonance disturbance. Furthermore, the possible difference in test-retest reliability between adults and children was verified in both conditions. Based on literature, long-term variability of the NSI 2.0 and its parameters is hypothesized to be larger compared to short-term variability. Additionally, variability of the NSI 2.0 and its parameters is expected to be larger in children than in adults.

Section snippets

Participants

Forty-one adults between 18 and 56 years old (mean age 32y, SD 11), 29 women and 12 men, and 29 children between 4 and 12 years old (mean age 8y, SD 2), 16 girls and 13 boys, were included in this study. All adult participants were recruited via friends, family or colleagues from the department of Speech, Language and Hearing Sciences at the Ghent University. Children were recruited via their youth movement or primary school. According to a short questionnaire (orally completed by the adults,

Results

Normality tests (Kolmogorov-Smirnov test, Q–Q plots, boxplots) revealed normal distribution of all parameters and the NSI scores in both the adult’s and children’s group. No heteroscedasticity was withheld as the Pearson correlation coefficients of the Bland-Altman plots were not statistically significant (p > 0.05), except for the parameter ‘nasalance of the oral text’ in the short-term condition of the children’s group (r = 0.64, p < 0.001) and the parameter ‘nasalance of the vowel /u/’ in the

Discussion

To enable the implementation of the NSI 2.0 in the diagnosis of hypernasality and evaluation of intervention, this study aimed to verify the short-term and long-term reliability of the NSI 2.0 and its parameters in adults and children without resonance disturbance. Additionally, the possible difference in test-retest reliability between adults and children was explored. NSI 2.0 scores and its parameters were determined twice within one session and between two sessions separated by approximately

Conclusion

Based on the results of the current study, the long-term test-retest variability of the NSI 2.0 and its parameters is slightly higher compared to short-term test-retest variability. This may be explained by the variation in personal performance and physiological changes of the nasal patency (de Boer and Bressmann, 2014, Lewis et al., 2008, van Doorn and Purcell, 1998) which may be larger when differences of a larger timespan are considered. The interval [NSI 2.0 ± MDD], i.e. [NSI 2.0 ± 2.82] for

References (63)

  • S.N. Awan et al.

    Effects of computer system and vowel loading on measures of nasalance

    Journal of Speech, Language and Hearing Research

    (2011)
  • S.N. Awan et al.

    Dialectical effects on nasalance: a multicenter, cross-continental study

    Journal of Speech, Language and Hearing Research

    (2015)
  • H. Beckerman et al.

    Smallest real difference, a link between reproducibility and responsiveness

    Quality of Life Research

    (2001)
  • K. Bettens et al.

    Effects of age and gender in normal speaking children on the Nasality Severity Index: an objective multiparametric approach to hypernasality

    Folia Phoniatrica et Logopaedica

    (2013)
  • K. Bettens et al.

    The Nasality Severity Index 2.0: revision of an objective multiparametric approach to hypernasality

    The Cleft Palate-Craniofacial Journal

    (2016)
  • J.M. Bland et al.

    Statistical methods for assessing agreement between two methods of clinical measurement

    Lancet

    (1986)
  • J.M. Bland et al.

    Statistics notes: measurement error proportional to the mean

    British Medical Journal

    (1996)
  • Boersma, P., & Weenink, D. PRAAT: Doing phonetics by computer (Version 5.3.78). Available at http://www.praat.org/...
  • T. Bressmann

    Comparison of nasalance scores obtained with the Nasometer, the NasalView, and the OroNasal System

    The Cleft Palate-Craniofacial Journal

    (2005)
  • T. Bressmann et al.

    Same noses, different nasalance scores: data from normal subjects and cleft palate speakers for three systems for nasalance analysis

    Clinical Linguistics & Phonetics

    (2006)
  • K. Brunnegard et al.

    Normative data on nasalance scores for Swedish as measured on the Nasometer: influence of dialect, gender, and age

    Clinical Linguistics & Phonetics

    (2009)
  • K.R. Bzoch

    Communicative disorders related to cleft lip and palate

    (1997)
  • D.V. Cicchetti

    Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology

    Psychological Assessment

    (1994)
  • E. D’haeseleer et al.

    Normative data of nasalance scores for Flemish adults using the Nasometer II: influence of dialect and gender

    Folia Phoniatrica et Logopaedica

    (2015)
  • G. de Boer et al.

    Comparison of nasalance scores obtained with the nasometers 6200 and 6450

    The Cleft Palate-Craniofacial Journal

    (2014)
  • S.G. Fletcher

    Diagnosing speech disorders from cleft palate

    (1978)
  • S.G. Fletcher et al.

    Measurement of nasality with TONAR

    Cleft Palate Journal

    (1970)
  • R. Foy

    Contribution rhinométrique à l’étude de la respiration nasale

    Annales des Maladies de l’Oreille, du Larynx, du Nez et du Pharynx

    (1910)
  • A.P. Fukushiro et al.

    Nasometric and aerodynamic outcome analysis of pharyngeal flap surgery for the management of velopharyngeal insufficiency

    Journal of Craniofacial Surgery

    (2011)
  • O. Karakoc et al.

    Nasalance scores for normal-speaking Turkish population

    Journal of Craniofacial Surgery

    (2013)
  • R. Kataoka et al.

    The relationship between spectral characteristics and perceived hypernasality in children

    The Journal of the Acoustical Society of America

    (2001)
  • Cited by (2)

    • Intensive speech therapy in Ugandan patients with cleft (lip and) palate: a pilot-study assessing long-term effectiveness

      2019, International Journal of Pediatric Otorhinolaryngology
      Citation Excerpt :

      Even though there was a better perceptual rating of hypernasality on data point 4 (i.e. moderately disturbed) compared to data points 1 and 2 (i.e. severely disturbed), the NSI 2.0 index on data point 4 had a lower value (−15.06) when compared to data points 1 (−11.79), 2 (−14.78). It should be mentioned that Bettens, Wuyts [42] reported that the interval of NSI 2.0 ± 2.68 for children defines the 95% confidence interval. Hence, if a new obtained NSI 2.0 value lies within this interval for a specific patient, the observed change is not considered to be a result of physiological changes.

    View full text