Improvement of cochlear implant performance: changes in dynamic range

Context Theoretically, a wide input dynamic range (IDR) will capture more of the incoming acoustic signal than a narrow IDR, allowing the cochlear implant (CI) user to hear soft, medium, and loud sound. A narrow IDR may restrict the CI user′s ability to hear soft speech and sound because less of the incoming acoustic signal is being mapped into the CI user′s electrical dynamic range. Aim The overall goal of the study is to provide guidelines for audiologists to efficiently and effectively optimize performance of CI recipients for two difficult listening situations: understanding soft speech and speech in noise. Settings and design Two variables were studied; the independent variables were IDR and the electric dynamic range of the channels. The dependent variables were six Ling sounds, monosyllabic word test, and speech in noise test. Materials and methods Fourteen patients participated in the study. For each patient, seven programs were created. In each program, dependent variables were assessed in different independent ones. Results A restricted IDR resulted in poor speech recognition compared with the relatively wide IDR. Subjectively determined T level and most comfortable level (MCL) at the most, not the maximum, comfortable level appears to have a positive effect on both soft sound recognition and speech discrimination. Conclusion Dynamic range is an important factor -among others- to improve the ability of CI users to understand soft speech as well as speech in noise.


Introduction
Cochlear implant (CI) patients who perform well on word and sentence tests presented in quiet at a comfortable listening level often report considerable diffi culty understanding in most noisy environments encountered in daily life [1]. Moreover, they report diffi culty understanding soft speech spoken by children and individuals speaking from a distance. If optimizing patient performance in daily life is the goal, then it is essential that clinical fi tting address the ability of CI users to understand soft speech as well as speech in noise [2].
Th e acoustic information carried by speech is quite complex and has many dynamic variations. Sounds are-by their nature-dynamic, changing over time in terms of level and spectral content [3]. It has been shown by formant analysis that the dynamic spectral variation in vowels provides reliable acoustic cues in fl uent speech that contribute toward both consonant and vowel identifi cation [4]. As the speech signal is highly variable in terms of its intensity, the relationship in consonant and vowel amplitude ratios play an important role in speech intelligibility.
A characteristic fi nding in individuals with sensorineural hearing loss, in addition to an increase in hearing threshold, is essentially a reduction in dynamic range. Th is reduction in dynamic range has its drawbacks in speech intelligibility. Dynamic range includes both input dynamic range (IDR) and electric dynamic range (C and T). Th e IDR is the range of the incoming acoustic signal that is mapped into the CI user's electrical dynamic range (range between minimum stimulation levels (T levels) and maximum stimulation levels (C levels) [5]. Th eoretically, a wide IDR will capture more of the incoming acoustic signal than a narrow IDR, allowing the CI user to hear soft, medium, and loud sound. A narrow IDR may restrict the CI user's ability to hear soft speech and sound because less of the incoming acoustic signal is being mapped into the CI user's electrical dynamic range [6]. Jacquelyn et al. [7] reported that speech is not all the same loudness; consequently, lowering IDR too much

Improvement of cochlear implant performance: changes in dynamic range
Ahmed Khater a , Amira El Shennaway b , Ahmed Anany a

Context
Theoretically, a wide input dynamic range (IDR) will capture more of the incoming acoustic signal than a narrow IDR, allowing the cochlear implant (CI) user to hear soft, medium, and loud sound. A narrow IDR may restrict the CI user's ability to hear soft speech and sound because less of the incoming acoustic signal is being mapped into the CI user's electrical dynamic range.

Aim
The overall goal of the study is to provide guidelines for audiologists to ef ciently and effectively optimize performance of CI recipients for two dif cult listening situations: understanding soft speech and speech in noise.

Settings and design
Two variables were studied; the independent variables were IDR and the electric dynamic range of the channels. The dependent variables were six Ling sounds, monosyllabic word test, and speech in noise test.

Materials and methods
Fourteen patients participated in the study. For each patient, seven programs were created. In each program, dependent variables were assessed in different independent ones.

Results
A restricted IDR resulted in poor speech recognition compared with the relatively wide IDR. Subjectively determined T level and most comfortable level (MCL) at the most, not the maximum, comfortable level appears to have a positive effect on both soft sound recognition and speech discrimination.
can decrease speech comprehension, even without any background noise.
Th e objectives of the present study were two-fold. Th e fi rst was to compare two ways for the minimum threshold level (T level), a fi xed value and a subjectively determined one, enabling audibility of soft speech cues. Th e second objective was to evaluate the eff ect of diff erent IDRs on CI patient performance. Th e overall goal of the study is to provide guidelines for audiologists to effi ciently and eff ectively optimize performance of CI recipients for two diffi cult listening situations: understanding soft speech and speech in noise.

Materials and methods
Th e study was approved by the ethical committee of Zagazig University, Faculty of Medicine, Zagazig, Egypt. Fourteen patients participated in the study. All patients were implanted with Advanced Bionics (AB) 90K CI devices in the ENT Medical Center, Kingdom of Saudi Arabia (Valencia, CA 91355, USA). Patients ranged in age from 14 to 27 years at the time of the study. Th e duration of hearing loss ranged from 2 to 9 years. Onset of hearing loss was perilingual in six patients and postlingual in eight patients. Th is category of patients was chosen to yield reliable results during the study. Th e length of implant use ranged from 2 to 4 years. Patients' data are presented in Table 1. Neural response image (NRI) confi rmed well-functioning electrodes at all channels together with a good response for all patients in the study. Intraoperative plain radiography confi rmed the correct position of the electrode array into the cochlea. Th e participants used HiRes 120 speech processing strategies and all had open-set speech recognition.

Electric dynamic range (difference between T and C levels)
After a suffi cient healing period, initial programming and activation of the sound processor was performed through Soundwave fi tting software, the HiRes clinical fi tting tool, version 2.2 for Advanced Bionics device (AB).

C level
C levels refl ect the amount of electrical current needed for each electrode to elicit a comfortable loudness percept. Th e maximum (C) stimulation level for each electrode was programmed using ascending loudness judgments. Th e patient reports the loudness of the sound on a fi ve-point scale (barely audible, soft, most comfortable, maximum comfortable, and uncomfortable). Each participant's preferred program with C level at the maximum comfortable level had been used for at least 1 month before being clinically evaluated for the current study. Another trial period was allowed with the C levels at all electrodes set at the most comfortable level. It was 10 current unit (CU) below the maximum comfortable one.
T level T level represents the minimum amount of electrical current needed for each electrode to elicit a low-level or a soft percept for the recipient. When using the manufacturer's software (SoundWave) to set T levels for an Advanced Bionics speech processor, a 'default' setting can be selected. By doing so, T levels are calculated automatically as a level that represents 10% of the recipient's C levels. Alternately, T levels can be set manually on the basis of the patient's perception of minimally audible sounds. Two programs were created. In one program, T level, it was 10% of the C level. Another program involved behavioral assessment of the T level as the patient reports the loudness of the sound on a scale to be just below soft. It is more than what was calculated as 10% of the C level.

Input dynamic range
IDR is the range of the softest to loudest sounds that are detected by a sound processor. Th e wider the range, the more sounds the patient hears. Th e IDR, of a CI sound processor is the ratio between the loudest and the softest sounds that it will present at any given time.

Plan of the study
Th e independent variables tested in this study were input acoustic dynamic range (IDR) and the electric dynamic range of the channels (the range between T and C levels). Seven programs were created. In the fi rst trial, two programs were created; by fi xing C level to the and three consonants [(m), (s), (sh)] that span the speech frequency range of 250-8000 Hz. Th ey are uttered as follows: ah (as in father), oo (as in moon), ee (as in key), sh (as in shoe), s (as in sock), and m (as in mommy). Th ese phonemes were recorded by a female speaker, were 800 ms in duration, and had a root mean square level within 1 dB of each other. Detection thresholds for recorded Ling sounds were obtained.

Ling sound frequency [10]
M is a low-frequency sound O has low-frequency information E has some low-frequency and some high-frequency information A is at the center of the speech range Sh is in the moderately high-frequency speech range S is in the very high-frequency speech range

Speech discrimination
Speech discrimination was carried out using the Arabic monosyllabic word list according to Soliman [11]. It was presented at 65 dB sound pressure level (SPL). Th e patient's response was in the form of repetition of the word heard.

Speech in noise test
Arabic SPIN was used according to Tawfi k et al. [12].
It is an open-set test that includes 25 items. Th e speech material was delivered to the patients through a front loudspeaker at zero azimuths while the background noise (multitalker babble) was delivered from a back speaker. Th e intensity of the signal was set at 65 dB SPL with 0 dB S/N ratios. Th e participant was instructed to ignore the noise and to repeat the speech signals.

Results
Th e results are shown in Tables 2-4 and refl ecting that subjectively determined T values that were at levels slightly higher than the manufacturer's recommended setting of Ts (10% of Cs) resulted in statistically signifi cant decrease in sound fi eld threshold levels for the Ling six sounds. Monosyllabic word discrimination was signifi cantly better with the most comfortable level maximum comfortable level, IDR at 60 dB (default), and sensitivity at zero, T level was tested with the six Ling sounds threshold comparing between two programs: program with T level at 10% of the C level and program with the T level just at barely audible sound to determine the program with the best detection threshold for the six Ling sounds. In the second trial, with the T level that yielded the best detection threshold for the six Ling sounds and all other parameters fi xed, two more programs were created: one program with C level at the maximum comfortable level and another program with C level 10 cu below the maximum comfortable level (at the most comfortable level), and then testing the best C level by testing speech discrimination by a monosyllabic word list and establishing the program with the C level that yielded the best score. In the third trial, using the T level that yielded the best detection threshold for the six Ling sounds and using the C level that yielded the best score for monosyllabic words and all other parameters were fi xed, three programs with three diff erent IDRs (50, 60, and 80 dB) were created to compare the results of speech in noise test (SPIN) between the three IDRs.
Th e dependent variables tested in this experiment were six Ling sounds, monosyllabic word test, and SPIN. All speech tests were performed in a soundproof booth through a loudspeaker placed at the ear level at 0° azimuth and 1 m from the center of the participants' heads. Th e test materials were presented through an IBM compatible, Pentium II computer that controlled a mixing and attenuation network to present stimuli through a power amplifi er and loudspeaker.

Six Ling sounds detection threshold levels
Th e Ling six sounds [8,9] represent diff erent speech sounds from low to high pitch. It was developed as a quick and easy test that professionals can use to check the hearing of the patient. Th e test checks that the patient can hear (detection) and in time recognize each sound (identifi cation) across the diff erent speech frequencies. Th e Ling 6 sounds test uses isolated phonemes consisting of three vowels [(ah), (oo), (ee)]  [13] found decreased sound-fi eld thresholds and improved the perception of soft speech by increasing T levels so that low-level sounds were mapped to higher levels within Nucleus 22 CI users' electrical dynamic range. Zeng et al. [14] found that an IDR between 50 and 60 dB provided the best vowel and consonant recognition for 10 Clarion CI users. Spahr et al. [15] recommended an IDR of 60 dB for use with the AB CII BTE speech processor.

Maximum versus most comfortable level
Th e results of the present study indicated that the performance with the most, not maximum, comfortable level was statistically signifi cant. Although higher stimulus levels exert positive eff ects that include better encoding of speech signals because of increased discharge rates and increased numbers of fi bers carrying the signal [16], negative eff ects could include rate saturation and increased channel overlap. Th e loudness of a pulse train presented on a single electrode of a CI grows monotonically with the stimulus amplitude [17,18]. When multiple electrodes provide interleaved stimulation as is typical of most modern CI processing strategies, the loudness of the interleaved stimulus is greater than the loudness provided by the stimulation from any one of the individual electrodes presented in isolation. Th is phenomenon is known as loudness summation [19]. Th e mechanism by which electrical stimulation level aff ects perception is related to the spatial extent of neural excitation. With a higher level of stimulation, the degree of current spread is increased. Increasing the level of electrical stimulation causes larger activating-potential fi elds and thus leads to an increase in the number of the stimulated neural population [20]. With more neurons contributing toward the representation of temporal cues at higher electrical stimulus levels, increased psychophysical and speech perception performance are the outcome. However, with more increments of electrical stimulus level, degradation in the specifi city of tonotopic stimulation is expected secondary to greater overlap of adjacent populations of stimulated neurons [16].
Site stimulation shifts as a function of stimulus level are assumed to be another explanation for the eff ects of higher stimulus level on speech performance. Such shifts in the site of stimulation can aff ect speech perception [21].

Input dynamic range
Good recognition of vowels and consonants -the constituent units of words -is dependent on the maintenance of spectral contrast. One set of the acoustic cues that specify manner and voicing are found in the gross shape of the amplitude envelope [14]. Th e  present study showed that patients' performance with a small dynamic range was worse than performance with a large dynamic range on speech recognition evaluation.
Phoneme spectra are characterized by peaks and valleys with the vowel spectra are typically characterized by high-amplitude peaks and relatively low-amplitude valleys [22]. Although the frequencies of the spectral peaks are considered to be the primary cues to phoneme identity, the spectral contrast, that is, the diff erence between the spectral peak and the spectral valley, needs to be maintained to some extent for accurate phoneme identifi cation as mentioned by Loizou and Poroy [22]. Normal-hearing listeners required a 1-2 dB peak-to-valley diff erence to identify four vowel-like harmonic complexes with relatively high 75% correct. Listeners with a fl at, moderate hearing loss required a 6-7 dB peak-to-valley diff erence for vowel identifi cation [14]. Th is was attributed to the lack of compression and the abnormally broad auditory fi lters associated with hearing loss. Spectral contrast is reduced when phonemes are processed through broad fi lters because of the shallow fi lter roll-off . As a result, the internal phoneme representation is 'blurred', leading to poorer identifi cation [23].
Spectral contrast is reduced in CI listeners, not because of the abnormally broad auditory fi lters -which are bypassed with electrical stimulation -but primarily because of the reduced dynamic range and amplitude compression [16]. Th e large acoustic dynamic range is typically compressed in implant speech processors using a logarithmic function to a small electrical dynamic range, 5-15 dB [24]. Another factor that could potentially reduce spectral contrast is the steepness of the compression function used for mapping acoustic amplitudes to electric amplitudes [22]. A highly compressive mapping function would yield a small spectral contrast even if the dynamic range were large. A third factor of the eff ect of background noise could also reduce spectral contrast probably to a larger degree in CI listeners compared with normal-hearing listeners because of the limited electrical dynamic range [25].
Large IDR maintains a suffi cient spectral contrast enough to make the peak in the channel amplitude spectrum more distinct and perceptually more salient, leading to a signifi cant improvement in identifi cation. A wider IDR may present a more complete picture of the sound environment. Narrower IDR may improve speech comprehension if there is trouble with background noise as it reduces unnecessary noise and limits the sound range to that of the normal variations in speech; however, this occurs at the expense of a feeling of isolation as patients are not hearing as many sounds around them.

Higher T level
As the consonant envelope distribution was about 20 dB lower than the vowel envelope distribution, the consonants are likely to be mapped into a lessthan-optimal electric range [23]. First, some low envelope levels may be mapped into electric levels below threshold; second, some of the upper portion of the electric dynamic range may not be utilized because few amplitude envelope levels are present. Th ird, most envelope levels are likely mapped into the lower portion of the electric dynamic range, where both intensity discrimination and modulation detection are poor. Higher T level will raise previously inaudible low envelope levels above the threshold, reduce the unused portion of the electric dynamic range, and map more of the envelope into the upper electric dynamic range, where intensity discrimination and modulation are optimal [26]. One negative tradeoff for the higher T level and the more compressive mapping is the possibility that low-level noise may become audible [27]. Another negative trade-off is the slightly distorted envelope level distribution; however, Kewley and Burkle [26] found in their study that this distortion should produce little, if any, decrease in consonant recognition.