Electrolarynx in voice rehabilitation
Introduction
The removal of the entire larynx as a treatment of laryngeal cancer usually results in the loss of the ability to produce voice and speech. Statistical data show that there are over 600,000 laryngectomees in the world [1], and apparently voice restoration is essential to these people. Standard esophageal (SE) speech and tracheoesophageal (TE) speech are two main methods used by laryngectomees for voice rehabilitation. But due to the low acquisition rate in SE speech (∼6%) [2] and the fact that as many as one-third of laryngectomized patients find TE speech unsuitable for anatomical or personal considerations [3], electrolarynx (EL) phonation is the most commonly adopted form of phonation. An electrolarynx (EL) is a battery-powered device, which incorporates the internal preset pitch that can be adjusted to meet with individual preference for male and female speakers. Lauder [4] and Rothman [5] found that the use of the EL was easier, produced longer sentences without special care, and was more effective for communication in many situations.
Since the debut of the first EL, named Sonovox, by Wright in 1942, EL has been undergoing many modifications. In 1945, Aurex company in Chicago started producing an EL named Aurex Neovox M-520T, setting the design foundation of modern EL. In 1959, the transistorized EL was developed by the Bell Laboratories [6]. Up to date, there are several commercially-used ELs including Nu-voice, Romet, Amplicode, Cooper-Rand, Servox, etc. The former four do not allow pitch adjustment during speaking, and Servox only has two preset pitch levels (high and low) with an external tone activation switch during conversation (see Fig. 1). There are two different types of EL: the neck-type and the intra-oral type. The neck-type EL is the most widely used among the laryngectomees. During phonation, the hand-held device is held against the neck approximately at the level of the former glottis to put the sound into the oral and pharyngeal cavities by an electromechanical vibrator. The vibrated electronic sound source is transmitted through the neck tissues, where the user modulates it to create speech by movements of articulators such as the lips, teeth, tongue, jaw and velum [6]. The particularity of the intra-oral type EL is the path that the sound is transmitted through. With the use of intra-oral tube, it transmits the sound into the mouth directly. Therefore, the energy leakage of the sound is limited and the speech quality is better as compared to the neck-type EL. Yet, due to the inconvenience in using the device such as articulation and sanitary problems, intra-oral type EL is more commonly used among laryngectomees immediately after laryngectomy or when still undergoing radiation therapy.
Acoustic and perceptual characteristics associated with EL phonation have been studied extensively in the English-speaking alaryngeal population. Because the pitch and intensity of EL speech are fixed during phonation, only few studies focused on the acoustic perspectives [5], [7], [8]. The experiments performed to characterize spectral differences between EL and normal phonation indicate that EL speech emitted energy peak near 500 Hz with a maximum amplitude near 2.5 kHz, whereas normal speech displays a energy peak near 500 Hz. Qi and Weinberg [8] also reported that EL speech has a substantial low-frequency deficit with an output of 30 dB lower below 500 Hz but 5–10 dB higher above 2 kHz than that of normal speech. Most of them focused on describing the perceptual characteristics of EL speech, which were usually investigated in comparison with SE or TE speech [9], [10], [11], [12], [13], [14], [15], [16]. It is generally agreed that EL speech is associated with lower intelligibility, poorer listeners’ acceptability, and more serious voicing confusions as compared to SE or TE speech, despite both are considerably worse than normal laryngeal speech. A review of the literature also indicates that some studies examined the characteristics of EL speech in tone languages including Thai, Cantonese, and Mandarin [17], [18], [19], [20], [21], [22], [23], [24], [25]. Compared to non-tone language, tone languages are characterized by wider ranges of fundamental frequency (F0) and faster F0 changes. This explains the poorer performance of EL speech. Results indicated that EL speakers were generally not able to produce the phonemic tones at a level of proficiency comparable to that of normal speakers. The patterns of tonal confusion of EL speech were different from those of NL speech. Higher intelligibility was related to level tone for EL speech than those related to rising or falling tones. Such poor performance was related to EL speakers’ inability to consistently produce pitch contours comparable to those of normal speakers due to the limitation of the instrument itself. Although some researches focused on acoustic and perceptual characteristics of EL speech, the literature presents conflicting information with regard to comparisons of EL and other alaryngeal speakers. Discrepancies in the findings from these studies may be due to the differences in subject sampling, recording methods, methods of analysis or speech samples used.
Section snippets
EL with pitch-control function
With more advanced technologies, several newer generations of EL have been developed to improve the sound quality. As indicated in previous discussion, the monotonic and robotic sound quality associated with EL speech is due to the lack of pitch adjustment during phonation. Pitch is preset at a certain level before use and remains steady during speech production. Improvements in the EL design therefore should focus on the real-time adjustment of the pitch. According to the method of pitch
EL speech enhancement
During EL phonation, some of the sounds produced by the vibrating diaphragm are radiated directly from the device. Poor interface with the neck and the surrounding neck tissues may result in radiated noise which interferes with the intelligibility of the speech. In the extreme cases, the stiff neck tissue resulting from radiation therapy may reflect all the acoustic energy from the EL back into the environment without propagating into the oral cavity for articulation. This apparently fails the
Summary
As mentioned above, considerable researches have been conducted to investigate the practical and theoretical improvements of EL speech. With the development of the state-of-the-art technology, we have reasons to believe that significant advances in improving the speech quality of EL speech in two aspects. The first aspect is the ability of the EL to adjust pitch and intensity real-time during phonation. The EMG-control EL will likely be adopted for several reasons. First of all, the method of
References (40)
- et al.
The effects of training in comprehension of electrolaryngeal speech
J Commun Disord
(1973) - et al.
Perception of contrastive stress in alaryngeal speech
J Phonet
(1982) - et al.
Speech performance of adult Cantonese-speaking laryngectomees using different types of alaryngeal phonation
J Voice
(1997) - et al.
Aerodynamic characteristics of laryngectomees breathing quietly and speaking with the electrolarynx
J Voice
(2004) - et al.
Alaryngeal speech aid using an intra-oral electrolarynx and a miniature fingertip switch
Auris Nasus Larynx
(2005) - Hirokazu S, Takahashi H. Voice generation system using an intra-mouth vibrator for the laryngectomee. MS thesis. Japan:...
- et al.
Functional outcomes following treatment for advanced laryngeal cancer. Part 1. Voice preservation in advanced laryngeal cancer. Part II. Laryngectomy rehabilitation: the state-of-the-art in the VA system
Ann Otol Rhinol Laryngol
(1998) - et al.
Utilization of microprocessors in voice quality improvement: the electrolarynx
Curr Opin Otolaryngol Head Neck
(2000) The laryngectomee and the artificial larynx—a second look
J Speech Hear Disord
(1970)Acoustic analysis of artificial electronic larynx speech
An experimental transistorized artificial larynx
Bell Syst Tech J
Acoustic and perceptual characteristics of speech produced with an electronic artificial larynx
J Acoust Soc Am
Low-frequency energy deficit in electrolaryngeal speech
J Speech Hear Res
An experimental study of artificial-larynx and esophageal speech
J Speech Hear Disord
The relative intelligibility of esophageal speech and artificial larynx speech
J Speech Hear Disord
Frequency, duration, and perceptual measures in relation to judgments of alaryngeal speech acceptability
J Speech Hear Res
A comparison of the intelligibility of esophageal, electrolarynx, and normal speech in quiet and in noise
J Commun Disord
Differences in speaking proficiency in three laryngectomy groups
Arch Otol
Production of intonation and contrastive stress in electrolaryngeal speech
J Speech Hear Res
Vowel length in Thai alaryngeal speech
Folia Phoniatric Logo
Cited by (68)
Electrolaryngeal speech enhancement based on a two stage framework with bottleneck feature refinement and voice conversion
2023, Biomedical Signal Processing and ControlCitation Excerpt :Patients with laryngeal cancer who receive full therapy have their larynx removed by the total laryngectomy surgery, and as a result, they lose the fundamental frequency generation mechanism of the human vocal tract [1]. The EL is one of the speaking-aid devices they can use to rehabilitate their speech [2]. Although laryngectomees lose their larynges, the other organs for producing speech are usually unaffected.
Aoustical and perceptual characteristics of mandarin consonants produced with an electrolarynx
2020, Speech CommunicationCitation Excerpt :Previous studies have indicated that more than half of laryngectomees use an EL up to two years post-laryngectomy due to a number of advantages, including its ease of learning and operation, and continuous output (Hillman et al., 1998). However, patients using ELs for communication are still limited in their communication due to the unnatural speech produced, thereby leading to low intelligibility (Kaye et al., 2017; Liu and Ng, 2007; Sluis et al., 2018; Verkerke and Thomson, 2014). A number of speech characteristics have been previously reported to affect EL speech quality and intelligibility.
Speaking without vocal folds using a machine-learning-assisted wearable sensing-actuation system
2024, Nature CommunicationsDevelopment and evaluation of a new intraoral voice assist device called the voice retriever
2024, Laryngoscope Investigative Otolaryngology