Research ArticleThe effects of larynx height on vowel production are mitigated by the active control of articulators
Introduction
The origin and evolution of language and speech are a heavily debated topic, a major division being between models proposing recent and sudden origin, restricted to modern humans only (Berwick and Chomsky, 2017, Hauser et al., 2014, Klein, 2009), versus deep origin, gradual evolution, and a wider distribution (also including archaic humans, such as the Neanderthals; Dediu and Levinson, 2013, Dediu and Levinson, 2018, Johansson, 2015, Lieberman, 2016). In particular, the speech capacities of archaic humans have been linked to the position of the larynx (itself linked to the position of the hyoid bone), and the corresponding ratio between the horizontal and the vertical parts of the vocal tract (Lieberman, 2016).
While it is currently unclear what this ratio might have been in Neanderthals and when its “modern” value evolved (Dediu and Levinson, 2013, Gokhman et al., 2017, Lieberman, 2016), a more tractable question concerns its effects on speech and language (Boë et al., 2002, de Boer and Fitch, 2010, Lieberman, 2016). More precisely, the seminal claim by Lieberman and Crelin (1971) that a high larynx (a position suggested by some for Neanderthals) reduces the vowels space, making impossible the production of the widely-used [a], [i], [u] and [ɔ], has generated a lively debate centered on the use of computer models of the vocal tract to make such inferences (Boë et al., 2007, de Boer and Fitch, 2010, Lieberman, 2007).
For example, starting from the suggestion (Honda & Tiede, 1998) that larynx height may be deduced from the shape of the oral cavity, Boë (1999) used the “variable linear articulatory model” (VLAM) (Maeda, 1990) coupled with factor analysis and a growth model to argue against (Lieberman & Crelin, 1971). Building on this and work by Boë et al., 2002, Ménard and Boë, 2000 concluded that “the maximal vowel space of a given vocal tract does not depend on the larynx height index: gestures of the tongue body (and lips and jaw) allow compensation for differences in the ratio between the dimensions of the oral cavity and pharynx” (p. 481). Boë et al. (2007) reiterated that VLAM shows a high larynx not leading to a less distinctive vowel space. However, de Boer and Fitch (2010) attributed circular reasoning to Boë et al. (2002), as the growth scaling in Boë, 1999, Boë et al., 2002, Boë et al., 2007 was applied after the articulatory factors have been extracted in the VLAM, meaning that any inferred anatomies (Neanderthals, infants) have the same degrees of articulatory freedom as modern female adults, but just with a different scaling (for example, this does not hold in the observational data from pre-babbling vocalizations of infants, which are (epilaryngeally) constricted, clearly with less degrees of articulatory freedom; Esling, Benner, & Moisik, 2015). Furthermore, such global scaling preserves the layout of the different components of the model including the angle and ratio between the pharynx and the oral cavity, but a change in this layout is precisely what has been hypothesized to set modern humans apart. Finally, de Boer and Fitch (2010) argued that the use of factor analysis in VLAM linearly extrapolates from observed to unobserved cases, likely overestimating the ability of the articulators to compensate for any effects of anatomy, and developed, in response, a model better adhering to the anatomical constraints of the vocal tract, showing that a larynx height similar to a human female would be ideal for maximally distinctive vowel inventory (Lieberman, 2012).
Here we introduce a novel computer model that has several advantages over its predecessors. First, it is based on a widely-used realistic 3D geometric model of the vocal tract (VocalTractLab 2.1) built on modern phonetic theory and calibrated with data (MRI and otherwise) from actual humans (Birkholz, 2005, Birkholz, 2013a, Birkholz and Kröger, 2006). Second, this model allows the programmatic control of multiple meaningful articulatory parameters (such as the position of the tongue tip or the degree of lip rounding), and produces the corresponding acoustic output. Third, with the author’s permission, we modified this model to allow (among others) the specification of hyoid position. Fourth, we implemented a complete agent that can control this vocal tract model using a generic machine learning algorithm, and which is capable of learning to produce a set of auditorily presented target vowels (here, [ə], [ɑ], [a], [æ], [e], [i], [o] and [u]) by controlling the free articulators of the model. This allows us to systematically study the impact of larynx height on vowel production, to find the optimal height for the production of widely-used vowels, and the compensatory strategies that can mitigate the impact of extreme larynx positions.
While still far from perfect, we think that our model represents an important advance, allowing more refined answers to questions surrounding the impact of larynx height on vowel production, and providing a platform for further improvement and application to other aspects of inter-group and inter-individual variation in speech, both pathological and normal (Dediu, Janssen, & Moisik, 2017). Given that the work reported here is in many ways novel, one of our main aims was to start from as “generic” and “theory-free” assumptions as possible and to write our code as easily replaceable and upgradeable modules.
Section snippets
Data and methods
The fundamental idea is to study how learning a set of vowels is affected by controlled changes in a particular aspect of vocal tract anatomy, here, larynx height. Such experimental manipulations are extremely difficult to conduct with human participants, but computer simulations using realistic models of the human vocal tract may offer approximations that, while imperfect, may still be good enough for answering specific questions in an objective, repeatable and quantitative manner. For more
Results
The analyses and plots reported here used R 3.4.4 (R Core Team, 2017). The full analysis (including aspects and details, including considering formants, not reported here due to space constraints) can be found in the Supplementary materials in Appendix. The patterns obtained considering and formants are roughly similar, so that we will be focusing here on the first.
We will describe first the tight relationship between the dynamically-adjusted continuous vocal tract ratio
Discussion and conclusions
We focused here on the systematic variation of larynx height and on its effects on vowel acoustics and on the articulatory mechanisms engaged in compensating for it. Our computational agents, using a generic machine-learning mechanism that controls a realistic geometric model of the vocal tract, did learn to a very high degree of accuracy eight target vowels ([ə], [ɑ], [a], [æ], [e], [i], [o] and [u]) widely attested cross-linguistically and covering the modern human vowel space. However, this
Acknowledgements
We wish to thank Peter Birkholz for sharing the source code of VocalTactLab 2.1, for allowing us to modify it and for answering our questions, and to three anonymous reviewers whose comments and suggestions greatly improved the paper. This work was Funded by the Netherlands Organisation for Scientific Research (NWO) VIDI grant 276-70-022 to DD. During the writing of this paper, DD was supported by an European Institutes for Advanced Study (EURIAS) Fellowship (2017–2018) and an IDEXLyon
References (72)
- et al.
Keep the lips to free the larynx: Comments on de Boer’s articulatory model (2010)
Journal of Phonetics
(2014) - et al.
Why only us: Recent questions and answers
Journal of Neurolinguistics
(2017) - et al.
Anatomy and control of the developing human vocal tract: A response to Lieberman
Journal of Phonetics
(2013) - et al.
The potential Neandertal vowel space was as large as that of modern humans
Journal of Phonetics
(2002) - et al.
The vocal tract of newborn humans and Neanderthals: Acoustic capabilities and consequences for the debate on the origin of language. A reply to Lieberman (2007)
Journal of Phonetics
(2007) Self organization in vowel systems
Journal of Phonetics
(2000)Investigating the acoustic effect of the descended larynx with articulatory models
Journal of Phonetics
(2010)- et al.
Language is not isolated from its wider environment: Vocal tract influences on the evolution of speech and language
Language and Communication
(2017) - et al.
Neanderthal language revisited: Not only us
Current Opinion in Behavioral Sciences
(2018) The evolution of speech: A comparative review
Trends in Cognitive Sciences
(2000)
Organization of tongue articulation for vowels
Journal of Phonetics
Current views on Neanderthal speech capabilities: A reply to Boë et al. (2002)
Journal of Phonetics
Vocal tract anatomy and the neural bases of talking
Journal of Phonetics
Ontogeny of postnatal hyoid and larynx descent in humans
Archives of Oral Biology
Human hyoid bones from the middle Pleistocene site of the Sima de los Huesos (Sierra de Atapuerca, Spain)
Journal of Human Evolution
Communicative capacities in Middle Pleistocene humans from the Sierra de Atapuerca in Spain
Quaternary International
Descent of the hyoid in chimpanzees: Evolution of face flattening and speech
Journal of Human Evolution
The dispersion-focalization theory of vowel systems
Journal of Phonetics
Normative standards for vocal tract dimensions by race as measured by acoustic pharyngometry
Journal of Voice
Reducing bias and inefficiency in the selection algorithm
Do we need a symbol for a central open vowel?
Journal of the International Phonetic Association
Evolution strategies: A comprehensive introduction
Natural Computing
3D-artikulatorische Sprachsynthese
Logos
Modeling consonant-vowel coarticulation for articulatory speech synthesis
PLoS One
Vocal tract model adaptation using magnetic resonance imaging
Modeling the judgment of vowel quality differences
The Journal of the Acoustical Society of America
Modelling the growth of the vocal tract vowel spaces of newly-born infants and adults: Consequences for ontogenesis and phylogenesis
Temporal development of compensation strategies for perturbed palate shape in German/sch/-production
Analogy between laryngeal gesture in Mongolian Long Song and supracricoid partial laryngectomy
Clinical Linguistics & Phonetics
Micro-biomechanics of the Kebara 2 hyoid and its implications for speech in Neanderthals
PLoS One
Modelling vocal anatomy’s significant effect on speech
Journal of Evolutionary Psychology
Computer models of vocal tract evolution: An overview and critique
Adaptive Behavior
Pushes and pulls from below: Anatomical variation, articulation and sound change. Glossa: A Journal of General
Linguistics
On the antiquity of language: The reinterpretation of Neandertal linguistic capacities and its consequences
Frontiers in Language Sciences
Cited by (4)
Articulatory effects on perceptions of men’s status and attractiveness
2023, Scientific ReportsEleven vowels of Imilike Igbo including ATR and RTR schwa
2023, Journal of the International Phonetic AssociationThe vocal tract as a time machine: Inferences about past speech and language from the anatomy of the speech organs
2021, Philosophical Transactions of the Royal Society B: Biological SciencesHindustani or Hindi vs. Urdu: A Computational Approach for the Exploration of Similarities Under Phonetic Aspects
2020, International Journal of Advanced Computer Science and Applications