Frequency sensitivity in mammalian hearing from a fundamental nonlinear physics model of the inner ear

A dominant view holds that the outer and middle ear are the determining factors for the frequency dependence of mammalian hearing sensitivity, but this view has been challenged. In the ensuing debate, there has been a missing element regarding in what sense and to what degree the biophysics of the inner ear might contribute to this frequency dependence. Here, we show that a simple model of the inner ear based on fundamental physical principles, reproduces, alone, the experimentally observed frequency dependence of the hearing threshold. This provides direct cochlea modeling support of the possibility that the inner ear could have a substantial role in determining the frequency dependence of mammalian hearing.

Original data read off from the original publications and put together by the present authors. Fig. 1 collects some of the data (for the example of the Mongolian Gerbil) [4], that let Ruggero et al. and colleagues conclude that the inner ear could have a more substantial role in shaping the frequency sensitivity of the mammalian hearing system. A number of newer biological measurements and finite element simulations seem to support the lack of frequency specificity of the outer and middle ear (e.g., [5,6], and [7], respectively). For the reader's convenience, data underlying this view are presented for the example of the Gerbil's hearing system.

II: METHODS
Over varying species-specific frequency intervals, mammalian hearing is able to access a huge dynamic range of sound (between 120-130 dB). This is due to the ability of the cochlea's outer hair cells to generate nonlinear amplification of the signal, leading to strong amplification of weaker sounds and weaker amplification of stronger sounds [8]. Outer hair cells follow in physical space and in frequency space (connected by the tonotopic map) a largely scaled building plan [9]. The mammalian hearing sensor, the cochlea, can therefore be described at several levels. The finest one is the level of the outer hair cells, focusing on explaining the intriguing interaction between hair bundle mechanics and electromotility of the hair cell bodies. On a more mesoscopic level, the cochlea's building plan can be captured in terms of so-called Hopf amplifier systems (e.g. [9]) that are composed as a sequence of mesoscopic sections representing discretization parts of the cochlea towards the Hopf cochlea (e.g. [10]). The device reflects the biophysics, including hair cell, basilar membrane, and fluid properties of the real cochlea in one model, to such an extent that all salient measured properties in biology could be verified in corresponding simulations. The composition of such sections into a macroscopic model of the sensor [11][12][13] is based on the detailed biophysics and nonlinear dynamics at work in the cochlea [14,15]. Fundamental for this model is that the sections share the dynamical properties of the microscopic amplification-providing outer hair cells [15,16], which are well-modeled by a stimulated Hopf procesṡ where z(t) denotes the response amplitude, F(t) a stimulation signal, ω ch is the characteristic frequency of the Hopf system, and µ is the Hopf parameter [14,15,[17][18][19]. At values µ < 0, the system is below bifurcation to self-oscillation, but responds towards stimulation signals F(t) as a small-signal amplifier [20][21][22]. Dissipation by fluidal viscous losses can be described by tailored 6th-order Butterworth low-pass filters [10,12]. The main characteristics of the isolated node dynamics are collected in Ref. [19]. When embedded into a compound cochlea, the response profiles broaden due to the sections' interaction with neighboring ones, reproducing the biological data [23] extremely well [10]. The distance of µ from bifurcation at µ = 0 defines how strongly a node amplifies an incoming signal; we choose this parameter to match the human hearing sensor. The biophysical properties of the cochlea suggest selecting the characteristic frequencies of the nodes according to a geometric sequence. We use a software implementation of an earlier hardware realization of 29 sections or nodes, taking care of 7 octaves (14.08 − 0.11 kHz), or a 31-section model covering an interval of (19.912 − 0.11) kHz. Our partition is optimal in the sense that finer partitions yield for the human amplification range, identical results, but coarser partitions lead to distortions in the frequency dependence, if sufficiently strong amplification is required. For 'flat tuning', µ ≡ const, all nodes have identical Hopf parameters (conventionally µ ≡ −0.25) [19], unless we provide them with a soft continued gradual amplification decay towards lower frequencies, to optimally match the human hearing system, or if we tune them actively (mimicking the effect of efferent cochlear connections) in the context of learning [13]. In the latter case, we condition the network towards chosen sounds by tuning unsuited nodes towards weaker amplification. A detailed account on the design of the used cochlea is provided in the supplements of Ref. [19].

Analytical approximation of the Hopf amplification
In Fig. 2, we demonstrate the essentials of the approximation by Eq.(1) in the main text. We exhibit here the comparison between our developed cochlea model and the corresponding biological measurements. The hearing threshold involves the whole extension of the cochlea. Across this distance, a detuning becomes essential. Almost all comparisons made below refer, however, to local properties (involving neighboring sections only). Across this distance, the detuning is without a noticeable effect. Exceptions to this statement are Figs. 8, 10 and 13 of this section, where the measurements are from an appropriately adapted cochlea.
Left hand side panels are normally the biophysical measurements, right hand side panels the corresponding modeling data. Modeling data are either from a hardware implementation of the model (Figs. 3b) -7b), 9b)), or real-time computational (Figs. 8b), 10a),b), 11, 12, 13a),b)). These results indicate the level of reliability of the model in describing the biophysical processes.
Note that our cochlea model focuses on the active amplification by the inner ear. In addition to active amplification, other processes influence hearing, in particular at high SPL (e.g., bone conductance) or at low frequencies (vibrotactile excitation). These effects are not the subject of our model.

Local amplification
Originally, our model of the cochlea was tuned to match the measured amplification profile of the chinchilla cochlea, one of the best-studied biological cochleas. Gain lines representing iso-intensity stimulation curves, measured at a place along the cochlea of preferred stimulation frequency f ch for (a) two mammalian species [24] (see also see also [8,25]), where the basilar membrane displacement relative to the sound pressure level at the eardrum was measured; (b) Hopf cochlea [12].

Compression
Compression of strong inputs is one characteristic nonlinear feature of the mammalian cochlea. Particularities of the compression also justify the usage of the Hopf small signal amplifier as the underlying amplification concept.  [12].

Two-tone suppression
Mutual compression of neighboring tones is another nonlinear feature of human hearing. The effect can be considered as a prototype of computation done by the mammalian hearing sensor [26]. and 'suppressor tone'), as a function of their intensity: (a) chinchilla [27], (b) Hopf cochlea [12].

Combination tones
The nonlinearities in the amplification process also introduce, by means of amplifier interaction, additional tones called combination tones. Such tones, and in particular their decay laws, are of great importance for the human perception of pitch.

Phase characteristics
The phase behavior along the cochlea also follows that in the biological example.

Medial efferent inhibition
The effect of a tuning of the cochlea by efferent medial olivocochlear stimulation has also been compared. Filled circles: medial olivocochlear efferent stimulation; µ is shifted to −0.5. Insets: Corresponding animal data [31].

Pitch shift
Another characteristic feature of the biological cochlea is the so-called pitch shift effect. The effect describes the human perception of pitch when stimulated by two tones, one of which is shifted in frequency with respect to the other's frequency. 200 1,000 0  FIG. 13: Pitch-shift experiment [29]. (a) Two-frequency stimulation f 2 = f 1 +200 Hz. Black stars: psychoacoustic data [32] (partial sound levels 40 dB sound pressure level, two subjects). Red circles: Hopf cochlea (sections as indicated, tones at -74 dB each). Black lines: false predictions by de Boer's formula [33] for k ′ = k, k ′ = k + 1/2 (dashed) and k ′ = k + 1, respectively. (b) Response of a cell of the cat ventral nucleus [34] ('On-L-cell', f ch = 1.1 kHz) to a three-frequency stimulation (( f c − f mod ), f c , ( f c + f mod ); f mod = 200 Hz) at 50 dB SPL. Black stars: inverse of the most frequent interspike intervals. Red circles: pitch from 15th cochlea section ( f ch = 1.095 kHz, -64 dB).