Rectifying and sluggish: Outer hair cells as regulators rather than amplifiers

In the cochlea, mechano-electrical transduction is preceded by dynamic range compression. Outer hair cells (OHCs) and their voltage dependent length changes, known as electromotility, play a central role in this compression process, but the exact mechanisms are poorly understood. Here we review old and new experimental findings and show that (1) just audible high-frequency tones evoke an ∼1-microvolt AC receptor potential in basal OHCs; (2) any mechanical amplification of soft high-frequency tones by OHC motility would have an adverse effect on their audibility; (3) having a higher basolateral K+ conductance, while increasing the OHC corner frequency, does not boost the magnitude of the high-frequency AC receptor potential; (4) OHC receptor currents display a substantial rectified (DC) component; (5) mechanical DC responses (baseline shifts) to acoustic stimuli, while insignificant on the basilar membrane, can be comparable in magnitude to AC responses when recorded in the organ of Corti, both in the apex and the base. In the basal turn, the DC component may even exceed the AC component, lending support to Dallos' suggestion that both apical and basal OHCs display a significant degree of rectification. We further show that (6) low-intensity cochlear traveling waves, by virtue of their abrupt transition from fast to slow propagation, are well suited to transport high-frequency energy with minimal losses (∼2-dB loss for 16-kHz tones in the gerbil); (7) a 90-dB, 16-kHz tone, if transmitted without loss to its tonotopic place, would evoke a destructive displacement amplitude of 564 nm. We interpret these findings in a framework in which local dissipation is regulated by OHC motility.


Introduction
The anatomical differentiation between inner and outer hair cells (IHC, OHC) in the mammalian cochlea has been known since the work of Retzius in the early 1880s ( Grant, 1999 ), but the functional correlate of this dichotomy remained obscure for almost a century. Because OHCs outnumber IHCs more than threefold, it came as a surprise when Spoendlin (1969Spoendlin ( , 1972 reported that 95% of the afferent auditory nerve fibers innervate IHCs. This strongly suggested that auditory information processing is primarily carried by the IHCs. What, then, is the function of OHCs? An important step was the discovery that kanamycin-induced OHC destruction caused a ∼40-dB elevation of behavioral threshold exactly in the frequency region where OHC were absent ( Ryan and Dallos, 1975 ). The authors concluded that "OHCs are specialized to perform a fa- * Corresponding author.
Around the same time other studies, not directly related to OHCs, provided accumulating evidence that the classic view of the cochlea as a passive, linear mechanical device needed revision. Rhode (1971Rhode ( , 1978 found that basilar membrane (BM) vibrations measured in vivo show a compressive growth with sound pressure. Indirect evidence of the strongly nonlinear character of inner-ear vibrations was obtained from auditory nerve (AN) recordings by Goblick and Pfeiffer (1969) and Rose et al. (1971) . In their analyses, both papers explore the possibility of a cochlear gain control mechanism. Rose et al. (1971) wrote "[our findings] suggest the existence of a cochlear sensitivity control mechanism which may, but perhaps need not be, mechanical in nature. Such a mechanism could be a major source of the demonstrable nonlinearity of the system […]." They went on to discuss the potential role of DC response components in the operation of the sensitivity control.
The discovery of prolonged otoacoustic emissions in the ear canal following transient acoustic stimulation ( Kemp 1978 )  led to speculations about some form of wave amplification at low intensities ( Kemp, 1979 ;Zwicker, 1979 ). OHCs were the obvious candidate for providing such "active" feedback. Davis (1983) proposed a "motor function for OHC" and speculated that "a piezoelectric effect may be involved in the putative transduction of electrical energy […] to mechanical vibration in the BM." Note that the scenario proposed by these authors, in which active feedback actually drives the motion, is a specific realization of the "facilitatory function" posited by Ryan and Dallos, but by no means the only possible realization. To illustrate, releasing the handbrake of a car parked on a slope facilitates the ensuing motion, but does not drive it -gravity does.
At this point it is important to clearly distinguish two possible ways in which OHCs may provide cochlear feedback. (1) They may control the mechanical properties of the partition, e.g. modify the stiffness or resistance on a time scale of one or several cycles; (2) alternatively, they may act as direct sources of vibrational energy. We refer to these roles as regulatory and cycle-by-cycle , respectively (see Cooper et al., 2018 ). The two roles are not mutually exclusive, and other roles may exist, too.
The possibility of cycle-by-cycle feedback was quickly embraced by modelers, who sought to explain the apparent discrepancy between the sharply tuned AN responses and the poor frequency selectivity of basilar membrane (BM) vibrations in cadaver cochleas ( Bekesy, 1960 ). The first one to incorporate active feedback into a cochlear model was Zwicker (1979) , who presented "a preliminary model assuming that the outer hair cells act as an amplifier which contains saturation (corresponding to 40 dB) and feed back to sensitize the inner hair cells." One of his motives to consider this type of model was his thorough knowledge of distortion products, audible nonlinear distortions generated in the cochlea that he had studied in psychoacoustic tests ( Zwicker, 1955 ). At the time, in vivo data on tuning sharpness of the BM were inconclusive, and Zwicker assumed that it was as poor as it is post mortem. In that sense, he was modeling a "second filter" in terms of an active process.
Unlike Zwicker, Kim et al. (1980b) and Neely and Kim (1983) build their active models on the assumption that the sharp tuning in the AN has mechanical origins -even though the evidence was controversial at the time. Whereas Kemp (1979) had stated that, to preserve stability, "net damping must never become negative," Neely and Kim went further and used a "BM damping function that is negative in a small region basal to the [peak]" ( Neely, 1981 ). In this scenario the vibrations in the peak region are truly driven by an internal energy source, and the energy flux at the peak exceeds the flux at more basal locations. The active model of Neely and Kim (1983) produced results that matched quite well the sharp tuning known from AN data and from sensitive BM measurements at the time ( Sellick et al., 1982 ). Neely and Kim conclude their paper with the suggestion "that the negative damping components in the model may represent some physical action of the outer hair cells, functioning in the electromechanical environment of the normal cochlea and serving to boost the sensitivity of the cochlea at low levels of excitation." It is against this background that Brownell et al. (1985) reported that isolated OHCs reversibly change their length upon current injection. In their introduction the authors mention the circumstantial evidence for energy injection from the work of Kemp (1978) and Neely and Kim (1983) , but for methodological reasons their own pioneering study was restricted to DC mechanical responses (to both DC and AC current injection). Consequently, the discussion of their results hardly touches upon cycle-by-cycle feedback, and instead focuses on the regulatory effects that OHC length changes may induce in the mechanical properties of the cochlear partition. Referring to the triangle formed by one OHC, the phalangeal process of the Deiters' cell underneath it and the stretch of reticular lamina spanned by them (their fig. 1C) as a "mechanical unit", Brownell et al. remark that "increases in OHC length would make this unit more rigid whereas decreases would make it more compliant." Although they do not state it explicitly, their description of a regulatory action of OHCs fits well in the sensitivity control schemes considered by Goblick and Pfeiffer (1969) and Rose et al. (1971) and in particular the DC effects discussed in the latter study.
The perspective rapidly changed in the years to follow. On the experimental side, advances in measurement techniques, reviewed in Ashmore (2008) , allowed the recording of AC motile responses at increasingly higher frequencies. Frank et al. (1999) reported a frequency limit (3-dB down point) of OHC electromotility of 79 kHz. While this may seem to support the physiological feasibility of cycle-by-cycle feedback in the spirit of Kemp (1979) , Zwicker (1979) and Kim et al. (1980b) , other studies seemed to spell trouble for this hypothesis, notably the finding that electromotility is driven by the OHC membrane potential rather than by transmembrane currents ( Santos-Sacchi and Dilger, 1988 ). Especially at very high frequencies, the AC receptor potential is strongly attenuated as it is shunted by the membrane capacitance. This is commonly referred to as the "RC problem," ( Santos-Sacchi, 1989 ) although it may be argued that this name is misleading (see Section 3.2 ). Corner frequencies for the OHC membrane, measured in vitro, range between 480 Hz to 1250 Hz ( Mammano and Ashmore, 1996 ;Johnson et al., 2011 ) and this contrasts with their purported role as high-frequency amplifiers up to 150 kHz in some species ( Vater and Kössl, 2011 ).
On the theoretical side, the vast majority of cochlear models published since the mid-1980s incorporate cycle-by-cycle feedback by OHCs. Refinement of "negative resistance models" improved the match with the sharply tuned BM responses of sensitive ears ( Neely and Kim, 1986 ) and reproduced their nonlinear compressive behavior, too (e.g., De Boer and Nuttall, 20 0 0 ). Based on a mathematical analysis of AN data, it was claimed by de Boer (1983) that it was impossible to explain the frequency selectivity of the cochlea unless "the resistance component of the BM impedance is negative over a part of the length of the cochlea." Similar analyses applied to BM data have aimed at proving that power injection cannot be dispensed with ( Brass and Kemp, 1993 ;Shera, 2007 ). However, such mathematical analyses are based on (often implicit) premises that may turn out to be wrong, and one is reminded of John Bell's aphorism that "what is proved by impossibility proofs is lack of imagination" ( Bell, 2014 ). In that context it is worth pointing out that the same cochlear models that successfully reproduced existing BM data, failed to anticipate the complex vibration patterns and wideband nonlinearities within the organ of Corti that were revealed by recent advances in measurement techniques ( Lee et al., 2015 ;Ren et al., 2016 ;Cooper et al., 2018 ). At the moment it is fair to say that in the field of cochlear mechanics, experiment is ahead of theory.
Over the last two decades, the hypothesis of cycle-by-cycle feedback has gained terrain, and only a limited number of studies was published that explicitly question it ( Allen and Fahey, 1992 ;Van der Heijden andVersteegh, 2014 , 2015 ;Cooper et al., 2018 ;Vavakou et al., 2019 ;Santos-Sacchi and Tan, 2018 ). Cycle-by-cycle amplification is often presented in textbooks as an accomplished fact, even though direct evidence is absent and the micromechanical operation of the cochlea is still poorly understood. The purpose of this article is to review a number of facts, findings and insights, both old and new, that suggest that an unconditional acceptance of the cycle-by-cycle character of OHC function is not justified, and that there are good reasons to consider potential regulatory aspects of OHC function. In our opinion it is important to keep an open mind toward such alternative explanations as long as we cannot claim that we understand how the cochlea works.

High-frequency motility and its limitations
Some mammals hear up to 150 kHz ( Vater and Kössl, 2011 ). The cycle of a 150-kHz tone is 6.7 μs. If hearing at such frequencies is indeed aided by cycle-by cycle feedback from OHCs, electromotility must be extremely fast and accurately timed. Recent work by Santos-Sacchi's group indicates that the motility process itself (mediated by conformational changes of prestin) may be much slower than previously believed ( Santos-Sacchi and Tan, 2018 ). They report inherent motility corner frequencies of ∼5 kHz, in marked contrast with the 79 kHz of Frank et al. (1999) . This sluggishness is hard to reconcile with the hypothesis of cycle-by-cycle amplification at high frequencies. But even if the motile process itself would be extremely fast, it would still need a sufficiently large AC receptor potential to provide amplification because it is voltage driven. Cody and Russell (1987) reported in vivo recordings of the receptor potential of hair cells in the basal turn of the guinea pig cochlea. Their fig. 6B shows the AC component of the receptor potential in an OHC in the 17-kHz region. They corrected their raw data for two effects: (1) the low-pass characteristics of their equipment; (2) the low-pass filtering by the cell membrane, which they estimated from the 1200-Hz corner frequency of the cell. Fig. 1 reproduces the 17-kHz curve after undoing the second correction. This restores the AC receptor potential seen by the cell, which cannot ignore its own membrane filtering.

Magnitude of the AC receptor potential in OHC
In the low-SPL range, the AC receptor potential varies linearly with sound pressure (dashed lines in Fig. 1 ). The behavioral threshold for guinea pigs is ∼0 dB SPL near 16 kHz ( Prosen et al., 1978 ). Extrapolating Cody and Russell's data, the AC receptor potential evoked by a just audible 17-kHz tone amounts to 2.0 μV. Note that this is at the tone's best place, already after the putative amplification. The 17-kHz recording site is where waves of lower frequencies acquire their "amplitude boost" on their way to peaking at a more apical location. The lowest frequency subject to this boost is half an octave below the characteristic frequency (CF) ( Robles and Ruggero, 2001 ), here 12 kHz. Thus, 15 kHz, for which Cody and Russell present data, falls well into the range of frequencies eligible to receive local cycle-by-cycle feedback from this OHC. Extrapolating the 15-kHz curve to the 0-dB-SPL hearing threshold ( Fig. 1 ) yields an estimated 0.6-μV AC receptor potential.
The foregoing estimates of the receptor potential are based on a straightforward extrapolation of Cody and Russell's in vivo data. They do not depend on any assumptions regarding corner frequencies or other parameters. It is unfortunate that Cody and Russell did not report 12-kHz data, as this would have enabled an even more critical test. A 12-kHz wave is one that just enters its "active region" when arriving at the 17-kHz place. A plausible estimate can still be derived. Knowing that low-SPL waves in the base receive a total amplitude gain of ∼15 dB when traveling from the spatial onset of their active region to their peak ( fig. 3A of Ren, 2002 ), this estimate amounts to 15 dB below the 17-kHz receptor potential, yielding 0.36 μV.
These estimates of receptor potential of just audible tones, directly derived from Cody and Russel's data, are consistent with independent estimates based on the equivalent electrical circuit of OHCs (Appendix 2 of Vavakou et al., 2019 ), which also yield values in the order of 1 μV.
To put these estimates in context, we note that in isolated OHC, a 1-μV variation in membrane potential evokes a fractional length change of 3 × 10 −7 , or 1 part in 3.3 million ( Ashmore, 2008 ).
For the ∼25-μm tall OHC from the 17-kHz region ( Dannhof et al., 1991 ), this is a length change L = 0.0075 nm, or 7.5% of the diameter of a hydrogen atom. It is difficult to assess the resultant motion in the in vivo situation. One may argue that the in vivo displacement is even smaller, because OHCs are firmly constrained by the structural embedding in the organ of Corti. Alternatively, one may argue that resonances within the organ could boost the motion of structures coupled to the OHC. A more insightful way to assess the magnitude of the receptor potential of soft sounds is by relating it to the expected noise floor, since this is what limits audibility ( De Vries, 1948 ;Allen, 1997 ).

Amplifying internal noise
The ∼1-μV amplitude of the AC receptor potential evoked by just audible tones is very small, not only when compared to the > 200-mV range of membrane potentials that evoke OHC length changes ( Ahsmore 2008 ), but in particular when compared to the expected electrical noise, i.e., statistical fluctuations in the absence of stimulation. We briefly consider thermal (Johnson) noise and shot (quantization) noise. For an insightful discussion of thermal noise in hair cells and its fundamental impact on their dynamic range, see Allen (1997) . The variance of Johnson noise is σ 2 = kT / C , where k is Boltzmann's constant, T the absolute temperature and C the cell membrane capacitance ( Allen, 1997 ). With C = 4 pF in the 17-kHz region ( Johnson et al., 2011 ), the RMS value of the thermal noise is 33 μV.
Shot noise arises from the bimodal character of the mechanotransducer (MT) channels which, in the absence of stimulation, spontaneously clatter between their open and closed states. It is a form of quantization noise: the fewer MT channels ("bits") a cell has, the poorer its S/N ratio is. With 60 MT channels ( Beurg et al., 2006 ) the RMS of the shot noise amounts to 13% of the peak-topeak range V pp of the receptor potential at maximum stimulation, i.e., all channels closed versus all channels open ( Van der Heijden and Versteegh, 2015 ). From the maximum value of the 10-kHz curve ( Fig. 1) of Cody and Russell's recording, V pp is at least 1.5 mV for this OHC, yielding a lower boundary for the shot-noise floor of 195 μV.
Even if a 1-μV AC receptor potential could somehow produce significant electromotile feedback, this would not serve its purported goal, namely to enhance auditory sensitivity. It would achieve the very opposite. For a tone evoking a 1-μV receptor potential, the S/N ratio of the OHC feedback is −30 dB and −46 dB for thermal and shot noise, respectively. Any motile response would then be completely hijacked by stochastic fluctuations, drowning the soft tone in amplified noise. It would take the parallel, synchronized, operation of 10 0 0 OHCs to just overcome the thermal noise and improve the S/N ratio to a meager 0 dB. It would take 40,0 0 0 synchronized OHC to just overcome the shot noise. The "active region" that determines the sensitivity to single tones, spans ∼0.5 mm and contains ∼150 OHCs. And even these ∼150 OHC cannot work in synchrony: wave propagation is slow in the active region, creating sizeable phase differences within that region ( > 1 cycle total, Ren 2002 ). This means that OHCs must work in series rather than in parallel. If they really were amplifiers, they would be cascaded -exactly as in the wave amplifier envisioned by Kemp (1979) . Indeed, cochlear compression exhibits a spatial build-up that reflects a cascaded OHC action ( Versteegh and Van der Heijden, 2013 ). But if the nature of that action is amplification, the cascaded configuration only exacerbates the problem, because the S/N ratio further degrades at every link of the chain. The noise will grow exponentially.
Finally, it is worth noting that even a hypothetical high-fidelity mechanical amplifier would not improve detection of soft tones. Detection of faint signals is limited by noise and in the best case the S/N is preserved by the amplifier, which means that detectability is spared, not improved. Put differently, by postulating a mechanical amplifier the detection problem is pushed from the input of transduction (IHCs) to the input of the amplifier (OHCs). To quote Ray Meddis, "The idea that the system is detecting a signal in order to amplify it in order to detect it has always seemed odd."

The (R)C problem
In vitro measurements of OHC corner frequency f c or, equivalently, the RC time constant τ RC = 1/2 π f c have yielded values of 480 Hz (7-kHz region of the guinea pig; Mammano and Ashmore, 1996 ) and 300-1250 Hz (300-2500-Hz regions of the gerbil; Johnson et al., 2011 ). In addition, Cody and Russell (1987) used current pulses to measure f c in vivo for a 17-kHz OHC, yielding a 1200-Hz value.
Such f c values seem hard to reconcile with the putative role of OHCs as high-frequency amplifiers. Reconciliation has been sought along two different lines: (1) the motile process somehow "circumvents" the lowpass character of the receptor potential; (2) RC times of OHCs in more basal regions are much smaller than in the apical and middle turns.
The first class of proposals accepts the low-pass character of the receptor potentials, but introduces hypothetical schemes that somehow overcome or compensate it in an attempt to make the electromotile output less band-limited than its electrical drive. Examples of such "circumvention schemes" include: gating by extracellular potential ( Dallos and Evans, 1995 ); chloride influx by stretch activation of the lateral membrane ( Rybalchenko and Santos-Sacchi, 2003 ); current flow in a 3D model of the organ of Corti ( Mistrik et al., 2009 ); the purported wideband character of the imaginary component of piezoelectrics postulated to drive electromotility ( Rabbitt, 2020 ). For more details on such schemes, see Ashmore (2008) . The circumvention schemes often invoke hypothetical processes that are difficult to test directly by experiment. The proposal of Rabbitt (2020) was experimentally tested by Santos-Sacchi et al. (2021) , who found that the effect of the imaginary component was too small to alleviate the lowpass character of electromotility.
A generic experimental approach to addressing circumvention schemes was taken in Vavakou et al. (2019) . They measured soundevoked vibration in the basal turn of the gerbil cochlea (CF, 13-25 kHz) and determined the spectral characteristics of OHC motility in vivo by an analysis of the vibrations in the OHC area. They found a clear-cut first-order low-pass characteristic (6-dB/octave roll-off; 0.25-cycle phase accumulation) with corner frequencies ranging from 2 to 3.1 kHz, i.e. three octaves below CF. Importantly, these corner frequencies were extracted from the mechanical responses of the intact cochlea, so they directly reflect the low-pass character of electromotility itself rather than the electrical properties of OHCs that the circumvention schemes sought to outsmart. With respect to circumvention schemes Vavakou et al. therefore remarked "Our findings do not support such schemes, as the ∼2.5-kHz corner frequency is evident in the motile response itself." We now turn to the proposal that the RC times at higher CFs are smaller in the base, which allegedly makes them fast enough to provide cycle-by-cycle feedback at CF Johnson et al. (2011) . There is experimental evidence against that proposal: Cody and Russell's f c = 1200 Hz in the 17-kHz OHC, and the data of Vavakou et al. (2019) , who measured the corner frequency of motility in the basal turn (CF, 13-25 kHz) and found it to be much lower than the values proposed by Johnson et al. (2011) . There is, however, a more fundamental issue at stake here, namely the implicit assumption that a higher corner frequency implies a larger high-frequency drive to electromotility.
Basal OHCs are smaller than apical OHCs, but R and C scale in opposite ways with membrane surface, so size per se does not affect RC times. Capacitance of bilipid cell membranes is quite universally given by ∼1 μF/cm 2 , and OHCs are no exception ( Huang and Santos_Sacchi, 1993 ). This leaves the option of having increasingly leaky OHC toward the base (smaller R , hence smaller RC and larger f c ), as indeed argued by Johnson et al. (2011) , based on the observation that the K + conductance of OHC increases from apex to base.
Apart from the experimental underpinning, the question is: would this help high-frequency motility? Lowering R makes f c larger, but does it also boost the amplitude of the receptor potential? After all, time constants and corner frequencies only inform us about the relative amplitudes of low-and high-frequency components. Fig. 2 answers this question. Using the equivalent electrical circuit of the OHC from fig. 6 of Johnson et al. (2011) , the AC component of the receptor potential is shown as a function of frequency for a constant stimulation level of the hair bundle (1% modulation of the MT conductivity, well within the linear range of the model). The different curves illustrate the effect of varying K + conductance from 50 nS to 10 0 0 nS. The corner frequency increases proportionally, from 1.6 kHz to 32 kHz, but this primarily reflects a loss in low-frequency sensitivity, as illustrated in Fig. 2 B. The gain at high frequencies is marginal once G K exceeds 100 nS. It never exceeds 3.5 dB, and for it to exceed 2 dB, a value of G K > 200 nS is required, whereas the highest experimental value in fig. 6 of Johnson et al. is 155 nS. In summary, pushing the corner frequency beyond 3 kHz has little effect on the high-frequency receptor potential. We conclude that corner frequencies and RC times per se are not the key to high-frequency motility. It is the membrane capacitance that shunts the high-frequency receptor potential, so perhaps the "RCproblem" should rather be called the "C-problem."

Rectification by OHCs
IHC receptor currents are strongly rectified. This yields a DC component in the receptor potential that follows the envelope of the acoustic stimulus ( Russell and Sellick, 1978 ). The ability to hear frequencies above a few kHz (the phase locking limit of the AN) Different curves illustrate the effect of lowering the K + conductance G K as indicated in the graph. When G K is increased beyond 100 nS, the corner frequency f c increases proportionally, but the effect on the receptor potential above 10 kHz is marginal. ( B ) Data of panel A normalized to the G K = 100 nS curve. The gain resulting from increasing G K never exceeds 3.5 dB, and for it to exceed 2 dB, a value of G K > 200 nS is required. (Model parameters: resting value of MT conductance, 60 nS; total membrane capacitance, 5 pF; endocochlear potential, 90 mV; reversal potential of K + channels, −75 mV). entirely depends on this rectification. If cochlear compression is based on a regulatory action by OHCs, their receptor current would also need to be rectified in order to produce a "control signal" that regulates the gain. Rectification in OHCs should then occur over the range of sound intensities that evoke compressive growth of the vibrations, i.e., starting below 30 dB SPL and extending to at least 100 dB SPL in many cases ( Robles and Ruggero, 2001 ). In this section we review the experimental evidence for rectification by OHC.

Electrophysiological data
In the intact cochlea, the resting position of OHC hair bundles is determined by their embedding in the tectorial membrane, so in vitro hair bundle stimulation of isolated OHCs provides little information concerning the extent of rectification. In vivo recordings of OHCs in the basal turn ( Cody and Russell, 1985 ) only showed DC responses to CF tones at high ( > 90 dB SPL) sound levels. In contrast, OHC recordings by Dallos and colleagues in the apical and middle turns (reviewed in Dallos, 1986 ) showed a significant DC component at intensities as low as 30 dB SPL. Dallos (1986) discusses potential explanations for the apparent discrepancy with the basal-turn data of Russell and colleagues, including the possibility of abnormal polarization of the OHC in the latter data set. An intriguing aspect of Dallos' (1985Dallos' ( , 1986 data is the polarity reversal of the DC response, both as a function of stimulus level for a fixed frequency and vice versa. Fig. 3 reproduces some of Dallos' observations. Similar level-and frequency dependent polarity reversals had been observed in the summating potential (SP), the extracellular cochlear response generally attributed to collective OHC receptor currents ( Dallos and Cheatham, 1976 ;Pappa et al., 2019 ). The in vivo intracellular recordings of Dallos and colleagues suggest that the polarity reversals in the SP are not the result of interference between different groups of OHCs (or between OHCs and IHCs), but that they reflect properties of the DC receptor current of individual OHCs. Rectification of single tones creates a DC component and even harmonics, predominantly the second harmonic ( Fig. 4 B,C). For complex stimuli such as tone pairs ("beats"), rectification also produces an envelope-following difference tone (f2 −f1) and a sum tone (f1 + f2), as shown in Fig. 4 D,E. These 2nd-order distortion products (DP2s) are the generalizations of the DC response and the second harmonic, respectively. The main message of Fig. 4 is that DC responses, envelope-following components and difference tones are inseparable aspects of one and the same rectification process.

Temporal and spectral effects of rectification
Also illustrated in Fig. 4 is the fact that immediately after the rectifying process (prior to any subsequent filtering) difference tones (f2 −f1) and sum tones (f1 + f2) have equal magnitude. This also applies to the single-tone case where the DC component ( f − f ) and the 2nd harmonic ( f + f ) have the same magnitude. These amplitude equalities between DPs having identical "parent primaries" are explained in Meenderink and Van der Heijden (2011) . Thus, if the magnitudes of such DP2 pairs are unequal in actual recordings, this reveals the operation of a frequency selective process ("filter") subsequent to the rectifying process. For the familiar case of a low-pass filter, this is illustrated in Fig. 4 C,E.

Rectification in cochlear-mechanics
Evidence for the presence of rectification products in cochlear vibration predates the discovery of OHC motility. Difference  1, f 2). The low-pass filtered (1st-order, τ = 1.5 tone cycle) version of the rectified waveform (green) is dominated by the envelope of the original waveform (blue), while its fine structure is reduced to a mere ripple. ( E ) Schematic power spectrum of the waveforms in C. The envelope-following waveform in D is spectrally represented by the combination of DC and the difference tone f 2 − f 1. tones f2-f1 had been studied in psychoacoustics ( Zwicker, 1955 ;Zwicker 1979 ), and the fact that they can be canceled by acoustic tones at f2-f1 shows that there exist cochlear-mechanical correlates of rectification. Responses to tone pairs recorded from the AN ( Kim et al., 1980a ) and cochlear microphonics ( Gibian and Kim, 1982 ) showed that DP2s propagate from the best place of the primaries to their own best place.
Until recently, most cochlear mechanical studies reported BM vibrations in the base of the cochlea ( Robles and Ruggero, 2001 ). Despite initial claims of tone-evoked DC shifts of the BM ( LePage, 1987 ), DC shifts of the BM were shown to be insignificant, except at very high ( > 90 dB SPL) stimulus levels ( Cooper and Rhode, 1992 ). However, large DC shifts were later found in the apical turn of the guinea pig and chinchilla ( Rhode and Cooper, 1996 ;Cooper and Dong, 2003 ). These data differed in two methodological aspects from the majority of cochlear mechanics data known at the time: (1) they were recorded in the apex of the cochlea; (2) the recordings were not obtained from the BM, but from structures inside the organ of Corti.
As an example of these apical data, the stimulus-evoked Hensen's cell displacement from fig. 1 of Cooper and Dong (2003) is reproduced in Fig. 5 . The 300-Hz tone evoked a substantial DC shift towards scala vestibuli that was comparable to the amplitude of the AC response.  Figure 9B of Rhode and Cooper (1996) shows DC shifts in the tectorial membrane (CF, 600 Hz). Over a wide range of frequencies (20 0-10 0 0 Hz), 50 dB-SPL tones evoked DC shifts of ∼2 nm, while 60 dB SPL tones evoked ∼10-nm DC shifts. The DC shifts were typically smaller than the AC displacements and were physiologically vulnerable. The DC shifts reported by Rhode and Cooper (1996) were directed towards scala vestibuli, except for low-intensity tones about an octave below CF, which evoked small displacements toward scala tympani. As noted by the authors, this polarity reversal strikingly resembles Dallos' intracellular OHC recordings in the apex (reproduced in our Fig. 3 ).
The mechanical DC shifts that Rhode and Cooper (1996) and Cooper and Dong (2003) found in the apex were not the only observation that made their data very different from the more familiar basal BM recordings. Another peculiarity of the apical data was the very wide frequency range (spanning several octaves) over which the responses were compressive, compared to the narrow ( ∼1/2 octave) compressive range consistently found in basal BM recordings. The authors primarily discussed these contrasting behaviors in terms of base-versus-apex differences, and so did later commentators ( Robles and Ruggero 2001 ). However, recent studies of vibration inside the organ of Corti show that Rhode and Cooper's (1996) findings in fact foreshadowed very similar observations in the base of the cochlea: wideband compression ( Ren et al., 2016 ;Cooper et al., 2018 ) as well as strong rectification ( Vavakou et al., 2019 ;He and Ren, 2021 ). As discussed by Cooper et al. (2018) and by Vavakou et al. (2019) , the actual dichotomy is not base versus apex, as had previously been assumed, but rather BM versus organ of Corti. Thus, there is less reason than existed formerly to postulate essential differences between apical and basal cochlear mechanics, and this is reassuring in view of the basic anatomical homogeneity of the cochlear partition along its length.
The rectified response to two-tone stimuli of fig. 1 of Vavakou et al. (2019) , recorded in the OHC area of the gerbil basal turn (13-kHz location), is reproduced in Fig. 6 A-C, together with another two-tone response from the same cochlea ( Fig. 6 D-F). In both cases, the distance between the tones (the beat rate) was 800 Hz, but the tones were centered at 5 kHz and 13 kHz, respectively. The extent of rectification, as manifested by the relative contributions of the envelope-following and fine-structure response components, is much larger in the 13-kHz case; its envelopefollowing component ( Fig. 6 E, thick line; 21 nm peak-to-peak) exceeds the fine structure ( Fig. 6 E, thin line; max 16 nm peak-topeak). As illustrated in Fig. 4 D,   ( C ) Zoom-in of the response, highlighting the relative timing between the minimum of the peak-to-peak amplitude (red line connecting the envelopes) and the reversal of the rectified component (green arrow). The black scale bar indicates the lag between these two. ( D-F ) Same as A-C, but now for a tone pair centered at CF (13 kHz). Positive displacements in panels B,C,E,F are away from the measurement beam, with approximately equal components along the 3 anatomical axes, being directed toward the apex, modiolus and scala vestibuli. Panels A-C show data from Vavakou et al. (2019) . Data in panels D -F are from the same recording location in the same cochlea.
The lower panels of Fig. 6 zoom in on the relative timing of the envelope-following component (solid lines in Fig. 6 C,F) and the peak-to-peak amplitude of the fine structure, which is the distance between the lower and upper envelopes (dashed lines in Fig. 6 C,F). The reversal of the envelope-following component (green arrow in Fig. 6 C,F) is seen to lag the minimum of the peak-to-peak amplitude (vertical red line in Fig. 6 C,F) by 60 μs and 72 μs, respectively, which is close to one CF cycle. Close inspection of Fig. 4 D shows that this lag stems from the low-pass filtering of the rectified waveform.
Intriguingly, these lags are comparable to the ∼50-μs "reaction time" of cochlear compression reported by Cooper and Van der Heijden (2016) . That study in the basal turn of the gerbil cochlea analyzed BM responses to fluctuating stimuli and found that cochlear compression was not instantaneous. For slowly varying magnitudes, the time-varying cochlear gain faithfully followed the stimulus envelope, but for more rapid ( > 200 Hz) fluctuations the gain started to fall behind the envelope and became smoothed and flattened. This caused an increasing amount of hysteresis in the dynamical input/out functions. The authors analyzed their results in terms of an automatic gain control scheme. The similar values of the delays observed in both studies suggest that the rectified (DC) component in the OHC area ( Fig. 6 ) may actually reflect the regulatory input or "control signal" to the gain mechanism that affects BM vibration. Obviously, more experimental work is needed to explore the link between compression and rectification by OHCs. In our recordings in the basal turn of the gerbil we also found polarity reversals of the rectified displacement component for frequencies well below CF ( Fig. 7 ), similar to the polarity reversals observed in Dallos' (1985) intracellular OHC recordings (reproduced here in Fig. 3 A) and in the cochlear mechanical recordings of the tectorial membrane in the apex by Rhode and Cooper (1996) . Although the polarity reversals themselves are as difficult to interpret as they were in the 1980s and 1990s, they represent yet another agreement between apical and basal cochlear mechanics, and appear to confirm Dallos' (1986) generalization to basal OHCs of his observations in the apex.

Discussion
We have reviewed old and new findings concerning • the inherent sluggishness of electromotility • the minute amplitude of high-frequency AC receptor potentials • the inability of motility to improve high-frequency sensitivity • significant rectification by apical and basal OHCs.
They call for the exploration of mechanisms of OHC function other than cycle-by-cycle feedback.
Given the crucial role of OHCs in cochlear compression, we begin by considering what it takes to deal with a large dynamic range. The low end of the large dynamic range of hearing requires a sensitive system, while the high end requires the ability of the same system to cope with high intensities without getting saturated or damaged. Of these two requirements, sensitivity has received most attention. The common view appears to be that sensitivity is something that needs explanation in terms of a "special feature" of the system. In comparison, the ability of the cochlea to operate at high (noise) levels without being deafened -temporarily or permanently -often appears to be taken for granted. It is important to test whether these presumptions are justified. The current level of knowledge of the cochlea allows a quantitative assessment of this question.

Soft sounds and damping in the cochlea
Concerning sensitivity, it is often stated that the fluid in the cochlea is too viscous to mediate efficient high-frequency transport, thus necessitating power injection from an internal source. A popular metaphor invoked to demonstrate the alleged need for power injection is a spring-mass system submerged in water; the viscous drag would damp its resonance. A more specific and quantitative analysis is needed here.
First of all, resistance by itself rarely defines the mechanical behavior of a system. When a pingpong ball and a tennis ball are thrown at the same speed, the tennis ball will travel farther even though it meets with more resistance due to its larger size. In this case the slowdown is determined by the ratio of resistance to mass. An example from electrical engineering illustrates an important general principle. Given a power plant that supplies a town over an electrical power line with a total resistance of 1 , what is the maximum achievable transport efficiency? The answer is that there is no theoretical limit: transport losses can be made arbitrarily small by choosing increasingly higher voltages, thereby reducing the current and the resistive losses. This is why long-distance power lines are operated at very high ( > 100 kV) voltages.
The mechanical equivalent of this strategy to achieve efficient transport in the face of friction, is the use of high pressure and small velocity, i.e., a stiff system. This appears to apply to the cochlear base where the BM is stiff and displacements are minute, resulting in BM velocities << 1 mm/s near hearing threshold ( Robles and Ruggero, 2001 ).
The theoretical possibility to minimize dissipative losses, however, does not mean that the actual system can achieve low-loss performance. Any system has its design constraints. In the example of the electrical power line, safety requirements (e.g. arc flash hazard) limit the use of arbitrarily high voltages. Likewise we can expect there to be cochlear "design constraints" that prevent the use of extremely stiff structures for the processing of high frequencies. A plausible estimate, however, can be made of the minimum attainable frictional losses in the cochlea.
The transport of acoustic energy from stapes to IHC is provided by the traveling wave ( Fig. 8 ). The local speed of the transport is the group velocity, which is not constant, but diminishes drastically between the stapes and the peak location ( Fig. 8 C). As illustrated in Fig. 8 , the transition from the initial, fast part of the wave to the slower part is quite abrupt for soft tones. For low-intensity tones it involves a drop in group velocity by a factor 10 or more ( Van der Heijden and Versteegh, 2015 ;Cooper et al., 2018 ).
In order to appreciate the functional implications of this abrupt deceleration, note that damping is a measure of energy loss per time unit. For a propagating wave this means that significant losses can occur only in regions where the wave spends enough time, i.e., where propagation is slow ( Lighthill 1978 ). In cochlear waves this occurs only over a short ( ∼0.5-mm) stretch basal to their peak ( Ren, 2002 ;Van der Heijden and Cooper 2018 ). Although this region (gray bands in Fig. 8 ) is only a small portion of the travel distance from stapes to the peak, it represents the vast majority of the travel time . Therefore, the analysis of dissipative losses can be restricted to this slow-wave stretch.
The slow-wave stretch coincides with the so-called short-wave portion ( Fig. 8 D) for which the wavelength is smaller than 2 π times the diameter of the scalae ( Van der Heijden and Versteegh, 2015 ). The main characteristic of short waves (also referred to as "waves on deep water") is the negligible fluid motion near the rigid boundaries, leaving the internal friction in the fluid as the only source of attenuation. Damping of short waves on water was analyzed by Stokes in the 1840s, who found the proportional loss of wave energy per cycle to be L = 16 π 2 ν λ 2 f (eq. 1) where ν is the dynamic fluid viscosity, λ the wavelength and f the frequency ( Lighthill, 1978 ). Stokes' formula predicts that sea waves with λ= 30 m (which count as "short waves" when the water is deeper than λ/2 π = 4.8 m) can travel more than halfway around the globe before accumulating a threefold attenuation of their amplitude. The reason for the very low susceptibility to damping of short waves is the low degree of shearing in the fluid motion pattern they evoke. In order to apply Stokes' formula for the internal friction of the fluid ( eq. (1) ) to the energy transport by cochlear waves, we must follow the pace of the transport by counting the number of cycles that fit in the group delay from stapes to the BM location. For low-intensity tones in the base of the cochlea, this amounts to ∼7 cycles. In the 16-kHz region of the gerbil, we have λ ∼ = 300 μm for 16-kHz tones, and eq. (1) gives a proportional energy loss per cycle L = 7.2%. The efficiency of the 7-cycle energy transport from stapes to best place is thus (1 − 0.072) 7 = 0.6, which amounts to an attenuation of 2.3 dB. We conclude that low-loss transport of high-frequency energy by traveling waves is well possible, as long as the slow portion of the wave is in the short-wave regime, i.e., as long as the vibrating structures are surrounded by a sufficiently large fluid chamber.

Boundary layers?
In hydrodynamic models of the cochlea, damping is either introduced by a free parameter (e.g. Siebert, 1974 ;Steele and Taber 1979a ) or, as we did in the previous section, explicitly attributed to fluid viscosity (e.g., Steele and Taber 1979b ;Wang et al., 2016 ). In the latter models, however, the main locus of viscous losses is not the internal friction of the fluid, but a Stokes boundary layer, a shearing motion pattern that arises when fluid moves periodically along a rigid surface with a no-slip condition ( Fig. 9 ). Boundary layers have been proposed to occur near the BM ( Steele and Taber, 1979b ) and between the reticular lamina and the opposing surface of the tectorial membrane ( Allen, 1980 ). It is assumed without justification that the material properties and motion of these structures are adequately modeled by the vibration of fluidimmersed, rigid objects along their surface, i.e., the conditions for creating a boundary layer. In a cochlear context, "rigid" means: unable to move by a few nanometers in the longitudinal direction. These are strong assumptions and they are not self-evident. Boundary layer losses can be reduced with simple means. Lubrication, a method abundantly used in biological systems, can minimize damping by simply replacing water by a less viscous substance in the thin ( < 5 μm above 10 kHz) layer where shear motion occurs. 1 Furthermore, neither the BM nor the tectorial membrane are rigid structures like the beams featuring in theoretical models. Their intricate composition with layers of oriented fibers permits elastic deformations that can soften the transition between their own motion and that of the adjacent fluid, thereby reducing the amount of shear and friction. In fact, the very function of the fiber networks may well be to minimize shear. Reduction of viscous drag is of major importance in industrial transport and processing of fluids, and the extensive engineering literature on this topic (e.g., Bushnell and Hefner, 1990 ) describes many methods to reduce viscous drag, including lubrication, surfactants, microbubble injection, suspension of polymers and the use of compliant walls.
Very little is known about the (passive) viscoelastic properties of the intact organ of Corti on a nanometer scale. Modeling this delicate organ as a system of standard elastic plates and rigid rods immersed in water might provide a crude approximation of their potential behavior, but cannot be claimed to be physiologically realistic or accurate. These biological structures serve a highly specialized function and must be expected to have adapted to their task. If one nevertheless insists on modeling the BM as an elastic plate with a rigid surface, the proportional loss per cycle from the boundary layer is where δ is the boundary layer thickness (see Appendix). For the 16-kHz wave considered in the previous section, δ= 3.6 μm, and this would give a 3.8-dB boundary loss on top of the 2.3 dB from internal friction, yielding a total loss of 6.1 dB. This is for twodimensional (2D) waves; for 3D waves the boundary losses will be smaller because the area-to-volume ratio is smaller (i.e., less boundary area per fluid mass).
The second aspect in which our analysis of energy transport differs from most cochlear models is the abruptness of the transition between fast and slow propagation of low-intensity waves ( Fig. 8 ). This abruptness, a key ingredient of our analysis, is an experimental fact ( Kim et al., 1980a ;Ren et al., 2011b ;Van der Heijden and Versteegh, 2015 ;Van der Heijden and Cooper, 2018 ), but it is not reproduced by prevailing hydrodynamic models. Even in Steele and Taber (1979b) who analyzed strongly dispersive 3D waves, the transition is smooth and fails to reproduce the sharp kink observed in measured phase curves (e.g. fig. 3 of Kim et al., 1980a ). In our opinion, this intriguing aspect of cochlear waves deserves more attention from experimenters as well as modelers. An elementary waveguide model ( Van der Heijden, 2014 ) offers a potential mechanism of the sharp deceleration, but its relevance to cochlear mechanics is presently unclear.

High intensities and the harnessing of damping
We now turn to the other end of the dynamic range: the processing of high-intensity sounds. For this task the cochlea urgently needs lots of damping. To demonstrate this, we evaluate what would happen if all the energy of a 90-dB-SPL tone would make it to its tonotopic place. This is a straightforward calculation based on data on BM stiffness and group velocity of the traveling wave (for details, see Van der Heijden and Versteegh, 2015 , where an independent computation based on fluid inertia is shown to match it to within 1.7 dB). Traveling waves satisfy a proportionality relation A 2 = γ P between energy flux P and the square of the displacement amplitude A . The coefficient γ is a metric of the compliance. For the 16-kHz region of the gerbil cochlea γ = 4.8 × 10 −5 m . s/N. A 90-dB-SPL tone at 16 kHz provides an acoustic power input into the gerbil cochlea of ∼6.7 μW. If all of this acoustic power would travel to the 16-kHz place, the wave amplitude would be 564 nm -an extremely destructive vibration amplitude in the basal turn. In reality the displacement is ∼11 nm ( Ren et al., 2011a ). Thus in the actual gerbil cochlea only (11/564) 2 or 0.04% of the acoustic power of the 90-dB-SPL tone reaches its transduction site. The remaining 99.96% of the acoustic energy entering the cochlea must be dissipated in the ∼0.5-mm region just basal to the 16-kHz place where propagation has slowed down. Such very high rates of dissipation suggest that the system is specifically equipped to absorb mechanical energy -when needed.
The 0.04% of the acoustic power that does reach the transduction site amounts to 2.5 × 10 −9 W. This number is directly derived from experimental data of BM displacement, group velocity and BM stiffness, and does not depend on any theoretical assumptions other than wave propagation. It is important to realize that the cochlear response remains strongly compressive well beyond 90 dB SPL. In order to explain this high-SPL compression in terms of the customary "saturating amplifier" one would have to postulate that (1) most of the 2.5 × 10 −9 W is generated by OHCs (otherwise the amplifier would already be exhausted); (2) prior to being amplified, even less than 0.04% of the acoustic power reaches the onset of amplification, that is, more than 99.96% is lost on the way. The first postulate is hard to reconcile with estimates of maximum power output of OHCs, which are in the order of 10 −14 W ( Wang et al., 2016 ), 4 to 5 orders of magnitude smaller than the experimentally determined power of the wave. The second postulate cannot be rejected on the basis of first principles, but we find it difficult to see the functional benefits of a scheme that first discards more than 99.96% of the acoustic power and then mechanically amplifies the tiny residual in order realize the desired magnitude. To paraphrase Ray Meddis, it seems odd that the system is amplifying the signal in order to attenuate it.
At lower intensities the net loss is less, and one is naturally led to consider a feedback scenario in which the amount of local damping is continually regulated to suit the intensity of the spectral components processed in the region in question. In engineering terms, this is a multi-band automatic gain control operating in the negative-gain range. OHCs are well equipped to play a central role in it. They sense the local excitation with their hair bundles. By rectifying and low-pass filtering their receptor currents they provide a temporally integrated metric of local excitation. By changing their length or stiffness accordingly, they can regulate the local impedance of the adjacent structures. In other words, OHCs have what it takes to cope with a large dynamic range. It is unlikely, however, that the OHCs themselves are the main absorbers, because when OHCs are damaged or absent, the affected portion of the cochlea is known to fall into the high-SPL, high-resistance state 2 ( Ryan and Dallos, 1975 ). So it seems more likely that OHCs regulate the amount of dissipation of adjacent structures.
The earliest instance of a model based on regulated damping ( Allen, 1980 ) predated the discovery of OHC motility and sought to explain nonlinear damping in terms of variable OHC bundle stiffness. The concept of nonlinear damping, however, is more fundamental than specific hypotheses concerning its anatomical correlate. Thus from a wider perspective, the proposal by Van der Heijden and Versteegh (2015) that OHC length changes regulate local damping, perhaps by changing the impedance of the Deiters' cell layer , falls in the same category as Allen's (1980) model. 2 This is reminiscent of the dead man's brake in trains.
An attractive feature of regulated-damping schemes, also discussed in Versteegh and Van der Heijden (2013) , is the clever use of slowly propagating waves to realize strong compression with minimal means: friction causes exponential decay, so a slight regulation of the damping coefficient (the exponent of the decay) suffices to accumulate a considerable loss over the region of slow propagation. Instead of counteracting friction, electromotility carefully regulates it. This interpretation puts both the traveling wave and its unusual dispersion properties into a clear functional framework.

Acknowledgments
Supported by EU H2020-MSCA- ITN-2016 [LISTEN -722098]. This paper is dedicated to Nigel Cooper, in friendship and wonder.

Appendix: viscous loss in the Stokes boundary layer of 2D waves
Here we derive the viscous damping of 2D short waves caused by a Stokes boundary layer at its surface, i.e., arising from a no-slip surface that is completely rigid in the longitudinal direction. This is a common assumption in cochlear models, even though there is no compelling reason to assume that it holds for the BM. Note that the ability of the BM to mediate waves does not depend on this assumption; the only necessary condition for wave propagation is elasticity in the transverse direction.
The boundary layer is assumed to occur at the side of the surface that faces the freely moving fluid, to be identified with the scala tympani side of the BM. The derivation is based on the analysis of fluid waves in chapters 2 and 3 of Lighthill (1978) . Denoting the amplitude of the transversal velocity of the surface by U , the time-averaged kinetic energy T per unit area is T = ρλU 2 4 π (eq. A1) with ρ the specific mass of the fluid and λ the wavelength, and where it is taken into account that the mass of the wave is twice that of ordinary water waves, which have air on one side. The total (kinetic + potential) energy per unit area is 2 T . For short waves, the trajectory of the fluid particles is circular, so the amplitude of the longitudinal component of the fluid motion just outside the boundary layer (i.e., the sliding motion that evokes the boundary layer) is also U . The amplitude F of the force per unit surface area exerted by the fluid on the surface (and vice versa) is F = ρU 2 π f ν (eq. A2) with f the frequency and ν the kinematic viscosity of the fluid. This force leads the fluid velocity by π /4 radians (equal inertial and resistive contributions). The amplitude F R of the resistive component is therefore The rate of loss R is the time-averaged product of resistive force F R and velocity U : ρU 2 π f ν (eq. A4) The proportional loss per cycle, L , becomes L = R 2 T f = π 2 λ ν π f = π 2 δ λ (eq. A5) with δ the thickness of the Stokes boundary layer.