Amplitude Modulation May Be Confused with Infrasound

Environmental infrasound is usually accompanied by low-frequency (LF) sounds. Considering that inner hair cell transduction equals half-wave rectification, activity of low-frequency auditory nerve fibres may be indistinguishable whether elicited by LF sound that is amplitude-modulated at an infrasonic rate, or LF sound that is superimposed onto infrasound that “biases” the basilar membrane position. We tested whether listeners are able to distinguish a 63-Hz carrier tone, amplitude modulated at 8 Hz, from a 63-Hz pure tone that was perceptually loudness-modulated by an 8-Hz biasing tone. Using a maximum-likelihood procedure, 12 participants first adjusted the intensity of the 8-Hz tone so that the perceived modulation of the pure tone matched a reference amplitude-modulated tone. Both stimuli types were then presented in random order, and participants had to identify presentations which contained the infrasound tone. About half the participants performed close to chance. The best had 81% correct. Experiments with a 125-Hz carrier tone gave similar results. Although performance may improve in a 2-interval discrimination task, this would not be representative of real listening conditions. Results suggest that slowly amplitude-modulated LF sounds may underlie complaints about environmental infrasound, where measured infrasound levels are well below sensation threshold. © 2018 The Author(s). Published by S. Hirzel Verlag · EAA. This is an open access article under the terms of the Creative Commons Attribution (CC BY 4.0) license (https://creativecommons.org/licenses/by/4.0/).


Introduction
Noise spectra assessed in response to complaints about environmental infrasound most often reveal that the sound pressures of spectral components in the infrasound range (< 20 Hz)a re well belows ensation threshold and therefore should be inaudible to human listeners (e.g.[1]).Commonly,s uch finding closes the complaint case.The measured spectra, however, often cross the auditory sensation threshold somewhere between 20 Hz and 100 Hz.These supra-threshold low-frequencyn oise components might have pronounced envelope fluctuations with spectral content well below20Hz, which might be easily mistaken as containing supra-threshold infrasound.
Using simple sinusoidal stimuli, we tested under laboratory conditions whether listeners can actually distinguish between stimuli mixes that contain true infrasound and stimuli that are amplitude-modulated (AM) at an infrasonic rate, i.e. do not actually contain frequencyc omponents below2 0Hz.This might well not be the case if one considers the similarity in auditory nerve(AN)output that these twostimuli produce.Figure 1illustrates  stimulus types under consideration.Because the inner hair cell releases neuro transmitter only during basilar membrane (BM) movement towards scala vestibuli, signals become effectively half-waver ectified.In order to illustrate this operation, the lower parts of the signals that are not coded by the AN are greyed out in the figure.Comparison of the remaining upper parts shows schematically that the spiking probabilities of the AN fibres in response to those twostimuli are almost identical.
Of course, this scenario is giveno nly if the stimulus components are not spectrally resolved by the cochlea.This might easily be the case at its very apical end because the lowest auditory filters have relatively wide spectral tuning and more importantly,the lowest stimulus frequencythat has acharacteristic place on the human BM is assumed to be as high as 50 Hz (see reviewing remarks in [2, pages 51-52]), if not even 80 Hz [3].In other words, all frequencycomponents lower than this share the very apical end of the BM as their characteristic place (the place of maximum vibration amplitude).The condition illustrated in Figure 1i st herefore not unlikely,a nd our test results showt hat listeners have indeed great difficulty in distinguishing these twot ypes of stimuli with carrier tones of 63 Hz and even 125 Hz.

Instrumentation
The infrasound source used in this study is identical to that used by Kuehler et al. [4]: From ah ermetically enclosed electrodynamic loudspeaker,a n8 -Hz biasing tone wast ransmitted via an 8-m long polyethylene tube to the listener'se ar.(It waso riginally developed to be used in MEG and fMRI experiments.)For the last 40 cm, the 14 mm inner diameter of this tube wasreduced via acoupling into amore flexible tube with 2.5 mm inner diameter that took at its end an audiometric ear tip (ER3-14A, Etymotic Research)t hat washermetically fitted in the listener'sear canal.Before, in-between and after all psychometric measurements, the fitting wast ested by measuring the SPL in situ with aminiature microphone (Knowles FG-23453) that wasc oupled to the ear canal via as mall plastic tube (20mmlength, 1mminner diameter)t hat penetrated the foam of the ear tip.It wasc alibrated in a1 .3-cm3cavity using aB rüel &K jaer 4153 microphone.As econd such plastic tube penetrated the ear tip foam to deliverthe other sounds into the ear canal.These were produced by asmall insert earphone (ER4B, Etymotic Research)t hat wasd irectly drivenbythe line output of the audio device (RME Fireface UC).Using separate sound sources ensured that AM due to hardware non-linearities were <1 %.In contrast, the line output of the 8-Hz biasing tone waslow-pass filtered (6 dB/octave,p assive,f c=10 Hz)b efore power amplification (BEAK Type BAA120).
This filter,together with the acoustic low-pass filtering effect of the 8-m polyethylene tube and appropriate electric attenuation, made sure that at maximum electric output of the audio device, the sound pressure in the ear canal could not exceed 105 phon [5,6].

Subjects
Tenfemale and four male normal-hearing subjects (18t o 49 years), without self-reported hearing abnormality,were recruited.Their auditory threshold for 8Hz, 63 Hz and 125 Hz wast ested using the standard audiometric procedure (but with 3-dB instead of 5-dB steps).None of the subjects had thresholds above 10 dB HL [5,6].However, one male and one female subject were unable to perform the psychometric procedure so that only 12 subjects completed the study.

Stimulus conditions
At otal of eight reference AM stimuli were used, four with ac arrier frequencyo f6 3Hza nd four with 125 Hz.We predicted that the infrasound detection might be easier at larger sound pressure, and modified therefore two further parameters of the AM tone for which we expected the required biasing tone level( L RBT )t ov ary to produce am atching perceivedl oudness modulation of the biased pure tone: the modulation depth of the AM tone (25% and 37.5%)a nd its sound pressure.All modulations were clearly audible.Since we expected asuppression in loudness due to biasing [7], the AM tones and pure tones to be biased were set to equal peak level.Levels corresponded to either 40 phon, or 50 phon of the pure tone (ISO 226).The phase of the amplitude modulation and that of the 8-Hz biasing tone wasindependent and random in each trial.The duration of all stimuli was1200 ms, including 250 ms cosine ramps at onset and offset.

Procedures
The measurement of each including breaks, took up to three hours.It wasd one in the ear of their preference.Before the discrimination task could commence, L RBT had to be determined for the eight AM reference stimuli.Initially,s ubjects made themselves familiar with the infrasound biasing tone and its effect on the pure tone in asimple L RBT adjustment procedure (buttons "up" and "down"), starting from lowl evels.Fora ll eight conditions, theywere asked to set the biasing tone levelsothat the pure tone in the second interval wasp erceivedw ith equal modulation as the leading reference AM tone.These served as individual starting levels for the subsequent, more accurate maximum-likelihood-tracking (MLT )p rocedure.Here, both kind of stimuli were presented in random order in twoi ntervals.In a2 AFC task, the subjects were asked: "Which of the twoi ntervals wass tronger modulated?"The tracks of the eight conditions, each ending after sixteen trials, were interleavedi nr andom order.Fore ach condition, the average L RBT of three repeats defined the individual'sL RBT ,u nless av alue deviated by more than 4dBf rom anyo ther,i nw hich case it wase xcluded from the average.The L RBT were then individually set for the final discrimination task, which wasa1 -interval, yes-no procedure with the question: "Did this stimulus contain infrasound?" Foreach of the eight conditions, atotal of 50 trials were presented, where half consisted of the AM tone and half of the corresponding biased tone (containing the infrasound at L RBT ).This gave atotal of 400 trials, which were presented in random order.The variation in loudness and modulation depth across the eight stimulus conditions made the task less monotonous for the subjects.Before the actual test, the subjects had at raining session with feedback, using ashorter series of 48 trials.No feedback was givenduring the formal test.

Results and discussion
Before looking at howw ell listeners were able to distinguish the AM stimuli from the stimuli that truly contained infrasound, let us first consider the infrasound levels that the listeners adjusted during the MLTprocedure so that the perceivedmodulation of the biased tones matched that of the corresponding AM reference stimulus.
The mean data across subjects for all 8s timuli conditions are shown in Figure 2.
According to the infrasound equal-loudness contours proposed by Møller and Pedersen [5], L RBT sf ell roughly in the range of 20-40 phon, and were therefore well above the 8-Hz perception threshold of most human listeners (approx.100 dB SPL [4]).Although the levels across the eight conditions stayed for each individual typically within a range of 10 dB and resembled roughly the pattern of the mean data, the individual curves were quite offset from each other (For individual L RBT ,see the table below.) As to expect, an increase in L RBT is required to achieve an increased perceptual modulation that matches the increase of modulation depth from 25% to 37.5% in the reference AM stimuli.Apart from this offset, the twocurves look rather similar across the remaining parameter variations.Similarly expected, the louder 50-phon probe tones require generally more intense infrasound tones than the softer 40-phon probe tones.
It has to be pointed out, however, that as imple linear superposition model, as illustrated in Figure 1, does not quantitatively account for these results: A2 5% modulation depth in the half-waver ectified output signal would require aB M-biasing amplitude that is 25% of the probe tone BM response amplitude.Similarly,a3 7.5% modulation depth would require aB M-biasing amplitude of 37.5% of the probe tone BM response amplitude.This theoretical 3.5-dB increase in L RBT is not observed for the 40phon data.The required increase in BM biasing amplitude would also just linearly scale up with probe tone amplitude if the perceivedmodulation could be simply explained by linear superposition of the twot ones.In other words, the roughly 7-dB increase in probe tone level( from 40 to 50 phon)s hould then also require ar oughly 7-dB increase in L RBT in order to maintain the same amplitude modulation of the half-waver ectified output signal.However, this is only the case for the 63-Hz condition with 37.5% modulation depth, butwas clearly lower in the other conditions.This brings us to the last, rather puzzling observation: We expected that the 125-Hz carrier tone requires as lightly higher biasing tone levelt han the lower 63-Hz carrier tone, because its excitation centroid being located at am ore basal and stiffer BM location.However, this is only the case with 40-phon carrier tones.We therefore conclude that perceptual modulation of an LF tone during infrasound exposure cannot be simply explained by linear superposition of the two.We believe that large cochlear microphonic potentials, experimentally observed by Salt et al. [8] in guinea pigs, might cause an "electrical" biasing of the inner hair cell neurotransmitter release, leading to am odulation of the AN output.The involved electrical phenomena are likely very non-linear, and only ad etailed model of these rather complexp rocesses might have the potential to quantitatively explain the present results, as well as previously published data on modulation of the AN activity by low-frequencyb iasing tones (e.g.[9,10,11]).
Although the adjustment of biasing tone levels wasa prerequisite to address the main question of this study, the data so farh aven ot yet answered whether listeners can pick out the stimuli that contain the infrasound tone.Discrimination results per condition showed that the number of subjects scoring above 64% wasalways about half.Note that only scores above 64% have ac hance of correctly guessing below5%(n=50).In other words, in each condition about half of subjects performed no better than chance.Takenthe performance across all eight conditions (Figure 3),however,most subjects performed well above Mod.depth (%)3 7.5 37.5 37. 53   chance, since theyscored in the average better than 55 % (p <0 .05for n = 400).The best performance wast hat of listener 5, with an average of 81% correct responses across the eight conditions.He is one of the authors, and performed at this levelsimilarly across all 8conditions.In his best condition, he achieved86% correct responses.But note that this does not necessarily mean that he detected the infrasound in 86% of the trials, as the percentage also includes correct guesses with a50% chance.On the other end, subject 1was amongst the worst performers, responding with 48% correct responses purely at chance.He is the other author and had also long experience in listening experiments with infrasound.In summary,i ti sf air to say that none of the listeners picked the stimuli containing the infrasound tone with great confidence.
Comparing the eight stimulus conditions, none of them appeared to be particularly hard or easy: As Figure 4 shows, the across-subject mean scores ranged between 60% and 64.3%.However, the pattern across the conditions wasi nconsistent amongst the listeners.In a3 -way ANOVA ,neither modulation depth, carrier frequency, nor carrier levelturned out to be asignificant factor (p = 0.58, 0.97, 0.87 and F = 0.31, 0.0, 0.03, respectively).This wasalso the case when only considering the six best performing subjects.In other words, the easiest condition for one subject wasone of the hardest conditions for another.Also surprising, there wasnocorrelation of infrasound detectability with infrasound SPLwhen pooling all data (R 2 = 0.00082, p >0 .05).Individually,h owever,t here were cases of significant positive as well as negative correlations, possibly reflecting different listening strategies.

Conclusion
It has been shown under laboratory conditions that a low-frequencyt one that is 8-Hz amplitude-modulated is perceptually similar to as timulus that contains al owfrequencyt one and an 8-Hz infrasound tone.We speculate that other slowly amplitude-modulated low-frequency stimuli, which actually do not contain spectral content be-low2 0Hz, might also sound as theyw ould contain infrasound.This finding may help to explain cases of annoyance attributed to infrasound, where measurements showa udible low-frequencyc ontent, butw ith infrasound content well belows ensation threshold.Nevertheless, givent heir perceptual similarity,L Fs ound that is slowly amplitude-modulated can still cause annoyance, similar to that attributed to true infrasound.

Figure 1 .
Figure 1.The twos timulus types considered in this study: abiased tone (upper panel)and an AM tone (lower panel).The halfwave rectified versions of the signals (only the upper parts)a re almost indistinguishable.

Figure 4 .
Figure 4.The table header also labels the percent correct bars, which showthe across-subject average score in each of the eight conditions.The length of each bar is divided proportional to the number of hits (N Hits )and correct rejections (N CR ).