Effects of age on electrophysiological measures of cochlear synaptopathy in humans

Highlights • We used highpass masking to record ABRs and FFRs from low-frequency regions.• Wave I ABRs with highpass masking are not consistent with low-frequency synaptopathy.• Wave I ABRs without highpass masking are consistent with high-frequency synaptopathy.• Highpass-masked FFRs do not provide evidence of low-frequency synaptopathy.

105 dB ppeSPL -HP Noise Figure S1: ABR grand averages with the HF-ITPR montage for each experimental condition as a function of age group. It should be noted that although the grand averages are shown as a function of age group for illustration purposes here, the statistical analyses used age as a continuous rather than nominal variable. : Posterior medians (circles) and 99% credibility intervals for the differences in ABR wave I and V amplitudes between females and males estimated by the Bayesian multiple regression models. The multiple regression models did not include montage by sex interactions, only the overall effects of sex across montages were estimated, hence the plot does not show montage specific sex effects. Lat. Change (ms) X PTA 0.5-2 (dB) Figure S6: Posterior medians (circles) and 99% credibility intervals for the effects of PTA 0.5−2 on ABR wave I and V latencies estimated by the Bayesian multiple regression models. The bottom row shows the effects for the wave I-V interpeak latencies. Lat. Difference [Female -Male] (ms) Figure S9: Posterior medians (circles) and 99% credibility intervals for the differences in ABR wave I and V latencies between females and males estimated by the Bayesian multiple regression models. The multiple regression models did not include montage by sex interactions, only the overall effects of sex across montages were estimated, hence the plot does not show montage specific sex effects. SNR Change (dB) X PTA 1-2 (dB) Figure S10: Posterior medians (circles) and 99% credibility intervals for the main effects (across montages) of PTA 1−2 on FFR ENV SNR estimated by the Bayesian multiple regression model. The top row shows the effect difference between the 70% and 100% MD. Latency Ch. (ms) X PTA 1-2 (dB) Figure S12: Posterior medians (circles) and 99% credibility intervals for the main effects (across montages) of PTA 1−2 on FFR ENV latency estimated by the Bayesian multiple regression model. The top row shows the effect difference between the 70% and 100% MD. Figure S13: Posterior medians (circles) and 99% credibility intervals for the main effects (across montages) of log 10 TCNE on FFR ENV latency estimated by the Bayesian multiple regression model. The top row shows the effect difference between the 70% and 100% MD.

Choice of ABR montages
The data from the HF-IERL, and HF-ITPR montages were retained for the statistical analyses, while the IMST referenced data were not analyzed further for the following reasons: i) the average root-mean-square (RMS) amplitude during the pre-stimulus baseline window, indicative of noise level, was ∼ 1.5 dB higher for the IMST referenced data than for the other two montages, that had similar baseline RMS amplitudes; ii) the ratio of the average ABR wave I amplitude to the amplitude of a dummy wave estimated with the same criteria as wave I during the pre-stimulus baseline window (Prendergast et al., 2017) was > 1 dB lower for the HF-IMST montage compared to the other two montages (HF-IERL: 4.42 dB; HF-ITPR: 5.1 dB; HF-IMST: 3.38 dB) iii) the HF-IMST data were often contaminated by the postauricular muscle reflex (PAM). Although the PAM triggered by the click falls outside the time-window of ABR waves I and V, the PAM response to the onset of the noise preceding the click could affect responses in this time window. The PAM to the onset of the noise should cancel out in the long run through averaging, due to its random start time with respect to the averaging window start time. However, given the large size of the PAM it is possible that the cancellation was not complete; iv) previous studies indicate that tiptrode-referenced ABRs provide larger, and slightly more reliable wave I amplitudes compared to mastoid-referenced ABRs (Bauch and Olsen, 1990;Prendergast et al., 2018). The metrics described above suggest that the HF-ITPR montage may provide somewhat better measurements of wave I over the HF-IERL montage, but the differences between the two montages are likely small. For this reason, and because multilevel models generally provide better parameter estimates compared to separate analyses for each level of a given factor (Gelman, 2006;Gelman and Hill, 2007), the HF-ITPR and HF-IERL ABR data were modeled jointly in the statistical analyses.

ABR waves peak-picking algorithm
ABR peaks were first identified on the grand-average waveforms using a semiautomatic peak-picking procedure similar to that proposed by Bradley and Wilson (2004). The approximate latency for each wave was first identified visually on the grand average waveforms separately for each condition. The grand-average peaks for each wave were then defined as the highest local maxima within a tolerance window of ±0.63-ms for wave V, and ±0.51-ms for wave I of the visually identified peak latencies.
The wave peaks were then searched in the individual subject waveforms within a search window centered at the grand-average peak latencies, and with bounds of ±0.51 ms of the grand-average peak latency for wave I, and of ±0.84 ms for wave V. These bounds correspond respectively to ±3, and ±4 standard deviations of the ABR wave latencies reported by Issa and Ross (1995) for wave I, and for wave V. Peaks were identified by selecting the highest local maximum in the search window. Troughs were identified by selecting the lowest local minimum in a search window going from 0.25 to 1.5 ms for wave I, and from 0.25 to 2 ms for wave V, from the estimated peak latency. Wave amplitudes were measured from peak to trough. If no local maxima were present in the peak search window the peak amplitude was estimated by the highest absolute point in the search window (this point was also used to set the search window for the following trough). If no local minima were present in the trough search window, the trough amplitude was estimated by the inflection point with the shallowest slope in the trough search window, if present. If no inflection points were present the trough amplitude was estimated by the lowest absolute point in the trough search window.
It should be noted that because of the constraint that the trough latency is at least 0.25 ms after the peak latency there is no guarantee that either the local or the absolute minimum found in the trough search window will have a lower amplitude than the peak found in the peak search window. Therefore the algorithm could fail to find positive peak-trough amplitudes. This usually occurred for noisy waveforms. In these cases the peak-trough amplitude was set as missing. A censored analysis, described in the section on statistical methods was then used to deal with these missing data.
Latencies for peaks that could not be identified using the largest local maximum were set as missing. The rationale for this is that while the largest absolute maximum can provide a reasonable estimate of the peak amplitude when no local maxima are present, it does not necessarily provide a good estimate of the peak latency. Peak latencies for peaks with an amplitude smaller than 100 nV were also considered unreliable and set as missing.
The noise floor was calculated by applying the same algorithm used to find wave I to the pre-stimulus baseline, on a time window centered at -2.1 ms.

Noise exposure
Lifetime noise exposure was estimated via the structured interview developed by Lutman et al. (2008). The interview covers both occupational and recreational noise exposures. For each of these two categories participants were asked to recall up to five activities with the greatest amount of noise exposure in their lifetime, and with levels exceeding a threshold of 85 dBA. Noise levels were estimated mostly using a speech communication difficulty table that listed approximate noise levels as a function of vocal effort required for communication. For each activity participants were asked to estimate the duration and frequency of occurrence, and the cumulative noise exposure for the activity (U) was calculated by U = 10 (L−A−90)/10 · Y · W · D · H/2080 where L is estimated noise exposure level in dBA, A is hearing protection in dB, Y is years of exposure, W is weeks of exposure per year, D is days of exposure per week, H is hours of exposure per day, and 2080 is the number of hours in a working year. One unit of noise exposure so calculated corresponds to an eight hour daily exposure, for five days a week, for 52 weeks, for a year, to a noise level of 90 dBA. The cumulative noise exposure was summed across all activities to estimate the total cumulative noise exposure (TCNE). For the analyses the TCNE was logtransformed using base 10, so that a unit difference in the log 10 -transformed TCNE corresponds to a tenfold difference in noise exposure energy.

Audiometric thresholds
The tones had a duration of 200 ms, including 10-ms raised-cosine onset and offset ramps. Thresholds were measured with a two-interval two-alternative forced-choice task. The presentation level of each tone was varied adaptively using a two-down one-up transformed up-down procedure tracking the 70.7% correct point on the psychometric function (Levitt, 1971) to determine its detection threshold. On each trial the tone was randomly presented during one of two observation intervals marked by flashing lights on the computer screen, and separated by a 500-ms silent interval. Participants were asked to indicate the interval in which the sound occurred by pressing the corresponding button on a numeric keypad. Feedback was provided at the end of each trial by means of a colored light on the computer screen.
A single block of trials was run for each combination of ear and frequency (in random order). Each block was terminated after 16 turnpoints of the adaptive track. The level was varied in 4-dB steps for the first four turnpoints, and by 2 dB for the remaining turnpoints. The threshold was estimated as the average of the last 12 turnpoints.
The pure tones were synthesized in Python (Python Software Foundation, Delaware, United States) with a sampling rate of 48 kHz, and 32-bit depth, were played through a E-MU 0204 USB sound card (E-MU Systems, Scotts Valley, U.S.A.), and presented via Sennheiser HDA300 headphones (Sennheiser electronic GmbH & Co. KG, Hanover, Germany). Testing took place in doublewalled IAC (IAC Acoustics, Winchester, UK) soundproof booths.

Statistical models and results
All analyses were performed using Bayesian models implemented by Markov Chain Monte Carlo (MCMC) simulations using JAGS (Plummer, 2003) and R (R Core Team, 2020). For all MCMC simulations the chains for the main parameters of interest were monitored for convergence using trace plots, and where available the Gelman-Rubin statistics. The chains were also monitored for autocorrelation to ensure an effective sample size of at least ≃ 10, 000 samples for the main parameters of interest.

Bayesian correlation model
The Bayesian model to estimate the correlations among predictor variables was based on the model of Lee and Wagenmakers (2014, chap. 5) but used vague uniform priors for estimating the standard deviations of the variables instead of inverse-square-root-gamma priors.

Mixed effect multiple regression models
The data were analyzed using robust mixed-effect multiple regression models which included both categorical and continuous predictors, as well as random effects of subjects. Robust regression uses a Student's t distribution instead of a Normal distribution for describing residuals, minimizing the potential influence of outliers on the estimated regression coefficients (Kruschke, 2014).
For categorical predictors an unweighted effect coding scheme was used (Aiken et al., 1991). Continuous variables were standardized using the Friedrich method (Friedrich, 1982;Aiken et al., 1991) before being entered into the analyses. Unstandardized coefficients corresponding to those resulting from an analysis of the mean-centered variables can be obtained by scaling using the appropriate standard deviation terms (Aiken et al., 1991). The priors for the slope coefficients in the models were set differently for coefficients that were of main interest in the analysis, and coefficients that were expected to affect the dependent variable, but were not of great analytical interest, such as the effect of stimulus level on ABR amplitude. For the latter effects, the priors were very broad on the scale of the data. Shrinkage priors were used for the former: the standardized coefficients were described by a t distribution centered at zero, with 1 degree of freedom, and scale parameter set to 0.1. This prior assumes that the standardized slope coefficients should be generally close to zero, where the narrow peak of the t distribution is located, reflecting a belief that effect sizes will be generally small. However, owing to its heavy tails the t prior can accommodate coefficients much larger than zero if the likelihood provides clear evidence for this (Kruschke, 2014).
The interpretation of the standardized slope coefficients, and hence of the priors set on them, differs for continuous and categorical variables. For continuous variables the standardized slope coefficient is the change of the dependent variable in standard deviation (sd) units, for a 1-sd change of the dependent variable. Categorical variables were not standardized, and the coefficients represent the shift in the value of the dependent variable (which was still set in sd units in our models) for the categorical level coded as 1, from the the unweighted grand mean of the dependent variable of all the levels. The model code is available at https://doi.org/10.17605/OSF.IO/S3BD9. Tables S1, and S2 indicate the dummy codes used to encode categorical variables through an unweighted effect coding scheme. Tables S3, S4, S5, S6, S7, S8, and S9 list all the terms included in each statistical model (excluding the random effect of participant). The first column indicates the variable to which each coefficient refers (abbreviated as previously defined in the main text of the manuscript or as indicated in Tables S1 and S2). The second column indicates the type of variable (continuous, categorical, or interaction). The third column indicates (for all the terms except the intercept) the scale parameter of the 1-degree-of-freedom t distribution used as a prior for the standardized slope coefficient; for the intercept term this column indicates the standard deviation of the zero-centered normal distribution used as a prior for the intercept. The fourth column indicates the same quantity as the third column, but in unstandardized mean-centered units. The fifth column indicates the median of the posterior distribution in unstandardized mean-centered units. The sixth column indicates the 99% CI for the coefficients in unstandardized mean-centered units.  To give a sense of the prior distribution Fig. S14 plots t distributions with the same mean and degrees of freedom as the priors used in the current study for several values of the scale parameter. In each case the prior probability is highest for values around zero; while it is sharply centered around zero for small scale values, it becomes more diffuse as the scale value increases. Even when the scale value is relatively small, due to its heavy tails the t distributions can accommodate coefficients much larger than zero if the likelihood provides clear evidence for this. For a more in depth overview of t priors see Kruschke (2014). The models for the ABR wave amplitude (Table S4) and latency (Table S6) in quiet had two predictors (and relative interaction terms) for audiometric thresholds instead of a single one: One predictor for low-frequency audiometric thresholds (PTA 0.5−2 ), and one for high-frequency audiometric thresholds (PTA 4−12 ). The choice of these two predictors was motivated by several factors: i) age-related audiometric losses typically are not flat across the frequency range, but more pronounced at frequencies 2 kHz ii) with broadband stimulation the contribution of lower and higher cochlear frequency regions to ABR wave amplitudes is level dependent (Don and Eggermont, 1978;Eggermont and Don, 1980), with a breakpoint occurring roughly around 2 kHz, hence an interaction term between frequency region and stimulus level is needed to capture the effects of audiometric loss on ABR wave amplitudes at different levels.

Abbreviation Variable Dummy codes
Residuals plus component plots (Ezekiel, 1924;Chatterjee and Hadi, 2006;Fox, 2016) were used to check that relations between the dependent variable and each predictor (with the effects of other predictors partialed out) were approximately linear.

ABR wave amplitudes for the HF-IERL montage
ABR grand averages with the HF-IERL montage are shown in Fig. S15 separately for each age group. Fig. S16 shows the ABR wave I and V amplitudes measured for each participant in each condition as a function of age with the HF-IERL montage. Overall, the pattern of results as a function of age was very similar to that observed for the HF-ITPR montage. 105 dB ppeSPL -HP Noise Figure S15: ABR grand averages with the HF-IERL montage for each experimental condition as a function of age group. It should be noted that although the grand averages are shown as a function of age group for illustration purposes here, the statistical analyses used age as a continuous rather than nominal variable.
The effects of age (Fig. S17), PTA 0.5−2 (Fig. S18), PTA 4−12 (Fig. S19), and log 10 TCNE (Fig. S20) estimated by the Bayesian multiple regression models for the HF-IERL montage were qualitatively similar to those described in the main manuscript for the HF-ITPR montage.  Figure S16: ABR wave I and V amplitudes by age for the HF-IERL montage. The inverted triangles represent data points for which the peak-trough amplitude could not be measured. These data points were modeled as having an amplitude lower than the lowest recorded peak-trough amplitude in the dataset through a censored analysis. The two downward arrows in the panel for wave V in HP noise represent the data points of a 47 years old participant with a peaktrough amplitude of 0.38 nV, and a 61 years old participant with a peak-trough amplitude of 3.35 nV. These two datapoints are not plotted at their actual coordinate simply for aesthetic reasons to avoid excessively enlarging the panel. Each panel shows a least squares line fit of wave amplitude by age with 95% confidence intervals as a visual aid. The slope for the effect of age estimated by the Bayesian multiple regression model is not the same as that shown in the figure. : Posterior medians (circles) and 99% credibility intervals for the effects of log 10 TCNE on ABR wave I and V amplitudes estimated by the Bayesian multiple regression models for the HF-IERL montage.  Figure S21: ABR wave I and V latencies by age for the HF-IERL montage. Each panel shows a least squares line fit of wave latency by age with 95% confidence intervals as a visual aid. The slope for the effect of age estimated by the Bayesian multiple regression model is not the same as that shown in the figure.

ABR wave latencies for the HF-IERL montage
effects of age (Fig. S22), PTA 0.5−2 (Fig. S23), PTA 4−12 (Fig. S24), and log 10 TCNE (Fig. S25) estimated by the Bayesian multiple regression models for the HF-IERL montage were largely similar to those described in the main manuscript for the HF-ITPR montage. Lat. Change (ms) X Noise [log 10 (Energy)] Figure S25: Posterior medians (circles) and 99% credibility intervals for the effects of log 10 TCNE on ABR wave I and V latencies estimated by the Bayesian multiple regression models for the HF-IERL montage. The bottom row shows the effects for the wave I-V interpeak latencies.

FFR ENV SNR results for each montage
Figs. S26, S27, S28, and S29 show the FFR ENV SNR measured for each participant in each condition as a function of age. Each of these figures shows the results for a different montage. The results for the HF-C7, HF-LERL, and HF-LTPR montages showed the same pattern described for the across-montage average in the main manuscript, while for the HF-LMST montage, SNRs tended to decrease with age not only for the 0.6-kHz CF, but also for the 2-kHz CF. We chose SNR over raw signal level as a measure of FFR amplitude because the former measure is normalized, and easily interpretable in absolute terms. Nevertheless, i) scatterplots of raw signal levels by age showed essentially the same trends as the scatterplots of FFR SNR by age shown in Figs. S26, S27, S28, and S29, and ii) correlations between SNRs and raw signal levels were high (ρ ranging from 0.68 to 0.88 for different montages); this suggests that the effects of age on SNR were largely driven by age-related changes in signal levels rather than by age-related changes in noise levels. Fig. S30 shows the CIs for the effects of age on FFR ENV SNR for each montage, as well as for the main effects across montages. The montage-specific effects were qualitatively similar to the main across-montage effects at the 0.6 kHz CF. At the 2-kHz CF, the effects for the HF-C7, HF-LERL, and HF-LTPR montages were also similar to the main across-montage effects, while for the HF-LMST montage there were trends for age-related decreases at both MDs (posterior median ∼ 0.6 dB per age decade). Fig. S31 shows the CIs for the effects of PTA 1−2 on FFR ENV SNR for each montage, as well as for the main effects across montages. The montage-specific effects were qualitatively similar to the main across-montage effects at the 0.6 kHz CF. At the 2-kHz CF, the effects for the HF-C7, HF-LERL, and HF-LTPR montages were also similar to the main across-montage effects, with trends for SNR increases with increasing PTA 1−2 , while for the HF-LMST montage there were no trends for SNR increases with increasing PTA 1−2 . The effects of log 10 TCNE for each montage are shown in Fig. S32, and overall, are qualitatively similar to the main effects across montages described in the main manuscript. : Posterior medians (circles) and 99% credibility intervals for the effects of age on FFR ENV SNR estimated by the Bayesian multiple regression model. The black "Acr. Mnt." segments plot the main effects across all montages. The top row shows the effect difference between the 70% and 100% MD. Effects are plotted as an SNR change for an age increase of ten years. 42 Figure S31: Posterior medians (circles) and 99% credibility intervals for the effects of PTA 1−2 on FFR ENV SNR estimated by the Bayesian multiple regression model. The black "Acr. Mnt." segments plot the main effects across all montages. The top row shows the effect difference between the 70% and 100% MD. Fig. S33 shows, for each montage, the FFR TFS SNR measured for each participant in each condition as a function of age. Overall, the montage-specific effects of age (Fig. S34), PTA 1−2 (Fig. S35), and log 10 TCNE (Fig. S36) were qualitatively similar to the main across-montage effects of these predictors described in the main manuscript. Effects are plotted as an SNR change for an age increase of ten years.

FFR ENV latency results for each montage
Figs. S37, S38, S39, and S40 show the FFR ENV latency measured for each participant in each condition as a function of age. Each of these figures shows the results for a different montage. The results of the Bayesian model indicate the presence of systematic latency differences across montages, with shorter than average latencies for the HF-C7 montage, longer latencies for the HF-ITPR montage, and a trend for slightly longer latencies for the HF-LERL montage Age (years) FFR Latency (ms) Figure S37: FFR ENV latency by age for the HF-C7 montage. Each panel shows a least squares line fit of FFR latency by age with 95% confidence intervals as a visual aid. The slope for the effect of age estimated by the Bayesian multiple regression model is not the same as that shown in the figure. Age (years) FFR Latency (ms) Figure S38: FFR ENV latency by age for the HF-LERL montage. Each panel shows a least squares line fit of FFR latency by age with 95% confidence intervals as a visual aid. The slope for the effect of age estimated by the Bayesian multiple regression model is not the same as that shown in the figure. Age (years) FFR Latency (ms) Figure S39: FFR ENV latency by age for the HF-LMST montage. Each panel shows a least squares line fit of FFR latency by age with 95% confidence intervals as a visual aid. The slope for the effect of age estimated by the Bayesian multiple regression model is not the same as that shown in the figure. Age (years) FFR Latency (ms) Figure S40: FFR ENV latency by age for the HF-LTPR montage. Each panel shows a least squares line fit of FFR latency by age with 95% confidence intervals as a visual aid. The slope for the effect of age estimated by the Bayesian multiple regression model is not the same as that shown in the figure.  Figure S41: Posterior medians (circles) and 99% credibility intervals for the effects of age on FFR ENV latency estimated by the Bayesian multiple regression model. The black "Acr. Mnt." segments plot the main effects across all montages. The top row shows the effect difference between the 70% and 100% MD. Effects are plotted as a latency change, in ms, for an age increase of ten years. : Posterior medians (circles) and 99% credibility intervals for the effects of log 10 TCNE on FFR ENV latency estimated by the Bayesian multiple regression model. The black "Acr. Mnt." segments plot the main effects across all montages. The top row shows the effect difference between the 70% and 100% MD.