Effects of hearing acuity on psychophysiological responses to effortful speech perception

In recent studies, psychophysiological measures have been used as markers of listening effort, but there is limited research on the effect of hearing loss on such measures. The aim of the current study was to investigate the effect of hearing acuity on physiological responses and subjective measures acquired during different levels of listening demand, and to investigate the relationship between these measures. A total of 125 participants (37 males and 88 females, age range 37 – 72 years, pure-tone average hearing thresholds at the best ear between -5.0 to 68.8 dB HL and asymmetry between ears between 0.0 and 87.5 dB) completed a listening task. A speech reception threshold (SRT) test was used with target sentences spoken by a female voice masked by male speech. Listening demand was manipulated using three levels of intelligibility: 20 % correct speech recognition, 50 %, and 80 % (IL20 %/IL50 %/IL80 %, respectively). During the task, peak pupil dilation (PPD), heart rate (HR), pre-ejection period (PEP), respiratory sinus arrhythmia (RSA), and skin conductance level (SCL) were measured. For each condition, subjective ratings of effort, performance, difficulty, and tendency to give up were also collected. Linear mixed effects models tested the effect of intelligibility level, hearing acuity, hearing asymmetry, and tinnitus complaints on the physiological reactivity (compared to baseline) and subjective measures. PPD and PEP reactivity showed a non-monotonic relationship with intelligibility level, but no such effects were found for HR, RSA, or SCL reactivity. Participants with worse hearing acuity had lower PPD at all intelligibility levels and showed lower PEP baseline levels. Additionally, PPD and SCL reactivity were lower for participants who reported suffering from tinnitus complaints. For IL80 %, but not IL50 % or IL20 %, participants with worse hearing acuity rated their listening effort to be relatively high compared to participants with better hearing. The reactivity of the different physiological measures were not or only weakly correlated with each other. Together, the results suggest that hearing acuity may be associated with altered sympathetic nervous system (re)activity. Research using psychophysiological measures as markers of listening effort to study the effect of hearing acuity on such measures are best served by the use of the PPD and PEP.


Introduction
Hearing loss is globally one of the most prevalent disabilities (World Health Organization (WHO), 2021).It increases the effort needed in daily life to listen and communicate (Beechey et al., 2020), resulting in increased levels of reported stress and fatigue (Holman et al., 2021).
Hard of hearing (HH) individuals may find themselves too fatigued or not willing to take the effort to participate in social events, leading to social isolation (e.g., see Mick et al., 2014).Hearing loss increases the risk of physical and mental health complications (Pichora-Fuller et al., 2015) and has been associated with greater cognitive decline (Lin et al., 2013) and poorer quality of life (e.g., Hasson et al., 2011;Holman et al., Abbreviations: ANS, autonomic nervous system; ECG, electrocardiography; EDA, electrodermal activity; HR, heart rate; HRV, heart rate variability; ICG, impedance cardiography; LME, linear mixed-effects; PEP, pre-ejection period; PNS, parasympathetic nervous system; PPD, peak pupil dilation; RSA, respiratory sinus arrhythmia; SCL, skin conductance level; SNR, signal-to-noise ratio; SNS, sympathetic nervous system; SRT, speech reception threshold. 2021; Mick et al., 2014).One of the mechanisms by which hearing loss may impact physical and mental health is a chronic activation of the physiological stress systems, in particular the autonomic nervous system (ANS).Such prolonged activation is known to lead to allostatic load and poor health (Epel et al., 2018).In this pathway from hearing loss to poor health, repeated increases in listening effort likely play a role.It is therefore important to gain more insight in the relationship between listening effort and changes in ANS activity.

Listening effort
Listening effort is defined as "the deliberate allocation of mental resources to overcome obstacles in goal pursuit when carrying out a task that involves listening" (Pichora-Fuller et al., 2016).Listening effort is generally not monotonically associated with listening demand.The Framework for Understanding Effortful Listening (FUEL) illustrates how listening effort varies as a function of factors like demand and motivation (Pichora-Fuller et al., 2016;Fig. 2, pp. 16S), where listening effort is largest when both motivation and listening demand are high, while the task is still possible.The motivational intensity theory (MIT) states that the mobilization of effort increases as task demand increases up to the point where individuals reach a maximum and then completely disengage when the difficulty outweighs the prospect of any reward or if one's capacity is insufficient (Brehm and Self, 1989;Richter et al., 2016).This results in a saw-tooth shaped function between listening effort and demand, where the height and width of the function are influenced by factors such as success importance (motivation) and/or mood (Richter et al., 2016).Kuchinsky and Vaden (2020) proposed an inverted-U shaped function where listening effort increases as function of increasing task demand until a maximum is reached, followed by a gradual decline with further increasing demand.Regardless of the precise shape, it is clear that listening effort behaves non-monotonic as a function of demand.
Listening effort is also influenced by other intrinsic and external factors, such as cognitive factors (Peelle, 2018), reward (Koelewijn et al., 2018;Richter, 2016;Vassena et al., 2014), and the presence of another individual (Pielage et al., 2021).Next to hearing loss, other hearing-related complaints like tinnitus can also affect listening effort (e.g., Degeest et al., 2022).Tinnitus is the sensation of auditory perception, such as ringing, without physical presence of the sound in the environment (Baguley et al., 2013).Tinnitus is generally associated with hearing loss, but is also reported in people with (close to) normal hearing (e.g., Møller et al., 2011).
A common strategy in lab-based studies to induce listening effort is by manipulating auditory signals and varying the difficulty level of auditory tasks (e.g., Pichora-Fuller et al., 2016;Richter et al., 2023).Such a task can involve listening to sentences masked by background noise, where the difficulty is controlled by varying the signal-to-noise ratio (SNR).The speech reception threshold (SRT) test involves such a task, where the SRT is the SNR a listener requires to understand a fixed percentage of the target stimuli (Plomp, 1978;Plomp and Mimpen, 1979).A higher SRT means that a more favorable SNR is required to obtain comparable intelligibility scores and indicates worse speech reception.HH individuals tend to have higher SRTs compared to normal hearing (NH) individuals, despite adjustments to stimuli to compensate for their hearing loss (Bernstein and Grant, 2009;Festen and Plomp, 1990;Houtgast and Festen, 2008;Zekveld et al., 2011).

Physiological measures of listening effort
Recent research has focused on the effects of listening demand on psychophysiological activity, which is assumed to be influenced by listening effort.Such physiological measures aim to capture changes in, for example, ANS activity.The ANS is involved in involuntary bodily functions and can be divided in the sympathetic and parasympathetic nervous system (SNS and PNS, respectively).Simply put: the SNS is associated with physical activity and stressful or cognitively challenging situations (McCorry, 2007) whereas the PNS activity is associated with rest (McCorry, 2007).The SNS and PNS generally show a complementary pattern of activity, where SNS activity decreases as PNS activity increases and vice versa, although alternative patterns can be found as well (Berntson et al., 1994).SNS activation and PNS withdrawal are both associated with acute psychological stress (Brindle et al., 2014) and increased cognitive load (Ayres et al., 2021).There are various methods in use to measure ANS activity, a number of which are described below.Additionally, Supplementary Fig. 1 provides an overview of the relevant physiological measures and their expected direction of effect with increased listening effort.
A physiological measure that is widely associated with listening effort is the pupil dilation response.Pupil size increases with auditory task demand and decreases with auditory overload (Zekveld et al., 2018).Pupil dilation is influenced by a complex mixture of both SNS and PNS activity changes (Joshi and Gold, 2020).An increase in pupil dilation due to listening effort is indicative of both increasing SNS activity and decreasing PNS activity (Loewenfeld and Lowenstein, 1993).In line with the proposed non-monotonic relationship between listening demand and effort, the pupil dilation response in NH has been found to be largest around 40-50 % sentence intelligibility (Ohlenforst et al., 2017;Wendt et al., 2018).Interestingly, pupil dilation in response to effortful listening is smaller in HH individuals compared to NH in difficult listening conditions (Ohlenforst et al., 2017;Zekveld et al., 2011).This indicates that NH and HH individuals may differ in how the SNS and/or PNS respond to listening demand.Wang et al. (2018) suggested that this difference may be associated with a larger PNS inhibition for NH compared to HH subjects.No other study has yet reported on the role the SNS and PNS may have in the smaller pupil dilation response in HH individuals.
Specific measures of either SNS or PNS activity are required to get a better understanding of the contribution of either system to the changes in physiological responses to listening demand related to hearing loss.Physiological measures that predominantly reflect activity from either the SNS or PNS include pre-ejection period (PEP), electrodermal activity (EDA), and heart rate variability (HRV).PEP is the time interval between the ventricular depolarization at the heart and the opening of the aortic valve.It decreases with cardiac contractility, which is influenced by the SNS (Harris et al., 1967).This results in a decrease in PEP when SNS activity increases, which means that increases in effort are expected to be associated with a task-induced decrease from baseline (i.e., decreased PEP reactivity).PEP reactivity to listening demand has been reported in four studies so far.Slade et al. (2021) found a non-monotonic relationship between PEP reactivity and listening demand in NH adults, where PEP reactivity decreased with increasing listening demand during a speech-in-noise task with short stories, but increased during the impossible condition.Richter (2016) showed decreased PEP reactivity with high demand and high (monetary) reward compared to the lower demand and/or lower reward conditions of an auditory discrimination task.On the other hand, Plain et al. (2020) found a linear relationship, indicating increased PEP reactivity with increasing SNR of masked sentences (ranging from − 21 to − 1 dB SNR), and no effect of monetary reward.Only one study included HH subjects, where no effect of intelligibility (50% vs 80 %) or social observation on PEP was found during a speech perception task (Plain et al., 2021).The effect of the degree of hearing loss or tinnitus on PEP reactivity to listening demand in a group of participants including both NH and HH individuals is therefore still unclear.
A measure of EDA is skin conductance level (SCL) and fluctuates with sweat secreted from eccrine glands underneath the skin.SCL increases with increased perspiration, as conductivity of the skin increases.Eccrine glands are exclusively innervated by nerves from the sympathetic branch of the ANS and produce more sweat when SNS activity increases (Lidberg and Wallin, 1981).SCL is a measure of general arousal, but it is also influenced by other (unintended) factors, such as environmental temperature (Boucsein, 2012).Following from the general assumption that SNS activity increases with effort, it is hypothesized that skin conductance also increases with effort.EDA measures have been shown to be affected by listening demand manipulations (e.g., see Cvijanović et al., 2017;Francis et al., 2021;Mackersie and Cones, 2011), although other studies did not find significant effects (e.g., see Francis et al., 2016;Mackersie et al., 2015).Additionally, Mackersie et al. (2015) found an overall increased SCL reactivity in HH compared to NH subjects, indicating higher overall SNS responses.In contrast, higher levels of tinnitus-related distress in NH individuals have been associated with lower SCL reactivity to listening (Cartocci et al., 2023).The effects of both hearing loss and tinnitus simultaneously on EDA measures has not been investigated thus far, especially in a larger sample.
HRV is the beat-to-beat variability in inter beat intervals and is mainly caused by cardiorespiratory coupling through the phenomenon of respiratory sinus arrhythmia (RSA) (Eckberg, 2003;Grossman and Kollai, 1993) and by the baroreflex effects of slow blood pressure oscillations (de Boer et al., 1985;Legramante et al., 1999).HRV associated with cardiorespiratory coupling reflects PNS activity and can be obtained through time-or frequency-domain methods.PNS specific HRV measures include peak-valley RSA, high frequency HRV (HF-HRV), and root mean square of successive differences (RMSSD).Peak-valley RSA (from here on referred to as RSA) is quantified as the mean difference between the longest inter beat interval during expiration and the shortest interval during inspiration for each respiratory cycle in the time domain (Grossman et al., 1990).At comparable breathing, it serves as a PNS measure as it reflects vagal activity on the heart during expiration.As PNS activity is presumed to decrease with increasing effort, listening demand should induce a decrease in these HRV measures.Mackersie and colleagues (2015) found that HH participants showed a decreased HF-HRV with increasing listening demand, whereas their relation was absent in NH participants.In other words, HH participants showed lower HF-HRV compared to NH at the lowest SNRs, indicating lower PNS levels.However, Mackersie and Kearney (2017) did not replicate this finding for RMSSD and Plain et al. (2021) did not find this effect in HH participants for both HF-HRV and RMSSD either.Studies including only NH subjects also showed inconsistent results: Mackersie and Calderon-Moultrie (2016) and Seeman and Sims (2015) found decreased HRV with increasing listening demand, but Cvijanović et al. (2017) and Slade et al. (2021) found no effect.Based on current literature, it is yet unknown if and how tinnitus affects HRV responses to effortful listening.
Assuming hearing loss is associated with increased listening effort and that physiological responses to listening demand are indicative of listening effort, one would assume that HH individuals tend to show increased physiological responses.As described above, if effects were found, the SNS responses were higher and PNS lower for HH individuals compared to NH (Mackersie et al., 2015), which is in line with this assumption.In contrast, pupil dilation shows the opposite effect as these responses are consistently smaller in HH individuals during listening (e. g., Zekveld et al., 2011Zekveld et al., , 2023)).One could argue that this indicates attenuated SNS reactivity and/or PNS withdrawal in HH individuals, possibly due to higher SNS and/or lower PNS baseline levels in response to higher stress levels in daily life compared to NH individuals.Although the findings by Mackersie et al. (2015) and the smaller pupil dilation in HH are contradictory, it is also important to note that the SNS/PNS effects were not only based on different listening tasks, but also measured from different organs (skin, heart, pupil).SNS/PNS activity is only partly unitary, and can show organ-specificity (Esler et al., 1988).To this end, this study aimed to simultaneously measure pupil dilation as well as specific SNS and PNS activity measured during listening in the current study.
Currently, in the context of listening effort, only three studies have reported on the relationship between pupil dilation, EDA, and cardiovascular measures.First, Mackersie et al. (2015) found a moderate correlation between HF-HRV at baseline and SCL reactivity (r = − 0.41), indicating that subjects with a low HF-HRV at baseline showed a higher SCL reactivity during listening.The authors concluded that individuals with low PNS baseline levels tended to show increased SNS reactivity to listening.Additionally, Alhanbali et al. (2019) found a weak correlation between SCL reactivity and peak pupil dilation (r = 0.21), suggesting higher pupil dilation responses corresponded with higher SNS reactivity.Lastly, Plain et al. (2021) found a moderate negative correlation between PEP and HF-HRV reactivity (r = − 0.43), but only for the condition in which the participants completed the easy listening task alone (as opposed to being observed).No study has yet reported on the association between pupil dilation and cardiovascular measures obtained during listening tasks.

Current study
Existing literature on physiological measures of listening effort, especially cardiovascular measures, and the relationship between these measures is limited.Additionally, there is little known about the effect of hearing acuity on these physiological responses.The aim of the current study was to investigate the effect of hearing acuity on EDA, cardiovascular, and pupil dilation responses and subjective measures to different levels of speech intelligibility and to assess the relationship between these measures in a relatively large sample of adults.We hypothesized that the physiological reactivity would show a nonmonotonic relationship with intelligibility level with the strongest response at a moderate intelligibility level and lower responses at low and high intelligibility levels.Additionally, we hypothesized that hearing acuity affects physiological reactivity, such as a decrease in PPD with worse hearing acuity, thereby replicating previous pupillometry findings (Ohlenforst et al., 2017;Zekveld et al., 2011).Based on the interactions between SNR and hearing status found by Ohlenforst et al. (2017) and Zekveld et al. (2011) we also expected an interaction between hearing acuity and intelligibility level.For the other physiological responses we had no unequivocal expectations with regard to the directionality of effect of hearing acuity, because current knowledge on this topic is limited.On the one hand, if individuals with worse hearing acuity were to have higher SNS activity at baseline, we also expected attenuated SNS reactivity to listening, corresponding to the expected smaller PPD.Opposite effects were then expected for PNS (re)activity.On the other hand, higher listening effort was expected to be associated with worse hearing acuity, indicated by more pronounced physiological responses.Moreover, tinnitus complaints were included in the analyses, as tinnitus often co-occurs with hearing loss (Oosterloo et al., 2021) and also may affect listening effort (Burke and Naylor, 2020;Degeest et al., 2022;Juul Jensen et al., 2018).Due to its close relationship with hearing loss, tinnitus complaints was expected to follow the same pattern of physiological responses as hearing acuity.Finally, the relationship between the outcomes measures (physiological responses and subjective ratings) were assessed for each condition, as well as the relationship between hearing acuity and the physiological baseline levels.Most previous studies on listening effort have focused on group effects between NH and HH individuals.Grouping HH individuals in one group does not allow for testing of the effect of degree of hearing loss.One solution is to create subgroups based on level of hearing acuity, but this leads to loss of information and statistical power (Burke et al., 2015).Instead, in the current study, we aimed to investigate the effect of hearing loss using hearing acuity based on the pure-tone hearing thresholds.Additionally, most studies on listening effort have applied listening demand manipulations covering the higher end of the performance scale (50-100 %), whereas only a small number of studies also implemented very difficult to impossible (<50 % performance) task conditions (Keur-Huizinga et al., 2024).Here, we aimed to include test conditions with relatively low (20 %), moderate (50 %), and high (80 %) performance (or intelligibility) levels.The current study was part of a large research project performed by several researchers.The remaining tests are beyond the scope of the current article, as they were associated with specific research questions that did not cover the currently L. Keur-Huizinga et al. analyzed SRT test data.

Participants
The current study was approved by the Ethics Committee of the Amsterdam University Medical Center and written informed consent was obtained from all participants prior to participation.A total of 136 adults participated in the study, with hearing acuity levels ranging from normal hearing to severe hearing loss (World Health Organization (WHO), 2021).They were recruited through the Department of Otolaryngology -Head and Neck Surgery of Amsterdam University Medical Center, flyers in the Amsterdam area, hearing aid dispensers, and social media.Participants had to be between 35 and 75 years old, native Dutch speakers, and have normal or corrected-to-normal eyesight.Exclusion criteria included cochlear implant use, cardiovascular diseases, diabetes, severe psychiatric issues, neurological diseases or other diseases affecting the eyes (e.g., glaucoma or cataract).Eligibility screening was based on self-reports.An a priori power analysis was performed based on the full scope of the research project, not for the specified research questions and analyses presented in the current article.No post hoc power analysis was performed, as we expected sufficient statistical power based on the current sample size compared to that of previous studies.For example, a sample size of around 32 participants was sufficient to detect the effect of intelligibility on PEP by Plain et al. (2020).Similarly, Zekveld et al. (2011) found the expected effects of intelligibility and hearing acuity on the pupil response in 38 NH and 36 HH participants.Due to technical limitations (limited headphone output), a total of 11 participants with severe hearing loss were not able to complete the SRT test.The final sample size was 125 participants (37 males and 88 females), with a mean age of 58.6 years (SD = 8.6, range: 37 -72) and average body mass index (BMI) of 26.2 kg/m 2 (SD = 4.5).A total of 48 of the participants wore hearing aid(s) in daily life.
The pure-tone average (PTA) air conduction hearing threshold across octave frequencies at 500, 1000, 2000, and 4000 Hz (World Health Organization (WHO), 2021) of all participants was 24.2 dB HL (SD = 17.2, range: − 5.0 -68.8) for the best ear and 34.1 dB HL (SD = 23.3,range: 2.5 -103.8) for the worst ear.These frequencies were used to calculate PTA following standard practice (World Health Organization (WHO), 2021) and based on the frequency range important for speech understanding (e.g., Fletcher and Galt, 1950).The average audiogram for each ear is depicted in Fig. 1.The PTA of the best ear was used in all statistical models to test for effects of hearing acuity.To account for asymmetrical hearing loss, the difference in PTA between the best and worse ear was added to the statistical models (see Section 2.7).The average PTA asymmetry was 9.9 dB (SD = 15.9, range: 0.0 -87.5).In addition, all participants were asked the following question about tinnitus: "Do you suffer from tinnitus or tinnitus-like complaints?" (yes/no).Tinnitus was defined as: 'Tinnitus means ringing in the ears, but the sound is not physically present in the surroundings.Such sounds include whistling, whooshing, beeping, humming, buzzing, murmurs or a combination of sounds.These sounds can vary in loudness and pitch, may always or occasionally be present, and can be heard continuously or in intervals'.A total of 74 participants confirmed suffering from tinnitus.

Procedure
At the start of the test session, pure-tone air-conduction hearing thresholds were obtained for both ears at 250, 500, 1000, 2000, 4000, and 8000 Hz and bone-conduction thresholds at 500, 1000, 2000, and 3000 Hz.Thresholds were obtained in accordance to ISO 8253-1 (2010) using the Hughson-Westlake procedure (Hughson and Westlake, 1944) and standard audiometric equipment (Decos or Otometrics Natus Madsen Astera2, IEC 60645-1 [2017] compliant).The headphones (TDH39P or DD450) and bone conductor (B71) were calibrated according to ISO 389-1 (2017) and 389-3 (2016), respectively.The tests were conducted in a soundproof room, compliant to ISO 8253-1.Afterwards, participants' height and weight were measured and the electrodes were attached and recording started.Next, the participants completed a questionnaire, namely the Amsterdam Inventory for Auditory Disabilities (Kramer et al., 1995; not discussed in the present article), which took 10 to 15 min.After completing the survey, the participants were told to rest and stay seated for 5 min.This resting interval was used as baseline for cardiovascular and skin conductance measures.After the baseline interval, the pupillometer was calibrated and the SRT test was performed.The SRT test consisted of a practice test, followed by three conditions with varying intelligibility level (IL) in a randomized order: IL20 %, IL50 %, IL80 %.Following each condition, participants completed rating scales on subjective effort, difficulty, performance and giving up.Subsequently, several remaining tests were completed, which are not discussed in the present article.

Speech-reception-threshold test
Stimuli in the SRT test were Dutch target sentences spoken by a female voice and masked by concatenated sentences spoken by a male speaker (Versfeld et al., 2000).The long-term average frequency spectrum of the masker was shaped to match that of the target speech and the bandwidth of both the target and masker was limited to 330-6300 Hz.The target sentences were logical and syntactically correct.The masker sentences started at a random part of the sentence and the sentences uttered by the target speaker were different from those uttered by the masker (for more details on the sentences, see Versfeld et al., 2000).The masker level was fixed at 65 dB SPL.Intensity of the target sentence (i.e., the SNR) was varied adaptively to reach the desired intelligibility level.The stimuli were presented binaurally using headphones (Sennheiser, HDA 200, 40 Ω) through an external soundcard (24 bit; Sound Blaster, Creative Technology) connected to a personal computer.
Each trial started with 2 s of masker only, after which the target sentence was presented while the masker continued, followed by another 3 s of masker only.The duration of the target sentences varied between 1.5-2.5 s.After masker offset, participants were instructed to verbally repeat all words they heard, even if they did not hear the full target sentence and they were encouraged to make a guess if unsure.The researcher was seated behind the participant and scored the repetition accuracy for each trial.Sentence repetition was considered correct if all words of the target sentence were repeated correctly.The interval between trials varied depending on the response of the participant, but was at least 4 s to provide the pupil size sufficient time to return to baseline levels (Winn et al., 2018).Participants were asked to remove their hearing aids, if applicable.To accommodate their hearing levels, the overall sound level of the stimuli was adjusted using the NAL-R algorithm (Byrne and Dillon, 1986; within frequency range of 330-6300 Hz) based on the pure-tone air and bone conduction hearing thresholds.
At the start of the SRT task, a short practice test was completed consisting of five trials.Next, the IL20 %, IL50 % and IL80 % conditions were completed in a randomized order.Each condition included 28 trials.If the participant's PTA at the best ear was equal to or better than 30 dB HL, the SNR of the first sentence was − 22 dB, − 18 dB, or − 14 dB SNR for the IL20 %, IL50 %, and IL80 % conditions, respectively.Otherwise, the conditions started at − 12 dB, − 8 dB, and − 4 dB SNR for the IL20 %, IL50 %, and IL80 % conditions, respectively.The first sentence of each condition was repeatedly presented until the participant repeated it correctly.The SNR of this first sentence was increased by 4 dB after every incorrect repetition.The remaining trials in the condition were presented only once.The practice condition started at 0 dB SNR and increased with 4 dB if the sentence was repeated incorrectly and decreased by 4 dB when repeated correctly.To reach the required intelligibility levels (20 %, 50 %, and 80 %) in the experimental conditions, the adaptive procedure used different step sizes to vary the SNR (Kaernbach, 1991).For the IL20 % condition, the step sizes were +1.6 dB and − 6.4 dB after incorrect and correct responses, respectively.For the IL80 % condition, these step sizes were conversed (i.e., +6.4 dB and − 1.6 dB after incorrect and correct responses, respectively) and for the IL50 % condition they were +2 dB and − 2 dB after incorrect and correct responses, respectively.The SRT per condition was calculated as the mean SNR (in dB) of trial 5 to 28.

Subjective ratings
Following each SRT condition, participants answered four questions on a Visual Analogue Scale from 0 to 10, with steps of 0.1 (based on Zekveld et al., 2010).The first question was (translated to English) 'How much effort did it take to understand the sentences?',with the answers ranging from 'no effort' (0) to 'a lot of effort' (10).The second question was 'How well were you able to understand the sentences?' and the answers ranged from 'I did not understand any sentence' (0) to 'I understood all sentences' (10).The third question was 'How difficult was the task?' and the scale ranged from 'I found the task not difficult at all' (0) to 'I found the task extremely difficult' (10).The fourth question was 'How many times did you give up trying to perceive the sentences?',with the answers ranging from 'I did not give up at all' (0) to 'I gave up every sentence' (10).These questions will be referred to as subjective effort, performance, difficulty, and tendency to give up, respectively.

Pupillometry
Pupil size was measured during the SRT test using an infrared eye tracker (SMI RED250mobile) positioned on a table in front of the participant at 55 cm distance.Luminance of the room was fixed at approximately 450 lux.The eye tracker was used without a PC or laptop screen and sampling rate was 60 Hz and spatial resolution 0.03 mm.Participants were instructed to focus on a fixation point positioned 1.4 m in front of them to reduce eye movements.If possible, participants performed the task without glasses, but contact lenses were not removed.Participants were required to remove eye make-up if applicable.
Pupil responses were measured during each trial, where the average pupil size in the 1000 ms window before target sentence onset was used as baseline.Recordings of the left eye were used by default, but the right eye was used if data quality of the left eye was considered low or due to medical reasons (e.g., if the participant had a glaucoma in the left eye).The first four trials were excluded for all participants, due to starting the adaptive procedure of the test (Zekveld et al., 2010).Additionally, trials were automatically excluded if they contained equal to or more than 25 % missing data, for example due to blinks or signal loss.Blinks were defined as intervals with pupil sizes below 3 SDs of the mean for each trial, with a minimal duration of 83 ms.All trials were visually inspected to check for artefacts and were manually removed in case of poor data quality (e.g., noisy signal or a significant amount (>50 %) of missing data due to blinks or signal loss in the baseline interval).For nine conditions from six participants with <10 remaining trials, an adapted criterion of at most 40 % missing data was used.If this still resulted in <10 acceptable trials despite the adjusted maximum, all trials for that condition were excluded.In total, 11 % of the pupillometry data were excluded from final analyses due to poor quality.
Linear interpolation was applied to the remaining included trials to remove missing data and eye blinks, and started four samples before and ended eight samples after each blink.For shorter periods of missing data, linear interpolation was restricted to the interval of missing data.Thereafter, data were smoothed using a five-point moving average filter.For each condition and participant, the pupil data were then averaged.Based on this average pupil response, the peak pupil dilation (PPD, i.e., maximum pupil size change compared to baseline) was calculated.The PPD was analyzed, as it has been shown to be sensitive to hearing loss and listening demand changes in previous studies (Zekveld et al., 2018), and allows to answer the current research questions.

Cardiovascular and electrodermal activity
Electrocardiography (ECG), impedance cardiography (ICG), and electrodermal activity (EDA) were measured using the Vrije Universiteit Ambulatory Monitoring System (VU-AMS; https://vu-ams.nl/).For ECG and ICG, seven 55 mm Kendall H98SG hydrogel ECG electrodes were placed on the thorax and back area.For EDA, a disposable Biopac Systems EL507A electrode with GEL101A or EL507 with isotonic gel was placed on the thenar eminence of the non-dominant hand and a reference ECG electrode on the forearm.ECG and ICG were sampled at a frequency of 1000 Hz and EDA at 10 Hz.Recording started after visually checking the quality of the signals and ended after the end of the test session.Data was scored and data quality was checked for each participant afterwards using the Data Analysis and Management Software (DAMS).Clipping levels in the signals were removed automatically and any additional noise or deviants were removed manually.For the ECG signal, any segments with missed or incorrect R-peaks from QRScomplexes were corrected manually.The heart rate (HR) was based on the automatic R-peak detection in the ECG signal by DAMS and calculated as the average number of R-peaks per minute per condition (in beats per minute).Respiratory sinus arrhythmia (RSA) was used as the measure of HRV, which is based on the peak-valley method (de Geus et al., 1995;Grossman et al., 1990).For each respiratory cycle (as measured from the thorax impedance of the ICG), the shortest inter beat interval during inspiration is subtracted from the longest inter beat interval during expiration.These values (in milliseconds) are then averaged over each condition as well as the baseline period.PEP was obtained by calculating the time (in milliseconds) between the B-point of the ensemble averaged ICG complex and Q-point on the average ECG complex per condition and during the baseline.The B-point signals the opening of the aortic valve, whereas the Q-point signals the ventricular depolarization.The ensemble averaged ICG and ECG complexes were visually inspected for quality and placements of relevant time points (Band Q-points) were corrected if necessary.SCL (in µSiemens) was obtained from the raw EDA signal and further processed using the EDA Matlab toolbox (Joffily, 2012).After filtering and removing artefacts (outside of the range of 2-60 µSiemens), SCL was averaged over baseline and each condition.For each participant, the reactivity of each measure (HR, RSA, PEP, and SCL) was calculated as the mean level during the condition subtracted by the mean level during the 5-minute resting period (baseline).Due to low quality or missing signals, data of some participants were excluded from analyses: HR and RSA data of 12 conditions from 4 participants, PEP data of 9 conditions from 3 participants, and SCL data of 26 conditions from 10 participants.

Statistical analyses
All physiological measures were converted to reactivity scores by subtracting the baseline values.For HR, PEP, RSA, and SCL this was done using the means of the baseline condition, for PPD the pre-trial pupil diameter acted as the baseline.Linear mixed-effects (LME) models were applied to assess the effect of hearing acuity (PTA of the best ear) and intelligibility level (task conditions IL20 %, IL50 %, and IL80 %) on the SRT, physiological reactivity, and the subjective measures.The LME analyses were performed using the lme4 package in R (v1.1-34; Bates et al., 2015).In each model, the physiological reactivity measures were set as dependent variables (outcome), PTA and ear asymmetry as continuous independent variables, and condition and tinnitus complaints (yes/no) as categorical independent variables, with an interaction term between PTA and condition.All independent variables were added as fixed factors and participant was added as random effect to account for the repeated measures.The LME model was chosen as it allows for simultaneous testing of the effect of hearing acuity and intelligibility level and because it is robust against missing and non-normally distributed data (Krueger and Tian, 2004;Schielzeth et al., 2020).All LME analyses were performed using a maximum likelihood estimation method.F-tests with Satterthwaite's estimation of degrees of freedom were used to test the overall contribution of each fixed factor in the LME model.To further test for differences between conditions, the follow up post hoc tests were performed using t-tests with Satterthwaite's method for denominator degrees of freedom.In the LME models of PPD, HR, PEP, RSA, and SCL reactivity, polynomial contrasts were added to the models to assess whether the physiological reactivity measures showed a linear or quadratic relationship with intelligibility level.To this end, the linear and quadratic relationships in the models were assessed using t-tests with Satterthwaite's method for denominator degrees of freedom.This assessment was only performed for the physiological measures, as we hypothesized a non-monotonic relationship between intelligibility level and physiological reactivity, but not for the SRT and subjective measures.
Additionally, potential confounding factors were added to the models.These potential confounders included age, sex, BMI, and respiration rate.Considering the general association between higher age and more severe hearing loss (Hoffman et al., 2017), age was added to the models of all measures.Sex was only added to the models of HR, RSA, PEP and SCL reactivity, as physiological responses generally differ based on biological sex (Allen et al., 1993;Huang et al., 2013;Koenig and Thayer, 2016).BMI was also added to the models of HR, RSA, PEP and SCL reactivity, because body composition generally affects physiological measures (e.g., see Doberenz et al., 2011;Steptoe and Wardle, 2005).As verbal repetition of the target sentences affects breathing and thus RSA (de Geus et al., 2019;Eckberg, 2003), respiration rate was added to the model of RSA reactivity.Finally, correlations between all physiological and subjective measures for each condition were tested using Pearson's correlations.To account for potential effects of hearing acuity on baseline levels, Pearson's correlation tests were also performed for PTA and tinnitus complaints with the physiological baseline measures.Pearson's correlations were chosen based on the close to normal distribution of the data and large sample size.For all analyses, 95 % confidence intervals (CI) of the parameter estimates as provided by the LME analyses excluding zero and p-values lower than 0.05 (for correlations tests) were considered significant.

Results
Table 1 summarizes the descriptive statistics per condition of the SRT, physiological reactivity, and subjective ratings.Additionally, the number of participants included in the analyses per outcome measure and condition is listed in this table as some participants had missing or low quality data (see Methods section).The omnibus results of the LME models are described below, and the post-hoc test results of these models are listed in Tables 2 and 3, as well as the parameter estimates and their confidence intervals.The tables include all t-values for all LME results, despite the effect of condition, for consistency.Lastly, the correlations between the outcome measures per condition are presented in Table 4 and discussed below (see 3.5).

Behavioral results
Fig. 2A shows the SRT per condition with error bars indicating the standard error and Fig. 2B illustrates the SRT as a function of PTA per condition.The LME analysis for SRT with condition (IL20 %, IL50 %, IL80 %) and PTA (best ear) as fixed factors indicated a significant main effect of condition (F(2, 250) = 324.2,p < 0.01) as well as PTA (F(1, 125) = 197.3,p < 0.01).The interaction and the effect of PTA asymmetry were not significant.The post-hoc results indicated a higher SRT for IL50 % compared to IL20 % and higher for IL80 % compared to IL50 % (see Table 3).Participants with a higher PTA obtained higher SRTs, meaning that participants with worse hearing acuity had a poorer SRTs in all conditions.Additionally, self-reported tinnitus complaints also showed a significant effect (F(1, 125) = 4.6, p = 0.03).

Pupil dilation
Fig. 3 shows the mean PPD per condition separated by tinnitus complaints status (A) and the PPD as a function of PTA per condition (B-D).The average pupil response over time for the different conditions is presented in Fig. 4. The LME model demonstrated significant effects of condition (F(2, 243.6) = 4.2, p = 0.02) and PTA (F(1, 123) = 7.5, p < 0.01).The interaction and PTA asymmetry were not significant.The effect of PTA indicated that PPD decreased with increasing PTA (i.e., worse hearing acuity).Moreover, there was a significant effect of tinnitus complaints (F(1, 123.5) = 5.1, p = 0.03), indicating that participants suffering from tinnitus showed a smaller PPD.The quadratic term for condition was significant (t(243.8)= − 2.7, p < 0.01), indicating a non-monotonic relationship between condition and PPD.The linear term was not significant.The follow up tests revealed a larger PPD for IL50 % compared to IL80 %, but no difference between IL20 % and IL50 % (see Table 2).

Table 1
Descriptive statistics with mean (standard deviations) of the outcome measures per condition, as well as the number of participants included in the analyses.

Cardiovascular and skin conductance measures
In Fig. 5, the mean HR, PEP, RSA and SCL reactivity per condition are shown (Fig. 5D also separated by tinnitus complaints status).The physiological reactivity over PTA per condition is shown in Fig. 6.The LME model of HR reactivity revealed a significant effect of condition (F (2, 240) = 4.3, p = 0.02), but the post-hoc tests did not reveal any significant differences between conditions (see Table 2).There was no effect of PTA and no interaction.PTA asymmetry and tinnitus complaints did not significantly affect HR reactivity.The quadratic term was not significant, whereas the linear term did show a significant effect (t(240) = 2.9, p < 0.01), as well as the interaction between PTA and the linear term (t(240) = − 2.26, p = 0.02).RSA reactivity was not significantly affected by condition or PTA and there was no interaction.The effects of PTA asymmetry and tinnitus were not significant.The linear and quadratic term also did not reveal a significant effect.
Condition showed a significant effect in the LME model on PEP reactivity (F(2, 242) = 8.8, p < 0.01), but the effect of PTA and the interaction were not significant.PTA asymmetry and tinnitus complaints were also not significant.Moreover, the quadratic term for Significant effects as indicated by t-value and confidence intervals in bold.IL20-50 indicates the contrast between IL20 % and IL50 %, IL50-80 the contrast between IL50 % and IL80 %.Tinnitus refers to self-reported suffering from tinnitus (complaints).Sex was coded as Female=0, Male=1.BMI = body mass index; HR = heart rate; PEP = pre-ejection period; PPD = peak pupil dilation; PTA = pure tone average; RSA = respiratory sinus arrhythmia; SCL = skin conductance level.
condition was significant (t(242) = 3.8, p < 0.01), indicating a nonmonotonic relationship between condition and PEP reactivity.The linear term was not significant.The post hoc test (see Table 2) confirmed that IL50 % showed the strongest PEP decrease from baseline compared to both IL20 % and IL80 %.The LME model for SCL reactivity indicated no significant effect of PTA or condition and there was no interaction.PTA asymmetry was not significant.The effect of tinnitus complaint was significant (F(1, 117.2) = 4.2, p = 0.04), indicating a lower SCL reactivity for participants who reported suffering from tinnitus.The linear and quadratic terms were not significant.

Subjective measures
Fig. 7 displays the mean ratings per condition for the four subjective measures and Fig. 8 shows the ratings as a function of PTA per condition.There was a main effect of PTA (F(1, 125) = 7.3, p < 0.01) and condition (F(2, 250) = 97.4,p < 0.01) on subjective effort, as well as an interaction between the two fixed factors (F(2, 250) = 3.3, p = 0.04).Table 3 and Fig. 8A indicate that PTA affected the subjective effort predominantly in the IL80 % condition, with higher effort ratings for participants with lower hearing acuity.Additionally, PTA asymmetry also showed a significant effect (F(1, 125) = 5.4, p = 0.02), indicating higher subjective effort with increased PTA asymmetry.Tinnitus complaints did not significantly affect subjective effort.Subjective performance was significantly affected by condition (F(2, 250) = 125.7,p < 0.01), but not PTA and there was no interaction.Participants rated their performance higher with increasing intelligibility level (see Table 3).PTA asymmetry and tinnitus complaints did not reveal a significant effect.There was no significant effect of PTA or PTA asymmetry on subjective difficulty.There was a significant effect of condition (F(2, 250) = 104.1,p < 0.01) as well as an interaction between PTA and condition (F(2, 250) = 7.5, p < 0.01).The post hoc contrasts indicated only a significant interaction between PTA and IL20 %vs.IL50 % (see Table 3), suggesting that PTA affected the subjective difficulty for IL20 % differently compared to IL50 %.There was no interaction between PTA and IL50 %vs.IL80 %, which implies that the relationship between PTA and subjective difficulty were

Table 3
Linear mixed models post hoc test results of the SRT and subjective outcome measures, with estimates (standard error) and the 95 % confidence interval (CI).Significant effects as indicated by t-value and confidence intervals in bold.IL20-50 refers to the contrast between IL20 % and IL50 %, IL50-80 the contrast between IL50 % and IL80 %.Tinnitus refers to self-reported suffering from tinnitus (complaints).PTA = pure tone average; SRT = speech reception threshold in dB SNR.
L. Keur-Huizinga et al. similar for IL50 % and IL80 %.Together, this suggests that PTA showed a positive relationship with difficulty ratings for IL50 % and IL80 %, but not for IL20 %, which was rated overall difficult by all participants, independent of hearing acuity.The effect of tinnitus complaints on subjective difficulty was not significant.The LME model for the subjective tendency to give up indicated a significant effect of condition (F (2, 250) = 45.0,p < 0.01), but not PTA and no interaction.PTA asymmetry and tinnitus complaints were also not significant.The follow up assessment indicated a higher tendency to give up for IL20 % compared to IL50 % and for IL50 % compared to IL80 % (see Table 3).Although the omnibus test did not reveal a significant interaction, the post hoc tests showed a significant interaction between PTA and IL20 % vs. IL50 % (see Table 3), indicating a dissimilar relationship between PTA and subjective tendency to give up for IL20 % and IL50 %.Specifically, participants with a better hearing acuity rated a stronger tendency to give up in the IL20 % condition, whereas this relationship is reversed for IL50 % and IL80 % (see also Fig. 8D).

Correlations
Table 4 summarizes the Pearson correlation coefficients of the physiological responses and subjective ratings per condition.PPD correlated positively with SCL reactivity in IL50 % and IL80 %, but not IL20 %, indicating that PPD increases as SCL reactivity increases when listening demand is not too high.Additionally, PPD showed a weak negative correlation with PEP reactivity in the IL50 % condition, suggesting that the PPD becomes larger as PEP reactivity is more negative (indicating increased SNS reactivity) in moderate listening conditions.HR reactivity correlated negatively with RSA reactivity in all conditions, indicating that an increase in heart rate, when it occurred, was associated with a decrease in RSA.HR reactivity correlated positively with SCL reactivity in IL20 % and IL80 %, showing that an increase in heart rate was associated with an increase in SCL reactivity.In all conditions, SRT showed a weak to moderate negative correlation with PPD, indicating that PPD decreases with increasing (worse) SRT.SRT did not show any significant relationship with the other physiological measures and only correlated weakly with subjective difficulty in IL20 % and subjective effort and performance in IL50 %.Among the subjective measures, the correlations are all moderate to high, except between tendency to give up and the other three ratings in the IL20 % condition.Overall, the subjective ratings showed no to weak correlations with the physiological responses in some conditions.Tendency to give up did not correlate with any of the physiological responses in any condition.
To account for possible physiological differences at baseline based on PTA, the relationship between the physiological baseline levels and PTA was assessed.This was also assessed for tinnitus complaints.Baseline PEP levels correlated negatively with PTA (r = − 0.20, p = 0.03), indicating lower PEP baseline level (higher SNS activity) with worse hearing      acuity (see Supplementary Fig. 1).None of the other correlations between baseline physiological measures and PTA or tinnitus complaints were significant (all p > 0.05).

Discussion
The current study aimed to assess the effect of hearing acuity on various ANS measures over different levels of listening demand as well as the relationships between these measures.To this end, we applied an adaptive SRT test and measured HR, RSA, PEP, and SCL reactivity and PPD at three levels of intelligibility: 20 %, 50 %, and 80 % accuracy.Additionally, participants' subjective ratings of effort, performance, difficulty and tendency to give up were measured at each condition.We hypothesized that the physiological reactivity would have a nonmonotonic relationship with listening demand with the highest response at moderate intelligibility levels and that hearing acuity affects these physiological responses.Below, the main findings will be discussed with regard to the influence of hearing acuity and tinnitus complaints on physiological reactivity and subjective ratings to listening demand.We do so separately for each of the measures employed, and also discuss the relationships between these measures.
In accordance with previous studies (e.g., Zekveld et al., 2011), worse hearing acuity was associated with higher (worse) SRTs and the SRTs increased with increasing intelligibility level.This further supports that increasingly favorable SNRs are needed with poorer hearing acuity to reach the same speech reception accuracy, despite signal level adjustments to accommodate hearing levels (Plomp, 1978).Interestingly, participants who reported suffering from tinnitus complaints showed higher SRTs, which was likely mediated by its association with worse hearing acuity.

Physiological measures
The PPD showed a quadratic relationship with listening demand, supporting the expected non-monotonic relationship between effort and listening demand.This also coincides with the studies of Ohlenforst et al. (2017) and Wendt et al. (2018), in which pupil dilation was found to be largest around 50 % intelligibility levels.Additionally, there was a negative association between PTA and PPD, where PPD was smaller for those with worse hearing acuity.This confirms the hypothesized decrease in PPD with worse hearing acuity and is consistent with previous research on pupil dilation in HH adults (e.g., see Kramer et al., 2016;Ohlenforst et al., 2017;Zekveld et al., 2023).However, we did not replicate the finding that participants with worse hearing acuity showed lower PPD specifically in the difficult listening situations (Zekveld et al., 2011), as would have been indicated by an interaction between PTA and condition.Zekveld and colleagues (2011) found such an interaction between pupil dilation and the relative SNR in the listening condition with an intelligibility level of 50 %, whereas the current study focused on condition effects.Interestingly, participants suffering from tinnitus also showed a smaller PPD during the listening task.Similar findings have been reported by Juul Jensen et al. (2018) and Sendesen et al. (2023).
Unexpectedly, we did not find an overall increase in heart rate, but rather a decrease compared to the resting baseline.However, this decrease was relatively small (e.g., see Charles and Nixon, 2019 for an overview), approximately one beat per minute on average.Although the LME results indicated a significant effect of condition, the contrasts in the post hoc analyses were not significant.Furthermore, there was a linear relationship between condition and HR reactivity including an interaction with PTA.Fig. 6A suggests this interaction is due to a stronger negative relationship between PTA and HR reactivity for IL80 % compared to the other intelligibility levels.This would suggest a larger HR decrease from baseline with worse hearing acuity in the easier condition.However, again, this difference is relatively small (around 1 bpm) and may be considered negligible.
The results of the current study did not indicate an effect of hearing acuity or intelligibility level on RSA reactivity.Participants' cardiac PNS activity was expected to decrease with increasing listening effort, indicated by a decrease in RSA compared to baseline.Considering the large variation in RSA reactivity levels, it is possible that RSA reactivity may not be sensitive enough to the SRT task to be able to detect differences in cardiac PNS responses.Previous studies that used various HRV measures during listening tasks are approximately equally divided in finding no effect (Cvijanović et al., 2017;Mackersie et al., 2015 [NH]; Plain et al., 2021;Slade et al., 2021) or a decrease with increasing listening demand (Mackersie and Calderon-Moultrie, 2016;Mackersie and Kearney, 2017;Mackersie et al., 2015 [HH]; Seeman and Sims, 2015).Furthermore, the results did not indicate an effect of tinnitus on RSA reactivity.No other study has yet reported on possible effects of tinnitus on PNS reactivity to listening demand.More research is needed to assess the sensitivity of RSA (and other HRV measures) and for optimizing the experimental tests or task design for detecting potential PNS changes to listening demand.
PEP reactivity showed a non-monotonic relationship with intelligibility level, with the strongest reactivity decrease (PEP decrease compared to baseline) in the IL50 % condition, in accordance with the hypothesized relationship.The current PEP reactivity results correspond to the findings of Slade et al. (2021), who also found a non-monotonic relationship between PEP reactivity and listening demand in NH adults.However, they are in conflict with the linear relationship between SNR and PEP reactivity found by Plain et al. (2020).One possible explanation for this discrepancy is the difference in sample characteristics, as Plain et al. (2020) used younger NH participants.The current study is the first to report on the effect of a wide range of hearing acuity levels and tinnitus on PEP.Although there were no effects of hearing acuity or tinnitus on PEP reactivity, baseline PEP levels were negatively correlated with PTA, suggesting higher SNS baseline activity for those with worse hearing acuity.This may suggest that these individuals were more stressed in anticipation of the test session.
Although not significant, Fig. 6D suggests a trend of decreasing SCL reactivity with lower hearing acuity in all conditions, which would indicate a lower SNS activation compared to baseline with worse hearing acuity.Reactivity of EDA measures to listening demand has shown inconsistent results in previous studies including HH participants (Holube et al., 2016;Mackersie et al., 2015).The lack of significant effects of intelligibility level or hearing acuity on SCL reactivity could be because the SRT task may not have been stressful enough to induce significant electrodermal changes.For example, in the study of Mackersie and Kearney (2017), SCL only increased with task demand if evaluative threat was high.Interestingly, however, SCL reactivity was lower for participants who reported suffering from tinnitus complaints.In one other (pilot) study, SCL has been used as a measure of listening effort in NH tinnitus patients (Cartocci et al., 2023).Here, participants with higher levels of tinnitus-related distress displayed lower SCL reactivity to listening to an audiobook in noise at different SNRs.To the authors' knowledge, the aforementioned study of Cartocci et al. (2023) and the current study are the first to report on the association between tinnitus and lowered SCL (responses) during effortful listening tasks.

Subjective measures
Subjective measures were affected by intelligibility level as expected: self-rated effort, difficulty and giving up increased and subjective performance decreased with decreasing intelligibility level.Interestingly, specifically in the IL80 % condition, participants with worse hearing acuity still rated their listening effort higher compared to participants with better hearing.This indicates that they felt like they still had to expend a lot of effort despite the more favorable intelligibility level, while participants with better hearing indicated that listening effort reduced to a larger extent in this condition compared to the more demanding conditions.Additionally, those with a better hearing acuity reported a higher tendency to give up when intelligibility level was lowest.Contrary to the non-monotonic relationship hypothesized for the physiological responses, self-reported listening effort generally increases linearly with listening demand (Decruy et al., 2020;Zekveld and Kramer, 2014), which was replicated in this study.Participants tend to take into account their perceived difficulty and performance when rating their effort (Moore and Picou, 2018): the more difficult a task, the higher their subjective effort, even when their tendency to give up increases.This is also supported by the correlations between the subjective measures in this study: in all conditions, subjective effort increased with decreasing subjective performance and increasing subjective difficulty.For IL50 % and IL80 %, subjective effort also increased with increasing tendency to give up.

Relationships between measures of listening effort
Overall, we found a number of weak to moderate correlations between the physiological responses and between the physiological and subjective measures.Especially the correlations between the physiological measures were relatively weak, but the subjective measures showed moderate to high correlations with each other, except in the IL20 % condition.This corresponds to previous research on listening effort using multiple physiological measures (Keur-Huizinga et al., 2024;Shields et al., 2023).For example, Alhanbali et al. (2019) applied pupil size, electroencephalographic alpha power, skin conductance, and self-reported listening effort as measures of listening effort, which showed no or weak correlations.They argued that these measures represent listening effort differently.Potentially, different physiological systems are differentially sensitive to effortful listening tasks.
The weak to moderate negative correlations between SRT and PPD in all conditions are in line with the effects of PTA on both SRT and PPD, which may have contributed to their negative correlation.Interestingly, SRT correlated with PPD in all conditions, further supporting the sensitivity of PPD to effortful listening tasks.PPD only correlated with SNS measures (PEP and SCL), but not RSA in any of the conditions.Additionally, the SNS measure PEP was sensitive to listening demand manipulations, whereas the PNS measure RSA was not.The present findings are supported by the study of Slade et al. (2021), who also found the SNS measures to be sensitive to listening demand, whereas the PNS measures were not.Furthermore, the PNS and SNS specific measures used in this study were not sensitive enough to hearing acuity, except for PEP baseline levels.
Contrary to the generally expected increase in listening effort with worse hearing acuity, pupil dilation responses are consistently shown to be decreased for HH individuals (e.g., Zekveld et al., 2011Zekveld et al., , 2023)).A potential explanation for the smaller PPD with worse hearing acuity is decreased listening effort due to earlier disengagement.Note that the subjective ratings on tendency to give up were not related to hearing acuity for the IL80 % and IL50 % conditions, and were even smaller for those with worse hearing acuity in the IL20 % conditions.This indicates that the smaller pupil dilation response is not accompanied with hearing-related subjective notions of withdrawal or disengagement.However, as subjective ratings of withdrawal are not necessarily valid markers of disengagement (e.g., Moore and Picou, 2018;van de Mortel, 2008), the absence of a relationship between hearing acuity and the subjective tendency to give up does not necessarily imply that individuals did not disengage to some extent.In any case, due to the adaptive procedure, performance levels were matched, so all individuals were reaching similar performance levels.
One could argue that individuals with hearing loss and those suffering from tinnitus have a higher overall sympathetic tone due to daily stress and fatigue from their hearing loss and/or tinnitus, resulting in a flattened SNS response.Such an attenuated SNS response would result in smaller pupil dilation responses, as SNS activation is associated with pupil dilation.The current study aimed to simultaneously record SNS and PNS specific measures with pupil dilation responses to effortful listening to not only investigate the sensitivity of these SNS and PNS measures to listening demand, but also to provide a potential indication of how each system contributes to the physiological responses to listening.It is important to note, however, that the PPD and SNS/PNS measures were recorded from different modalities, meaning that the current results do not represent direct SNS or PNS activity on pupil size.No other study has yet reported on the effect of hearing acuity on the simultaneous recording of pupil dilation with both SNS and PNS measures during effortful listening.
Both PPD and SCL reactivity were associated with tinnitus complaints with lower responses to the listening task in participants suffering from tinnitus.Together with the hearing acuity results, this suggests that the smaller PPD with lower hearing acuity and with tinnitus complaints is potentially related to attenuated SNS activation.Previous studies have found that HH individuals show higher overall cortisol levels (Kramer et al., 2016) and overall increased SCL reactivity (speech in noise vs. quiet) (Mackersie et al., 2015).Tinnitus patients also tend to show lower HR reactivity (Betz et al., 2017), lower salivary alpha-amylase responses during mental arithmetic tasks (Alsalman et al., 2016) and higher overall cortisol levels (Hebert and Lupien, 2009).If HH and tinnitus patients would have a higher overall sympathetic tone, one would expect the SNS baseline levels to be increased.Although the SCL baseline levels did not correlate with hearing acuity, PEP did, indicating a higher cardiac sympathetic activity level during the baseline for those with worse hearing acuity.However, this is opposed to the study by Wang et al. (2018), who concluded that the lower pupil response in HH individuals is due to smaller PNS inhibition compared to NH.They compared the pupil responses to speech perception in light and dark conditions, based on the theory that the PNS inhibitory activity is minimal in darkness, as the pupillary sphincter muscles are relaxed (Loewenfeld and Lowenstein, 1993).The HH subjects showed smaller differences between the pupil responses in light and dark conditions compared to NH subjects.Although decreased PNS inhibition and decreased SNS reactivity are not mutually exclusive, the current data do indicate different key factors for the decreased pupil dilation responses.However, it is important to echo the statement again that SNS and PNS activity as measured from one organ does not necessarily represent uniform SNS/PNS activity levels, as they can be organ specific (e.g., Esler et al., 1988).It is recommended to interpret the results of the present study with a multimodal approach as well.
The current study is the first to investigate the relationship between pupil dilation and cardiovascular responses (PEP and RSA) in the context of listening effort.The results showed that PPD correlated negatively and weakly with PEP reactivity, but did not correlate with RSA reactivity in any of the conditions.Aliakbaryhosseinabadi et al. (2023) used both pupil dilation and HF-HRV as measures of communication difficulty, but did not look at the relationship between these measures.Pupil dilation and HRV are also used simultaneously in a number of studies on emotion regulation and mental workload, in which these measures also did not correlate (Vanderhasselt et al., 2015).To the authors' knowledge, there are currently no other reports on the association between PEP and pupil dilation.

Limitations
One of the limitations of the current study was that individuals with severe hearing loss were not able to complete the SRT test due technical limitations of the test.This means that, despite the wide range of participant's hearing acuity, people with severe hearing loss are not well represented in our sample, restricting the generalizability of our findings.The number of male participants was also substantially lower than that of females, possibly affecting the effects of sex in the statistical models and also generalizability of the results.The current study used short sentences to induce listening effort, which does not represent communication and listening effort in daily life.Nevertheless, sentences have more face validity than digits or words, and were chosen to efficiently reach the fixed intelligibility levels and sentence repetition allowed for testing the participant's ability to correctly perceive the target sentences, whereas stimuli with much longer duration do not.Additionally, although PPD seems to be the most sensitive measure to intelligibility level changes and hearing acuity, this measure was acquired on a trial-by-trial basis, whereas the other physiological measures were assessed in a block-wise manner.This means that the cardiovascular and SCL measurements also include the intervals in which participants repeated the sentences and in which their response was scored by the experimenter, whereas PPD does not.Furthermore, the current study design included one baseline interval prior to all conditions, rather than a resting baseline interval prior to each task condition.Despite randomization, this may have led to carry over effects, possibly affecting the strength of current effects.Considering the full duration of participation and burden on the participant, it was chosen to use one baseline interval prior to the task onset.
It is also necessary to highlight the explorative nature of the correlation analysis.The benefit of this analysis is that it shows the associations between the different physiological measures that are all assumed to be associated with the effort and ANS activity in response to difficult listening.As the correlation coefficients were determined between each outcome measure per condition it resulted in a large number of tests.Testing correlation coefficients on the data averaged across conditions would reduce the number of tests.However, it is recommended to test for correlations per condition (Keur-Huizinga et al., 2024).Due to the resulting increased probability of type-I errors, the significance of the correlation coefficients should be interpreted with caution, and the strength of the correlation coefficient should also be taken into account.However, merely evaluating the strength of the correlations to quantify significance level is also considered arbitrary to some extent (Schober et al., 2018).
Lastly, the three intelligibility levels used in the SRT test were carefully selected to capture relevant sections of the psychometric function (Festen and Plomp, 1990;Ohlenforst et al., 2017) and hypothesized non-monotonic relationship between listening demand and physiological responses.Due to feasibility concerns, it was not possible to cover the entire spectrum of listening demand and effort.If SCL and RSA follow a different pattern of reactivity across listening demand conditions, a broader range of conditions may be needed to detect these.Although PPD and PEP showed a non-monotonic relationship with intelligibility level as expected, it is impossible to infer from the current results how this relationship is shaped exactly.For example, the biggest difference between the MIT and inverted-U shaped functions is what happens when demand increases after maximum effort is reached: according to the MIT, effort drops immediately as individuals disengage, whereas the inverted-U shaped function illustrates a gradual decline of effort.For future studies it is therefore recommended to implement more levels of intelligibility or SNRs to be able to make this distinction, and to analyze whether the drop in engagement differs between individuals.

Implications
The current study suggests attenuated SNS responses due to higher SNS baseline levels with worse hearing acuity.This is in line with higher subjective listening effort and fatigue for individuals with hearing loss (Beechey et al., 2020;Holman et al., 2021).Prolonged increased levels of sympathetic activity can significantly impact one's health and quality of life.For instance, increased SNS and decreased PNS activity in daily life are associated with metabolic dysregulations (Licht et al., 2010).Numerous studies have indicated that enhanced SNS activity leads to increased risk of high blood pressure and other cardiovascular issues (e.g., see Bruno et al., 2012;Grassi et al., 2015;Hering et al., 2015).The importance also accounts for individuals suffering from tinnitus.It is therefore important in clinical audiology to measure beyond the hearing thresholds indicated by the audiogram and take into account the potential health risks of increased listening effort, chronic stress, and prolonged enhanced sympathetic activity in daily life if no countermeasures are applied.Consequently, it is crucial to make HH individuals aware of the fact that they (may) have to compensate for their hearing loss.
The PPD was the most sensitive measure among the physiological measures to intelligibility level, hearing acuity and tinnitus complaints.This is not surprising considering previous research using physiological measures as indicators of listening effort (for overviews, see Keur-Huizinga et al., 2024;Shields et al., 2023).Of the SNS and PNS specific measures in the current study, the SNS measure PEP (reactivity) was seemingly the most sensitive, at least to intelligibility level and hearing acuity during baseline.Based on the current study design and results, studies using physiological measures as indicators of listening effort are recommended to use pupil dilation and PEP, as well as studies investigating the effect of hearing acuity on such measures.As stated earlier, results of this study support the conclusions by Alhanbali et al. (2019) that measures of listening effort are multidimensional and should not be applied interchangeably.Furthermore, the results support FUEL stating that there is a non-monotonic relationship between listening effort and demand and that various mechanisms underlie listening effort (Pichora-Fuller et al., 2016).Pichora-Fuller et al. (2016) also posed the question whether it is necessary to combine tests.Considering the weak correlations between measures and their varying sensitivity to listening demand and hearing acuity, the current study suggests it is advantageous to combine multiple (physiological) measures.There is no golden standard among markers of listening effort and different physiological measures are likely to represent different aspects of listening effort.It is therefore important to develop a multimodal approach of listening effort by applying multiple physiological measures and to provide a rationale for the use of such measures as markers of listening effort (Alhanbali et al., 2019;Richter et al., 2023;Strand et al., 2020).
Additionally, the current study not only showed effects of hearing acuity on physiological responses to effortful listening, but also tinnitus complaints.Currently, the literature on such physiological responses in tinnitus patients is limited.It is recommended for future studies on hearing loss to additionally assess suffering from tinnitus complaints, as this may have an additional and/or confounding effect on the physiological measures.

Conclusion
This study investigated the effect of hearing acuity on physiological responses to different levels of listening demand as well as subjective measures of effort and the relationships among these outcome measures.PPD and PEP reactivity followed a non-monotonic relationship with listening demand.PPD decreased with decreasing hearing acuity and participants suffering from tinnitus complaints showed smaller PPD and SCL reactivity, which are likely predominantly associated with attenuated SNS activation.PPD was predominantly associated with SNS measures and worse hearing acuity was associated with higher SNS activity levels at baseline.Together, the results indicate that pupil dilation responses to effortful listening may be primarily affected by SNS activation.Additionally, the participants with worse hearing acuity still rated their effort to be high in the relatively easy condition and were less likely to disengage in the difficult listening situation compared to participants with better hearing.The results of the current study provide evidence that hearing loss and suffering from tinnitus are associated with altered physiological reactivity to effortful listening, specifically for PPD and SNS measures.Altered physiological responses may be associated to general health in HH individuals experiencing increased effortful listening in daily life.It is recommended for future studies to examine how these findings obtained in the lab translate to findings obtained in real life.

Fig. 1 .
Fig. 1.Average pure-tone air-conduction thresholds at each frequency for the best and worse ear.Error bars indicate standard deviation.

Fig. 4 .
Fig. 4. Average pupil dilation response during listening to the target speech per intelligibility level (IL) condition.Responses are relative to baseline (between − 1 and 0 s).Shaded areas indicate 95 % confidence level interval.

Fig. 6 .
Fig. 6.Physiological reactivity (compared to resting baseline) at each intelligibility level (IL) condition as a function of the pure-tone average (PTA) threshold of the best ear for heart rate (HR; A), respiratory sinus arrhythmia (RSA; B), pre-ejection period (PEP; C), and skin conductance level (SCL; D).Higher PTA indicates worse hearing acuity.

Fig. 7 .
Fig. 7. Subjective ratings at each condition for effort (A), performance (B), difficulty (C), and tendency to give up (D).Error bars indicate standard error.

Fig. 8 .
Fig. 8. Subjective ratings at each intelligibility level (IL) condition as a function of the pure-tone average (PTA) threshold of the best ear for effort (A), performance (B), difficulty (C), and tendency to give up (D).Higher PTA indicates worse hearing acuity.

Table 2
Linear mixed models post hoc test results of the physiological responses, with estimates (standard error) and the 95 % confidence interval (CI).