Altered event-related potentials and theta oscillations index auditory working memory deficits in healthy aging

Speech comprehension deficits constitute a major issue for an increasingly aged population, as they may lead older individuals to social isolation. Since conversation requires constant monitoring, updating and selecting information, auditory working memory decline, rather than impoverished hearing acuity, has been suggested a core factor. However, in stark contrast to the visual domain, the neurophysiological mechanisms underlying auditory working memory deficits in healthy aging remain poorly understood, especially those related to on-the-fly information processing under increasing load. Therefore, we investigated the behavioral costs and electrophysiological differences associated with healthy aging and working memory load during continuous auditory processing. We recorded EEG activity from 27 younger (∼25 years) and 29 older (∼70 years) participants during their performance on an auditory version of the n-back task with speech syllables and 2 workload levels (1-back; 2-back). Behavioral measures were analyzed as indices of function; event-related potentials as proxies for sensory and cognitive processes; and theta oscillatory power as a reflection of memory and central executive function. Our results show age-related differences in auditory information processing within a latency range that is consistent with a series of impaired functions, from sensory gating to cognitive resource allocation during constant information updating, especially under high load.


Introduction
One of the hallmarks of cognitive aging is working memory (WM) decline. WM can be defined as a set of cognitive processes that allow one to update, maintain and manipulate information over a short period of time ( Baddeley, 2003 ;Cowan, 1999 ). It is considered a critical function to successfully execute adaptive, goal-oriented behavior ( Baddeley, 2003 ;Baddeley and Hitch, 1974 ;Goldman-Rakic, 1996 ). WM decline is indicated as a main factor underlying deficits in learning, reasoning, planning, direction of attention, task goal maintenance, decision making and inhibition of irrelevant information ( Baddeley, 20 03 ;Glisky, 20 07 ;Lubitz et al. , Merikle, 1996 ). This seems rather unsurprising as, during conversation, speech comprehension requires information processing supported by WM functions associated with the central executive component ( Baddeley, 2012 ), such as focused attention and constant monitoring, updating and maintenance of relevant (and inhibition of irrelevant) information.
Traditionally, WM functions have been studied employing mainly 2 different types of tasks: item-recognition tasks, such as Sternberg's WM task ( Sternberg, 1966 ), which involve rehearsal and storage of perceived items and are considered to reflect maintenance functions; and information manipulation tasks which involve, in addition to maintenance, cognitive functions related both to processing and continuous dynamic updating of information. An example of the latter type is the n-back task ( Kirchner, 1958 ), which consists in deciding whether the current item from a set of presented items in succession matches the one presented n items before. The n-back task thus relies on functions related to executive processes ( Jablonska et al. 2020 ), such as continuously updating a rehearsal set of maintained information while taking a decision on each item, regardless of whether a response is executed to a match or withheld to a non-match (i.e., all stimuli are relevant), and hence better reflects the requirements of successful speech comprehension than WM maintenance tasks.
However, while the neurophysiological bases of healthy agingrelated changes in WM functions reflecting the central executive component have extensively been studied in the visual domain ( Daffner et al. , 2011 ;Falkenstein, 2014 , 2018 ;Jonides et al. , 1997 ;Lubitz et al. , 2017 ;McEvoy et al. , 2001 ;Missonnier et al. , 2004 ;West and Bowry, 2005 ;Wild-Wall et al. , 2011 ), they have been overlooked in the auditory domain, favoring tasks mostly addressing storage capacity, such as the Sternberg's task and Delayed Match-to Sample Tasks (DMTS; Chao and Knight, 1997 ;Golob and Starr, 20 0 0 ;Karrasch et al. , 2004 ;Pelosi and Blumhardt, 1999 ;Pratt et al. , 1989 ). Although equivalent visual-verbal and auditory-verbal WM tasks recruit the activity of overlapping brain regions due to the supramodal characteristics of WM, these different modalities are associated with different activity patterns as well ( Rodriguez-Jimenez et al. , 2009 ). Thus, there is a lack of evidence on how a healthy aged brain processes continuous auditory information under different levels of workload.
The present study was hence designed to address the effects of healthy aging and auditory WM (AWM) load during continuous information processing. We asked younger ( ∼25 years) and older ( ∼70 years) participants to perform an auditory n-back task ( Kirchner, 1958 ) with speech syllables and 2 workload levels ( 1back; 2-back ; Fig. 1 ) while recording their electroencephalographic (EEG) activity. We analyzed performance as an indicator of AWM function; event-related potentials (ERPs) as measures of sensory and cognitive processes; and oscillatory power in the theta band during continuous performance ( ∼6 Hz) as a measure of memory and central executive function ( Sauseng et al. , 2010 ). Furthermore, we correlated behavioral indices of the n-back task with frontal theta power differences across workload levels in order to explore the link between performance and central executive activity ( Finnigan and Robertson, 2011 ). According to auditory WM literature and visual n-back task aging studies, we expected to find: (1) A decrease in performance with workload, especially affecting older participants ( Bopp and Verhaeghen, 2020 ); (2) an age-related increase in early sensory ERPs, as a result of placing more weight on the early processing of incoming auditory stimulation, in line with accounts of impaired "sensory gating" ( Chao and Knight, 1997 ); (3) an age-related decrease of late frontal and parietal ERPs, particularly the Sustained Frontal Negativity (SFN), related to sustained auditory attention during item retention in auditory DMTS tasks ( Chao and Knight, 1997 ), and the P3b, related to cognitive resource allocation, among other functions ( Chao and Knight, 1997 ;Polich, 2007 ); (4) a workload-related increase of the SFN, at least in young adults as seen in auditory n-back tasks ( Alain et al. , 2009 ) and DMTS ( Guimond et al. , 2011 ;Lefebvere et al. , 2013 ;Alunni-Menichini et al. , 2014 ); workload effects on the P3b usually show a reduction of this component ( Polich, 2007 ), albeit results in visual and auditory n-back tasks are inconsistent (e.g., Daffner et al. , 2011 ;Gajewski and Falkenstein, 2014 ;Lubitz et al. , 2017 ;Saliasi et al. , 2013 ;Arjona-Valladares et al. , 2021 ); (5) an agerelated decrease and a workload-related increase in frontal theta power, correlating with performance, indexing central executive and memory function ( Cummins and Finnigan, 2007 ;Jensen and Tesche, 2002 ); and (6) considering that older participants may reach supra-capacity levels at high load, finding the task too difficult to continuously apply full cognitive effort ( Gajewski and Falkenstein, 2014 ;Van Snellenberg et al. , 2015 ), we may find the direction of workload effects in late ERPs (SFN and P3b) and theta power measures to be opposite in older (decrease) as compared to younger (increase) participants.

Materials and methods
The study was approved by the local Ethical Commission at the University of Social Sciences and Humanities in Warsaw, Poland (permission no 3/II/11-12), conforming to the Code of Ethics of the World Medical Association (Declaration of Helsinki). All participants gave their written informed consent prior to the study. The data that support the findings of this study and the code used for data analysis are available upon reasonable request to the authors.

Participants
Sixty-three participants were included in the study. However, due to large artefacts in the EEG signal, the results from 57 participants are reported here: 28 young adults from 21-31 years of age (Younger group; 16 females, 12 males; mean age = 25.39 years; SD = 3.03) and 29 older participants from 65-78 years of age (Older group; 23 females, 6 males; mean age = 70.17 years; SD = 3.38). All participants reported being healthy, with no history of head trauma, neurological or psychiatric diseases. Age groups were equivalent in education level and handedness (Edinburgh Handedness Inventory ( Oldfield, 1971 ); all were righthanded; comparisons performed with independent samples t -tests; Table 1 ). All participants passed a screening audiometry ensuring a normal hearing level for frequencies from 250-30 0 0 Hz (puretone average; ≤30 dB HL; Carhart, 1971 ;Kung and Willcox, 2007 ), which covered the main spectrum of the presented speech stimuli.
All older participants included in the study underwent testing with the Mini-Mental State Examination and filled in a Geriatric Depression Scale (Short Version) to screen for mental deterioration and depression. Inclusion criteria were a score above 26 points in Mini-Mental State Examination and less than 6 points in Geriatric Depression Scale. Moreover, for each of the older participants a geriatric examination conducted by a professional geriatrician was provided. It comprised a physical examination, an assessment of functional performance and an evaluation of currently taken medication. All candidates presented a stable condition and were not receiving medication that could affect the functioning of the nervous system.
The noticeable gender imbalance in the Older group is a consequence of an unsuccessful effort to promote the cooperation of older male individuals in the study. As previous research has shown gender effects in the neural correlates of auditory perception (e.g., men show greater activation in speech perception under noisy conditions than females; Kocak et al. , 2005 ), speech control (e.g., females exhibit faster N1 responses than males; Li, et al. , 2018 ) and cognition (e.g., females require longer monitoring conflict and response execution times; Melynyte et al. , 2017 ; females show greater prefrontal activation in AWM tasks than males; Goldstein et al. , 2005 ), we acknowledge that our results could be biased by this factor, possibly increasing the observed differences across age groups. Future studies focusing on the interaction between age and gender in the neural correlates of AWM should be performed to elucidate this matter.

Stimuli
A set of 30 syllables with a consonant-vowel structure, recorded from a professional female speaker (44.1 kHz sampling rate), served as stimuli. Each syllable consisted of 1 of 6 consonants (b, d, g, m, l and z) and 1 of 5 vowels (a, o, e, u, y; e.g., /ba/, /do/, /zu/... ). The length of the syllables was matched with Adobe Audition 2.0 (Adobe Systems, CA, USA) to a total duration of 300 ms, including 5 ms rise/fall time. Syllables were delivered at 80 dB Sound Pressure Level (SPL; measured with Artificial Ear, Bruel & Kjaer, type 2250) using Presentation Software v14.9 (Neurobehavioral Systems Inc.) via E •A •RTone 5A Insert Earphone headphones plugged into the ear canal. All audio files can be downloaded from Supp. Audio Files ( syllable_stimuli_WMtask.zip ).

Auditory n-back task
An auditory version of the n-back paradigm with 2 experimental conditions was applied ( Fig. 1 ). In the 1-back condition ( low memory load ), participants had to indicate whether the currently presented syllable matched the previous one. In the 2-back condition ( high memory load ), they had to indicate whether it matched the syllable presented 2 trials back. Syllables matching syllables presented n-positions back are called targets, while the rest are called non-targets. Participants responded only to targets by pressing a button with their right index finger. Two blocks were presented per condition, with 150 non-target and 30 target stim-  uli each (a total of 300 non-target and 60 targets per condition). Blocks were presented alternately and the order of the starting condition was randomized across participants. Syllables were presented randomly, with a stimulus-onset asynchrony of 20 0 0 ms.

Procedure
Before the main experiment, each participant received information about the task and underwent a training session consisting of 2 practice blocks, 1 per condition ( 1-back; 2-back ). Each practice block contained 3 targets only but was otherwise identical to the experimental block. Participants had to complete the training session without errors in a maximum of 3 attempts for the actual experimental procedure to begin. During the experiment, participants were asked to look at a white fixation cross presented on a monitor placed at a distance of 60 cm in front of their eyes while listening to the stimuli.

Behavioral measures and analyses
Performance was assessed by measuring reaction times (RT) and the log-linear corrected sensitivity index ( d' ) ( Hautus, 1995 ). Behavioral measures (natural logarithm of RT in seconds, computed to normalize the typically skewed distribution of RT data; and d' ) were independently submitted to a 2-factor repeated measures analysis of variance (ANOVA) with ' Age ' (Younger; Older) as between-subject factor and ' Workload ' ( 1-back; 2-back ) as withinsubject factor (SPSS 23 software, IBM, NY, USA).

EEG acquisition and analysis
2.4.1. EEG recording EEG was continuously recorded by BrainVision Recorder© v.1.10 software (Brain Products, Germany) with a frequency bandpass of 0.1-100 Hz and digitized at sampling rate of 10 0 0 Hz. The signal was recorded from 32 Ag/AgCl active electrodes (ActiCAP, Brain Products, Germany) placed on a cap (EasyCap, Germany) according to the 10-20 system. The ground electrode was placed at AFz and the common reference electrode at FCz. All impedances were kept below 10 k during the whole recording session, which lasted approximately 26 minutes (4 blocks * 6 minutes + short breaks).

EEG preprocessing
The initial preprocessing steps were performed in Brain Vision Analyzer v. 2.1 (Brain Products, Germany). Data were downsampled to 256 Hz, re-referenced to average (FCz electrode was reused for further analyses) and bandpass filtered at 0.5-70 Hz (zero-phase Butterworth filter with 24 dB/octave). Eye blinks and horizontal eye movements were removed using Independent Component Analysis (ICA). Only 2-3 clearly eye-related independent components were removed from each participant's data after visual inspection of their scalp topography and time course ( Jung et al. ,20 0 0 ). EEGLab ( Delorme and Makeig,20 04 ) and Fieldtrip ( Oostenveld et al. , 2011 ) toolboxes running under Matlab R2016a (Mathworks) were used for further analysis.

Event-Related Potential (ERP) processing
Data were epoched from -200 ms to 20 0 0 ms with respect to each syllable onset and baseline corrected subtracting the mean amplitude between -200 ms and 0 ms. Epochs containing improbable data 3 SD above or below the mean probability distribution of values across all epochs were excluded (EEGlab's function pop_jointprob.m ). ERPs were obtained by averaging each participant's epochs separately to correctly answered targets and nontargets corresponding to correctly answered targets, except for those triggering False Alarm responses (from now on, they will be simply termed targets and non-targets). Only participants who had a minimum of 40% (24) correct trials after artefact removal were included in the analysis. Accordingly, 4 participants from the older group and 2 participants from the younger group were excluded from the study, resulting in the total number of 57 participants reported here (28 younger and 29 older; see 2.1. Participants ). Descriptive statistics regarding the number of trials are detailed in Supp. Table 1. It should also be noted that button presses were included within the length of the epoch cut around target stimuli, but the potential confounds in ERP data due to movements and motor preparation appeared however minimal. Again, as we did not directly compare target and non-target stimuli, any potential confound in the results would appear only in the analyses of ERPs to target stimuli, and would be driven by an effect of age or workload on the motor responses. However, while RTs exhibited a large variability across groups and conditions (ex. a difference of 200 ms between 1-back and 2-back in the older participants; Fig. 2 ), a visual inspection of the grand-averaged ERP data reveals that the latencies of the evoked ERP components were comparable (ex. in the older participants, all components peak at similar latencies independently of workload; Fig. 3 ).

Time-frequency analysis
The n-back task requires constant monitoring of the presented stimuli. As the cognitive load increases, also the amount of cognitive resources engaged in the task may also increase. As this is a continuous task, a conventional analysis of evoked oscillatory power, baseline-corrected with pre-stimulus activity, may conceal the differences between conditions and groups (i.e., if there were differences in oscillatory power between conditions and/or age groups unrelated to stimulus evoked neural responses, applying a baseline correction would eliminate such differences). Therefore, in order to investigate the continuous effect of monitoring and cognitive load in aging we focused on the temporal modulation of total power (single-trial based; no baseline correction) around all nontarget syllables, using a wavelet-based time-frequency (TF) analysis.
Data were epoched from -750 ms to 2850 ms relative to stimulus onset. Epochs containing improbable data 3 SD above or below the mean probability distribution of values across all epochs were excluded (EEGlab's function pop_jointprob.m ). The complex Fourier spectrum was obtained by convolving single trials with complex Morlet wavelets with a linearly increasing number of wavelet cycles from 3-12 as center frequencies ranged from 3-70 Hz, in 72 exponentially spaced frequency bins. Center frequencies were spaced exponentially to have a greater representation of the lower frequencies, as spectral bandwidth increases with frequency, and the EEG spectrum is best represented using a log scale ( Buzsáki, 2004 ). We increased the number of wavelet cycles (in integer numbers) in order to balance the temporal and spectral resolution across the frequency range (so that at higher frequencies, spectral resolution is higher at an acceptable temporal resolution). Supp. Table 2 contains details of the wavelet analysis such as the center frequency, the number of wavelet cycles and the temporal and frequency resolution as computed in Tallon-Baudry et al. (1998) . Estimates of total power, calculated by averaging the squared absolute values of the convolutions over trials, were computed according to the procedures described by Tallon-Baudry et al. (1996) . Power estimates from each frequency bin were log transformed and multiplied by 10 so that comparisons across conditions or groups would be expressed in dB units, as: Relative difference in dB = 10 * log 10 (power 1 )-10 * log 10 (power 2 ). Furthermore, a subtraction of the oscillatory power in the 1-back condition from the 2-back condition was conducted in each participant to obtain the difference in oscillatory power between conditions in dBs to test for Age x Workload interaction effects.

Statistical analyses
Statistical analyses were performed using a mass-univariate non-parametric randomization procedure ( Maris, 2004 ;Maris and Oostenveld, 2007 ) with a 2D (space-time) cluster correction to overcome the problem of multiple comparisons over a large group of electrodes and time samples. Neighbouring electrodes were defined using a Delaunay triangulation over a 2D projection of the electrode montage, which connects nearby electrodes independently of their physical distance. A minimum of 2 nearby electrodes was set per cluster. Two dimensional (time, electrode) analyses were conducted on the ERP amplitudes (from 0-20 0 0 ms). We compared statistically the activity drawn from 1 condition against the other ( 2-back vs. 1-back ; all participants from both groups merged to test a main effect of Workload ); from 1 group against the other (Younger vs. Older; both conditions merged to test a main effect of Age ); and, to reveal Age x Workload interactions, we compared the memory load effect ( 2-back condition minus 1back condition in each participant) between age groups (Younger vs. Older). For each comparison, the ERP amplitude at each time point and electrode underwent a 2-tailed dependent ( Workload ) or independent ( Age & Age x Workload ) t -test. The significance probability ( p value) of the t statistic was determined by calculating the proportion of 2D samples from 10 0 0 0 random partitions of the data (individual ERP data to compute main effects; individual ERP difference data across workload conditions to compute the interaction) that resulted in a larger test statistic than those on the observed test statistic (Monte Carlo method). Then, clusters were created by grouping adjacent 2D points exceeding a significance level set to 0.05 (2-tailed). A cluster-level statistic was calculated by taking the sum of the t -statistics within every cluster. The significance probability of the clusters was assessed with the described nonparametric Monte Carlo method. Values of p < 0.01, corrected for 2-tailed tests, were considered significant. For each significant cluster we report its temporal spread, cluster statistic and p value.
Regarding TF power estimates, our main focus was the theta band (5-7 Hz center frequencies [CF]), as it has been related to memory and central executive function ( Sauseng et al. , 2010 ) and shown to be reduced in healthy aging ( Cummins and Finnigan, 2007 ). However, we computed the TF transformation in a wider frequency range, as described above, to allow an exploratory analysis of effects on the delta (3 Hz CF), alpha (8-13 Hz CF), beta (14-30 Hz CF) and gamma (35-70 Hz CF) frequency bands, as previous literature attributed them a role in object representation maintenance in WM, attention and executive processing ( Kambara, et al. , 2017 ;Palva, et al. , 2011 ;Von Lautz, et al. , 2017 ). Our first exploratory approach was to compute a hypothesis blind, statistically strict 3D mass-univariate analysis with a cluster-based correction for multiple comparisons in frequency (3-70 Hz CF), channels (32) and time (0-2 seconds) following the procedures described above for the 2D analysis on ERP data (the only difference being is that clustering is performed by grouping adjacent 3D samples). As this analysis yielded no significant clusters in the Age x Workload interaction, which is our main focus of interest because main effects of age and auditory workload have been discussed elsewhere ( Kaiser 2015 ;Karrasch et al. 2004 ), we performed a 2D analysis (time-electrode) for each defined frequency band (averaging over frequency bins). No significant Age x Workload interactions were found in any frequency band, except for theta. We then performed another exploratory analysis to ascertain whether the defined frequency bands were meaningful on the light of our data. We computed an Age x Workload interaction on each channel-TF bin (independent t -test assessed with the non-parametric Monte Carlo method as described above; uncorrected; thresholded at p < 0.001), as shown in Fig. 7 . The only meaningful interaction obtained was a midline frontal increase in theta power with increasing workload that was larger in the Younger versus the Older group. Therefore, as both exploratory analysis and theoretically grounded a priori hypothesis supported a major relation of frontal midline theta in WM modulations with workload and aging, for the sake of simplicity, only results in the theta band will be further reported.
A note of caution has to be considered given the particular statistical approach to analyze electrophysiological data used here. Opting for a data-driven, cluster-based statistical analysis effectively controlling for multiple comparisons ( Maris, 2004 ;Maris and Oostenveld, 2007 ) may lead to an overestimation of the onset and spatial extent of the effects ( Sassenhagen and Draschkow, 2019 ), so keen observation of the raw data pattern and avoiding precise time-space claims is crucial for interpretation. Albeit acknowledging these limitations, we are quite confident that our data can consistently relate to classic ERP components and frontal theta power, given the shapes and scalp topographies of the obtained waveforms and the breadth of our statistical results ( Figs. 3 and 4 ).

Brain-behavior correlation
Additionally, in order to explore the relation between task performance and continuous stimulus monitoring under different levels of workload, we performed robust correlational analyses between the difference in mean frontal theta power across conditions, extracted from each participant from those electrodes and time points that participated in the Age x Workload interaction cluster ( see Results), and performance indicators ( d' index and log transformed RT), separately per age group. Specifically, we used the skipped Pearson correlation approach, controlling for bivariate outliers while providing a better estimate of the true relationship between variables ( Pernet et al. , 2013 ;Rousseeuw, 1984 ;Rousseeuw and Driessen, 1999 ;Verboven and Hubert, 2005 ).
Likewise, a 2-way ANOVA performed on the RTs (log transformed) revealed a main effect of ' Workload ' ( F (1,55) = 83.276; p < 0.001; η² = 0.602), and an Age x Workload interaction ( F (1,55) = 20.650; p < 0.001; η² = 0.273). Older participants, in contrast to younger ones, exhibited a greater increase of RTs in the 2-back compared to the 1-back task ( Fig. 2 right ). Therefore, our results show that increasing AWM load impairs task performance in older participants in a greater fashion than in younger ones. The descriptive statistics of all behavioral measures, including hit and false alarm rates, are shown in Table 2 .

Event-related potentials
ERP waveforms evoked to correctly detected target and allbut-false-alarm non-target syllables (from now on, simply targets and non-targets) are shown in Fig. 3 for each workload condition and age group separately. Their corresponding ERP scalp topography distributions at different relevant time-points are depicted in Fig. 5 B and Fig. 6 B. Obligatory auditory ERP components (P1, N1 and P2) can be observed, as well as N2 (heavily reduced in the older group) and prominent parietal P3b and SFN components.

Age-related effects in ERPs
Regarding target stimuli, a significant negative fronto-central cluster was obtained between 148 and 1019 ms ( T = -6169.5; p  Regarding non-target stimuli, a significant negative frontocentral cluster was obtained between 60 and 1383 ms ( T = -11470; p < 0.001), revealing more negative ERP amplitudes in the Younger group already from the earliest observable obligatory ERP components. Moreover, a significant positive centro-posterior cluster was obtained between 137 and 1445 ms ( T = 11450; p < 0.001), re-  vealing more positive ERP amplitudes in the Younger group, overlapping with the P2 component and encompassing the entire P3b component. Thus, amplitudes around the P1 were larger in the Older group, as well as around the N1 (albeit only at temporomastoidal electrodes, with positive amplitudes due to the polarity inversion below the Sylvian fissure). Additionally, in a similar fashion to the responses to target stimuli, the ERP around the P2 exhibited a different spatial distribution and latency between groups, being more frontal and later (and longer-lasting) in the Older group than in the Younger, differences that may have contributed to the fronto-lateral portion of the negative cluster and to the central portion of the positive cluster at this time range. Finally, ampli-tudes around the N2 component were larger in the Younger group (while barely observable in Older), as well as around the SFN.
In summary, while older participants showed higher amplitudes around obligatory ERP components to non-target stimuli, younger participants exhibited stronger late processing of both target and non-target stimuli, as revealed by more negative fronto-central and more positive posterior responses encompassing the SFN and P3b ERP components respectively. The evolution of the significant clusters in the time domain is depicted for targets in Fig. 5 A ( top row ) and for non-targets in Fig. 6 A ( top row ). For the sake of simplicity, Fig. 4 A ( top row, left and middle columns ) depicts the average of the significant clusters within a 60 0-10 0 0 ms time window, which roughly corresponds to the time range in which an Age x Workload interaction was found ( see "Age x Workload interaction" below).

Workload effects in ERPs
Regarding target stimuli, a significant negative centro-parietal cluster was obtained between 430 and 781 ms ( T = -1735.1; p < 0.001), revealing more positive ERP amplitudes in the time range of the P3b for the 1-back versus the 2-back conditions. No significant clusters were found for the workload effect in non-target stimuli.
Therefore, while target syllables elicited larger amplitudes around the P3b under low compared to high auditory workload, no main effect of workload was observable in the processing of non-target syllables. The evolution of the significant clusters in the time domain is depicted for targets in Fig. 5 A ( middle row ) (and for non-targets, albeit no clusters were found, in Fig. 6 A, middle row ), as well as the evolution in time of the scalp topography of the difference between ERP amplitudes elicited to the 2-back minus the 1-back tasks ( Fig. 5 B and Fig. 6 B). For the sake of simplicity, Fig. 4 A ( middle row, left and middle columns ) depicts the average of these clusters within a 60 0-10 0 0 ms time window, which roughly corresponds to the time range in which an Age x Workload interaction was found ( see "Age x Workload interaction" below ), and Fig. 4 B ( left  Fig. 7. Workload effect per age group and age x workload interaction on the time-frequency decomposition using the whole frequency (3-70 Hz) and time (-0.2 to 2 s) ranges. Left , time-frequency plots for a selected subset of channels depicting the relative difference (in dB) between the spectral power in the 2b minus 1b workload conditions in the Younger group. Redder colors indicate larger power in the 2b versus 1b; bluer colors indicate larger power in the 1b versus 2b. A mid-frontal power increase in the theta band is apparent, as well as a generalized alpha power decrease. Right , same but for the Older group. While a frontal lower theta/delta increase seems apparent, the strongest power modulation is a reduction in the alpha band with increasing workload. Center , statistical comparison of the workload effect across age groups (y[2b-1b] vs. o[2b-1b]); independent t -test at each channel-time-frequency bin assessed with the non-parametric randomization Monte Carlo method; uncorrected; thresholded at p < 0.001). The only evident interaction is in the mid-frontal theta band, indicating that the increase in power with increasing workload was larger in the Younger group as compared to the Older. and middle columns ) depicts the scalp topography of the difference between ERP amplitudes elicited to the 2-back minus the 1-back tasks in that time range.

Age x workload interaction in ERPs
Regarding target stimuli, a significant negative bilateral cluster, extending from fronto-lateral to parieto-lateral sites, was obtained between 614 and 1082 ms ( T = -2256; p < 0.001) and a significant positive fronto-central cluster between 660 and 1094 ms ( T = 2123; p < 0.001). The mean scalp distribution of these 2 clusters within a 60 0-10 0 0 ms time window is depicted in Fig. 4 A ( bottom row, left column ) and their detailed temporal evolution can be seen in Fig. 5 A ( bottom row ). As interactions of this type are difficult to interpret, it is useful to observe the spatio-temporal distribution of the workload effect per age group separately ( Fig. 4 B,  left column ; Fig. 5 B): it shows an increased bilateral negativity in the Younger group in the 2-back versus the 1-back condition, while the Older group exhibited the opposite pattern; and an increased frontocentral positivity in the Younger group due to a decreased negative potential around the SFN in the 2-back versus the 1-back condition, while the Older group barely showed any SFN.
Regarding non-target stimuli, a significant negative bilateral cluster, resembling that obtained to target stimuli, was obtained between 488 and 875 ms ( T = -2023; p < 0.001), and a significant positive centro-parietal cluster between 551 and 894 ms ( T = 1748; p < 0.001). The mean scalp distribution of these 2 clusters within a 60 0-10 0 0 ms time window is depicted in Fig. 4 A ( bottom row, middle column ) and their detailed temporal evolution can be seen in Fig. 6 A. The spatio-temporal distribution of the workload effect is depicted for each age group separately in Fig. 4 B ( middle column ) and Fig. 6 B, showing: (1) similarly to the results obtained with target stimuli, an increased bilateral negativity in the Younger group in the 2-back versus the 1-back condition, while the Older group exhibited the opposite pattern; and (2) an increased parietal positivity in the Younger group encompass-ing the P3b component in the 2-back versus the 1-back condition, while the Older group showed the opposite pattern (albeit an observable topographic shift, with more posterior potentials at the P3b time range in the 2-back condition, could partially account for this load-related decrease).
In summary, parietal amplitudes around the P3b evoked to target syllables (which were larger in the Younger group) decreased with increasing workload regardless of age. Negative SFN amplitudes evoked to target syllables at fronto-central electrodes decreased in younger participants, while older participants barely showed any fronto-central negativity. Regarding non-target syllables, a cross-over interaction showed that parietal amplitudes around the P3b component increased with higher workload in the Younger group but decreased in the Older group. Furthermore, in both target and non-target syllables, the effect of workload during the SFN time range at fronto-lateral electrodes was differentially modulated by age, being increased (more negative) with higher workload in the Younger while decreased in the Older group.

Theta (5-7 Hz) oscillatory power measurements
As both exploratory analyses ( see Fig. 7 and section 2.4.5. of Materials and methods ) and theoretically grounded a priori hypotheses supported a major relation of frontal midline theta in WM modulations with workload and aging, we focused our analyses in the 5-7Hz oscillatory band.

Age-related effects in theta power
A significant positive mid-line cluster, from frontal to occipital sites, was obtained through all the analysed time range (0-2 seconds; T = 1727; p < 0.01), indicating a higher ongoing theta power in the Younger versus the Older group (please note that no baseline correction was applied in this analysis of continuous effects). The evolution of this significant cluster in the time domain is depicted in Fig. 8 A ( top row ). For the sake of simplicity,  1-back [1b]) in grand-mean theta ( θ) power (dB) in non-target syllables for the Younger (Y) and the Older (O) groups separately, within the selected time windows. Below each plot, 2 smaller scalp topographies depicting the grand mean theta ( θ) power (log μV 2 /Hz) for each n-back condition separately within the same time range. Fig. 4 A ( top row, right column ) depicts the average of the significant cluster within a 60 0-10 0 0 ms time window, which roughly corresponds to the time range in which an Age x Workload interaction was found ( see "Age x Workload interaction" below ). A visual examination of the theta power elicited per condition and group separately renders this age effect particularly clear as, while both age groups show a weak theta power at centro-lateral sites, older participants show a strong decrease of theta power at midline sites compared to younger participants.

Workload effects in theta power
No main effects of workload survived the cluster-based correction for multiple comparisons ( Fig. 8 A, middle row ; Fig. 4 A, middle row, right column ). This is hardly surprising, as a look at the subtraction of the theta power in the 2-back minus the 1-back condition per each group separately ( Fig. 8 B; Fig. 4 B) reveals opposite workload effects, with younger participants showing an increase of frontal theta power with higher workload, while older participants exhibited a pronounced decrease over frontal and parietal sites.

Age x workload interaction in theta power
Indeed, a significant positive cluster was obtained in the interaction analysis from 344-1712 ms ( T = 6329; p = 0.01), with a topographical distribution resembling a merging of a frontocentral cluster, resulting from the increase of frontal theta power in younger participants as well as the decrease in older participants with higher workload, and a bilateral posterior parietal cluster, resulting from a strong decrease of parietal theta power with higher workload in older participants compared to younger ones. The mean scalp distribution of this cluster within a 60 0-10 0 0ms time window is depicted in Fig. 4 A ( bottom row, right column ) and its detailed temporal evolution can be seen in Fig. 8 A ( bottom row ).

Correlations between frontal theta power and behavioral measures
Scatter plots illustrating the difference ( 2-back minus 1-back task) in frontal theta power with the difference in sensitivity index ( d' ) or RT are depicted in Fig. 9 . Frontal theta power was extracted from each individual in the time-electrode range of the significant Age x Workload interaction cluster, restricted to frontocentral electrodes (344-1712 ms; Fz, FC1, FCz, FC2). In the Younger group, skipped Pearson correlations yielded a significant negative linear relation between frontal theta power and RTs ( r = -0.35; p < 0.05), and a positive linear relation between frontal theta power and d' approaching significance ( r = 0.31; p = 0.06). No correlations were significant in the Older group ( d', r = -0.04; RTs, r = 0.29). These results suggest that the stronger the increase in frontal theta power of younger individuals when facing tasks involving a higher working memory load, the better their performance, while older individuals fail to show this relation.

Summary of results
Overall, our results can be summarized in 5 major points: (1) increasing memory load in the auditory n-back task impaired performance, the Older group being the most affected; (2) early processing of non-target stimuli, as indexed by the amplitude of obligatory auditory ERPs (P1, N1, P2), was enhanced in healthy aging, while late processing, as indexed by the amplitude of slow frontal negative (SFN) and parietal positive potentials (P3b), was diminished; (3) the effects of AWM load in late stimulus processing were age-dependent and occurred in opposite patterns according to the nature of the stimulus: while SFN measured from fronto-lateral electrodes and P3b ERPs elicited to non-target stimuli were enhanced in the high-load condition in younger but reduced in older participants, SFN measured from fronto-central electrodes elicited to target stimuli was reduced with increasing load in younger (SFN from fronto-lateral electrodes was increased) and P3b reduced in both age groups; (4) the effect of AWM load in midline frontal theta power was age dependent: power increased with higher AWM load in the younger while it decreased in the older group, who showed a strong overall power reduction; and (5) younger participants with a higher midline frontal theta power increment in response to increased AWM load demands exhibited a better performance in the n-back task.

Discussion
In the current study we investigated the effects of healthy aging and working memory (WM) load in auditory processing during continuous information monitoring and updating. In order to provide a comprehensive picture of the associated electrophysiological correlates, we employed an auditory n-back task ( Kirchner, 1958 ;Owen et al. , 2005 ) with 2 levels of cognitive load ( 1-back; 2-back ), coupled with EEG recordings. Our results, detailed above, suggest an age-related enhanced weighting of incoming sensory stimulation, in line with propositions of impaired sensory gating ( Friedman, 2011 ), weaker inhibitory control ( Chao and Knight, 1997 ) and unsuccessful allocation of cognitive resources towards the maintenance of multiple items held in working memory ( Polich, 2007 ), especially under high cognitive load. Similar deficits have been related to impaired speech comprehension ( Pichora-Fuller, 2003 ;Rönnberg et al. , 2013 ;Wingfield et al. , 2015 ;cf. Evans et al. , 2015 ).
As expected, our behavioral results showed that increasing workload in AWM from 1-to 2-back hinders performance, both in accuracy ( d' ) and reaction times, the Older group being the most affected. This replicates a reliable finding from visual n-back task studies ( Daffner et al. , 2011 ;Falkenstein, 2014 , 2018 ;Jonides et al. , 1997 ;Lubitz et al. , 2017 ;McEvoy et al. , 2001 ;Missonnier et al. , 2004 ;West and Bowry, 2005 ;Wild-Wall et al. , 2011 ), in line with the idea that age-related differences are revealed when the difficulty of the tasks increases ( Hess, 2005 ;Klencklen et al. , 2017 ). AWM studies manipulating memory load in Sternberg type tasks ( Sternberg, 1966 ) also reported effects of aging in performance and electrophysiological as well as fMRI indices ( Chao and Knight, 1997 ;Golob and Starr, 20 0 0 ;Grady et al. , 2008 ;Karrasch et al. , 2004 ;Pelosi and Blumhardt, 1999 ;Pratt et al. , 1989 ). However, as noted by Bopp and Verhaeghen (2020) , simple memory tasks involving stimulus encoding, storage and retrieval produce weaker and more unreliable age effects than tasks that require executive functions such as updating, inhibition and focus switching within the contents of working memory. Thus, the n-back task appears particularly suited to study cognitive function decline in aging, as it can be considered a dual-task requiring continuous updating, maintenance and information processing, heavily taxing the limited pool of cognitive resources ( Daffner et al. , 2011 ;Mattay et al. , 2006 ;McEvoy et al. , 2001 ;McEvoy et al. , 1998 ). Interestingly, the finding that the strongest performance decrements occur from 1-to 2-back tasks has been associated with a qualitative change in task demands: while 0-and 1-back tasks are, essentially, memory search tasks that require comparing items within the limited focus of attention, n > 1 tasks demand, in addition to storing presented items, storing their sequence of presentation, bringing back items from short-term memory stores to the focus of attention (i.e., focus switching; Oberauer, 2002 ) and inhibiting interfering items. The effects of such a qualitative change are aggravated in aged individuals who, as also indicated by our results, do not exhibit deficits in memory search processes but are rather more susceptible to short-term memory decay and interference ( Bopp and Verhaeghen, 2020 ). As, to the best of our knowledge, the only existing auditory n-back study on aging used a 1-back task in fMRI ( Grady et al. , 2008 ), we are here the first to provide a comprehensive account of this behavioral bifurcation and its electrophysiological correlates in AWM.
It has long been suggested that inhibitory deficits in the processing of irrelevant information underlie age-related performance decrements in a variety of cognitive, attentional and perceptual tasks (Inhibitory Deficit Hypothesis; Hasher and Zacks, 1988 ). Accordingly, the enhancement of early sensory ERPs to irrelevant or unattended stimuli in normal hearing older individuals is a reliable finding in electrophysiological literature ( Aghamolaei et al. , 2018 ;Alain and Woods, 1999 ;Anderer et al. , 1996 ;Chao and Knight, 1997 ;Friedman, 2011 ;Pelosi and Blumhardt, 1999 ;Stothart and Kazanina, 2016 ;cf. Tusch et al. 2016 ). Possibly arising from a general disinhibition of sensory systems and frontal lobe alterations ( Knight et al. , 1999 ), such an impaired control in the access of irrelevant information to the focus of attention and WM (i.e., "sensory gating") may lead to information overload. A crucial distinction between experimental paradigms assessing the processing of irrelevant or unattended information and the n-back task, though, is that in the latter there are no irrelevant stimuli. The matching and updating process takes place for both target and non-target stimuli, as both must be retained in WM, albeit only targets require response execution. We found increased ERP amplitudes to non-target stimuli in aged individuals encompassing the P1 (ca. 70 ms) and N1 (ca. 120 ms; albeit only in its temporo-mastoidal component), resembling impaired sensory gating (and ruling out subclinical diminished hearing sensitivity as an influencing factor ( Hyde, 1997 )). We also observed an age-related shift in the scalp distribution around the time range of the P2 ERP (ca. 200 ms) from central to fronto-lateral electrodes ( Figs. 5 and 6 ; please, be aware that scalp topographies were not directly assessed statistically), similarly to findings suggesting a deficient disengagement of irrelevant information processing ( Anderer et al. , 1996 ;McEvoy et al. , 2001 ). However, given the fundamental differences between different experimental paradigms, our results must be interpreted under a different light. We suggest that, as the n-back task demands constant monitoring, the increased auditory cortex responsiveness and prolonged engagement in incoming stimuli may divert cognitive resources towards early stages of processing, leaving fewer resources to other task demands and rendering representations in WM more susceptible to decay and interference ( Bopp and Verhaeghen, 2020 ).
In line with this, we found a strong decrement in older participants' ERP amplitudes around the N2 component (ca. 300 ms; Figs. 5 and 6 ). Since decreased N2 has been interpreted to index impaired signalling to prevent further stimulus processing ( Bertoli and Probst, 2005 ;Stothart and Kazanina, 2016 ), as well as deficient early processes of matching/mismatching incoming stimuli with short-term memory representations ( Daffner et al. , 2011 ;Folstein and Van Petten, 2008 ;Pratt et al. , 1989 ), it is indeed plausible that overly engaged resources in early sensory processing impact further task requirements. Crucially, our sample of older participants exhibited a prominent decrease of a late positive parietal component, allegedly the P3b (ca. 40 0-10 0 0 ms), associated with impaired posterior generators ( McEvoy et al. , 2001 ) and cognitive decline in aging ( Missonnier et al. , 2004 ), consistently found in DMTS studies (e.g., Chao and Knight, 1997 ). The P3b has been related to cognitive resource allocation, inhibition and transference of attentional contents from frontal to temporal and parietal regions for subsequent memory processes (i.e., context updating; Friedman, 2011 ;Polich, 2007 ). Albeit electrophysiological aging studies using visual n-back tasks are scarce and present inconsistent methods and findings (e.g., Daffner et al. , 2011 ;Gajewski and Falkenstein, 2014 ;Lubitz et al. , 2017 ;Saliasi et al. , 2013 ) our results fit with the assumption that P3b indexes these cognitive operations ( Polich, 2007 ). First, targets, as relevant infrequent events, prompt higher resource allocation, yielding larger P3b's in both groups (evident in Fig. 3 ; but please be aware that we did not directly compare amplitudes across stimuli roles); second, both groups exhibited a load-related decrease in response to targets, expected as cognitive resources are divided between target detection (orienting attention), behavioral response and other task requirements (akin to dual-tasks; Sirevaag et al. , 1989 ); and third, younger participants showed a load-related increase to non-targets, reflecting higher resource allocation/context updating (e.g., stimulus and sequential order). Conversely, older participants exhibited a decrease, which may index discontinuous cognitive effort at high load due to experiencing the task as too difficult ( Gajewski and Falkenstein, 2014 ;Saliasi et al. , 2013 ;Van Snellenberg et al. , 2015 ). An alternative, non-exclusive explanation is that a higher allocation of cognitive resources to early sensory processing (indexed by enhanced early auditory ERPs) may affect the reallocation of attention to the contents hold in WM that is needed for subsequent stimulus comparison/decision making processes, in line with impaired focus switching ( Bopp and Verhaeghen, 2020 ).
Moreover, older participants exhibited a drastic reduction in a late slow frontal component, which we identified as the sustained frontal negativity (SFN; or sustained anterior negativity [SAN]; Kaiser, 2015 ), an ERP related to sustained attention processes allowing item maintenance in WM ( Chao and Knight, 1997 ). SFN is distinctly elicited during retention periods in WM maintenance tasks (e.g., DMTS tasks) involving the auditory system ( Guimond et al. , 2011 ;Lefebvre et al. , 2013 ;Nolden et al. , 2013 ;Ruchkin et al., 1997 ). It steadily increases with load until reaching a plateau ( Alunni-Menichini et al. , 2014 ) and is severely diminished in aged individuals, possibly due to prefrontal activity deficits ( Chao and Knight, 1997 ). The effects of workload on SFN in our study are complex and apparently do not fit the literature, as younger participants surprisingly showed a larger SFN to targets as measured from fronto-central electrodes in the low load condition and no difference to non-targets. However, activity at frontolateral and temporal electrodes to both stimulus types did increase with load, while it decreased in older participants. In this regard, we would like to underline the differential nature of the n-back task as compared to other previously used, storage-centered AWM tasks. As a matter of fact, auditory n-back EEG studies are scarce and the effects of workload on the SFN component (if analyzed, and using varied terminology) are inconclusive: decreased negativity with workload to targets (Late Slow Wave; Arjona-Valladares et al. , 2021 ); increased negativity without distinguishing targets from non-targets (Slow Wave activity; Alain et al. , 2009 ); and in-creased negativity at central (not frontal) electrodes (Slow Brain Potentials; Rämä et al. ,20 0 0 ).
Given the inconclusive previous literature, we propose an alternative interpretation. If, as discussed above, our results on the P3b suggested a larger allocation of cognitive resources to processing targets in the low load condition, the activity related to reassigning cognitive resources to the task after giving a behavioral response should be larger as well. Thus, we speculate that the observed sustained negativity at fronto-central electrodes may reflect, at least in part, the Reorienting Negativity (RON) component, which has been related to reorienting to original task goals ( Escera et al. ,20 0 0 ;Escera et al. ,20 01 ), is reduced in aging ( Correa-Jaraba et al. , 2016 ;Getzmann et al. , 2015 ) and has been postulated as a marker of frontal and high-order function integrity ( Justo-Guillén et al. , 2019 ). This speculation claims for future studies better suited to disentangle, at the source level, this intriguing possibility.
Notwithstanding the relevance of oscillatory activity at multiple frequency bands for cognitive operations involved in AWM ( Kaiser, 2015 ), we decided to focus on the theta band for its reliable association to item maintenance, sequential order encoding and coordination of the reactivation of information represented in posterior areas through interregional synchronization ( Hsieh and Ranganath, 2014 ;Kawasaki et al. , 2010 ;Sauseng et al. , 2010 ). Moreover, our additional exploratory analyses also pointed specifically to the theta band as differentially modulated by age and workload ( see Materials and methods and Fig. 7 ). The n-back task, especially at high load conditions, heavily relies on the integrity of such processes, which is reflected by more robustly elicited theta power increments than in other WM tasks ( Brookes et al. , 2011 ;Scharinger et al. , 2017 ). Moreover, theta resting state power, but not delta nor alpha, has been proposed as a marker of healthy neurocognitive aging, as it correlates with several cognitive measures of attention, memory and executive function ( Finnigan and Robertson, 2011 ). Consistent with our results, frontal theta power is severely decreased in aging during WM tasks as well as in resting state ( Cummins and Finnigan, 2007 ;Dustman et al. , 1993 ). Furthermore, previous studies consistently reported frontal theta power increments with WM load ( Brookes et al. , 2011 ;Deiber et al. , 2007 ;Jensen and Tesche, 2002 ), as we observed in our younger group. Additionally, we found that younger participants with stronger load-related increments of frontal theta power performed better in our auditory n-back task. However, the Older group exhibited a decrease of theta power both at frontal and parietal electrodes, which may reflect an accelerated decay of sensory representations ( Bopp and Verhaeghen, 2020 ) and an unsuccessful maintenance of constant monitoring at high load, possibly due to reaching supra-capacity levels ( Gajewski and Falkenstein, 2014 ;Saliasi et al. , 2013 ;Van Snellenberg et al. , 2015 ), as supported by behavioral results. Such strongly reduced frontal theta power in the older participants may explain the lack of correlation with performance measures ( Fig. 9 ).
Crucially, theta activity deficits in a network supporting domain-general cognitive abilities, functionally decoupled from a left-hemispheric dorso-frontal language-specific sentence comprehension network, are associated with impaired verbal working memory leading to speech comprehension problems in healthy aged individuals ( Beese et al. , 2017 ). Furthermore, theta power and interregional coherence are associated with the retrieval of verbal information during unfolding sentences, increasing with syntactic complexity and sequencing demands ( Meyer, 2018 ). As activity in the alpha band, reflecting unbalanced cortical inhibition, has been recently related to speech comprehension in aging as well ( Beese et al. , 2019 ), studying the sources of theta oscillatory activity in older individuals during the comprehension of unfolding sentences at different levels of syntactic complexity constitutes a promising avenue for further research.

Conclusion
The results reported here show age-related differences in auditory information processing within a latency range that is consistent with a series of impaired functions, from sensory gating to cognitive resource allocation during constant information updating. While consistent with most studies using visual n-back tasks, they represent a substantial addition to the auditory working memory literature in healthy aging populations. The auditory n-back task, especially at the bifurcation between taxing cognitive operations within ( n ≤ 1 ) and outside ( n > 1 ) the focus of immediate attention, appears as sufficiently sensitive to reveal subtle deficits that may hypothetically underlie speech comprehension problems in older individuals.