There is a great deal of evidence that sleep plays a functional role in maintaining cognitive abilities. For example, sleep deprivation leads to a profound impairment in sustained attention and deficits in executive function (Lim & Dinges, 2010). Another aspect of cognitive function served by sleep is working memory (WM), which refers broadly to the system that actively manipulates, updates, and maintains mental representations (Baddeley & Hitch, 1974; Baddeley, 2000).

WM performance is influenced by both state and trait factors and is closely related to executive control of attention (Ilkowska & Engle, 2010). The amount of information or number of items one can maintain over short intervals is referred to as working memory capacity (WMC), and it is most often studied in terms of the capacity of visual short-term memory (Cowan, 2010; Vogel & Machizawa, 2004). WMC is frequently studied in conjunction with WM performance and is considered an individual trait that influences one’s performance in a WM task. For example, trait anxiety predicted worse performance among individuals with low WMC on the maths computation tests of the wide range achievement test (Wilkinson & Robertson, 2006) and Raven’s standard progressive matrices (Raven, Raven, & Court, 1998), two tasks involving WM, but trait anxiety had no relation to performance among those with average WMC, and a positive relationship with performance for those with high WMC (Owens, Stevenson, Hadwin, & Norgate, 2014).

Sleep deprivation has been shown to negatively impact performance on a variety of cognitive tasks involving WM, such as digit span (Quigley, Green, Morgan, Idzikowski, & King, 2000) and N-back tasks (Choo, Lee, Venkatraman, Sheu, & Chee, 2005; for review, see Frenda & Fenn, 2016), and these effects are likely driven, in part, by impairments in sustained attention (Lim & Dinges, 2010) and altered functioning of frontal and parietal networks (Chee & Choo, 2004). Continuous sleep restriction has also been reported to lead to WM deficits measured using a digit symbol substitution task (Van Dongen, Maislin, Mullington, & Dinges, 2003). Recent research suggests that sleep loss may impact various components of WM differently, and may depend on the degree of sleep loss and tasks employed. For example, Angel et al. (2015) reported that having sleep restricted to 4 h per night for five nights led to deficits in the phonological and visuospatial storage components on both visual and auditory n-back tasks. Drummond, Anderson, Straus, Vogel, and Perez (2012) reported that neither total sleep deprivation nor four nights restricted to 4 h in bed impacted visual WMC, yet total sleep deprivation did alter filtering efficiency in a visual distracter task. Of note, sleep disruption has been associated with behavioral deficits in both reaction time (e.g., Jiang et al., 2011) and accuracy (e.g., Tempesta, De’Gennaro, Presaghi & Ferrara, 2014) in WM tasks. Taken together, this body of work confirms disruption to WM performance following sleep loss and suggests that sleep plays a functional role in WM; however, it remains unclear what properties of sleep support WM performance.

In a recent study, Lau, Wong, Lau, Hui, and Tseng (2015) investigated WM performance before and after a 90-min nap in comparison with a no-nap group. Lau et al. utilized an N-back WM task that required participants to recall the location of a symbol that appeared two trials prior in one of 12 possible locations. Accuracy in identifying repeated symbols improved after a nap, but not following an equivalent period of wakefulness. Moreover, performance improvement in the last of three blocks of this task was positively correlated with REM sleep duration and total sleep time (TST) in the nap group, which may indicate that the nap was most beneficial for those with greater sleep pressure before the nap. In addition to sleep architecture (i.e., amount of sleep time spent in each sleep stage), sleep spindles may also contribute to WM improvements. Sleep spindles are 12–16 Hz oscillations present in the electroencephalogram (EEG) of non-rapid eye movement (NREM) sleep, predominantly at central sites, with durations ranging from about 0.5 to 3 s (Rechtschaffen & Kales, 1968). Sleep spindle activity has been positively correlated with declarative memory recall after sleep (Clemens, Fabo, & Halasz, 2005; Mednick et al., 2013; Schabus et al., 2004) and various measures of general cognitive ability and intelligence (Bódizs et al., 2005; Fogel, Nader, Cote, & Smith, 2007; Lustenberger, Maric, Dürr, Achermann, & Huber, 2012; Schabus et al., 2006). More specifically, Fang et al. (2017) reported that the relationship between sleep spindles and cognitive ability was specific for fast (13.5–16 Hz) sleep spindles of NREM stage 2 sleep (NREM2) and slow-wave sleep (SWS) and reasoning abilities such as those needed for logical problem solving and identification of complex patterns (i.e., “fluid intelligence”). Importantly, Fang et al. (2017) found this relationship to be independent of markers of sleep maintenance, which suggests that spindles may have some function beyond protecting sleep (Cote, Epps, & Campbell, 2000; Dang-Vu, McKinney, Buxton, Solet, & Ellenbogen, 2010). Fluid intelligence and WM have been conceptualized as highly related constructs linked through executive attention (Engle & Kane, 2004; Engle, Tulhoski, Laughlin, & Conway, 1999); thus, one may expect sleep spindles to be related to WM performance and WM improvements resulting from a nap.

The current study sought to examine the role of sleep in WM by comparing WM performance after a 90-min afternoon nap opportunity to performance within the same participants after a control condition without a nap opportunity. A nap was predicted to improve WM performance over the waking control condition, and the main goal of the study was to investigate the nature of this sleep-related WM performance improvement. We employed a delayed-estimation, continuous-report task, which allows for measurement of both the ability to hold and report visual representations and the fidelity of those representations. While such tasks are used frequently to understand the properties of WM (e.g., Bays, Catalao, & Husain, 2009; Wilken & Ma, 2004), the use of a continuous-report WM task in a sleep study is novel and allows for better understanding of how sleep may separately, or jointly, affect the odds of successful WM storage and retrieval and the fidelity of WM. The delayed-estimation task used asked participants to view two, four, or six uniquely colored squares, hold them in WM over a brief delay, and then, from memory, report the color of a probed square on a 360° color wheel. Data were analyzed at a trial-by-trial level with mixed-effect modelling to capture performance changes over the task, an important consideration given the time-on-task effect (i.e., the progressive worsening of performance over time on tasks requiring sustained focus), its relationship to sleep and sleep loss (Lim & Dinges, 2008), and the potential for relationships between sleep and performance to vary over the duration of a task (Lau et al., 2015). Models were constructed to estimate the effect of the experimental conditions on two measures of WM performance, namely the odds of WM success and the error within success. A report for a trial was considered a WM success if the reported color was within 25° of the color of the probed square. Error within success reflects the fidelity of successful reports and refers to the distance in degrees between a report and the color of the probed square.

To elucidate mechanisms and potential moderators of WM improvements following a nap, trait WMC and sleep physiology, including sleep architecture, sleep spindles, and EEG power during sleep, were quantified and included in models predicting WM performance. WMC was predicted to moderate the effects of the nap on WM performance; those low in WMC were predicted to show greater benefits from the nap given that these individuals might be more susceptible to declines that accumulate over wakefulness. Similarly, a nap may be most beneficial to those with a greater sleep need prior to the nap. Thus, markers of sleep pressure such as SWS and NREM delta (0.5–4 Hz) power (Dijk, Brunner, Beersma, & Borbély, 1990) may predict sleep-related WM improvements. Given the results of Lau et al. (2015), TST and REM sleep duration were predicted to be associated with post-nap WM improvements. Because WM has been associated with fluid intelligence (Engle & Kane, 1999), and fluid intelligence has been associated with sleep spindle activity (Bódizs et al., 2005; Fang et al., 2017; Schabus et al., 2006), sleep spindle density in NREM sleep was predicted to relate to WM performance or specifically sleep-related improvements in WM performance if sleep spindles have an active role in maintaining cognitive performance.

Finally, we conducted exploratory analyses on waking EEG recorded during the WM task. EEG associated with task demands was obtained by extracting event-related potentials (ERPs) from the delay period during which items were held in WM. Quantification of an occipital component, namely occipital delay activity (ODA), was guided by previous research on the ERP component known as the sustained posterior contralateral negativity (SPCN) or contralateral delay activity (CDA), which is a lateralized component extracted from occipital channels during the delay period of delay-estimation tasks that has been shown to vary in amplitude based on visual WM load (e.g., Vogel & Machizawa, 2004; Jolicœur, Brisson, & Robitaille, 2008). A similar load-dependent, long-duration component has been observed in tasks that were not lateralized in stimuli presentation (Ruchkin, Johnson, Canoune, & Ritter, 1990). In addition, a frontal delay activity (FDA) component was quantified and explored due to past research linking frontal EEG activity and attentional processes of WM (e.g., Liesefeld, Liesefeld, & Zimmer, 2014) and the established link between sleep and sustained attention (Lim & Dinges, 2010). This guided selection of load-dependent FDA and ODA ERPs was used to better understand the nature of sleep-related improvements to WM performance. The rationale was that components that vary by load during the maintenance period may index memory-specific processes, and learning how these indicators may vary with a nap could provide insight into how a nap affects WM performance.

Methods

Participants

Participants were recruited from the local university to participate in a larger study on the role of sleep in memory and emotion. They were required to be 17–30 years of age, healthy, good sleepers, non-smokers, right-handed, taking no medications that disrupt sleep or cognition, and to have no history of brain injury, anxiety, or other psychiatric conditions. In addition, all participants completed the Ishihara Color Test (Ishihara, 2014) during screening to verify accurate color vision for inclusion in the study. A target sample size of approximately 40 participants was determined based on anticipated effect sizes in the declarative memory tasks not reported here. Of the initial 41 participants who completed the study, two were excluded for particularly poor performance (>3 SD below the mean number of WM successes) on the least difficult level of the WM task (WM load of two items). An additional three participants were removed because they obtained fewer than 20 min of sleep during the 90-min nap opportunity, whereas all other participants obtained greater than 45 min. The final sample thus included 36 participants (Mage = 20, 20 men, 16 women). Participants in the study were given an honorarium of Can$50 or course credit. The study was cleared by the local Biosciences Research Ethics Board.

Materials

Questionnaires

Questionnaires assessing sleep habits and history were administered during an orientation session. A seven-item napping questionnaire was also given to explore whether napping preferences predicted performance outcomes. The question used to measure a preference for napping was: “If your schedule was such that you could nap everyday, how frequently would you nap?” Participants indicated their response from the following options: everyday, more than twice a week, once or twice a week, once or twice a month, and never, which were numerically scored as 5, 4, 3, 2, and 1, respectively. Pre- and post-sleep questionnaires were administered during the experimental session to measure subjective sleepiness, fatigue, and mood and to give participants an opportunity to report any unusual events and confirm that they followed instructions regarding alcohol, caffeine, and sleep times.

Working memory capacity (WMC)

A change detection task was administered to estimate WMC (Luck & Vogel, 1997). Participants were shown two, four, or six uniquely colored squares for 500 ms followed by fixation for 1,000 ms before the whole display returned either exactly as it was or with one square containing a new unique color different from its previous color. This test display remained on the screen until participants reported whether they saw a change in color at any location (whole-probe). The colored squares were each 1° of visual angle in length. The squares were randomly assigned to any of 12 possible locations (clock positions) 4° of visual angle from the fixation point. Both reaction time and accuracy were recorded but only accuracy was stressed to the participants. WMC was calculated using Pashler’s formula for capacity (K) in whole-probe change detection (Pashler, 1988).

$$ \mathrm{K}=\left(\mathrm{set}\ \mathrm{size}\right)\ast \frac{hit\ rate- false\ alarm\ rate}{1- false\ alarm\ rate} $$

Working memory performance

Participants performed 300 trials of a delayed-estimation task that utilizes continuous-report to measure WM performance (Zhang & Luck, 2008). Memory load varied between two, four, and six items across trials of this task with each item on the screen being a colored square 1° of visual angle in size and 3° of visual angle from a central fixation point. The colors of each square were randomly selected from 360 colors from a continuous color wheel made using CIE L*a*b* color space to present all colors in equal luminance across the entire color wheel. The colors selected for each memory display were required to be 20° apart on the color wheel to ensure they would be seen as unique colors while still allowing for substantial variation in the colors selected for high-load trials. For each trial, the memory display was shown for 200 ms followed by a 1,000-ms delay over which participants would maintain the items in memory and then the test display to measure WM performance (Fig. 1). The test display consisted of square outlines in the position of each of the items from the memory display all enclosed by a color wheel with a diameter of 15° of visual angle. For each trial, one of the items was randomly selected to be probed, and the outline of this item was thicker than the others. Participants used a computer mouse and the displayed color wheel to report the color that was previously contained within the probed square during the memory display; the color in the probed square changed as they moved their mouse pointer around the color wheel to show the color currently under their pointer. The accuracy of their response was recorded as the circular difference between the true target value and their own report. Responses within 25° were categorized as WM success (i.e., correct responses) for analysis. The color wheel orientation was jittered between trials to prevent participants from memorizing the color locations on the wheel as spatial locations. The inter-trial interval began when participants made their selection and was temporally jittered between 1,000 and 1,500 ms. The trial conditions were presented intermixed and randomly ordered.

Fig. 1
figure 1

Sample trial of the continuous-report delayed-estimation task used to measure working memory performance. Trials had memory loads of two, four, or six uniquely colored squares. A trial with a memory load of four is shown. Participants were asked to remember the colors of the squares over the delay, and then report the color of probed square (thick border) from memory using the color wheel

Electrophysiological recordings and analysis

Electrophysiology was recorded using Neuroscan SynAmps2 amplifiers with Scan 4.5.1 software (Compumedics Inc., Abbotsford, Victoria, Australia). Gold-plated silver electrodes from 12 scalp sites were placed at F3, Fz, F4, C3, Cz, C4, P3, Pz, P4, PO7, Oz, and PO8 according to the modified International 10–20 system (Pivik et al., 1993). Bipolar left and right electro-oculography (EOG) and submental electromyography (EMG) were also recorded. Data were recorded with reference to an electrode at Fpz and with a sample rate of 1,000 Hz filtered online DC to 200 Hz with an additional notch filter at 60 Hz. Electrical impedances were below 5 KΩ at all scalp sites prior to recording and below 10 KΩ at peripheral sites.

Sleep

Scalp EEG recorded during the nap was re-referenced offline to the contralateral mastoids for lateral sites and the average of the two mastoids for central sites. Sleep records from the 90-min nap opportunities were scored by the first author using standard criteria (Rechtschaffen & Kales, 1968). Stages 3 and 4 of NREM sleep were collectively categorized as slow-wave sleep (SWS). Total sleep time and the time spent in each stage were determined for each participant. In addition, sleep spindles were identified in sleep records through visual inspection of EEG at site Cz by a trained researcher. To aid identification, a wide band-pass (0.5–35 Hz) and narrow band-pass (12–16 Hz) filter were applied separately to two copies of the EEG at Cz, and these filtered channels were examined simultaneously. Spindles were defined as bursts of 12–16 Hz EEG activity lasting a minimum duration of 0.5 s and consisting of a fusiform shape. Spindle onset and offset were marked for each sleep spindle identified in epochs of NREM sleep. As done by Fang et al. (2017) and others, a measure of sleep spindle density was calculated as the number of sleep spindles divided by minutes of NREM stage 2 sleep and SWS combined (NREM stage 1 sleep was not included in this calculation because spindles are not present during this stage).

All epochs of NREM stage 2 sleep and SWS that did not include artefact, arousals, or transitions to or from lighter sleep or REM sleep were subjected to fast Fourier transformation (FFT) with 2-s, 75% overlapped Hanning windows. For each epoch, absolute power (μV2) was obtained for delta power at frontal sites (mean of F3, Fz, and F4) where it is maximal and most reflective of sleep pressure (Finelli, Borbély, & Achermann, 2001), delta power at occipital sites (mean of PO7, Oz, and PO8) because slow-wave activity local to regions crucial to visual WM (Vogel & Machizawa, 2004) may relate to WM performance by reflecting the sleep need specific to these regions (Vyazovskiy et al., 2011), and fast sigma (14–16 Hz) power at central sites (mean of C3, Cz, and C4) to index activity of fast sleep spindles shown to relate to fluid intelligence (Fang et al., 2017). After receiving logarithmic transformation, the mean overall qualifying epochs were computed for each measure to obtain the average power in an epoch of NREM sleep.

Wake

For waking EEG recorded during the delayed-estimated task, scalp EEG was re-referenced offline to the average of the mastoids. To explore ERPs associated with task demands, EEG collected during this task was cut into 1-s epochs time-locked to the onset of the stimuli (colored squares) with a 200-ms pre-stimulus interval. Epochs were down-sampled to 256 Hz and were baseline-corrected to the pre-stimulus interval. Automatic artefact rejection was performed, which removed any epochs exceeding ±70 μV in amplitude between 0 and 700 ms on the channels of interest or the vertical eye channel (to detect blinks). The epochs were averaged according to WM load of the trial (2, 4, or 6) and nap condition (nap or wake control) for a total of six ERP averages. Trials marked as containing artefact were not included in individual averages, and individual averages based on less than 20 trials were not included in grand averages or analyses. This criterion removed four participants from all ERP analyses and removed a total of five other individual averages across three additional participants.

Investigation of the delay period in the ERPs at frontal and occipital sites was guided by previous research of ERP components of visual WM (e.g., Jolicœur et al., 2008; Liesefeld et al., 2014; Ruchkin et al., 1990; Vogel & Machizawa, 2004). This guided investigation combined with a visual inspection of the grand average ERPs for regions in which delay activity amplitude varied with respect to WM load led to the selection of two long-lasting potentials, namely FDA and ODA, defined in a window 400–675 ms post-stimulus onset across frontal and occipital sites, respectively. With little evidence of topographic lateralization of these components, the three frontal channels (F3, Fz, and F4) and the three occipital channels (PO7, Pz, and PO8) were averaged to form two new channels, respectively “frontal” and “occipital,” for all analyses.

Procedure

Participants first attended a 1-h orientation in the Sleep Research Laboratory. During this off-protocol session, participants gave informed consent and were administered the various measures used for eligibility screening. Those meeting inclusion criteria then completed the change detection task to measure WMC followed by a practice period in which participants completed a few example trials of the tasks they would be asked to complete during the experimental sessions.

The experimental protocol consisted of two repeated testing sessions from 13:00 to 18:00 in the Sleep Research Laboratory. Upon arriving, participants underwent electrode application procedures. At 13:30 participants started the first part of the task battery, which required them to learn material required for a later memory test. These memory tasks involved declarative and procedural memory, but are not the focus of this paper. Following these tasks, participants were given the pre-retention questionnaire and assigned to complete one of the two conditions: nap or control. Participants assigned to the nap condition were asked to sleep in their assigned bedroom from 14:00 to 15:30. At the end of the nap, participants were given the post-retention questionnaire and were then given 30 min to play tabletop games with a researcher to ensure they were given time to recover from sleep inertia before continuing the tasks at 16:00. Participants assigned to the control condition were given the pre- and post-retention questionnaires at the same time as napping participants, but played games with the researcher for the full 14:0016:00 period. For both the nap and wake conditions, participants were only allowed to play games they already knew how to play so as to limit the cognitive demand during this time. In addition, the available games were selected to be low to mildly arousing and simple in design (e.g., UNO® and Yahtzee®). While visual and cognitive processing is inevitable for this, and all other, wake retention control tasks, we believe this form of social activity to be an ecologically-valid control in that it is a reasonably-likely alternative way for young adults to spend afternoon leisure time in lieu of napping. All participants completed both the nap and control conditions with the order counter-balanced among participants. To control for any effects of expectation, participants were informed that they may be asked to nap (short or long nap) or remain awake on either day, regardless of any previously assigned condition.

At 16:00, after the retention period, the post-sleep battery of tasks was administered. This battery began with tests on the material learned before the nap. After a brief break, all participants completed the WM performance task at approximately 16:30. The WM performance task took approximately 2040 min to complete with self-paced breaks throughout. Experimental sessions were scheduled 1 week apart with both sessions falling on the same day of the week to best control for other daily activities before or after the session (e.g., classes) that would also follow a repeating weekly schedule.

Data analysis

One participant was a statistical outlier (>3 SD) in their number of sleep spindles and density of sleep spindles. To preserve this case, their spindle scores were winsorized to 2.75 SD above the mean of the other participants, which reserved their place in rank order.

To test the effects of the experimental manipulation on task performance, mixed-effects models were used to estimate the odds of WM success and error within successful reports at the trial-by-trial level. The odds of WM success refers to the odds of a report being within 25° of the target color, and the error within success refers to the distance in degrees that a successful report is away from the target color ranging from a perfect response (0°) to the maximum of 25°. The criterion of 25° was chosen as a compromise in the trade-off between capturing true low-fidelity memories as correct responses and excluding guesses that happen to fall under the criterion. To understand the effect of this decision, the main analyses were repeated using 20° and 30° as the correct response criteria, and these results did not differ qualitatively from those presented. Mixed-effects modelling was used because this method can be applied to the large trial-level dataset without the loss of information that comes from aggregating observations; it can account not only for individual variation in intercepts (similar to a repeated-measures ANOVA), but also for individual variation in experimental effects, leading to more precise parameter estimates; and it can handle unbalanced designs with non-independent observations (Baayen, Davidson, & Bates, 2008). The binary outcome of a WM success or failure was examined using a logistic mixed-effect model to predict the odds of a WM success, whereas the continuous outcome of error within success was predicted using a linear mixed-effect model. Each model was specified to predict the respective outcome from the fixed effects of trial, load, nap condition, and the interactions of these variables. Following the recommendations of Barr, Levy, Scheepers, and Tily (2013) for confirmatory hypothesis testing, we used a maximal random effects structure, which allowed for random intercepts for each participant and random slopes for the effects of load, condition, and the load-by-condition interaction. The models were fit to the data using maximum likelihood estimation available in the lme4 package for R (Bates, Maechler, Bolker, & Walker, 2015). Statistical significance, evaluated at α of .05, of the fixed effects of trial, load, and condition was tested using type III Wald χ2 tests available in the car package for R (Fox & Weisberg, 2011) for logistic models and tested using type III F tests with Satterthwaite approximated degrees of freedom (Satterthwaite, 1946) for linear models available in the lmerTest package for R (Kuznetsova, Brockhoff, & Christensen, 2015). Estimated ß coefficients were obtained and reported for each statistically significant model to understand the nature and magnitude of the effects. Nap condition was coded to reflect the effect of the nap relative to the control condition of not having a nap. Trial was centered such that a value of 0 represented the middle of the task and scaled by a factor of 0.01 such that a 1-unit change in trial represented a change of 100 trials. Given the three levels of WM load, load4-2 and load6-4 contrasts were specified to represent the change from load 2 to load 4 and the change from load 4 to load 6, respectively. The significance of these contrasts was tested using asymptotic Wald tests for logistic models and using t-tests with Satterthwaite approximated degrees of freedom for linear models. Estimated ß coefficients from logistic regression are reported in both logit form and exponentiated form to reflect odds ratios for each effect. For interpretation of significant interactions, we examined conditional ß coefficients estimated for one variable while holding other variables at specified levels. All figures illustrating significant effects and interactions were drawn from unadjusted, raw response data.

To examine whether there were properties of sleep physiology or other individual differences that predicted WM performance on their own or moderated other effects, variables of interest were selected to include in subsequent mixed-effect models. These variables of interest included WMC, preferred napping frequency, sleep spindle density, TST, SWS duration, REM sleep duration, both frontal and occipital NREM delta (0.5–4 Hz) power, and central NREM fast sigma (14–16 Hz) power, and each was centered prior to inclusion in the models. New models were created that built upon the previous experimental effects models (i.e., models with trial, load, nap condition, and their interactions) by adding one of these predictor variables and all the associated interaction terms up to the four-way interaction with trial, load, and nap condition. Models investigating a measure of EEG power as a predictor were conducted with and without participant sex included in the model to statistically control for sex differences in EEG power (Dijk, Beersma, & Bloem, 1989). The overall patterns of results did not differ between these methods and only the results for the models not including sex are reported. Due to the reduced degrees of freedom from only including trials with successful WM reports, in the error analyses, the random effects structure included only the random intercept for each participant and the random slope for the condition and load effects, and not the random slope for the load-by-condition interactions. Rather than testing each single predictor and interaction term from each of these, here named additional predictor models, likelihood ratio tests comparing each additional predictor model to its corresponding experimental effects model were performed to determine whether the inclusion of a predictor significantly improved the goodness of fit. If the inclusion of a predictor did significantly (p < .05) improve the goodness of fit, the additional predictor model was further explored using the same strategy as described above for examining the effects and interactions in the experimental effects models.

Event-related delay activity, FDA and ODA, was analyzed to quantify neural response to stimuli in the WM task. Mixed-effect linear models were fit to predict individual FDA and ODA averages computed over all accepted trials and grouped by nap condition and load. Separate experimental effects models predicted FDA and ODA averages from the fixed effects of load and nap condition and allowed random intercepts for each participant and a random effect of nap condition. Models were analyzed using the same method as the linear models predicting hit error in WM performance. To better understand these ERPs and how they might relate to behavior, both FDA and ODA were examined as potential predictors of WM performance. FDA and ODA were examined in relation to aggregate scores reflecting the percentage of trials with WM success and the average error (i.e., distance from the correct color) among these hits. For these analyses, FDA and ODA were standardized to z-scores within each combination of load and nap condition to reflect individual differences in delay-activity amplitude. Analyses were conducted with mixed-effect models predicting these outcomes from the fixed effects of load, nap condition, FDA, ODA, the FDA by load interaction, and the ODA by load interaction with random intercepts allowed for each participant and a random slope for load and nap condition.

Results

Sleep data

Descriptive statistics regarding properties of sleep obtained during the 90-min nap opportunity are reported in Table 1.

Table 1. Descriptive Statistics of Working Memory Capacity, Preference for Napping, and Properties of the Nap

Working memory performance

Odds of WM success

The odds of a WM success (i.e., a report within 25° of the target color) was modelled using a mixed-effect logit model with the fixed effects of trial, load, and nap condition. Estimated ß coefficients in log odds and exponentiated forms for each fixed effect term in the model are reported in Table 2. For reference, the intercept of the model was not significantly different from zero, χ2 = 0.21, p = .647, indicating approximately equal odds of success and failure in WM when controlling load, condition, and trial. The main effect of trial was significant, χ2 = 6.02, p = .014, reflecting a trend for reduced WM success as the task goes on. Not surprisingly, the odds of WM success also varied significantly by WM load, χ2(2) = 859.63, p < .001. Investigation of the ß coefficients for the contrast terms revealed that the odds of WM success were significantly lower for load 4 than load 2, z = -22.35, p < .001, and significantly lower for load 6 than load 4, z = -14.42, p < .001, with respective odds ratios of 0.27 and 0.53 indicating that the odds of a success were 1.89 times greater on trials of load 4 than trials of load 6, and 3.68 times greater on trials of load 2 than trials of load 4. In testing the research question of whether a nap would increase the odds of WM success, it was found that the main effect of nap condition was not significant, χ2 = 2.70, p = .100, but there was a significant nap condition by trial interaction, χ2 = 6.13, p = .013. While holding trial constant at the start of the task, the nap was estimated to not have much of an effect on the odds of WM success, ß = -.03 at trial 1, but by the end of the task, the odds of WM success was estimated to be 1.26 times greater in the nap condition compared to control, ß = 0.23 at trial 300. These results are illustrated in terms of the percentage of successful reports observed for each trial in Fig. 2. The trial-by-load interaction, χ2(2) = 0.62, p = .734, load-by-condition interaction, χ2(2) = 0.16, p = .922, and three-way interaction, χ2(2) = 0.32, p = .852, were all non-significant.

Table 2. Regression Coefficients for Fixed Effects Predicting Odds of Working Memory Success
Fig. 2
figure 2

Percentage of reports within 25° of the target color in the working memory performance task. (a) Data are grouped by working memory load. (b) Data are grouped by trial and condition. Linear fits are represented with solid lines with shaded areas displaying 95% confidence intervals around the linear fit. Dashed lines represent a loess fit line plotting the average success percentage over a moving window of 30

To determine whether any of the potential predictor variables were associated with WM success, the experimental effects model (i.e., predicting the odds of WM success from trial, load, condition, and their interactions) was compared to each of the additional predictor models (i.e., predicting the odds of WM success from trial, load, condition, one of WMC preferred napping frequency, spindle density, TST, SWS duration, or REM sleep duration, NREM frontal delta power, NREM occipital delta power, and NREM central fast sigma power and all possible interactions). Preferred napping frequency, χ2 = 7.49, p = .824, spindle density, χ2 = 5.71, p = .930, TST, χ2 = 10.60, p = .564, SWS duration, χ2 = 10.77, p = .549, REM sleep duration, χ2 = 9.50, p = .660, NREM frontal delta power, χ2 = 13.97, p = .303, NREM occipital delta power, χ2 = 14.41, p = .273, and NREM central fast sigma power, χ2 = 11.27, p = .506, all failed to significantly increase the goodness of fit over the experimental effects model, although inclusion of WMC, χ2 = 35.04, p < .001, did significantly improve the goodness of fit. There was a significant main effect of WMC, χ2 = 25.32, p < .001, such that higher WMC was, not surprisingly, associated with greater odds of reporting within 25° the target color; a 1-unit increase in WMC (z-score of 1.06) was associated with 1.28 times greater odds of WM success controlling for trial, load, and nap condition, ß = 0.25.

Error within success

Estimated ß coefficients for each fixed-effect term in the mixed-effect linear model predicting error within success (i.e., distance between report and target color given WM success) from trial, load, nap condition, and their interactions are reported in Table 3. There was a significant effect of load, F(2, 99) = 30.14, p < .001, on error when controlling for trial and condition. Examination of the significant load4-2 and load6-4 contrasts revealed that error was estimated to be 0.84° greater in successful load 4 trials compared to successful load 2 trials and 0.48° greater in successful load 6 trials compared to successful load 4 trials. A nap was found to significantly improve accuracy of hit responses (less error) relative to control, F(1, 432) = 5.75, p = .017, with an estimated error reduction of 0.35° associated with the nap. Figure 3 illustrates these effects with estimated probability density functions displaying the relative frequency of error distances (025°) given WM success on a trial of a given load (A) or condition (B). The main effect of trial, F(1, 10736) = 0.26, p = .612, the trial-by-load interaction, F(2, 10681) = 0.78, p = .456, the trial-by-condition interaction, F(1, 10726) = 0.42, p = .515, the load-by-condition interaction, F(2, 106) = 0.03, p = .973, and three-way interaction, F(2, 10668) = 0.14, p = .870, were all found to be non-significant. In addition, goodness of fit of the model predicting error within success was not significantly improved by the inclusion of WMC, χ2(12) = 14.69, p = .259, preferred napping frequency, χ2(12) = 13.25, p = .351, TST, χ2(12) = 11.47, p = .489, SWS duration, χ2(12) = 5.92, p = .920, REM sleep duration, χ2(12) = 8.53, p = .743, spindle density, χ2(12) = 11.94, p = .451, NREM frontal delta power, χ2(12) = 9.28, p = .679, NREM occipital delta power, χ2(12) = 13.09, p = .363, and NREM central fast sigma power, χ2(12) = 11.81, p = .461.

Table 3. Regression Coefficients for Fixed Effects Predicting Error within Trials of Working Memory Success
Fig. 3
figure 3

Estimated probability density functions depicting the distribution of error within trials of working memory success. Data grouped by (a) load and (b) nap condition. Density values reflect the relative frequency of reports being a given distance (0–25°) from the target color

Event-related delay activity

Both FDA and ODA were averaged over all non-rejected trials and grouped by load, nap condition, and participant. Mixed effect linear models with fixed effects of load and condition, random intercepts for each participant, and a random slope for load and condition were used to model average amplitude of delay activity separately for FDA and ODA. Grand average ERPs grouped by load and nap condition are shown in Fig. 4a. Respectively for loads 2, 4, and 6, the mean (SD) number of trials constituting the individual participant averages was 56.8 (25.7), 55.6 (25.6), and 57.0 (24.4) for the control condition and 66.9 (27.5), 67.7 (26.5), and 68.1 (29.0) for the nap condition.

Fig. 4
figure 4

(a) Grand average event-related potentials for all levels of load and nap condition averaged over the frontal (F3, Fz, F4; upper panels) and occipital (PO7, Oz, PO8; lower panels) sites. Dotted lines mark boundaries of windows defining frontal delay activity (FDA) and occipital delay activity (ODA). (b) Average amplitude of frontal delay activity (FDA) by levels of load and nap condition. (c) Average amplitude of occipital delay activity (ODA) by levels of load and nap condition

Frontal delay activity (FDA)

Estimated ß coefficients for the fixed effects predicting average FDA amplitude are reported in Table 4. There was a significant main effect of load, F(2, 127) = 8.27, p < .001, and the main effect of nap condition was non-significant, F(1, 30) = 0.38, p = .543. Importantly, there was also a significant load-by-nap condition interaction, F(2, 127) = 3.70, p = .028. Both the condition by load4-2 contrast and condition by load6-4 contrast interactions were significant, indicating that the pattern of change in FDA amplitude between loads differed by nap condition. Analyses of the differences in estimated least squares means revealed that, in the nap condition, there was no significant difference in average FDA between loads 2 and 4, t(127) = 0.01, p = .990, but FDA was significantly more negative on average for load 6 compared to load 4, t(127) = -3.12, p = .002. In the control condition, this change in FDA averages was observed in the difference between load 4 and load 2, t(127) =-3.04, p = .003, but there was no further change as load increased from 4 to 6, t(127) = 0.42, p = .672. This pattern is depicted in Fig. 4b. Additional predictor models including WMC, χ2(6) = 7.08, p = .313, preferred napping frequency, χ2(6) = 5.03, p = .541, TST χ2(6) = 4.56, p = .601, SWS duration, χ2(6) = 1.75, p = .941, REM sleep duration, χ2(6) = 1.93, p = .926, spindle density, χ2(6) = 4.74, p = .578, NREM frontal delta power, χ2(6) = 4.27, p = .641, NREM occipital delta power, χ2(6) = 3.59, p = .732, or NREM central fast sigma power, χ2(6) = 2.88, p = .824, did not significantly improve the goodness of fit over the model including only load, nap condition, and their interaction as fixed effects.

Table 4. Regression Coefficients for Fixed Effects Predicting Frontal Delay Activity Averaged over Trials by Load and Nap Condition

Occipital delay activity (ODA)

Estimated ß coefficients for the fixed effects predicting average ODA amplitude are reported in Table 5. There was both a significant main effect of load, F(2, 128) = 69.22, p < .001, and a significant main effect of nap condition, F(1, 32) = 4.57, p = .040, but a non-significant load-by-condition interaction, F(1, 128) = 0.25, p = .783. ODA average was significantly more negative for load 4 relative to load 2 trials by an estimated 1.27 μV and significantly more negative still for load 6 relative to load 4 by an estimated 0.34 μV. ODA was more negative with increasing load, but a nap was associated with less negative ODA on average by an estimated 0.57 μV. Average ODA by each level of load and nap condition is depicted in Fig. 4c. Again, additional predictor models including WMC, χ2(6) = 7.62, p = .268, preferred napping frequency, χ2(6) = 3.36, p = .732, TST, χ2(6) = 4.74, p = .578, SWS duration, χ2(6) = 7.68, p = .263, REM sleep duration, χ2(6) = 4.63, p = .592, spindle density, χ2(6) = 12.18, p = .058, NREM frontal delta power, χ2(6) = 6.52, p = .368, NREM occipital delta power, χ2(6) = 8.37, p = .212, and NREM central fast sigma power, χ2(6) = 11.64, p = .071, did not significantly improve the goodness of fit over the model including only load, nap condition, and their interaction as fixed effects. Although it did not hold up to the conservative testing approach of checking for improved goodness of fit, it is worth noting that the nap condition by occipital delta power interaction term was significant, ß = 0.92, se = 0.42, t(33) = 2.19, p = .035, suggesting that effect of a nap to make task ODA more positive was strongest for those with greater delta power over occipital sites in NREM sleep.

Table 5. Regression Coefficients for Fixed Effects Predicting Occipital Delay Activity Averaged over Trials by Load and Nap Condition

Relationship between delay activity and performance

Given the observed patterns of both FDA and ODA averages varying according to load and nap condition, analyses were conducted to explore whether event-related delay activity related to WM performance, which would assist in the interpretation of these neurophysiological effects. Models were specified to predict the percentage of successful reports responses and average error within success for each combination of load and condition from the fixed effects of load, condition, FDA, ODA, and the interactions of these ERPs with load. For WM success, the main effects of FDA, F(1, 151) = 1.83, p = .178, and ODA, F(1, 116) ≈ 0.00, p = .980, were not significant, but the load-by-FDA interaction was a significant predictor, F(2, 76) = 5.06, p = .009. In examining the contrasts, it was found that the interaction was driven by the FDA interaction with the load6-4 contrast. This interaction is depicted in Fig. 5 with average percentage of WM successes plotted against average FDA for each WM load. From these data, it appears that more positive FDA was associated with more WM success at loads 2 and 4, but not load 6. At loads 2 and 4, a 1 standard deviation increase in FDA was estimated to increase the success percentage by 1.81 and 1.82 points, respectively, though at load 6, the same increase in FDA was estimated to decrease success percentage by 0.81 points. The interaction between load and ODA was not significant in predicting WM success percentage, F(2, 65) = 0.42, p = .662. Regarding average error within success, neither average ODA, F(1, 90) = 3.49, p = .065, nor average FDA, F(1, 124) = 0.83, p = .366, were significant main effects, though the main effect of average ODA was marginally significant with 1 standard deviation increase in ODA associated with an 0.17° increase in average error. The load-by-FDA interaction, F(2, 102) = 1.29, p = .279, and the load-by-ODA interaction, F(2, 82) = 0.07, p = .934, were also non-significant in predicting average error within success.

Fig. 5
figure 5

Percentage of reports within 25° of target color plotted as a function average frontal delay activity (standardized within each load and condition) for all working memory loads

Discussion

In the current study, we sought to further examine the role of sleep in WM performance by employing a task that measures both the ability to hold and report visual representations and the fidelity of those reports. WM performance was compared between nap and wake conditions, and we investigated whether WMC or properties of sleep physiology related to performance or moderated the effects of the nap. ERPs obtained from the period in which items were held in WM were examined to better understand the neurophysiological underpinnings of the observed effects. As predicted, WM performance was greater after the 90-min nap opportunity compared to the control condition. Interestingly, the nap condition performance was greater for both the ability to report the approximate color of the probed item and the fidelity of those reports. With respect to potential moderators, it was predicted that the nap would be most beneficial for those with low WMC. This pattern, however, was not observed, and further, none of the measured properties of the nap were found to predict the performance benefits of the nap.

Napping increased the likelihood of responding within 25° of the target, specifically for the later trials of the task. The ability to hold and retrieve items in visual WM was similar on nap and no-nap days near the start of the task, but performance in the two conditions deviated as the task continued (Fig. 2b). On days with a nap, participants maintained their performance over the course of the task, whereas on days without a nap, performance degraded over time. By the end of the task, the estimated effect of a nap on the odds of a successful WM report, ß = 0.23, was of similar strength to the overall effect of a one-item increase in WMC, ß = 0.25. This pattern is consistent with the reports of a nap benefiting accuracy in a spatial N-back task (Lau et al., 2015). Although an overall benefit of the nap to WM performance was observed by Lau et al. (2015), block-by-block analyses showed that the nap only had significant benefits in the second and third blocks of the task and not the first. This pattern of results may reflect progressive decreases in attentional control in the wake conditions. Consistent engagement of attentional control mechanisms is required in tasks of visual WM to prevent errors from complete attentional lapses and more subtle causes such as insufficient individuation of items or swapping item positions (Adam, Mance, Fukuda, & Vogel, 2015). Moreover, neural measures of attentional control also predict WMC directly (Emrich & Busseri, 2015). Both sleep deprivation and sleep restriction are known to result in deficits to sustained attention and executive control of attention (Durmer & Dinges, 2005), and such deficits may account for impaired WM under conditions of sleep loss (Frenda & Fenn, 2016). One view of performance deficits following sleep loss is that they are driven by use-dependent, bottom-up mechanisms such that activity in neural populations over wakefulness leads to increase sleep need in those populations and subsequent occurrences of a local, sleep-like state in these populations that leads to impaired performance (Van Dongen, Belenky, & Kruger, 2011, Vyazovskiy et al., 2011). From this perspective, performance deterioration on the no-nap day may have been driven by the fact that participants had been awake for about 10 h by the end of the task and have already completed two attention-demanding memory tasks. In contrast, a nap opportunity mid-day may have protected against these effects by alleviating sleep need incurred throughout the day.

A novel contribution of the current study to the investigation of sleep and WM was the ability to measure the fidelity of the memory representations beyond the ability to simply report an item or not. The fidelity of visual WM reports is thought to be explained by the proportion of a shared attentional resource allocated to items (Bays & Husain, 2008; Dube, Emrich & Al-Aidroos, 2017; Emrich, Lockhart, & Al-Aidroos, 2017). In the current study, and others (e.g., Zhang & Luck, 2008), greater fidelity at lower WM loads results from having a greater proportion of resources available for each item. In both the nap and control conditions, participants likely allocated their attention across all items, but it is possible that after a nap, participants had a greater control over their attentional resources, allowing for a more effective distribution of those resources and greater fidelity across all loads. However, if a progressive decline in attentional control in the wake condition underlies the differences in WM success between nap and wake (discussed above), one might expect the fidelity difference between conditions to have a similar pattern if it too was the result in differences in executive attention. The fact that nap-day accuracy was greater than control over the whole task and not only in the later trials suggests a separate mechanism beyond the control of attentional resources may contribute to this effect.

A second explanation for greater fidelity in the nap condition is that the nap resulted in a beneficial reduction in neural noise (random firing) in the visual system. The contents of visual WM can be decoded from patterns of neural activity in early visual cortices using multivoxel pattern analysis (Harrison & Tong, 2009), and the performance of this decoding method is correlated with estimates of memory precision (Emrich, Riggall, LaRocque, & Postle, 2013), suggesting that signal-to-noise ratio in the early visual cortices has a direct relation to the accuracy of reports from visual WM. Bays (2014) supports this notion that neuronal noise accounts for error in visual WM by presenting a population coding model that accurately models human behavior. Although this model suggests that increases in a multiplicative gain factor can largely counter the effect of increased stimulus-independent firing (i.e., noise), small changes in the precision of WM representations from increased noise are predicted. Importantly, firing rates of cortical neurons have been shown to be reduced after sleep (Vyazovskiy et al., 2009), so it is reasonable to suggest such a mechanism may be responsible for the nap benefit to fidelity observed in the current study.

Exploration of ERPs measured over the delay period revealed interesting results that suggest differences in neural processing resulting from the nap. FDA, measuring the average amplitude across frontal sites, varied by load and nap condition with the key difference between conditions found at load 4 (Figure 4b). After the nap, FDA at load 4 was relatively low (more negative) and like the FDA observed at load 2 in both conditions. In contrast, FDA at load 4 was relatively high in the waking control condition and like the FDA observed at load 6 in both conditions. Liesefeld et al. (2014) found event-related EEG amplitude at frontal sites in a visual WM task to vary in response to presence or absence of distractor items. Specifically, they found in distractor-present conditions a negative-going peak about 245 ms post-stimulus, and (although not analyzed in their report), a pattern of more positive amplitude on average over the second half of the delay period. Importantly, the activity associated with filtering processes has been shown to predict WMC independent of the presence of distractors, and it has been proposed that such activity may index the efficient processing of target information (Emrich & Busseri, 2015). Thus, in the current study, although no distractors were present, variations in FDA between load and condition may reflect demands associated with effective allocation of attentional resources.

In terms of the population coding model of visual WM (Bays 2014), FDA observed here may reflect the strength of the gain factor working to boost the signal of the neurons coding for the items in memory. When multiple items are present, attentional resources are spread amongst these items, which may result in the signal coding for each item to be lower in load 6 trials than load 4 trials and still lower in load 4 trials compared to load 2 trials. For load 6, a large gain factor may be needed to boost the small signals over the level of neural noise to obtain a functional signal-to-noise ratio, whereas for load 2, such a large gain factor would not be required. As mentioned previously, the nap may have generally reduced the level of neural noise in visual cortices, resulting in small reductions in error. Importantly, the magnitude of the gain factor required is inversely related to the level of neural noise (Bays, 2014). Thus, a general reduction in neural noise may also account for the divergence in FDA between nap and control conditions for load 4 trials. Under normal (wake control) conditions, a large gain factor may be required to obtain a functional signal-to-noise ratio; however, if neural noise is reduced after the nap condition, similar performance could be obtained without as large a gain factor. The view that FDA may represent a beneficial gain factor is indeed supported by the pattern for higher average hit percentage at WM loads 2 and 4 to be associated with greater FDA. For loads 2 and 4, those able to engage the gain factor more reliably would be more likely to respond within 25° of the target color. This relationship was not present for load 6, for which even a high gain factor may not have been enough to reliably overcome the poor signal when operating in high load conditions. Finally, Edin et al. (2009), using functional magnetic resonance imaging, found evidence of a boost signal present in the dorsolateral prefrontal cortex (dlPFC) and associated with improved WM performance at high loads; thus, the dlPFC may be considered a potential generator for the FDA observed in the current study.

ODA in the current study was more positive with lower WM loads, suggesting that more negative ODA is associated with greater demands on visual memory. Along this line of thought, the fact that a nap was associated with more positive ODA relative to control across all loads suggests that the nap may benefit visual WM by generally reducing demands on the visual memory system. Individual differences in ODA activity did not significantly predict the proportion of hit responses or average hit error within levels of condition and load. Thus, it is difficult to presume what, if any, effect may come from a nap-related positive shift in ODA. It may be that greater demands on the visual memory system do not directly affect performance, but do lead to greater strain or fatigue over time, and a nap helps delay these consequences. One limitation for interpreting the ERPs in the current study is that memory load (i.e., the number of colored squares to remember) increased concurrently with visual load (i.e., the number of colored squared viewed). Thus, it cannot be determined whether the ERPs measured reflect the visual demands or the memory demands of the task. To avoid this conflation, future research investigating the role of sleep in visual WM should control for visual demand by utilizing lateralized ERP components, such as the SPCN, obtained by cueing attention to one side of a symmetric display during encoding and then subtracting ipsilateral activity from contralateral activity (Jolicœur et al., 2008; Vogel & Machizawa, 2004).

One purpose of this investigation was to examine what individual differences or properties of the nap might moderate the effects of the nap. It was predicted that those lowest in WMC would benefit the most from the nap. Although low WMC was associated with less hit responses, particularly in later trials, there were no significant interactions between WMC and nap condition. The distribution of WMC scores in our sample of 36 individuals was approximately normal and ranged from 2.0 to 5.2 items, thus a restricted range in WMC was likely not an issue. Our WMC measures were taken during an orientation session and not during the main protocol. Although WMC is expected to be a stable measure under normal conditions, it is possible that WMC measured before the nap would have been a better predictor of performance and would have shown an interaction with condition. It was also predicted based on Lau et al. (2015) that TST and REM sleep duration would predict sleep-related gains in WM performance. These interactions were also not observed in the data as none of TST, REM sleep duration, SWS duration, sleep spindle density, NREM delta power, or NREM fast sigma power were found to significantly improve the models predicting trial-by-trial performance. Again, considerable ranges for these variables were observed in the 90-min nap opportunity. In our study, napping produced performance benefits relative to control regardless of the specific makeup of the nap. Larger samples may be required to reliably identify the properties of sleep associated with performance improvement, and the relationships may be more apparent when WM is examined following a full night of sleep.

The analytic approach of the current study required a criterion by which a response would be classified as a WM success. If a very permissive criterion was used or error was modelled for all trials without a criterion, the analysis may be unable to detect relatively small changes in fidelity of memories given the large amount of error variance coming from responses that were only guesses resulting perhaps from lapses of attention. In contrast, a very restrictive criterion would decrease the validity of our WM success classification given that many true but relatively low-fidelity memories would be excluded. We chose a criterion of 25° after identifying it as the typical distance needed between two points on the color wheel for these two points to be subjectively categorized by us as different colors (i.e., light blue vs. blue-green). Although chosen somewhat arbitrarily, the ±25° criterion is in line with results from a study of the categorical naming of colors in which the range of hues consistently assigned to a specific color category covered, on average, approximately 50° of the color wheel (Bae, Olkkonen, Allred, & Flombaum, 2015). The results of our analyses did not qualitatively differ when 20° and 30° were used as the criterion of WM success, but other criteria for determining WM success may impact on results and interpretations. Having participants report guesses or lapses of attention to exclude these trials from analyses of fidelity is one alternative to consider for future research.

An important point to consider in interpreting the results of this study is the recent sleep and activity history of participants and the limited information obtained regarding this history. All participants were screened to be healthy individuals free of sleep disorder and excessive daytime sleepiness who typically slept 7–9 hours starting between 22:00 and 0:00 each night, and all participants reported sleep from approximately 23:00 to 7:00 on the night before their experimental sessions. Despite these controls, given that an estimated 60% of college students could be classified as poor sleepers (Lund, Reider, Whiting, & Prichard, 2010), it is possible that some participants carried sleep deficits that could impair performance on the demanding visual WM task. Furthermore, playing the tabletop games with a researcher for 1.5 h longer in the wake condition than in the nap condition may also contribute to relatively impaired WM performance through increased sleepiness or a state of mental fatigue. Thus, although one may view the results of the current study as a nap leading to improved performance over a baseline level, an alternative explanation is that the nap relieved an underlying sleep need, returning participants’ performance to their hypothetical baseline level. Considering that no sleep variables were found to be significant predictors of sleep-related WM improvement, it may be that composition of participants’ naps was naturally personalized to their specific sleep needs, which varied between individuals. To better control for or understand the potential effects of recent sleep and activity history, future studies should collect detailed sleep and wake diaries or actigraphy from participants.

In conclusion, the current study contributes to the understanding of the role of sleep in WM by showing further evidence of visual WM performance benefits following an afternoon nap. With the novel use of a continuous-report delayed-estimation task, it was shown that the nap improved both the odds of successful WM storage and retrieval as well as the accuracy of successful WM reports. The nap-associated benefit to the odds of WM success was particularly evident for later trials, implicating improved sustained attention as a potential mechanism, whereas the benefit to accuracy was more general in nature. Further research to replicate and expand upon the novel ERP evidence presented here may further clarify how sleep alters neurophysiological processing during visual WM and how these changes impact performance. While the current study improves our understanding of how WM performance can be improved by an afternoon nap, given the ubiquity of sleep loss and the importance of WM in many cognitive functions, further research into the functional role of sleep in WM performance is warranted.