Introduction

Aging research has proliferated in the past decade, particularly in the context of cognitive intervention and enhancement (Hertzog et al. 2008; Lustig et al. 2009; Zhu et al. 2016). Numerous studies have shown that single-modality training methods, such as computerized cognitive training, or aerobic exercise, are associated with positive neural changes and enhanced cognition, particularly in the domains of executive functions (EFs) and memory (Colcombe and Kramer 2003; Hertzog et al. 2008; Lustig et al. 2009). More recent studies have examined the efficacy of combined training formats such as pairing exercise and cognitive training, either in concurrent or sequential formats (Lustig et al. 2009; Zhu et al. 2016). However, studies on combined training report mixed findings, prompting the direct comparison of sequential and concurrent formats in the present study. Additionally, individual difference factors, such as motivation and aerobic fitness, have rarely been examined as potential moderators of training efficacy (cf. Stine-Morrow et al. 2014; Voss et al. 2013) and are thus also considered in the present study.

Single-Modality Training

Substantial evidence has accrued in studies of single-modality training as compared to no-treatment or active control conditions. Computerized cognitive training studies have demonstrated some positive effects in older adults that range from the reduction of global cognitive decline to the preservation of specific cognitive processes such as executive functions, although such improvements have rarely led to the elimination of age differences in performance (Hertzog et al. 2008; Lampit et al. 2014; Lustig et al. 2009). Focused single-process training and, in particular, protocols that target EFs appear to yield more promise than protocols targeting other cognitive processes in showing transfer effects (Hertzog et al. 2008). For example, working memory training has led to significant transfer gains in EFs in both young and older adults (Baniqued et al. 2015; Brehmer et al. 2012; Morrison and Chein 2011; Rhodes and Katz 2017). Similarly, dual-task (DT) training has led to cross-modality near-transfer, (Bherer et al. 2005, 2006, 2008; Desjardins-Crépeau et al. 2016; Erickson et al. 2007a, 2007b; Lussier et al. 2015, 2016), up-regulation and increased efficiency in prefrontal regions in younger and older adults (Erickson et al. 2007a, b), and improved balance and mobility (Li et al. 2010). Such DT training is thought to improve processes such as task coordination and working memory (Bherer et al. 2005). Dahlin et al. (2008) demonstrated that the degree of transfer follows the principle of neural overlap, such that gains are greater when the trained task and outcome measures share neural substrates. This implies that not all cognitive outcome variables should show equivalent transfer effects but that the magnitude of improvement should depend on the relationship to the trained task.

Another commonly used single-modality training approach involves physical exercise or aerobic training (Bherer et al. 2013; Colcombe and Kramer 2003; Lustig et al. 2009). Exercise training has led to improved EFs, episodic memory, processing speed, and other cognitive processes in older adults (Nouchi et al. 2014). Neuroanatomical changes due to aerobic training have also been observed, such as increased gray and white matter volume in prefrontal and temporal cortices in humans (Colcombe and Kramer 2003; Erickson and Kramer 2009) and neurogenesis in the hippocampus in animal models (Duzel et al. 2016).

Multimodal Approaches: Combining Cognitive and Aerobic Training

A newer focus of cognitive training research compares different training formats in efforts to maximize cognitive gains using some combination of cognitive and exercise training. Combined formats are commonly compared with single-modality training to determine whether synergistic effects can be found, with mixed results (see Zhu et al. 2016 for a meta-analysis). A possible reason for these mixed results is the heterogeneity of training schedules: Most multimodal studies have employed a sequential training schedule, in which participants engage in exercise followed by cognitive training, or vice versa, either in the same sessions or in separate sessions. For example, cognitive training followed by aerobic training in the same session produced significantly greater gains on EFs and verbal episodic memory when compared to cognitive training alone (Rahe et al. 2015; see also Oswald et al. 2006). By contrast, another sequential training protocol administered on different days led to comparable gains when compared to single-modality cognitive training on measures of EFs and episodic memory (Shatil 2013). Desjardins-Crépeau et al. (2016) also failed to find synergistic effects of sequential multimodal training utilizing Bherer et al.’s (2005) DT training protocol and moderate-intensity aerobic training administered on separate days.

Studies using simultaneous multimodal training schedules (during which both training components are given concurrently within the same training day) have also produced mixed results. For example, Theill et al. (2013) paired a verbal working memory task with concurrent moderate-intensity aerobic exercise. Both the multimodal and pure cognitive training groups showed the same degree of cognitive improvement with the exception of a task of visual memory, which improved more in the multimodal group. Hiyamizu et al. (2012) contrasted combined simultaneous training (cognitive plus strength and balance training) with a pure physical training group and found a significant advantage for the combined simultaneous training group. Pooling sequential and simultaneous training studies in a meta-analysis, Zhu et al. (2016) concluded that multimodal training is superior to pure aerobic training and active control conditions in conferring cognitive gains but is comparable to pure cognitive training. However, the scarcity and heterogeneity of multimodal training studies suggest that this conclusion may be premature. We therefore aimed to examine one aspect of multimodal training schedules, namely the comparison of sequential vs. simultaneous training, delivered within each training session.

There are pros and cons associated with both schedules of training: Simultaneous training helps to avoid monotony and reduce training time, and acute bouts of exercise have been associated with small (but significant) positive effects on cognitive performance either during the exercise, or just after (e.g., Chang et al. 2012). However, simultaneous multimodal training can be conceptualized as cognitive-motor dual-tasking, which in many studies shows sizeable DT costs in older adults (e.g., Verhaeghen et al. 2003). The combination of cognitive and motor tasks is particularly detrimental to cognitive performance for older adults, due to the tendency to prioritize motor performance and incur DT costs in the cognitive domain (Li et al. 2001). Along similar lines, Labelle et al. (2013) found that participants committed significantly more errors on a Stroop test while cycling when their peak power output (PPO) was increased from 60 to 80%. For these reasons, simultaneous exercise may not be an effective way to deliver multimodality training in older adults.

Determinants of Cognitive Improvements Due to Training: Moderating Factors

Another under-studied dimension of cognitive training research is the potential for individual difference factors to influence the magnitude of training gains. For example, Stine-Morrow et al. (2014) found that motivation to engage in cognitively challenging activities, as measured by the Need for Cognition (NFC) questionnaire (Cacioppo et al. 1996), was a moderator of training gains. Interestingly, however, the association was negative, such that participants who scored higher on NFC showed smaller training gains than those who scored low on NFC (Hess et al. 2012). Similarly, although aerobic fitness is positively associated with cognitive abilities (Bherer et al. 2013; Colcombe and Kramer 2003), baseline aerobic fitness has rarely been investigated as a potential moderator of training-related change (cf. Burzynska et al. 2017; Voss et al. 2013). This is also paralleled by finding that fit older adults show less cognitive decline than their sedentary counterparts and that physical exercise can improve cognitive functions (Bherer et al. 2013; Colcombe and Kramer 2003).

The Current Study

The current study was designed to address the primary question of whether the magnitude of cognitive gains differs as a function of training format (simultaneous vs. sequential) when healthy older adults undergo multimodal cognitive and exercise training. Our approach was to use a well-established DT training task (Bherer et al. 2005) and combine it with moderate-intensity aerobic exercise (Bherer et al. 2013). In recent work (Desjardins-Crépeau et al. 2016), this same combination was delivered sequentially and compared with an active control condition (computer lessons plus stretching). Notably, no cognitive gains were observed for this active control combination, leading us to simplify our design to the direct comparison of the two training schedules. We included a range of EF and memory outcome measures in keeping with the reviewed literature. A secondary question was whether individual differences in motivation to engage in cognitively challenging activity and aerobic fitness status would influence the magnitude of cognitive gains.

Across both training formats, we anticipated that older adults would show significant cognitive gains based on the foregoing review. From the array of cognitive outcome measures chosen (measuring response inhibition, switching, working memory, immediate and delayed recall), we anticipated that the training-related benefits would be more pronounced for outcome measures with the greatest functional and neural overlap with the trained dual task (Dahlin et al. 2008), namely measures of divided attention or working memory. Based upon the literature on age-related increases in cognitive-motor DT costs (Li et al. 2001), we hypothesized that sequential training would confer greater cognitive benefits than simultaneous training, due to the cognitive advantage of training under full attention. Based upon the reviewed individual difference literature (Stine-Morrow et al. 2014; Voss et al. 2013), we also expected that Need for Cognition and aerobic fitness level at baseline would be negatively associated with the magnitude of training-related gains in cognition.

Method

Participants

Older participants were recruited in the current study to compare the effects of two training schedules of multimodal training (simultaneous vs. sequential) on measures of EFs and memory before and after training. A total of 85 participants were initially recruited from the community via notices and newspaper advertisements. Due to the eligibility criteria and participant attrition, 42 older adults (range = 61–83, M = 68.05 years, SD = 4.65) completed the full study and contributed data. Eligible participants were free of chronic medical conditions such as cardiopulmonary or musculoskeletal diseases and were not on medications that could impair their cognitive and physical test performance. Participants were excluded if they failed the assessment of readiness to exercise using the PAR-Q+ (Physical Activity Readiness Questionnaire: Canadian Society of Exercise Physiologists, Thomas et al. 1992) and approval was not granted from the participants’ physician, or if they scored less than 26/30 on the Montreal Cognitive Assessment (MoCA: Nasreddine et al. 2005). Participants were also excluded if they had uncontrolled hypertension, or if they could not commit to the training schedule. Forty-three participants were unable to meet the cognitive and/or physical prerequisite and had to withdraw from the inclusion of the study. Written consent was obtained from all participants, and an honorarium of $300.00 was given for the completion of the study. All participants who qualified for the study were then randomly assigned to either the simultaneous or sequential training condition.

Materials

In addition to the tests described below, other outcome measures (balance, mobility, social engagement, hearing acuity) were administered for other purposes and will not be reported here.

Background Measures

Background measures used in the individual difference analyses included the Need for Cognition (Cacioppo et al. 1996) questionnaire, in which participants’ tendency to engage in or enjoy cognitively effortful tasks was assessed using a 45-item questionnaire, with each item containing a 9-point Likert Scale. The maximum score one could obtain from the questionnaire is 90. The Jones Test (Jones et al. 1985) served to determine baseline aerobic fitness. Participants cycled on stationary bikes with increasing physical intensity until reaching 85% of their estimated maximum heart rate, from which sub-maximum VO2 was calculated. This information was used to set the physical workload for the training phase.

Training Tasks

Dual-Task (DT) Training

The DT training task (Lussier et al. 2015) was adapted for iPad use (MD785CL/B,IOS 8.2). The participants held the iPad in landscape orientation. Two tasks were performed either separately or simultaneously. One task required to discriminate fruits by pressing the corresponding button on the left extremity of the tablet with their left thumb. The other task required discriminating vehicles with the right thumb. Stimuli measured 150 pixel2 and were randomly presented above or below the focus point where the fixation cross appeared (approximately an angle of 3.17° from the cross at an arm’s length). Figure 1 shows an example display. There were two types of blocks, pure (A) and mixed (B). In the pure blocks, participants were presented with single-pure trials, in which stimuli from the same category were presented throughout the block. In the mixed blocks (B), participants were presented with a random order of single-mixed and dual-mixed trials. In the single-mixed trials, a single stimulus from one of the two categories was displayed while in the dual-mixed trials, two stimuli simultaneously (one from each category) were displayed (participants were instructed to respond to one stimulus after another as quickly as possible). The training task was adaptive based on the participant’s performance in the previous block (Lussier et al. 2015).

Fig. 1
figure 1

Dual-mixed trial of the DT training task

All participants performed the iPad training task in the same AfruitAvehicle BBBBAfruitAvehicle order. In each training session, participants completed two pure blocks (A), one for each category. Next, participants completed five mixed blocks (B) followed by two pure blocks (A). In each of the pure blocks (A), there were 12 single-pure trials, and in the mixed blocks (B), there were 30 single-mixed trials and 90 dual-mixed trials. Practice trials preceded each first occurrence of a block and were not included in the analyses. Mean response time and error rates (in percentage; # of errors/total trials per session) were the two primary outcome measures (analyses using median yielded similar results). Continuous feedback was shown at the end of each trial, and mean RT and accuracy per session was shown at the end of each training session. Although there was no minimum accuracy rate required, all participants maintained an error rate of < 5% even on the most difficult condition (dual-mixed), replicating Lussier et al. (2015). Each cognitive training session lasted approximately 30 min.

Physical Training

The aerobic training component involved recumbent cycling, chosen to minimize balance demands and allow participants in the simultaneous group to hold the iPad comfortably while cycling. Over the 12 training sessions, the physical workload incremented gradually, such that participants cycled at 40% of their baseline estimated maximum heart rate in Sessions 1–4, at 44% in Sessions 5–8, and 48% in Sessions 9–12, as determined by the Jones Test (Jones et al. 1985). Each training session began with a 5-min low-intensity warm-up on the bike, followed by 25-min cycling at the target heart rate and a 5-min cooldown. Heart rate was monitored throughout each physical training session. The participants then performed 5 min of stretching to minimize any potential muscle soreness.

Outcome Measures

The outcome measures were selected to represent a range of EFs and memory abilities. We considered the DT near-transfer task, which uses a similar protocol as the DT training task, and the working memory measure (Letter Number Sequencing) as sharing more neural overlap than the other measures (response inhibition, processing speed, immediate and delayed memory), which we considered as far-transfer outcome measures (Zelinski 2009).

Dual-Task Near-Transfer

We administered a variant of the DT training task with new response categories (animal and astral bodies) to assess near-transfer effects. The near-transfer assessment consisted of 48 single-pure trials, 68 single-mixed trials, and 204 dual-mixed trials and followed an AanimalAastralBBAanimalAastral format similar to the training sessions. Furthermore, all the participants performed this near-transfer task while seated in a chair. Mean response times and errors per condition were derived for analyses.

Executive Functions and Memory

Letter Number Sequencing, a subtest of the WAIS-IV (Wechsler 2008), was included to assess working memory and requires participants to report increasing spans of letters and numbers re-arranged in numerical and alphabetical order. The dependent variable is the total number of correct sequences recalled by the participants (out of a possible 30). Digit Symbol Coding (Wechsler 2008) was included to measure processing speed. The number of correct responses completed in 120 s is the dependent measure, with a maximum possible score of 135. The Stroop task (MacLeod 1991) was included to estimate conflict resolution and response inhibition (Banich et al. 2000; Lezak 2004). The participants were first given 120 s each to complete as many items as possible in the congruent condition followed by another 120 s to complete the incongruent (color naming) condition. For each condition, speed was derived by dividing the correct responses by the completion time of the task (the maximum being 120 s). A ratio was obtained by dividing the incongruent condition speed by the congruent condition speed, with a higher ratio indicating better inhibition. The immediate and delayed memory subtests of the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS: Randolph et al. 1998) were given to assess episodic memory. Four subtests were included: immediate and delayed tests for words (maximum scores = 40 and 10, respectively) and story recall (maximum scores = 24 and 12, respectively). The dependent variable derived from these measures was the total number of items recalled from each subtest.

Procedure

During the initial evaluation session, the participants underwent cognitive testing. On a subsequent day, the physical assessment was administered and included the measurement of blood pressure, height, weight, heart rate, and sub-maximal VO2. Heart rate and sub-maximal VO2 were measured by a certified exercise physiologist while the participants were on recumbent bikes. Those participants deemed eligible to participate began the training phase in the following week under the supervision of an exercise physiologist. Those in the simultaneous group completed the cognitive and physical training at the same time and were told to give equal emphasis to both tasks, whereas those in the sequential group performed the DT training in a separate room, followed by the cycling. This fixed order was used to avoid fatiguing the participants prior to cognitive training. We note that participants in the sequential group received approximately double the contact time compared to the simultaneous group to ensure equal dosages of exercise and DT training. In both training conditions, participants were trained in groups of 3–4 individuals. Both groups completed 12 sessions of training over a 6-week period. Make-up sessions were offered to participants who were absent due to illness or scheduling conflicts, so that all participants who finished the study contributed complete data sets. Following the training phase, participants returned for the post-training assessment and repeated the same cognitive tests performed before training. The total completion time for most participants was 2 to 3 months.

Data Analysis

Data from 22 participants in the simultaneous group and 20 participants in the sequential group were analyzed. There were very few outliers (0–4 cases, defined as SD > 3.5) on the neuropsychological tests, which were subsequently winsorized for data analyses. Additionally, with the exception of Need for Cognition, all variables employed in the study had distributions that were considered normal (defined as having a skewness and kurtosis of < ± 2). The reaction time of the iPad training tasks underwent outlier analysis, in that trials with a reaction time of < 200 or > 4000 ms were identified and deleted. In total, 531 trials were deleted, and most of the deleted trials were from the first four training sessions with a random distribution among all participants (approximately 2–3 trials per training session per participant). The error rate variables from the iPad training task and the near-transfer task were derived from the total number of errors divided by the total number of trial of their respective trial types and then multiplied by 100. For most of the outcome measures, a mixed-factorial ANOVA design was applied, with group as the between-subjects factor (simultaneous vs. sequential) and time (pre- vs. post-training) as the within-subjects factor.

Results

Baseline Characteristics

Descriptive statistics and group contrasts for all baseline measures are presented in Table 1. The two training groups did not differ significantly on any background or baseline measures, confirming that there were no systematic between-groups confounds.

Table 1 Means, standard deviations, and group contrasts for all baseline measures by training condition

Training Phase

We next analyzed the training phase data to confirm that the participants improved on the trained task and to assess whether the training format (simultaneous vs. sequential) affected the overall level of DT performance or the rate of improvement during the training phase. Mean RT and error data are shown in Table 2.

Table 2 DT near-transfer task: mean RTs (SD) and error rate as a function of trial type and time

DT Reaction Time

To determine the magnitude of improvement on reaction time on the cognitive training task, a 12 (time: training sessions 1–12) × 3 (trial type: single-pure vs. single-mixed vs. dual-mixed) × 2 (training group: simultaneous vs. sequential) mixed-factorial ANOVA was carried out. Figure 2 depicts the mean RTs across 12 sessions for each training group and trial type. The analysis revealed a significant effect of time, F(11, 253) = 72.08, p < .001, η p 2 = 0.76, with polynomial contrasts showing significant linear F(1, 23) = 169.13, p < .001, η p 2 = 0.88; quadratic F(1, 23) = 119.52 p < .001, η p 2 = 0.84; and cubic F(1, 23) = 22.94, p < .001, η p 2 = 0.50 trends. Furthermore, there was a significant effect of trial type, F(2, 46) = 569.39, p < .001, η p 2 = 0.96, such that the mean RT on the single-pure trials (M = 870.92 ms, SE = 19.43) was significantly faster than that of single-mixed trials (M = 929.99 ms, SE = 23.77, p < .001), which was in turn significantly faster than that of dual-mixed (M = 1377.69 ms, SE = 29.21, p < .001). The analysis also revealed a significant interaction of time and trial type, F(22, 506) = 2.49, p < .001, η p 2 = 0.14. Polynomial contrasts on the dual-mixed trials revealed significant linear F(1, 25) = 104.56, p < .001, η p 2 = 0.81; quadratic F(1, 25) = 47.56, p < .001, η p 2 = 0.66; and cubic F(1, 25) = 9.16, p = .006, η p 2 = 0.27 trends. Additionally, polynomial contrasts on the single-pure trials revealed significant linear F(1, 26) = 128.09, p < .001, η p 2 = 0.83; quadratic F(1, 26) = 67.12, p < .001, η p 2 = 0.72; and cubic F(1, 26) = 9.50, p < .001, η p 2 = 0.27 trends. Furthermore, polynomial contrasts on the single-mixed trials revealed significant linear F(1, 25) = 139.80, p < .001, η p 2 = 0.85; quadratic F(1, 25) = 82.62, p < .001, η p 2 = 0.78; and cubic F(1, 25) = 44.93, p < .001, η p 2 = 0.64 trends. Lastly, the main effect of group and the interactions of group with time and with trial type were non-significant (ps ≥ .354).

Fig. 2
figure 2

DT training task. Mean RT of all three types of trial types across 12 sessions by group. Note: error bars represent one standard error of the mean

DT Error Rate

To assess the reduction of error rate on the trained task over time, the same mixed-factorial ANOVA design was employed using the error rate. The analysis revealed significant main effects of time, F(11, 440) = 11.57 p < .001, η p 2 = 0.22. Polynomial contrasts revealed significant linear F(1, 40) = 24.03, p < .001, η p 2 = 0.38; quadratic, F(1, 40) = 24.57, p < .001, η p 2 = 0.38; and cubic F(1, 40) = 18.04, p < .001, η p 2 = 0.31 trends. There was also a significant trial type effect, F(2, 80) = 71.98, p < .001, η p 2 = 0.643, such that the error rate of single-pure trials (M = 0.05%, SE = 0.01) was significantly lower than that of the single-mixed trials (M = 0.65%, SE = 0.15; p = .012), which in turn was significantly lower than that of the dual-mixed trials (M = 2.48%, SE = 0.33; p < .001). These main effects were qualified by a significant interaction of time and trial type, F(22, 880) = 12.33, p < .001, η p 2 = 0.24. Polynomial contrasts on the single-pure trials did not yield any significant trend, but polynomial contrasts on the single-mixed trials revealed significant linear F(1, 40) = 17.49, p < .001, η p 2 = 0.30; quadratic F(1, 40) = 15.67, p < .001, η p 2 = 0.28; and cubic F(1, 40) = 12.84, p < .001, η p 2 = 0.24 trends. Furthermore, polynomial contrasts on the dual-mixed trials revealed significant linear F(1, 40) = 24.08, p < .001, η p 2 = 0.38; quadratic F(1, 40) = 24.61, p < .001, η p 2 = 0.38; and cubic F(1, 40) = 18.23, p < .001, η p 2 = 0.31 trends. All other main effects or interactions were non-significant (ps ≥ .422).

The results suggest that the training was successful in significantly improving the participants’ reaction time and accuracy on the DT training task, particularly on the most difficult condition (dual-mixed trials), regardless of the training schedule.

Aerobic Exercise

To characterize the exercise training, we examined subjective (Borg Scale) (Borg 1982) and objective measures (mean power output in watts) of physical workload during the training phase. A 12 (time: training sessions 1–12) × 2 (training group: simultaneous vs. sequential) mixed-factorial ANOVA was carried out for each outcome. For mean power output, a significant main effect of time was observed, F(11, 440) = 20.66 p < .001, η p 2 = 0.341. Polynomial contrasts revealed a significant linear trend, F(1, 40) = 80.70, p < .001, η p 2 = 0.67. For the Borg, no significant main effect or interaction was found. More importantly, the main effect of group and the group by time interaction were non-significant for both measures (ps ≥ .116), suggesting that objective and subjective measures of aerobic effort did not differ between the two training groups.

Dual-Task Near-Transfer

Reaction Time

To assess the magnitude of improvement on reaction time on the near-transfer task, a time (pre-vs. post-training) × trial type (single-pure, single-mixed, dual-mixed) × training group (simultaneous vs. sequential) mixed-factorial ANOVA was employed. Figure 3 depicts mean RTs across the three trial types and training groups over time. A significant effect of time was found, F(1, 33) = 79.96, p < .001, η p 2 = 0.708, such that participants’ reaction time was markedly shorter after the training phase. Furthermore, there was a significant effect of trial type, F(2, 66) = 88, p < .001, η p 2 = 0.882, which was qualified by a significant interaction of time and trial type, F(2, 66) = 17.68, p < .001, η p 2 = 0.349. Post hoc tests revealed that a reduction in RT in the dual-mixed trials was significantly larger than that of the single-mixed trials (ps ≤ .001), which in turn was significantly larger than that of the single-pure trials (ps ≤ .001). All other main effects or interactions were non-significant (ps ≥ .247).

Fig. 3
figure 3

Near-transfer task. Mean RT of all three types of trial types across assessment stages by group. Note: error bars indicate one standard error of the mean. SP single-pure trials, SM single-mixed trials, DM dual-mixed trials

Error Rate

A similar mixed-factorial ANOVA was employed using the error data. This analysis revealed significant main effects of time, F(1, 40) = 6.887, p = .012, η p 2 = 0.147, and trial type, F(2, 80) = 59.51, p < .001, η p 2 = 0.598. Post hoc pairwise comparisons using Bonferroni correction revealed significant error rate differences among all three trial types, such that error rates of the single-pure trials and the single-mixed trials were significantly lower than that of the dual-mixed trials (p< .001). All other main effects and interactions were not statistically significant (ps ≥ .218).

Executive Functions and Memory

To test whether one training format was superior to the other in the neuropsychological measures, a time (pre vs. post) × group (simultaneous vs. sequential) mixed-factorial ANOVA was carried out for each of the EF and memory outcome measures. Means and standard deviations for the relevant tests are presented in Table 3.

Table 3 Means and standard deviations of outcome measures by training group and assessment session

Working memory, as measured with Letter Number Sequencing, revealed no significant main effects of time, F(1, 40) = 0.728, p = .339, η p 2 = 0.018, or group, F(1, 40) = 7.924, p = .848, η p 2 = 0.00. However, as predicted, there was a significant interaction of time and group, F(1, 40) = 7.924, p = .008, η p 2 = 0.165 (Fig. 4). Paired-sample contrasts with Bonferroni correction showed that the sequential group improved significantly from Time 1 to 2, t(19) = 2.43, p = .025, whereas the simultaneous group showed negligible improvement, t(21) = 1.48, p = .153.

Fig. 4
figure 4

Letter Number Sequencing performance as a function of group and time. Note: error bars indicate one standard error of the mean. The asterisk symbol denotes a significant effect of time, p < .025

Analysis of the episodic memory measures (RBANS subtests) revealed significant main effects of time for immediate word list recall, F(1, 40) = 7.447, p = .009, ηp 2 = 0.157 and delayed story recall, F(1, 40) = 7.683, p = .008, ηp 2 = 0.162, indicating some improvements to memory performance following training. No other main effects or interactions proved significant. Analyses on the RBANS delayed word list recall and immediate story recall did not reveal any significant main effects or interactions (ps ≥ .233).

Processing speed, as measured by Digit Symbol Coding, showed a marginally significant main effect of time, F(1, 40) = 3.687, p = .06, ηp 2 = 0.084, indicating a slight improvement from pre- to post-training overall. All other main effects and interactions were non-significant (ps ≥ .248). Finally, response inhibition, as measured with the Stroop task (incongruent vs. congruent), showed no significant main effects or interactions (ps ≥ .69).

Together, the neuropsychological results suggest that training format affected the magnitude of gains in working memory (Letter Number Sequencing, LNS) favoring the sequential group, while comparable gains were observed across training groups on measures of episodic memory and processing speed.

Moderators of Training Gains

As a secondary question, we examined baseline aerobic fitness and motivation for intellectually stimulating activities as potential moderators of training-related gains in cognitive performance. As there were no significant group differences between groups on the potential moderators, we examined all participants together as a single group. We used baseline sub-maximal VO2 as the predictor variable and residualized change scores for each of the cognitive outcome measures in a series of regression analyses. It was found that baseline aerobic fitness significantly predicted the magnitude of change in Stroop task performance, β = − .336, t(35) = −2.107, p = .042, with an R 2 of 0.11, such that those with a lower baseline aerobic fitness level showed a greater training-related improvement in response inhibition. Baseline fitness did not significantly predict training-related improvements in any of the other outcome measures.

We next considered motivation to engage in cognitively challenging activity, as measured with the NFC questionnaire. Due to the distributional properties of the NFC data (kurtosis > 2 even after square root transformation), an alternate form of analysis was utilized instead of regression. The participants were categorized as either high or low in NFC based on a median split, and paired-samples t tests were conducted to compare the pre- and post-training gains of each motivation group. Those who scored low on NFC improved significantly on the RBANS immediate word recall, t(13) = − 4.69, p < .01, whereas those in the high NFC subgroup did not improve. By contrast, the high NFC subgroup improved significantly on the delayed story recall task, while the low NFC subgroup did not (Table 4). As an added precaution, we also ran non-parametric Mann-Whitney U tests which revealed similar patterns as the t test marginally significant group differences on the two RBANS measures, ps ≤ 0.08.

Table 4 Memory measures by Need for Cognition (NFC), group, and time

Discussion

The primary aim of this study was to compare the magnitude of training-related improvements in cognitive performance after undergoing exercise and cognitive training in sequential or simultaneous formats. Consistent with predictions, and with the principle of neural overlap, significant gains were observed overall in the tasks most closely related to the trained DT processes (near-transfer DT, LNS). As well, the sequential training group improved significantly more on a measure of working memory than the simultaneous group, whereas other measures showed equal improvement across groups, suggesting a differential sensitivity to training format. Furthermore, individual differences in baseline aerobic fitness and motivation for cognitive challenge were related to the magnitude of cognitive gains in response inhibition and memory.

We first confirmed that our version of the DT training protocol replicated the training gains and near-transfer effects reported in previous studies (Bherer et al. 2005, 2006, 2008; Lussier et al. 2016). Notably, there was a non-significant trend in the DT training data such that the sequential group appeared to outperform the simultaneous group throughout the training sessions (Fig. 1), with consistently faster RT demonstrated by the sequential group, suggesting a slight cost incurred by concurrent cycling. This observation is compatible with other works showing that mild to moderate physical intensities interfere less with concurrent EF performance than higher intensities (Labelle et al. 2013). Accordingly, in the present study, the intensity of the exercise training was kept low (ranging from 40 to 48% of maximum heart rate), thus the cost to concurrent cognitive performance was non-significant. As well, the two training groups did not differ on their baseline DT performance. This finding suggests that the sequential group’s slight advantage on the DT training task was not due to pre-existing group differences but, rather, was due to the lack of DT interference from concurrent cycling. This result is in line with cognitive-motor DT research that suggests a tendency for older adults to prioritize motor performance over cognitive performance and incur greater DT costs in the cognitive domain than in the motor domain (Li et al. 2001; Li et al. 2005; & Woollacott and Shumway-Cook 2002).

In line with our predictions, it was found that sequential training was more beneficial than simultaneous training for working memory, as measured by the LNS task. Additionally, the LNS results parallel findings from the same study (Bruce et al. 2017) in which the sequential group showed an advantage over the simultaneous group on complex working memory performance (1-back working memory task during a concurrent mobility test). These results converge to suggest that sequential training is superior to simultaneous training with respect to working memory outcomes, but not to other measures of EF or memory.

The selective sensitivity of working memory to training format can be interpreted in terms of the functional overlap between the trained task, which targeted divided attention and the outcome variables (working memory, switching, response inhibition, processing speed, episodic memory). Based on the model proposed by Miyake et al. (2000), EFs such as working memory and inhibition can be conceptualized as moderately correlated, but distinct. Accordingly, the dual-mixed condition of the present training task has the requirement of holding two task sets in mind, similar to many working memory tasks (Baddeley 1986). Therefore, the selective effect of training format may reflect the relative functional overlap between the trained divided attention task and the outcome measures, with working memory showing the greatest sensitivity and overlap, memory and processing speed showing non-specific improvement after training (some functional overlap), and inhibition showing no improvement (little functional overlap). Similarly, one can consider the principle of neural overlap (Dahlin et al. 2008; Lustig et al. 2009), in that the brain regions showing increased neural efficiency in prefrontal cortex after the same DT training (Erickson et al. 2007a, b) are also associated with a variety of working memory tasks (e.g., Braver et al. 1997; D’Esposito et al. 2000; Kane and Engle 2002). The advantage observed in the sequential training group for the working memory outcomes may reflect the benefits of training under full attention (i.e., without concurrent exercise).

Along similar lines, one can consider the brain regions known to be affected by aerobic exercise training (Bherer et al. 2013) in order to better understand the general improvement seen in the other outcome measures. Based on previous evidence of older adults prioritizing motor performance during cognitive-motor dual-tasking (e.g., Li et al. 2001), we argue that both training groups benefited from doing aerobic training, with little cost to motor performance incurred in the simultaneous condition. This interpretation is supported by observing group equivalence in the subjective and objective measures of physical workload taken during the training phase. Hence, the non-specific training-related improvements observed in episodic memory (Nouchi et al. 2014) and processing speed (Smith et al. 2010) might be more closely tied to the exercise component of the training protocol than the cognitive component. As noted previously, aerobic exercise is associated with volumetric change in the hippocampus, which is correlated with better memory performance (Colcombe et al. 2006; Duzel et al. 2016; Erickson et al. 2011; Kramer et al. 2006).

Together, the pattern of format-specific and non-specific training gains shows that training cognition under full attention is advantageous for functionally similar outcomes but may not matter for tasks that are far-transfer tasks or that implicate brain regions that are globally affected by aerobic exercise.

Moderators of Training Gains

The second objective of the study was to explore potential moderators of training gains at the level of individual differences. Baseline aerobic fitness (sub-maximal VO2) significantly predicted changes in Stroop task performance, such that lower baseline aerobic fitness was associated with greater improvements in response inhibition. This finding complements other works showing that greater aerobic fitness gains were associated with greater short-term memory gains (Voss et al. 2013). It may be that the absence of transfer effects in Stroop task performance was due to the high level of fitness in our sample.

The second individual difference factor under consideration, Need for Cognition (NFC), was found to be a significant moderator of episodic verbal memory; however, the direction of the moderation depended on whether or not the test involved immediate or delayed recall. Specifically, the immediate word recall results are consistent with those of Stine-Morrow et al. (2014), in showing a negative relationship with NFC. In contrast, we found that individuals who were high in NFC gained more in delayed story recall than those low in NFC. A possible explanation is that whereas immediate word recall reflects working memory capacity, delayed story recall is driven by more elaborate encoding and semantic processes (Park et al. 2014). Thus, the DT training may have conferred greater benefits to individuals without ongoing cognitive stimulation (i.e., low NFC), but at the same time, a cognitively enriched lifestyle associated with high NFC (Baer et al. 2012) might support long-term memory encoding and retrieval. Together, the moderation findings have methodological implications, in that cognitive training programs should consider the match between participant characteristics (e.g., baseline fitness, motivation to seek out intellectual stimulation) and the type of outcome measures. More generally, these findings underscore the importance of considering non-cognitive factors in understanding the heterogeneity of responsiveness to cognitive intervention.

Limitations and Future Directions

A few methodological issues warrant consideration. First, we did not include a traditional control group in the current study for several reasons. First, many of the published studies on multimodal training directly compare active treatments only (Zhu et al. 2016). Second, we found in our recent work that an active control group (i.e., stretching plus internet lessons) did not exhibit pre-post improvements in cognition (Desjardins-Crépeau et al. 2016). The main purpose of our study was to directly compare the effectiveness of the two formats (simultaneous and sequential) of multimodal training and not to test for the synergistic effects of multimodal training, which would require at least a single-modality control group (Zhu et al. 2016).

A second issue to consider is that the same materials were used for both pre- and post-training sessions; thus, some of the general improvement could be attributed to practice, although our recent findings with an active control group argue against this explanation (Desjardins-Crépeau et al. 2016). There is also evidence to the contrary regarding the significance of practice effects on neuropsychological tests, particularly in older adult samples (Mitrushina and Satz 1991). As well, the moderation findings cannot be explained by simple practice effects; nevertheless, it would be desirable to replicate the memory results with multiple forms at pre- and post-training.

Another issue is that the sequential group technically received double the training time when compared to the simultaneous group. This decision was chosen in order to make sure that both groups received the same dosage of DT and aerobic training. While this might have potentially influenced the finding that favored sequential training for working memory gains, we felt that this format would be preferable to shortening the simultaneous training sessions to 15 min, which would introduce a different confound and would likely be too brief to induce cognitive gains.

An additional consideration for future studies is to include a motivational questionnaire for the participants’ engagement in physical activity, in parallel with the NFC questionnaire. In the current study, The Borg scale (Rating of Perceived Exertion) was not found to be associated with the magnitude of training gains. However, it is a standard measure of subjective physical effort and not a measure of motivation per se. Thus, it would not capture how the participants behave in their everyday activities (cf., Ahlskog et al., 2001; Burzynska et al. 2015; Steffener et al. 2016). Future studies could investigate motivation for physical challenge as an additional moderator of training-related gains.

Finally, in the current study, the sequentially trained participants completed the cognitive component first, followed by the aerobic component for practical reasons. Future studies could examine the reverse order, which might yield different results given the evidence suggesting that aerobic exercises could boost subsequent cognitive performance (Chang et al. 2012). Another future consideration would be to compare simultaneous multimodal training as done presently, against a more integrated multimodal intervention such as dance (Burzynska et al. 2017), which has also shown positive effects on cognition and associated neuroanatomical changes in older adults.

Conclusions

To summarize, both multimodal training formats led to general improvements on measures of working memory, processing speed, and verbal memory. Importantly, the sequentially trained group exhibited greater gains in working memory relative to the simultaneous group, suggesting that focused cognitive training under full attention is more effective particularly for working memory. Additionally, the format-specific training effects are consistent with the concepts of process and neural overlap (Dahlin et al. 2008; Lustig et al. 2009). Thus, the present findings underscore the importance of matching the format of training with the goals of training (i.e., broad or narrow gains). Additionally, the moderator results have significant implications for the type of participants to recruit. Overall, this study adds to the rapidly growing literature on multimodal training (Zhu et al. 2016) by directly contrasting different schedules of training and by considering individual differences in non-cognitive factors.