Introduction

Heart rate variability (HRV) is the variability of the time intervals between adjacent heartbeats (Malik et al., 1996). High frequency HRV reflects parasympathetic nervous system activity, while low frequency HRV predominantly represents baroreflex function, with neither reflecting sympathetic tone (Goldstein et al., 2011; Houle & Billman, 1999; Moak et al., 2007; Rahman et al., 2011). Previous studies suggest that higher HRV is associated with better cognitive function. For instance, in younger adults, higher resting HRV was related to better performance in modified versions of the Eriksen Flanker task, which is a measure of inhibitory control (Sørensen et al., 2019; Williams et al., 2016). Another study with younger adults (Hansen et al., 2003) found that the high resting HRV group (based on median split of the root mean squared successive differences or RMSSD) showed more correct responses than the low resting HRV group on an n-back working memory task. Older adults have also shown a positive association between HRV and executive functions. For example, in people aged 50 years and over, lower HRV during resting and paced breathing periods was associated with worse performance on the Montreal cognitive assessment, which is a measure of global cognitive function (Frewen et al., 2013). Furthermore, a 5-month aqua aerobics training improved older adults’ performance in a Stroop inhibitory task, which was correlated with an increase in resting HRV (Albinet et al., 2016). Among older adults at high risk of cardiovascular diseases, lower resting HRV was associated with worse performance on inhibitory control and processing speed tasks (Mahinrad et al., 2016). The same study found that lower resting HRV was associated with a steeper decline in the processing speed score during a mean follow-up of 3.2 years.

But why is HRV associated with cognitive function? One possible explanation is that overlapping brain regions are involved in both HRV and cognitive function (Thayer & Lane, 2000; Thayer et al., 2009, 2012). Many cognitive tasks, such as inhibitory control and working memory, are associated with prefrontal cortex activity (Miller & Cohen, 2001; O'Reilly, 2006) and prefrontal-subcortical functional connectivity (Gangopadhyay et al., 2021; Salzman & Fusi, 2010; Yizhar & Klavir, 2018). Similarly, the prefrontal cortex plays a central role in regulating HRV. The neurovisceral integration model (Thayer & Lane, 2000, 2009; Thayer & Sternberg, 2006) proposes that the interaction between prefrontal cortices and subcortical brain regions, such as the amygdala and brainstem, regulates sympathetic and parasympathetic nervous system activities, thus modulating HRV. Therefore, effective prefrontal activation is key to successful regulation of HRV and cognitive function.

However, not all studies have found a relationship between HRV and cognitive function (Britton et al., 2008; Duschek et al., 2009; Jennings et al., 2015; Kimhy et al., 2013; Mann et al., 2015). Previous research using a large sample of middle-aged adults from the UK Whitehall II study examined cross-sectional and longitudinal associations between resting HRV and cognitive function but found no reliable correlation between them (Britton et al., 2008). Another study using a large sample from the Midlife in the United States-II study revealed that the correlations between resting HRV and executive performance were not significant when controlling for age (Mann et al., 2015). Since both HRV and cognitive function decline with age, controlling for age may mask associations between these two variables.

Since most previous findings on the relationship between HRV and cognitive function were correlational in nature, it is unclear whether individual differences in HRV play a causal role in cognitive performance in both younger and older adults. To better understand whether there are causal relationships, the current study employed a simple breathing manipulation that increases HRV through a 5-week HRV biofeedback intervention and examined whether this manipulation results in improvement in cognitive performance in younger and older adults. This study is part of a larger study examining the effects of HRV biofeedback on brain mechanism associated with emotion regulation (Clinicaltrials.gov NCT03458910 “Heart Rate Variability and Emotion Regulation” or HRV-ER). We have summarized the primary and secondary emotion regulation outcomes in a separate paper (Nashiro et al., 2022). The current paper focuses on the effects of HRV biofeedback on cognitive performance, which was a secondary outcome measure of the study.

Materials and Methods

Participants

We recruited 193 participants (Fig. 1) including 121 younger adults between 18 and 35 years old, and 72 older adults between 55 and 80 years old. They were recruited through the University of Southern California (USC) Healthy Minds community subject pool, a USC online bulletin board, Facebook and flyers between January 2018 and March 2020. The study was terminated in March 2020 due to the COVID-19 pandemic (see ‘Power considerations’ in Supplementary Information). All participants provided informed consent approved by the USC Institutional Review Board. Prospective participants were screened for major medical, neurological, or psychiatric illnesses. We excluded people who: (1) had a disorder that would impede performing the HRV biofeedback procedures (e.g., coronary artery disease, angina, cardiac pacemaker); (2) currently were training using a relaxation, biofeedback or breathing practice; or (3) were on any psychoactive drugs other than antidepressants or anti-anxiety medications. We included people who were taking antidepressant or anti-anxiety medication and/or attending psychotherapy only if the treatment had been ongoing and unchanged for at least 3 months and no changes in treatment were anticipated. Older adults were screened for possible dementia using a validated telephone screening called TELE (Gatz et al., 1995), and those who scored lower than 16 were excluded. Eligible participants were assigned to small groups of 3–6 people, with each group meeting at the same time and day each week, except that MRI assessments on the second and seventh lab visits were scheduled individually. Group assignment was based on their availability (e.g., people who are available on Mondays for each week were assigned to the Monday group). After recruitment and scheduling of each wave of groups were complete, groups were randomized to one of two conditions using flipping a coin method. Thus, each participant was randomly assigned to a daily biofeedback intervention that either increased heart rate oscillations (Osc+) or decreased heart rate oscillations (Osc−). Out of 193 participants, 25 people (15 younger adults and 10 older adults) dropped out of the study due to a medical or personal emergency, time conflict or unexpected job offers. Three older adults did not complete the post-intervention cognitive and HRV assessments due to the COVID-19 pandemic, leaving 165 participants in this study (Table 1). For the NIH toolbox cognitive battery data, one additional younger adult was excluded since they did not complete the NIH Toolbox cognitive battery; thus, 164 people were included in the analyses of those data. For analyses involving HRV data, four younger adults were excluded because heart rate data from ear sensors failed to save due to technical issues, and three older adults were excluded because they could not complete the post-intervention HRV assessments due to the pandemic. Additionally, we excluded six outliers based on RMSSD from pre-intervention HRV (N = 4) and post-intervention HRV (N = 2), leaving an N of 152 in the analyses. These outliers were identified using the Tukey’s approach with box-and-whisker plots (Tukey, 1977). The inter quartile range (IQR) represents the distance between the lower (Q1) and upper (Q3) quartiles. Outer fences are located at a distance of 3 IQR below Q1 and above Q3. The values beyond the outer fences were identified as outliers.

Fig. 1
figure 1

CONSORT flow diagram

Table 1 Participant demographic information

Overview of 7-Week Protocol Schedule

A full description of the study has been reported elsewhere (Nashiro et al., 2022). Briefly, the study protocol involved seven weekly lab visits at the Emotion and Cognition Lab at USC and five weeks of home biofeedback training. On the first lab visit, we administered the NIH-toolbox cognitive battery including the Flanker inhibitory control and attention test, List Sorting Working Memory Test (LSWM) and Pattern Comparison Processing Speed test (PCPS). The second lab visit involved the first biofeedback training session and baseline MRI assessments. Between the second and seventh lab visits (approximately five weeks), participants were instructed to engage in daily biofeedback practice and visit the lab weekly. Participants were asked to practice 20 min/day for the first week of training, 30 min/day for the second week of training and 40 min/day for the remaining weeks. This adds up to a requested 1190 min or 19 h and 50 min over 5 weeks. Note that participants started daily practice the day after their second lab visit and ended the day before their seventh lab visit. Since these MRI assessment visits were scheduled individually based on participants’ availability for those weeks, the number of days between the two lab visits was slightly different among participants. While some participants had 35 days (or exactly 5 weeks) between the two lab visits, others had a few days shorter or longer than 35 days between the two visits. The sixth lab visit repeated the cognitive assessments from the first lab visit. The seventh lab visit repeated the baseline MRI session. Resting HRV was measured on the second and seventh visits (see the “Main HRV indices” section for more details).

Biofeedback Training

Osc+ Condition

Using an HRV biofeedback procedure, participants can be trained to breathe at their “resonance frequency” which is around 0.1 Hz or 10 s per breath (Lehrer et al., 2013). During paced breathing at their own baroreflex resonance frequency, people can significantly increase acute levels of heart rate oscillations. In the second lab visit, participants were asked to try out several breathing paces around 10 s/breath to see which induced the largest oscillations in their heart rate (their own resonance frequency; Lehrer et al., 2013) using the emWave Pro software (Heartmath, 2016). Participants were asked to wear a HeartMath ear sensor to measure their pulse and to breathe in and out with a visual pacer. They were instructed to inhale through the nose when the blue bar went up, and to exhale through the mouth when the blue bar went down. They were asked to breathe at 5 different paces for 5 min each (9 s, 10 s, 11 s, 12 s, and 13 s per breath, which approximately corresponds with 6.5, 6, 5.5, 5, and 4.5 breaths per minute as in Lehrer et al. (2013). To identify each participant’s best approximate resonance frequency, we followed published recommendations (Lehrer et al., 2013) and computed various aspects of the oscillatory dynamics for each breathing pace episode using Kubios HRV Premium 3.1 software and assessed which one had the most of the following characteristics: highest LF power, the highest maximum LF amplitude peak on the spectral graph, highest peak-to-trough amplitude, cleanest and highest-amplitude LF peak and highest RMSSD. In addition, we included coherence scores associated with each breathing pace episode as one of the characteristics we evaluated (see below for details on the coherence score).

To complete home training, participants received a HeartMath ear sensor and a small laptop with the emWave Pro software. During their home training, participants breathed to a pacer set to their resonance frequency determined in their prior lab visit. They received biofeedback on their heart rate oscillatory activity via a real-time plot of their heart rate and a ‘coherence’ score, which is a built-in metric score provided by the emWave Pro software. Coherence is characterized by a sine-wave-like pattern in the HRV waveform in the low frequency (LF) range (McCraty et al., 2009). Thus, a higher coherence score is reflected in the HRV power spectrum as a large increase in power in the LF band around 0.1 Hz. Participants were instructed to aim for a high coherence score. The coherence score was calculated as peak power/(total power − peak power). Peak power was identified by finding the highest peak within the range of 0.04–0.26 Hz and calculating the integral of the window 0.015 Hz above and below this highest peak. Total power was computed for the 0.0033–0.4 Hz range.

In the third visit, participants returned to the lab to receive coaching from researchers who checked again which breathing frequency produced the strongest heart rate oscillations (i.e., which was likely to approximate resonance frequency). Participants were asked to breathe at 3 different paces for 5 min each: the best approximated resonance frequency from the prior week’s visit, a one second per breath shorter, and a one second per breath longer than their best pace (e.g., if their best pace in the prior week was 10 s, they tried 9 s, 10 s and 11 s per breath). As in the second week, we evaluated the same aforementioned characteristics of these 3 paces and assigned the participant the pace that best approximated their resonance frequency for home training that week. In subsequent weekly visits, during 5-min training segments, they were asked to try out abdominal breathing with pursed lips (Lehrer et al., 2013) as well as other strategies of their choice to increase their coherence score (e.g., occasionally closing eyes).

Osc− Condition

An ideal comparison to the Osc+ condition would be another condition with similar biofeedback information, participant expectations and time spent training but no increases in heart rate oscillatory activity during the training sessions. Thus, we designed a comparison condition in which participants received heart rate biofeedback aimed at reducing their heart rate oscillations while breathing normally (Osc−). In order to avoid participants figuring out that one way to reduce their heart rate oscillations and get positive feedback would be through physical activity, such as jumping jacks, we instructed them to also try to decrease their heart rate during the training sessions.

In the second lab visit, participants were asked to come up with five strategies to lower heart rate and heart rate oscillations (e.g., imagining the ocean, listening to nature sounds, listening to instrumental music). Participants were instructed to wear the same HeartMath ear sensor as Osc+ participants and view real-time heart rate biofeedback while they tried each strategy for five minutes. We analyzed the data in Kubios and identified which strategy had the most of the following characteristics: lowest LF power, the minimum LF amplitude peak on the spectral graph, lowest peak to trough amplitude, multiple and lowest-amplitude LF peak and lowest RMSSD. Additionally, we included calmness scores (see below for details) as one of the characteristics we evaluated.

To complete home training, participants received a HeartMath ear sensor and a small laptop with a custom software. During their daily training sessions, participants aimed to reduce heart rate and heart rate oscillations using the best strategy determined in their prior lab visit. The custom-developed software provided a ‘calmness’ score, which was calculated by multiplying the coherence score that would have been displayed in the Osc+ condition by − 1 and adding 10 (i.e., an ‘anti-coherence’ score). The net result was that participants got more positive feedback and higher calmness scores when their heart rate oscillatory activity in the 0.04–0.26 Hz range was low (see ‘Osc− condition’ in Supplementary Information for more details).

In the third visit, they were asked to select three strategies and try them out for 5 min each. The strategy identified as best (based on the same characteristics used at the second lab visit) was selected as the one to focus on during home training the following week. In subsequent weekly visits, during 5-min training segments, participants were asked to try out strategies of their choice.

Cognitive Tests

Overview of the National Institutes of Health Toolbox Cognitive Battery

The National Institutes of Health Toolbox Cognitive Battery (NIH-toolbox cognitive battery) is a component of the NIH-toolbox for Assessment of Neurological and Behavioral Function (www.nihtoolbox.org; Weintraub et al., 2013). The NIH-toolbox cognitive battery consists of extensively validated computer-administered cognitive tests with utility across childhood and adolescence, early adulthood, and old age. We administered the NIH-Toolbox cognitive battery using an iPad app on an iPad Air 2, 9.7 inch.

As part of the NIH-Toolbox Cognitive Battery, the Flanker Test, the List Sorting Working Memory (LSWM) Test, and Pattern Comparison Processing Speed (PCPS) Test were administered to evaluate attention and executive function, working memory, and processing speed, respectively as the secondary outcome measures of the study. In addition to raw scores and/or reaction time, the NIH-toolbox cognitive battery generates age-adjusted standard scores, which have a mean of 100 and standard deviation of 15. These age-adjusted standard scores were used in the main analyses.

Flanker Inhibitory Control and Attention Test

The Flanker Test is a measure of inhibition and visual attention. On each trial, a central arrow is flanked by arrows on the left and right. The participant chooses the direction of the central arrow. On 12 congruent trials, the flankers face the same direction as the central arrow. On 8 incongruent trials, they face the opposite direction. There are 20 trials, which takes approximately 3 min to administer. Computed score is based on an algorithm derived from both accuracy and reaction time if accuracy is greater than 80%. If accuracy is less than or equal to 80%, the score is based only on accuracy. The algorithm integrates accuracy and reaction time, yielding computed scores from 0 to 10. Age-adjusted standard score compares the computed score of the participant to those in the NIH Toolbox nationally representative normative sample of the same age, where a score of 100 indicates performance that was at the national average for the participant’s age. Higher scores indicate better performance. For additional analyses, we used raw reaction time for congruent trials and incongruent trials, and response time difference between congruent and incongruent trials.

List Sorting Working Memory Test (LSWM)

In this task, a series of stimuli are presented on a computer monitor one at a time at a rate of 2 s per stimulus both visually (picture) and auditorily (recording of a one-word description of the stimulus). Participants are then required to repeat all of the stimuli back to the examiner in order of increasing real-world size, from smallest to largest. In the first phase of the test (i.e., the 1-List phase), participants are first shown a list with 2 items drawn from a single category (i.e., food). If participants answer the 2-item list correctly, the number of items in the list presented on the next trial increases by one item, up to a total of 7 items per list (i.e., list length ranges from a 2-item list to a 7-item list, for a total of six levels of list length). If participants do not give a correct response on a trial at a given list length, they receive another trial with the same number of items in the list; if they do not give a correct response on that trial again, this phase of the test is discontinued. Following the 1-List phase, all participants proceed to the second phase of the test (the 2-List phase), in which they see lists of items drawn from two different categories (i.e., food and animals). Participants are instructed to reorder and repeat the stimuli first from one category, then the other, in order of size within each category. Lists in the 2-List phase start with a 2-item list and increase in number of items in the same way as in the 1-List phase. For both phases, for each list length, participants receive a score of 2 points if they are correct on the first trial. A second trial at a given list length is only administered when participants fail the first trial. Participants receive a score of 1 point only for a given list length if they fail the first trial at that list length but pass the second trial. The test takes approximately 7 min to administer. Raw scores consist of combined total trials correct on the 1-List and 2-List phases of the task. Age-adjusted standard score, for which the normative mean is 100 and the standard deviation is 15, compares the raw score of the participant to those in the NIH Toolbox nationally representative normative sample of the same age, where a score of 100 indicates performance that was at the national average for the participant’s age. Higher scores indicate better performance.

Pattern Comparison Processing Speed Test (PCPS)

This test requires participants to identify whether two visual patterns are the “same” or “not the same” (responses were made by pressing a “yes” or “no” button). Patterns were either identical or varied on one of three dimensions: color, adding/taking something away, or one versus many. The test takes approximately 3 min to administer. Raw scores reflected the number of correct items (of a possible 130) completed in 85 s. Age-adjusted standard score, for which the normative mean is 100 and the standard deviation is 15, compares the raw score of the participant to those in the NIH Toolbox nationally representative normative sample of the same age, where a score of 100 indicates performance that was at the national average for the participant’s age. Higher scores indicate better performance.

HRV Indices

Main HRV Indices

Resting HRV was measured while participants sat in a chair with knees at a 90-degree angle with both feet flat on the floor for 5 min at pre- and post-intervention lab visits (i.e., the second and seventh lab visits, respectively). They were instructed to breathe normally during these recordings. Note that resting HRV was measured on different days from the cognitive assessments (i.e., the first and sixth lab visits), but both measures were collected before and after the intervention. A participant’s pulse was measured using the HeartMath emWave pro software with an infrared pulse plethysmograph (ppg) ear sensor. 370 HZ sample rate of the pulse wave was recorded, and interbeat interval data was extracted after eliminating ectopic beats or other sources of artifacts through a built-in process in emWave pro software. We used Kubios HRV Premium Version 3.1 to compute three standard heart rate variability metrics as the secondary outcome measures; root mean squared successive difference (RMSSD) in time domain analysis, and high frequency power (HF-power) and low frequency power (LF-power) in frequency domain analysis. In frequency domain analysis, the autoregressive model was applied to the inter-beat interval time series and spectral power derived in the HF range (0.15 to 0.40 Hz) and LF range (0.04–0.15 Hz). HF-power and LF-power were natural log-transformed to normalize the distribution (ln HF and ln LF, respectively).

Average Resonance Frequency Oscillatory Power Across All Daily Biofeedback Sessions

To assess the impact of Osc+ versus Osc− biofeedback during training sessions, we used Kubios HRV Premium 3.1 software to compute autoregressive spectral power for each training session (as an exploratory outcome measure). We extracted the summed power within the 0.063–0.125 Hz range for each participant (corresponding with 8–16 s, a range encompassing paces used by Osc+ participants for their breathing) to obtain a measure of resonance frequency oscillatory activity during biofeedback. Before conducting statistical analyses, we log transformed the power values.

Statistical Analysis

Cognitive Performance

We performed three-way mixed ANOVAs including time (pre vs. post) as a within-subject factor and condition (Osc+ vs. Osc−) and age group (younger vs. older) as between-subject factors for the age-adjusted scores for the three cognitive tasks. For the Flanker Test, we also performed three-way mixed ANOVAs separately for response time for congruent trials, response time for incongruent trials, and response time difference between congruent and incongruent trials.

Resting HRV

We performed the same three-way mixed ANOVAs (time × condition × age group) for ln HF-power, ln LF-power, and RMSSD. In addition, we conducted a univariate analysis with condition and age group as fixed factors and the average resonance frequency oscillatory power across all daily biofeedback sessions as a dependent variable (hereinafter referred to as ‘resonance frequency power’).

Correlation Between HRV and Cognitive Performance

For resting HRV, we calculated pre-to-post percent change of ln HF-power, ln LF-power and RMSSD using the following formula: (post-intervention value − pre-intervention value)/pre-intervention value * 100. Using the same formula, we computed pre-to-post percent change of the age-adjusted Flanker, LSWM and PCPS scores. For the Flanker task, we also calculated percent change of response time for congruent trials, response time for incongruent trials, and response time difference between congruent and incongruent trials. The percent change of each resting HRV index was correlated with that of each cognitive score. We also performed correlation analyses between the resonance frequency power during biofeedback practice and the percent change of each cognitive score.

Results

Cognitive Performance

To examine our main question of the effect of Osc+ training on cognitive performance, we performed three-way mixed ANOVAs (time × condition × age group) for the Flanker task, LCWM and PCPS. For all these measures, there was no significant two-way or three-way interaction among time, condition and age group (Fig. 2), suggesting relative to Osc− training, that Osc+ training did not improve cognitive function in either age group. There were some significant findings that are not directly related to our main research question, which are reported in Supplementary Information (see ‘Additional results for cognitive performance’).

Fig. 2
figure 2

NIH Toolbox cognitive assessments at pre- vs. post-intervention. Both Osc+ and Osc− participants across age groups performed better at post-training than pre-training for the Flanker task (A), LSWM (B) and PCPS (C). YA younger adults, OA older adults

For the Flanker task, we additionally performed 3-way mixed ANOVAs (time × condition × age group) on response time for congruent trials, response time for incongruent trials, and response time difference between congruent and incongruent trials (Supplementary Fig. 1). However, there was no significant two-way or three-way interaction among time, condition and age group, suggesting that Osc+ training did not improve response time for either trial type in either age group. There were some significant findings that are not directly related to our main research question, which are reported in Supplementary Fig. 1.

HRV

We performed 3-way mixed ANOVAs (time x condition x age group) for resting HRV measured by ln HF-power, ln LF-power, and RMSSD. There was a significant interaction between time and condition for ln LF-power, F(1,148) = 4.52, p = 0.035, r = 0.17. At rest, Osc+ participants across age groups showed greater ln LF power at post-training compared to pre-training, t(78) =  − 2.84, p = 0.006, d =  − 0.26, 95% CI [− 0.446, − 0.073] (Mpre = 6.73; Mpost = 7.08), whereas Osc− participants across age groups did not show such a difference, t(72) = 0.93, p = 0.353, d = 0.10, 95% CI [− 0.111, 0.307] (Mpre = 6.28; Mpost = 6.14). The lack of time x condition interaction across age groups for ln HF-power and RMSSD indicated that there was no across-age-group significant impact of Osc+ training on resting-state ln HF-power and RMSSD. There was no significant three-way interaction among time, condition and age group for any of the three measures (Table 2). Other significant findings that are not directly related to our main research question are reported in Supplementary Information (see ‘Additional results for HRV’).

Table 2 Pre-to-post intervention changes in resting HRV

As a manipulation check, we conducted a univariate analysis with condition and age group as fixed factors and resonance frequency power during biofeedback practice as a dependent variable (i.e., performance index for daily biofeedback). As expected, we found a main effect of condition, indicating that Osc+ participants showed greater resonance frequency power than Osc− participants during biofeedback practice, F(1,148) = 98.82, p < 0.001, r = 0.63 (Fig. 3). We also found a main effect of age group, indicating that younger adults showed greater resonance frequency power than older adults during biofeedback practice, F(1,148) = 33.32, p < 0.001, r = 0.43. However, no significant interaction of condition and age group was found.

Fig. 3
figure 3

The average resonance frequency oscillatory power across all daily biofeedback sessions. Relative to the Osc− participants, the Osc+ participants in both age groups showed greater resonance frequency power during home biofeedback practice. YA younger adults, OA older adults

Correlation Between Pre-post Change in Cognitive Performance and That in HRV

As exploratory analyses, we examined whether pre-post percent changes in each cognitive measure is associated with pre-post change in resting HRV measured by ln HF-power, ln LF-power and RMSSD. There were no significant findings. We also examined whether pre-post percent changes in each cognitive measure is associated with resonance frequency power during biofeedback practice. No significant correlations were found across all participants. However, separate analyses for each condition revealed that the Osc+ participants showed a significant correlation between percent change in the Flanker score and resonance frequency power with FDR correction of p < 0.05, r(77) = 0.296, p = 0.008, 95% CI [0.080, 0.485] (Fig. 4; Table 3), which was not observed in the Osc− condition. Further separate analyses for each age group revealed a significant correlation between percent change in the Flanker scores and resonance frequency power with FDR correction of p < 0.05 in younger Osc+ participants, r(52) = 0.361, p = 0.007, 95% CI [0.103, 0.574], but not in older Osc+ participants, r(23) = 0.344, p = 0.092, CI [− 0.059, 0.651]. However, the correlations for younger vs. older Osc + participants did not differ significantly, z = 0.076, p = 0.47 and their confidence intervals showed considerable overlap. There were no significant correlations between resonance frequency power and percent change in LSWM or PCPS in either Osc+ or Osc− condition (Table 3).

Fig. 4
figure 4

Correlation between pre-to-post percent change in Flanker scores and resonance frequency power during biofeedback practice. The Osc+ participants across age groups (red dots in the online version; black dots in print) showed a significant correlation between pre-to-post percent change in Flanker scores and the average resonance frequency oscillatory power across all daily biofeedback sessions. This pattern was not observed in the Osc− participants across age groups (blue dots in the online version; grey dots in print) (Color figure online)

Table 3 Pearson correlations (r) between percent change of cognitive scores and resonance frequency power

For the Flanker task, we additionally examined whether pre-post percent changes in response time for congruent trials, incongruent trials, and congruent minus congruent trials were associated with heart rate oscillation during biofeedback practice. However, there were no significant findings (Supplementary Table 1).

Effects of Practice Time on Cognitive Performance

Participants in the Osc+ vs. Osc− conditions did not significantly differ in the average of total practice time (M = 1066 min, SE = 40.71 and M = 1044 min, SE = 33.65, respectively), t(160) = 0.40, p = 0.69, r = 0.03.

We also examined whether the total practice time was associated with cognitive outcomes. We performed correlation analyses between the total practice time and pre-post percent changes in each cognitive measure (i.e., pre-post percent changes in the Flanker, LSWM, and PCPS scores) for each condition. There was a significant negative correlation between the total practice time and percent change in the PCPS scores in the Osc+ condition with FDR correction of p < 0.05, r(82) =  − 0.312, p = 0.004, 95% CI[− 0.493, − 0.104] (i.e., more practice led to less improvement in the PCPS scores). No other significant correlations were found.

Discussion

In this study, we examined whether daily practice of increasing HRV would lead to improvement in cognitive performance. We did not find significant improvements in the Osc+ condition relative to the Osc− condition in either age group. However, greater resonance frequency power during biofeedback practice was associated with improvement in the Flanker score (a measure of inhibition and attention) in the Osc+ condition across both age groups. Such a relationship was not observed with scores on the working memory and processing speed measures. Our findings are consistent with prior research suggesting that greater inhibitory attentional control is associated with higher resting HRV (Sørensen et al., 2019; Wendt et al., 2015; Williams et al., 2016). However, since previous studies also found associations between overall executive functioning and resting HRV (Hansen et al., 2009; Stenfors et al., 2016; Williams et al., 2019), it is not entirely clear why inhibitory control (but not working memory or processing speed) was particularly affected by the Osc+ intervention. One possible explanation is that the Flanker task was a better assessment of executive functioning than the other tasks given previous evidence suggesting that inhibition is a common factor across all executive functions (Miyake & Friedman, 2012). Although our results need to be interpreted with caution, they suggest the possibility that with longer training periods and with strategies that maximize the amplitude of heart rate oscillations during practice, daily practice of increasing HRV may improve inhibitory control.

In the Flanker task, people typically show slower response time and higher error rates in incongruent trials than congruent trials (i.e., a congruency effect). In this study, we found no training-related changes in the congruency effect but observed a main effect of age group (Supplementary Fig. 1c), suggesting a larger congruency effect in older adults than younger adults. Consistent with our results, some previous studies found an age-related increase in a congruency effect (Waszak et al., 2010; Weintraub et al., 2013; Zelazo et al., 2014), but others failed to observe such an age-related effect (Hsieh et al., 2012; Larson et al., 2016; Posthuma et al., 2002). A recent study (Erb et al., 2020) suggested that age-related decline in inhibitory control during the Flanker task may occur in a process-specific manner. Relative to younger adults, older adults showed a significantly larger congruency effect, which was driven by slower initiation time. Initiation time is the gap between stimulus onset and the individual's movement onset, which is a measure of global inhibition. Conversely, an age effect was not found in curvature, which is another type of inhibition measure to calculate how much one’s movement deviates from a direct path to the selected response target. In the current study, our task design did not allow us to examine the effects of training and age on separate inhibitory processes; thus, we were unable to determine which type of inhibitory process might have been affected by age or training. Future studies should explore these questions.

We also observed that among those in the Osc+ condition, greater total practice time was associated with less improvement in PCPS scores, which measure processing speed (Carlozzi et al., 2015). This effect may reflect the association between arousal state and processing speed. For instance, engaging in a bout of moderate intensity exercise tends to increase processing speed (McMorris & Hale, 2015), which may be due to activation of the locus coeruleus (LC) and associated noradrenergic activity (McMorris, 2016). As we report elsewhere (Bachman et al., 2022), among younger adults the Osc+ intervention significantly decreased left LC magnetic resonance imaging contrast. a measure reflecting locus coeruleus magnetic properties. LC contrast is influenced by the accumulation of neuromelanin, a by-product of catecholamine metabolism (Keren et al., 2015). Among the younger cohort, relative to the Osc− condition, the Osc+ condition also decrease cAMP-responsive element binding protein (CREB)-regulated gene expression in circulating blood cells, an index of sympathetic nervous system signaling (Bachman et al., 2022). Thus, one possibility that should be examined in future research is whether the Osc+ intervention influences processing speed via its suppression of noradrenergic activity.

Many other intriguing questions remain and need to be addressed in future studies. What are possible brain mechanisms underlying relationships between HRV and cognitive function? In a separate paper, we have examined whether our HRV intervention strengthened resting-state functional connectivity in cognitive versus emotional networks. While we found that the Osc+ intervention increased functional connectivity within emotional networks, we found no significant effects on cognitive networks. One of the limitations in our study was that we administered the cognitive tasks outside the MRI scanner and were thus unable to examine brain activity during cognitive performance. Future studies should examine brain activity and connectivity as well as physiology during cognitive tasks, which may help better understand brain mechanisms linking HRV and cognition. Another critical issue to address in future studies is duration of training. It is possible that 5 weeks were too short to observe marked changes in cognitive performance and associated brain function. Future research with longer intervention periods may clarify the relationship between HRV and cognition. It is also important to identify the appropriate length of training separately for older adults and younger adults since the effective duration of training may differ by age group. Another limitation was the limited number of cognitive measures we included in our study. Future studies should include a variety of cognitive measures, such as verbal fluency and episodic memory as well as executive functioning, to examine the effects of Osc+ training on multiple cognitive domains and associated brain structure and function. In terms of resting HRV, we found that Osc+ participants across age groups showed greater ln LF power at post-training compared to pre-training. Although participants were instructed to breathe normally during these assessments, it is possible that some Osc+ participants breathed slower than Osc− participants at post-intervention, which accounted for the significant condition difference in ln LF power. In future studies, we should measure respiration rate during resting HRV measures to rule out this possibility.

It is well documented that HRV declines with age at rest (Agelink et al., 2001; Liao et al., 1995), during 24-h recording (Antelmi et al., 2004) and in the spine and standing positions (Shannon et al., 1987). However, this study demonstrated that both younger and older adults successfully increased HRV during biofeedback practice as reflected by greater resonance frequency power in the Osc+ group than the Osc− group, which was associated with improvement in the Flanker task. Other studies have also demonstrated the feasibility of using the HRV biofeedback intervention and its benefits in attention, memory and cognitive function in older adults (Jester et al., 2019; Meeuwsen et al., 2021). Thus, it seems possible that a longer intervention with effective strategies could improve various aspects of cognition in older adults. This question is particularly important for individuals with age-related cognitive decline or dementia, such as Alzheimer’s disease. It would be worthwhile for future research to further investigate neutral mechanisms linking HRV and cognition and to examine whether HRV biofeedback helps mitigate age-related decline in HRV and cognitive function.