The efficacy of electroencephalography neurofeedback for enhancing episodic memory in healthy and clinical participants: A systematic qualitative review and meta-analysis

Several studies have examined whether electroencephalography neurofeedback (EEG-NF), a self-regulatory technique where an individual receives real-time feedback on a pattern of brain activity that is theoretically linked to a target behaviour, can enhance episodic memory. The aim of this research was to i) provide a qualitative overview of the literature, and ii) conduct a meta-analysis of appropriately controlled studies to determine whether EEG-NF can enhance episodic memory. The literature search returned 46 studies, with 21 studies (44 effect sizes) meeting the inclusion criteria for the meta-analysis. The qualitative overview revealed that, across EEG-NF studies on both healthy and clinical populations, procedures and protocols vary considerably and many studies were insufficiently powered with inadequate design features. The meta-analysis, conducted on studies with an active control, revealed a small-size, significant positive effect of EEG-NF on episodic memory performance (g = 0.31, p = 0.003), moderated by memory modality and EEG-NF self-regulation success. These results are discussed with a view towards optimising EEG-NF training and subsequent benefits to episodic memory.


Introduction
The ability to remember events or "episodes" from our personal past is known as episodic memory (Tulving, 1972).It includes details about what happened, when and where.For example, try to remember your last birthday.Perhaps you can recall what you did, who celebrated it with you and what presents you received.These are all features of episodic memory.The ability to go back and figuratively relive past experiences is a fundamental aspect of everyday life and is critical to our sense of self.In everyday life we sometimes have memory lapses, where we fail to remember an important detail about an event, and this can become more prevalent in older age (Cansino, 2009).Moreover, deficits in episodic memory are a hallmark feature of certain disorders, such as mild cognitive impairment (Nordahl et al., 2005) and Alzheimer's disease (Greene et al., 1996).There has been a growing impetus in recent years to develop and test interventions to determine if they can enhance memory performance.
One technique that has emerged which may hold promise is neurofeedback.This is a self-regulatory technique where an individual is given feedback about certain patterns of brain activity which are proposed to be linked to a target behaviour.The assumption is that through this realtime feedback an individual can change their brain activity to the pattern desired and this will result in enhancements in behaviour.It is a non-invasive procedure which is based upon operant conditioning.There are several imaging modalities which can be used to measure different brain signals, such as functional Magnetic Resonance Imaging (fMRI), which measures changes in blood oxygenation and flow to selected cortical regions and magnetoencephalography (MEG), which indexes the amplitude of magnetic fields (see review by Thibault et al., 2016).The technique which has been researched the most, and will be the subject of this paper, is electroencephalography (EEG), which measures electrical activity generated by pyramidal cells perpendicular to the scalp.The benefit of EEG for neurofeedback is its prevalence and accessibility, with low-cost headsets available that could be used in participants' homes.
The standard and most prevalent approach using EEG is to examine brain oscillations (Buzsaki, 2006), which arise from the synchronised activity of a population of neurons within a selected frequency band, and feedback the power of this signal to the participant.However, several other approaches have emerged in more recent years.For instance, network or connectivity-based neurofeedback has been employed with EEG, which focuses on inter-electrode phase coherence over certain frequency bands.This can provide an estimate of the functional interactions between neural systems operating in a frequency band (e.g.see Kober et al., 2020).Another approach is low resolution electromagnetic tomography (LORETA; Pascual-Marqui et al., 1994), which utilises multi-channel scalp-recorded EEG data and inverse solutions to estimate underlying brain electrical activity.LORETA neurofeedback targets the regulation of activity in specific brain regions using scalp-recorded multi-channel EEG data (e.g.see Bauer and Pllana, 2014;Congedo et al., 2004).Very recently machine learning algorithms have been proposed for use in EEG neurofeedback paradigms, for example, to train autobiographical memory (see Luján et al., 2021).This approach involves identifying the training targets and features from the multiple-channel data in real-time.As the majority of studies use a standard power-based oscillatory approach we focus on that.
Moreover, from the memory literature there is good reason to think that EEG-Neurofeedback (EEG-NF) may be effective as there is now substantial evidence delineating a functional role for brain oscillations in episodic memory.For example, numerous studies using intercranial electrodes in patients with epilepsy and scalp recorded EEG and MEG have found this link, with several frequency bands being investigated, including theta (4-8 Hz), alpha (8-12 Hz) and gamma (25-100 Hz) (Düzel et al., 2003;Fell et al., 2003;Guderian and Düzel, 2005;Klimesch et al., 1997Klimesch et al., , 2001;;Lin et al., 2019;Martín-Buro et al., 2020;Mormann et al., 2005).Research is currently determining the exact functional significance of these frequency bands and their interaction with each other in promoting episodic retrieval (Hanslmayr et al., 2016;Herweg et al., 2020;Nyhus and Curran, 2010).
There have been a few studies which have examined the effects of EEG-NF on episodic memory.One of the first studies completed in healthy volunteers was by Berner et al. (2006) who was interested in the links between sleep, neurofeedback and memory performance.In their study a sample of 11 participants, who had previously been found to be able to regulate their brain activity, took part in four 10-minute neurofeedback sessions where they were required to upregulate sigma/beta activity  or were given pseudo feedback which was provided randomly from an inactive EEG channel (within-participants design, sessions counterbalanced and one week apart).After the neurofeedback session participants were required to encode word-pairs by imagining a visual relationship between the two words.Participants were given a cued recall test in the evening around 10-15 min after the encoding phase and then another test in the morning.Neurofeedback had no significant effects on memory performance on either test.In contrast other studies have found significant effects of neurofeedback on episodic memory.For example, in a study by Rozengurt et al. (2017), they asked healthy volunteers to upregulate their theta for 30 min in the period between participants learning object pictures and having to subsequently free recall them.In comparison to active (who upregulated low beta, 15-18 Hz) and passive control groups the participants who completed theta neurofeedback had significantly better memory performance immediately following the intervention and also one day and one week later.Thus, there are differences between studies in their conclusions as to whether EEG-NF has a beneficial effect on memory and there is heterogeneity in terms of neurofeedback testing protocols, such as which EEG frequency band is targeted.
The issue of whether EEG-NF can enhance episodic memory has also been examined in the context of various clinical conditions, such as mild cognitive impairment, sleep disorders, epilepsy, and stroke.Lavy et al. (2019) conducted a pilot study in 11 individuals who had a diagnosis of mild cognitive impairment.Neurofeedback training was 10 × 30-minute sessions which were delivered over five weeks and asked participants to increase the power of their individual upper alpha band.There was no control group in this study.Participant's performance was examined before the intervention, immediately afterwards and at a 30-day follow-up.Participants were given a standardised battery of tasks measuring a variety of cognitive functions as well as an item-association memory task.In the standardised battery, one of the measures, the composite memory score, was found to improve from before the intervention to afterwards and then was maintained at the 30-day follow-up.However, this reflected improvements in immediate recall, likely more akin to working memory.Participants did not show any enhancement for the item-association task, a measure of episodic memory, for either words or images.Nevertheless, there are other clinical studies which have demonstrated enhancements of episodic memory.Escolano et al. (2014a,b) tested 60 participants with major depressive disorder, who were not randomly allocated to the neurofeedback group and a non-interventional control group.The neurofeedback protocol was targeted at increases in individual upper alpha power, with participants completing eight sessions of 20-minutes neurofeedback training, spread over five weeks.For the measure of episodic memory there was an improvement in the number of words recognised from pre to post intervention in the neurofeedback group which was not seen in the control group.In parallel with the findings from healthy volunteer studies there is mixed evidence as to whether EEG-NF is advantageous for episodic memory, and this is complicated further by the range of clinical disorders that have been examined.
The small number of studies discussed above also highlight two critical design issues which need to be considered when determining the efficacy of EEG-NF.One is the presence of an active control group/ condition (Enriquez-Geppert et al., 2017;Ros et al., 2020;Sorger et al., 2019).This allows the researcher to determine the extent to which any improvement seen in the experimental group is specifically due to the neurofeedback intervention and not other general factors, such as: participant-experimenter interaction, motivation, and repetition-related effects.In the context of EEG-NF experiments there are three general options for an active control: i) non-contingent, where there is no link between the participant's brain activity and the feedback they receive, such as when they receive the same feedback as a participant in the experimental group or artificially generated feedback, ii) contingent, where the participant receives feedback from an alternative frequency band that is not hypothesised to be linked to the target behaviour, and iii) non-neurofeedback, where participants complete a task that they need to engage with that does not require neurofeedback.For all control conditions the participants should have the same schedule as those in the experimental group, including visits to the lab and being actively engaged with a task for the same duration.Moreover, in between-participants designs participants should be randomly allocated to the experimental or control group (or in within-participants designs the conditions should be counterbalanced) to minimise bias by the experimenter or participant.In EEG-NF studies this would also mean that studies which allocate 'non-responders', i.e. those participants who are unable to regulate their brain activity in the desired way, to the control group do not meet this criterion.Therefore, the quality of studies needs to be examined, particularly the presence of an active control group and randomisation of participants to groups.
Given the potential promise of EEG-NF to enhance episodic memory function there is now a need to review, evaluate and quantify the research in this area.The first aim was to conduct a systematic review into the literature on episodic memory and EEG-NF to understand what research has been conducted in this area.This review included both healthy and clinical populations and three key areas were examined: i) sample characteristics, ii) study design, and iii) neurofeedback protocols utilised.This is the first systematic and qualitative review which has been conducted examining both healthy and clinical populations specifically with respect to episodic memory and will provide information concerning the scope of currently published studies.The second aim was to complete a meta-analysis to determine whether EEG-NF can enhance episodic memory performance.Importantly for this aim the analysis was restricted to only those studies where there was an active control group/ condition and participants were randomly allocated or counterbalanced to the experimental and control groups/conditions.Furthermore, given the heterogeneity of neurofeedback protocols it was examined whether there would be moderators of memory performance.The essence of EEG-NF is that it is participants' success in modulating their brain activity which results in the behavioural improvement.We therefore also included a measure of EEG-NF success in the moderator analysis to examine this.Our goal with this meta-analysis is to provide critical information for future studies on episodic memory as to whether EEG-NF can enhance memory and what might be the optimal training parameters.

Study searches and inclusion criteria
The search for studies was completed in two rounds.The initial search took place on 1 February 2021, followed by a fresh search which was conducted on 4 March 2022 to ensure the review included newer publications.This was conducted within the databases PsychInfo, PubMed, Scopus (Elsevier), Web of Science, CINAHL and ProQuest using the key word search string: ((EEG OR electroencephalograph*) AND (biofeedback OR neurofeedback OR "bio feedback" OR "neuro feedback") AND (memor* OR cogniti*)).A filter was added to include English language articles only.Following the removal of duplicate studies, the searches generated 2086 potential studies that ranged from L.E.Jackson et al. published books and articles to conference proceedings, randomised controlled trials, dissertations and theses.
The initial screening process involved scanning the titles and/or abstracts of each study generated by the search, followed by more detailed scrutiny of the remaining 211 full-text studies to ascertain eligibility.Screening was performed by the first author and a random sample containing approximately 10 % of the full-text studies was screened by one of the other authors, to check consistency of eligibility judgements.Raters achieved 90 % alignment and discussed and agreed on the eligibility status of the remaining studies.To be eligible for inclusion to the qualitative review the study needed to meet the following criteria.First, the study needed to involve neurofeedback, which was measured using EEG.Second, the study needed to examine the effects of EEG-NF on episodic memory.A variety of tasks can be used to do this, including recall and recognition and could be of verbal or visual information.Third, the participants were adult healthy volunteers or those with a clinical condition.Studies which had tested animals, or children i. e. those aged 15 or younger, were not included.Importantly, in this paper, the question being examined is whether EEG-NF has an effect on episodic memory and not whether there is a difference between healthy and clinical groups.For the qualitative review the final study set was 46.
Additional criteria were applied for completion of the meta-analysis.First, studies had to have an active control group or control condition, which was attended according to the same schedule as the experimental group.Second, studies needed to have randomised participants to the experimental or control groups if it was a between-participants design or to counterbalance the conditions if it was a within-participants design.Finally, the study needed to have sufficient data available for calculating effect sizes.The final study set for the meta-analysis was 21.See Fig. 1 for an overview of study screening and selection.

Data extraction and study coding
Data were extracted by the first author and a random sample containing approximately 10 % of the eligible studies was completed by one of the other authors to check consistency of data extracted.We coded the following variables:

Sample characteristics
This included the number of participants in each study, and per group or condition.The mean age of participants was also recorded including the age range, if reported in the study.The population type was defined as healthy volunteer or a clinical group.In addition, the number of participants who were unable to self-regulate the target band during neurofeedback i.e. non-responders, was also noted if reported.

Study design
Whether the study was within-participants (a cross-over design where all participants were tested under both the experimental and control conditions), or between-participants (participants were allocated to either the experimental or control group/condition) was noted.Single-case and single-group experiments, where no control condition was included in the design, were labelled as such where only withinparticipant changes are noted before and after the neurofeedback.The presence of a control group/condition was coded with the following general categories used: i) no control; there is only a neurofeedback condition, with nothing to compare this to i.e. pre-post only designs, ii) non-active control; there is a control group or condition but participants do not receive any training, this would include waitlist control groups in clinical studies, and iii) active control; there is a control group or condition where the participant does a task according to the same schedule as the neurofeedback group.For the meta-analysis only studies which had an active control group/condition were included and this category was further split into the following three groups: i) non-contingent, where there is no link between the participant's brain activity and the feedback they receive, ii) contingent, where the participant receives feedback from an alternative frequency band that is not hypothesised to be linked to the target behaviour, and iii) non-neurofeedback, where participants complete a task that they need to engage with that does not require neurofeedback.A study using inverse contingency, where the active control group regulated target band in the opposite direction, was coded as contingent as well.Studies were also coded as to whether they randomised participants to groups, if it was a between-participants design.This included pseudo-randomisation where participants were matched across groups e.g. for demographic factors such as age, gender and education.For within-participants design it was examined whether the order of the experimental and control conditions was counterbalanced.Finally, it was also coded as to whether blinding measures were included in the experimental design.There were three classifications of blinding: none, single (the participant does not know which study group they are in) or double (the participant and experimenter do not know which group the participant has been assigned to).

EEG-neurofeedback training
There were several aspects of the neurofeedback training protocol that we coded.Across different studies the neurofeedback training is structured in different ways, some have many testing sessions, whereas others have only one.Therefore, one variable that was coded is the number of separate neurofeedback testing sessions.Related to this is the total duration of time that participants spend completing neurofeedback training.Therefore, the number of minutes each participant spent performing neurofeedback training was also quantified for each study, excluding resting.A variety of EEG frequency bands can be used for neurofeedback.The following were coded: slow cortical potentials (0.1-1 Hz); theta (4-8 Hz); alpha (8-12 Hz), which includes the mu rhythm (8-13 Hz); beta (12-30 Hz), which includes both the sensorymotor rhythm (12-15 Hz) and sigma (11.6-16Hz); and finally gamma (30-100 Hz).Clinical studies where participant's feedback was based on their resting baseline quantitative EEG were coded as qEEG.This method measures localisation, frequency, and connectivity of brain activity for every individual, which informs their live z-score training (LZT) in relation to the normative/clinical database (Ko et al., 2021).Neurofeedback is measured from certain electrode sites positioned over the scalp.These were grouped into: frontal, central, or parietal and occipital sites.In addition, the number of feedback electrodes used to measure target activity was recorded.The neurofeedback the participant receives can come from different modalities, we coded: visual, auditory and both.Finally, we coded whether in each study participants were given instructions for how they should go about regulating their brain activity.This was coded as yes if any were given, even if they were vague, and a no if no explicit instructions were provided to the participant i.e. they were instructed to simply relax and let the feedback guide them.

Episodic memory measure
To examine whether EEG-NF affected episodic memory performance in the meta-analysis, an effect size was calculated to reflect the magnitude of change in memory scores pre-and post-EEG-NF in the experimental group, relative to the control group.Episodic memory was further sub-categorised into recognition and recall in the moderator analysis to determine whether the effect of EEG-NF was moderated by these memory types.A measure of recognition memory was obtained from memory paradigms or neuropsychological tests that required participants to make an old/new decision.A measure of episodic recall was acquired where participants were required to recall information studied at least 15 min prior (e.g.delayed memory or source recollection tasks).Group means (M), standard deviations (SD) and sample sizes (n) were extracted from the text or alternatively from figures using Web-PlotDigitizer (2020).Alternatively, F and t statistics were used to calculate the effect size.If insufficient data were reported, this was requested by contacting the corresponding author via email, if no response was received, these studies were excluded from the meta-analysis.

Neurofeedback success measure
To generate a measure of participants' overall ability to self-regulate target brain activity, a binary code was assigned to each study, whereby '1' indicates that EEG-NF success was reported and '0' indicates there was no evidence of EEG-NF success.Self-regulation of target brain activity was evidenced by a range of different measures across studies, including absolute and relative power or amplitude, and band-ratio such as theta/low beta.EEG-NF was considered a success when the authors reported a statistically significant increase in the EEG-NF group relative to the control group.This could be reported by way of: i) a significant between-groups p-value (p < 0.05), ii) a significant group effect or interaction between groups and time in an ANOVA, or iii) a significant within-subjects pre-post EEG-NF comparison (e.g.baseline to EEG-NF training session) in the experimental group but not in the control group.This success measure is the same as used by Rogala et al. (2016).The same criteria were applied to each band where more than one band was investigated within a study.

Statistical analyses
A meta-analysis was conducted using the robu() function of the robumeta package in R, version 4.1.3.(RStudio Team, 2022).The output of the primary meta-analysis included the pooled mean population effect size (g) which represents the overall effect of EEG-NF on memory.Also reported is the standard error, a t-value representing the statistical significance of the combined effect size and 95 % confidence interval.The proportion of heterogeneity observed across studies is indicated by I 2 , and τ 2 represents an estimate of the standard deviation of the true effect size.

Effect size calculation
The standardised mean difference (d) was calculated for most studies using the d ppc2 formula (Morris, 2008).Alternatively, F and t statistics were used in equivalent formulas, and appropriate transformations and corrections applied for studies using within-participants designs (Morris and DeShon, 2002).Individual effect sizes were converted from d to Hedges' g using the bias correction formula (Hedges, 1981), which produces a relatively unbiased estimate of the population standardised mean difference effect size.The small sample correction was applied to studies with a sample size of 50 or less (Hedges and Olkin, 1985).

Outliers and influential cases
Outliers, or 'extreme effect sizes', can contribute disproportionately to the effect size estimate in a meta-analysis.Consequently, if these are included in analyses, the reported pooled effect size estimate could be somewhat greater or smaller than the true effect size.Many different methods exist to detect outliers; however, a common method used to detect outliers in a meta-analysis is to calculate whether the confidence interval of each study effect size overlaps with the confidence interval of the pooled effect size estimate.If either the lower or upper boundary of the former does not overlap with the upper or lower boundary of the latter, respectively, the study effect size is considered an outlier (Viechtbauer and Cheung, 2010).In the current meta-analysis, we report pooled effect sizes that were calculated following the removal of outliers detected using this method.

Publication bias
Egger's Regression Test (ERT) was used to test for possible influence of publication bias on the analyses (Egger et al., 1997).This test aims to measure any significant relationship between the effect size and its precision, whereby such a relationship might indicate that larger effect sizes are driven by small-study effects, i.e. studies that are less precise.A modified version of the ERT was used in this meta-analysis, whereby the effect sizes were regressed against the sample variance (√W) rather than the standard error, as the latter can overestimate the significance of funnel plot asymmetry when using SMD effect size estimates (Pustejovsky and Rodgers, 2019;Rodgers and Pustejovsky, 2021).

Data synthesis
Robust variance estimation (RVE) was used to account for the dependency between multiple effect size estimates within each study (Hedges et al., 2010;Tanner-Smith, Tipton, 2014).Accordingly, this method firstly applies an appropriate correlated weight and standard error to each effect size estimate to allow the balanced inclusion of multiple outcomes in the meta-analysis.Sensitivity analysis was performed to estimate the correlation between the effect sizes within-studies (p) based on the fact a random effects model was used.A small sample correction was applied because less than 40 studies were included in the meta-analysis (Tipton, 2015).

Moderator analyses
To investigate the relationship between individual moderators and the overall mean population effect size, a meta-regression was performed with RVE.Categorical moderators were dummy coded to compare two sub-levels within a factor.Multi-level factors were contrast (sum) coded to compare the mean effect size of each level with the grand mean of the factor (e.g. the difference between the mean effect size for studies employing alpha band as the experimental EEG-NF protocol, and the grand mean of all EEG-NF protocol mean effect sizes).Both the coefficient (B) and the p-value are reported for each comparison, as well as the degrees of freedom (df).Continuous moderators consisted of numerical data which could be directly correlated with effect sizes via a linear regression model with RVE.Similarly, the coefficient (B) of the slope is reported along with the df and p-value, to reflect the magnitude and direction of the relationship (e.g. between the amount of EEG-NF training received by participants and their subsequent memory performance).Categorical moderators that contained less than 5 effect sizes were excluded from all analyses.This resulted in the omission of the active non-EEG-NF condition from the control condition analysis (1 effect size), the gamma band frequency (2 effect sizes) being removed from the target frequency band analysis, and the auditory variable (4 effect sizes) being excluded from the modality analysis.

Sample characteristics
The systematic review included 46 studies with a total of participants (1192 observations), details of these studies can be found in Table 1.Of these studies just under half had been conducted in healthy volunteers (n = 22) with the rest in clinical populations or looking at the effects of a medical condition (n = 24).A wide variety of conditions have been examined but for many only a single study has been conducted in that area: Alzheimer's disease (n = 1), alcoholic dependence syndrome (n = 1), COVID-19 (n = 1), epilepsy (n = 1), insomnia (n = 2), mild cognitive impairment (n = 3), major depressive disorder (n = 1), multiple sclerosis (n = 2), obsessive compulsive disorder (n = 1), stroke (n = 5), and traumatic brain injury/concussion/brain tumour (n = 6).For all studies reviewed the sample sizes range from single-case studies up to 79 participants in total, with a maximum of participants in the experimental group (excluding single-cases, mean = 16, median = 11).For healthy volunteer studies, where there were no single-case studies, the mean number of participants in the experimental condition of interest is 13.9 (median = 10).In the clinical domain there are a significant number of studies which only have one participant in the experimental condition (n = 8), excluding these studies results in a mean number of participants in the experimental condition of 19.4 (median = 15).
In neurofeedback experiments some participants cannot regulate their brain activity in the desired way.Thus, positive effects on memory cannot be expected in these individuals if they are unable to complete  the intervention.There is no standard definition of what would constitute a non-responder, but it has been estimated that the rate of these is between 16 % and 57 % (Alkoby et al., 2018).After excluding studies with one or two participants in the experimental condition we found that 28 (80 %) did not report information regarding how many participants were non-responders.In the 7 studies (including one study with two conditions) that did report the number of non-responders in the experimental condition the percentage ranged from 0 to 33.3 with an average of 17.7 % for healthy participants and 31.3 % for clinical patients.

Study design
A total of 17 studies (37 %) included no control measure i.e. there was not a group or condition to compare the effects of the neurofeedback training on memory to.These were largely single-case studies and pilot work.Five of the studies (2 healthy volunteer, 3 clinical) used a nonactive control.In all these studies there was a control group, but this group did not do anything instead of the neurofeedback intervention and did not attend the lab according to the same schedule.Twenty-four of the studies did include an active control group or condition.Of the studies with a control condition or group (active or non-active) 3 of these had a within-subjects design (10.3 %) and 26 (89.7 %) had a between-participants design.All 3 studies with a within-subjects design counterbalanced the conditions, and for the between-subjects design, 22 studies randomised participants to the experimental and control groups.This meant that 4 studies did not implement randomising or were not clear when reporting this information.A further design feature that studies can apply is blinding.Of those studies with a control group or condition in 14 (48.3 %) of them participants were blinded to their group allocation, or the condition under which they were being tested.Double blinding was implemented in seven studies (24.1 %), whereby both participant and experimenter were unaware of who was in what condition.No blinding measures were included in eight studies (27.6 %), or this information was not clearly reported.

EEG-neurofeedback training
The number of feedback sessions included in EEG-NF training schedules ranged from one single session to 42 sessions, where the total amount of training provided to participants ranged from 25 min to 17.5 h, with four studies failing to report this latter information.As might be anticipated and can be seen from Table 1 there seems to be a difference between single-case and group studies in the number and duration of neurofeedback sessions.The median number of sessions in single-case studies is 19 (mean = 19.8)with a median duration of 8.8 h in total (mean = 9.2 h).In group studies there are a median of 10 sessions (mean = 10.8) and these have a median total duration of 3.5 h (mean = 4.6 h).Thus, the number and duration of neurofeedback sessions has a lot of variability across studies, even when single-case studies are excluded these range from 1 to 40 sessions, ranging from a few minutes to 16 h.
Forty-six studies were included in the qualitative review and 7 of these investigated more than one frequency band (besides the neurofeedback control condition).Therefore, k refers to the number of protocols rather than to the number of studies (total k = 53).The EEG-NF protocols used across studies included alpha (8-12 Hz), beta (12-30 Hz), theta (4-8 Hz), gamma (30-100 Hz), slow cortical potential (SCP) and qEEG.All protocols involved up-regulation of the target frequency band unless otherwise stated.The protocol used the most in neurofeedback studies on memory in this review was beta (k = 22).In addition to general broadband beta (k = 4), this includes 16 sensorimotor rhythm (SMR) protocols, 1 up-and down-regulation of SMR coherence, and 1 sigma band.Most of the beta protocols (k = 20) used centrally located electrodes.Fourteen protocols examined alpha, comprising broad band alpha protocols (k = 6), 2 peak alpha frequency (PAF) protocols, 5 upper alpha (UA) and 1 mu.As might be anticipated alpha was mainly measured at parietal sites (k = 5), with two additional protocols combining parietal with occipital sites.Occipital (k = 3) and central (k = 4) areas were also targeted with alpha.Theta was the focus of 5 protocols and featured in 1 protocol which involved down-instead of up-regulation.Electrode placement was generally at frontal regions (k = 4).A minority of protocols looked at gamma (k = 3), SCP (k = 2) and qEEG (k = 2).Five studies used protocols combining different frequencies.Across all protocols an average of 1.7 electrodes were used (median = 1), with a range of 1-6 electrodes.See Fig. 2 for an overview of EEG-NF protocols and electrode locations.
When participants receive neurofeedback, it can be delivered in different modalities.The studies in this review mainly presented feedback just visually (n = 22), this was typically a bar graph where participants had to try to keep the bar above a line (e.g.Kober et al., 2015b;Rozengurt et al., 2017) but also included richer displays like a rollercoaster (e.g.Eschmann et al., 2020;Wang and Hsieh, 2013).A combination of visual and auditory feedback was also popular (n = 18), and this could be achieved by presenting participants with a short acoustic tone and increasing the clarity of the picture.Less popular was solely auditory feedback (n = 3), where the aim was simply to increase the rate of the tone occurrences.Three studies did not report which modality was used to deliver neurofeedback.
In the majority of studies (n = 36) participants were not provided with explicit instructions on how to self-regulate target brain activity.In these studies participants are generally told that the feedback that they receive is determined by the characteristics of their EEG and they need to work out what mental state provides positive feedback and to maintain that, or this information was not clearly reported.Eight studies provided participants with suggested strategies to modulate target brain activity.One study (Byers, 1995) used instructions for the first part of the protocol but not the second, so is not included in the totals above.Some of these instructions were quite general e.g., to use a combination of relaxation techniques and positive thought (Hoedlmoser et al., 2008), whereas others gave specific strategies for target bands e.g.relaxation for theta and concentration for low beta (Rozengurt et al., 2017) and motor imagery for SMR (Kober et al., 2020).L.E.

.1. Sample characteristics
For the meta-analysis only studies where the relevant data were available, and which had an active control condition and who randomised participants to this or the experimental condition (or counterbalanced in a within-participants design) were included.This reduced the sample to 21 studies, with 361 participants across all these studies in the experimental condition/group.Most of these studies were on healthy volunteers with only 2 conducted in clinical populations.The mean age of participants was 32.8 years (range 20-75.3).Some of the studies had multiple memory measures or looked at several target frequency bands and so generated a total of 44 effect sizes (range of 1-8 per study).

Primary analysis: Effect of EEG-NF on episodic memory
A statistically significant, small effect (Cohen, 2013) of EEG-NF on episodic memory performance was revealed: g = 0.31, SE = 0.09, t (17.1) = 3.49, p = 0.003, 95 % CI [.12,.49] 1 -see Fig. 3.A small amount of heterogeneity (I 2 = 18.2 %, τ 2 = 0.03) was detected between the studies analysed.Further exploration of this variance was conducted by way of moderator analyses and their individual estimates to examine the dispersion of effects.

Moderator analyses
Several moderator analyses were conducted to examine the effects of the sample, study design and EEG-NF training parameters and type of episodic memory measures.The results for all these analyses, including individual effects for each group, are summarised in Table 2.
EEG-NF success significantly moderated the effect of EEG-NF on episodic memory (B = 0.46, p = 0.007), such that where studies reported significant modulation of brain activity in the EEG-NF group relative to the active control group, a highly significant, approaching medium size effect on memory performance was revealed (g = 0.47, t (11.2) = 4.95, p < 0.001).In studies where no such modulation was reported, no effect was observed on memory performance.
Memory type (i.e.whether recognition or recall was being measured) was not a significant moderator of memory performance overall.However, at the sub-group level EEG-NF had a highly significant, small size effect on participants' ability to recall information (g = 0.34, t(15.1)= 3.54, p = 0.003).The analysis revealed no significant effect on recognition performance.
Memory modality (whether verbal or visual memory was being measured) significantly moderated the overall effect size (B = − 0.34, p = 0.032).A significant, small size effect of EEG-NF on verbal memory was revealed (g = 0.37, t(13) = 3.65, p = 0.003), whereas it had no significant effect on visual memory.
There was no significant moderation effect of control condition on episodic memory performance.However, on a sub-group level, studies using a contingent control generated a highly significant, small size effect (g = 0.31, t(17.1)= 3.49, p = 0.003), whereas the effect was not significant for studies using a non-contingent control.
Another factor we explored was whether the EEG-NF training instructions given to participants moderated the overall memory effect size.Whether or not participants were given instructions regarding how to achieve the target brain state did not significantly influence overall memory performance.However, a small effect on memory performance was found in the sub-group analysis for those who received no instructions (g = 0.23, t(12) = 2.22, p = 0.047) and those who did (g = 0.44, t(4.7) = 2.66, p = 0.048).The modality of the neurofeedback did not moderate memory performance.However, on a sub-group level, protocols delivered visually did (g = 0.36, t(10.2) = 3.02, p = 0.013).In studies where a combined visual and auditory protocol was used, there was no significant effect of EEG-NF on memory performance.
There was no evidence that target frequency band, either in the moderation or sub-group analyses, had any impact on memory performance.Similarly, the amount of EEG-NF, whether measured by the total time or number of sessions, did not affect memory.

Discussion
This is the first systematic review and meta-analysis which examines the effect of EEG-NF on episodic memory in both healthy and clinical populations.The first aim of the systematic review was to provide a qualitative overview of the literature based on several factors, such as the participants, study design and neurofeedback protocols to understand what research has been conducted in this area.Forty-six studies were found with approximately equal numbers conducted in healthy volunteer and clinical groups.The second aim was to conduct a metaanalysis solely on studies with an active control condition or group, which contained randomised or counterbalanced participants, to determine if EEG-NF can enhance episodic memory and whether success in modulating brain activity affected this result.
The meta-analysis, which included 20 studies (39 effect sizes), revealed a small beneficial effect of EEG-NF on episodic memory performance.This finding is in line with the meta-analysis by Yeh et al. (2021) on six episodic memory studies.However, their effect size was much larger than ours (0.77 versus 0.31).This is likely because in the Yeh et al. (2021) analysis the effect size was calculated using only post-neurofeedback memory performance and one outcome per study was included.In the current meta-analysis, our calculations took into consideration participants' pre-neurofeedback memory performance, to provide an adequate baseline of their ability, thereby generating a more accurate effect size (Morris and DeShon, 2002).Also, multiple outcomes per study were included to avoid selection bias which can occur when choosing only one outcome when multiple outcomes are available.Therefore, our analysis was more inclusive and based upon more studies as we included all frequency bands, all episodic outcome variables, and we examined healthy volunteers and clinical populations.
The finding that EEG-NF does improve episodic memory performance provides some incentive to conducting further research in this area, to determine if this technique could be developed as an intervention to enhance memory functioning in individuals.Given that it is lowcost, portable and could be conducted by the individual in their home it would be ideally suited to this.However, there are further issues which would need to be considered.One, which is the same for any intervention, is about how long behavioural benefits are seen for?Many of the studies in this review tested performance immediately after training, those who do look at longer intervals typically test after one to two weeks (e.g.Eschmann et al., 2020;Rozengurt et al., 2017).It is unknown if improvements are maintained over a longer timescale.Furthermore, there is very little research completed on training generalisability.If neurofeedback can enhance memory for the task tested in the protocol will this also lead to a boost in memory capabilities in everyday life?The transfer of learning beyond the specific task tested to other tasks and to more ecologically valid activities is rarely examined.The second major question concerns the mechanisms and brain structures underlying episodic memory that neurofeedback is acting on.In this regard neurofeedback using other imaging modalities, such as functional Magnetic Resonance Imaging (fMRI), might provide complementary information to EEG, due to its higher spatial resolution and ability to access deeper = effect size; g = Hedges' g; GM = grand mean; I 2 = I-squared measure of heterogeneity; p = probability value; τ 2 = Tau squared.
L.E.Jackson et al. brain structures which are known to be important to memory, such as the hippocampus.Research in this domain is very much in its infancy, with very few studies.A proof-of-concept study by Hohenfeld et al., (2017Hohenfeld et al., ( , 2020) ) used real-time fMRI-based neurofeedback training of visuo-spatial memory in older adults and those with Alzheimer's disease.After three sessions of training, which targeted the parahippocampal gyrus, there was potentially some improvement in the delayed recall condition of a different visuo-spatial task.Thus, even if EEG-NF can enhance memory a better understanding of the neural basis and more data on the longevity and transfer of the effect is required.
Although the moderator analysis was not significant, at the subgroup level it was found that EEG-NF had a small size, significant effect when participants free recalled or remembered source/contextual details but the effect on recognition was not significant.The majority of tasks administered to participants were bespoke tasks delivered on a computer, but a few gave standardised neuropsychological tasks which tend to be given in paper format (e.g.Rey Auditory Verbal Learning Test, RAVLT;Rey, 1964).These bespoke tasks encompass several different types of paradigms, such as paired associates, where participants learn pairs of items and then at test are given one of the items and have to recall the other (e.g.Berner et al., 2006;Hsueh et al., 2012Hsueh et al., , 2016)); the Remember/Know paradigm, which taps participants' subjective ability to distinguish between being able to recover any contextual details from the encoding episode (a Remember response) or being aware that an item was previously presented but without any of these details (a Know response) (e.g.Keizer et al., 2010;Staufenbiel et al., 2014); and tasks where participants have to indicate if a test item is new or old, and if old, the encoding task that was completed on it (e.g.Eschmann et al., 2020).The memory tests administered can vary substantially in the number of items and the duration of the test.For example, Rozengurt et al. (2017) asked participants to encode 30 items and gave a free-recall test which took approximately 5 min; whereas other studies ask participants to encode and retrieve a few hundred items which takes much longer (e.g. in Eschmann et al., 2020, 200 words were studied and 300 were in the test phase).There are also differences in the design of studies and how the memory tasks are administered.Rozengurt et al. (2017) was specifically interested in how neurofeedback could enhance consolidation, so participants studied items, received the neurofeedback and then their memories were tested in the same session, 24 h later and a week later.Other studies (e.g.Eschmann et al., 2020) look at transfer effects whereby participants complete a baseline study and test memory task, receive neurofeedback (typically over several days), and then learn new items and are tested on them.Thus, there is great variety in the characteristics of the memory tasks used.
One way that these seemingly different tasks can be thought of is in terms of process.According to dual-process models of memory (e.g.Jacoby, 1991;Yonelinas, 2002) familiarity describes a fast and relatively automatic process that involves recognition of having previously encountered something i.e. participants' ability to discriminate between old and new items.In contrast, recollection is a slower, more effortful process, that involves conscious recollection of previously studied contextual detail i.e. participant's ability to retrieve source information.Thus, tasks which require participants to free-recall or recover details from the study phase utilise recollection, whereas recognition tasks require familiarity (but can also be completed with recollection).Our results suggest that EEG-NF may target recollection rather than familiarity.That is extremely useful as the decline in memory seen in aging (Friedman, 2013) and across clinical conditions such as Alzheimer's Disease and Mild Cognitive Impairment (Westerberg et al., 2006), and Depression (Dillon and Pizzagalli, 2018) all point to specific deficits in recollection.EEG-NF also appeared to have a specific effect on verbal memory for language-based stimuli e.g.words, but there was no effect on visual memory for spatial form e.g.objects, places, animals, and people.One explanation for this could be that retrieval of visual stimuli is known to be far more superior than that of verbal stimulithe so-called picture superiority effect (Paivio, 1971) -so perhaps there was less capacity for participants to improve on this.It might also be that EEG-NF training might have less impact on more automatic visual stimuli-based tasks and instead facilitate communication between the more distributed networks across the left prefrontal and temporoparietal regions used in linguistic processing (Binder et al., 1997).It was not possible in this review, due to a paucity of studies, to examine whether neurofeedback targeting a certain frequency band and location would be more likely to enhance recollection and verbal stimuli, but future empirical work could address this.
A fundamental assumption of EEG-NF is that a participants' ability to successfully regulate their brain activity in the desired manner is related to a change in behavioural performance.The moderator analysis provided support for this by revealed that enhanced episodic memory performance was observed only in studies reporting a significant change in the target brain activity due to neurofeedback.In this meta-analysis, a binary code was used to represent self-regulating success; specifically, 'yes' if participants were able to achieve the target brain activity, and 'no' if not (as used by Rogala et al., 2016).A more robust approach could be to calculate an effect size to represent EEG-NF success and correlate this with memory performance effects.However, there is some variability in the units of measurements used to calculate changes in neural activity across studies (e.g.spectral power, time above threshold).Furthermore, the contrasts used to measure these differences can range from between pre-and post-EEG-NF resting blocks, or between rest/early active EEG-NF blocks and the average of all, or just later, active EEG-NF blocks.Together, this presents a challenge in synthesising these values appropriately in a meta-analysis.Nonetheless, this positive finding demonstrates the importance of the ability to self-regulate target brain activity to receive the associated benefits to memory.
One inherent issue when using EEG for neurofeedback is the production of eye and movement artefacts in the electrical signal during the training session.These artefacts can generate frequencies that overlap with the target brain frequency to be modulated.In the event artefacts are produced, it could be argued that any improvements in memory performance observed following EEG-NF may be due to artefactfeedback, as opposed to any real changes in target brain activity being fed back to the individual.Many protocols try to mitigate for these effects by using online real-time artefact detection processes, whereby when certain thresholds are exceeded; where eye and movement artefacts are usually seen, this causes the neurofeedback to be interrupted and paused until the level of artefacts are below the threshold.In addition, offline analyses can be implemented on the EEG data to detect artefacts and to correct or remove these to ensure when researchers quantify whether participants were able to successfully modulate their brain activity in the desired manner this is not contaminated by the effect of artefacts.The vast majority of studies included in the metaanalysis (all except two) reported implementing some form of control for artefacts.Even if these two studies are excluded from the moderator analysis, the result is still significant.Thus, the enhancements in memory performance, found when people can successfully modify their brain activity in our meta-analysis, are likely to be as a result of real changes in target brain activity rather than eye or movement artefacts driving neurofeedback success.
However, there are some individuals who cannot produce the target brain activity during neurofeedback.This has been reported to be approximately one-third of individuals (Enriquez-Geppert et al., 2017), our findings suggest up to this figure.However, we also found that the vast majority of studies did not report the number of non-responders, so this number might not be reliable, and practices around non-responders in many studies were not clear.This presents a couple of issues in EEG-NF research.First, the inclusion of non-responders might serve to diminish the overall observed effect of EEG-NF on memory performance at a group level.Second, the exclusion of non-responders from relevant analyses might render a sample insufficiently powered to detect the effect of interest.Furthermore, if studies do identify non-responders there is a lack of consensus as to what measure to use to do this and how to define a non-responder.For example, in Rozengurt et al. (2017) they described them as those who cannot increase their target band power ratio by at least 5 % relative to baseline, whereas others have defined them as those whose total target band duration in the last session is not greater than 95 % confidence intervals of the total duration in the first three sessions (Hsueh et al., 2016).Recent research has been undertaken to examine what individual differences predict responder ability.Psychosocial factors such as attention/concentration, motivation and mood have been linked to self-regulation ability (Kadosh and Staunton, 2019).Also, brain volume, fluid intelligence and alpha power at rest have predicted responders (Enriquez-Geppert at al., 2017;Khodakarami and Firoozabadi, 2020;Kober et al., 2017a,b).Taken together, these points suggest that future research using EEG-NF could benefit from: the use of a widely adopted, operational definition of a responder; accurate reporting of the number of responders per study and by collecting informative participant data that may assist researchers in identifying non-responders.
Perhaps surprisingly we did not find that total time or the number of neurofeedback sessions that the participant completed moderated the effect of neurofeedback on episodic memory.There are a variety of explanations for this.One possibility is that what is important is the training intensity i.e. how many sessions participants complete over what period of time (Esteves et al., 2019;Rogala et al., 2016).Alternatively, a critical variable might be the extent to which participants can exert control and pace for themselves the training sessions rather than this being externally dictated (Uslu and Vögele, 2023).We also found that target band frequency was not a moderator of memory performance.These results could be partly because there is ongoing debate regarding the specific role of different oscillations in memory but also the small number of studies per band (except beta), rendering us possibly underpowered to detect these effects.In any event, drawing confident conclusions about band specificity at a meta-analytical level remains a challenge given many studies do not report activity across the full power spectrum, only the target band.Better transparency regarding this should elucidate the contribution from adjoining bands or coupled frequencies (Ros et al., 2020).Furthermore, some research shows enhanced effects of EEG-NF on both neural and cognitive outcomes with personalised feedback, such as individual peak alpha or individualised theta (Alkoby et al., 2018).
There is some debate in the literature regarding whether giving specific instructions to participants assists self-regulation of the target brain activity.The majority of studies in the qualitative review did not give explicit instructions to participants.In the meta-analysis there was tentative evidence of better memory performance both when participants did not receive any instructions with respect to how they should achieve the target brain state and when they did.A recent study (Chikhi et al., 2023) explicitly tested the effect of instructions by giving one group of participants a list of mental strategies, based on previous studies which had trained the same target band, and another group no strategies.Contrary to expectations they found that giving participants instructions about strategies did not enhance their ability to modulate the target band frequency.They suggest that this might have been because the strategies given were too numerous or not relevant.However, they did find a link between certain self-reported strategies and higher target band activity, highlighting that specific strategies may play a role in how well participants can modulate their brain activity.Further work explicitly examining strategies and applying a more fine-grained classification of them would be useful (see Lubianiker et al., 2022) and might also help researchers to reduce the number of non-responders as these could be individuals who are unable to find or to implement an effective strategy.
The design quality varied across studies, with just over a third of studies not reporting a control group or condition, and of those that did, the majority randomised or counterbalanced participants.For those with a control group or condition, around three-quarters implemented some form of blinding, with the rest either failing to do this or report it.
There was a suggestion in the moderator analysis, which only included studies with an active control group, that those studies which used a contingent control had a more beneficial effect on memory performance, which was not found when using a non-contingent control group.An explanation for this could be that participants in a contingent group are being trained to specifically regulate activity that is unrelated to the target frequency band, so there is potentially better separation in measured activity between the experimental and control group.Conversely, in a non-contingent group, participants could be upregulating frequencies within the target band, and thereby obscuring the effect.Non-contingent controls, where participants detect them, can be associated with negative effects, such as: frustration and decreased motivation due to the lack of control over the feedback received (Sorger et al., 2019;Witte et al., 2013) and risk unblinding the participants.Thus, in healthy volunteer research a contingent control condition might be best as participants can exert control over brain activity, which eliminates the negative issues arising from a lack of this and may allow the experimenter to demonstrate greater specificity in the neurophysiological mechanism (Sorger et al., 2019).
Finally, our analysis of the studies included in the qualitative review revealed that the sample size in many of these, even excluding singlecase studies, was very low.A power calculation reveals that for a onetailed test (alpha = 0.05, power = 0.8) comparing between two unmatched groups 21 participants would be required in each group to detect a large effect (d = 0.8) and 51 for a moderate effect (d = 0.5).Thus, many of the studies are insufficiently powered to detect a large effect size and none of the studies have sufficient participants to detect a moderate effect of neurofeedback on memory in a between-participants design.This review demonstrates the fundamental need for larger samples to be used in EEG-NF research to reliably reveal its true effect on episodic memory.
In conclusion, the meta-analysis based on actively controlled studies revealed a small-size, significant positive effect of EEG-NF on episodic memory performance.Effects of EEG-NF were larger for tasks requiring retrieval of details around the encoding episode, with enhanced performance in remembering verbal stimuli.Importantly, the overall effect was significant for studies reporting that participants were successful in self-regulation of the target frequency band.Therefore, the efficacy of EEG-NF to improve episodic memory shows promise.However, sufficiently powered studies with adequate study design features are required to provide stronger empirical support for this intervention.Moreover, there is a need to investigate the characteristics of responders and the specific effects of different EEG-NF protocols on underlying neural systems involved in memory processes.

Fig. 2 .
Fig. 2. Diagram depicting the number of protocols of each frequency band type and the predominantly used electrode location for each.

Fig. 3 .
Fig. 3. Forest plot showing the overall effect of EEG-NF on episodic memory performance and the distribution and weighting of effect sizes across studies, represented by the size of the square.Error bars represent the 95 % confidence interval of the effect.Squares to the left of zero indicate a negative effect of EEG-NF on memory.Squares to the right of zero indicate a positive effect of EEG-NF on memory.The white diamond and dotted line represent the pooled effect size. 1 The result of the meta-analysis on the study set before outliers were removed was still significant but with a smaller effect size: g = 0.28, SE = 0.11, t(19.3)= 2.55, p = 0.019, 95 % CI [.05,.50].

Table 1
Sample, study design and EEG-NF training characteristics.

Table 2
Moderator Analyses.Note.Significant (p < .05)moderators and individual estimates in bold.Dummy-coded categorical moderators: B represents the difference between estimated effects for each group.Contrast(sum)-coded categorical moderators: B represents the difference between estimated effects for each group and the grand mean of that category.Continuous moderators: B represents effect size change relative to one-unit moderator change.Abbreviations: CI = confidence interval; df = degrees of freedom; ES