Introduction

Cognitive impairments that occur due to mild cognitive impairment (MCI) or more severe forms of dementia have been well described in terms of memory deficits or executive dysfunction (Bastin et al., 2019; Cohen et al., 2019; Glisky, 2007; Mimura & Yano, 2006), with the predominant difficulties in retrospective memory, at least in earlier stages of the Alzheimer’s diseases (AD; Huppert & Beardsall, 1993; Murman, 2015). In the 1990s, increasing interest was dedicated to another form of memory process called prospective memory (PM). PM is defined as remembering to carry out intended actions when a specific target occurs or at an appropriate time in the future (Burgess & Shallice, 1997; Ellis & Kvavilashvili, 2000; Kliegel, Jager, et al., 2008, Kliegel, McDaniel, et al., 2008; McDaniel & Einstein, 2007, 2011). It is a highly complex process that requires formulating plans and intentions, retaining the information, and then executing the planned intention at the appropriate future moment. Two distinct PM components have been identified in the execution of a PM task: the retrospective component, which is responsible for the initial encoding and long-term retention of the content of the intention, and the prospective component, which refers to the ability to autonomously activate the intention at the right moment without any explicit prompt to recall being given (McDaniel & Einstein, 2000). An additional distinction concerns the cue that triggers the PM action: if the cue is event-based, a person performs a PM action when a specific event occurs; conversely, whereas if the cue is time-based, a person forms a self-generated intention to perform an action at a specific time in the future. Everyday life examples of PM activities concern our ability to remember the appointment with the doctor at 4 pm (i.e. time-based PM) or to remember to buy the milk at the store on your way home (i.e. event-based PM; Kliegel, Jager et al., 2008, Kliegel, McDaniel, et al., 2008).

Executive and declarative memory processes are differently implicated in the two PM components. Indeed, the encoding and long-term retention of the associative relationship between a specific event or time and the concrete actions to be performed requires the correct functioning of the declarative memory system. Conversely, the executive system is mainly implicated in controlling the mental operations needed to spontaneously activate the prospective intention at the appropriate time or at the occurrence of the specific cue. To resemble everyday experiences, laboratory PM tasks are typically embedded in an ongoing task; participants need to share their cognitive resources between performing the ongoing task and keeping track of the PM task. Periodically they must monitor for the occurrence of the appropriate cue or time to initiate task execution. When finally, the appropriate moment occurs (either cue or time triggered), they must stop performing the ongoing task and begin performing the intended action. Top-down attentional control, strategic monitoring of the external environment and/or of the time passing and shifting between concurring activities are all cognitive abilities under the control of the executive system (Laera et al., 2021; Martin et al., 2003; Schnitzspahn et al., 2013) which is known to be affected by age (McFarland & Glisky, 2009).

Previous neuroimaging studies have identified a significant role of anterior frontal regions in PM functions (Burgess et al., 2001, 2003; Lamichhane et al., 2018; Zhuang et al., 2021). Brodmann Area 10 (BA 10) has been demonstrated in directing attention toward either stimulus-oriented or stimulus-independent thoughts (Burgess et al., 2007). The medial and lateral portions of the anterior prefrontal cortex are critical in balancing attention between the external ongoing stimuli and the internally represented PM intention. A recent meta-analysis (Cona et al., 2015) further indicated that PM may rely on the dorsal frontoparietal network which is involved mainly in the maintenance phase and seems to mediate the strategic monitoring processes (top-down attention both towards external stimuli and to internal memory contents). The ventral frontoparietal network is recruited in the retrieval phase and probably the bottom-up attention is captured by external PM cues and activated, internally, by intention stored in memory. Together with other brain regions (i.e. insula and posterior cingulate cortex), the ventral frontoparietal network would support the spontaneous retrieval processes. Neuroimaging studies investigating time-based PM have identified specific activations in the superior and middle prefrontal cortex as well as the precuneus (Gonneaud et al., 2014). Okuda et al. (2007) revealed that the left rostral prefrontal cortex was found to be more active in the time-based compared with the event-based PM tasks but also a bilateral decrease in blood flow in medial BA 10 regions during the event-based relative to the time-based PM tasks. Moreover, the authors found that time-based PM recruited far more prefrontal regions than event-based PM did depending on clock availability. In a recent study, Morand and colleagues (2021) exhibited that reduced time-based PM performance in older adults correlated with diminished white matter integrity, particularly within the tracts of the superior fronto-occipital fasciculus, whereas no correlations with grey matter volume were found.

Deficits in memory and executive functioning, which are involved in PM (Laera et al., 2021; McFarland & Glisky, 2009; Schnitzspahn et al., 2013), are characteristic features of mild cognitive impairment (MCI) and dementia (e.g. Alzheimer’s disease, AD; Arnáiz & Almkvist, 2003; Bäckman et al., 2005; Baddeley et al., 2001; Petersen, 2004). Mild cognitive impairment (MCI) is an intermediate clinical state between normal cognitive decline due to ageing and dementia (Albert et al., 2011; Díaz-Mardomingo et al., 2017). It has been observed that people with MCI progressed to dementia at very different rates, with an average conversion rate of 10% per year; Petersen (2003) reported that after approximately 6 years, 80% of the MCI cohort has progressed to dementia. However, not all individuals with MCI progress to AD (Petersen et al., 1999) raising the concern that MCI is both a clinically and etiologically heterogeneous grouping. On the other hand, AD is a progressive age-related neurodegenerative disease associated with distinct pathological changes (extracellular accumulation of amyloid-beta-containing plaques and intracellular development of tau-containing neurofibrillary tangles) in cortical and subcortical regions (McKhann et al., 2011). According to the amyloid cascade theory, the main cause of AD consists in the precipitation of beta-amyloid proteins and the formation of extracellular plaques, which leads to inflammatory processes and finally results in cognitive deficits (de Vrij et al., 2004; Morishima-Kawashima & Ihara, 2002). The possibility that the AD process may begin years before clinical symptoms is evident (Petersen, 2003). Monitoring cognitive decline and more specifically memory and PM in older adults is fundamental at clinical and experimental level to detect, at the individual level, the first manifestation of more severe cognitive decline.

Our capacity to shape and direct our future behaviour is of fundamental importance in the development, pursuit, and maintenance of an independent and autonomous lifestyle from early childhood to late adulthood. Adequate PM abilities are fundamental for social interaction or normal maintenance, such as remembering your friend’s birthday or remembering to stop at the grocery store on the way home from work or paying bills before the due date. Many other PM tasks are central to health needs, in particular for older adults, such as remembering to take medication and remembering to monitor indexes of physical function (e.g. blood sugar levels; Hering et al., 2018). Given the importance of PM tasks in everyday life and the demographics of an increasingly ageing society, it is important to understand PM performance in healthy and clinical ageing.

In 2012 van den Berg and colleagues conducted a meta-analysis investigating event-based and time-based PM performance in healthy and in patients with different degrees of cognitive decline. The meta-analysis included 14 (7 included AD patients, 4 included MCI patients and 3 included both groups)Footnote 1 and showed no statistical difference between the PM impairment in MCI and AD; both types of patients exhibited large deficits in PM compared to healthy older adults (Cohen’s d of −1.62). Those results were surprising considering that AD patients exhibit more severe overall cognitive impairments than MCI patients (Albert et al., 2011; Arnáiz et al., 2003), for what the authors stated that it corroborated earlier suggestions that PM is already affected in the early stages of cognitive decline (Huppert & Beardsall, 1993). In addition, the size of the impairment was comparable for event-based and time-based PM, as well as for PM and RM, and the meta-analysis did not find evidence of publication bias. Those findings were of particular interest bringing light to the importance of including PM measures in clinical settings to further test PM decline in the early stages of dementia.

Since 2012 we have observed an increasing interest in PM performance in the clinical population that mirrors the great interest in understanding PM impairment and developing new training to compensate for and enhance remaining PM competencies (Hering et al 2018; Kliegel et al., 2016). Considering that the number of older adults is drastically increasing over the next decades, it is timely to understand how PM decline as one gets older and signs of cognitive decline more severe. The primary aim of the present meta-analysis was to quantify the nature and extent of PM deficits in MCI and AD incorporating all the new evidence that appeared in the last decade to the literature reviewed in the previous meta-analysis. PM is a valid construct in neuropsychological assessment in patients with MCI and AD; a better understanding of prospective memory abilities in patients with MCI or AD will provide the opportunity to better comprehend the functioning of PM and to assist researchers and clinicians in shaping increasingly effective and patient-centred intervention projects.

Method

Literature Search (PRISMA)

A systematic search strategy was performed following the PRISMA recommendations (Moher et al., 2009; Page et al., 2021). Starting from previous meta-analysis (van den Berg et al., 2012 literature search from 1990 to July 1, 2011), we conducted our search starting from 2011 to February 2022. Firstly, we consulted PsycInfo, PubMed, and Web of Science using the terms: “prospective memory”, “event-based prospective memory”, “time-based prospective memory”, “PM”, “event-based PM”, “time-based PM”, “EBPM”, or “TBPM”, in combination with “Dementia”, “Alzheimer”, “AD”, “mild cognitive impairment” or “MCI”. Reference lists from published reviews, books, and chapters were additionally checked to identify studies that might have been missed by the databases search. The literature search was conducted by the librarian assistant working at the Department of General Psychology, University of Padova; articles selection was conducted independently by GM and RRC and two research assistants at the Department of General Psychology. Any difference was resolved by discussion until a consensus was reached. In total, we found 382 potentially relevant studies and 232 records remained after duplicates were removed. Among them, 46 studies met the inclusion criteria described below and formed the sample for our meta-analysis (Fig. 1).

Fig. 1
figure 1

Flowchart of the studies included in the systematic review and meta-analysis

Inclusion Criteria

The inclusion criteria were as follows:

  1. 1.

    Published articles or theses reporting measures of event-based or time-based PM abilities in patients with AD or MCI,

  2. 2.

    The studies additionally assessed PM in a group of healthy older participants used as a comparison group,

  3. 3.

    Designs with standard encoding of the instructions, excluding conditions in which the participants received strategies that presumably affect PM performance (such as implementation intention encoding; Lee et al., 2016; Shelton et al., 2016),

  4. 4.

    The studies contained sufficient information to calculate at least one effect size (otherwise, authors were contacted, and the studies included if the information was provided),

  5. 5.

    Participants did not suffer from any other neurological or psychiatric condition.

Statistical Analysis

Effect Size

As the studies used different tasks of PM, we opted for the standardized mean difference as an estimator of the effect size of accuracy in PM tasks (for the sake of homogeneity between the studies; measures of RTs were not included). Specifically, we used the between-group Hedges’ g, which reduces the bias of small samples in classic Cohen’s d through a correction factor (J),

$$g=J\times \frac{{\text{M}}_{\text{patient}} \, - \, {\text{M}}_{\text{control}}}{{\text{SD}}_{\text{pooled}}},$$
(1)

with variance,

$${V}_{g}={J}^{2}\times \frac{{n}_{patient}+{n}_{control}}{{n}_{patient}\times {n}_{control}}+\frac{{d}^{2}}{2\times \left({n}_{patient}+{n}_{control}\right)},$$
(2)

where Mpatient and Mcontrol represent the means of AD/MCI patients and healthy controls, respectively; npatient and ncontrol are the number of participants in each group; and SDpooled is the pooled standard deviation for the scores of both groups (Borenstein et al., 2021). In those studies, in which the size of the control group exceeded 1.5 times the size of the group of patients, the sampling variance was calculated by replacing the number of control participants by the number of patients:

$${V}_{g}={J}^{2}\times \frac{2\times {n}_{patient}}{{n}_{patient}^{2}}+\frac{{d}^{2}}{4\times {n}_{patient}},$$
(3)

With the previous procedure, we prevent some large studies, mainly because of the large size of their control sample, from contributing more (i.e. smaller variance) to the final meta-analytic effect than other smaller studies but with similar size of their group of patients. Moreover, J was calculated as follows:

$$J=1-\frac{3}{4\times \left({n}_{patient}+{n}_{control}-2\right)-1}$$
(4)

Negative values of g represent worse PM performance for patients than controls, whereas positive values index the opposite case. We multiplied by –1 to maintain this coherence among the effects (of note, this procedure was only applied to the outcomes in Shelton et al., 2016).

Meta-analytic Approach, Heterogeneity, and Moderator Analysis

Due to most of the included studies contributed with more than one effect size from the same sample, we used the robust variance estimation method (RVE; Hedges et al., 2010) using the robumeta package for R (Fisher et al., 2017) to conduct multilevel models, with a prespecified within-study effect-size correlation of .80 (although sensitivity analyses were conducted with other correlation values to test the robustness of the models: 0, .2, .4, .6, and 1). The significance level was set at .05. This method allows for dealing with a correlated structure of outcomes from the same study. We chose a correlated dependence model with small-sample corrections (Tipton, 2015) and effect sizes were nested within each independent sample of participants. Note that some studies also contributed with several experiments and/or multiple samples of patients, so we decided to select independent samples as a nesting variable (i.e. the main source of dependency). First, we tested the overall difference in PM between AD/MCI patients and healthy controls. Moreover, we computed the common heterogeneity indexes: τ2 and I2.

In a second step, we repeated the analyses including variables that could have a moderating effect on the final estimate and possibly account for part of the heterogeneity. We, thus, fitted separate multilevel meta-regressive models with the following moderators, one model per moderator:

  1. 1.

    Neurological condition: AD patients vs. controls or MCI patients vs. controls;

  2. 2.

    Mean Mini-Mental State Examination (MMSE) score of the group of patientsFootnote 2;

  3. 3.

    The standardized mean difference (Hedges’ g) in the reported neuropsychological tests of (3a) retrospective memory, (3b) executive functions, (3c) working memory, and (3d) processing speedFootnote 3;

  4. 4.

    Mean age of the study participants (in years);

  5. 5.

    Mean years of education of the study participants;

  6. 6.

    Type of measure regarding its cue for action: event-based or time-based PM;

  7. 7.

    Type of PM task: classic neuropsychological PM tasks or other PM tasksFootnote 4;

  8. 8.

    Year of publication of the study;

  9. 9.

    If the study was published before or after the meta-analysis by van den Berg et al. (2012).

Moreover, we conducted a meta-regression with the standard error of the effect size as a covariate to test for the existence of publication bias. If the publication process favours significant results that confirm the predominant theories than null outcomes, it would be more likely to observe larger effects in smaller studies (i.e. small-study effect). It could be translated to asymmetrical distributions of the effect sizes, especially within studies with larger standard error, with few small-to-null results (Egger et al., 1997). Therefore, a way to test the existence of publication bias is through a meta-regressive model using standard error as a predictor. Moreover, the intercept of that meta-regression can be used as the adjusted overall effect (i.e. the intercept when the standard error is close to zero; Stanley & Doucouliagos, 2014). In the present work, we chose a variance-stabilizing transformation for the standardized mean difference (h) to conduct the test of asymmetry this transformation prevents the artefactual dependence between the effect size and its precision estimate (Pustejovsky & Rodgers, 2019). In parallel, we implemented the same analysis with the ordinary Hedges’ g and a modified formula of the sampling variance (W) to adjust the final effect without changing the scale of the effect size, unlike the variance-stabilizing transformation.

Finally, to find out which combination of moderators provided the best fit for the data, we carried out a backward stepwise selection (αexclusion = .10) with all the moderators. This procedure would consider more complex structures of moderators and look into the residual heterogeneity of the best meta-regressive model.

Results

The meta-analysis included 46Footnote 5 studies investigating the differences in PM in AD patients compared to healthy older adults (17 studies), in people with MCI (24 studies), or both conditions in the same article (5 studies: Kazui et al., 2005; Massa et al., 2020; Thompson et al., 2010; Thompson et al., 2011; Troyer & Murphy, 2007; Tables 1 and 2). In three samples (Huppert & Beardsall, 1993; Mori & Sugimura, 2007; Thompson et al., 2010, 2011) AD patients were mixed with patients with other types of dementia (e.g. Lewy-body or vascular dementia), but the latter represented a small proportion of the samples (Huppert & Beardsall: 6%; Thompson et al.: 10%). All the findings in the present work remained identical when these three samples were excluded in subsequent sensitivity analyses. The use of different within-study effect-size correlations in RVE models (i.e. 0, .2, .4, .6, and 1, instead of the prespecified .8) also did not affect the results. The 46 studies contributed a total of 63 independent samples and 129 effect sizes from 4668 participants (2115 patients and 2553 controls).

Table 1 Characteristics of studies including AD patients
Table 2 Characteristics of studies including MCI patients

Consistent with the findings in the preceding literature, patients with AD and MCI showed remarkable impairments in PM compared to healthy controls, g =  −1.12 [−1.27, −0.98], p < .0001. Contrasting with the meta-analysis by van den Berg et al., 2012, this result arose from a pool of effect sizes that were highly variable among themselves, more than could be explained by sampling error (i.e. heterogeneity): τ2 = 0.24, I2 = 77.86%. It suggests that a great portion of the observed variability between the effect sizes of the studies (77.86%) was potentially due to the influence of moderating variables and other sources of variability different from chance. Studentized residuals (> 2) and Cook’s distance [> 4/(n − 1)] allowed us to identify one outlier study contributing with disparate outcomes (g <  −3.4; Dermody et al., 2016), probably because of its small sample of AD patients (12 participants). Another reason for those outlying effects would be that the necessary information for estimating them was not available in the manuscript, and we extracted it from the graphs instead (using WebPlotDigitizer, https://automeris.io/WebPlotDigitizer). After excluding the outlying outcomes from Dermody et al. (2016), the overall effect and heterogeneity were reduced, g =  −1.1 [−1.24, −0.96], p < .0001, τ2 = 0.22, I2 = 76.47%, although heterogeneity remained substantial.

The difference between AD and MCI explained part of the observed variability among studies, where AD patients exhibited significantly lower PM performance than patients with MCI (g =  −1.45 vs. MCI: g =  −0.89; Table 3 and Fig. 2). However, this approach contrasted samples of patients assessed in separate studies and under potentially diverse conditions (different PM tasks, settings, degree of cognitive impairment, etc.). Consistent with the previous result, the difference between AD and MCI patients was statistically significant when the meta-analytic model was fitted only with studies that included samples of both neurological conditions (i.e. assessed under the same procedure), gAD vs. MCI =  −0.71 [−0.94, −0.49], p = .005, τ2 = 0, I2 = 0%. Similarly, AD patients showed larger PM impairment compared to MCI patients when the model only included classic (and the most established) neuropsychological PM tests (i.e. Cambridge Prospective Memory Test, CAMPROMPT, Wilson et al., 2005; Memory for Intentions Screening Test, MIST, Raskin, 2009; Rivermead Behavioral Memory Test, RBMT, Wilson et al., 1989; and Royal Prince Alfred Prospective Memory Test, RPA-ProMem, Radford et al., 2011), g =  −2.08 [−3.02, −1.13], p = .003, τ2 = 0.20, I2 = 74.18%. As expected, lower MMSE scores predicted larger impairments in PM (MMSE: p = .014). However, the average performance in retrospective memory, executive functions, working memory, and processing speed tests, as well as age and education did not explain PM impairments (ps > .05; Supplementary Table 1, 2 include studies reporting the cognitive tests used in each study). There was no difference between time-based and event-based PM measures (p = .467), neither when we examined it separately in each neurological condition (AD: p = .126; MCI: p = .721). It is important to notice that the available number of time-based PM measures, especially in AD patients, remains more limited than for event-based measures (k = 25 vs. k = 92; AD: k = 5 vs. k = 37). When the role of this moderator was examined with a multilevel Bayesian meta-analysis,Footnote 6 while the model in MCI patients suggests there was strong evidence against a difference between both types of PM measures (β = 0.05, 95% CrI [−0.10, 0.20], BF10 = 0.10), the evidence in AD patients is still inconclusive and coherent with larger impairment in time-based PM tasks (β =  −0.41, 95% CrI [−0.92, 0.11], BF10 = 0.87). The meta-analytic results remained similar when the sample of studies was constrained only to those including both event-based and time-based PM measures (MCI: β = 0.06, 95% CrI [−0.10, 0.22], BF10 = 0.11; AD: β =  −0.29, 95% CrI [−0.83, 0.25], BF10 = 0.50). Finally, PM impairments were larger when they were measured with classic neuropsychological tests, p = .024.

Table 3 Results of moderator analyses
Fig. 2
figure 2

Forest plot of the included. The PM measures within the same type (event-based or time-based) and within the same sample of participants were averaged for depicting purposes. Outlying studies were removed from the plot. aMCI, amnesic mild cognitive impairment; EB, event-based; TB, time-based; MMSE, Mini Mental State Examination; naMCI, non-amnesic mild cognitive impairment. Studies are reported and sorted by year of publication (within each cluster)

Interestingly, the year in which the articles were published and if they were published after the meta-analysis by van den Berg et al., both predicted a reduction in the overall effect size (Table 3). As studies have been accumulating in the literature, the estimated overall impairment has been reduced (from g =  −1.49 in 2006 to −1.10 in the present; Fig. 3). The reduction has been more remarkable in the case of studies about MCI (from −1.31 to −0.89). In fact, as it was reported by van den Berg et al., the difference between AD and MCI patients in the magnitude of PM impairments was not statistically significant by the date in which the literature search of the previous meta-analysis was limited (July 1, 2011), p = .108 (AD: g =  −1.53 [−1.87, −1.18]; vs. MCI: g =  −1.16 [−1.48, −0.84]).

Fig. 3
figure 3

Cumulative meta-analysis across years. Each effect size represents the meta-analytic results of all the included studies that were available by that year (i.e. all studies accumulated up to that period). Whereas the PM impairments for AD patients have remained similar throughout all these years (a reduction of 9%, −1.38/ −1.51), the estimated impairments for MCI patients have been reduced by a third (−0.89/ −1.31)

One reason for the unexpected reduction in the overall effect size could be the increasing number of articles with MCI patients across years, r = .42 [.18, .61], p < .001 (Fig. 4A). The proportion of studies investigating PM in MCI patients was smaller before the publication of the meta-analysis than after (46% vs. 67%; Fig. 4B). Given that MCI patients showed smaller PM impairments compared to AD, their greater representation in the latter period may have led to the meta-analytic result being closer to the outcome of MCI patients (Fig. 3).

Fig. 4
figure 4

Chronological evolution of A the number of published studies investigating PM in AD and MCI, as well as C the number of patients, E the standard error, and G the number of task items in the included studies. Across years, B the proportion of studies with MCI patients and (H) the number of task items have increased in the period after the publication of the meta-analysis by van den Berg et al. (2012), while F the standard error of the studies has decreased. The size of the samples of patients remained similar across years, and D it was comparable before and after the reference time point, especially when removing three studies with unusually large samples (n > 100)

Another explanation could come from the reporting process itself, favouring the publication of positive and significant over null results (Mathur & VanderWeele, 2020). Thus, smaller studies tended to report higher effect sizes (p = .001; Table 3), which is evidence of publication bias in the literature. Publication bias has been progressively reduced across years (\(\sqrt{W}\times\) Year of publication, p = .033) and among studies with MCI patients (\(\sqrt{W}\times\) Neurological condition, p = .002; Fig. 5). Although some studies published since 2012 had large samples of patients (such as Kinsella et al., 2016; Thompson et al., 2017; Tse et al., 2015; Wang et al., 2012; n > 100), the sample sizes remained similar across years, especially if we removed these four exceptions, r = .10 [−.16, .35], p = .465 [mean of 31.6 patients before 2012 vs. mean of 35.8 after 2012, t(57.38) =  −0.461, p = .646; Fig. 4C, D]. Nevertheless, the sampling error decreased, r =  −.27 [−.48, −.02], p = .036 [mean SEg = 0.37 before 2012 vs. mean SEg = 0.33 after 2012, t(58.01) = 1.05, p = .300; Fig. 4E, F], in part because of PM tasks in the studies included more number of trials, r = .29 [.05, .51], p = .021. Whereas the studies used a mean of 4.4 trials per PM measure before the publication of the meta-analysis by van den Berg et al., the mean after that increased to 7.5 [t(53.95) =  −2.62, p = .011; Fig. 4G, H]. Therefore, selective reporting might be more likely with designs with less precision, producing less stable estimates of the effect size and with more room for outliers to appear. Finally, the use of classic neuropsychological PM tests has decreased across years, r =  −.27 [−.48, −.02], p = .036 (29% of the studies used classic neuropsychological tests before 2012 vs. 18% after 2012).

Fig. 5
figure 5

Funnel plot of the included studies. The dashed line is the overall effect size, whereas the red line represents the asymmetry in the distribution of effect sizes in terms of their corrected standard error (i.e. fitted meta-regressive coefficient of \(\sqrt{W}\))

Finally, the best meta-regressive model after a backward stepwise selection with all the prespecified moderators included neurological condition, type of PM measure, and small-study effect (\(\sqrt{W}\)), with a residual heterogeneity of τ2 = 0.15 and I2 = 66.33% (vs. the heterogeneity of the model without moderators: τ2 = 0.22 and I2 = 76.47%). It is relevant to note that after the inclusion of these three main moderators, the variable year of publication was no longer a significant predictor, and the backward selection excluded it from the model. This supports the idea that it was the changes over the years and not the year of publication per se that explains the overall decrease in effect size.

Discussion

Adequate remembering to take medication or turning on time to the next doctor appointment are two examples of everyday activities that are fundamental for independent living in particular for older adult individuals. These activities are also two reasonable PM tasks in everyday situations. PM failures are frequently observed in older adults (Henry et al., 2004; Kliegel et al., 2016) indeed forgetting intentions and struggling with planning actions comprised between 50 to 80% of all reported memory problems in healthy adults (Cohen et al., 2019). Considering that the number of older adults is drastically increasing over the next decades, it is timely to understand how PM decline as one gets older and identify early signs of cognitive more severe decline.

In a previous meta-analysis, van den Berg and colleagues (2012) investigated event-based and time-based PM in healthy older adults and patients with different degrees of cognitive decline. The meta-analysis included 14 studies (seven with AD patients, four with MCI, and three with both types of patients) and surprisingly showed no statistical difference between the impairment in MCI and AD. In the present work, we updated the review of the literature and meta-analysed 46 studies of PM in AD patients (10 new studies), in people with MCI (20 new), or in both groups of patients (2 new). The results of this larger sample of studies confirmed the previous finding of a lower PM performance in patients of both neurological conditions, although, this impairment was more pronounced in AD compared to MCI patients. The difference arose even when AD patients were compared with MCI patients within the same studies (Kazui et al., 2005; Thompson et al., 2010, 2011; Troyer & Murphy 2007) or when patients were contrasted against healthy older adults in studies using classic neuropsychological PM tests (i.e. CAMPROMPT, MIST, RBMT, and RPA-ProMem). These results suggest that although the deficits in PM are already observable in MCI, there is a progression in the decline throughout the advance of the disorder to AD. One clear explanation for the discrepancy between our meta-analysis and the one by van den Berg et al. is the increased amount of evidence that redounded in increased statistical power, allowing us to detect a significant difference. In addition, we have observed an increasing interest in studying PM performance in ageing and in clinical populations (Kliegel, Jager, et al., 2008; Kliegel, McDaniel, et al., 2008; Raskin, 2018) that mirrors the great interest in understanding the causes of PM impairment and monitoring the decline as an early sign of more severe neurological disorders (Hering et al., 2018). Indeed, before 2012, only seven studies have been conducted about PM in MCI patients, but since that year the number of papers has almost tripled. In the present meta-analysis, this trend has been determinant to show that MCI is characterized by the presence of PM deficits, but significantly smaller than in AD. Increasing the knowledge concerning the preclinical phase of AD is important for theoretical and clinical reasons. From a theoretical point of view, advancing the knowledge regarding the transition from normal ageing to dementia is vital in understanding how the disease evolves. From a clinical perspective, identifying individuals at risk for developing AD as early as possible is timely for boosting treatment efficacy.

Our results also indicated no conclusive evidence for an effect of the type of cue on PM performance. By definition, it is assumed that time-based PM relies more on internal, self-initiated control mechanisms than event-based PM because no external cue prompts the action (Kliegel, Jager, et al., 2008; Kliegel, McDaniel, et al., 2008). Following this definition, time-based PM performance should be particularly affected by an age-related decline (Vanneste et al., 2016). However, our results did not confirm this assumption for MCI patients, who showed similar impairment in both PM paradigms (event-based, −0.83; vs. time-based, −0.90; BF10 = 0.09), and only numerically for AD patients (event-based, −1.42; vs. time-based, −2.84; BF10 = 0.97). While the lack of conclusive evidence in the case of AD patients might be a matter of the lack of studies using time-based PM tasks in this population (only five studies), our findings suggest that a more severe cognitive deficit is necessary to cause a differential affectation of time-based PM. It is possible that previously observed differences between event-based and time-based PM tasks were merely due to differences in task characteristics rather than the difference in the type of cue for action. According to the Multiprocess framework (McDaniel & Einstein, 2000), PM performance relies on both strategic monitoring and automatic retrieval processes, based on this assumption both event-based and time-based PM tasks can vary in the amount of self-initiated processes indeed in both cases individuals are required to monitor the environment for the cue. For example, by varying the cue focality, certain event-based PM tasks may be more demanding than some time-based PM tasks, therefore the observed differences in PM performance between event-based and time-based tasks might be mainly determined by the type of process (i.e. automatic vs. controlled) rather than by the type of cue. Furthermore, although the available evidence today is substantially greater than it was a decade ago, the number of studies investigating time-based PM is still small compared to the studies investigating event-based PM.

The lack of differences in time-based and event-based PM performance was also observed in the previous meta-analysis conducted by van den Berg et al. (2012) and other meta-analyses conducted on patients with traumatic brain injury (Shum et al., 2011) and patients with Parkinson’s disease (Ramanan & Kumar, 2013). As mentioned, the age and clinical invariance for event- and time-based PM may be due to methodological differences between the measures used to detect event-based and time-based PM performance. More studies including both event-based and time-based cues are needed to better understand the specificity of these two processes and if are differently affected in healthy and pathological ageing.

One of the most striking findings of our meta-analysis was an observed reduction of the PM deficits shown by MCI and AD patients. Considering the short period that has passed since 2012, we believe that this result was not the result of a change in the effect of both neurological conditions. Instead, we detected several aspects that changed in the literature in the last decade that can account for this trend. Across years, the number of items or trials per task has increased, giving greater reliability to the PM measure and it would subsequently result in a more stable estimation of the between-group difference. In parallel, the small-study effect, a sign of potential publication bias, has been reduced in the last decade. Taking into account both correlations, we propose the use of more reliable research designs as one plausible explanation for the reduction of PM impairments. Thus, it might have produced conditions less favourable for the publication of extreme values (i.e. more stable estimates). These findings highlight the relevance of collecting enough observations per participant for getting reliable results. However, increasing the number of observations per task should not mean a drastic shortening of the intervals separating PM cues (below 2 min) at the risk of tapping short-term rather than prospective memory. Such a modification could alter the nature of the task and would prevent distinguishing whether the overall reduction of PM deficits is a consequence of higher reliability or a loss of sensitivity. The fact that most of the tasks have not exceeded that threshold, including those in the recent literature, and that the year-of-publication effect also appears with the studies that used classic neuropsychological tests, β = 0.05 [0.02, 0.08], p = .011, which have not undergone changes in their number of trials, rules out the possibility that a loss of sensitivity has been the main explanation for our finding. Reliability can also be enhanced by including multiple assessment sessions, which is more costly but feasible in institutionalized settings. Neuropsychological studies, which often experience difficulties in accessing samples, should ensure that the information they obtain from their participants is sufficient to achieve meaningful results.

Furthermore, other factors can explain the observed heterogeneity between the studies, such as the PM paradigms used. Although all the studies used laboratory-based paradigms, some of them included classical event-based or time-based tasks in which participants were engaged in an ongoing task (i.e. word categorization, Chi et al., 2014; Duchek et al., 2006) and also instructed to press a key when the designed word appeared on the screen or when a specified amount of time has passed. Other studies used a computerized task that resembles everyday activities (i.e. Virtual Week; Shelton et al., 2016; Thompson et al., 2010, 2011). The advantage of using the latter tasks concerns the possibility of using computerized controlled tasks with good psychometric properties and, at the same time, a high resemblance to real-life situations. Other studies employed PM tasks commonly used in the clinical setting. The RBMT (Huppert & Beardsall, 1993; Kazui et al., 2005; Martins & Damasceno, 2008; Mori & Sugimura, 2007) is one of the first tools used in clinical and experimental settings to investigate PM, representing a valid measure of “everyday” memory function [but Shum et al. (2002) concluded that there was little evidence to support the reliability or validity of the PM items separately]. New and more reliable tasks have been developed to be used in clinical settings such as the CAMPROMPT (Delprado et al., 2012; Dermody et al., 2016), the MIST (Belmar et al., 2020; Karantzoulis et al., 2009); the RPA-ProMem (Aronov et al, 2015; Rabin et al., 2014). All of them include both event-based and time-based activities to be performed during one session lasting 20–30 min approximately (Mioni et al., 2022). Interestingly, the observed PM impairments were larger with these neuropsychological tests compared to the rest of tasks (g =  −1.44 vs. −0.98), which could be partially explained by their reduced number of observations/trials [2.6 vs. 6.6, t(124.80) = 5.82, p < .001] and, on the other hand, by their potential great sensitivity to PM deficits, as they were expressly designed and validated for that purpose. Further studies that included both types of paradigms will be crucial for elucidating this result.

It is also important to consider the heterogeneity of the characteristics of patients recruited, and the methods to classify patients. Concerning the studies that include AD patients only two studies considered the different degrees of patients’ cognitive decline (Huppert & Beardsall, 1993; Tse et al., 2015). Concerning MCI patients, only five studies considered the heterogeneity of this neurological condition (e.g. amnestic or non-amnestic; single domain or multiple domains; Chi et al., 2014; Costa et al., 2015; Rabin et al., 2014; Schmitter-Edgecombe et al., 2009; Wang et al., 2012). These factors may affect the comparison between the studies and the possibility to generalize the meta-analytic result to MCI patients. It is also important to point out that in most cases that the Mini-Mental Examination State (MMSE) was the measure to evaluate global cognitive function, with few exceptions such as the Addenbrooke’s Cognitive Examination Revised (Dermody et al., 2016; Kamminga et al., 2014), the Wechsler Adult Intelligence Scale (Duchek et al., 2006), and the Montreal Cognitive Assessment (Kinsella et al., 2016; Lajeunesse et al., 2021, 2022). The MMSE is well-known and extensively used in clinical and experimental settings, but it is important to consider that it has been demonstrated to be less sensitive to detecting early manifestations of cognitive decline than the other measures used (Bergeron et al., 2017).

Conclusions

The global population is ageing at an unprecedented rate; the number of people aged 60 and over is projected to more than double by 2050, and the number of people aged 80 and over is projected to quadruple. The ageing population is likely to have a significant impact on society, including increased demand for healthcare and long-term care services, as well as changes in the labour market and patterns of consumption. Memory complaints are the most common causes of age-related cognitive dysfunction as we age. Interest in subjective memory complaints and specifically PM complaints as possible indicators of impending dementia has increased in recent years as research focus has shifted toward identifying at the earliest possible stage people who will develop more severe forms of dementia. Consequently, the proportion of studies investigating MCI has increased in the last decade. The present work confirmed that MCI patients already showed lower PM abilities than healthy older adults, and the PM impairments increase when MCI progresses to AD. There was no difference between the deficits in time-based and event-based PM tasks for both MCI and AD patients. Although it needs further research, PM deficits were numerically larger in patients with deficits in episodic memory, such as amnestic MCI. Furthermore, the use of more reliable research designs could explain the reduction of observed PM impairments in recent years. Our findings highlight the relevance of collecting enough observations per participant for getting reliable results.