Introduction

To behave purposefully in uncertain and changing environments, we must deploy mechanisms that allow us to adapt information processing in accordance with intrinsic task goals—collectively referred to as cognitive control (Goschke, 2003; Miller & Cohen, 2001). The past decades yielded significant insights into how we exert control over cognitive processes, such as naming the color of a Stroop stimulus (e.g., say “red” in response to GREEN) (Cohen et al., 1990; Posner & Snyder, 1975; Shiffrin & Schneider, 1977), or switching between two simple tasks (e.g., categorizing numbers vs. letters) (Allport et al., 1994; Arrington & Logan, 2004; Rogers & Monsell, 1995). However, less is known about how we exert control over control itself. For example, we may have to adjust how much control to deploy depending on the frequency of response conflict in the Stroop task (for a review see Bugg, 2012), considering that exerting cognitive control is associated with a cost (Shenhav et al., 2017). Similarly, in task environments that require frequent switching between tasks, participants may invoke control strategies that promote cognitive flexibility at the expense of cognitive stability (Braem, 2017; Dreisbach & Fröber, 2019; Goschke, 2000; Mayr, 2006; Monsell & Mizon, 2006; Musslick et al., 2019). Borrowing terminology from machine learning and artificial intelligence (AI), we define the subset of control mechanisms responsible for the monitoring and regulation of cognitive control itself as meta-control (Fig. 1). That is, a control process is regarded as an instance of meta-control if it monitors and regulates another control process. Note that meta-control processes can operate over different timescales and may be subject to other meta-control processes themselves: one meta-control process (e.g., the regulation of the amount of control applied to a single trial of the Stroop task) may be the target of another meta-control process (e.g., the regulation of the overall amount of cognitive stability versus flexibility across an experiment). Furthermore, as suggested by some of the work featured in this special issue (Bustamante et al., 2021; Dey & Bugg, 2021; Siqi-Liu & Egner, 2020), meta-control itself can be learned from experience (e.g., based on contextual features), without the requirement for external regulation.

Fig. 1
figure 1

Components of meta-control. Meta-control involves the monitoring of control parameters, behavioral outcomes and/or environmental features to guide the regulation of control processes (the target of meta-control) in the service of some objective function. A meta-control itself may be guided by the regulation of another meta-control process and/or shaped by learning

Meta-control processes can be differentiated based on the target of regulation, as well as the objective function guiding that regulation. The target of regulation corresponds to the subject of meta-control, i.e., the set of control processes being monitored and regulated. For instance, in the Stroop task, participants may invoke meta-control to regulate the amount of proactive engagement in a control-dependent behavior (Bustamante et al., 2021; Dey & Bugg, 2021; Lieder & Iwama, 2021), whereas in a prospective memory task, meta-control is needed to regulate the trade-off between the requirement to shield a current task-goal from interfering stimuli and the requirement to monitor the environment for cues signaling that one should switch to a different task (Goschke & Dreisbach, 2008; Möschl et al., 2020). However, meta-control also may regulate more complex control policies that determine exploratory versus exploitative behavior (Marković et al., 2021).

The objective function specifies what a meta-control process is trying to accomplish. In the Stroop example, the agent may seek to improve performance (i.e., minimizing errors and reaction time) by engaging in proactive control but, at the same time, may seek to minimize the amount of control to mitigate associated costs, e.g., in terms of the cognitive effort associated with controlled behavior (Kool et al., 2010; Shenhav et al., 2013). In the exploration-exploitation dilemma, agents may weigh instrumental value (e.g., expected monetary reward) against epistemic value (information to be obtained from future actions) in their objective function when regulating the amount of exploration versus exploitation (Cohen et al., 2007; Wilson et al., 2014). From a more general theoretical perspective, meta-control problems can be conceived of as a by-product of the evolution of advanced cognitive control capacities, which dramatically expanded the flexibility and future-directedness of human action, but also gave rise to fundamental control dilemmas (Del Giudice & Crespi, 2018; Goschke, 2013; Goschke & Bolte, 2014, 2017; Musslick & Cohen, 2020). These control dilemmas confront agents in dynamic and uncertain environments with the challenge of how to balance antagonistic control states. The dilemmas entail challenges such as whether to respond to a conflict with increased goal shielding or by shifting attention to a different goal (stability vs. flexibility); whether to select actions that were rewarded in the past or to explore potentially better but riskier options (exploitation vs. exploration); whether to focus attention on the current task at hand or to monitor the environment for potentially significant information or (attentional selection vs. monitoring) and whether to persist in the pursuit of long-term goals or to satisfy immediate desires (delayed vs. instant gratification). Importantly, the target and objective function of meta-control do not merely determine how much control should be recruited, but which mode of control and which configuration of control parameters is suited to optimize task performance.

In conclusion, both the target and objective function determine how meta-control can be operationalized and measured, how it can be formalized, and how it may be implemented in terms of neural mechanisms. Below, we present an overview of the work covered in this special issue in terms of a psychological, computational, and neuroscientific perspective.

Psychological perspective

Several studies in this special issue focus on behavioral operationalisations of different control states and consider the balancing of antagonistic states an objective function of meta-control. The stability-flexibility dilemma exemplifies this problem in task switching and learning. In task switching, greater cognitive flexibility is assumed to facilitate the switching between tasks but is also associated with increased distractibility (i.e., lower cognitive stability) (Dreisbach & Goschke, 2004; Musslick et al., 2018). The study by Siqi-Liu and Egner (2020), published in a previous issue of Cognitive, Affective and Behavioral Neuroscience, investigates how task switch costs change in response to different demands for cognitive flexibility (i.e., the frequency of task switches). Their work provides evidence that participants adjust control processes involved in the reconfiguration of task-sets (Meiran, 1996; Rogers & Monsell, 1995) based on learnt associations between contextual demands (i.e., task switch frequency) and specific task sets and stimuli (Siqi-Liu & Egner, 2020). They suggest that participants regulate the balance between cognitive flexibility and stability, by adjusting an “updating threshold” (Goschke & Bolte, 2014) as the target of meta-control: a lower updating threshold is assumed to facilitate flexible switching between tasks (high cognitive flexibility), at the expense of poor task-shielding against interference (low cognitive stability). In a related line of work on voluntary task switching, Fröber and Dreisbach (2021) show in the current Special Issue that the performance costs incurred during task switching, as well as participant’s preference to switch tasks, depend on the prospect of reward and, more importantly, the immediate reward history (Fröber & Dreisbach, 2021). The authors conclude that, similar to contextual demands (as in the Siqi-Liu and Egner (2020) study), reward may serve as a parameter for meta-control. The same high reward prospect can bias the system either towards greater cognitive flexibility or towards stability, depending on the immediate reward history: increasing reward prospect promotes cognitive flexibility, whereas remaining high reward prospect promotes cognitive stability. Thus, the stability-flexibility dilemma also applies to the integration of information over time: Higher learning rates allow for the efficient integration of recent experiences at the expense of forgetting (overlearning) older information. Dey and Bugg (2021) propose that different strategies of control—in this case, reactive versus proactive control—rely on different time scales for integrating past information, resulting in different learning rates (Dey & Bugg, 2021). They deploy a statistical model (Aben et al., 2017) to analyze published data of three Stroop experiments in which the probability of conflict was manipulated. Participants are assumed to engage in reactive or proactive control if the overall likelihood of response conflict is low or high, respectively. Dey and Bugg (2021) show that participants integrate more recent experience into their behavior (i.e., adopt a higher learning rate) when they are expected to engage reactive control. Conversely, the authors show that participants consider a longer history of trials (i.e., adopt a smaller learning rate) in task environments that promote proactive control. What is shared across the three studies is that they all focus on the trade-off between cognitive stability and flexibility, either with respect to the flexible switching between tasks or with respect to integrating past experiences over time. What changes across studies is the target of meta-control: a threshold for updating task sets in Siqi-Liu and Egner (2020) and Fröber and Dreisbach (2021), and a learning rate for integrating past experiences of conflict in Dey and Bugg (2021). However, the stability-flexibility trade-off is not the only control dilemma requiring meta-control. Foraging scenarios often are characterized by trade-offs in which participants need to decide between actions that yield known rewards (exploitation) and actions that are associated with unknown, potentially greater future rewards (exploration). The objective function of meta-control in such tasks could be to optimize the reward rate, by regulating the balance between exploratory versus exploitative behavior. The study by van Dooren et al. (2021) investigates the effects of two types of mood states (excited versus sad) that differ in arousal and valence on exploration-exploitation trade-offs (van Dooren et al., 2021). The results suggest that higher mood-related arousal is associated with more exploratory behavior, whereas a positive valence is associated with exploitation.

Another question addressed in this Special Issue pertains to the development of meta-control across the lifespan: Does the degree to which individuals engage in meta-control change across development and aging (Bolenz et al., 2019; Bolenz & Eppinger, 2020; Ruel, Devine, & Eppinger, 2021)? Niebaum et al. (2021) used a demand selection task to study developmental differences in proactive versus reactive engagement of control. The results suggest that greater task performance with age does not just result from an improved ability to engage in proactive control, but also from greater awareness of cognitive demands associated with the task (Niebaum et al., 2021). The latter pertains to the monitoring function of meta-control. Age-comparative approaches to meta-control, as explored by Niebaum et al. (2021), provide valuable insights into developmental changes in the regulation of cognitive control. However, the work also highlights an asymmetry in the meta-control processes under study: Whereas most of the research in the Special Issue focuses on the regulation of control, few investigate the mechanisms underlying the monitoring of control processes—an important direction for future research.

Computational perspective

What are the computational mechanisms that underlie the optimization of an objective function in the service of meta-control? The work presented in this Special Issue considers different computational mechanisms for different objective functions. Marković et al. (2021) propose a computational model in which meta-control can be cast as a mechanism that arbitrates between explorative an exploitative behavior depending on the current task context. According to the model, the objective function of the agent is to optimize the balance between instrumental (average reward) value and epistemic (information) value. The target of meta-control are the behavioral policies, that is, the associations between task contexts and appropriate modes of behavior. The model describes an inference process over meta-control states. Each meta-control state determines the control policy for a given context, e.g., whether to seek exploitation or exploration. The use of contextual information can also inform the engagement in proactive versus reactive control, as suggested by Lieder and Iwama (2021). They introduce a meta-control mechanism that engages in a cost-benefit computation to decide (a) whether to set a goal, (b) whether to boost or inhibit an existing goal, or (c) whether to engage in reactive control when no task goal is present. The engagement in these control strategies (the subject of meta-control) depends on their expected utility as well as computational costs associated with proactive control. The objective of meta-control is to maximize the expected utility of control while minimizing its computational costs. The authors show that such a model is capable of replicating behavior across several instantiations of the continuous performance task (AX-CPT) (XX). However, Lieder and Iwama (2021) also point out that the optimal behavior of their rational model can be approximated with efficient learning mechanisms. Bustamante et al. (2021) introduce such a mechanism for learning the value of cognitive control. Their Learned Value of Control (LEVC) model learns to predict the optimal amount of control allocated in a Stroop task based on monetary feedback. Similar to Lieder and Iwama (2021), the objective function of meta-control is to maximize the expected value of allocating control while minimizing the cost associated with exerting control. Finally, Nassar and Troiani (2021) show that when applied to predictive inference, the concept of meta-control becomes closely related to the concept of meta-learning. In an age-comparative approach, they show that attention to detail—a prominent feature of autism—is associated with a bias to update beliefs based on more recent information (showing high flexibility in learning), at the expense of integrating noisy information (low stability) (Nassar & Troiani, 2021). The assumed objective function applied by subjects in this study is to minimize stimulus prediction errors and can be conceptualized in terms of the learning rate as a control parameter.

Cognitive neuroscience perspective

So far, only few studies have explicitly addressed the neural mechanisms underlying meta-control (Lee et al., 2014; Ruel, Bolenz, et al., 2021). Two studies in this Special Issue take a psychophysiological (pupillometry and electroencephalography (EEG)) approach to study the neurobiological processes of meta-control. Kirschner et al. (2021) used event-related potentials to investigate the role of conscious error perception for an optimal engagement in proactive versus reactive control (Kirschner et al., 2021). They show that error awareness is reflected in error-related components of the ERP such as the error positivity (Pe) and the error-related negativity (ERN). Moreover, error awareness seems to mediate the relationship between error-related components and behavioral control adjustments in response to errors. The authors interpret their findings as evidence for the idea that conscious error perception might trigger the recruitment of proactive control. Thus, conscious error perception may underlie the monitoring of control-processes, to ensure efficient meta-control of error-related behavioral adaptations. The workings of meta-control processes may also be reflected in measures of pupil dilation. In their study, Da Silva Castanheira et al. (2021) examine pupil dilation as an indicator of cognitive effort during incentivized task switching (da Silva Castanheira et al., 2021). The psychophysiological results suggest that more cognitively demanding task switches are associated with larger task-evoked pupillary responses. Furthermore, their findings indicate that pupil dilations are predictive of individual differences in task switch costs, suggesting that task-evoked pupil diameter can provide a unique index of effort investment. According to this interpretation, pupillary responses may reflect evaluations of the objective function of meta-control, and more specifically, factors that guide the optimal investment of proactive versus reactive control.

Conclusions

The monitoring and regulation of control processes pervade all forms of control-dependent behavior and is a key ingredient of human cognition. The articles in this Special Issue seek to advance our understanding of the behavioral phenomena associated with meta-control, the computational mechanisms and neural correlates. A common theme across all these studies is the role of meta-control in regulating a balance between antagonistic control states, whether it be the balance between cognitive stability and flexibility, between proactive versus reactive control, or between exploration versus exploitation. This is a new field of study in cognitive psychology and neuroscience and many aspects of meta-control are currently underexplored. Most studies so far have focused on the stability-flexibility dilemma, as well as the trade-off between exploration and exploitation. Other control dilemmas, such as the trade-off between complementary attentional systems serving the goal-directed focusing of attention versus the background-monitoring of potentially relevant information, or the trade-off between the future-directed pursuit of long-term goals versus the present-directed satisfaction of current desires, or the trade-off between deliberation (i.e., sampling and processing information to optimize a decision) and implementation (i.e., initiating an action based on incomplete information or under uncertainty) have received less attention and should be investigated in future studies (Del Giudice & Crespi, 2018; Goschke, 2013; Goschke & Bolte, 2014). Another question that needs to be addressed in the future pertains to the relationship between meta-control and other metacognitive processes such meta-learning (Griffiths et al., 2019; Schweighofer & Doya, 2009). One promising approach to study the conceptual overlap between these processes would be to focus more on the underlying computational mechanisms. For example, one computational process that emerges across several of the studies in this Special Issue is the appropriate setting of learning rates, which determine the degree behavioral adaptation across different types of tasks. The tuning of learning rates also is a topic of interest for meta-learning, particularly in the domain of reinforcement learning (Cook et al., 2019; Schweighofer & Doya, 2009). A further important research question concerns modulators of meta-control as, for instance, effects of psychosocial stress (Möschl et al., 2017; Plessow et al., 2011) and neuromodulatory systems (Cook et al., 2019; Cools, 2016) on the balance between cognitive flexibility and stability or individual differences in the adjustment of meta-control parameters (Mekern et al., 2019). Finally, the ontogenetic development of meta-control (Ruel, Devine, & Eppinger, 2021) and the neurobiological mechanisms underlying meta-control (Zhang et al., 2020) are underspecified. Promising approaches addressing this question are studies in this special issue, which use psychophysiological approaches as a window into the neural dynamics underlying meta-control. However, which neural systems are involved in the regulation of cognitive control and whether there is a hierarchy of control and meta-control processes implemented in the prefrontal cortex (Koechlin et al., 2003) remains an intriguing issue for future research.