Biologically plausible models of cognitive flexibility: merging recurrent neural networks with full-brain dynamics

Cognitive flexibility, a cornerstone of human cognition, enables us to adapt to shifting environmental demands. This brain function has been widely explored using computational modeling, although oftentimes these models focus on the operational dimension of cognitive flexibility and do not retain a sufficient level of neurobiological detail to lead to electrophysiological or neuroimaging insights. In this review, we explore recent advances and future directions on neurobiologically plausible computational models of cognitive flexibility. We first cover progress in recurrent neural network models trained to perform flexible cognitive tasks, followed by a discussion on how whole-brain or large-scale brain network models have approached the distributed nature of flexible cognitive functions. Ultimately, we propose here a hybrid framework in which both modeling philosophies converge, advocating for a balanced approach that merges computational power with realistic spatiotemporal dynamics of brain activity, and explore early examples in this direction.


Introduction
The concept of cognitive flexibility stands as a central pillar of human cognition, shaping our ability to navigate the intricacies of a complex world.Cognitive flexibility is linked to an individual's capacity to effectively adjust their thoughts and actions in response to shifting or unfamiliar environmental requirements [1,2], allowing our brains to perform context-dependent computations, quickly adapt our behavior, and make correct decisions in uncertain environments.However, despite the promising advances in understanding the neural basis of cognitive flexibility [3][4][5] and other aspects discussed in this review, a good number of important issues remain far from being fully understood.
To gain a deeper understanding of the underlying neural computations behind cognitive flexibility, researchers have developed different hypotheses via computational models.These models carry the power of clarifying covert processes and are capable of isolating latent variables, rather than those directly observable from behavior.For instance, different types of models have been developed to better understand adaptive behavior and its link to prefrontal cortex (PFC) [6], or to uncover the mental computations behind neuropsychological assessment tests (e.g.Wisconsin Card Sorting Task or Stroop Task) [7][8][9].Connectionist models [4,[10][11][12][13] alongside Bayesian models [14,15] have been explored in detail for these purposes.However, these classes of models often lack enough biological details to allow for a proper comparison with neural data.Neurobiologically plausible computational modeling stands as an effective approach to probe critical questions and gain a mechanistic understanding of how neural circuits operate.This review presents a state-of-the-art overview of recent computational models of cognitive flexibility that gravitate toward more biologically realistic approaches.We first described the so-called recurrent neural network (RNN) models: simplified neural networks typically describing local computations associated with cognitive tasks (that are learned using different training protocols).We later focused on large-scale brain networks (LSBNs), which capture phenomena at larger scales and are constrained by anatomical data.Both approaches offer complementary tools to understand cognitive flexibility, the first reaching higher computational complexity and the second aligning better with multiscale electrophysiological observations.Ultimately, we advocate for a 'hybrid modeling' approach that combines these two modeling strategies as the direction in which the field should progress.

Recurrent neural networks
In recent years, RNNs have emerged as a pragmatical example of computational neuroscience models capable of simulating flexible cognitive functions.This flexibility must be understood here as the capacity of the network to easily adapt its behavior by switching between different functional rules.A simple but convenient way to accomplish this would be to inform such rule switching via environmental or contextual cues.While primarily a class of artificial neural networks (ANNs) designed for processing sequential data, RNNs retain a good level of neurobiological plausibility compared with other ANNs such as deep networks, due to their recurrent connection structure (Figure 1a) [16] and additional realistic features introduced in recent works [17][18][19].These networks are able to sequentially process time-varying input and update their hidden state accordingly -that effectively serves as a dynamic internal memory to store information.This makes RNNs particularly effective at capturing temporal dependencies.
RNNs play a crucial role in cognitive flexibility research, particularly centered around context-dependent behavior in PFC and related areas [4,[20][21][22]].In a seminal example, Mante and colleagues [20] trained an RNN on a contextdependent decision-making task, mirroring key electrophysiological observations in macaque and challenging early selection theory notions by suggesting a dual role of PFC as input selector and integrator.This line of work triggered interest in the field on the use of RNNs as models for cognitive function, also fueled by theoretical advances to improve the interpretability of RNNs.A significant example of this is the work done by Ostojic and collaborators, which laid a theoretical foundation to understand how low-rank RNNs can perform cognitive tasks [23][24][25].
Subsequent efforts have extended the use of RNN models in novel directions.For example, Soldado-Magraner and colleagues [5] recently used linear dynamical system models fitted to PFC data to explore different mechanisms of input integration: one similar to Mante et al., and another relying on context-dependent inputs.Both mechanisms captured the trajectories of neural activity in state space (Figure 1b), implying that linear dynamics can replicate complex PFC responses across different contexts.Other efforts have extended RNNs from using rate-based neurons to spiking neurons, increasing their biological plausibility [26] and building spiking RNNs able to perform context-dependent decision-making tasks similar to Mante et al. [27,28], and other complex behaviors [29].These works illustrate that a high task performance can be achieved by incorporating a more realistic neuron model with basic biophysical limitations.
Besides context-dependent computations, another important aspect of cognitive flexibility is that it provides the capacity for multitasking.This raises the question of how a single network represents and supports distinct tasks.Various studies have researched this by training an RNN on multiple interrelated cognitive tasks with an interleaved training regime, with the implementation of both rate-based [30,31] and spiking neurons [28].Post training, these networks could execute all tasks with high behavioral performance (Figure 1c).Modularity, in the sense of functionally specific clustering, emerged in the networks [30]: self-organized task-specific clusters of units fired preferentially during the implementation of particular tasks allowing for compositional task representation, specializing, for instance, for sensory inputs or motor outputs.Additionally, a group of neurons emerged with high mixed-selectivity: neurons may respond to multiple, often diverse, input patterns or features.Mixed-selectivity is indeed found in higher-order brain areas critical for flexible behavior, such as the PFC and hippocampus [32], whereas specialized neurons are evident in early sensory processing areas.
Although these networks achieved high levels of performance when training samples were randomly interleaved, they were not trained in the way that humans and animals do, namely through continual learning [33].However, ANNs suffer from a recurrent problem called catastrophic forgetting, where learning new tasks leads to a significant degradation or loss of previously acquired knowledge, impeding the network's ability to retain and adapt to multiple tasks [34].To overcome this, research has focused on solutions such as weight change regularization-based approaches [35] and other continual learning methods [36].Applying these learning methods enables successful sequential training, for both interrelated [30] and not interrelated tasks [37] by explicitly avoiding specialization to the tasks being learned.This approach aligns with the idea that flexible solutions require mixed-selectivity.When learning a new task, mixed-selectivity can be preserved as long as learning does not strongly alter the random connectivity.Perhaps, such continual learning rules are not only valid for simple tasks but may also apply to more complex ones, such as tasks involving temporal patterns and event sequences [38].
Finally, an important dimension of cognitive flexibility is set switching: switching rules or strategies within one task.Wierda and colleagues [18] trained a large number of randomly initialized RNNs to the same multisensory integration task and observed a large variability in behavioral strategies for the same task, with a clear correlation between the strategy and neural composition: fast networks had more choice-selective, mixed-selective, and silent neurons, and precise networks had less choice-selective units.Interestingly, external modulatory currents applied to different subsets of neurons could shift the adopted behavioral strategies of the networks (Figure 1d), moving across speed-accuracy trade-offs.This could explain the alternations in behavioral strategies within single sessions observed experimentally [39].
In summary, RNN models have so far provided insight into cognitive flexibility via context-dependent decisionmaking, input selection, multitasking and behavioral rule switching, and expanding from abstract to more biologically realistic -and sometimes spike-based -networks.

Large-scale brain networks
An important limitation of many RNNs flexibly performing cognitive tasks is that these models often describe only one brain area (such as PFC) or describe interconnected local circuits not affected by interregional brain communication matters (such as variability across regions or synaptic delays).Cognitive flexibility is however not limited to any individual brain region: neuroimaging and behavioral approaches in humans and pharmacological and lesion studies in animals suggest that cognitive flexibility is supported by large-scale functional distributed brain networks (Figure 2a).These proposed networks encompass lateral and orbital frontoparietal, midcingulo-insular, and frontostriatal brain networks [40][41][42].Cognitive flexibility likely emerges from the interplay of specific areas in these networks, while each provides a relatively specific functional contribution [40].For instance, the midcingulo-insular network includes the mediodorsal (MD) thalamus [42], which supports PFC for cognitive flexibility as it enhances the signal-to-noise ratio in prefrontal decision-making under uncertainty [43][44][45].Note that these areas may not be specialized in cognitive flexibility, but are activated across a range of other executive functions such as working memory, inhibition, and attention [46].Therefore, cognitive flexibility is hard to isolate, as it requires the confluence of several aspects of executive functions on both the behavioral and anatomical levels.
This evidence emphasizes the need for a broader, lesslocalized approach -something that computational models of LSBNs can provide.LSBNs, sometimes referred to as whole-brain networks, combine simple models of the activity of small neural circuits with connectivity data across brain regions, simulating neural dynamics at the full-brain level [47][48][49][50].LSBNs help in addressing common problems of more abstract network models, such as reducing the number of free parameters, provide a more realistic network structure, and allowing the conversion of abstract nodes into concrete brain areas with specific properties and predictions.Although LSBNs tend to focus on resting-state dynamics rather than cognitive functions, recent efforts aim to extend them toward flexible cognitive functions.
As a first step, local dynamical models of cognitive flexibility have expanded to include multiple interconnected brain areas.For example, Zhang and colleagues [51] modeled a MD-PFC network with biological constraints such as genetically identified thalamocortical pathway specificity and cortical interneuron subtypes.Such model outperformed the single-PFC model in context-dependent decision tasks with higher sensory uncertainty (Figure 2b).Likewise, Jaramillo and colleagues [52] incorporated different anatomical loops in cortico-pulvinar networks to investigate the role of thalamic nuclei on flexibly modulating cognitive tasks in functions, including working memory, decision-making, and attention (Figure 2c).Although the models above were not directly constrained by connectome data, recent connectome-constrained models have investigated the global neural dynamics of flexible cognition or related functions [53][54][55][56].Schirner and colleagues [53] algorithmically developed a set of personalized LSBN models (Figure 2d) able to perform flexible decision-making tasks, which were then used to explore links between decision speed, functional connectivity properties and intelligence scores.Regarding memory-related functions, Mejias and Wang [54] recently built a connectomeconstrained LSBN model of the macaque neocortex, which revealed the underlying mechanisms behind the distributed nature of working memory [57].This model, the first to consistently incorporate a cognitive function in a LSBN formalism, attributed the emergence of this distributed working memory-related activity to the combination of brain-wide interactions, variability across local circuit properties, and feedback inhibition across the cortical hierarchy.Interestingly, the exploration of the model's parameter space revealed the existence of multiple subnetwork configurations able to sustain sensory-related activity (Figure 2e), indicating that task-relevant information can be maintained and flexibly manipulated across wide brain functional networks under different contexts.For example, a given visual item could be maintained in memory using one of two different subnetworks depending on the particular context given (i.e.color vs. motion) and subsequently triggering a different response.Feng and colleagues [58] recently extended the principles underlying distributed working memory to human LSBN models, by combining multiple anatomical datasets and reproducing brain-wide activation patterns as observed in humans performing working memory tasks.

Hybrid modeling approach
The computational strategies reviewed above display interesting cognitive flexibility features, but they also present important limitations making them unable to capture the richness of flexible cognitive computations (a) Adapted from [42].(b) Adapted from [51].(c) Adapted from [52].(d) Adapted from [53].(e) Adapted from [54].
by themselves.In this section, we advocate for the combination of these approaches into a 'hybrid modeling' approach.Our proposal is that future modeling efforts of cognitive flexibility should combine multiple local circuits (modeled as RNNs) into a large-scale network constrained by anatomical data, in the style of LSBNs (Figure 3a).Such approach would bring together the computational power and adaptability of RNNs with the interareal modulatory capacity and integration with structural data of LSBNs.Note that, while there are several ways to align RNN models with structural data, the integration with LSBNs is particularly important to address cognitive flexibility, given the distributed nature of flexible cognition as discussed in the previous section.
There are several indications that this hybrid approach is within reach.For example, modularity or functional clustering is an aspect that already emerges in RNNs after proper training [28,30].However, it is not straightforward to map these submodules to actual brain areas.Developing multi-area RNNs is arguably a better choice, as it allows prior identification of each brain region and facilitates the emergence of modularity.Although not specific to cognitive flexibility, attempts have already been recently made in this direction (Figure 3a) [59][60][61].Two approaches were proposed, namely based on area-specific neural activity (i.e.localized and specific activation of neurons in a particular brain region), and area-specific connectivity (i.e.patterns of communication and connections between brain regions) [62].These approaches still have unresolved challenges though, such as how to deal with nonhierarchical parallel processing or how to represent subcortical areas [62] (Figure 3b).
Using neuroanatomical data is an efficient solution to meaningfully replicate the observed distribution of brain computations.Although the architecture of ANNs is handcrafted (e.g. they are built as recurrent or feedforward networks), it is usually very different than that of actual brain networks as described by connectomics.Actual brain networks tend to be much more complex (in the topological and algorithmic sense) than ANNs, displaying nonrandom connectivity patterns and nontrivial topology.
A first approach has been to shape the topology of RNNs to match that of biological brain networks.Goulas and colleagues [63] employed human, macaque, and marmoset brain connectivity data to generate so-called bioinstantiated RNNs, which possess the basic topological structure of brain networks.The performance of these networks was also comparable to that of randomly wired RNNs for working memory tasks.In a follow-up study, Damicelli and colleagues [64] incorporated real brain connectomes of the same primate species but using echo state networks (ESNs) instead of classical RNNs, leading to a reservoir computing approach to simulating distributed brain computations (Figure 3c).They found that a certain level of randomization is necessary to reach better performance in cognitive tasks, and biological wiring diagrams achieve performance similar to purely random networks if an appropriate degree of randomness is maintained.Additional work has suggested that pretraining RNNs on naturalistic tasks before the specific training on a laboratory cognitive task may endow RNNs (including bio-instantiated RNNs) with neural mechanisms similar to those in actual brain networks [62].
Another potential solution for meaningfully mapping various brain areas that are not necessarily hierarchical is to use the predictive coding framework (PC).Importantly, PC employs local information to update synapses and can be implemented under biologically plausible constraints and learning rules [65,66], overcoming the disadvantages of models based on artificial learning rules.In a recent development, Salvatori and colleagues [67] provide a PC framework for generating and exploring architectures with diverse topologies, including small-world networks inspired by brain regions, making it ideal for constructing large-scale hybrid network models for cognitive flexibility and aligning closely with biological systems (Figure 3d).Post training, the assemblies of neurons in these PC networks were successful in image reconstruction and denoising, delivering performance comparable to traditional neural networks.

Conclusions
In this review, we focused on providing an up-to-date overview of biologically plausible computational models of cognitive flexibility, and to venture on promising new directions.RNNs have been crucial in understanding context-dependent decision-making.Recent developments, including the introduction of spiking RNNs and the adoption of continual learning paradigms, have significantly improved their biological realism.On the other hand, LSBNs offer a brain-wide view of cognitive flexibility by encompassing various brain regions, mirroring the distributed nature of cognitive flexibility occurring across the cortex and subcortical structures.Nonetheless, it is crucial to acknowledge the absence of a compact solution for cognitive flexibility constrained by data.
The emergence of hybrid models combining RNNs and LSBNs with empirical connectome data represents a promising avenue for future research in the field of cognitive flexibility.Current approaches merging RNNs with connectomics for simple cognitive tasks [63,64,67] constitute a first step in the creation of hybrid models for cognitive flexibility.A hybrid approach could sometimes converge into a solution similar to the ones revealed by classical neural dynamics models, although in more general cases, it will allow to explore other (low-or highdimensional) solutions.Future work, however, should specifically address whether the advantages of RNNs in flexible cognitive function still emerge in hybrid models, and whether their specialization and modularity align with that of real brains.Likewise, potential caveats of the hybrid approach should be addressed, including the need to develop specific theoretical approaches for improving interpretability as in local networks [23,24] and the challenges of large computational models such as parameter constraints and high computational cost (that could be alleviated with specific learning rules).Additionally, it will be important to expand the scale of these models so that each brain region can be described by a sufficiently large RNN, which will facilitate the comparison with electrophysiological recordings from selected brain regions.

Figure 1 Current
Figure 1

Figure 2 Current
Figure 2

Figure 3 Current
Figure 3 Advances in Neural Information Processing Systems.Curran Associates, Inc.; 2020:14387-14397.37. Yang GR, Cole MW, Rajan K: How to study the neural mechanisms of multiple tasks.Curr Opin Behav Sci 2019, 29:134-143.This work has shown and quantified that mice performing a decisionmaking task are able to alternate between different behavioral strategies (e.g.engaged, disengaged, biased) within the same session.It constitutes a prominent example of how behavioral variability can be addressed experimentally and the importance of considering this factor in experimental design.40.Dajani DR, Uddin LQ: Demystifying cognitive flexibility: Implications for clinical and developmental neuroscience.Trends Neurosci 2015, 38:571-578.