Goal-directed top-down control of perceptual representations: A computational model of the Wisconsin Card Sorting Test

In mammals, goal-directed behaviour, relying on an integrated network of fronto-striatal and fronto-parietal systems, supports the performance of flexible behavior. Here we focus on the contribution to such flexibility of topdown selection processes involving internal representations of percepts. We study these processes through a computational model able to solve the Wisconsin Card Sorting Test (WCST), an important neuropsychological test used for measuring cognitive flexibility. The analysis of the model behavior solving the WCST, and its errors resulting from different lesions, are comparable with those of healthy participants and patients with frontal impairments performing the test. The results represent a first validation of our hypothesis on the importance of internal representation selection for cognitive flexibility.


Introduction
Goal-directed behaviour (GDB; Daw, Gershman, Seymour, Dayan, and Dolan, 2011) is supported by mechanisms able to flexibly create associations between perceptions and actions. This function relies on an active internal exploration of the possible alternative courses of action based on task-independent representations of the world dynamics. Here we hypothesise that GDB supports cognitive flexibility (the ability to change strategy depending on the external feedback; Diamond, 2013) based on the selection of internal perceptual representations. This selection relies on the active top-down control of perception affecting both bottom-up and top-down perceptual processes (Findlay & Gilchrist, 2001;Vitay & Hamker, 2007).
The Wisconsin Card Sorting Test (WCST; Heaton et al., 2000) is an important neuropsychological test used to measure cognitive flexibility. Here we propose a computational model able to solve the WCST by pivoting on a novel mechanism operating a top-down selection of internal perceptual representations. We validate the model by showing how its behaviour, and the behaviour of some lesioned versions of it, resemble the behaviour of healthy and pathological human participants of the test.
Previous computational models focused on the theoretical analysis (Dehaene & Changeux, 1991), psychiatric patients (Berdia & Metz, 1998), deficit varieties (Kaplan, Ş engör, Gürvit, Genç, & Güzeliş, 2006), neuronal dynamics (Rigotti, Ben Dayan Rubin, Wang, & Fusi, 2010), sequential learning processes (Bishara et al., 2010), and the role of a specific brain component such as basal ganglia (Caso & Cooper, 2017) in the WCST. However, none of them showed the important role that the selection of internal perceptual representations, and the effect of them on bottom-up and top-down processes, might have for the solution of the task.

Task
In the WCST ( Figure 1) the participant has to draw a card from a deck and match it to one of four sample cards following a visual criterion. Each card contains items with unique combinations of features. These features are grouped in three categories, each involving four attributes: colour (red, green, blue, or yellow); shape (stars, triangles, circles, or crosses); number (one, two, three, or four elements). To solve the task, the participant should move each deck card close to one of the target cards by trying to match it in terms of either the colour, the form, or the number. After each matching attempt, an external operator gives a feedback ('correct' or 'not correct') based on the current matching rule. The participant is not told this rule that so should be inferred through the feedback. Critically, the correct rule changes after a certain number of uninterrupted correct actions (usually 10) and when this happens the participant should infer the new rule and flexibly switch to it.

Neuro-functional components of the model
The model is formed by components implementing functions that resemble analogous functions of brain areas relevant for a GDB-based interaction with the environment: (a) a visual sensor that mimics the fovea/perifovea parts of the retina; (b) a hierarchical perceptual system that processes visual information and extracts visual features, simulating the human visual system (Konen & Kastner, 2008); (c) a working-memory that stores the goals linked to the possible matching rules, supported by the brain prefrontal cortex (Barraclough, Conroy, & Lee, 2004) and frontal-striatal loops (Baldassarre et al., 2013); (d) a motivational system, processing the external feedback, that activates the goals within the working-memory, a function mostly supported by brain basal ganglia and ventral systems (Gläscher, Daw, Dayan, & O'Doherty, 2010); (e) a selector that chooses the matching rule and hence the specific features of the cards that the system focuses on, simulating the contribution of basal ganglia (Redgrave, Prescott, & Gurney, 1999) and frontal-parietal cortex (Gazzaley & Nobre, 2012) to top-down control of internal perceptual representations; (f) a comparator that supports the visual matching (comparison of the deck card and a target card), executed by brain frontal and temporal-occipital cortices (Perani et al., 1999); (g) a motor component performing the movements of the visual sensor and the displacement of the deck card close to the target card.

Computational implementation of the model
The model components are mainly implemented with neural networks. The working-memory component is formed by recurrent units that encode the tendency to choose specific matching rules. Each unit has a self-synapse and decays to a baseline (0.5) with a decay rate φ. The motivational component, that changes the working-memory values on the basis of external feedback, is supported by a reinforcement learning algorithm (reward -expectation) with a learning rate µ. The hierarchical perceptual system is supported by a modified version of Deep Belief Network (DBN; Hinton, Osindero, and Teh, 2006), a bidirectional network composed by two stacked generative Restricted Boltzmann Machines (RBM; Hinton, 2012). The DBN is composed by a traditional RBMtrained by the contrasting divergence algorithm -and a supervised RBM trained with a modified version of the contrasting divergence algorithm where specific units of the external layer are clumped to different attributes of the colour, shape, and number (this learning putatively takes place before the WCST session based on the interaction with the world). Top-down selector is composed by (a) a softmax function (with temperature τ) that operates the choice of the behavioural strategy based on the working memory content, and (b) a disinhibition mechanism, mimicking basal ganglia, which inhibits all the units of the last DBN layer with the exception of the one encoding the desired rule. This allows the system to have an internal representation of each card (RBM) that is focused one one specific attribute (either colour, shape, number). This process is sequentially applied to the deck-card and each target-card until a matching between them is found, and the deck-card displacement is triggered. The matching (based on Euclidean distance) is performed on the basis of the DBN reconstructed image focused on the selected attribute.

Results and conclusion
We used five behavioural measures to score the model and compare it with human data: (a) completed categories (CC): this identifies the number of successfully completed categories (max six categories i.e. colour, shape, number, colour, shape, number) and informs about the level of global performance; (b) total errors (TE): this identifies the total incorrect responses and is informative about the level of global deficit; (c) perseverative errors (PE): these mark a perseverative tendency to sort the cards with the same incorrect rule after a negative feedback; (d) non-perseverative errors (NPE): these occur in different situations and suggest an attentional failure or incorrect inferential reasoning; (e) Failure-to-Maintain Sets (FMS) errors: these occur after five consecutive correct matches and suggest a distraction.
We searched different parameter settings to obtain two different groups of models ('conditions) exhibiting different relevant behaviours: (a) healthy model (HM): this baseline model reproduces the behaviour of normotypical WCST participants; (b) pathological model (PM): this reproduces the behaviour of pathological WCST participants. Moreover, we applied two types of 'lesions' to the model, thus obtaining two 'extreme pathological' versions of it. The first lesion (an extreme low µ value that negatively influences the efficacy of reinforcement processing) produces a highly perseverative model (PM) with a low sensitivity to rule changes (high number of PE).
The second lesion (an extreme high values of τ, the temperature of softmax, that makes the model insensitive to values differences, together with an extreme high φ, the decay speed of recurrent units self-synapses, that causes a forgetting of previous chosen rule) produces a distracted model (DM) that shows an opposite tendency with respect to the previous model, i.e. an extreme distracted and erratic behaviour (high number of NPE and FMS errors). Figure 3 and Figure 4 show the statistical comparisons between human data and artificial data in healthy and pathological conditions, highlighting the similarity between the model behaviour and target empirical data. Figure 5 shows the errors of the extreme pathological models, highlighting the effects of specific neuro-inspired lesions.
The results validate the hypothesis, incorporated in the neuro-inspired model, that the selection of internal perceptual representations might represent a key mechanism supporting goal-directed cognitive flexibility measured by the WCST. In particular, the manipulation of the parameters of the model regulating its selection and memory processes, mimicking different possible lesions, reproduce the tendency of human patients to exhibit perseverative or distracted WCST errors.