GABAergic Competition Boosts the Irrationality of Protracted Decisions

Humans often violate the principles of decision rationality. For instance, they may choose fish over steak, steak over salad, but salad over fish, disclosing thus inconsistent preferences. Such choice inconsistencies are well-established, but their mechanistic basis remains elusive. We have previously attributed choice inconsistencies to a selective integration process: In a protracted decision-making task (trials of 5-10s) requiring the accumulation of two simultaneously presented streams of payoff samples, momentarily higher payoffs were accumulated with stronger weight. Here, we hypothesized that this selective integration process may be realized via competitive interactions between incoming payoffs, mediated via GABAergic inhibition in cortical circuits. We tested this hypothesis in humans through a combination of the task above with magnetoencephalography (MEG) and pharmacological boost of GABAergic transmission (lorazepam). The drug amplified MEG markers of cortical inhibition as well as behavioral signatures of selective integration. Critically, the drug did not change the time-constant of payoffs accumulation. We conclude that GABAergic cortical inhibition mediates selective integration and decision irrationality. In the protracted decisions we examined, GABAergic inhibition exerts its effect primarily on the input rather than the accumulation stage, distinct from current neural circuit models of perceptual evidence accumulation.

Humans often violate the principles of decision rationality. For instance, they may choose fish over steak, steak over salad, but salad over fish, disclosing thus inconsistent preferences. Such choice inconsistencies are well-established, but their mechanistic basis remains elusive. We have previously attributed choice inconsistencies to a selective integration process: In a protracted decision-making task (trials of 5-10s) requiring the accumulation of two simultaneously presented streams of payoff samples, momentarily higher payoffs were accumulated with stronger weight. Here, we hypothesized that this selective integration process may be realized via competitive interactions between incoming payoffs, mediated via GABAergic inhibition in cortical circuits. We tested this hypothesis in humans through a combination of the task above with magnetoencephalography (MEG) and pharmacological boost of GABAergic transmission (lorazepam). The drug amplified MEG markers of cortical inhibition as well as behavioral signatures of selective integration. Critically, the drug did not change the time-constant of payoffs accumulation. We conclude that GABAergic cortical inhibition mediates selective integration and decision irrationality. In the protracted decisions we examined, GABAergic inhibition exerts its effect primarily on the input rather than the accumulation stage, distinct from current neural circuit models of perceptual evidence accumulation.

Background
Humans often violate the principles of rational choice theory. For instance, they may prefer A over B, B over C but C over A, disclosing thus inconsistent preferences. Such violations of decision rationality imply that the value assigned to an alternative is context-sensitive, being influenced by the properties of the other alternatives under offer. This contextsensitivity has been recently attributed to a selective integration mechanism: In protracted choices requiring the accumulation of temporally discrete psychophysical or numerical samples, momentarily high-valued sampled are accumulated with a higher gain (Tsetsos, Chater, & Usher, 2012;Tsetsos et al., 2016).

Selective Integration
To account for this effect, we have developed a computational model, henceforth called selective integration model (Tsetsos et al., 2016). The model applies to decisions based on two sequences of inputs, presented simultaneously ( Figure 1a). Two accumulators ( ",$ ) integrate the two sequences ( ",$ with ", ( ) denoting the value of sequence A at the discrete sample t) across time:

767
This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0 In equations 1-2, t is the current discrete time-step (or sample), is the accumulation leakage, ",$ is the effective input to the accumulators, is the standard deviation of noise , and ",$ are standard Gaussian samples. The accumulators are initialised at 0 and at the end of the stimulus presentation (after the last sample at = ) a decision is made in favour of the accumulator with the higher accumulated value. The inputs to the two accumulators reflect the modified sequence values after the selective integration filter is applied: ( , ) = 9 1 ≥ < (5).

Behavioral signatures of selective integration
Between two Gaussian sequences with different variances and equal means, the selective integration model predicts a choice bias towards the more variable sequence (hereafter, the pro-variance bias).
To illustrate, we assume for simplicity that the more variable sequence HV takes high (H) and low values (L) while the less variable sequence LV mid-range values (M): . Assuming no leak or noise, the accumulated values for the streams after the selective integration filter will be: EF = + > HF = + , which holds for < ( − ) ( − ) ⁄ . Given that the two sequences are generated via symmetric (Gaussian) distributions M is equidistant to H and L and the pro-variance bias occurs for < 1, or in other words as long as the selective integration filter is active.
Secondly, between two sequences of equal mean value the selective integration model predicts a choice bias towards the sequence that wins in more samples (hereafter, the frequent-winner bias).

Limitations of Selective Integration Model
The pro-variance and frequent-winner bias are parsimoniously explained by the selective integration model. Both behavioral signatures are found in tasks requiring the accumulation of payoff or magnitude samples (Tsetsos et al., 2012;Tsetsos et al., 2016). The selective integration model predicts a tight positive correlation between both signatures but this correlation is negligible in behavioral data Additionally, the selective integration model is casted in algebraic terms, lacking biological realism.

Extended Selective Integration Model
We develop a new model that overcomes the above two limitations of the Selective Integration Model. In our new model, called extended selective integration, selective integration is mediated by a lateral inhibition mechanism at the input level. We replace equations 3-5 with the following: ",$ reflect the activity of two input units. Variable P denotes the duration (in units of time) that a given pair of samples is presented for, and is a small time interval. The input units are initiated at 0, and their  dynamics are governed by the following coupled differential equations: . The input units are non-linear being subject to a reflecting boundary at 0. In equations 10-11, is a leak parameter (set to 1), is the strength of inhibition and is a sigmoid function that gates the inhibitory interactions (Brown & Holmes, 2001): whereby g is the slope and b the inflection point of the sigmoid, representing the threshold above which the inhibiting unit becomes effective. While the pro-variance and frequent-winner bias are both affected in the same way by the strength of inhibition (parameter β), they are effectively decorrelated by the "threshold" parameter b. If b is set above the middle of the value-range, the selective integration filter will be inactive for mid-range vs. low values (e.g. cases in which the less variable sequence dominates over the more variable sequence), further exaggerating the bias for the more variable sequence. By contrast the threshold parameter does not have a major influence on the frequent-winner effect.

Behavioral Task
Participants Forty healthy individuals (N = 40) took part in 4 experimental sessions. The first session was performed outside the MEG scanner and involved training in the task. The second session did not involve a pharmacological manipulation (nocebo) and was followed by two sessions that involved intake of a low dose of lorazepam (1 mg) or a placebo pill (order counterbalanced).

Task & Procedure
On each trial, participants observed 5-8 pairs of black 2-digit numerical values (payoffs) presented sequentially (at a rate of 800 ms per pair), to the left and right of a central fixation point (0.34° diameter) against a grey background. The viewing distance was 65 cm and each numerical character was 0.66° wide and 0.95° long. Participants were instructed to monitor the two sequences and report after the offset of both sequences, which one had the higher average value. Feedback was provided on each trial. At the end of each sessions participants received a monetary bonus based on their performance. Each session involved 6 blocks with 60 trials each. At the end of each block, participants were informed about their cumulative accuracy.

Design
The average difference between the correct and incorrect sequence ranged from 2 to 12 units. Five types of trials were presented in random order. In type 1 trials, the two sequences were sampled from two Gaussians with equal variances and difference means. In type 2 and type 3 trials the sequence with the higher mean had higher (type 2) or lower (type 3) variance. The difference in accuracy between type 3 and 2 trials quantified the pro-variance bias. In type 4 (5) trials the sequence with the higher mean value dominated more (less) often the sequence with the lower mean value. The accuracy difference between type 5 and 4 trials quantified the frequent-winner bias.

MEG
We recorded 275-sensor MEG while participants performed the task described above. We used independent component analysis (ICA) to eliminate head muscle and eye movement artefacts from the MEG data. The continuous MEG signal series across each run was submitted to spectral analysis to compute (i) overall spectral power and (ii) continuous amplitude envelope time series in different frequency bands. Those time series were used for detrended fluctuation analyses (DFA). The spectral analysis aimed to identify power changes in different frequency bands in the drug condition relative to placebo. The resulting scaling exponent quantifies the long-range temporal correlations in amplitude envelope fluctuations and is sensitive to the ratio between cortical excitation and inhibition (Pfeffer et al., 2018).

Behavioural drug effects
Participants had lower accuracy and longer response latencies under lorazepam compared to placebo, in line with reduced arousal under lorazepam (Figure 1b). Critically, the frequent-winner and pro-variance biases were both enhanced under lorazepam. By contrast, the temporal dynamics of evidence accumulation, characterised, by the time-constant (leak) of the psychophysical kernels, were unaffected ( Figure 1b, "Leak" panel). We note that if lateral inhibition between choice accumulators had increased under drug, then a decrease of the net leak would have been observed (Usher & McClelland, 2001).

MEG drug effects
Compared to placebo, lorazepam decreased theta (4-6 HZ) and alpha (7-12 HZ) power across all MEG sensors and increased frontal beta power (15-30 HZ), which is a well-document effect of benzodiazepines ( Figure 2a). Lorazepam also reduced the scaling exponent of theta-band amplitude envelopes ( Figure  2a). This is in line with a decrease in the excitation/ inhibition ratio in cortical circuits, due to the local, lorazepam-induced enhancement GABAergic neurotransmission.

Link between MEG and behavioural effects
The individual lorazepam-induced increase in beta power predicted the individual, non-specific increase in response latencies. By contrast, the individual increases in pro-variance and frequent-winner biases were predicted by changes in the scaling exponent (biomarker of cortical inhibition) ( Figure 2b). All results are in line with the idea that selective integration is realised via inhibitory interactions between the cortical processing stage that provides input to the decision computation (i.e. evidence accumulation). Computational modelling further corroborated the above conclusion. Fitting the extended selective integration model to the placebo and drug sessions revealed a focal increase in the noise and inhibition parameters in the latter (Figure 3). Further, partial correlation analyses (with all model parameter and all MEG drug-induced changes as covariates) showed that changes in scaling exponents (cortical inhibition biomarker) selectively predicted changes in the inhibition parameter (r = -0.42, p = .014). By contrast, beta power changes were related to changes in the model's decision noise (r = 0.41, p = .018).

Conclusions
Our results illuminate the cortical mechanisms underlying decision irrationality, linking those to GABA-A mediated intra-cortical competition. An analogous mechanism has been inferred for multi-stable perception, another domain of intra-cortical competition (van Loon et al., 2013). Our insights challenge the notion, put forward by influential biophysical circuit models of perceptual evidence accumulation (Wang, 2008), that GABAergic inhibition shapes decision dynamics by mediating the competition between choice accumulators. Instead, our findings indicate that, during our task, inhibitory interactions predominate at the input stage encoding the evidence and are negligible at the evidence accumulation stage. We propose that during protracted decisions the neural circuits that typically carry out decisions in short timescales perform "microdecisions" (identifying the larger input) and act as filters to accumulators with longer time-scales downstream the cortical hierarchy. Parameter value (a.u.)