Identifying the Neurophysiological Correlates of Learning in Human Perceptual Decision-Making

Despite the well-established benefits of training on perceptual decision-making, there is still considerable uncertainty regarding the precise stages of information processing that are altered by learning. Here, we sought to characterize the neural adjustments that take place along the sensorimotor hierarchy following training on a perceptual task. To this end, we isolated distinct electrophysiological signatures of perceptual decisionmaking at the three key stages of information processing necessary for simple sensorimotor transformationssensory evidence encoding, decision formation and motor preparationas participants trained on a contrast discrimination task over five days. Steady-state visual evoked potentials (SSVEPs) reliably traced changes in stimulus contrast, thereby providing a read-out of sensory evidence encoding, while the centroparietal positivity (CPP) and lateralized beta-band activity provided domain-general and effector-selective indices of decision formation, respectively. Over the course of training, subjects learned to make quicker and more accurate perceptual decisions. These improvements were accompanied by a progressive boosting of sensory evidence representation, which in turn led to an increase in the build-up rate and peak amplitude of the CPP. A diffusion model analysis attributed the learning effects to increases in the rate of evidence accumulation, but no changes in the decision bound were observed.


Background
An important component of adaptive human behaviour is the ability to refine and enhance our perceptual capabilities through learning. Indeed, it is well established that our ability to make perceptual decisions improves with practice, a phenomenon better known as perceptual learning (for a recent review, see Dosher & Lu, 2017). However, there is considerable debate as to the precise neural adaptations underpinning these behavioural improvements. One particular source of debate pertains to whether these improvements reflect changes in early sensory representations or changes in later stages of the decision process involved in the read-out of sensory evidence from representational units (e.g. Petrov, Dosher & Lu, 2005;Bejjanki et al., 2011). In the primate neurophysiological literature, under the guidance of the sequential sampling framework, significant advances have been made in addressing this issue (e.g. Law & Gold, 2008). However, in human neurophysiology the neural mechanisms of perceptual learning have yet to be thoroughly investigated within the context of sequential sampling. Thus, the aim of the present study was to examine the impact of training across the hierarchy of information processing in the human brain by isolating distinct signatures of sensory evidence representation and sensory evidence accumulation at domain-general and effector-selective levels using temporally-precise electrophysiological recordings. The results from this neurophysiological analysis will also be compared with the results of a diffusion model fit to the behavioural data (Ratcliff, 1978) to assess whether the effects of learning on the behavioural and neural signatures of perceptual decision-making are consistent with one another.

Psychophysical task and perceptual learning protocol
Participants performed a difficult two-alternative contrast discrimination task in which they were required to discriminate the direction (left or right) of a target (tilted grating stimulus) based on a change in the relative contrast between two overlaid grating stimuli (see Figure 1). The gratings were 'frequency tagged' in order to allow independent measurement of sensory evidence in favour of both possible choices via separate steady-state visual evoked potentials (SSVEPs), with the left-and right-tilted gratings flickering at 20 Hz and 25 Hz, respectively. The stimuli were held at 50% contrast for an initial foreperiod, after which they underwent antithetical changes in contrast whereby the target stepped up in contrast while the non-target stepped down by a corresponding amount. This change in contrast was determined separately for each participant and was estimated via a staircase procedure conducted at the beginning of the study.
At end of each trial, feedback was presented onscreen and indicated whether the subject had responded correctly, incorrectly or failed to respond within the deadline. Points were awarded on a trial-bytrial basis according to the accuracy and speed with which the participants responded. Every correct response was awarded 40 points plus a speed bonus, while incorrect responses and missed targets were awarded no points. The maximum speed bonus was 40 points and this amount diminished linearly from 40 to 0 points across a 2000 ms period. Participants trained on the contrast discrimination task for five sessions and were encouraged to improve their performance via monetary incentives.

EEG signal analysis
In order to investigate the effects of training on the neurophysiological correlates of perceptual decision-making, EEG data were collected while participants trained on the contrast discrimination task. Following previous work from our lab (e.g. O'Connell, Dockree & Kelly, 2012;Steinemann, O'Connell & Kelly, 2018), distinct neural signatures of perceptual decision-making were isolated in the EEG signal. These included the steady-state visual evoked potential (SSVEP), which tracked the representation of stimulus contrast thereby providing a direct measure of sensory evidence, a domain-general decision formation signal found in the event-related potential (ERP), termed the centroparietal positivity (CPP), and oscillatory activity in the mu/beta frequency bands (10-30 Hz) that indexes evidence accumulation in an effector-selective manner (e.g. de Lange et al., 2013). The effects of training on the rate and peak amplitude of sensory evidence accumulation were measured by examining the decision-related activity in the response-locked CPP and mu/beta waveforms. The build-up rate of the CPP was measured as the slope of a straight fitted line to the unfiltered ERP over a time window of -500 to -200 ms relative to response. The slope of response-locked mu/beta lateralization waveform was measured over the time window -450 to -150ms relative to response. The peak magnitude of the response-locked CPP was calculated as the average amplitude within the -100 to 100ms window centred on the individual response time, while the trough of the mu/beta lateralization was calculated by averaging the FFT values within the window -150 to -50 ms relative to response.

Drift diffusion modelling
Preliminary modelling of the data was conducted by fitting the behavioural data with a two-choice drift Figure 1: Schematic of two-alternative forced choice contrast discrimination task. At the beginning of each trial, overlaid left and right tilted gratings were presented at 50% contrast. After an initial foreperiod, that varied unpredictably from trial to trial, one grating stepped up in contrast while the other stepped down by a corresponding amount. Participants reported the orientation of the target via a click of a mouse button. diffusion model with six free parameters (non-decision time, drift rate for left targets, drift rate for right targets, decision bound, starting point, urgency). Urgency was modelled as a straight linear collapse of the decision bound as a function of time. The diffusion model was fit to the pooled-subject behavioural data for each training day by minimizing the G 2 statistic with a SIMPLEX minimization routine.

Electrophysiological findings
To investigate the effect of training on sensory evidence encoding, a difference SSVEP was calculated by subtracting the non-target SSVEP (diminishing contrast) from the target SSVEP (increasing contrast) on each trial. While training did not affect the overall signal-to-noise ratio of the SSVEP, it did enhance the representation of the difference in contrast between the two grating stimuli (F(4, 72) = 2.92, p<0.05; Figure 3a). In line with this improvement in the representation of sensory evidence, there was also a corresponding improvement in the quality of the cumulative sensory evidence reflected in the increased build-up rate of the response-aligned CPP with training (F(4, 72) = 2.82, p<0.05; Figure 3f). Furthermore, the peak amplitude of the CPP at response was larger for later training sessions (F(2.83, 51.02) = 3.46, p<0.5) suggesting that participants based their decisions on a greater quantity of cumulative sensory evidence later in training.  However, these enhancements in cumulative sensory evidence did not translate into corresponding improvements at the effector-selective level of decision formation as there was no corresponding increase in either the magnitude (F(4, 72) = 0.14, p=0.97) or slope (F (2.39, 43.09) = 0.74, p=0.57) of pre-response mu/beta lateralization ( Figure 3I).

Preliminary modelling findings
A standard drift diffusion model was fit to the pooled correct and error response time distributions for each of the five training sessions. Consistent with the increased build-up rate of the CPP as function of training session, drift rate increased as a result of perceptual learning indicating that the quality of evidence being integrated during decision formation improves with training. However, the diffusion analysis revealed no change in the decision boundary parameter following training which is at odds with the increase in the peak amplitude of the CPP for later training sessions. Future research will attempt to reconcile our neurophysiological findings with the modelling results using a neurally-informed modelling approach (e.g. McGovern et al., 2018).

Discussion
Our results shed light on the multifaceted nature of the mechanisms underlying learning in perceptual decisionmaking. Two key mechanisms of learning were identified. First, training boosted the representation of the sensory evidence relevant to the decision that observers were trained to make. This is reflected in the increased amplitude of d-SSVEP following training. As a result of this increase in the quality of the sensory evidence representation, the rate of sensory evidence accumulation correspondingly increased, as reflected in the increasing slope of the CPP with training. Furthermore, participants became faster in their responses and missed fewer targets leading to an overall increase in their performance. Second, training also increased the peak amplitude of the CPP prior to response suggesting that participants sought to accumulate more evidence before committing to a decision in later training sessions. This change in decision policy led to an increase in task accuracy, but also comes at the expense of response speed. This may explain the observation that, despite increasing the quality of sensory evidence accumulation throughout training, response times plateaued after session three rather than becoming progressively faster. Together these results suggest that learning is mediated both by improvements in the efficiency of perceptual processing and by strategic adjustments in decision policy.