A signature of neural coding at human perceptual limits

Simple visual features, such as orientation, are thought to be represented in the spiking of visual neurons using population codes. I show that optimal decoding of such activity predicts characteristic deviations from the normal distribution of errors at low gains. Examining human perception of orientation stimuli, I show that these predicted deviations are present at near-threshold levels of contrast. The findings may provide a neural-level explanation for the appearance of a threshold in perceptual awareness whereby stimuli are categorized as seen or unseen. As well as varying in error magnitude, perceptual judgments differ in certainty about what was observed. I demonstrate that variations in the total spiking activity of a neural population can account for the empirical relationship between subjective confidence and precision. These results establish population coding and decoding as the neural basis of perception and perceptual confidence.


Introduction
Population coding describes a method by which information can be encoded in, and recovered from, the combined activity of a pool of neurons (Georgopoulos et al., 1982;Pouget, Dayan, & Zemel, 2000;Salinas & Abbott, 1994;Seung & Sompolinsky, 1993;Vogels, 1990). For example, in area V1, simple cells' spiking activity contains information about the orientation of visual stimuli. Each neuron's mean firing rate is described by an approximately bell-shaped tuning curve with a maximum at the cell's ''preferred'' orientation. This orientation varies from neuron to neuron, and the population as a whole encodes information about every possible orientation. Simple neural mechanisms have been proposed that can decode population spiking activity and recover the information about the stimulus (Deneve, Latham, & Pouget, 1999;Jazayeri & Movshon, 2006) although these theoretical mechanisms have not as yet been validated by neurophysiology. Irrespective of mechanism, the decoded values are necessarily noisy ap-proximations to the stimulus due to the stochastic nature of spiking events.
The principle that internal noise is responsible for errors in detection or discrimination of visual patterns has a long history in vision science (e.g., Pelli, 1985), and many models have been proposed to account for behavioral performance on such tasks, incorporating varying degrees of biological detail from simple linear filters to spiking neurons (e.g., Bradley, Abrams, & Geisler, 2014;Foley et al., 2007;Goris et al., 2013;Itti, Koch, & Braun, 2000;Watson & Ahumada, 2005). The present study diverges from previous work by examining predictions of a population coding model for the distribution of errors in an estimation task. Variability in perception of visual stimuli is typically assumed to follow a normal distribution (Green & Swets, 1966;Swets, Tanner, & Birdsall, 1961); the normal is a central limit distribution, a distribution to which values converge when many small influences are summed together, and for this reason, it is ubiquitous in biology. However, here I show that the mathematics of population coding puts it in conflict with the assumption of normality. Specifically, characteristic deviations from the normal distribution are predicted at low gains, i.e., when spiking activity is reduced. I confirm the presence of these deviations in human estimation of low-contrast stimuli, demonstrating a causal connection between population coding and perception.
As well as explaining errors, the neural model predicts variation in the certainty associated with each judgment, i.e., some estimates are more reliable than others. I show that observers have access to reliability information and use it to assign confidence to their perceptions. Previous attempts to explain the accuracy of confidence judgments have proposed a relationship to response time (Audley, 1960) or to the balance of accumulated evidence favoring one response over another (Smith & Vickers, 1988;Vickers & Packer, 1982). Here, I show that the sum of spiking activity in the population encoding a stimulus could provide a plausible neural basis for confidence judgments.

Experimental procedures Experiment
Eight participants (one male, seven females, aged 22-41 years) participated in the study after giving informed consent in accordance with the Declaration of Helsinki. All participants reported normal color vision and had normal or corrected-to-normal visual acuity. Stimuli were presented on a 21-in. linearized CRT monitor with a refresh rate of 130 Hz. The monitor was fitted with a neutral density filter to decrease the luminance range to the level of human detection thresholds. Participants sat with their head supported by a forehead and chin rest and viewed the monitor at a distance of 60 cm.
Stimuli consisted of Gabor patches of varying contrast and orientation (wavelength of sinusoid, 0.758 of visual angle; SD of Gaussian envelope, 0.758) presented at display center on a gray background. Stimuli were presented within an annulus (white, radius 48), which was always present on the display.
Detection thresholds were obtained prior to the main experiment using an adaptive estimation method. In each trial (160 in total), a Gabor was presented for 100 ms randomly at one of two time points, 1 s apart, identified by auditory cues; participants reported at which of the two time points the Gabor was present. Detection threshold was defined as the Gabor contrast at which participants performed at 75% correct, estimated by fitting a sigmoid function to the contrast-response data. Gabor contrast was selected in each trial to maximize the information available for this estimation (Psi method; Kontsevich & Tyler, 1999).
In the main experiment, each trial began with presentation of a randomly oriented Gabor patch for 100 ms and a simultaneous auditory tone. The contrast of the Gabor was chosen at random from 50%, 100%, 200%, or 400% of the previously obtained detection threshold. After 1 s, a randomly oriented bar stimulus (white, radius 58, width 0.18, central 68 omitted) was overlaid on the annulus; participants adjusted the bar orientation to match the orientation of the Gabor patch, using a computer mouse. They then indicated their confidence in their judgment by clicking on one of a set of buttons labeled 0%, 25%, 50%, 75%, or 100%. Participants completed between 280 and 480 trials.

Analysis
Orientations were analyzed and are reported with respect to the circular parameter space of possible values, i.e., the space of possible orientations (À908, 908) was mapped onto the circular space (Àp, p) radians. Error for each trial was calculated as the angular deviation between the orientation reported by the participant and the true orientation. Central tendency was assessed using the V statistic for nonuniformity of circular data. Recall precision was defined as 1/r 2 where r ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi À2logR p is the circular standard deviation as defined by Fisher (1995), and R is the resultant length. Hypotheses regarding the effects of experimental parameters (contrast, subjective confidence rating) were tested with t tests.
Population coding model I studied encoding and decoding in a population of M idealized neurons with orientation tuning and contrast sensitivity. The average response of the ith neuron to visual input was defined as (Albrecht & Hamilton, 1982;Carandini & Heeger, 2012;Heeger, 1992) where h is the stimulus orientation, c is the stimulus contrast, and f i (h) is a Von Mises tuning function, centered on u i , the neuron's preferred orientation where c is the population gain. Preferred orientations were evenly distributed throughout the range of possible orientations. Spiking activity was modeled as a homogeneous Poisson process such that the probability of a neuron generating n spikes in time T was Decoding of orientation information from the population's spiking activity, n, was based on maximum a posteriori (MAP) decoding. Assuming a uniform prior, this is equivalent to maximizing the likelihood If two or more orientations tied for the maximum, the decoded orientation was sampled at random from the tied values. The output of the model was given bŷ h ¼ h MAP È b, where b is a response bias term, and È indicates addition on the circle. Decoding time T was fixed at 100 ms. I considered the limit M '. The model therefore has five free parameters: r and a, constants of the contrast response function; c, the population gain; j, the tuning curve width; and b, the response bias.

Fitting the model
Although the equations above provide a complete description of the model, further analysis is needed to obtain predictions of the model and fit them to data. From Equation 4, Assuming dense uniform coverage, the second term is constant, so Consider the combined activity of the population in terms of the preferred stimulus corresponding to each spike: u (1) , u (2) , . . . u (m) , where the notation u (i) indicates the preferred orientation of the neuron that generated the ith of m spikes. The error in the decoded orientation, Dh MAP ¼ h MAP É h, can then be written as where e i ¼ u (i) É h. Setting the derivative of the term to be maximized to zero, we obtain Because spikes are generated by independent Poisson processes, every spike event is conditionally independent of every other given the true stimulus orientation. Approximating the uniformly spaced discrete distribution of preferred orientations of M neurons by a continuous uniform distribution, this probability is given by So the error in decoded orientation Dh MAP is the resultant angle (Equation 11) of a Von Mises (circular normal) random walk (Equation 12) of m steps. It follows that the error for a given resultant length r is Von Mises distributed (Mardia & Jupp, 2009): where the distribution of r for m steps is given by where rw m (r) is the probability density function for resultant length r of a uniform random walk of m steps. The distribution of m, the total spike count during the decoding interval T, being a sum of M independent Poisson distributions, is itself Poisson: where n is the expected total spike count n ¼ cT c a r a þ c a ð16Þ Equations 13, 14, and 15 together provide a means of obtaining the distribution of Dh MAP and hence of the response error Dh ¼ĥ É h ¼ Dh MAP È b, for any values of the free parameters, r, a, c, j, and b. For m 100, the density w m (r) was approximated by Monte Carlo simulation, discretizing over 10 3 bins. For larger m, a Gaussian approximation to Equation 14 was used (Mardia & Jupp, 2009): These equations were fit to empirical response data using the Nelder-Mead simplex method (fminsearch in MATLAB). Note that, as a mixture of normal distributions of different widths, the distribution of error is, in general, not normally distributed.

Simulations
To examine predictions of the population coding model in more detail, I performed Monte Carlo simulations (M ¼ 100 neurons, 10 5 repetitions per subject and contrast) using parameters obtained by fitting the model to the experimental data. Note that previous work (Bays, 2014) has shown 100 neurons to be sufficient to approximate the large population limit M '; simulating larger numbers of neurons would not have changed the results. Simulated trials were split into two equal bins, according to either the precision of the posterior distribution p(hjn) or the total spike count R M i n i , and precision of simulated responses was estimated separately for each bin.
Modeling detection threshold I modeled the detection task as follows: In each trial, there were two decoding intervals of length 100 ms, corresponding to the two time points at which a stimulus could be presented. A response was generated according to which interval contained the most spikes. There is no baseline activity in the model neurons, so the no-stimulus epoch always contained zero spikes; therefore errors occurred only when the stimulus epoch also had no activity, and then at the guessing rate of 50%. From Equation 15, the probability of generating zero spikes in interval T in response to a stimulus of contrast c is So the threshold contrast at which responses are 75% correct is given by Threshold model A threshold model of perceptual judgments would suggest that the stimulus in each trial is either seen, with probability p, or not seen, with probability (1 À p), where p depends on stimulus contrast. Seen stimuli are reported with circular normal (Von Mises) distributed error with SD r seen and bias b. When the stimulus is not seen, the response is random (i.e., drawn from a uniform distribution). The result is a mixture distribution with density where VM(h, l, r) is the Von Mises distribution evaluated at h with mean l and SD r. This resulted in a model with six free parameters: r seen , b, p 50% , p 100% , p 200% , and p 400% . Models were compared using the Akaike information criterion with finite data correction (AICc) and Bayesian information criterion (BIC).

Two-stage model
I considered a two-stage model in which the stimulus is first represented with circular normal error before being encoded in the neural population. This could correspond to the case in which the non-normality arises subsequent to initial perceptual representation, for example, in working memory. The resulting decoded stimulus estimates are distributed as the convolution of a circular normal with the population coding error distribution obtained above (Equations 13-15): The effect of changing contrast was reflected in the width r of the initial normal representation. This model therefore had seven free parameters: r 50% , r 100% , r 200% , r 400% , b, j, and n.

Background activity
I considered a variant of the population coding model in which all neurons have background (baseline) activity, g. In this case, the response of the ith neuron is given by and Equations 2-4 hold as before. The model of detection is the same as above except that the nostimulus epoch contained spikes generated at the baseline rate g, and activity in the stimulus epoch was given by Equation 22. Results I examined a model of population coding based on responses of visual cortical neurons to simple oriented stimuli of varying contrast (Figure 1). Mean firing rate of each neuron was determined by the product of its contrast response (Figure 1a), described by a sigmoid relationship between log contrast and firing rate (Albrecht & Hamilton, 1982;Carandini & Heeger, 2012;Heeger, 1992), and its orientation tuning ( Figure  1b), described by a bell-shaped tuning function (Pouget et al., 2000). Spikes were generated probabilistically according to a Poisson process (Figure 1c). Estimation of orientation was modeled as MAP decoding over a fixed temporal window. Because of the noise in spiking activity, the decoded orientation was imprecise with respect to the true stimulus value.
Modeling results showed that the distribution of errors in the decoded orientation estimate varied with gain and hence with input contrast (Figure 1d). For high-contrast stimuli, the decoded value was distributed approximately as a circular normal (Von Mises) centered on the true orientation (e.g., blue curve). As contrast decreased, the distribution became broader and also deviated substantially from the circular normal distribution (long tails, e.g., magenta curve). As the contrast fell to zero, the distribution of errors became flatter, approaching the uniform distribution (red line).

Experimental confirmation
To examine whether non-normality of response errors is a feature of human perceptual judgments, observers were presented with randomly oriented Gabor patches of varying contrast at and around each observer's detection threshold (defined as the contrast at which two-alternative forced choice judgments were 75% correct). They were asked to reproduce the orientation they had seen by rotating a bar stimulus. Figure 2a (black symbols) plots the distribution of response errors for different stimulus contrasts (labeled as percentage of detection threshold). Response precision declined with decreasing contrast, but performance was significantly above chance at every contrast level tested, V . 6.9; t(7) . 2.6, p , 0.032. Significant deviations from circular normality were evident as long tails in the error distribution at detection threshold (100%): circular kurtosis of 2.7 greater than circular normal with matched variance; t(7) ¼ 2.8, p ¼ 0.026; also in eight out of eight subjects considered individually. Figure 2b plots the discrepancy between the error distributions generated by observers and a circular normal distribution with the same variance.
Red curves in Figure 2 show fit of the population coding model (Maximum Likelihood [ML] parameters: response bias b ¼À0.050 rad 6 0.028 rad, tuning width j ¼ 2.40 6 0.58, population gain c ¼ 145 Hz 6 92 Hz, contrast response parameters a ¼ 48.2 6 16.6, r ¼ 0.096 6 0.0081; goodness of fit: r 2 ¼ 0.64 6 0.14 SD). The model reproduced both the changes in distribution width with contrast and, importantly, the non-normality of errors around detection threshold. Figure 3 plots response precision as a function of contrast for experimental data (black symbols) and the fitted model (black line).
In addition to perceptual error, the population coding model also makes predictions about stimulus detection. In a two-alternative forced choice task, as used here to estimate detection threshold, a simple observer model selects whichever epoch contained the most spikes. I estimated the threshold contrast that would result in 75% correct responses under this model based on the ML parameters obtained above. The resulting predictions were statistically indistinguishable from the empirical threshold values: 97% 6 16% of empirical threshold contrast, t(7) ¼ 0.19, p ¼ 0.86.

Other models
I compared the population coding model to a threshold model of perceptual responses (Luce, 1963;Sergent & Dehaene, 2004;Supèr, Spekreijse, & Lamme, 2001), which describes trials as falling into one of two categories: seen and unseen. When the stimulus is seen, responses are distributed normally; when the stimulus is unseen, responses are random. This model generated qualitatively similar predictions to the population coding model although with a tendency to underestimate non-normality at higher contrasts (Figure 4, blue (c) Spikes were generated according to a Poisson process. Estimation of orientation was modeled as MAP decoding of this spiking activity over a fixed time window. (d) Simulations revealed that the distribution of error in the estimated orientation depended on stimulus contrast. At high contrast, errors had an approximately circular normal distribution (e.g., blue curve). As contrast decreased, variability increased, and error distributions deviated from circular normality (long tails, e.g., magenta curve). At the lowest contrasts, errors approximated a uniform distribution (e.g., red curve). Error distributions are normalized by peak probability to best illustrate distribution shape. curves; ML parameters: response bias b ¼ À0.044 rad 60.027 rad, variability r seen ¼ 0.43 6 0.044, probability seen p 50% ¼ 0.048 6 0.016, p 100% ¼ 0.59 6 0.11, p 200% ¼ 0.89 6 0.08, p 400% ¼ 0.98 6 0.013). The threshold model was a poorer fit to the experimental data according to model selection criteria (DAICc ¼ 12.6; DBIC ¼ 43.5).
Although a standard perceptual task, the orientation reproduction task also has a working memory component as the target stimulus must be held in mind as the participant adjusts the probe bar. One possibility is that the non-normality arises in working memory storage, subsequent to the perceptual representation. To test this, I considered a two-stage model in which the error arising initially in perception is normally distributed with a width determined by the stimulus contrast, and the perceived value is then encoded and decoded according to the population model, introducing non-normality. This model failed to reproduce the non-normality in response distributions, particularly at contrasts around detection threshold (Figure 4, green curves; ML parameters: response bias b ¼ À0.062 rad 60.023, tuning width j ¼ 17.2 6 6.1, population activity n ¼ 13.5 6 9.4, normal SD r 50% ¼ 4.0 6 0.94, r 100% ¼ 1.7 6 0.74, r 200% ¼ 0.52 6 0.14, r 400% ¼ 0.48 6 0.11). The two-stage model was a substantially poorer fit to the experimental data than the population coding model (DAICc ¼ 327; DBIC ¼ 390).
A final possibility is that non-normality arises from anisotropy in orientation perception. It is well established that orientation judgments display small biases away from the cardinal angles (e.g., de Gardelle, Kouider, & Sackur, 2010), an ''anti-Bayesian'' effect possibly due to efficient coding by the underlying neural populations (Wei & Stocker, 2015). As shown in   5a, some evidence for such biases was obtained in the present study, specifically as response shifts away from the horizontal. Because error distributions are calculated by averaging over different stimulus orientations, such biases could potentially result in nonnormal distributions of error overall even if the distribution of error for any given stimulus orientation is normal. To test this, I simulated responses by drawing samples from normal (Von Mises) distributions with the biases and dispersions observed in the data at different stimulus values (15 evenly spaced bins); Figure 5b (blue curve) plots the resulting deviations from normality in the simulated error distribution. The deviations from normality are an order of magnitude smaller than those observed in the data (black data points), demonstrating that anisotropy in orientation perception cannot account for the nonnormality of errors that is the focus of this study.

Subjective confidence
As well as varying in the magnitude of error, responses also varied in the subjective confidence, or  reliability, observers assigned to them. Green and red symbols in Figure 3 indicate the precision of high and low confidence responses, respectively, based on a median split. Subjective ratings of confidence were significantly correlated with error magnitude for all but the lowest contrast stimuli, indicating that observers had some awareness of the uncertainty in their perception: 50% contrast, r 2 ¼ 0.03, t(7) ¼ 1.0, p ¼ 0.34; 100% contrast, r 2 ¼ 0.17, t(7) ¼ 4.6, p ¼ 0.003; 200% contrast, r 2 ¼ 0.07, t(7) ¼ 3.0, p ¼ 0.020; 400% contrast, r 2 ¼ 0.05, t(7) ¼ 3.6, p ¼ 0.009.
In the population coding model, the parameter that most directly corresponds to response reliability is the precision of the posterior distribution. To assess whether knowledge of this feature of neural decoding could underlie confidence judgments, I performed a median split on the posterior precision of simulated data, generated using the ML parameters obtained above. Green and red solid lines in Figure 3 plot the precision of high and low posterior precision trials, respectively. Despite not being fit to the high or low confidence data, this model closely replicated the behavioral results (MSE 0.073 6 0.023).
A more directly computable parameter of spiking activity correlated with reliability is the total spike count during the decoding window (Bays, 2014;Ma et al., 2006;Pouget, Dayan, & Zemel, 2003). This parameter was strongly correlated with posterior precision (r 2 ¼ 0.42). A median split based on total spike count (dashed lines in Figure 3) produced a replication of behavioral results that was indistinguishable from posterior precision, MSE 0.062 6 0.015, t(7) ¼ 1.3, p ¼ 0.23.

Background activity
The model of population coding presented above assumes that each neuron's spiking activity falls to zero at zero contrast. Here, I consider the case in which all neurons have background (baseline) activity, g. This model is considerably less analytically tractable than the no-baseline (g ¼ 0) model, and numerically fitting it to the experimental data is impractical. However, the predictions of the model share all the main characteristics of the no-baseline case. To illustrate the similarity, I considered the case g ¼ 1 Hz. Taking as a starting point the ML parameters of the no-baseline model for a representative observer, I used a grid-search (10 3 10 parameter space, 10 5 repetitions, M ¼ 100) to seek new values of j and c for which the baseline model approximated the predictions of the no-baseline model. As shown in Figure 6 and consistent with previous results (Bays, 2014), the baseline model generated predictions that were almost indistinguishable from those of the no-baseline model but at higher gain (c ¼ 41.7 Hz, compared to 28.8 Hz in the nobaseline case) and based on broader tuning curves (j ¼ 1.30, compared to 2.12).
A notable feature of the no-baseline case is the presence of simulated trials in which no spikes occur during the decoding window, and the decoder must ''guess'' a random value. In the no-baseline model, these trials are prevalent at detection threshold and contribute to the non-normality of the error distribution. In contrast, at threshold, the occurrence of such trials in the baseline case g ¼ 1 Hz was negligible ( p , 0.0001). This demonstrates that guessing is not critical to generating the non-normal distributions of error observed here but is rather an artifact of the simplified neuronal model lacking baseline activity.
The model of detection is the same as above except that now the no-stimulus epoch in general contained spikes, generated at the baseline rate g. I used Monte Carlo simulation (discretizing contrast into 100 bins; 10 5 repetitions, M ¼ 100) to estimate the threshold contrast, which again closely approximated the empirical threshold (101% of empirical value for the representative observer). Although in the no-baseline case all errors were due to guesses when no spikes occurred during the stimulus epoch, in the baseline model these trials occurred with negligible frequency ( p , 0.0001), providing further evidence that guessing is not a critical element of the population coding model.

Discussion
The present results demonstrate a signature of population coding in the errors made by human observers in perception of near-threshold stimuli. The predictions of the population coding model reproduce the variability and shape of error distributions in the perceived orientation of a stimulus as well as capturing the relationship between subjective confidence and perceptual precision. The model also accurately predicted detection threshold based on responses in the reproduction task.
In a recent study (Bays, 2014), I demonstrated deviations from normality, similar to those observed here, in recall errors on a working memory task when memory load was manipulated. The effect of memory set size on precision was explained by a normalization model, in which total population gain was held constant across changes in the number of items represented. Although, as is typical for perceptual tasks, there was a working memory component to the present study, memory limits do not provide an explanation for the present results as memory load was constant (at one item) across changes in stimulus contrast. Nonetheless, an important alternative hypothesis is that the non-normality observed here arose subsequent to the initial perceptual representation, which itself had normally distributed error (generated by some unknown mechanism). To test this hypothesis, I examined a two-stage model, in which an initial stimulus estimate with normal error was subsequently represented in a population code with attendant nonnormal error. This model failed to reproduce the nonnormality in the data, presumably because changes in contrast necessarily had their effect at the initial perceptual stage when they could not influence the strength of non-normality. This result strongly supports the view that non-normality is present in the initial perceptual representation and maintained in working memory.
Although the population coding model used in the present study incorporates a number of simplifications of the behavior of real neural populations (homogeneity of tuning curves, no baseline activity, no interneuronal correlations), modeling in the working memory study showed that the signature deviations from normality arise independently of these factors. In particular, the model's behavior was qualitatively unaffected by introducing across-neuron variation in the sharpness of orientation tuning or changing the shape of the tuning function from Von Mises to cosine. These analyses also identified two factors that serve to increase the population gain corresponding to a given level of variability: the presence of spontaneous (baseline) activity and short-range noise correlations. These factors would prove critical to attaining realistic levels of activity in neural populations on the scale of primary visual cortex. However, analysis of this scenario is hampered by the computational impracticality of simulating activity of hundreds of thousands of correlated neurons.
There is the possibility, in the simplified model of population activity presented here, that no spikes are generated during the decoding interval, resulting in a random response; however, real neurons typically have baseline levels of activity that make this situation unlikely even at very low contrasts. Additional analysis ( Figure 6) confirmed previous modeling work (Bays, 2014) in showing that identical deviations from normality are observed for populations with baseline activity even though the chance of observing zero spikes is negligible. The population coding model provided a more parsimonious description of empirical data than a threshold model in which stimuli are categorically either perceived or not perceived (Luce, 1963;Sergent & Dehaene, 2004;Supèr et al., 2001). However, error distributions predicted by the two models were notable mostly for their similarity. Rather than being mutually exclusive models, I suggest that population coding provides a neural-level explanation for the appearance of a threshold in human perception because the longtailed error distribution observed at low contrasts resembles a mixture of guessing and accurate judgments.
An interesting outcome of the mathematical analysis presented in the Experimental procedures is that the error distributions predicted by the population coding model can be precisely described by an infinite mixture of circular normal distributions. This may provide an explanation for the success of ''variable precision'' models of working memory (Fougnie, Suchow, & Alvarez, 2012;van den Berg et al., 2012), which attempt to capture recall errors in just such a way although the proposed distributions over precision in these models do not exactly match that predicted by population coding.
It has long been recognized that our observations are associated with different degrees of certainty, even when the external stimulation that gives rise to the perception is fixed, and further that this certainty is correlated with the magnitude of error in the observation. Clearly, we do not have access to the actual error in our observations, or we could correct for it, but exactly what aspect of the perceptual process our sense of confidence is based on is debated (Insabato et al., 2010;Kepecs et al., 2008;Kiani & Shadlen, 2009;Smith & Vickers, 1988). For a population code, an ideal observer of the neural data would base his or her confidence judgment on the width of the posterior distribution, that is, the probability distribution of the stimulus value conditional on the observed spiking activity. I found that this parameter provided an excellent fit to the empirical relationship between subjective confidence and precision of a judgment.
Although the posterior width is the best theoretical basis for judging certainty, it is not obvious how it could be computed neurally. The sum of all spiking activity during the decoding window (Bays, 2014;Ma et al., 2006;Pouget et al., 2003) was found to be strongly related both to the width of the posterior and to the precision of the judgment: The more spikes available for decoding, the more precise the estimate. The fit to empirical data was indistinguishable from that using posterior precision, indicating that total spiking activity is a viable and more readily computable proxy for the true uncertainty in the judgment.
An important limitation of the present study is that both the modeling and experimental work presented here pertain to perception of simple oriented stimuli on a uniform background. Situations in which stimulus energy is present in more than a single orientation, for example, encoding an orientation embedded in noise, are not currently represented by the model. Such situations would likely alter the relationship between population activity and precision, potentially making total spike count a less viable basis for subjective confidence. In order to address these issues, future work could expand the encoding model to take arbitrary images as input, perhaps by modeling neural responses as a linear image filtering process followed by a nonlinear response transformation (e.g., Goris, Simoncelli, & Movshon, 2015).

Conclusions
In summary, these results provide behavioral evidence that perception of elementary visual stimuli is an outcome of population coding and decoding at the neural level. Most theoretical work on population codes focuses on the limit of large numbers of spikes, in particular making use of the asymptotic approach to the optimal Cramér-Rao bound (Seung & Sompolinsky, 1993). Although some previous studies have analyzed low-spiking regimens (Berens et al., 2011;Brunel & Nadal, 1998;Xie, 2002), they have typically not sought to generate behaviorally testable predictions. The present results open up the possibility of using analysis of human perceptual reports of nearthreshold stimuli to probe the finer details of neural coding that are typically accessible only to nonhuman electrophysiology. They also have profound implications for signal-detection theory and Bayesian models of perception, which almost universally assume a normal distribution of internal errors.
Keywords: population coding, visual perception, perceptual confidence, Poisson noise, neural gain