A Functional Model of Neuronal Response Variability in Primary Visual Cortex

Responses of sensory neurons to repeated presentations of identical stimuli are variable. Despite extensive studies on the structure and mechanisms of this variability, its functional role remains debated. Here we propose and test a functional account of both response selectivity and variability, based on two recent hypotheses about neural coding: first, that probabilistic inference about localized visual features explains how primary visual cortex (V1) neurons integrate information inside and outside their receptive fields (RFs). Second, that the inferred probability distribution is reflected in the across-trial distribution of neuronal responses (termed sampling hypothesis), and therefore higher uncertainty in the inference implies higher variability. The resulting model predicts that stimuli surrounding the RF should reduce response variability, reflecting that surround information reduces uncertainty about stimuli inside the RF. We test the predictions on macaque V1 responses to compound gratings and natural images. We find that variability is generally suppressed by stimuli extending beyond the RF; that the suppression is weaker for uninformative surrounds (ie. with mismatched orientation); and that the modulation of variability and average firing rate can be dissociated. Our results offer strong evidence for a functional role of cortical variability in probabilistic inference.


Introduction
Cortical response variability has been studied extensively, because it can limit the information encoded by neuronal populations and therefore reduce behavioral accuracy (Zohary et al., 1994;Averbeck et al., 2006). However, it has also been proposed that variability could play functional roles (Stein et al., 2005), in particular in the representation of probabilities. Theory suggests that perception can be modeled as probabilistic inference (Knill & Pouget, 2004), and that these inferences rely on internal models adapted to the statistics of the natural environment (Barlow et al., 1961;Berkes et al., 2011). Understanding how neurons maintain internal representations of probability distributions is therefore key to understanding sensory processing, and for these reason different schemes have been proposed (Pouget et al., 2013). A recent influential proposal, termed the sampling hypothesis, is that the instantaneous neuronal activity represents samples from the target probability distribution (Hoyer & Hyvärinen, 2003;Fiser et al., 2010). Therefore, in this framework, response variability may be indicative of the uncertainty associated to the variables represented by the firing activity.
Here we combine modeling and electrophysiology in macaques to test this functional hypothesis in primary visual cortex (V1), with a focus on how it is modulated by contextual stimuli. We focus on spatial context, i.e. stimuli in the surround of the neurons receptive field (RF), because it is known to strongly modulate firing rate (Cavanaugh et al., 2002), it has been linked to image statistics and probabilistic inference (Schwartz & Simoncelli, 2001;Coen-Cagli et al., 2012), and it has recently been shown to affect variability (Snyder et al., 2014). Importantly, spatial context provides additional information that could reduce uncertainty about the stimulus inside the RF, without modifying the RF stimulus itself. We relate neuronal responses to probabilistic inference in Gaussian scale mixture (GSMs) models. GSM models capture well the statistics of natural images (Wainwright et al., 2001) and successfully predict V1 responses including how spatial contextual stimuli modulate firing rate (Coen-Cagli et al., 2015), and how variability is affected by stimulus onset and contrast (Orbán et al., 2016). We use a simple formulation of the GSM model with spatially separate input regions representing the RF of a neuron and its surround, where the features encoded by the neuron -oriented edges -are subject to additive observation noise, and to the global influence of a single scalar multiplicator, the mixer (representing e.g. the image contrast level). We assume that the goal of the neuron is, given a visual input, to invert the generative model and represent the posterior distribution of the feature inside the RF, while marginalizing out the nuisance variables (i.e. the mixer and the observation noise). Those unknowns therefore are a source of uncertainty that depends on the stimulus, and are reflected in the width of the probability distribution. Following the sampling hypothesis, we then assume that neuronal responses are samples from this posterior distribution.
We show that in this framework, model neurons exhibit supra-Poisson variability consistent with existing data (Goris et al., 2014). The model predicts that surround stimulation should reduce response variability beyond the known reduction in firing rate, reflecting that spatial context information reduces uncertainty, and that this effect on variability should be stronger when surround stimuli are more informative. We tested these predictions with recordings from V1 of anesthetized and awake fixating macaques, viewing compound gratings and natural images. Our results offer strong support for these predictions, providing new evidence for the theory that cortical variability has a precise functional role in probabilistic inference.

409
This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0

Generative model
To generate predictions about contextual modulation of response variability, we extended a Gaussian Scale Mixture (GSM) model. This model and related extensions capture well the spatial statistics of natural images (Wainwright et al., 2001) and predict the average firing rate of V1 neurons to compound gratings (Schwartz & Simoncelli, 2001) and natural images (Coen-Cagli et al., 2015).
We denote the observable quantity as the vector x ∈ R n , which in our case corresponds to the output of a bank of n linear, Gabor-like filters (Simoncelli & Freeman, 1995) applied to an image patch. The filters are arranged into a center and a surround region (Fig. 1). Every position and orientation has two filters with complementary phases (in a quadrature pair), for convenience we denote the center vertical filters as x + c and x − c . We now consider a generative process of the form: Here the vector g ∈ R n represents the image features, i.e. oriented edges corresponding to the filter position and orientation. The variable ν ∈ R + is instead the mixer, a positive global multiplier, that can be intended as a global contrast level. η ∈ R n is an additional source of noise.
The task of the model neuron is to invert the generative process described above: given a visual stimulus and a corresponding observation x , infer the latent image features inside the receptive field, denoted as g + c and g − c . This requires the computation of the posterior distribution over two factors of uncertainty: the mixer values and the observation noise. We then model the phase-invariant neuronal response at each trial as: where P(g|x) is the posterior probability of the features of interest, obtained by inference over the generative model. The parameters r 0 and α are not part of the GSM, and could be used for quantitative fits at the single-neuron level. Here, for a qualitative comparison, we set them heuristically to achieve realistic Fano factors.
This results in a direct dependency between the trial-to-trial variability of the spike responses and the shape of the posterior belief P(g|x). We derived analytical expressions for this relationships in a reduced model (not shown) with no observation noise (i.e. η = 0 in Eq.1). In the model contextual stimuli (that is, the surround elements of x) have a divisive scaling effect on both mean and variance of the inferred g. This analytic result therefore predicts a precise link between divisive normalization (Carandini & Heeger, 2012) and contextual modulation of response variability, as has been recently suggested (Verhoef & Maunsell, 2017;Coen-Cagli & Solomon, 2018).

Model fitting and implementation
The Σ g is computed by moment-matching, based on an empirical estimate of the covariance of x, computed applying the filter bank to a large sample of natural image patches (N ≈ 10 4 ). The observation noise Σ noise is found by applying white noise to the filters, and extracting the resulting correlation structure. The global scaling of the noise is chosen heuristically. To test the model, we apply the same filter bank to patches of interest (gratings equivalent to those used experimentally), and sample from the posterior distribution P(g|x) numerically by a Hamiltonian Monte Carlo procedure (Stan Development Team, 2018). We compute spike-counts using Eq. 3.

Electrophysiology
Data were collected with microelectrode arrays implanted in V1 of anesthetized and awake macaques. In the anesthetized experiments, stimuli were static natural images and gratings of varying size, phase, and orientation (details below) presented for 100 msec followed by 200 msec blank screen, 20 times in pseudo-random order (see Coen-Cagli et al., 2015, for details). In the awake experiments animals performed a fixation task and stimuli were similar, except they were presented for 200 msec with a blank of 150 or 100 msec, and repeated 60 to 120 times.
We analyzed only neurons whose RFs were well-centered on the images, and for each neuron only stimuli that evoked robust responses above spontaneous activity.

Results
A general prediction of this framework is that the presence of contextual information should reduce response variability, as a consequence of the expected reduction of uncertainty about the features within the receptive field (RF). We therefore compared the Fano factors (FF) of V1 neurons in the presence of a large image (size RF) or a small image (size ≈ RF).
The results consistently show that variability is reduced when a surround stimulus is present, for gratings (Fig. 2 a) and for natural image patches (Fig. 2 b).
This effect could reflect a generalized network mechanism, whereby providing a stronger input (i.e. a larger stimulus) to the population induces a more deterministic dynamical regime (Rajan et al., 2010). However our model also predicts that contextual stimuli have a stronger impact on variability when they are more informative about the stimulus inside the RF. Specifically, the model predicts that both mean response and FF decrease when center and surround are parallel, but less so when they have different orientations (Fig. 3 a).
We therefore presented compound gratings to both awake and anesthetized monkeys. The center region was kept fixed at the neuron's preferred orientation, whereas the surround had a fixed size and contrast, but varying orientations. Consistent with the prediction, we found that the suppression of average response and FF were similarly tuned for the surround orientation (Fig. 4 a). To exclude the possibility that the reduction in FF was simply a byproduct of the reduction in mean response, we performed a mean-matching analysis. We grouped neurons across conditions by mean response, using bins corresponding to 0.33 spike counts. We then consider mean-matched pairs, where one of the neuron is in the parallel surround condition, and the other in the orthogonal surround condition. We then compute a Fano factor reduction score as Despite having very similar spike counts, the matched pairs show a consistently higher FF in the orthogonal condition compared to the parallel one (Fig. 4 b). To further de-couple mean response modulation from effects on variability, we performed a second series of experiments.
We considered circular patches of gratings and natural images, of varying size. We expect that, as size increases, the average response would change nonmonotonically (Cavanaugh et al., 2002). However the GSM model makes a strong distinction between low firing due to a weak stimulus or due to surround suppression. In the former case the response is small due to a small input (the x in Eq. 1), and there is still high uncertainty in the model latent variables. In the latter case, the reduction is due to a higher estimate of the global mixer (ν in Eq. 1 has a divisive effect on g), and there is more overall information about the hidden variables and the nature of the stimulus. The net result is that variability decreases monotonically with stimulus size (Fig. 3 b). We tested this model prediction in V1 of both anesthetized and awake macaque monkeys, finding a good qualitative correspondence between data and model (compare Fig. 3 b to  Fig. 5).

Discussion
In this work, we combined computational modeling and experimental electrophysiology to investigate the functional role of response variability in visual coding. We extended the normative model of Coen-Cagli et al. (2012), including explicit predictions on contextual modulation of variability, based on the hypothesis of neural sampling (Orbán et al., 2016).
Our results indicate that models of neural coding based on the GSM statistics can explain the nonlinear effects of contextual modulation not only on average neuronal responses : V1 responses in awake and anesthetized monkey to circular patches of varying size. The patch is either an oriented grating, at best orientation (as represented in Fig. 3 b), or a natural image. Spike counts of each neuron are normalized by their peak response. Same conventions as in Fig. 4 a. (a,d) Anesthetized, gratings; (b,e) awake, gratings; (c,f) awake, natural images. (Coen-Cagli et al., 2015), but also on modulation of variability.
Our data supports the model prediction that surround modulation of variability is stronger for more informative stimuli, and can be dissociated from the modulation of average responses. Furthermore, we observed similar effects across natural images; future work will address image-specific model predictions.
This work offers new evidence for a functional role of cortical variability in probabilistic inference, and underlines the importance of unified models of neural coding that account for both mean response and variability.