The place of human psychophysics in modern neuroscience

Human psychophysics is the quantitative measurement of our own perceptions. In essence, it is simply a more sophisticated version of what humans have done since time immemorial: noticed and reflected upon what we can see, hear, and feel. In the 21st century, when hugely powerful techniques are available that enable us to probe the innermost structure and function of nervous systems, is human psychophysics still relevant? I argue that it is, and that in combination with other techniques, it will continue to be a key part of neuroscience for the foreseeable future. I discuss these points in detail using the example of binocular stereopsis, where human psychophysics in combination with physiology and computational vision, has made a substantial contribution.


INTRODUCTION
From ancient times, observing our own sensations and perceptions has been the most important way of learning about our body and mind. At its most basic, this is how we observe that our eyes are essential for seeing, the ears for hearing and so on. More subtly, Aristotle (350BC) described several perceptual illusions, including retinal after-images and the motion after-effect, now a staple of psychology and neuroscience (Sekuler, 1965). But it was in the nineteenth century that this folk psychology became formalized into detailed measurements of human perception. Galileo, Kepler and Newton had demonstrated with stunning success that the physical world was subject to laws that explained the observed regularities in the cosmos. Scientists now began to search for similar laws governing human perception; in Fechner's bold phrase, ''an exact science of the relations between body and soul'' 1 (Fechner, 1860). Many, such as Ernst Mach, Hermann von Helmholtz or Fechner himself, were distinguished physicists as well as psychologists or (what we would now call) neuroscientists. Whereas Aristotle had simply noted the motion after-effect as a quaint phenomenon, these scientists now began to construct theories of what it might imply about the inner workings of the brain.
They were remarkably successful in their endeavor. Weber's observation that the just-noticeable difference between two weights is proportional to the weight itself (Weber, 1846) encapsulates a profound truth about how the nervous system encodes information; although there are deviations, the basic observation applies to a vast range of phenomena in areas including timing and number as well as touch, vision and hearing (Stevens, 1957;Whittle, 1986;Killeen and Weiss, 1987;Dehaene, 2003). Wheatstone (1838) discovered binocular stereopsis, the sensation of depth produced by small disparities between the images seen by the two eyes. Surprisingly, this phenomenon had been missed by earlier researchers, such as Leonardo da Vinci (1835 (1651)), who had studied why it is that pictures appear flat even when the perspective is correct. Wheatstone's discovery implied the existence of structures within the brain sensitive to binocular disparity, 130 years before such neurons were identified (Barlow et al., 1967;Nikara et al., 1968). Young (1802) famously deduced the trichromatic nature of human vision, despite having no knowledge of the three cone types, and nearly two centuries before human physiological cone spectra were finally measured -also using psychophysics (Wald, 1964). absorption spectra reported by Bowmaker and Dartnall (1980) with the sensitivities sketched by Helmholtz in 1867. The agreement is impressive considering how little physiology was known at the time.

WHAT IS PSYCHOPHYSICS?
Psychophysics has been defined as ''the analysis of perceptual processes by studying the effect on a subject's experience or behavior of systematically varying the properties of a stimulus along one or more physical dimensions'' (Bruce et al., 1996). While the techniques of psychophysics can be applied in a variety of domains, ''classic'' psychophysics has concentrated on the early sensory system. This is the area I shall concentrate on in this review. Furthermore, reflecting my own limited knowledge and experience, I shall draw most of my examples from vision, and specifically my own area of binocular depth perception or stereopsis.
The nineteenth-century psychophysicists still often used introspection rather than reporting quantitative measurements. Helmholtz' (1867) magnum opus contains no psychometric functions or similar data that would pass muster in a modern paper. Rather, the book is peppered with informal observations by the great man, including some charming anecdotes such as this on size perception: ''I still remember once, as a boy, passing by a church tower (the garrison church in Potsdam) and seeing people on its gallery who I thought were dolls. I asked my mother to fetch them down for me, which at the time I believed she would be able to do if she stretched out her arm.'' 2 Helmholtz describes his and others' experiments, not presenting the data, but inviting the reader to check them against his own experience. This illustrates another key assumption of much psychophysics: that it examines the most basic, fundamental aspects of human perception, common to all normally-functioning humans, rather than more subtle aspects of human experience that might fluctuate within or between individuals. To this day, this assumption underpins the very small number of subjects often used in psychophysical studies.
However, modern psychophysics generally requires objective, quantitative judgments rather than verbal report or introspection. At the heart of all modern psychophysics is the psychometric function, where a quantitative aspect of the stimulus is related to the probability of a particular judgment. This is often used to extract a threshold, at which the probability of a correct judgment exceeds some particular level. Psychophysics is almost always combined with a mathematical framework such as signal detection theory. A classic example is the Weber/Fechner law mentioned above as one of the earliest successes of the field. Weber (1846) observed that the just-noticeable difference between two physical stimuli, say the minimum difference in luminance required for one light to be perceived as brighter than the other, tends to be constant when expressed as a percentage of the reference stimulus. Fechner (1860) explained this as follows. We postulate that the neural signal representing brightness depends on the logarithm of luminance, and is furthermore subject to internal noise, which we assume is Gaussian and independent of the signal. The perceived brightness of the dimmer light is therefore a random variable with mean log(L) and standard deviation r; the perceived brightness of the other light has mean log(L + dL) and the same standard deviation. The difference in perceived brightness is thus a random variable with mean log(L + dL)-log(L), or approximately dL/L, and standard deviation r p 2. The probability that the brighter light is correctly identified is simply the Fig. 1. An early success of psychophysics. Although Helmholtz had no knowledge of the different cone types, and the different roles played by rods and cones were unclear, the sensitivities he sketched for the putative three color sensors (colored lines) agree rather well with subsequent measurements, given that he assigns the green color sensors the absorption spectra of rods. The underlying figure, showing black curves with symbols, is reproduced from Bowmaker and Dartnall (1980), Fig. 2. The colored curves superimposed are redrawn from Fig. 119 of Helmholtz (1867), p. 292. The vertical lines mark colors that Helmholtz labeled violet, blue, green, yellow, orange and red. On p. 269, Helmholtz gives the wavelengths for the boundaries separating these colors, in nm. I have used these to align his curves with the axes. probability that this difference exceeds zero, which is 0.5(1 + erf(dL/(2Lr))), where erf is the error function, erf(x)=(2/ p p) R 0 x exp(Àt 2 )dt. The luminance increment required for 75% correct performance is then dL thresh = 0.95rL. This postulate both accounts for the observation that luminance threshold dL thresh increases with test luminance L, and enables us to estimate the level of internal noise. Fechner traces his idea back to Bernoulli (1954Bernoulli ( (1738) and to Laplace, (1812), who postulated a logarithmic relationship between a physical good (fortune physique) and its psychological benefit or utility to the observer (fortune morale).
As this example illustrates, right from its inception psychophysics has made postulates about the underlying neuronal mechanisms relating physical stimuli to perception. These include how sensory information is encoded (for example, the logarithmic relation in the above example), how this is affected by various sources of noise, how the activity of sensory neurons is converted into a perceptual judgment (e.g. via a decision criterion), and so on. Concepts such as decision variable (the difference in log luminance in the example above) and utility, originally developed in human psychophysics, have provided a language for describing the internal workings of the brain (Gold and Shadlen, 2007). As will emerge throughout this review, our increasing physiological knowledge is enabling modern psychophysics to make ever more detailed postulates about neuronal mechanisms.
In order to make these inferences, psychophysics uses a toolbox of techniques for measuring human perceptions (Gescheider, 1997;Ehrenstein and Ehrenstein, 1999), many developed by the pioneers of the field but given new power by digital computers. In the Method of Adjustment, the subject adjusts one stimulus until it appears the same as another. In the Method of Constant Stimuli, a fixed set of parameter values is chosen -for example, a fixed set of luminance increments {dL i } -and repeatedly presented in a random order. A function, such as 0.5(1 + erf(dL/(2Lr))), is then fitted to the set of data, and used to deduce quantities of interest, in this example the internal noise r. With the advent of digital computers, it is easy to interleave different experimental conditions at random in order to minimize the effects of expectation, fatigue or out-and-out cheating by the subject.
Computers also enable automated staircase procedures, which offer a particularly quick and convenient way of extracting thresholds and other parameters where there is a monotonic relationship between the experimental parameter and task difficulty (Dixon and Mood, 1948). Staircase procedures typically start with a large value of the parameter, designed to make the task easy. The parameter is reduced until the person makes an error, at which point the parameter is increased again. In this way, by stepping up and down an imaginary staircase, the procedure gradually homes in on the threshold level of performance. There is a large body of work examining different mathematical recipes for adjusting the staircase (Watson and Pelli, 1983;Bernstein and Gravel, 1990;Johnson et al., 1992;King-Smith et al., 1994;Treutwein, 1995;Snoeren and Puts, 1997;Treutwein and Strasburger, 1999;Shen, 2013). Staircases work well in tasks like contrast detection or luminance discrimination. However, they can fail catastrophically if task difficulty is a non-monotonic parameter of interest. For example, judgments of relative depth from binocular disparity are hard if the disparity is near-zero, become easier as the disparity is increased up to around half a degree, and subsequently become hard or impossible as excessive disparities cause double vision and a loss of the depth percept.
As well as examining the precision of human perception, psychophysics can also reveal its accuracy. Psychophysicists are fascinated by illusions, where human perception does not veridically represent the world. A famous example is the Ebbinghaus illusion, where a circle surrounded by larger (smaller) circles appears smaller (larger) than it really is. Illusions are informative because a veridical perception simply tells us that our perceptual systems are well adapted to their job of representing the world, whereas a system's failures can reveal how it is constructed. However, illusions often take the form of ''biases'', such as the size bias in the Ebbinghaus illusion, and measuring these can be tricky. Morgan et al. (2013) have recently argued that many experimental approaches confound response biases (e.g. a tendency to press the left button when in doubt), decisional biases (e.g. a tendency to respond ''bigger'' when in doubt), and genuine perceptual biases (e.g. the tendency to perceive a circle as bigger when it is surrounded by small circles). They argue that by designing experiments appropriately, it is possible to dissect out these different forms of bias. In terms of signal detection theory, this enables the psychophysicist to distinguish between a shift in the signal function and a shift in the decision criterion. In terms of neuronal mechanisms, these correspond to a change in how sensory neurons encode the physical stimulus, and a change in how higher brain areas decode the response of a population of sensory neurons.
Deductions about neuronal mechanisms can also be made by comparing how performance varies across individuals. If thresholds on tasks A and B are correlated between individuals whereas those on tasks C and D are not, this suggests that the brain areas subserving A and B may overlap more than those subserving C and D. Perhaps surprisingly, these techniques have been little exploited within pure psychophysics. Several individual-differences studies have related a psychophysical measurement, e.g. threshold, to a physiological measurement e.g. cerebral blood flow (Kosslyn et al., 2002). Nefs et al. (2010) is a rare example of correlating thresholds on different psychophysical tasks, used in their case to deduce that humans possess two independent mechanisms for detecting motion in depth.
As noted above, much psychophysics has been directed at uncovering fundamental mechanisms shared by all humans. Given this assumption, and the fact that experiments may require hours of painstaking observation, human psychophysics papers often use very small numbers of subjects, sometimes as small as 2. This is often surprising to scientists from other fields, and seems at odds with the generally rigorous approach laid out above. Can a paper reporting data from 4 subjects really tell us anything general about humanity? My own research area of binocular stereopsis is one where there seems to be a particularly large amount of individual variation, so small studies can be misleading. For example, a paper examining sensitivity to vertical disparity, using 3 subjects, concluded that ''sensations of depth are not elicited by modulations of vertical-size disparity of any amplitude at spatial frequencies higher than about 0.04 c/deg'' and that the sensitivity function was low-pass, suggesting that the brain does not contain mechanisms tuned to modulations in verticalsize disparity (Kaneko and Howard, 1997). A subsequent paper with 9 subjects found similar results for 3 subjects, but the other 6 subjects showed bandpass sensitivity and a weak sensation of depth up to frequencies four times higher than the previous study . This suggests that some people possess mechanisms tuned to modulations in vertical disparity while others do not. There are also conflicting results that do not appear to be due to under-sampling. For example, the ''anti-correlated random-dot stereogram'', which presents opposite contrast to the two eyes, has been influential in developing theories of cortical depth encoding (reviewed by Read (2005)). In order to understand how information in primary visual cortex relates to perception, it is important to understand what percept is caused by this stimulus, but the results are conflicting. Several labs have found that such images cause no perception of depth (Julesz, 1960;Cogan et al., 1993;Cumming et al., 1998), even when dozens of subjects are tested (Hibbard et al., 2014), whereas others have reported that under some circumstances, some observers see reversed depth (Read and Eagle, 2000;Tanabe et al., 2008;Doi et al., 2011;Doi et al., 2013). The reason for these discrepancies is not clear. It is probably not coincidence, however, that both these examples relate to highly unnatural and difficult stimuli, which create only a weak depth percept in the most sensitive observers. In general, my impression is that the techniques that characterize perceptual psychophysics -objective reports, randomly interleaved presentations controlled by computer, rigorous fitting based on well-understood mathematics -do generally ensure good reproducibility. The Open Science Framework (https://osf.io/ezcuj/) has recently launched the Reproducibility Project: Psychology, which aims to systematically replicate selected psychology publications (Carpenter, 2012;Yong, 2012). Over time, this project should reveal how well psychophysics is living up to its ideals.
A further advantage of the move away from introspection and toward rigorous techniques using quantitative reports is that it has made psychophysics possible in animals as well as humans. Animal psychophysics may exploit a spontaneous behavior such as the optokinetic/optomotor response (McCann and MacGinitie, 1965), or require extensive training (Pavlov, 1927;Skinner, 1933). The use of animals enables the neu-ronal activity underlying perception to be probed in detail. Modern neuroscience has a plethora of techniques at its disposal. Current flow or voltage change in an individual neuron can be recorded; spikes fired by scores of neurons can be recorded simultaneously; optogenetic techniques allow specific classes of neurons to be activated or inactivated at will. Concepts originally derived from behavioral or psychophysical studies, such as the decision variable or utility discussed above, are now probed at the level of single neurons (Barlow, 1972;Parker and Newsome, 1998;Gold and Shadlen, 2007;Shadlen and Kiani, 2013).
Animal studies are particularly valuable because they enable physiology and psychophysics to be used simultaneously in the same organism. However, nowadays neuroscientists also have access to a wide range of non-invasive techniques that allow coarser access to neural anatomy and physiology in living humans. To electrical and magnetic encephalography have been added functional near infra-red spectroscopy, structural and functional magnetic resonance imaging, diffusion tensor imaging to track white matter tracts, and transcranial magnetic stimulation to briefly alter the functioning of specific cortical areas.
Given these developments, even a reader who accepts the huge contribution made by human psychophysics in the past might reasonably wonder if it has a place in the future. One can query both the ''human'' and the ''psychophysics'': in humans, will psychophysics remain valuable, as opposed to other techniques such as neuro-imaging? And if psychophysics remains an important technique, will it continue to be done in humans as opposed to experimental animals where results can be directly compared with invasive physiology? I argue that there are several reasons why human psychophysics will remain a fundamental tool of neuroscience.

THE CONTINUING ROLE OF HUMAN PSYCHOPHYSICS
As noted above, animal psychophysics has particular value because we can directly relate neuronal activity to perceptual judgments. Despite this, human psychophysics has several advantages over the animal variety which assure its continued importance. Perhaps most fundamentally, human psychophysics tells us directly about the species we are most interested in. Some human abilities (language, abstract reasoning) may not even exist in other species, or not to the same degree. Even where the abilities exist in other species, human psychophysics experiments can exploit complex tasks that would be difficult or impossible without verbal instruction. For example, one recent paper examined ''electrophysiological correlates of anxious rumination'' by comparing electroencephalography (EEG) signals measured while participants performed a neutral counting task versus while they ruminated on a personal conflict in their own life (Andersen et al., 2009). It is hard to see how such an experiment could be carried out in a lab animal, even if the species was capable of anxious rumination. These sorts of more complex tasks are likely to become more important in the future, as the field moves beyond basic sensory encoding to processing in higher brain areas. A second point worth highlighting is that human subjects can give verbally more complex responses than are possible in animals, for example reporting their qualitative sensations. Admittedly, this ability is little exploited in the sort of classic sensory psychophysics I am discussing in this review.
Even where an animal can apparently be trained to perform a task, it is difficult to be sure that the animal is in fact reporting what the experimenter hopes it is. It may be attending to a different aspect of the stimulus, perhaps even an artifact the experimenter is not aware of. Perceptual thresholds in animals may reflect the effect of motivation, for example trading off a low but acceptable reward rate in return for lower attentional load, rather than true sensory limits. These are valid concerns in humans too, but human participants will generally communicate such problems.
Furthermore, the extensive training necessary to teach lab animals what is required of them may in itself alter the neuronal substrate under study (Chowdhury and DeAngelis, 2008;Hua et al., 2010). That is, it may change the low-level neuronal circuits representing the sensory information as well as the high-level circuits representing the animal's understanding of and motivation to do the task. The brain areas involved when a highly trained animal carries out a task on which it has performed hundreds of trials may be very different from those subserving such tasks before training. For a similar reason, animal studies of perceptual learning can be hard to interpret, because of the difficulty of distinguishing perceptual learning from simple task learning.
Last but not least, the ''3Rs'', the principles of Replacement, Reduction and Refinement (Russell et al., 1992), mandate that animal experiments should be carried out only when necessary. Experiments should therefore be done in humans whenever possible.
These are all reasons, then, why we need to study humans as well as animals. But one might wonder whether the powerful new techniques mentioned above supersede traditional psychophysics. Perhaps nowadays we should confine ourselves to measuring human brain activity with functional magnetic resonance imaging (fMRI) or magnetoencephalography (MEG), rather than inferring it via psychophysics. Does psychophysics in any species still have value for understanding the brain? I would argue that it does. The ultimate goal of neuroscience is to understand the biological basis of our thoughts and behavior. Within this, a major subgoal is understanding our own perceptions: how our brains represent and interpret the world around us. Psychophysics asks an individual to make quantitative reports about their perception of a stimulus, and examines how these reports change as a function of the physical properties of the stimulus. In other words, it probes the input/output relations of the system under study. It is hard to imagine a more basic approach, or how one could claim to understand any system without first measuring these relations.
Of course, we have amassed a large body of knowledge about how humans perceive stimuli. But this does not mean that psychophysics is now over. Rather, our growing knowledge about brain mechanisms is prompting new psychophysical experiments designed to probe more subtle questions. New technologies such as fMRI or transcranial magnetic stimulation (TMS) have supplemented rather than replaced psychophysics. Studies using these new techniques in humans routinely pair them with psychophysical measurements that greatly increase their power. Below, I give specific examples of such interactions between physiology and psychophysics.
On a less exalted level but of considerable practical importance, human psychophysics is generally much quicker, easier and cheaper than either non-human psychophysics or other techniques in humans. So human psychophysics can be used to map out the nature of the phenomena to be explained, providing valuable guidance for subsequent work using other techniques. For example, human fMRI generally investigates phenomena that have been previously established using psychophysics alone.
Human psychophysics is continuing to make major contributions to one of my own particular areas of interest: stereoscopic vision, and in particular the constraints placed upon our stereoscopic vision by the initial encoding in binocular disparity in primary visual cortex (V1). By definition, the properties of V1 are a matter for neurophysiology, so by its nature this has required close collaboration between human psychophysics and physiology. These techniques are sometimes combined within a single study, sometimes applied separately, and many different groups have contributed to this ongoing project. In the next section, I review the progress made in this area. Along the way, I hope to highlight the distinctive contribution made by human psychophysics, illustrating the general points made in this section.

LINKING NEURONS TO HUMAN PERCEPTION IN STEREOSCOPIC VISION
Binocular stereopsis refers to the perception of depth based on small disparities between the images seen by the two eyes. As noted above, its discovery was itself an early triumph of the new discipline of psychophysics. Stereopsis was studied by many nineteenth-century luminaries, including Hering and Helmholtz. Notable advances included Helmholtz's work on the horopter (points in space that appear at the same location when viewed monocularly in either eye) and his demonstration that vertical disparities are used to calibrate the depth percept due to horizontal disparity. A century later, human psychophysics provided a second major breakthrough which revitalized the field and prompted new avenues of research in psychophysics, neurophysiology and computational neuroscience. This was Julesz's (1960) demonstration that stereopsis does not require a monocularly-visible object, but can work on ''cyclopean'' stimuli in which structure is defined purely by the offsets between the two eyes. Julesz (1978) dubbed this ability ''global stereopsis'', on the grounds that local features are ambiguous, so a successful match requires the visual system to take account of stimulus structure over relatively large scales. This ability proves that at least one form of stereopsis precedes object recognition.
This demonstration immediately made stereopsis an attractive model system to neuroscientists seeking to understand the relationship between cortical computations and perception. Neurons in the lateral geniculate nucleus of the thalamus receive their primary innervation from only one eye, and although there are binocular interactions (Marrocco and McClurkin, 1979;Schroeder et al., 1990), thalamic neurons appear not to be tuned for disparity (Xue et al., 1987). Therefore, it seems likely that the neuronal mechanisms subserving stereo vision must begin in primary visual cortex, the first place in the visual pathway where neurons tuned to disparity are found. This makes stereopsis an interesting candidate for studying specifically cortical algorithms. Scientists since Isaac Newton had already used degree of interocular transfer as a way of assessing whether a particular phenomenon was supported by cerebral structures (if not always with impeccable logic; Day (1958)). The advent of cyclopean stimuli facilitated this by enabling the presentation of stimuli that were only visible to the cortex. Cyclopean stimuli also enable depth from binocular stereopsis to be examined in isolation, without the other depth cues that normally accompany it, such as texture, shading, occlusion, motion parallax and perspective cues. Stimuli such as dynamic random-dot stereograms therefore became a staple of stereo psychophysics.
Furthermore, Julesz's demonstration that global stereopsis does not require recognizable objects suggested that the algorithm used by the cortex to detect such stimuli must be simple and low-level, the sort of algorithm that could potentially be understood and implemented in a machine. Computational neuroscientists were quick to come up with candidates (Dev, 1975;Marr and Poggio, 1976;Marr et al., 1978;Marr and Poggio, 1979). As Julesz pointed out, his use of cyclopean stimuli shifted the direction of the whole field: away from trying to understand the relationship between binocular disparity and perceived depth, and toward understanding how binocular disparity is extracted in the first place (Julesz, 1964). This piece of human psychophysics therefore set the agenda in this area of neuroscience for decades to come.
In the years following Julesz's demonstration of global stereopsis, neurons tuned to binocular disparity were identified in a range of species: cat (Barlow et al., 1967;Nikara et al., 1968;Pettigrew et al., 1968;Nelson et al., 1977;Fischer and Krueger, 1979), monkey (Zeki, 1974;Poggio and Fischer, 1977), sheep (Clarke et al., 1976) and owl (Pettigrew, 1979). These early studies followed in the tradition set by Hubel & Wiesel of using bar stimuli, which are of course monocularly visible objects. However, Julesz's work was rapidly followed up in monkey psychophysics, and within two years of his original report, it had been shown that monkeys too possess global stereopsis (Bough, 1970). It is perhaps surprising that it took another fifteen years for a published demonstration that neurons in monkey V1 were sensitive to disparity in cyclopean stimuli (Poggio et al., 1985;Poggio et al., 1988) as well as in traditional stimuli like bars. This strongly implicated these neurons as playing a role in the brain's algorithm for global stereopsis (Poggio and Poggio, 1984;Poggio, 1990), raising the possibility that could be regarded as analogous to photoreceptors with V1 as the ''cyclopean retina'' for global stereopsis, a term introduced by Julesz (1971) to refer to the putative processing site in the cortex that extracts disparity from such stimuli.
In turn, this physiology was soon being used to develop new computational models, notably the stereo energy model (Ohzawa, 1998). This model was developed in cat (Ohzawa et al., 1990), but its predictions were soon being tested and confirmed in monkey (Cumming and Parker, 1997). This test exploited another tool developed in human psychophysics: the anti-correlated stereogram introduced above (Anstis and Rogers and Anstis, 1975;Cogan et al., 1993).
More recently, the neuronal basis of stereopsis in humans has been examined using functional magnetic resonance imaging (Backus et al., 2001;Gilaie-Dotan et al., 2002;Negawa et al., 2002;Tsao et al., 2003;Neri et al., 2004;Bridge and Parker, 2007;Likova and Tyler, 2007;Preston et al., 2008;Spang and Morgan, 2008). These kinds of studies are guided and informed by the established human psychophysics, and very often they combine cortical imaging with human psychophysics in their experiments. For example, Backus et al. (2001) used this approach to demonstrate that ''measured cortical activity covaried with psychophysical measures of stereoscopic depth perception''. This exemplifies a point made above: by combining their fMRI recording with psychophysics, Backus et al. strengthened their power to draw conclusions about the significance of the cortical activity they measured.
As discussed above, psychophysics has always been concerned to relate human perception to underlying neuronal mechanisms, via mathematical models or linking hypotheses (Morgan et al., 2013). This relationship has been particularly close in the area of stereoscopic vision, perhaps because the detection of binocular disparity occurs later in the visual pathway, and thus closer to perceptual experience, than the detection of light. In the following paragraphs, I will briefly review several aspects of human stereoscopic vision and discuss our current understanding of the underlying neuronal mechanisms.

Stereoacuity
Psychophysics has always been much occupied with the study of thresholds: the dimmest light or smallest tilt perceivable. In the context of stereopsis, this corresponds to stereoacuity: the smallest depth step detectable from binocular disparities. This is much smaller than the spacing of photoreceptors in the retina. What feature of neural circuitry sets this limit? Poggio and Poggio (1984) initially noted that the coarse stereoacuity implied by the tuning curves of monkey V1 neurons did not accord with the fine stereoacuity of human or monkey observers: ''The threshold of stereoacuity is more than one order of magnitude smaller than the width of tuning of disparity sensitive cells''. Of course, it may be naı¨ve to compare the sensitivity of an individual neuron to that of the whole organism, which contains many thousands of such neurons. Yet there are some tasks on which the sensitivity of individual neurons does closely match that of the organism (Britten et al., 1992). There are several reasons for the discrepancy noted by Poggio & Poggio. First, we now know that cells in V1 encode absolute disparity (Cumming and Parker, 1999), whereas the exquisitely low stereo thresholds achieved by human observers require relative disparity (Westheimer, 1979). Cells selective for relative disparity are not observed until V2 (von der Heydt et al., 2000;Thomas et al., 2002). This is an example of how psychophysics enables us to interpret physiological measures of neuronal function, in this case implying that we should compare the tuning of V1 neurons with human thresholds for absolute, not relative, disparity.
However, the tuning width of V1 neurons is still wide even compared with the sensitivity of human observers to absolute disparity. For example, the absolute disparity thresholds we measured in one recent paper were about 0.04 o for long-duration stimuli and 0.08 o for stimuli presented for just 160 ms , Supp Mat), whereas the width of typical V1 disparity-tuning curves is around 0.5° (Poggio et al., 1985;Poggio et al., 1988;Prince et al., 2002b). A further complication is the fact that V1 neurons are not usually recorded at the fovea itself, but may be at several degrees eccentricity; stereoacuity declines rapidly as stimuli move out from the fovea (Rawlings and Shipley, 1969). Furthermore, monkey stereoacuity may not be as good as human.  addressed many of these issues by specifically comparing psychometric and neurometric functions in the same animal for relative-disparity judgments at the appropriate eccentricity. They concluded that the best V1 neurons were as sensitive or better than the animals themselves, although the average neuronal threshold was four times poorer than the average psychophysical threshold. That is, psychophysical stereoacuity does seem to be accounted for by the properties of neurons in V1, when comparisons are made for the same species and eccentricity. In agreement with this picture, disparity-tuning curves in ventral areas like IT are not substantially sharper than in V1, even though these areas seem to be more directly related to depth perception (Janssen et al., 1999;Uka et al., 2000;Janssen et al., 2003;Uka et al., 2005).

Disparity range
Stereoscopic vision is unusual in that there is not only a threshold but a ceiling: both a minimum and a maximum detectable disparity. Disparities beyond about 0.5°lie outside the fusible range (Panum, 1858), and do not result in a depth percept in cyclopean stimuli. This psychophysical limit agrees very well with the observed range of disparity tuning in monkey V1. The preferred disparities of monkey V1 neurons are generally less than 0.5°, with very few neurons selective for disparities over 1°, even at an eccentricity of 5° (Prince et al., 2002a).

Size-disparity correlation
Several psychophysical studies have found evidence for a ''size-disparity correlation'', meaning that larger disparities are encoded by sensors with larger receptive fields (Felton et al., 1972;Tyler, 1973Tyler, , 1974Tyler, , 1975Smallman and MacLeod, 1994;Tsirlin et al., 2008). Computational neuroscientists have also proposed a similar relationship on theoretical grounds: if sensors tuned to larger disparities are also tuned to larger spatial scales, they are less likely to respond to false matches between the left and right-eye images (Marr and Poggio, 1979). This size-disparity correlation emerges naturally from the class of model known as phase-based (Sanger, 1988;Ohzawa et al., 1990;Qian, 1994). Although stereo vision is not limited to purely phase-based encoding (Prince and Eagle, 1999;Prince and Eagle, 2000;Prince et al., 2002a), there is some physiological evidence for such a relationship. V1 neurons tuned to small disparities are found at all spatial scales, but cells tuned to the largest disparities tend to be those with the largest scales (Prince et al., 2002a). This is an interesting example, because the size-disparity correlation is widely accepted based on computational modeling of psychophysical data, despite the relatively weak physiological evidence supporting it.

Temporal stereoresolution
Temporal resolution for disparity is very low: human observers can perceive variations in disparity only up to around 5 Hz (Norcia and Tyler, 1984;Lankheet and Lennie, 1996;Kane et al., 2014), an order of magnitude lower than the threshold for flicker fusion (Kelly, 1971). This agrees reasonably well with the properties of V1 neurons. Macaque V1 neurons modulate their firing rates to track temporal modulations in disparity up to around 10 Hz (Nienborg et al., 2005), even though they track variations in contrast up to much higher frequencies. Nienborg et al. point out that this loss of resolution is an interesting mathematical consequence of comparing inputs from the two eyes. Unfortunately, there are as yet no psychophysical data on temporal stereoresolution in macaques. We do not know, therefore, whether macaques can perceive temporal modulation in disparity up to the frequencies suggested by their V1 neurons, which at 10 Hz is somewhat higher than most humans.

Spatial resolution
Stereo vision has also much coarser spatial resolution than luminance. Humans are able to detect variation in luminance on a scale of 50 cycles per degree or higher (Campbell and Green, 1965), yet we can detect variation in disparity only up to around 4 cycles per degree (Tyler, 1974;Bradshaw and Rogers, 1999). The low spatial stereo resolution appears to reflect the size of receptive fields in V1. This is not the case for luminance, because V1 receptive fields have ON and OFF subregions which make them sensitive to variations in luminance across the receptive field. In contrast, V1 neurons respond best to uniform disparity (Nienborg et al., 2004). The minimum response fields of primate V1 neurons near the fovea are roughly Gaussian; the smallest receptive fields have a standard deviation around 0.1°. The Fourier transform of such a Gaussian is a low-pass function falling to 5% of its peak value at 4 cycles per degree (Nienborg et al., 2004). Thus, the size of monkey V1 receptive fields accords well with the observed stereoresolution of human observers (Lankheet and Lennie, 1996;Banks et al., 2004;Allenmark and Read, 2011;Kane et al., 2014). Of course, this relies on a cross-species comparison. It would be preferable to relate the sensitivity of V1 neurons to disparity corrugations directly to an observer's ability to detect disparity gratings at the same eccentricity. In terms of stereoacuity, macaque thresholds are very similar to human , including when the stimuli have an interocular delay (Read and Cumming, 2005). Again, no study has directly compared neurometric and psychophysical thresholds for spatial modulation of disparity.
Our ability to detect spatial depth corrugations is subject to a disparity gradient limit: we cannot see changes more rapid than about 1°disparity per degree visual angle (Tyler, 1975;Burt and Julesz, 1980;McKee and Verghese, 2002;Banks et al., 2004;Filippini and Banks, 2009;Kane et al., 2014). Banks and colleagues (Banks et al., 2004;Filippini and Banks, 2009) have developed a computational model which shows how the disparity gradient limit arises naturally from the abovementioned properties of V1 neurons. The original model predicts that the disparity gradient limit should not apply to square-wave disparity corrugations, since their disparity is locally constant. If a square-wave corrugation is visible at a particular frequency at low amplitude, the model predicts it will remain visible as the disparity amplitude increases up to the fusional limit. This prediction is not borne out by human psychophysics (Allenmark and Read, 2010). However, the original model took no account of the size-disparity correlation discussed above. If the model is adjusted so as to include this, then squarewave corrugations with larger disparity amplitudes are detected by sensors with larger receptive fields and thus coarser spatial resolution. The modified model now agrees well with human psychophysics (Allenmark and Read, 2011). This is a good example of how human psychophysics, animal physiology and computational neuroscience can all contribute to a cycle of progressively refined understanding.
As we have seen, then, the properties of macaque V1 neurons closely explain the limits of human observers in several aspects of stereoscopic vision. It is worth pointing out that there are other aspects where this has not yet been demonstrated. For example,  examined two other well-known results from human psychophysics: the decline in stereoacuity with eccentricity (Rawlings and Shipley, 1969) and with pedestal disparity (Blakemore, 1970); that is, the disparity separation required to discriminate two objects increases with their distance from fixation both in the visual field and in depth. Prince et al. were not able to account for this in their neuronal data; for example, neurometric thresholds were not correlated with receptive field eccentricity. As they point out, this could simply be because the relation-ship was swamped by other sources of variation in their data. It remains to be seen whether a relationship between neurometric threshold and eccentricity will be demonstrated in the future, or whether there is a different neuronal basis for the psychophysical effect. For example, we know that the distribution of preferred disparity in V1 is centered on near-zero disparities (Prince et al., 2002a). If neurons sensitive to disparity edges in higher visual areas are constructed by combining outputs of suitable V1 neurons (Bredfeldt et al., 2009), we would expect this to result in more neurons tuned to near-zero pedestal disparities. Analogously, perhaps greater sensitivity is achieved near the fovea simply because there are more disparity-tuned neurons near the fovea and this reduces the effective noise. Such population-based effects could not be reflected in neurometric thresholds derived from single neurons, and remain to be demonstrated. New physiological techniques, such as recording from many neurons simultaneously using Utah arrays, should enable psychophysical data to be related more directly to population, as well as single-unit, activity.
If V1 is the first place where binocular information is combined, then all stereoscopic information available to the observer must be available at least implicitly in V1, just as all monocular information must be available in the retina. Because disparity is not detected until V1, there are no extrastriate pathways for stereoscopic information and presumably, no stereo version of the blindsight observed in other domains (Weiskrantz, 1986). However, the converse is not true: not all stereoscopic information available in V1 is available to the observer. Stereo vision offers two striking examples. The first concerns the ''anti-correlated random-dot stereograms'' discussed above in the context of reproducibility. These produce almost no sensation of depth (Julesz, 1960;Cogan et al., 1993;Cumming et al., 1998;Hibbard et al., 2014); even the most sensitive observers can discriminate depth only about 75% of the time (Read and Eagle, 2000;Tanabe et al., 2008;Doi et al., 2011;Doi et al., 2013). In contrast, consider the cell shown in Fig. 2. A homunculus, or electrophysiologist, using this single cell to discriminate surfaces of ±0.1°would be 100% correct for correlated stereograms and 100% wrong for anti-correlated. Although the disparity of anti-correlated stereograms is reliably represented in V1, little if any of this information reaches consciousness. This makes sense, since the stereo system is looking for ''matches'' between the eyes that represent different views of the same object. Real objects do not usually appear black in one eye and white in the other (and when they do, for example due to specular reflection, there are different means of judging their depth (Muryy et al., 2013)). The stereo system has to correctly detect the correct matches, while suppressing the response to false matches, for example by combining information across spatial scales. As a side-effect, information contained in anti-correlated disparities is lost when the population activity is ''read out'' to form a depth percept. Thus, studying these highly unnatural anti-correlated stimuli can constrain computational models of stereopsis (Read and Eagle, 2000;Read, 2002a,b;Read and Cumming, 2007;Doi et al., 2011;Doi et al., 2013).
The second example is perhaps even more intriguing, since it is less obviously adaptive. As noted above, human stereo vision is more precise for depth judgments around fixation than for those about a pedestal disparity (Blakemore, 1970). McKee et al. (2005) showed that this is also true when the stimulus being discriminated is a sinusoidal grating. This is surprising because a sine-grating is a periodic stimulus. When the grating is given a pedestal disparity that is an integer multiple of its period, nothing changes in the stimulus except the location of its edges (Fig. 3). The observer perceives the grating at the depth signaled by the edges (McKee et al., 2004), and shows the reduction in stereoacuity normally associated with that depth (McKee et al., 2005). Experiments using this same stimulus in macaques had already shown that V1 neurons are not sensitive to the disparity signaled by the edges, but simply respond to the portion of the grating falling within their receptive field, which is the same in both cases . Assuming that humans and macaques are alike in this, a neurophysiologist recording from an appropriate V1 cell would show the same high stereoacuity for gratings, independent of the absolute depth signaled by their edges, whereas the organism itself would become less and less sensitive as the absolute depth increased. In a subsequent paper, McKee et al. (2007) further showed that over the course of a few seconds, the signal from the edges adapts and the observer then perceives the grating in the fixation plane rather than in the plane consistent with its edges (the wallpaper illusion, Brewster (1844)). The observer then displays the usual stereoacuity. McKee et al. concluded that second-order mechanisms which detect the edge disparities control whether and how visual awareness is able to access information contained in V1.
These two examples both provide insight into when and how the activity of sensory neurons results in Fig. 2. A far-type cell from Cumming and Parker (1997). Data show mean firing rate to dynamic random-dot stereograms. Filled symbols/solid curve are for correlated stimuli; empty symbols/dotted curve for anti-correlated. Fig. 3. The task is to discriminate a small disparity in the sinusoidal grating (compare circles in the two images). When the edges of the grating are at the same position in the two eyes (A), observers are very sensitive to this disparity. When the edges have a large disparity that is an integer multiple of grating periods (B), observers are much less sensitive, even though many disparity-sensitive V1 neurons see identical images, and give the same response, in both cases. conscious perception. They gain their power from the skillful combination of human psychophysics along with physiology. If visual psychophysics had been abandoned twenty years ago, such insights would be impossible.
Many unanswered questions remain concerning the neuronal mechanisms of stereoscopic depth perception. Nevertheless, huge progress has been made in the 40 years or so since the discovery of neurons tuned to binocular disparity. One result is that we can now identify several areas where perception is fundamentally constrained by the properties of primary visual cortex. This is particularly interesting given that in other areas, e.g. visual acuity or luminance detection, human abilities are constrained at the periphery, by the retina. Binocular stereopsis offers a window into constraints imposed specifically by the cerebral cortex. I hope that I have demonstrated the key role played by human psychophysics throughout this process.

CONCLUSION
Marr famously introduced three levels of analysis for neuronal systems: the computational (what problem does the system solve), the representational (what algorithms does it use to solve it) and the physical level (what neuronal structures implement it). Roughly speaking, psychophysics aims to study the first level, physiology the third, and computational modeling the intermediate, algorithmic level which links the two. All three levels are essential for a complete understanding of the system, and certainly a detailed knowledge of the physical level is essential if we wish to intervene in the system, e.g. through drug therapy. Yet arguably the computational level is the most fundamental, the one which people really mean when they ask ''how does the brain work?'' By addressing this level, psychophysics goes to the heart of understanding ourselves. Physiology is fascinating in its own right, but acquires its full meaning and significance when related to perceptual experience by psychophysics.
In the first part of this article, I spent some time on the distinguished history of psychophysics. Subsequently in my review of binocular stereopsis, in every case, the psychophysical observation came long before evidence of the neuronal properties which might account for it. Human psychophysics has set the agenda: in the discovery first of stereopsis and then of ''global stereopsis'' and cyclopean stimuli; in the introduction of complex stimuli such as anti-correlated stereograms; and in the formation of theories to be tested. While establishing the primacy of psychophysics, this may have risked giving the impression that psychophysics was an early technique that has since been supplanted. As stated above, I do not believe this is the case: the new techniques simply give us better tools for building a psychophysical understanding. In fact, physiology has constantly stimulated new psychophysics, and vice versa. As an example of the former, the discovery that neurons in cortical area MT are tuned both to direction of motion and to binocular disparity (Maunsell and Van Essen, 1983a,b) led us to examine human spatial resolution for gratings defined by direction/disparity conjunctions (Allenmark and Read, 2012). As an example of the latter, the Pulfrich illusion and related observations in human psychophysics (Pulfrich, 1922;Ross, 1974;Morgan and Thompson, 1975) led us to record from single neurons while monkeys viewed random-dot stereograms with interocular delay, in an attempt to elucidate the neuronal basis of these perceptual phenomena (Read and Cumming, 2005). These are just two examples of the continuing flow of ideas and stimuli between the different levels of Marr's hierarchy.
In this review, I have concentrated on ''classic'' psychophysics of the early sensory system. I stated that this probes the input/output relations of the system under study, but I did not stress the limitations of this ''black box'' approach. For example, psychophysics has little to say about how the motor system achieves the behavioral outputs it measures. Even within the sensory domain, it can be difficult to say with confidence how well psychophysical results really constrain the internal properties of the system, as opposed to, for example, merely reflecting limitations imposed by the choice of stimulus. Furthermore, classical psychophysics boils down the complexity of human experience to highly limited, quantitative judgments. For example, ''forced choice'' designs explicitly ignore subject motivation; requiring binary ''yes/no''-type judgments deliberately excludes confidence in the response or the qualitative nature of the perception. There have been attempts to bring psychophysical techniques to bear on more complex aspects of human experience than judging the relative brightness of lights, for example changes of mind (Resulaj et al., 2009), social exclusion (DeWall andBaumeister, 2006) or emotional sensitivity (Martin et al., 1996). Yet it is true that by excluding the more complex, qualitative aspects of our conscious experience, psychophysics often ignores what many consider the most important aspects of being human. The merit of this approach is that it simplifies the system enough to make it amenable to mathematical modeling and hypothesis testing. Similar idealizations in physics, though satirized in a hundred ''spherical cow'' jokes, have been hugely productive. As Sir Peter Medawar noted (1981), science is the art of the soluble. We hope that what we learn by studying simplified, abstracted basic perceptual abilities will ultimately help us in understanding more complex abilities and system properties. For example, the uniform structure of the cortex all over the brain has long been cited as evidence that the brain may use a few canonical computations (Douglas et al., 1989;Stevens, 1994;Douglas and Martin, 2007). Concepts such as normalization (Carandini and Heeger, 2012), Bayesian networks (Knill and Richards, 1996;Ripley, 1996), inference by probabilistic population codes (Ma et al., 2006), correlated variability between neurons (Cohen and Kohn, 2011;Haefner et al., 2013) and evidence accumulation (Gold and Shadlen, 2007;Drugowitsch et al., 2012) may be of very broad applicability, and yet most easily approached through the study of low-level sensory inputs. Many of these concepts have been developed, influenced or tested by human psychophysics. Of course, to make progress, human psychophysics and computational modeling have to be combined with many other techniques, including those yet to be invented.
This point may also be worth emphasizing given continuing controversy about animal research. Without invasive physiology, we could still draw some broad conclusions about the workings of the nervous system by combining psychophysics and computational analysis alone, as Young and Helmholtz did so brilliantly in deducing trichromacy. However, the value of such study would be far more limited than when it is informed by animal physiology. Perhaps one day, non-invasive neuro-imaging techniques will progress to a point where they can replace invasive animal experiments. However, that day is far off. I am arguing the value of human psychophysics as a complement, certainly not a replacement, for other approaches.
Perhaps I should give the last word to Fechner, who as described by Stevens (1957) ''concluded his polemic of 1877 with a defiant five-line Nachwort'': ''The tower of Babel was never finished because the workers could not agree on how they should build it; my psychophysical edifice will stand because the workers will never agree on how to tear it down.'' 3 160 years after Fechner's foundation of the field, his edifice is in fine shape; surrounded by many other fine buildings, but not remotely under threat of being torn down.