ReviewDorsal–ventral integration in object recognition
Introduction
In 1982, an idea was presented that dramatically influenced thinking about the primate visual system. Ungerleider and Mishkin (1982), based on the pattern of behaviour following lesions to dorsal (occipito-parietal) and ventral (occipito-temporal) regions of the monkey cortex, suggested that the visual cortex can be decomposed into two pathways—a dorsal pathway concerned with spatial properties of vision (answering the question “where?”) and the ventral pathway concerned with identification of the visual objects (answer the question “what?”). However, after 25 years, many challenges have been raised to that original elegant and simple view (Merigan and Maunsell, 1993) (Hegde and Felleman, 2007), and an alternative description of the two pathways exists in terms of vision for perception (ventral stream) and vision for action (dorsal stream; Goodale and Milner, 1992). While the original model and its variant still serve as useful paradigms for interpreting results from psychophysics, neurophysiology, neuroanatomy, neuropsychology, and functional imaging, they are still evolving to incorporate newer findings. The objective of this article is to highlight a number of studies that together suggest the two pathways are functionally integrated in normal object recognition to enhance cue-invariant and viewpoint-invariant recognition by use of 3-D information. This may at first appear to contradict the original ideas of Ungerleider and Mishkin (1982) or those of Goodale and Milner (1992), but at closer inspection it will be evident that normal object recognition and all the variable viewing conditions that may challenge it necessitate the integrative action of these two streams.
First, the discussion will focus on the nature of object recognition in the ventral stream. It will be suggested that ventral object representations are largely viewpoint-invariant, although this invariance may not be represented at the single-cell level. Additionally, it will be suggested that familiarity with objects drives the development of representations that are more viewpoint-invariant. Finally, given that expertise with an object class requires extensive knowledge and familiarity with many members of the class, it is suggested that categories of objects that one has developed expertise in have a greater facility at achieving viewpoint-invariant representations for individual members of that class.
Second, the case is presented for dorsal–ventral integration in object recognition. Considering the primary discussion of viewpoint-invariant representations in the ventral pathway, it is suggested that shapes defined by 3-D cues that are dorsally extracted (particularly structure-from-motion) are ultimately processed by ventral stream mechanisms for recognition.
Through these two syntheses, it is proposed that normal object recognition likely requires the integrative action of the dorsal and ventral streams. This leads to several conjectures as to the properties of ventral stream representations, such as their invariance with respect to 3-D depth cues.
Section snippets
Distinctions in object recognition models
Models of visual object recognition can be divided along multiple, orthogonal dichotomies. The grandest dichotomy is between models that assume viewpoint-invariance in the neural representation of objects, and those that assume that viewpoint-invariant effects can be explained by uses of multiple individual viewpoints in an image-based manner. In the latter case, the brain interpolates intermediate views and thus allows us to recognize known objects from novel angles (Riesenhuber and Poggio,
Dorsal–ventral integration in object recognition
While plenty of data now exists to suggest that objects and shapes are indeed represented dorsally and certain 3-D cues of shapes are uniquely computed in dorsal-stream mechanisms, it seems clear that what we normally consider object recognition takes place in the ventral cortex (see Peissig and Tarr, 2007, Reddy and Kanwisher, 2006, for reviews). A number of issues then require clarification. First, how does the shape selectivity of neurons in the dorsal stream relate to object recognition in
Conclusion
Taken together, it would appear that while neuroanatomical dissociations do exist between a dorsal and ventral visual pathway, interpretations of the function of these streams is less certain. Specific tasks such as object and face recognition may not be subserved exclusively by ventral stream mechanisms, and there is some emerging evidence to suggest that certain aspects of object recognition, such as recognition of an object's orientation in space, may be processed by dorsal-stream mechanisms
Acknowledgments
I am grateful to Profs Ruffin Vogels and Guy Orban for their encouragement and comments on earlier drafts of this paper. This work was funded by a Doctoral Fellowship from the Fonds de la recherche en santé due Québec.
References (81)
- et al.
Perception of three-dimensional structure from motion
Trends in Cognitive Science
(1998) - et al.
Are face representations viewpoint dependent? A stereo advantage for generalizing across different views of faces
Vision Res.
(2007) - et al.
The role of the corpus callosum and extra striate visual areas in stereoacuity in macaque monkeys
Neuropsychologia
(1991) - et al.
Untangling invariant object recognition
Trends Cogn. Sci.
(2007) - et al.
Anterior regions of monkey parietal cortex process visual 3D shape
Neuron
(2007) - et al.
Viewer-centered object representation in the human visual system revealed by viewpoint aftereffects
Neuron
(2005) - et al.
Separate visual pathways for perception and action
Trends Neurosci.
(1992) - et al.
Cue-invariant activation in object-related areas of the human occipital lobe
Neuron
(1998) - et al.
fMR-adaptation: a tool for studying the functional properties of human cortical neurons
Acta Psychol. (Amst)
(2001) - et al.
Information and viewpoint dependence in face recognition
Cognition
(1997)
Let's face it: it's a cortical network
Neuroimage
Differential effects of viewpoint on object-driven activation in dorsal and ventral streams
Neuron
The role of familiarity in three-dimensional view-transferability of face identity adaptation
Vision Res.
The use of 3D information in face recognition
Vision Res.
Shape representation in the inferior temporal cortex of monkeys
Curr. Biol.
Recognizing moving faces: a psychological and neural synthesis
Trends Cogn. Sci.
Extracting 3D structure from disparity
Trends Neurosci
TMS evidence for the involvement of the right occipital face area in early face processing
Curr. Biol.
Coding of visual objects in the ventral stream
Curr. Opin. Neurobiol.
Representations of familiar and unfamiliar faces as revealed by viewpoint-aftereffects
Vision Res.
Selectivity of neuronal adaptation does not match response selectivity: a single-cell study of the fMRI adaptation paradigm
Neuron
Three-dimensional shape representation in monkey cortex
Neuron
The fusiform face area is not sufficient for face recognition: evidence from a patient with dense prosopagnosia and no occipital face area
Neuropsychologia
Learning to see faces and objects
Trends Cogn. Sci.
A double dissociation between sensitivity to changes in object identity and object orientation in the ventral and dorsal visual streams: a human fMRI study
Neuropsychologia
Microstimulation of inferotemporal cortex influences face categorization
Nature
Three-dimensional structure-from-motion selectivity in the anterior superior temporal polysensory area, STPa, of the behaving monkey
Cereb. Cortex
Recognition-by-components: a theory of human image understanding
Psychol. Rev.
View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex
Cereb. Cortex
‘Breaking’ position-invariant object recognition
Nat. Neurosci.
Dorsal–ventral integration in the recognition of motion-defined unfamiliar faces
J. Neurosci.
Recognition of faces in photographic negative
Psychon. Sci.
Expertise for cars and birds recruits brain areas involved in face recognition
Nat. Neurosci.
Shape-selective stereo processing in human object-related visual areas
Hum. Brain Mapp.
The fusiform face area subserves face perception, not generic within-category identification
Nat. Neurosci.
Visual receptive fields of neurons in inferotemporal cortex of the monkey
Science
Neural correlates of structure-from-motion perception in macaque V1 and MT
J. Neurosci.
Location of human face-selective cortex with respect to retinotopic areas
Hum. Brain Mapp.
Reappraising the functional implications of the primate visual anatomical hierarchy
Neuroscientist
Cited by (74)
Action inhibition and affordances associated with a non-target object: An integrative review
2020, Neuroscience and Biobehavioral ReviewsThe visuospatial pattern of temporal lobe epilepsy
2019, Epilepsy and BehaviorReproducibility and a unifying explanation: Lessons from the shape bias
2019, Infant Behavior and DevelopmentCitation Excerpt :Recent findings from separate areas of research may be related: infants at risk for or diagnosed with ASD show atypical object manipulation and hand-eye coordination (Koterba, Leezenbaum, & Iverson, 2012). Object manipulation segregates objects from scenes and teaches the visual system about 3-dimensional shape (Farivar, 2009; Graf, 2006). The representation of the abstract 3-dimensional geometry of multi-part shapes depends on the visual experiences generated by actively handling and looking at objects (Bushnell & Boudreau, 1993; James, Jones, Swain et al., 2014; Yu, Smith, Shen, Pereira, & Smith, 2009).
Towards a unified perspective of object shape and motion processing in human dorsal cortex
2018, Consciousness and CognitionSemantic and pragmatic integration in vision for action
2017, Consciousness and CognitionCitation Excerpt :Concerning the functional level, which is the level investigated in this paper, integration has been enquired concerning conscious control for action (Shepherd, 2015), the role of visual consciousness (Clark, 2009) and the representational and computational mechanisms at the basis of it (Grünbaum, 2016), in relation not only to the ventral stream, but also to the dorsal stream (Brogaard, 2011; Gallese, 2007; Wu, 2014). Integration has also been investigated concerning object processing (Konen & Kastner, 2008), view variant/invariant processing (Farivar, 2009), egocentric processing in visual experience and its relation to action (Briscoe, 2009), generation of our motor representations (Ferretti, 2016b) and visual perception of objects as present (Ferretti, 2016c). However, this paper focuses mainly on specific functional consequences of such an interaction that have not been meticulously investigated in the literature: the functional link between semantic and pragmatic processing in vision for action.