Elsevier

Brain Research Reviews

Volume 61, Issue 2, October 2009, Pages 144-153
Brain Research Reviews

Review
Dorsal–ventral integration in object recognition

https://doi.org/10.1016/j.brainresrev.2009.05.006Get rights and content

Abstract

The idea of two parallel hierarchical pathways in vision has fueled a great deal of research and enhanced our understanding of visual processing in the brain. However, after 25 years, it has become clear that the earlier distinctions in terms of neuroanatomy and functional dissociation are less pure than originally considered. Dorsal visual areas may exhibit object-selective responses and many 3-D cues of shape, particularly structure-from-motion, appear to be computed exclusively by dorsal areas. These findings imply a more important role for dorsal visual areas in object recognition than previously considered and also place restrictions on the nature of ventral object representations. These representations will need to include information about the objects in 3-D, making them more viewpoint-invariant. They will also need to be invariant to the 3-D cue used to describe them. Through the discussion of relevant findings in psychophysics, single-unit electrophysiology, neuroanatomy and functional imaging, I suggest that these qualities are indeed present in ventral stream representations. Thus dorsal visual areas that extract 3-D structure of shapes from certain cues, can relate these representations to cue-invariant and view-invariant representations in the ventral stream.

Introduction

In 1982, an idea was presented that dramatically influenced thinking about the primate visual system. Ungerleider and Mishkin (1982), based on the pattern of behaviour following lesions to dorsal (occipito-parietal) and ventral (occipito-temporal) regions of the monkey cortex, suggested that the visual cortex can be decomposed into two pathways—a dorsal pathway concerned with spatial properties of vision (answering the question “where?”) and the ventral pathway concerned with identification of the visual objects (answer the question “what?”). However, after 25 years, many challenges have been raised to that original elegant and simple view (Merigan and Maunsell, 1993) (Hegde and Felleman, 2007), and an alternative description of the two pathways exists in terms of vision for perception (ventral stream) and vision for action (dorsal stream; Goodale and Milner, 1992). While the original model and its variant still serve as useful paradigms for interpreting results from psychophysics, neurophysiology, neuroanatomy, neuropsychology, and functional imaging, they are still evolving to incorporate newer findings. The objective of this article is to highlight a number of studies that together suggest the two pathways are functionally integrated in normal object recognition to enhance cue-invariant and viewpoint-invariant recognition by use of 3-D information. This may at first appear to contradict the original ideas of Ungerleider and Mishkin (1982) or those of Goodale and Milner (1992), but at closer inspection it will be evident that normal object recognition and all the variable viewing conditions that may challenge it necessitate the integrative action of these two streams.

First, the discussion will focus on the nature of object recognition in the ventral stream. It will be suggested that ventral object representations are largely viewpoint-invariant, although this invariance may not be represented at the single-cell level. Additionally, it will be suggested that familiarity with objects drives the development of representations that are more viewpoint-invariant. Finally, given that expertise with an object class requires extensive knowledge and familiarity with many members of the class, it is suggested that categories of objects that one has developed expertise in have a greater facility at achieving viewpoint-invariant representations for individual members of that class.

Second, the case is presented for dorsal–ventral integration in object recognition. Considering the primary discussion of viewpoint-invariant representations in the ventral pathway, it is suggested that shapes defined by 3-D cues that are dorsally extracted (particularly structure-from-motion) are ultimately processed by ventral stream mechanisms for recognition.

Through these two syntheses, it is proposed that normal object recognition likely requires the integrative action of the dorsal and ventral streams. This leads to several conjectures as to the properties of ventral stream representations, such as their invariance with respect to 3-D depth cues.

Section snippets

Distinctions in object recognition models

Models of visual object recognition can be divided along multiple, orthogonal dichotomies. The grandest dichotomy is between models that assume viewpoint-invariance in the neural representation of objects, and those that assume that viewpoint-invariant effects can be explained by uses of multiple individual viewpoints in an image-based manner. In the latter case, the brain interpolates intermediate views and thus allows us to recognize known objects from novel angles (Riesenhuber and Poggio,

Dorsal–ventral integration in object recognition

While plenty of data now exists to suggest that objects and shapes are indeed represented dorsally and certain 3-D cues of shapes are uniquely computed in dorsal-stream mechanisms, it seems clear that what we normally consider object recognition takes place in the ventral cortex (see Peissig and Tarr, 2007, Reddy and Kanwisher, 2006, for reviews). A number of issues then require clarification. First, how does the shape selectivity of neurons in the dorsal stream relate to object recognition in

Conclusion

Taken together, it would appear that while neuroanatomical dissociations do exist between a dorsal and ventral visual pathway, interpretations of the function of these streams is less certain. Specific tasks such as object and face recognition may not be subserved exclusively by ventral stream mechanisms, and there is some emerging evidence to suggest that certain aspects of object recognition, such as recognition of an object's orientation in space, may be processed by dorsal-stream mechanisms

Acknowledgments

I am grateful to Profs Ruffin Vogels and Guy Orban for their encouragement and comments on earlier drafts of this paper. This work was funded by a Doctoral Fellowship from the Fonds de la recherche en santé due Québec.

References (81)

  • IshaiA.

    Let's face it: it's a cortical network

    Neuroimage

    (2008)
  • JamesT.W. et al.

    Differential effects of viewpoint on object-driven activation in dorsal and ventral streams

    Neuron

    (2002)
  • JiangF. et al.

    The role of familiarity in three-dimensional view-transferability of face identity adaptation

    Vision Res.

    (2007)
  • LiuC.H. et al.

    The use of 3D information in face recognition

    Vision Res.

    (2006)
  • LogothetisN.K. et al.

    Shape representation in the inferior temporal cortex of monkeys

    Curr. Biol.

    (1995)
  • O'TooleA.J. et al.

    Recognizing moving faces: a psychological and neural synthesis

    Trends Cogn. Sci.

    (2002)
  • OrbanG.A. et al.

    Extracting 3D structure from disparity

    Trends Neurosci

    (2006)
  • PitcherD. et al.

    TMS evidence for the involvement of the right occipital face area in early face processing

    Curr. Biol.

    (2007)
  • ReddyL. et al.

    Coding of visual objects in the ventral stream

    Curr. Opin. Neurobiol.

    (2006)
  • RyuJ.J. et al.

    Representations of familiar and unfamiliar faces as revealed by viewpoint-aftereffects

    Vision Res.

    (2006)
  • SawamuraH. et al.

    Selectivity of neuronal adaptation does not match response selectivity: a single-cell study of the fMRI adaptation paradigm

    Neuron

    (2006)
  • SerenoM.E. et al.

    Three-dimensional shape representation in monkey cortex

    Neuron

    (2002)
  • SteevesJ.K. et al.

    The fusiform face area is not sufficient for face recognition: evidence from a patient with dense prosopagnosia and no occipital face area

    Neuropsychologia

    (2006)
  • TarrM.J. et al.

    Learning to see faces and objects

    Trends Cogn. Sci.

    (2003)
  • ValyearK.F. et al.

    A double dissociation between sensitivity to changes in object identity and object orientation in the ventral and dorsal visual streams: a human fMRI study

    Neuropsychologia

    (2006)
  • AfrazS.R. et al.

    Microstimulation of inferotemporal cortex influences face categorization

    Nature

    (2006)
  • AndersonK.C. et al.

    Three-dimensional structure-from-motion selectivity in the anterior superior temporal polysensory area, STPa, of the behaving monkey

    Cereb. Cortex

    (2005)
  • BiedermanI.

    Recognition-by-components: a theory of human image understanding

    Psychol. Rev.

    (1987)
  • BoothM.C. et al.

    View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex

    Cereb. Cortex

    (1998)
  • CoxD.D. et al.

    ‘Breaking’ position-invariant object recognition

    Nat. Neurosci.

    (2005)
  • FarivarR. et al.

    Dorsal–ventral integration in the recognition of motion-defined unfamiliar faces

    J. Neurosci.

    (2009)
  • Farivar, R., Germann, J., Petrides, M., Blanke, O., & Chaudhuri, A. (2006). Dorsoventral integration for recognizing...
  • GalperR.E.

    Recognition of faces in photographic negative

    Psychon. Sci.

    (1970)
  • GauthierI. et al.

    Expertise for cars and birds recruits brain areas involved in face recognition

    Nat. Neurosci.

    (2000)
  • Gilaie-DotanS. et al.

    Shape-selective stereo processing in human object-related visual areas

    Hum. Brain Mapp.

    (2002)
  • Grill-SpectorK. et al.

    The fusiform face area subserves face perception, not generic within-category identification

    Nat. Neurosci.

    (2004)
  • GrossC.G. et al.

    Visual receptive fields of neurons in inferotemporal cortex of the monkey

    Science

    (1969)
  • GrunewaldA. et al.

    Neural correlates of structure-from-motion perception in macaque V1 and MT

    J. Neurosci.

    (2002)
  • HalgrenE. et al.

    Location of human face-selective cortex with respect to retinotopic areas

    Hum. Brain Mapp.

    (1999)
  • HegdeJ. et al.

    Reappraising the functional implications of the primate visual anatomical hierarchy

    Neuroscientist

    (2007)
  • Cited by (74)

    • Reproducibility and a unifying explanation: Lessons from the shape bias

      2019, Infant Behavior and Development
      Citation Excerpt :

      Recent findings from separate areas of research may be related: infants at risk for or diagnosed with ASD show atypical object manipulation and hand-eye coordination (Koterba, Leezenbaum, & Iverson, 2012). Object manipulation segregates objects from scenes and teaches the visual system about 3-dimensional shape (Farivar, 2009; Graf, 2006). The representation of the abstract 3-dimensional geometry of multi-part shapes depends on the visual experiences generated by actively handling and looking at objects (Bushnell & Boudreau, 1993; James, Jones, Swain et al., 2014; Yu, Smith, Shen, Pereira, & Smith, 2009).

    • Semantic and pragmatic integration in vision for action

      2017, Consciousness and Cognition
      Citation Excerpt :

      Concerning the functional level, which is the level investigated in this paper, integration has been enquired concerning conscious control for action (Shepherd, 2015), the role of visual consciousness (Clark, 2009) and the representational and computational mechanisms at the basis of it (Grünbaum, 2016), in relation not only to the ventral stream, but also to the dorsal stream (Brogaard, 2011; Gallese, 2007; Wu, 2014). Integration has also been investigated concerning object processing (Konen & Kastner, 2008), view variant/invariant processing (Farivar, 2009), egocentric processing in visual experience and its relation to action (Briscoe, 2009), generation of our motor representations (Ferretti, 2016b) and visual perception of objects as present (Ferretti, 2016c). However, this paper focuses mainly on specific functional consequences of such an interaction that have not been meticulously investigated in the literature: the functional link between semantic and pragmatic processing in vision for action.

    View all citing articles on Scopus
    View full text