Review
A model of the temporal dynamics of multisensory enhancement

https://doi.org/10.1016/j.neubiorev.2013.12.003Get rights and content

Highlights

  • Multisensory responses begin, rise, and peak earlier than unisensory responses.

  • Different computations are evident in different portions of the response.

  • Essential empirical findings are captured by a minimalist neural network model.

  • This model is easily inserted into broader artificial implementations.

Abstract

The senses transduce different forms of environmental energy, and the brain synthesizes information across them to enhance responses to salient biological events. We hypothesize that the potency of multisensory integration is attributable to the convergence of independent and temporally aligned signals derived from cross-modal stimulus configurations onto multisensory neurons. The temporal profile of multisensory integration in neurons of the deep superior colliculus (SC) is consistent with this hypothesis. The responses of these neurons to visual, auditory, and combinations of visual–auditory stimuli reveal that multisensory integration takes place in real-time; that is, the input signals are integrated as soon as they arrive at the target neuron. Interactions between cross-modal signals may appear to reflect linear or nonlinear computations on a moment-by-moment basis, the aggregate of which determines the net product of multisensory integration. Modeling observations presented here suggest that the early nonlinear components of the temporal profile of multisensory integration can be explained with a simple spiking neuron model, and do not require more sophisticated assumptions about the underlying biology. A transition from nonlinear “super-additive” computation to linear, additive computation can be accomplished via scaled inhibition. The findings provide a set of design constraints for artificial implementations seeking to exploit the basic principles and potency of biological multisensory integration in contexts of sensory substitution or augmentation.

Introduction

The evolution of multiple sensory systems has enhanced the likelihood of survival for organisms living in a wide variety of environments. This is not only because the senses substitute for one another when necessary, but because they can interact synergistically, thereby providing far more information about external events than would otherwise be possible. This is because the different senses are not corrupted by the same sources of noise, and combining their conditionally independent estimates of the same event yields a better analysis of its features (Ernst and Banks, 2002). This advantage manifests physiologically as enhancements in the speed and robustness of reactions to concordant cross-modal stimuli (Rowland et al., 2007a, Rowland and Stein, 2008), which in turn lead to faster and more accurate behavioral responses to the originating event (Meredith and Stein, 1983, Gielen et al., 1983, Perrott et al., 1990, Hughes et al., 1994, Frens et al., 1995, Wilkinson et al., 1996, Goldring et al., 1996, Jiang et al., 2002). Such enhancements are particularly beneficial when the information provided by the inputs is otherwise impoverished and/or unreliable; that is, circumstances in which their individual utilities are minimized (Stein and Meredith, 1993).

The best studied system in which this occurs is the mammalian superior colliculus (SC), which mediates the detection, localization, and orientation toward environmental targets (Meredith et al., 1987, Stein and Meredith, 1993). Individual neurons within the SC are sensitive to cues derived from different sensory modalities (e.g., vision, audition, and somatosensation) within circumscribed and overlapping regions of space (Stein and Arigbede, 1972). When stimulated by cross-modal cues within their respective receptive fields (RFs), their net evoked response magnitude (i.e., total number of impulses) is elevated above the response magnitude evoked by only one of the cues individually (“multisensory enhancement”). For robust stimuli, this enhancement typically reflects the sum of the net unisensory response magnitudes, but can be greater than this sum when the unisensory responses are less robust.

However, recent analyses examining the temporal profile of multisensory enhancement suggest that this enhancement is not uniform over the duration of the response (i.e., the entire discharge train). As the multisensory response rises and falls, its instantaneous firing rate (IFR) rarely reflects a simple addition of the component unisensory firing rates, even when the overall enhancement in the net response magnitude is consistent with an additive model (Rowland et al., 2007a). Rather, response enhancements are proportionally largest at the beginning of the response, which leads to earlier-than-expected response onsets (Rowland et al., 2007a, Rowland and Stein, 2008). The timing and magnitude of these multisensory enhancements, especially when occurring early in the discharge train, have the potential to greatly influence downstream circuits responsible for overt behavioral responses, as well as other targets involved in more higher-order perceptual processes. The operational principles of these neurons are a subject of great interest to basic scientists and researchers in applied domains seeking to engineer devices for sensory augmentation and substitution. However, most computational approaches to understanding multisensory integration in the SC have been restricted to describing its net products (e.g., Anastasio et al., 2000, Rowland et al., 2007b, Cuppini et al., 2010), not its moment-to-moment operations.

The purpose of this paper is to describe how the nonlinearities evident at the beginning of the multisensory response can be explained by a simple spiking model of SC multisensory integration, and do not require more complex assumptions about the biological substrate. At a coarse temporal resolution, the behavior of this model is similar to those described previously. However, at the level of resolution addressed here, the timing and “shape” of the inputs are revealed as key determinants of the integrated multisensory response. It thereby makes the neurobiological computations underlying the multisensory response more explicit.

Section snippets

Empirical observations

In multisensory SC neurons, concordant cross-modal signals typically evoke responses containing more impulses (i.e., enhanced net response magnitude), higher firing rates, longer durations, and shorter latencies than do their individual component stimuli (Stein and Meredith, 1993). The magnitude of the total multisensory response is generally related to the efficacy of the component stimuli: typically greater than the sum of these constituent unisensory response magnitudes when they are

Discussion

Below we summarize the relationship between the properties of the model, its underlying assumptions, and the key results.

A critical feature of the model is the amplification of the neuron's responsiveness to inputs that would not otherwise evoke impulses; that is, “stochastic resonance” (Benzi et al., 1981). This is provided by the noise current source as well as any input modalities whose values are (instantaneously) insufficient to generate impulses on their own. Because each input signal

Acknowledgements

This research was supported by NIH grants EY016716 and NS036916, and a grant from the Tab Williams Foundation.

References (37)

  • C. Cuppini et al.

    An emergent model of multisensory integration in superior colliculus neurons

    Front. Integr. Neurosci.

    (2010)
  • S.B. Edwards et al.

    Sources of subcortical projections to the superior colliculus in the cat

    J. Comp. Neurol.

    (1979)
  • M.O. Ernst et al.

    Humans integrate visual and haptic information in a statistically optimal fashion

    Nature

    (2002)
  • C.R. Fetsch et al.

    Neural correlates of reliability-based cue weighting during multisensory integration

    Nat. Neurosci.

    (2012)
  • M.A. Frens et al.

    Spatial and temporal factors determine auditory–visual interactions in human saccadic eye movements

    Percept. Psychophys.

    (1995)
  • F. Gabbiani et al.

    Principles of spike train analysis

  • S.C. Gielen et al.

    On the nature of intersensory facilitation of reaction time

    Percept. Psychophys.

    (1983)
  • J.E. Goldring et al.

    Combined eye–head gaze shifts to visual and auditory targets in humans

    Exp. Brain Res.

    (1996)
  • Cited by (16)

    • Behavioral, perceptual, and neural alterations in sensory and multisensory function in autism spectrum disorder

      2015, Progress in Neurobiology
      Citation Excerpt :

      Rather than combining this information in an indiscriminant manner, these neurons and circuits appear to be strongly sensitive to the statistical relationships of the stimuli to one another. Thus, stimuli from the different senses that are spatially (Carriere et al., 2008; Ghose and Wallace, 2014; Krueger et al., 2009; Meredith and Stein, 1986a, 1996; Rohe and Noppeney, 2015; Royal et al., 2009) and temporally (Cappe et al., 2012, 2009; Diederich and Colonius, 2015; Meredith et al., 1987; Rowland and Stein, 2014; Stevenson and Wallace, 2013; van Eijk et al., 2008) proximate generally result in large enhancements of neuronal response, whereas those that are more disparate in these domains generally fail to elicit these large enhancements, and if sufficiently far apart in space and/or time, can often result in dramatic depressions in neuronal response (Sarko et al., 2012; Senkowski et al., 2007, 2011; Stevenson et al., 2010; Stevenson and Wallace, 2013; Teder-Salejarvi et al., 2005; Wallace et al., 1996). Such a coding strategy makes a great deal of intuitive sense if we think of the brain as a statistical machine that is using this information to make probabilistic judgments about which stimuli belong together (Beck et al., 2008; Kording et al., 2007; Magnotti and Beauchamp, 2014; Magnotti et al., 2013; Shams, 2012).

    • The Effect of Multisensory Distraction on Working Memory: A Role for Task Relevance?

      2023, Journal of Experimental Psychology: Learning Memory and Cognition
    View all citing articles on Scopus
    View full text