Cortical regions underlying successful encoding of semantically congruent and incongruent associations between common auditory and visual objects
Highlights
► We used fMRI to examine creation of audio–visual memories of common objects. ► Successful encoding of congruent audio–visual memories activated right LOC. ► Successful encoding of incongruent audio–visual memories activated bilateral STG/STS. ► Creation of both congruent and incongruent audio–visual memories activated left IFG.
Introduction
Functional neuroimaging studies suggest that a distributed network of brain areas contributes to the processing of complex audio–visual (AV) stimuli such as familiar objects (reviewed in [8]). Audio–visual convergence effects have most consistently been found in (1) the inferior frontal gyrus (IFG) [3], [10], [16], [17], [23] particularly in the left hemisphere, (2) the superior temporal sulcus (STS) [2], [6], [10], [16], [17], [23], [24], [25] and (3) higher-level visual and auditory areas [3], [10], [24], [25] including the lateral occipital cortex (LOC) and superior temporal gyrus (STG). Very few studies (e.g., [22]), however, so far have investigated the neural substrates and mechanisms involved in the creation of multimodal memory traces.
Here, we focused on a ubiquitous but rarely investigated form of multisensory memories, namely, AV associative memories of common objects. In everyday life, we often simultaneously see and hear meaningful objects that provide us with concurrent streams of visual and auditory information. These situations can lead to (incidental) creation of AV memory traces, which can be classified into two categories: congruent (e.g., when one hears a meowing sound while viewing a cat) and incongruent (e.g., when one is viewing a cat but hears a car). While generation of a congruent memory trace requires establishment of linkages between different perceptual properties of a single object, creation of incongruent memory traces involves generating new conceptual associations between separate objects. Given this fundamental difference between the two types of AV memories, it is very likely that congruent and incongruent memories rely on different mechanisms mediated by distinct neural networks.
In the present study we used event-related functional magnetic resonance imaging (fMRI) in a subsequent memory paradigm [5], [26], to examine whether distinct neural systems mediate successful creation of associative memories for semantically congruent and incongruent pairs of object images and sounds.
Section snippets
Subjects
A total of 33 healthy and right-handed subjects with normal hearing and normal or corrected to normal vision participated in the experiment and were paid for participation. The mean age of the participants was 38.8 years. Three subjects were excluded after fMRI scanning because they had fewer than 10 remembered or 10 forgotten trials, and the data from the remaining 30 subjects were used for the first step of analyses, i.e., subsequent memory analysis regardless of semantic congruency. In the
Behavioral data
Regardless of semantic congruency, on average participants correctly recognized 0.55 (SD 0.13) of all AV pairs. The hit rate for congruent pairs (mean 0.69, SD 0.17) was significantly higher than the hit rate for incongruent pairs (mean 0.41, SD 0.15) [F(1,28) = 37.1, P < 0.001]. The mean false alarm rate (proportion of rearranged pairs to which subjects responded “yes”) was 0.23 (SD 0.15), showing a d-prime of 1.85 and 0.93 when compared with the hit rate for intact congruent and incongruent pairs
Discussion
The results of this study indicate that a distributed neural network encompassing regions in the frontal lobe and higher-order temporal and occipital cortices is involved in successful encoding of crossmodal associations between common auditory and visual objects. Critically, our findings reveal that the various regions of this network differentially contribute to the formation of semantically congruent and semantically incongruent AV memories. In the following, we will discuss the possible
References (27)
- et al.
Prefrontal and hippocampal contributions to the generation and binding of semantic associations during successful encoding
Neuroimage
(2006) - et al.
Integration of auditory and visual information about objects in superior temporal sulcus
Neuron
(2004) - et al.
Detection of audio–visual integration sites in humans by application of electrophysiological criteria to the BOLD effect
Neuroimage
(2001) - et al.
Semantics and the multisensory brain: how meaning modulates processes of audio–visual integration
Brain Res.
(2008) - et al.
Sculpting the response space” an account of left prefrontal activation at encoding
Neuroimage
(2000) - et al.
The role of multisensory memories in unisensory object discrimination
Brain Res. Cogn. Brain Res.
(2005) - et al.
The brain uses single-trial multisensory memories to discriminate without awareness
Neuroimage
(2005) - et al.
Rapid discrimination of visual and multisensory memories revealed by electrical neuroimaging
Neuroimage
(2004) - et al.
Integration of letters and speech sounds in the human brain
Neuron
(2004) - et al.
Audio–visual crossmodal interactions in environmental perception: an fMRI investigation
Cogn. Process.
(2004)
Making memories: brain activity that predicts how well visual experience will be remembered
Science
Neuroanatomical correlates of episodic encoding and retrieval in young and elderly subjects
Brain
Cited by (16)
The more you know: Schema-congruency supports associative encoding of novel compound words. Evidence from event-related potentials
2021, Brain and CognitionCitation Excerpt :It is well established that events which are congruent with a given schema are better retained than schema-incongruent events (Alba & Hasher, 1983; Schulman, 1974; Pichert & Anderson, 1977; see Greve et al., 2019, for an overview). The congruency effect has been reported for a wide range of event types and modalities (Atienza et al., 2011; Bein et al., 2015; Bein et al., 2014; Greve et al., 2019; Hall & Geis, 1980; Naghavi et al., 2011; Staresina et al., 2009; van Kesteren et al., 2013), although up to now, only few studies have investigated the influence of a memory schema on the learning of associations. Staresina et al. (2009), for example, operationalized event congruency as the plausibility judgment given by participants for the semantic match of a word–color combination.
Co-stimulation-removed audiovisual semantic integration and modulation of attention: An event-related potential study
2020, International Journal of PsychophysiologyCitation Excerpt :4.1 Audiovisual semantic integration with attention As shown in Figs. 5A and 6A, the first significant semantic integration effect in the attended condition was noted over the bilateral occipito-temporal areas at 220‐240 ms. Some studies have reported that the subregions in the temporal lobe are involved in multisensory semantic representation (Lewis et al., 2018; Jackson et al., 2015; Lambon Ralph, 2014), while the occipital lobe is often considered to play a role in the formation of multisensory associations (Naghavi et al., 2011; Murray et al., 2004). Since the audiovisual stimuli we used here were semantically congruent, we suggested that this semantic integration effect reflect the formation of congruent semantic association for sounds and pictures, and was further integrated into a unified concept.
Multisensory contributions to object recognition and memory across the life span
2019, Multisensory Perception: From Laboratory to ClinicA multisensory perspective on object memory
2017, NeuropsychologiaCitation Excerpt :Where the study design enabled the calculation of a more rigorous measure of sensory processing (the perceptual sensitivity parameter, d′, Macmillan and Creelman, 2004), these multisensory benefits were found to be even larger (i.e., 12% performance memory improvement; Matusz et al., 2015a). Overall, these improvements have been seen across 6 studies involving more than 100 participants and exhibiting effect sizes ranging from small to large (η2p =0.14–0.63; see Table 1 of Thelen and Murray, 2013 for details; see also Moran et al. (2013); for similar size of effects in studies involving setups with separate exposure and recall, see Heikkilä et al., 2015; Heikkilä and Tiippana, 2016; Naghavi et al., 2011; Ueno et al., 2015). In contrast to initial contexts where multisensory stimuli were semantically congruent, if the initial pairing is semantically incongruent, the typical result is memory impairments relative to when stimuli are initially presented in a unisensory manner, with the impairments ranging between a 4% and 16.5% relative decrease in discrimination accuracy (Fig. 2b).
Electrical neuroimaging of memory discrimination based on single-trial multisensory learning
2012, NeuroImageCitation Excerpt :The present observation of effects within nominally auditory regions in response to visual stimuli may be linked to the episodic nature of the multisensory pairings and/or the impaired performance for the V+ vs. V− condition. Support for the former can be found in fMRI studies showing that activity within superior temporal regions is inversely related to the strength of the association between arbitrary auditory-visual multisensory stimulus combinations (e.g. Tanabe et al., 2005; see also Naghavi et al., 2011 for effects during encoding that are in turn linked with subsequent memory performance). Support for the latter possibility can be gleaned from the results of Nyberg et al. (2000).
The dissociation of semantically congruent and incongruent cross-modal effects on the visual attentional blink
2023, Frontiers in Neuroscience