Skip to main content
Log in

Brain Prediction of Auditory Emphasis by Facial Expressions During Audiovisual Continuous Speech

  • Original Paper
  • Published:
Brain Topography Aims and scope Submit manuscript

Abstract

The visual cues involved in auditory speech processing are not restricted to information from lip movements but also include head or chin gestures and facial expressions such as eyebrow movements. The fact that visual gestures precede the auditory signal implicates that visual information may influence the auditory activity. As visual stimuli are very close in time to the auditory information for audiovisual syllables, the cortical response to them usually overlaps with that for the auditory stimulation; the neural dynamics underlying the visual facilitation for continuous speech therefore remain unclear. In this study, we used a three-word phrase to study continuous speech processing. We presented video clips with even (without emphasis) phrases as the frequent stimuli and with one word visually emphasized by the speaker as the non-frequent stimuli. Negativity in the resulting ERPs was detected after the start of the emphasizing articulatory movements but before the auditory stimulus, a finding that was confirmed by the statistical comparisons of the audiovisual and visual stimulation. No such negativity was present in the control visual-only condition. The propagation of this negativity was observed between the visual and fronto-temporal electrodes. Thus, in continuous speech, the visual modality evokes predictive coding for the auditory speech, which is analysed by the cerebral cortex in the context of the phrase even before the arrival of the corresponding auditory signal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Arnal LH, Morillon B, Kell CA, Giraud AL (2009) Dual neural routing of visual facilitation in speech processing. J Neurosci 29(43):13445–13453

    Article  CAS  PubMed  Google Scholar 

  • Arnal LH, Wyart V, Giraud AL (2011) Transitions in neural oscillations reflect prediction errors generated in audiovisual speech. Nat Neurosci 14(6):797–801

    Article  CAS  PubMed  Google Scholar 

  • Barker J, Berthommier F (1999) Evidence of correlation between acoustic and visual features of speech. In: Ohala JJ, Hasegawa Y, Ohala M, Granville D, Bailey AC (eds) 14th international congress of phonetic sciences, San Francisco, USA, 1999. the congress organizers at the Linguistics Department, University of California, Berkeley, pp 199–202

  • Barkhuysen P, Krahmer E, Swerts M (2008) The interplay between the auditory and visual modality for end-of-utterance detection. J Acoust Soc Am 123(1):354–365

    Article  PubMed  Google Scholar 

  • Barone P, Deguine O (2011) Multisensory processing in cochlear implant listeners. In: Zeng FG, Fay R, Popper A (eds) Springer handbook of auditory research. auditory prostheses: cochlear implants and beyond. Springer, New York, pp 365–382

    Google Scholar 

  • Besle J, Fort A, Delpuech C, Giard MH (2004) Bimodal speech: early suppressive visual effects in human auditory cortex. Eur J Neurosci 20(8):2225–2234

    Article  PubMed Central  PubMed  Google Scholar 

  • Besle J, Fort A, Giard MH (2005) Is the auditory sensory memory sensitive to visual information? Exp Brain Res 166(3–4):337–344

    Article  PubMed Central  PubMed  Google Scholar 

  • Besle J, Fischer C, Bidet-Caulet A, Lecaignard F, Bertrand O, Giard MH (2008) Visual activation and audiovisual interactions in the auditory cortex during speech perception: intracranial recordings in humans. J Neurosci 28(52):14301–14310

    Article  CAS  PubMed  Google Scholar 

  • Besle J, Bertrand O, Giard MH (2009) Electrophysiological (EEG, sEEG, MEG) evidence for multiple audiovisual interactions in the human auditory cortex. Hear Res 258(1–2):143–151

    Article  PubMed  Google Scholar 

  • Calvert GA, Bullmore ET, Brammer MJ, Campbell R, Williams SC, McGuire PK, Woodruff PW, Iversen SD, David AS (1997) Activation of auditory cortex during silent lipreading. Science 276(5312):593–596

    Article  CAS  PubMed  Google Scholar 

  • Campanella S, Gaspard C, Debatisse D, Bruyer R, Crommelinck M, Guerit JM (2002) Discrimination of emotional facial expressions in a visual oddball task: an ERP study. Biol Psychol 59(3):171–186

    Article  CAS  PubMed  Google Scholar 

  • Campbell R (2008) The processing of audio-visual speech: empirical and neural bases. Philos Trans R Soc Lond B Biol Sci 363(1493):1001–1010

    Article  PubMed Central  PubMed  Google Scholar 

  • Cappe C, Thut G, Romei V, Murray MM (2010) Auditory-visual multisensory interactions in humans: timing, topography, directionality, and sources. J Neurosci 30(38):12572–12580

    Article  CAS  PubMed  Google Scholar 

  • Carpenter J, Bithell J (2000) Bootstrap confidence intervals: when, which, what? a practical guide for medical statisticians. Stat Med 19(9):1141–1164

    Article  CAS  PubMed  Google Scholar 

  • Cavé C, Guaïtella I, Bertrand R, Santi S, Harlay F (1996) Espesser R about the relationship between eyebrow movements and Fo variations. ICSLP, Philadelphia, pp 2175–2178

    Google Scholar 

  • Chandrasekaran B, Krishnan A, Gandour JT (2007) Mismatch negativity to pitch contours is influenced by language experience. Brain Res 1128(1):148–156. doi:10.1016/j.brainres.2006.10.064

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Chandrasekaran C, Trubanova A, Stillittano S, Caplier A, Ghazanfar AA (2009) The natural statistics of audiovisual speech. PLoS Comput Biol 5(7):e1000436

    Article  PubMed Central  PubMed  Google Scholar 

  • Chatterjee M, Peng SC (2008) Processing F0 with cochlear implants: modulation frequency discrimination and speech intonation recognition. Hear Res 235(1–2):143–156

    Article  PubMed Central  PubMed  Google Scholar 

  • Cohen J (1992) A power primer. psychol Bull 112(1):155–159

    Article  CAS  Google Scholar 

  • Colin C, Radeau M, Soquet A, Demolin D, Colin F, Deltenre P (2002) Mismatch negativity evoked by the McGurk–MacDonald effect: a phonetic representation within short-term memory. Clin Neurophysiol 113(4):495–506

    Article  CAS  PubMed  Google Scholar 

  • Colombo L, Deguchi C, Boureux M, Sarlo M, Besson M (2011) Detection of pitch violations depends upon the familiarity of intonational contour of sentences. Cortex 47(5):557–568

    Article  PubMed  Google Scholar 

  • Davis C, Kislyuk D, Kim J, Sams M (2008) The effect of viewing speech on auditory speech processing is different in the left and right hemispheres. Brain Res 1242:151–161

    Article  CAS  PubMed  Google Scholar 

  • de Gelder B, Bocker KB, Tuomainen J, Hensen M, Vroomen J (1999) The combined perception of emotion from voice and face: early interaction revealed by human electric brain responses. Neurosci Lett 260(2):133–136

    Article  PubMed  Google Scholar 

  • Donnelly PJ, Guo BZ, Limb CJ (2009) Perceptual fusion of polyphonic pitch in cochlear implant users. J Acoust Soc Am 126(5):EL128–EL133

    Article  PubMed  Google Scholar 

  • Foxton JM, Riviere LD, Barone P (2010) Cross-modal facilitation in speech prosody. Cognition 115(1):71–78

    Article  PubMed  Google Scholar 

  • Friston K, Kiebel S (2009) Predictive coding under the free-energy principle. Philos Trans R Soc Lond B Biol Sci 364(1521):1211–1221

    Article  PubMed Central  PubMed  Google Scholar 

  • Friston K, Harrison L, Daunizeau J, Kiebel S, Phillips C, Trujillo-Barreto N, Henson R, Flandin G, Mattout J (2008) Multiple sparse priors for the M/EEG inverse problem. Neuroimage 39(3):1104–1120. doi:10.1016/j.neuroimage.2007.09.048

    Article  PubMed  Google Scholar 

  • Ghazanfar AA, Chandrasekaran C, Logothetis NK (2008) Interactions between the superior temporal sulcus and auditory cortex mediate dynamic face/voice integration in rhesus monkeys. J Neurosci 28(17):4457–4469

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Guaitella I, Santi S, Lagrue B, Cave C (2009) Are eyebrow movements linked to voice variations and turn-taking in dialogue? an experimental investigation. Lang Speech 52(Pt 2–3):207–222

    Article  PubMed  Google Scholar 

  • Hadar U, Steiner TJ, Rose FC (1984) Involvement of head movement in speech production and its implications for language pathology. Adv Neurol 42:247–261

    CAS  PubMed  Google Scholar 

  • Halgren E, Baudena P, Clarke JM, Heit G, Marinkovic K, Devaux B, Vignal JP, Biraben A (1995) Intracerebral potentials to rare target and distractor auditory and visual stimuli. II. medial, lateral and posterior temporal lobe. Electroencephalogr Clin Neurophysiol 94(4):229–250

    Article  CAS  PubMed  Google Scholar 

  • Hertrich I, Mathiak K, Lutzenberger W, Menning H, Ackermann H (2007) Sequential audiovisual interactions during speech perception: a whole-head MEG study. Neuropsychologia 45(6):1342–1354

    Article  PubMed  Google Scholar 

  • Jiang J, Alwan A, Keating PA, Auer ET, Bernstein LE (2002) On the relationship between face movements, tongue movements and speech acoustics. EURASIP J Appl Signal Process 11:1174–1188

    Article  Google Scholar 

  • Kang E, Lee DS, Kang H, Hwang CH, Oh SH, Kim CS, Chung JK, Lee MC (2006) The neural correlates of cross-modal interaction in speech perception during a semantic decision task on sentences: a PET study. Neuroimage 32(1):423–431. doi:10.1016/j.neuroimage.2006.03.016

    Article  PubMed  Google Scholar 

  • Kilian-Hutten N, Vroomen J, Formisano E (2011) Brain activation during audiovisual exposure anticipates future perception of ambiguous speech. Neuroimage 57(4):1601–1607. doi:10.1016/j.neuroimage.2011.05.043

    Article  PubMed  Google Scholar 

  • Kimura M (2012) Visual mismatch negativity and unintentional temporal-context-based prediction in vision. Int J Psychophysiol 83(2):144–155. doi:10.1016/j.ijpsycho.2011.11.010

  • Kislyuk DS, Mottonen R, Sams M (2008) Visual processing affects the neural basis of auditory discrimination. J Cogn Neurosci 20(12):2175–2184

    Article  PubMed  Google Scholar 

  • Lakatos P, Chen CM, O’Connell MN, Mills A, Schroeder CE (2007) Neuronal oscillations and multisensory interaction in primary auditory cortex. Neuron 53(2):279–292

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Li X, Yang Y, Ren G (2009) Immediate integration of prosodic information from speech and visual information from pictures in the absence of focused attention: a mismatch negativity study. Neuroscience 161(1):59–66

    Article  CAS  PubMed  Google Scholar 

  • Maris E, Oostenveld R (2007) Nonparametric statistical testing of EEG- and MEG-data. J Neurosci Methods 164(1):177–190

    Article  PubMed  Google Scholar 

  • Marx M, James C, Foxton J, Capber A, Fraysse B, Barone P, Deguine O (2013) Prosodic cues in cochlear implant users. In: 20th IFOS world congress, Seoul, June 2013

  • McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature 264(5588):746–748

    Article  CAS  PubMed  Google Scholar 

  • Mottonen R, Krause CM, Tiippana K, Sams M (2002) Processing of changes in visual speech in the human auditory cortex. Brain Res Cogn Brain Res 13(3):417–425

    Article  PubMed  Google Scholar 

  • Munhall KG, Jones JA, Callan DE, Kuratate T, Vatikiotis-Bateson E (2004) Visual prosody and speech intelligibility: head movement improves auditory speech perception. Psychol Sci 15(2):133–137

    Article  CAS  PubMed  Google Scholar 

  • Naatanen R, Paavilainen P, Rinne T, Alho K (2007) The mismatch negativity (MMN) in basic research of central auditory processing: a review. Clin Neurophysiol 118(12):2544–2590

    Article  CAS  PubMed  Google Scholar 

  • Ponton CW, Bernstein LE, Auer ET Jr (2009) Mismatch negativity with visual-only and audiovisual speech. Brain Topogr 21(3–4):207–215

    Article  PubMed Central  PubMed  Google Scholar 

  • Proverbio AM, Riva F (2009) RP and N400 ERP components reflect semantic violations in visual processing of human actions. Neurosci Lett 459(3):142–146

    Article  CAS  PubMed  Google Scholar 

  • Quian Quiroga R, Garcia H (2003) Single-trial event-related potentials with wavelet denoising. Clin Neurophysiol 114(2):376–390

    Article  CAS  PubMed  Google Scholar 

  • Reale RA, Calvert GA, Thesen T, Jenison RL, Kawasaki H, Oya H, Howard MA, Brugge JF (2007) Auditory-visual processing represented in the human superior temporal gyrus. Neuroscience 145(1):162–184

    Article  CAS  PubMed  Google Scholar 

  • Ross ED, Monnot M (2008) Neurology of affective prosody and its functional-anatomic organization in right hemisphere. Brain Lang 104(1):51–74. doi:10.1016/j.bandl.2007.04.007

    Article  PubMed  Google Scholar 

  • Ross LA, Saint-Amour D, Leavitt VM, Javitt DC, Foxe JJ (2007) Do you see what I am saying? exploring visual enhancement of speech comprehension in noisy environments. Cereb Cortex 17(5):1147–1153

    Article  PubMed  Google Scholar 

  • Saint-Amour D, De Sanctis P, Molholm S, Ritter W, Foxe JJ (2007) Seeing voices: high-density electrical mapping and source-analysis of the multisensory mismatch negativity evoked during the McGurk illusion. Neuropsychologia 45(3):587–597

    Article  PubMed Central  PubMed  Google Scholar 

  • Sams M, Alho K, Naatanen R (1984) Short-term habituation and dishabituation of the mismatch negativity of the ERP. Psychophysiology 21(4):434–441

    Article  CAS  PubMed  Google Scholar 

  • Sams M, Aulanko R, Hamalainen M, Hari R, Lounasmaa OV, Lu ST, Simola J (1991) Seeing speech: visual information from lip movements modifies activity in the human auditory cortex. Neurosci Lett 127(1):141–145

    Article  CAS  PubMed  Google Scholar 

  • Scarborough R, Keating P, Mattys SL, Cho T, Alwan A (2009) Optical phonetics and visual perception of lexical and phrasal stress in English. Lang Speech 52(Pt 2–3):135–175

    Article  PubMed  Google Scholar 

  • Schroeder CE, Foxe J (2005) Multisensory contributions to low-level, ‘unisensory’ processing. Curr Opin Neurobiol 15(4):454–458

    Article  CAS  PubMed  Google Scholar 

  • Schwartz JL, Savariaux C (2013) Data and simulations about audiovisual asynchrony and predictability in speech perception. the 12th international conference on auditory-visual speech processing, Annecy, France, 2013

  • Schwartz JL, Berthommier F, Savariaux C (2004) Seeing to hear better: evidence for early audio-visual interactions in speech identification. Cognition 93(2):B69–B78

    Article  PubMed  Google Scholar 

  • Stein BE, London N, Wilkinson LK, Price DD (1996) Enhancement of perceived visual intensity by auditory stimuli: a psychophysical analysis. J Cogn Neurosci 8(6):497–506. doi:10.1162/jocn.1996.8.6.497

    Article  CAS  PubMed  Google Scholar 

  • Stekelenburg JJ, Vroomen J (2007) Neural correlates of multisensory integration of ecologically valid audiovisual events. J Cogn Neurosci 19(12):1964–1973

    Article  PubMed  Google Scholar 

  • Stekelenburg JJ, Vroomen J (2009) Neural correlates of audiovisual motion capture. Exp Brain Res 198(2–3):383–390

    Article  PubMed Central  PubMed  Google Scholar 

  • Strelnikov K (2007) Can mismatch negativity be linked to synaptic processes? a glutamatergic approach to deviance detection. Brain Cogn 65(3):244–251

    Article  PubMed  Google Scholar 

  • Strelnikov K (2008) Activation-verification in continuous speech processing. interaction of cognitive strategies as a possible theoretical approach. J Neurolinguist 21:1–17

    Article  Google Scholar 

  • Strelnikov K (2010) Neuroimaging and neuroenergetics: brain activations as information-driven reorganization of energy flows. Brain Cogn 72(3):449–456

    Article  PubMed  Google Scholar 

  • Sumby WH, Pollack I (1954) Visual contribution to speech intelligibility in noise. J Acoust Soc Am 26(2):212–215

    Article  Google Scholar 

  • Tales A, Newton P, Troscianko T, Butler S (1999) Mismatch negativity in the visual modality. Neuroreport 10(16):3363–3367

    Article  CAS  PubMed  Google Scholar 

  • Ullsperger P, Erdmann U, Freude G, Dehoff W (2006) When sound and picture do not fit: mismatch negativity and sensory interaction. Int J Psychophysiol 59(1):3–7

    Article  PubMed  Google Scholar 

  • van Wassenhove V, Grant KW, Poeppel D (2005) Visual speech speeds up the neural processing of auditory speech. Proc Natl Acad Sci USA 102(4):1181–1186

    Article  PubMed Central  PubMed  Google Scholar 

  • Vatikiotis-Bateson E, Yehia HC (2002) Speaking mode variability in multimodal speech production. IEEE Trans Neural Networks 13(4):894–899

    Article  CAS  Google Scholar 

Download references

Acknowledgments

We thank E. Barbeau for help in the first pilot study and C. Marlot for the bibliography. This study was supported by the Human Frontier Science Program (to JMF), the DRCI Toulouse (Direction de la Recherche Clinique et de l’Innovation to KS and MM), the ANR (ANR Plasmody ANR-11-BSHS2-0008 (to BP), and the recurrent funding of the CNRS (to BP).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kuzma Strelnikov.

Additional information

Kuzma Strelnikov and Jessica Foxton are joint first authors

This is one of several papers published together in Brain Topography on the ‘‘Special Issue: Auditory Cortex 2012”

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (TIFF 1739 kb)

Supplementary material 2 (DOCX 10 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Strelnikov, K., Foxton, J., Marx, M. et al. Brain Prediction of Auditory Emphasis by Facial Expressions During Audiovisual Continuous Speech. Brain Topogr 28, 494–505 (2015). https://doi.org/10.1007/s10548-013-0338-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10548-013-0338-2

Keywords

Navigation