Skip to main content

Auditory-Stream Formation

  • Chapter
  • First Online:
The Perceptual Structure of Sound

Part of the book series: Current Research in Systematic Musicology ((CRSM,volume 11))

  • 2301 Accesses

Abstract

Chapter 4 ended with the remark that the formation of an auditory unit can be strongly affected by sound components that precede and follow that auditory unit. This chapter describes the process in which successive auditory units are linked to each other to form auditory streams. An auditory stream is a sequence of auditory units that are perceived as coming from one and the same sound source. Examples are the successive syllables of a speech utterance, or the sequence of tones that together form a melody. The result of this complex process of auditory-stream formation is an auditory scene consisting of more or less well-defined auditory streams only one of which can be attended to effortlessly. Moreover, when the number of sound sources is more than three to four, listeners generally underestimate the number of sound sources in an auditory scene. The most important characteristic of an auditory stream is that the successive auditory units are temporally coherent, i.e., that they are well ordered in time and their beats form a well-defined rhythm. Establishing temporal coherence between successive auditory units is a complex process depending on many factors. These factors can be relatively simple, such as the pitch and the timbre of successive auditory units, or more complex factors such as the familiarity of the sounds. The result appears to be a very flexible and adaptive system that also operates well in very noisy circumstances such as bustling restaurants or cocktail parties. When segments of an auditory stream are masked by other sounds, the auditory system is highly capable of restoring this information in such a way that the listener is not aware of this restoration. Besides having a well-defined rhythm, auditory streams have well-defined loudness contours and, if at least the constituent auditory units have pitch, well-defined pitch contours, but these contours are not perceived independently of each other. In addition, the consonant and dissonant relations between the parallel streams of musical scenes are described. This chapter ends with the description of three different approaches to computational modelling of the auditory-stream-formation process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. ’t Hart J, (1991) \(F_{0}\) stylization in speech: straight lines versus parabolas. J Acoust Soc Am 90(6):3368–3370. https://doi.org/10.1121/1.401396

  2. ’t Hart J, Collier R, Cohen A, (1990) A perceptual study of intonation: an experimental-phonetic approach to speech melody. Cambridge University Press, Cambridge, UK

    Google Scholar 

  3. Abercrombie D (1967) Elements of general phonetics. Edinburgh University Press, Edinburgh, UK. https://doi.org/10.1515/9781474463775

    Article  Google Scholar 

  4. Aggelopoulos NC et al (2020) Predictive cues for auditory stream formation in humans and monkeys. Eur J Neurosci 51:1254–1264. https://doi.org/10.1121/10.0001349

    Article  Google Scholar 

  5. Agres KR, Krumhansl CL (2008) Musical change deafness: the inability to detect change in a non-speech auditory domain. In: Proceedings of the 30th annual meeting of the cognitive science society Washington, DC, vol 30, pp 969–974. https://escholarship.org/uc/item/84z5g0j7

  6. Agus TR, Pressnitzer D (2021) Repetition detection and rapid auditory learning for stochastic tone clouds. J Acoust Soc Am 150(3):1735–1749. https://doi.org/10.1121/10.0005935

    Article  Google Scholar 

  7. Agus TR, Thorpe SJ, Pressnitzer D (2010) Rapid formation of robust auditory memories: Insights from noise. Neuron 66(4):610–618. https://doi.org/10.1016/j.neuron.2010.04.014

    Article  Google Scholar 

  8. Aitchison L, Lengyel M (2017) With or without you: predictive coding and Bayesian inference in the brain. Curr Opin Neurobiol 46:219–227. https://doi.org/10.1016/j.conb.2017.08.010

    Article  Google Scholar 

  9. Akre KL et al (2014) Harmonic calls and indifferent females: no preference for human consonance in an anuran. Proc Roy Soc B: Biol Sci 281(1789):20140986, 5 p. https://doi.org/10.1098/rspb.2014.0986

  10. Alain C, Bernstein LJ (2015) Auditory scene analysis: tales from cognitive neurosciences. Music Percept: Interdiscipl J 33(1):70–82. https://doi.org/10.1525/mp.2015.33.1.70

    Article  Google Scholar 

  11. Alain C, Bernstein LJ (2008) From sounds to meaning: the role of attention during auditory scene analysis. Curr Opin Otolaryngol Head Neck Surg 16(5):485–489. https://doi.org/10.1097/MOO.0b013e32830e2096

    Article  Google Scholar 

  12. Alain C et al (2001) ‘What’ and ‘where’ in the human auditory system. Proc Natl Acad Sci 98(21):12301–12306. https://doi.org/10.1073/pnas.211209098

    Article  Google Scholar 

  13. Albouy P et al (2019) Specialized neural dynamics for verbal and tonal memory: fMRI evidence in congenital amusia. Hum Brain Mapp 40(3):855–867. https://doi.org/10.1002/hbm.24416

    Article  Google Scholar 

  14. Allman MJ et al (2014) Properties of the internal clock: first-and second-order principles of subjective time. Ann Rev Psychol 65:743–771. https://doi.org/10.1146/annurev-psych-010213-115117

    Article  Google Scholar 

  15. Alluri V et al (2012) Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm. Neuroimage 59(4):3677–3689. https://doi.org/10.1016/j.neuroimage.2011.11.019

    Article  Google Scholar 

  16. Andreou L-V, Griffiths TD, Chait M (2011) The role of temporal regularity in auditory segregation. Hear Res 280(1):228–235. https://doi.org/10.1016/j.heares.2011.06.001

    Article  Google Scholar 

  17. Angulo-Perkins A, Concha L (2019) Discerning the functional networks behind processing of music and speech through human vocalizations. PLoS ONE 14(10):e0222796, 19 p. https://doi.org/10.1371/journal.pone.0222796

  18. ANSI (1994) ANSI S1.1-1994. American National Standard acoustical terminology. New York, NY

    Google Scholar 

  19. Anstis SM, Saida S (1985) Adaptation to auditory streaming of frequency-modulated tones. J Exp Psychol Hum Percept Perform 11(3):257–271. https://doi.org/10.1037/0096-1523.11.3.257

    Article  Google Scholar 

  20. Araya-Salas M (2012) Is birdsong music? Evaluating harmonic intervals in songs of a Neotropical songbird. Anim Behav 84(2):309–313. https://doi.org/10.1016/j.anbehav.2012.04.038

    Article  Google Scholar 

  21. Arnal LH, Giraud A-L (2012) Cortical oscillations and sensory predictions. Trends Cognit Sci 16(7):390–398. https://doi.org/10.1016/j.tics.2012.05.003

    Article  Google Scholar 

  22. Arvaniti A (2012) Rhythm classes and speech perception. In: Niebuhr O (ed) Understanding prosody: the role of context, function and communication. Walter de Gruyter GmbH, Germany, pp 75–92

    Chapter  Google Scholar 

  23. Arvaniti A (2012) The usefulness of metrics in the quantification of speech rhythm. J Phonet 40(3):351–373. https://doi.org/10.1016/j.wocn.2012.02.003

    Article  Google Scholar 

  24. Asano R, Boeckx C (2015) Syntax in language and music: what is the right level of comparison? Front Psychol 6 Article 942, 16 p. https://doi.org/10.3389/fpsyg.2015.00942

  25. Attneave F, Olson RK (1971) Pitch as a medium: a new approach to psychophysical scaling. Am J Psychol 84(2):147–166. https://doi.org/10.2307/1421351

    Article  Google Scholar 

  26. Aubanel V, Davis C, Kim J (2016) Exploring the role of brain oscillations in speech perception in noise: intelligibility of isochronously retimed speech. Front Hum Neurosci 10, Article 430, 11 p. https://doi.org/10.3389/fnhum.2016.00430

  27. Auksztulewicz R et al (2018) Not all predictions are equal: ‘What’ and ‘When’ predictions modulate activity in auditory cortex through different mechanisms. J Neurosci 38(40):8680–8693. https://doi.org/10.1523/JNEUROSCI.0369-18.2018

    Article  Google Scholar 

  28. Aures W (1985) Ein berechnungsverfahren der Rauhigkeit. Acustica 58(5):268–281

    Google Scholar 

  29. Awh E, Belopolsky AV, Theeuwes J (2012) Top-down versus bottom-up attentional control: a failed theoretical dichotomy. Trends Cognit Sci 16(8):437–443. https://doi.org/10.1016/j.tics.2012.06.010

    Article  Google Scholar 

  30. Bååth R, Madison G (2012) The subjective difficulty of tapping to a slow beat. In: Proceedings of the 12th international conference on music perception and cognition, Thessaloniki, Greece, pp 82–85. Accessed from 23–28 July 2012

    Google Scholar 

  31. Bååth R, Tjøstheim TA, Lingonblad M (2016) The role of executive control in rhythmic timing at different tempi. Psychonomic Bull Rev 23(6):1954–1960. https://doi.org/10.3758/s13423-016-1070-1

    Article  Google Scholar 

  32. Bachem A (1955) Absolute pitch. J Acoust Soc Am 27(6):1180–1185. https://doi.org/10.1121/1.1908155

    Article  Google Scholar 

  33. Baldeweg T (2007) ERP repetition effects and mismatch negativity generation: a predictive coding perspective. J Psychophysiol 21(3–4):204–213. https://doi.org/10.1027/0269-8803.21.34.204

    Article  Google Scholar 

  34. Barbosa PA (2007) From syntax to acoustic duration: a dynamical model of speech rhythm production. Speech Commun 49(9):725–742. https://doi.org/10.1016/j.specom.2007.04.013

    Article  Google Scholar 

  35. Barnes R, Johnston H (2010) The role of timing deviations and target position uncertainty on temporal attending in a serial auditory pitch discrimination task. Quart J Exp Psychol 63(2):341–355. https://doi.org/10.1080/17470210902925312

    Article  Google Scholar 

  36. Barnes R, Jones MR (2000) Expectancy, attention, and time. Cognit Psychol 41(3):254–311. https://doi.org/10.1006/cogp.2000.0738

    Article  Google Scholar 

  37. Barniv D, Nelken I (2015) Auditory streaming as an online classification process with evidence accumulation. PLoS ONE 10(12):e0144788, 20 p. https://doi.org/10.1371/journal.pone.0144788

  38. Bashford JA Jr, Riener KR, Warren RM (1992) Increasing the intelligibility of speech through multiple phonemic restorations. Percept Psychophys 51(3):211–217. https://doi.org/10.3758/BF03212247

    Article  Google Scholar 

  39. Bashford JA Jr, Warren RM, Brown CA (1996) Use of speech-modulated noise adds strong ‘bottom-up’ cues for phonemic restoration. Percept Psychophys 58(3):342–350. https://doi.org/10.3758/BF03206810

    Article  Google Scholar 

  40. Basirat A, Schwartz J-L, Sato M (2012) Perceptuo-motor interactions in the perceptual organization of speech: evidence from the verbal transformation effect. Philos Trans Roy Soc B Biol Sci 367(1591):965–976. https://doi.org/10.1098/rstb.2011.0374

    Article  Google Scholar 

  41. Bauer A-KR et al (2015) The auditory dynamic attending theory revisited: a closer look at the pitch comparison task. Brain Res 1626:198–210. https://doi.org/10.1016/j.brainres.2015.04.032

  42. Beauvois MW (1998) The effect of tone duration on auditory stream formation. Percept Psychophys 60(5):852–861. https://doi.org/10.3758/BF03206068

    Article  Google Scholar 

  43. Beauvois MW, Meddis R (1991) A computer model of auditory stream segregation. Quart J Exp Psychol Sect A: Hum Exp Psychol 43(3):517–541. https://doi.org/10.1080/14640749108400985

    Article  Google Scholar 

  44. Beauvois MW, Meddis R (1996) Computer simulation of auditory stream segregation in alternating-tone sequences. J Acoust Soc Am 99(4):2270–2280. https://doi.org/10.1121/1.415414

    Article  Google Scholar 

  45. Beauvois MW, Meddis R (1997) Time decay of auditory stream biasing. Percept Psychophys 59(1):81–86. https://doi.org/10.3758/BF03206850

    Article  Google Scholar 

  46. Beier EJ, Ferreira F (2018) The temporal prediction of stress in speech and its relation to musical beat perception. Front Psychol 9, Article 431, 6 p. https://doi.org/10.3389/fpsyg.2018.00431

  47. Beim JA, Oxenham AJ, Wojtczak M (2019) No effects of attention or visual perceptual load on cochlear function, as measured with stimulus-frequency otoacoustic emissions. J Acoust Soc Am 146(2):1475–1491. https://doi.org/10.1121/1.5123391

    Article  Google Scholar 

  48. Benard MR, Mensink JS, Başkent D (2014) Individual differences in top-down restoration of interrupted speech: Links to linguistic and cognitive abilities. J Acoust Soc Am 135(2):3072–3084. https://doi.org/10.1121/1.4862879

    Article  Google Scholar 

  49. Bendixen A (2014) Predictability effects in auditory scene analysis: a review. Front Neurosci 8, Article 60, 16 p. https://doi.org/10.3389/fnins.2014.00060

  50. Bendixen A, Denham SL, Winkler I (2014) Feature predictability flexibly supports auditory stream segregation or integration. Acta Acust Acust 1000(5):888–899. https://doi.org/10.3813/AAA.918768

    Article  Google Scholar 

  51. Bendixen A, SanMiguel I, Schröger E (2012) Early electrophysiological indicators for predictive processing in audition: a review. Int J Psychophysiol 83(2):120–131. https://doi.org/10.1016/j.ijpsycho.2011.08.003

    Article  Google Scholar 

  52. Bendixen A et al (2010) Regular patterns stabilize auditory streams. J Acoust Soc Am 128(6):3658–3666. https://doi.org/10.1121/1.3500695

    Article  Google Scholar 

  53. Besson M, Schön D (2001) Comparison between language and music. Ann NY Acad Sci 930(1):232–258. https://doi.org/10.1111/j.1749-6632.2001.tb05736.x

    Article  Google Scholar 

  54. Bey C, McAdams S (2003) Postrecognition of interleaved melodies as an indirect measure of auditory stream formation. J Exp Psychol Hum Percept Perform 29(2):267–279. https://doi.org/10.1037/0096-1523.29.2.267

    Article  Google Scholar 

  55. Bey C, McAdams S (2002) Schema-based processing in auditory scene analysis. Percept Psychophys 64(5):844–854. https://doi.org/10.3758/BF03194750

    Article  Google Scholar 

  56. Bidelman GM, Krishnan A (2011) Brainstem correlates of behavioral and compositional preferences of musical harmony. NeuroReport 22(5):212–219. https://doi.org/10.1097/WNR.0b013e328344a689

    Article  Google Scholar 

  57. Billig AJ, Carlyon RP (2016) Automaticity and primacy of auditory streaming: concurrent subjective and objective measures. J Exp Psychol Hum Percept Perform 42(3):339–353. https://doi.org/10.1037/xhp0000146

    Article  Google Scholar 

  58. Billig AJ, Davis MH, Carlyon RP (2018) Neural decoding of bistable sounds reveals an effect of intention on perceptual organization. J Neurosci 38(11):2844–2853. https://doi.org/10.1523/JNEUROSCI.3022-17.2018

    Article  Google Scholar 

  59. Billig AJ et al (2013) Lexical influences on auditory streaming. Curr Biol 23(16):1585–1589. https://doi.org/10.1016/j.cub.2013.06.042

    Article  Google Scholar 

  60. Bizley JK, Cohen YE (2013) The what, where and how of auditory-object perception. Nat Rev Neurosci 14(10):693–707. https://doi.org/10.1038/nrn3565

    Article  Google Scholar 

  61. Bizley JK et al (2009) Interdependent encoding of pitch, timbre, and spatial location in auditory cortex. J Neurosci 29(7):2064–2075. https://doi.org/10.1523/JNEUROSCI.4755-08.2009

    Article  Google Scholar 

  62. Blauert J, Braasch J (2020) The technology of binaural understanding. Springer Nature Switzerland AG, Cham, Switzerland. https://doi.org/10.1007/978-3-030-00386-9

  63. Bogacz R (2017) A tutorial on the free-energy framework for modelling perception and learning. J Math Psychol 76(B):198–211. https://doi.org/10.1016/j.jmp.2015.11.003

  64. Bolton TL (1894) Rhythm. Am J Psychol 6(2):145–238. https://doi.org/10.2307/1410948

    Article  Google Scholar 

  65. Botte M-C et al (1997) Perceptual attenuation of nonfocused auditory streams. Percept Psychophys 59(3):419–425. https://doi.org/10.3758/BF03211908

    Article  Google Scholar 

  66. Bouwer FL, Honing H, Slagter HA (2020) Beat-based and memory-based temporal expectations in rhythm: similar perceptual effects, different underlying mechanisms. J Cognit Neurosc 32(7):1221–1241. https://doi.org/10.1162/jocn_a_01529

    Article  Google Scholar 

  67. Bratzke D, Ulrich R (2019) Temporal sequence discrimination within and across senses: do we really hear what we see? Exp Brain Res 237(12):3089–3098. https://doi.org/10.1007/s00221-019-05654-4

    Article  Google Scholar 

  68. Bregman AS (1990) Auditory scene analysis: the perceptual organization of sound. MIT Press, Cambridge, MA

    Book  Google Scholar 

  69. Bregman AS (1978) Auditory streaming is cumulative. J Exp Psychol Hum Percept Perform 4(3):380–387. https://doi.org/10.1037/0096-1523.4.3.380

    Article  Google Scholar 

  70. Bregman AS (1978) Auditory streaming: competition among alternative organizations. Percept Psychophys 23(5):391–398. https://doi.org/10.3758/BF03204141

    Article  Google Scholar 

  71. Bregman AS (2008) Rhythms emerge from the perceptual grouping of acoustic components. Proc Fechner Day 24(1):13–16. http://proceedings.fechnerday.com/index.php/proceedings/article/view/163

  72. Bregman AS, Ahad PA (1996) Demonstrations of scene analysis: the perceptual organization of sound. Montreal, Canada. http://webpages.mcgill.ca/staff/Group2/abregm1/web/downloadsdl.htm

  73. Bregman AS, Campbell J (1971) Primary auditory stream segregation and perception of order in rapid sequences of tones. J Exp Psychol 89(2):244–249. https://doi.org/10.1037/h0031163

    Article  Google Scholar 

  74. Bregman AS, Dannenbring GL (1977) Auditory continuity and amplitude edges. Can J Psychol/Revue canadienne de psychologie 31(3):151–159. https://doi.org/10.1037/h0081658

    Article  Google Scholar 

  75. Bregman AS, Dannenbring GL (1973) The effect of continuity on auditory stream segregation. Percept Psychophys 13(2):308–312. https://doi.org/10.3758/BF03214144

    Article  Google Scholar 

  76. Bregman AS, Pinker S (1978) Auditory streaming and the building of timbre. Can J Psychol/Revue canadienne de psychologie 32(1):19–31. https://doi.org/10.1037/h0081664

    Article  Google Scholar 

  77. Bregman AS, Woszczyk W (2004) Controlling the perceptual organization of sound: guidelines derived from principles of auditory scene analysis (ASA). In: Greenebaum (ed) Audio Anecdotes: tools, tips and techniques for digital audio, vol 1. AK Peters, Natick, MA, pp 33–61

    Google Scholar 

  78. Bregman AS et al (2000) Effects of time intervals and tone durations on auditory stream segregation. Percept Psychophys 63(3):626–636. https://doi.org/10.3758/BF03212114

    Article  Google Scholar 

  79. Bregman MR, Patel AD, Gentner TQ (2016) Songbirds use spectral shape, not pitch, for sound pattern recognition. Proc Natl Acad Sci 113(6):946–959. https://doi.org/10.1073/pnas.1515380113

    Article  Google Scholar 

  80. Breska A, Ivry RB (2018) Double dissociation of single-interval and rhythmic temporal prediction in cerebellar degeneration and Parkinson’s disease. Proc Natl Acad Sci 115(48):12283–12288. https://doi.org/10.1073/pnas.1810596115

    Article  Google Scholar 

  81. Broadbent DE, Ladefoged P (1959) Auditory perception of temporal order. J Acoust Soc Am 31(11):1539–1539. https://doi.org/10.1121/1.1907662

    Article  Google Scholar 

  82. Brochard R et al (1999) Perceptual organization of complex auditory sequences: effect of number of simultaneous subsequences and frequency separation. J Expl Psychol: Hum Percept Perform 25(6):1742–1759. https://doi.org/10.1037/0096-1523.25.6.1742

    Article  Google Scholar 

  83. Brochard R et al (2003) The ‘ticktock’ of our internal clock: direct brain evidence of subjective accents in isochronous sequences. Psychol Sci 14(4):362–366. https://doi.org/10.1111/1467-9280.24441

    Article  Google Scholar 

  84. Brodbeck C et al (2020) Neural speech restoration at the cocktail party: auditory cortex recovers masked speech of both attended and ignored speakers. PLoS Biol 18(10):e3000883, 22 p. https://doi.org/10.1371/journal.pbio.3000883

  85. Brokx JPL, Nooteboom SG (1982) Intonation and the perceptual separation of simultaneous voices. J Phonet 10:23–36. https://doi.org/10.1016/S0095-4470(19)30909-X

    Article  Google Scholar 

  86. Brokx JPL (1979) Waargenomen continuiteit in spraak: Het belang van toonhoogte. Eindhoven, pp 1–124. https://doi.org/10.6100/IR171313

  87. Bronkhorst AW (2000) The cocktail party phenomenon: a review of research on speech intelligibility in multipletalker conditions. Acustica United with Acta Acustica 86(1):117–128

    Google Scholar 

  88. Bronkhorst AW (2015) The cocktail-party problem revisited: early processing and selection of multi-talker speech. Atten Percept Psychophys 77(5):1465–1487. https://doi.org/10.3758/s13414-015-0882-9

    Article  Google Scholar 

  89. Brown GJ (1992) Computational auditory scene analysis: a representational approach. Sheffield, UK, pp i-iv, 1-196. https://etheses.whiterose.ac.uk/2982/1/DX202847.pdf

  90. Brungart DS (2001) Informational and energetic masking effects in the perception of two simultaneous talkers. J Acoust Soc Am 109(3):1101–1109. https://doi.org/10.1121/1.1345696

    Article  Google Scholar 

  91. Buckley CL et al (2017) The free energy principle for action and perception: a mathematical review. J Math Psychol 81:55–79. https://doi.org/10.1016/j.jmp.2017.09.004

    Article  MathSciNet  MATH  Google Scholar 

  92. Burger B et al (2018) Synchronization to metrical levels in music depends on low-frequency spectral components and tempo. Psychol Res 82(6):1195–1211. https://doi.org/10.1007/s00426-017-0894-2

    Article  Google Scholar 

  93. Burns EM (1999) Intervals, scales, and tuning. In: Deutsch D (ed) The psychology of music, 2nd edn, Chap 7. Academic, New York, NY 1999, pp 215–264. https://doi.org/10.1016/B978-012213564-4/50008-1. http://cachescan.bcub.ro/e-book/Adriana%20C_3_e-book_12000-13000/580710/215-264.pdf

  94. Burns EM, Campbell SL (1994) Frequency and frequency-ratio resolution by possessors of absolute and relative pitch: examples of categorical perception? J Acoust Soc Am 96(5):2704–2719. https://doi.org/10.1121/1.411447

    Article  Google Scholar 

  95. Burns EM, Houtsma AJ (1999) The influence of musical training on the perception of sequentially presented mistuned harmonics. J Acoust Soc Am 106(6):3564–3570. https://doi.org/10.1121/1.428151

    Article  Google Scholar 

  96. Burns EM, Ward WD (1978) Categorical perception - phenomenon or epiphenomenon: evidence from experiments in the perception of melodic musical intervals. J Acoust Soc Am 63(2):456–468. https://doi.org/10.1121/1.381737

    Article  Google Scholar 

  97. Burr D, Banks MS, Morrone MC (2009) Auditory dominance over vision in the perception of interval duration. Exp Brain Res 198(1):49–57. https://doi.org/10.1007/s00221-009-1933-z

    Article  Google Scholar 

  98. Butler JW, Daston PG (1968) Musical consonance as musical preference: a cross-cultural study. J Gen Psychol 79(1):129–142. https://doi.org/10.1080/00221309.1968.9710460

    Article  Google Scholar 

  99. Byrne Á, Rinzel J, Rankin J (2019) Auditory streaming and bistability paradigm extended to a dynamic environment. Hear Res 383:107807, 12 p. https://doi.org/10.1016/j.heares.2019.107807

  100. Caclin A et al (2008) Interactive processing of timbre dimensions: an exploration with event-related potentials. J Cognit Neurosci 20(1):49–64. https://doi.org/10.1162/jocn.2008.20001

    Article  Google Scholar 

  101. Caclin A et al (2006) Separate neural processing of timbre dimensions in auditory sensory memory. J Cogn Neurosci 18(12):1959–1972. https://doi.org/10.1162/jocn.2006.18.12.1959

    Article  Google Scholar 

  102. Cantrell L, Smith LB (2013) Open questions and a proposal: a critical review of the evidence on infant numerical abilities. Cognition 128(3):331–352. https://doi.org/10.1016/j.cognition.2013.04.008

    Article  Google Scholar 

  103. Carbajal GV, Malmierca MS (2018) The neuronal basis of predictive coding along the auditory pathway: From the subcortical roots to cortical deviance detection. Trends Hear 22:2331216518784822, 33 p. https://doi.org/10.1177/2331216518784822

  104. Carcagno S, Semal C, Demany L (2011) Frequency-shift detectors bind binaural as well as monaural frequency representations. J Exp Psychol: Hum Percept Perform 37(6):1976–1987. https://doi.org/10.1037/a0024321

  105. Carden J, Cline T (2019) Absolute pitch: myths, evidence and relevance to music education and performance. Psychol Music 47(6):890–901. https://doi.org/10.1177/0305735619856098

    Article  Google Scholar 

  106. Carlyon RP (2004) How the brain separates sounds. Trends Cognit Sci 8(10):465–471. https://doi.org/10.1016/j.tics.2004.08.008

  107. Carlyon RP et al (2004) Auditory processing of real and illusory changes in frequency modulation (FM) phase. J Acoust Soc Am 116(6):3629–3639. https://doi.org/10.1121/1.1811474

    Article  Google Scholar 

  108. Carlyon RP et al (2003) Cross-modal and non-sensory influences on auditory streaming. Perception 32(11):1393–1402. https://doi.org/10.1068/p5035

    Article  Google Scholar 

  109. Carlyon RP et al (2001) Effects of attention and unilateral neglect on auditory stream segregation. J Exp Psychol Hum Percept Perform 27(1):115–127. https://doi.org/10.1037/0096-1523.27.1.115

    Article  Google Scholar 

  110. Cermeño-Aínsa S (2020) The cognitive penetrability of perception: a blocked debate and a tentative solution. Consciousness Cognition 77:102838, 23 p. https://doi.org/10.1016/j.concog.2019.102838

  111. Cervantes Constantino F et al (2012) Detection of appearing and disappearing objects in complex acoustic scenes. PLoS ONE 7(9):e46167, 13 p. https://doi.org/10.1371/journal.pone.0046167

  112. Chakrabarty D, Elhilali M (2019) A Gestalt inference model for auditory scene segregation. PLoS Comput Biol 15(1):e1006711, 33 p. https://doi.org/10.1371/journal.pcbi.1006711

  113. Chang A, Bosnyak DJ, Trainor LJ (2019) Rhythmicity facilitates pitch discrimination: differential roles of low and high frequency neural oscillations. Neuroimage 198:31–43. https://doi.org/10.1016/j.neuroimage.2019.05.007

    Article  Google Scholar 

  114. Chao ZC et al (2018) Large-scale cortical networks for hierarchical prediction and prediction error in the primate brain. Neuron 100:1252–1266. https://doi.org/10.1016/j.neuron.2018.10.004

    Article  Google Scholar 

  115. Cheng T-HZ, Creel SC (2020) The interplay of interval models and entrainment models in duration perception. J Exp Psychol: Hum Percept Perform 46(10):1088–1104. https://doi.org/10.1037/xhp0000798

  116. Cherry EC (1953) Some experiments on the recognition of speech, with one and with two ears. J Acoust Soc Am 25(5):975–979. https://doi.org/10.1121/1.1907229

  117. Chi T, Ru P, Shamma SA (2005) Multiresolution spectrotemporal analysis of complex sounds. J Acoust Soc Am 118(2):887–906. https://doi.org/10.1121/1.1945807

  118. Choi J, Cutler A, Broersma M (2017) Early development of abstract language knowledge: evidence from perception-production transfer of birth-language memory. Royal Society Open Science 4(1):160660, 14 p. https://doi.org/10.1098/rsos.160660

  119. Ciocca V (2008) The auditory organization of complex sounds. Front Biosci 13:148–169. https://doi.org/10.2741/2666

    Article  Google Scholar 

  120. Ciocca V, Bergman AS (1987) Perceived continuity of gliding and steady-state tones through interrupting noise. Percept Psychophys 42(5):476–484. https://doi.org/10.3758/BF03209755

    Article  Google Scholar 

  121. Clark A (2013) Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behav Brain Sci 36(3):181–204. https://doi.org/10.1017/S0140525X12000477

  122. Clarke EF (1987) Categorical rhythm perception: an ecological perspective. In: Gabrielsson A (ed) Action and perception in rhythm and music: papers given at a symposium in the third international conference on event perception and action. Royal Swedish Academy of Music, Stockholm, Sweden, pp 19–33

    Google Scholar 

  123. Clarke EF (1989) The perception of expressive timing in music. Psychol Res 51:2–9. https://doi.org/10.1007/BF00309269

    Article  Google Scholar 

  124. Cole RA, Scott B (1973) Perception of temporal order in speech: the role of vowel transitions. Can J Exp Psychol 27(4):441–449. https://doi.org/10.1037/h0082495

    Article  Google Scholar 

  125. Comstock DC, Hove MJ, Balasubramaniam R (2018) Sensorimotor synchronization with auditory and visual modalities: Behavioral and neural differences. Front Comput Neurosci 12, Article 53, 8 p. https://doi.org/10.3389/fncom.2018.00053

  126. Cook P et al (2013) A California sea lion (Zalophus californianus) can keep the beat: motor entrainment to rhythmic auditory stimuli in a non vocal mimic. J Comp Psychol 127(4):412–427. https://doi.org/10.1037/a0032345

    Article  Google Scholar 

  127. Cooke M, Ellis DPW (2001) The auditory organization of speech and other sources in listeners and computational models. Speech Commun 35(3):141–177. https://doi.org/10.1016/S0167-6393(00)00078-9

    Article  MATH  Google Scholar 

  128. Costa-Faidella J, Sussman E, Escera C (2017) Selective entrainment of brain oscillations drives auditory perceptual organization. Neuroimage 159:195–206. https://doi.org/10.1016/j.neuroimage.2017.07.056

    Article  Google Scholar 

  129. Cousineau M et al (2014) What is a melody? On the relationship between pitch and brightness of timbre. Front Syst Neurosci 7, Article 127, 7 p. https://doi.org/10.3389/fnsys.2013.00127

  130. Crystal TH, House AS (1990) Articulation rate and the duration of syllables and stress groups in connected speech. J Acoust Soc Am 88(1):101–112. https://doi.org/10.1121/1.399955

    Article  Google Scholar 

  131. Culling JF, Darwin CJ (1993) The role of timbre in the segregation of simultaneous voices with intersecting F0 contours. Percept Psychophys 54(3):303–309. https://doi.org/10.3758/BF03205265

    Article  Google Scholar 

  132. Culling JF, Summerfield Q (1995) Perceptual separation of concurrent speech sounds: absence of acrossfrequency grouping by common interaural delay. J Acoust Soc Ame 98(2):785–797. https://doi.org/10.1121/1.413571

    Article  Google Scholar 

  133. Cummins F (2012) Looking for rhythm in speech. Empir Musicol Rev 7:28–35. https://doi.org/10.18061/1811/52976

  134. Cusack R, Carlyon RP (2003) Perceptual asymmetries in audition. J Exp Psychol Hum Percept Perform 29(3):713–725. https://doi.org/10.1037/0096-1523.29.3.713

    Article  Google Scholar 

  135. Cusack R, Roberts B (2004) Effects of differences in the pattern of amplitude envelopes across harmonics on auditory stream segregation. Hear Res 193(1–2):95–104. https://doi.org/10.1016/j.heares.2004.03.009

  136. Cusack R, Roberts B (2000) Effects of differences in timbre on sequential grouping. Percept Psychophys 62(5):1112–1120. https://doi.org/10.3758/BF03212092

    Article  Google Scholar 

  137. Cusack R, Roberts B (1999) Effects of similarity in bandwidth on the auditory sequential streaming of twotone complexes. Perception 28(10):1281–1289. https://doi.org/10.1068/p2804

    Article  Google Scholar 

  138. Cusack R et al (2004) Effects of location, frequency region, and time course of selective attention on auditory scene analysis. J Exp Psychol Hum Percept Perform 30(4):643–656. https://doi.org/10.1037/0096-1523.30.4.643

    Article  Google Scholar 

  139. Cutler A, Norris D (2016) Bottoms up! How top-down pitfalls ensnare speech perception researchers, too. Behav Brain Sci 39(e236):25–26. https://doi.org/10.1017/S0140525X15002745

    Article  Google Scholar 

  140. d’Alessandro C, Mertens P (1995) Automatic pitch contour stylization using a model of tonal perception. Comput Speech Lang 9(3):257–288. https://doi.org/10.1006/csla.1995.0013

    Article  Google Scholar 

  141. Dai J, Dixon S (2019) Intonation trajectories within tones in unaccompanied soprano, alto, tenor, bass quartet singing. J Acoust Soc Am 146(2):1005–1014. https://doi.org/10.1121/1.5120483

    Article  Google Scholar 

  142. Dalton P, Fraenkel N (2012) Gorillas we have missed: sustained inattentional deafness for dynamic events. Cognition 124(3):367–372. https://doi.org/10.1016/j.cognition.2012.05.012

    Article  Google Scholar 

  143. Daniel P, Weber R (1997) Psychoacoustical roughness: implementation of an optimized model. Acustica 83:113–123

    Google Scholar 

  144. Dannenbring GL (1976) Perceived auditory continuity with alternately rising and falling frequency transitions. Can J Psychol/Revue canadienne de psychologie 30(2):99–114. https://doi.org/10.1037/h0082053

  145. Dannenbring GL, Bregman AS (1976) Stream segregation and the illusion of overlap. J Exp Psychol Hum Percept Perform 2(4):544–555. https://doi.org/10.1037/0096-1523.2.4.544

    Article  Google Scholar 

  146. Darwin CJ (2008) Listening to speech in the presence of other sounds. Philos Tran Roy Soc of Lond B: Biol Sci 363(1493):1011–1021. https://doi.org/10.1098/rstb.2007.2156

  147. Darwin CJ, Bethell-Fox CE (1977) Pitch continuity and speech source attribution. J Exp Psychol Hum Percept Perform 3(4):665–672. https://doi.org/10.1037/0096-1523.3.4.665

    Article  Google Scholar 

  148. Darwin CJ, Ciocca V (1992) Grouping in pitch perception: effects of onset asynchrony and ear of presentation of a mistuned component. J Acoust Soc Am 91(6):3381–3390. https://doi.org/10.1121/1.402828

    Article  Google Scholar 

  149. Darwin CJ (1997) Auditory grouping. Trends Cognit Sci 1(9):327–333. https://doi.org/10.1016/S1364-6613(97)01097-8

  150. Dauer RM (1983) Stress-timing syllable-timing reanalyzed. J Phon 11(1):51–62. https://doi.org/10.1016/S0095-4470(19)30776-4

  151. David M et al (2017) Discrimination and streaming of speech sounds based on differences in interaural and spectral cues. J Acoust Soc Am 142(3):1674–1685. https://doi.org/10.1121/1.5003809

    Article  Google Scholar 

  152. De Lange FP, Heilbron M, Kok P (2018) How do expectations shape perception? Trends Cognit Sci 22(9):764–779. https://doi.org/10.1016/j.tics.2018.06.002

  153. Dehaene S (1997) The number sense: how the mind creates mathematics. Oxford University Press, New York, NY

    MATH  Google Scholar 

  154. Deike S et al (2012) he build-up of auditory stream segregation: a different perspective. Front Psychol 3, Article 416, 7 p. https://doi.org/10.3389/fpsyg.2012.00461

  155. Demany L (1982) Auditory stream segregation in infancy. Infant Behav Dev 5:261–276. https://doi.org/10.1016/S0163-6383(82)80036-2

    Article  Google Scholar 

  156. Demany L, Erviti M, Semal C (2015) Auditory attention is divisible: segregated tone streams can be tracked simultaneously. J Exp Psychol: Hum Percept Perform 41(2):356–363. https://doi.org/10.1037/a0038932

  157. Demany L, McKenzie B, Vurpillot E (1977) Rhythm perception in early infancy. Nature 266(5604):718–719. https://doi.org/10.1038/266718a0

  158. Demany L, Semal C (2002) Limits of rhythm perception. Quart J Exp Psychol Sect A 55(2):643–657. https://doi.org/10.1080/02724980143000406

  159. Denham SL, Winkler (2006) The role of predictive models in the formation of auditory streams. J Exp Psychol Hum Percept Perform 41(2):154–170. https://doi.org/10.1016/j.jphysparis.2006.09.012

  160. Denham SL, Winkler I (2020) Predictive coding in auditory perception: challenges and unresolved questions. Eur J Neurosci 51:1151–1160. https://doi.org/10.1111/ejn.13802

    Article  Google Scholar 

  161. Denham SL et al (2012) Characterising switching behaviour in perceptual multi-stability. J Neurosci Methods 210(1):79–92. https://doi.org/10.1016/j.jneumeth.2012.04.004

    Article  Google Scholar 

  162. Denham SL et al (2013) Perceptual bistability in auditory streaming: How much do stimulus features matter? Learn Percept 5(Supplement 2):73–100. https://doi.org/10.1556/LP.5.2013.Suppl2.6

  163. Denham SL et al (2018) Similar but separate systems underlie perceptual bistability in vision and audition. Sci Rep 8:7106, 10 p. https://doi.org/10.1038/s41598-018-25587-2

  164. Denham SL et al (2014) Stable individual characteristics in the perception of multiple embedded patterns in multistable auditory stimuli. Front Neurosci 8, Article 25, 25 p. https://doi.org/10.3389/fnins.2014.00025

  165. Desain P, Honing H (2003) The formation of rhythmic categories and metric priming. Perception 32(3):341–365. https://doi.org/10.1068/p3370

    Article  Google Scholar 

  166. Deutsch D (2013) Absolute pitch. In: Deutsch D (ed) The psychology of music, 3rd edn, Chap 5. Elsevier, Amsterdam, pp 141–182. https://doi.org/10.1016/B978-0-12-381460-9.00005-5

  167. Deutsch D (1974) An auditory illusion. Nature 251(5473):307–309. https://doi.org/10.1038/251307a0

  168. Deutsch D (2013) Grouping mechanisms in music. In: Deutsch D (ed) The psychology of music, 3rd edn, Chap 6. Academic, New York, NY, pp 183–246. https://doi.org/10.1016/B978-0-12-381460-9.00006-7

  169. Deutsch D (2019) Musical illusions and phantom words: how music and speech unlock mysteries of the brain. Oxford University Press, New York, NY

    Book  Google Scholar 

  170. Deutsch D, Henthorn T, Lapidis R (2011) Illusory transformation from speech to song. J Acoust Soc Am 129(4):2245–2252. https://doi.org/10.1121/1.3562174

  171. Devergie A et al (2010) Effect of rhythmic attention on the segregation of interleaved melodies. J Acoust Soc Am 128(1):EL1–EL7. https://doi.org/10.1121/1.3436498

  172. DeWitt LA, Samuel AG (1990) The role of knowledge-based expectations in music perception: evidence from musical restoration. J Exp Psychol Gen 119(2):123–144. https://doi.org/10.1037/0096-3445.119.2.123

    Article  Google Scholar 

  173. Ding N et al (2018) Attention is required for knowledge-based sequential grouping: insights from the integration of syllables into words. J Neurosci 38(5):1178–1188. https://doi.org/10.1523/JNEUROSCI.2606-17.2017

    Article  Google Scholar 

  174. Ding N et al (2017) Temporal modulations in speech and music. Neurosci Biobehav Rev 81:181–187. https://doi.org/10.1016/j.neubiorev.2017.02.011

    Article  Google Scholar 

  175. Divenyi P (ed) (2005) Speech separation by humans and machines. Kluwer Academic Publishers, Boston, MA

    Google Scholar 

  176. Dolležal, L-V, Beutelmann R, Klump GM (2012) Stream segregation in the perception of sinusoidally amplitude-modulated tones. PLoS ONE 7(9):e43615, 12 p. https://doi.org/10.1371/journal.pone.0043615

  177. Dolležal, L-V et al (2014) Evaluating auditory stream segregation of SAM tone sequences by subjective and objective psychoacoustical tasks, and brain activity. Front Neurosci 8, Article 119, 15 p. https://doi.org/10.3389/fnins.2014.00119

  178. Dowling WJ (1968) Rhythmic fission and perceptual organization. J Acoust Soc Am 44(1):369. https://doi.org/10.1121/1.1970461

  179. Dowling WJ (1973) The perception of interleaved melodies. Cognit Psychol 5(3):322–337. https://doi.org/10.1016/0010-0285(73)90040-6

  180. Dowling WJ, Lung KM-T, Herrbold S (1987) Aiming attention in pitch and time in the perception of interleaved melodies. Percept Psychophys 41(6):642–656. https://doi.org/10.3758/BF03210496

    Article  Google Scholar 

  181. Drennan WR, Gatehouse S, Lever C (2003) Perceptual segregation of competing speech sounds: The role of spatial location. J Acoust Soc Am 114(4):2178–2189. https://doi.org/10.1121/1.1609994

  182. Dunlap K (1910) Reaction to rhythmic stimuli with attempt to synchronize. Psychol Rev 17(6):399–416. https://doi.org/10.1037/h0074736

  183. Edwards E, Chang EF (2013) Syllabic (\(\sim \) 2–5 Hz) and fluctuation (\(\sim \) 1–10 Hz) ranges in speech and auditory processing. Hear Res 305:113–134. https://doi.org/10.1016/j.heares.2013.08.017

    Article  Google Scholar 

  184. Egan JP, Carterette EC, Thwing EJ (1954) Some factors affecting multi-channel listening. J Acoust Soc Am 26(5):774–782. https://doi.org/10.1121/1.1907416

  185. Elfner LF, Homick JL (1967) Continuity effects with alternately sounding tones under dichotic presentation. Percept Psychophys 2(1):34–36. https://doi.org/10.3758/BF03210062

    Article  Google Scholar 

  186. Elhilali M (2017) Modeling the cocktail party problem. In: Middlebrooks JC et al (ed) The auditory system at the cocktail party, Chap 5. Springer International Publishing, Cham, Switzerland, pp 111–135. https://doi.org/10.1007/978-3-319-51662-2_5

  187. Elhilali M et al (2009) Temporal coherence in the perceptual organization and cortical representation of auditory scenes. Neuron 61(2):317–329. https://doi.org/10.1016/j.neuron.2008.12.005

    Article  Google Scholar 

  188. Ellis DPW (1996) Prediction-driven computational auditory scene analysis. Massachusetts Institute of Technology, Cambridge, MA. https://doi.org/10.7916/D84J0N13

  189. Ellis RJ, Jones MR (2009) The role of accent salience and joint accent structure in meter perception. J Exp Psychol Hum Percept Perform 35(1):264–280. https://doi.org/10.1037/a0013482

    Article  Google Scholar 

  190. Eramudugolla R et al (2005) Directed attention eliminates ‘change deafness’ in complex auditory scenes. Curr Biol 15(12):1108–1113. https://doi.org/10.1016/j.cub.2005.05.051

    Article  Google Scholar 

  191. Erle TM, Topolinski S (2018) Disillusionment: how expectations shape the enjoyment of early perceptual processes. Exp Psychol 65(6):332–344. https://doi.org/10.1027/1618-3169/a000419

    Article  Google Scholar 

  192. Falk S, Rathcke T, Dalla Bella S (2014) When speech sounds like music. J Exp Psychol: Hum Percept Perform 40(4):1491–1506. https://doi.org/10.1037/a0036858

  193. Farkas D et al (2016) Assessing the validity of subjective reports in the auditory streaming paradigm. J Acoust Soc Am 139(4):1762–1772. https://doi.org/10.1121/1.4945720

    Article  Google Scholar 

  194. Farkas D et al (2016) Auditory multi-stability: idiosyncratic perceptual switching patterns, executive functions and personality traits. PLoS ONE 11(5):e0154810, 20 p. https://doi.org/10.1371/journal.pone.0154810

  195. Feeney MP (1997) Dichotic beats of mistuned consonances. J Acoust Soc Am 102(4):2333–2342. https://doi.org/10.1121/1.419602

  196. Filippi P et al (2019) Temporal modulation in speech, music, and animal vocal communication: evidence of conserved function. Ann N Y Acad Sci 1453(1):99–113. https://doi.org/10.1111/nyas.14228

    Article  Google Scholar 

  197. Firestone C, Scholl BJ (2016) Cognition does not affect perception: Evaluating the evidence for ‘top-down’ effects. Behav Brain Sci 39:e229, 77 p. https://doi.org/10.1017/S0140525X15000965

  198. Fishbach A, Nelken I, Yeshurun Y (2001) Auditory edge detection: a neural model for physiological and psychoacoustical responses to amplitude transients. J Neurophysiol 85(6):2303–2323. https://doi.org/10.1152/jn.2001.85.6.2303

  199. Fitch WT (2007) Rosenfeld AJ Perception and production of syncopated rhythms. Music Percept: Interdiscip J 25(1):43–58. https://doi.org/10.1525/mp.2007.25.1.43

    Article  Google Scholar 

  200. Fraisse P (1982) Rhythm and tempo. In: Deutsch D (ed) The psychology of music, Chap 6. Academic, London, UK, pp 149–180

    Google Scholar 

  201. Fraisse P (1946) Contribution a l’étude du rythme en tant que forme temporelle. J Psychol Norm Pathol 39:283–304

    Google Scholar 

  202. Fraisse P (1948) Rythmes auditifs et rythmes visuels. Année Psychologique 49:21–42. https://doi.org/10.3406/psy.1948.8352

    Article  Google Scholar 

  203. French-St George M, Bregman AS (1989) Role of predictability of sequence in auditory stream segregation. Percept Psychophys 46(4):384–386. https://doi.org/10.3758/BF03204992

    Article  Google Scholar 

  204. Friberg A, Sundberg J (1995) Time discrimination in a monotonic, isochronous sequence. J Acoust Soc Am 98(5):2524–2531. https://doi.org/10.1121/1.413218

    Article  Google Scholar 

  205. Friston K (2003) Learning and inference in the brain. Neural Netw 16(9):1325–1352. https://doi.org/10.1016/j.neunet.2003.06.005

  206. Friston K (2009) The free-energy principle: A rough guide to the brain? Trends Cognit Sci 13(7):293–301. https://doi.org/10.1016/j.tics.2009.04.005

  207. Friston K (2010) The free-energy principle: a unified brain theory? Nat Rev Neurosci 11(2):127–138. https://doi.org/10.1038/nrn2787

  208. Fritz JB, et al (2007) Auditory attention: focusing the searchlight on sound. Curr Opin Neurobiol 17(4):437–455. https://doi.org/10.1016/j.conb.2007.07.011

  209. Füllgrabe C, Moore BC (2012) Objective and subjective measures of pure-tone stream segregation based on interaural time differences. Hear Res 291:24–33. https://doi.org/10.1016/j.heares.2012.06.006

    Article  Google Scholar 

  210. Gallun FJ, Mason CR, Kidd G (2007) Task-dependent costs in processing two simultaneous auditory stimuli. Percept Psychophys 69(5):757–771. https://doi.org/10.3758/BF03193777

  211. Gámez J et al (2018) Predictive rhythmic tapping to isochronous and tempo changing metronomes in the nonhuman primate. Ann N Y Acad Sci 1423(1):396–414. https://doi.org/10.1111/nyas.13671

    Article  Google Scholar 

  212. Gan L et al (2015) Synchronization to a bouncing ball with a realistic motion trajectory. Sci Rep 5:11974, 9 p. https://doi.org/10.1038/srep11974

  213. Garcia Lecumberri ML, Cooke M, Cutler A (2010) Non-native speech perception in adverse conditions: a review. Speech Commun 52(11–12):864–886. https://doi.org/10.1016/j.specom.2010.08.014

  214. Garner WR (1951) The accuracy of counting repeated short tones. J Exp Psychol 41(4):310–316. https://doi.org/10.1037/h0059567

  215. Garrido MI et al (2009) The mismatch negativity: a review of underlying mechanisms. Clin Neurophysiol 120(3):453–463. https://doi.org/10.1016/j.clinph.2008.11.029

    Article  Google Scholar 

  216. Ghitza O (2011) Linking speech perception and neurophysiology: speech decoding guided by cascaded oscillators locked to the input rhythm. Front Psychol 2, Article 130, 13 p. https://doi.org/10.3389/fpsyg.2011.00130

  217. Ghitza O, Greenberg S (2009) On the possible role of brain rhythms in speech perception: intelligibility of time-compressed speech with periodic and aperiodic insertions of silence. Phonetica 66:113–126. https://doi.org/10.1159/000208934

    Article  Google Scholar 

  218. Giraud A-L, Poeppel D (2012) Cortical oscillations and speech processing: emerging computational principles and operations. Nat Neurosci 15(4):511–517. https://doi.org/10.1038/nn.3063

  219. Gjorgjieva J, Sompolinsky H, Meister M (2014) Benefits of pathway splitting in sensory coding. J Neurosci 34(36):12127–12144. https://doi.org/10.1523/JNEUROSCI.1032-14.2014

  220. Godsmark D, Brown GJ (1999) A blackboard architecture for computational auditory scene analysis. Speech Commun 27(3–4):351–366. https://doi.org/10.1016/S0167-6393(98)00082-X

  221. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, MA. http://www.deeplearningbook.org

  222. Gordon MS (2017) Change deafness across voices in music and language. J Cognit Psychol 29(1):53–64. https://doi.org/10.1080/20445911.2016.1223244

  223. Gordon MS, Ataucusi A (2021) Continuous sliding frequency shifts produce an illusory tempo drift. J Acoust Soc Am Express Lett 1(5):053202, 8 p. https://doi.org/10.1121/10.0005001

  224. Graddol D (1986) Discourse specific pitch behavior. In: Johns-Lewis C (ed) Intonation in discourse. Croom Helm, London, UK, pp 221–237

    Google Scholar 

  225. Grahn JA (2012) See what I hear? Beat perception in auditory and visual rhythms. Exp Brain Res 220(1):51–61. https://doi.org/10.1007/s00221-012-3114-8

  226. Grahn JA, McAuley JD (2009) Neural bases of individual differences in beat perception. Neuroimage 47(4):1894–1903. https://doi.org/10.1016/j.neuroimage.2009.04.039

    Article  Google Scholar 

  227. Greenwood DD (1997) The Mel Scale’s disqualifying bias and a consistency of pitch-difference equisections in 1956 with equal cochlear distances and equal frequency ratios. Hear Res 103:199–224. https://doi.org/10.1016/S0378-5955(96)00175-X

    Article  Google Scholar 

  228. Gregg MK, Samuel AG (2012) Feature assignment in perception of auditory figure. J Exp Psychol Hum Percept Perform 38(4):998–1013. https://doi.org/10.1037/a0026789

    Article  Google Scholar 

  229. Gregg MK, Samuel AG (2009) The importance of semantics in auditory representations. Attent Percept Psychophys 71(3):607–619. https://doi.org/10.3758/APP.71.3.607

    Article  Google Scholar 

  230. Gregory AH (1994) Timbre auditory streaming. Music Percept: Interdiscip J 12(2):161–174. https://doi.org/10.2307/40285649

  231. Gregory RL (1980) Perceptions as hypotheses. Philos Trans Roy Soc B Biol Sci 290(1038):181–197. https://doi.org/10.1098/rstb.1980.0090

  232. Grimault N, Bacon SP, Micheyl C (2002) Auditory stream segregation on the basis of amplitude-modulation rate. J Acoust Soc Am 111(3):1340–1348. https://doi.org/10.1121/1.1452740

  233. Grimault N, McAdams S, Allen JB (2007) Auditory scene analysis: a prerequisite for loudness perception. In: Kollmeier B et al (ed) Hearing: from sensory processing to perception, Chap 32. Springer, Berlin, Heidelberg, pp 295–302. https://doi.org/10.1007/978-3-540-73009-5_32

  234. Grimm S, Escera C (2012) Auditory deviance detection revisited: evidence for a hierarchical novelty system. Int J Psychophysiol 85(1):88–92. https://doi.org/10.1016/j.ijpsycho.2011.05.012

    Article  Google Scholar 

  235. Grimm S, Escera C, Nelken I (2016) Early indices of deviance detection in humans and animal models. Biol Psychol 116:23–27. https://doi.org/10.1016/j.biopsycho.2015.11.017

    Article  Google Scholar 

  236. Groenveld G, Burgoyne JA, Sadakata M (2020) I still hear a melody: investigating temporal dynamics of the Speech-to-Song Illusion. Psychol Res 84(5):1451–1459. https://doi.org/10.1007/s00426-018-1135-z

  237. Grondin S (2020) The perception of time: your questions answered. Routledge, New York, NY

    Google Scholar 

  238. Grondin S (2012) Violation of the scalar property for time perception between 1 and 2 seconds: evidence from interval discrimination, reproduction, and categorization. J Exp Psychol: Hum Percept Perform 38(4):880–890. https://doi.org/10.1037/a0027188

  239. Grondin S, Meilleur-Wells G, Lachance R (1999) When to start explicit counting in a time-intervals discrimination task: a critical point in the timing process of humans. J Exp Psychol: Hum Percept Perform 25(4):993–1004. https://doi.org/10.1037/0096-1523.25.4.993

  240. Grondin S et al (2018) Auditory time perception. In: Bader R (ed) Springer handbook of systematic musiclology, Chap 21. Springer GmbH Germany, Cham, Switzerland, pp 423–440. https://doi.org/10.1007/978-3-662-55004-5_21

  241. Grossberg S et al (2004) ARTSTREAM: a neural network model of auditory scene analysis and source segregation. Neural Netw 17(4):511–536. https://doi.org/10.1016/j.neunet.2003.10.002

    Article  Google Scholar 

  242. Grube M et al (2010) Dissociation of duration-based and beat-based auditory timing in cerebellar degeneration. Proc Natl Acad Sci 107(26):11597–11601. https://doi.org/10.1073/pnas.0910473107

    Article  Google Scholar 

  243. Grube M et al (2010) Transcranial magnetic theta-burst stimulation of the human cerebellum distinguishes absolute, duration-based from relative, beat-based perception of subsecond time intervals. Front Psychol 1, Article 171, 8 p. https://doi.org/10.3389/fpsyg.2010.00171

  244. Gu L, Huang Y, Wu X (2020) Advantage of audition over vision in a perceptual timing task but not in a sensorimotor timing task. Psychol Res 84:2046–2056. https://doi.org/10.1007/s00426-019-01204-3

    Article  Google Scholar 

  245. Guernsey M (1928) The role of consonance and dissonance in music. Am J Psychol 40(2):173–204. https://doi.org/10.2307/1414484

  246. Guttman N, Julesz B (1963) Lower limits of auditory periodicity analysis. J Acoust Soc Am 35(4):610. https://doi.org/10.1121/1.1918551

  247. Guttman S, Gilroy LA, Blake R (2005) Hearing what the eyes see: Auditory encoding of visual temporal sequences. Psychol Sci 16(3):228–235. https://doi.org/10.1111/j.0956-7976.2005.00808.x

  248. Haegens S, Zion Golumbic E (2018) Rhythmic facilitation of sensory processing: a critical review. Neurosci Biobehav Rev 86:50–165. https://doi.org/10.1016/j.neubiorev.2017.12.002

  249. Hannon EE, Johnson SP (2005) Infants use meter to categorize rhythms and melodies: implications for musical structure learning. Cognit Psychol 50(4):354–377. https://doi.org/10.1016/j.cogpsych.2004.09.003

    Article  Google Scholar 

  250. Hannon EE et al (2004) The role of melodic and temporal cues in perceiving musical meter. J Exp Psychol Hum Percept Perform 30(5):956–974. https://doi.org/10.1037/0096-1523.30.5.956

    Article  Google Scholar 

  251. Hänsler E, Schmidt G (eds) Speech and audio processing in adverse environments. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70602-1

  252. Harrison PMC, Pearce MT (2020) Simultaneous consonance in music perception and composition. Psychol Rev 127(2):216–244. https://doi.org/10.1037/rev0000169

    Article  Google Scholar 

  253. Hartmann WM, Johnson D (1991) Stream segregation and peripheral channeling. Music Percept: Interdiscip J 9(2):155–183. https://doi.org/10.2307/40285527

  254. Hartmann WM, McAdams S, Smith BK (1990) Hearing a mistuned harmonic in an otherwise periodic complex tone. J Acoust Soc Am 88(4):1712–1724. https://doi.org/10.1121/1.400246

  255. Hass J, Durstewitz D (2016) Time at the center, or time at the side? Assessing current models of time perception. Curr Opin Behav Sci 8:238–244. https://doi.org/10.1016/j.cobeha.2016.02.030

    Article  Google Scholar 

  256. Hasuo E et al (2015) Effects of sound marker durations on the perception of inter-onset time intervals: a study with instrumental sounds. Jpn J Psychon Sci 34(1):2–16. https://doi.org/10.14947/psychono.34.2

  257. Hasuo E et al (2012) Effects of temporal shapes of sound markers on the perception of interonset time intervals. Attent Percept Psychophys 74(2):430–445. https://doi.org/10.3758/s13414-011-0236-1

    Article  Google Scholar 

  258. Hausfeld L et al (2018) Cortical tracking of multiple streams outside the focus of attention in naturalistic auditory scenes. Neuroimage 181:617–626. https://doi.org/10.1016/j.neuroimage.2018.07.052

    Article  Google Scholar 

  259. Hawkins S (2014) Situational influences on rhythmicity in speech, music, and their interaction. Philos Trans Roy Soc B: Biolog Sci 369(1658):20130398, 11 p. https://doi.org/10.1098/rstb.2013.0398

  260. Haykin S, Chen Z (2005) The cocktail party problem. Neural Comput 17(9):1875–1902. https://doi.org/10.1162/0899766054322964

    Article  Google Scholar 

  261. Haywood NR, Chang I-CJ, Ciocca V (2011) Perceived tonal continuity through two noise bursts separated by silence. J Acoust Soc Am 130(3):1503–1514. https://doi.org/10.1121/1.3609124

  262. Haywood NR (2010) Build-up of the tendency to segregate auditory streams: resetting effects evoked by a single deviant tone. J Acoust Soc Am 128(5):3019–3031. https://doi.org/10.1121/1.3488675

    Article  Google Scholar 

  263. Haywood NR, Roberts B (2011) Sequential grouping of pure-tone percepts evoked by the segregation of components from a complex tone. J Exp Psychol Hum Percept Perform 37(4):1263–1274. https://doi.org/10.1037/a0023416

    Article  Google Scholar 

  264. Heilbron M (2018) Great expectations: is there evidence for predictive coding in auditory cortex? Neuroscience 389:54–73. https://doi.org/10.1016/j.neuroscience.2017.07.061

    Article  Google Scholar 

  265. Hellstrom LI, Young ED (1989) Physiological responses to the pulsation threshold paradigm. II: Representations of high-pass noise in average rate measures of auditory-nerve fiber discharge. J Acoust Soc Am 85(1):243–253. https://doi.org/10.1121/1.397730

  266. Helmholtz HLF (1895) On the sensations of tone as a physiological basis for the theory of music. Trans. by Ellis AJ 2nd edn. Longmans, Green, and Co., London, UK, pp i–xix, 1–576. https://archive.org/stream/onsensationsofto00helmrich/onsensationsofto00helmrich%5C_djvu.txt

  267. Henton CG (1989) Fact and fiction in the description of female and male pitch. Lang Commun 9(4):299–311. https://doi.org/10.1016/0271-5309(89)90026-8

  268. Hermes DJ (2006) Stylization of pitch contours. In: Sudhoff S et al (ed) Methods in empirical prosody research. Walter De Gruyter, Berlin, pp 29–62. https://doi.org/10.1515/9783110914641.29

  269. Hermes DJ, Van Gestel JC (1991) The frequency scale of speech intonation. J Acoust Soc Am 90(1):97–102. https://doi.org/10.1121/1.402397

  270. Hinton G et al (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process Mag 29(6):82–97

    Google Scholar 

  271. Hirsch A (2013) What is the domain for weight computation: the syllable or the interval? Proc Ann Meet Phonol 1(1) 12 p https://doi.org/10.3765/amp.v1i1.21

  272. Hoeschele M et al (2013) Chickadees fail standardized operant tests for octave equivalence. Anim Cognit 16(4):599–609. https://doi.org/10.1007/s10071-013-0597-z

    Article  Google Scholar 

  273. Hofman PM, Van Opstal AJ (1998) Spectro-temporal factors in two-dimensional human sound localization. J Acoust Soc Am 103(5):2634–2648. https://doi.org/10.1121/1.422784

    Article  Google Scholar 

  274. Hofmann-Shen C et al (2020) Mapping adaptation, deviance detection, and prediction error in auditory processing. NeuroImage 207:116432, 9 p. https://doi.org/10.1016/j.neuroimage.2019.116432

  275. Hohwy J (2013) The predictive mind. Oxford University Press, Oxford, UK

    Book  Google Scholar 

  276. Holmes SD, Roberts B (2012) Pitch shifts on mistuned harmonics in the presence and absence of corresponding in-tune components. J Acoust Soc Am 132(3):1548–1560. https://doi.org/10.1121/1.4740487

    Article  Google Scholar 

  277. Hommel B et al (2019) No one knows what attention is. Attent Percept Psychophys 81(7):2288–2303. https://doi.org/10.3758/s13414-019-01846-w

    Article  Google Scholar 

  278. Honing H (2013) Structure and interpretation of rhythm in music. In: Deutsch D (ed) The psychology of music, 3rd edn, Chap 2. Elsevier, Amsterdam, pp 369–404. https://doi.org/10.1016/B978-0-12-381460-9.00009-2

  279. Honing H (2012) Without it no music: beat induction as a fundamental musical trait. Ann NY Acad Sci 1252(1):85-91. https://doi.org/10.1111/j.1749-6632.2011.06402.x

  280. Honing H et al (2009) Is beat induction innate or learned? Probing emergent meter perception in adults and newborns using event-related brain potentials. Ann NY Acad Sci 1169(1):93–96. https://doi.org/10.1111/j.1749-6632.2009.04761.x

  281. Honing H et al (12) Rhesus monkeys (Macaca mulatta) sense isochrony in rhythm, but not the beat: Additional support for the gradual audiomotor evolution hypothesis. Front Neurosci 12, Article 475, 15 p. https://doi.org/10.3389/fnins.2018.00475

  282. Houtgast T (1972) Psychophysical evidence for lateral inhibition in hearing. J Acoust Soc Am 51(6B):1885–1894. https://doi.org/10.1121/1.1913048

  283. Hove MJ, Spivey MJ, Krumhansl CL (2010) Compatibility of motion facilitates visuomotor synchronization. J Exp Psychol: Hum Percept Perform 36(6):1525–1534. https://doi.org/10.1037/a0019059

  284. Hove MJ et al (2014) Superior time perception for lower musical pitch explains why bass-ranged instruments lay down musical rhythms. Proc Natl Acad Sci 111(28):10383–10388. https://doi.org/10.1073/pnas.1402039111

    Article  Google Scholar 

  285. Huang N, Elhilali M (2017) Auditory salience using natural soundscapes. J Acoust Soc Am 141(3):2163–2176. https://doi.org/10.1121/1.4979055

    Article  Google Scholar 

  286. Huang Y, Rao RPN (2011) Predictive coding. Wiley Interdiscip Rev: Cognit Sci 2(5):580–593. https://doi.org/10.1002/wcs.142

    Article  Google Scholar 

  287. Huang Y et al (2018) Relative contributions of the speed characteristic and other possible ecological factors in synchronization to a visual beat consisting of periodically moving stimuli. Front Psychol 9, Article 1226, 16 p. https://doi.org/10.3389/fpsyg.2018.01226

  288. Hukin RW, Darwin CJ (1995) Comparison of the effect of onset asynchrony on auditory grouping in pitch matching and vowel identification. Percept Psychophys 57(2):191–196. https://doi.org/10.3758/BF03206505

    Article  Google Scholar 

  289. Hukin RW, Darwin CJ (1995) Effects of contralateral presentation and of interaural time differences in segregating a harmonic from a vowel. J Acoust Soc Am 98(3):1380–1387. https://doi.org/10.1121/1.414348

    Article  Google Scholar 

  290. Huron D (1989) Voice denumerability in polyphonic music of homogeneous timbres. Music Percept: Interdiscip J 6(4):361–382. https://doi.org/10.2307/40285438

  291. Ihlefeld A, Shinn-Cunningham BG (2008) Disentangling the effects of spatial cues on selection and formation of auditory objects. J Acoust Soc Am 124(4):2224–2235. https://doi.org/10.1121/1.2973185

    Article  Google Scholar 

  292. Itti L, Koch C (2001) Computational modelling of visual attention. Nat Rev Neurosci 2(3):194–203. https://doi.org/10.1038/35058500

    Article  Google Scholar 

  293. Iversen JR et al (2015) Synchronization to auditory and visual rhythms in hearing and deaf individuals. Cognition 134:232–244. https://doi.org/10.1016/j.cognition.2014.10.018

    Article  Google Scholar 

  294. Iverson P (1995) Auditory stream segregation by musical timbre: Effects of static and dynamic acoustic attributes. J Exp Psychol: Hum Percept Perform 21(4):751–763. https://doi.org/10.1037/0096-1523.21.4.751

  295. Iverson P, Krumhansl CL (1993) Isolating the dynamic attributes of musical timbre. J Acoust Soc Am 94(5):2595–2603. https://doi.org/10.1121/1.407371

    Article  Google Scholar 

  296. Jackendoff R (2009) Parallels and nonparallels between language and music. Music Percept: Interdiscip J 26(3):195–204. https://doi.org/10.1525/mp.2009.26.3.195

  297. Johnsrude IS et al (2013) Swinging at a cocktail party: voice familiarity aids speech perception in the presence of a competing voice. Psychol Sci 24(10):1995–2004. https://doi.org/10.1177/0956797613482467

    Article  Google Scholar 

  298. Jones MR (1976) Time our lost dimension: toward a new theory of perception, attention, and memory. Psychol Rev 83(5):323–335. https://doi.org/10.1037/0033-295X.83.5.323

  299. Jones MR, Boltz M (1989) Dynamic attending and responses to time. Psychol Rev 96(3):459–491. https://doi.org/10.1037/0033-295X.96.3.459

    Article  Google Scholar 

  300. Jones MR, Moynihan Johnston H, Puente J (2006) Effects of auditory pattern structure on anticipatory and reactive attending. Cognit Psychol 53(1):59–96. https://doi.org/10.1016/j.cogpsych.2006.01.003

  301. Jones MR, Moynihan Johnston H, Puente J (2002) Temporal aspects of stimulus-driven attending in dynamic arrays. Psychol Sci 13(4):313–319. https://doi.org/10.1111/1467-9280.00458

  302. Jones M, Love BC (2011) Bayesian fundamentalism or enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition. Behav Brain Sci 34(4):169–231. https://doi.org/10.1017/S0140525X10003134

    Article  Google Scholar 

  303. Kaernbach C (1992) On the consistency of tapping to repeated noise. J Acoust Soc Am 92(2):788–793. https://doi.org/10.1121/1.403948

  304. Kaernbach C (1993) Temporal and spectral basis of the features perceived in repeated noise. J Acoust Soc Am 94(1):91–96. https://doi.org/10.1121/1.406946

  305. Kalinli O, Narayanan S (2009) Prominence detection using auditory attention cues and task-dependent high level information. IEEE Trans Audio Speech Lang Process 17(5):1009–1024. https://doi.org/10.1109/TASL.2009.2014795

    Article  Google Scholar 

  306. Kameoka A, Kuriyagawa M (1969) Consonance theory part I: consonance of dyads. J Acoust Soc Am 45(6):1451–1459. https://doi.org/10.1121/1.1911623

    Article  Google Scholar 

  307. Kameoka A (1969) Consonance theory part II: consonance of complex tones and its calculation method. J Acoust Soc Am 45(6):1460–1469. https://doi.org/10.1121/1.1911624

    Article  Google Scholar 

  308. Kang H, Lancelin D, Pressnitzer D (2018) Memory for random time patterns in audition, touch, and vision. Neuroscience 389:118–132. https://doi.org/10.1016/j.neuroscience.2018.03.017

    Article  Google Scholar 

  309. Kanizsa G (1976) Subjective contours. Sci Am 234(4):48–53 . https://www.jstor.org/stable/24950327

  310. Katzin N, Cohen ZZ, Henik A (2019) If it looks, sounds, or feels like subitizing, is it subitizing? A modulated definition of subitizing. Psychon Bulle Rev 26:790–797. https://doi.org/10.3758/s13423-018-1556-0

  311. Kaufman EL et al (1949) The discrimination of visual number. Am J Psychol 62(4):498–525

    Google Scholar 

  312. Kawashima T, Sato T (2015) Perceptual limits in a simulated ‘Cocktail party’. Attent Percept Psychophys 77(6):2108–2120. https://doi.org/10.3758/s13414-015-0910-9

  313. Kaya EM, Elhilali M (2014) Investigating bottom-up auditory attention. Front Hum Neurosci 8, Article 327, 12 p. https://doi.org/10.3389/fnhum.2014.00327

  314. Kaya EM, Elhilali M (2017) Modelling auditory attention. Philos Trans Roy Soc B Biol Sci 372(1714) 10 p. https://doi.org/10.1098/rstb.2016.0101

  315. Kayser C et al (2005) Mechanisms for allocating auditory attention: an auditory saliency map. Curr Biol 15(21):1943–1947. https://doi.org/10.1016/j.cub.2005.09.040

    Article  Google Scholar 

  316. Keele SW et al (1989) Mechanisms of perceptual timing: beat-based or interval-based judgments? Psychol Res 50(4):251–256. https://doi.org/10.1007/BF00309261

    Article  Google Scholar 

  317. Kell AJE, McDermott JH (2019) Deep neural network models of sensory systems: windows onto the role of task constraints. Curr Opin Neurobiol 55:121–132. https://doi.org/10.1016/j.conb.2019.02.003

    Article  Google Scholar 

  318. Kell AJE et al (2018) A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron 98(3):630–644. https://doi.org/10.1016/j.neuron.2018.03.044

    Article  Google Scholar 

  319. Kelso JAS Multistability and metastability (2012) Understanding dynamic coordination in the brain. Philos Trans Roy Soc B Biol Sci 367(1591):906–918. https://doi.org/10.1098/rstb.2011.0351

    Article  Google Scholar 

  320. Kershenbaum A et al (2016) Acoustic sequences in non-human animals: a tutorial review and prospectus. Biol Rev 91(1):13–52. https://doi.org/10.1111/brv.12160

    Article  Google Scholar 

  321. Kidd Jr G, Mason CR, Best V (2014) The role of syntax in maintaining the integrity of streams of speech. J Acoust Soc Am 135(2):766–777. https://doi.org/10.1121/1.4861354

  322. Kidd Jr G et al (2008) Informational masking. In: Yost WA, Fay RR (eds) Auditory perception of sound sources, Chap 6. Springer Science+Business Media Inc, New York, NY 2008, pp 143–189. https://doi.org/10.1007/978-0-387-71305-2_6

  323. Kidd G Jr et al (2005) The advantage of knowing where to listen. J Acoust Soc Am 118(6):3804–3815. https://doi.org/10.1121/1.2109187

    Article  Google Scholar 

  324. Kim K et al (2014) Automatic detection of auditory salience with optimized linear filters derived from human annotation. Pattern Recognit Lett 38:78–85. https://doi.org/10.1016/j.patrec.2013.11.010

    Article  Google Scholar 

  325. Koch I et al (2011) Switching in the cocktail party: exploring intentional control of auditory selective attention. Percept Psychophys 37(4):231–238. https://doi.org/10.1037/a0022189

    Article  Google Scholar 

  326. Koelsch S, Vuust P, Friston K (2019) Predictive processes and the peculiar case of music. Trends Cognit Sci 23(1):63–77. https://doi.org/10.1016/j.tics.2018.10.006

  327. Koffka K (1955) Principles of gestalt psychology, 5th edn. Routledge, London, UK

    Google Scholar 

  328. Kogo N, Trengove C (2015) Is predictive coding theory articulated enough to be testable? Front Hum Neurosci 9, Article 111, 4 p. https://doi.org/10.3389/fncom.2015.00111

  329. Kohler KJ (2009) Rhythm in speech and language. Phonetica 66(1–2):29–45. https://doi.org/10.1159/000208929

  330. Kohlrausch A, Sander A (1995) Phase effects in masking related to dispersion in the inner ear. II. Masking period patterns of short targets. J Acoust Soc Am 97(3):1817–1829. https://doi.org/10.1121/1.413097

  331. Kolers PA, Brewster JM (1985) Rhythms and responses. J Exp Psychol Hum Percept Perform 11(2):150–167. https://doi.org/10.1037/0096-1523.11.2.150

    Article  Google Scholar 

  332. Kondo HM et al (2017) Auditory and visual scene analysis: an overview. Philos Trans Roy Soci B Biol Sci 372(20160099) 6 p. https://doi.org/10.1098/rstb.2016.0099

  333. Kondo HM et al (2012) Effects of self-motion on auditory scene analysis. Proc Natl Acad Sci 109(17):6775–6780. https://doi.org/10.1073/pnas.1112852109

    Article  Google Scholar 

  334. Kondo HM et al (2018) Inhibition-excitation balance in the parietal cortex modulates volitional control for auditory and visual multistability. Sci Rep 8:14548, 13 p. https://doi.org/10.1038/s41598-018-32892-3

  335. Kopp-Scheinpflug C, Sinclair JL, Linden JF (2018) When sound stops: offset responses in the auditory system. Trends Neurosci 41(10):712–728. https://doi.org/10.1016/j.tins.2018.08.009

  336. Koreimann S, Gula B, Vitouch O (2014) Inattentional deafness in music. Psychol Res 78(3):304–312. https://doi.org/10.1007/s00426-014-0552-x

  337. Kösem A et al (2018) Neural entrainment determines the words we hear. Curr Biol 28(18):2867–2875. https://doi.org/10.1016/j.cub.2018.07.023

  338. Kraus N, Chandrasekaran B (2010) Music training for the development of auditory skills. Nat Rev Neurosci 11(8):599–605. https://doi.org/10.1038/nrn2882

    Article  Google Scholar 

  339. Krishnan L, Elhilali M, Shamma SA (2014) Segregating complex sound sources through temporal coherence. PLoS Comput Biol 10(12):e1003985, 10 p. https://doi.org/10.1371/journal.pcbi.1003985

  340. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems 3–6 December 2012, Lake Tahoe, NV, pp 1097–1105. https://doi.org/10.1145/3065386

  341. Krumhansl CL, Iverson P (1992) Perceptual interaction between musical pitch and timbre. J Exp Psychol: Hum Percept Perform 18(3):739–751. https://doi.org/10.1037/0096-1523.18.3.739

    Article  Google Scholar 

  342. Kunert R, Jongman SR (2017) Entrainment to an auditory signal: is attention involved? J Exp Psychol Gen 146(1):77–88. https://doi.org/10.1037/xge0000246

    Article  Google Scholar 

  343. Kuroda T, Nakajima Y, Eguchi S (2012) Illusory continuity without sufficient sound energy to fill a temporal gap: Examples of crossing glide tones. J Exp Psychol: Hum Percept Perform 38(5):1254–1267. https://doi.org/10.1037/a0026629

  344. Kuroyanagi J et al (2019) Automatic comparison of human music, speech, and bird song suggests uniqueness of human scales. In: Proceedings of the 9th international workshop on folk music analysis (FMA 2019), Birmingham, UK. pp 35–40. https://biblio.ugent.be/publication/8621733

  345. Kwak C, Han W (2020) Towards size of scene in auditory scene analysis: a systematic review. J Audiol Otol 24(1):1–9. https://doi.org/10.7874/jao.2019.00248

    Article  Google Scholar 

  346. Landauer TK (1962) Rate of implicit speech. Percept Motor Skills 15(3):646. https://doi.org/10.2466/pms.1962.15.3.646

  347. Large EW (2008) Resonating to musical rhythm: theory and experiment. In: Grondin S (ed) Psychology of time, Chap 6. Emerald Group Publishing Limited, Bingley, UK, pp 189–231

    Google Scholar 

  348. Large EW (2015) Rhythm perception: pulse and meter. In: Jaeger D, Jung R (eds) Encyclopedia of computational neuroscience. Springer Science+Business Media Inc, New York, NY, pp 2650–2654

    Google Scholar 

  349. Large EW, Gray PM (2015) Spontaneous tempo and rhythmic entrainment in a bonobo (Pan paniscus). J Comp Psychol 129(4):317–328. https://doi.org/10.1037/com0000011

    Article  Google Scholar 

  350. Large EW, Herrera JA, Velasco MJ (2015) Neural networks for beat perception in musical rhythm. Front Syst Neurosci 9, Article 159, 14 p. https://doi.org/10.3389/fnsys.2015.00159

  351. Large EW, Jones MR (1999) The dynamics of attending: how people track time-varying events. Psychol Rev 106(1):119–159. https://doi.org/10.1037/0033-295X.106.1.119

    Article  Google Scholar 

  352. Large EW, Kolen JF (1994) Resonance and the perception of musical meter. Connect Sci 6(1):177–208. https://doi.org/10.1080/09540099408915723

    Article  Google Scholar 

  353. Large EW, Palmer C (2002) Perceiving temporal regularity in music. Cognit Sci 26(1):1–37. https://doi.org/10.1016/S0364-0213(01)00057-X

    Article  Google Scholar 

  354. Large EW, Snyder JS (2009) Pulse and meter as neural resonance. Ann N Y Acad Sci 1169(1):46–57. https://doi.org/10.1111/j.1749-6632.2009.04550.x

    Article  Google Scholar 

  355. Larrouy-Maestri P, Pfordresher PQ (2018) Pitch perception in music: do scoops matter? J Exp Psychol Hum Percept Perform 44(10):1523–1541. https://doi.org/10.1037/xhp0000550

    Article  Google Scholar 

  356. Larson E, Lee AK (2013) Influence of preparation time and pitch separation in switching of auditory attention between streams. J Acoust Soc Am 134(2):EL165–EL171. https://doi.org/10.1121/1.4812439

  357. Lawrance ELA et al (2014) Temporal predictability enhances auditory detection. J Acoust Soc Am 135(6):EL357–EL363. https://doi.org/10.1121/1.4879667

  358. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. https://doi.org/10.1038/nature14539

  359. Lee AK, Maddox RK, Bizley JK (2019) An object-based interpretation of audiovisual processing. In: Lee AK et al (ed) Multisensory processes: the auditory perspective, Chap 4. Springer Nature Switzerland AG, Cham, Switzerland, pp 59–83. https://doi.org/10.1007/978-3-030-10461-0_4

  360. Lee AK, Shinn-Cunningham BG (2008) Effects of frequency disparities on trading of an ambiguous tone between two competing auditory objects. J Acoust Soc Am 123(6):4340–4351. https://doi.org/10.1121/1.2908282

    Article  Google Scholar 

  361. Leibovich T et al (2017) From ‘sense of number’ to ‘sense of magnitude’: the role of continuous magnitudes in numerical cognition. Behav Brain Sci 40:e164, 62 p. https://doi.org/10.1017/S0140525X16000960

  362. Levitin DJ, Grahn JA, London J (2018) The psychology of music: rhythm and movement. Annu Rev Psychol 69:51–75. https://doi.org/10.1146/annurev-psych-122216-011740

    Article  Google Scholar 

  363. Levitin DJ, Rogers SE (2005) Absolute pitch: perception, coding, and controversies. Trends Cognit Sci 9(1):26–33. https://doi.org/10.1016/j.tics.2004.11.007

    Article  Google Scholar 

  364. Liao H-I. et al (2016) Human pupillary dilation response to deviant auditory stimuli: Effects of stimulus properties and voluntary attention. Front Neurosci 10, Article 43, 14 p. https://doi.org/10.3389/fnins.2016.00043

  365. Liberman AM, Isenberg D, Rakerd B (1981) Duplex perception of cues for stop consonants: evidence for a phonetic mode. Percept Psychophys 30(2):133–143. https://doi.org/10.3758/BF03204471

  366. Liberman M, Prince A (1977) On stress and linguistic rhythm. Linguist Inquiry 8(2):249–336

    Google Scholar 

  367. Little DF, Snyder JS, Elhilali M (2020) Ensemble modeling of auditory streaming reveals potential sources of bistability across the perceptual hierarchy. PLoS Comput Biol 16(4):e1007746, 31 p. https://doi.org/10.1371/journal.pcbi.1007746

  368. Lomber SG, Malhotra S (2008) Double dissociation of ‘what’ and ‘where’ processing in auditory cortex. Nat Neurosci 11(5):609–616. https://doi.org/10.1038/nn.2108

  369. London J (2002) Cognitive constraints on metric systems: some observations and hypotheses. Music Percept: Interdiscip J 19(4):529–550. https://doi.org/10.1525/mp.2002.19.4.529

  370. London J (2012) Three things linguists need to know about rhythm and time in music. Empir Musicol Rev 7(1–2):5–11. https://doi.org/10.18061/1811/52973

  371. Luck G, Sloboda JA (2009) Spatio-temporal cues for visually mediated synchronization. Music Percept: Interdiscip J 26(5):465–473. https://doi.org/10.1525/mp.2009.26.5.465

    Article  Google Scholar 

  372. Luo H, Poeppel D (2007) Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 54(6):1001–1010. https://doi.org/10.1016/j.neuron.2007.06.004

    Article  Google Scholar 

  373. Luo X, Masterson ME, Wu C-C (2014) Melodic interval perception by normal-hearing listeners and cochlear implant users. J Acoust Soc Am 136(4):1831–1844. https://doi.org/10.1121/1.4894738

    Article  Google Scholar 

  374. Lyzenga J, Carlyon RP, Moore BC (2005) Dynamic aspects of the continuity illusion: perception of level and of the depth, rate, and phase of modulation. Hear Res 210:30–41. https://doi.org/10.1016/j.heares.2005.07.002

    Article  Google Scholar 

  375. MacDorman CF (1962) Synchronization with auditory models of varying complexity. Percept Motor Skills 15(3):595–602

    Google Scholar 

  376. MacDougall R (1902) Rhythm, time and number. Am J Psychol 13(1):88–97.https://doi.org/10.2307/1412206

  377. MacDougall R (1903) The structure of simple rhythm forms. Psychol Rev Monogr Suppl 4(1):309–412 . http://www.gutenberg.org/files/16266/16266-h/16266-h.htm#AES1

  378. MacLeod CM (1991) Half a century of research on the Stroop effect: an integrative review. Psychol Bull 109(2):163–203. https://doi.org/10.1037/0033-2909.109.2.163

  379. Madison G (2006) Experiencing groove induced by music: consistency and phenomenology. Music Percept: Interdiscip J 24(2):201–208. https://doi.org/10.1525/mp.2006.24.2.201

  380. Madison G, Merker B (2002) On the limits of anisochrony in pulse attribution. Psychol Res 66(3):201–207. https://doi.org/10.1007/s00426-001-0085-y

    Article  Google Scholar 

  381. Madsen S, Dau T, Moore BC (2018) Effect of harmonic rank on sequential sound segregation. Hear Res 367:161–168. https://doi.org/10.1016/j.heares.2018.06.002

    Article  Google Scholar 

  382. Makov S et al (2017) Sleep disrupts high-level speech parsing despite significant basic auditory processing. J Neurosci 37(32):7772–7781. https://doi.org/10.1523/JNEUROSCI.0168-17.2017

    Article  Google Scholar 

  383. Malmberg CF (1918) The perception of consonance and dissonance. Psychol Monogr 25(2):93–133. https://doi.org/10.1037/h0093119

  384. Malmierca MS et al (2019) Pattern-sensitive neurons reveal encoding of complex auditory regularities in the rat inferior colliculus. Neuroimage 184:889–900. https://doi.org/10.1016/j.neuroimage.2018.10.012

    Article  Google Scholar 

  385. Mandler G, Shebo BJ (1982) Subitizing: an analysis of its component processes. J Exp Psychol Gen 111(1):1–22. https://doi.org/10.1037/0096-3445.111.1.1

    Article  Google Scholar 

  386. Marin CMH, McAdams S (1991) Segregation of concurrent sounds. II: effects of spectral envelope tracing, frequency modulation coherence, and frequency modulation width. J Acoust Soc Am 89(1):341–351. https://doi.org/10.1121/1.400469

  387. Marozeau J, De Cheveigné A (2007) The effect of fundamental frequency on the brightness dimension of timbre. J Acoust Soc Am 121(1):383–387. https://doi.org/10.1121/1.2384910

    Article  Google Scholar 

  388. Marozeau J, Innes-Brown H, Blamey PJ (2013) The effect of timbre and loudness on melody segregation. Music Percept: Interdiscip J 30(3):259–274. https://doi.org/10.1525/mp.2012.30.3.259

  389. Marozeau J et al (2003) The dependency of timbre on fundamental frequency. J Acoust Soc Am 144(5):2946–2957. https://doi.org/10.1121/1.1618239

    Article  Google Scholar 

  390. Marozeau J et al (2010) The effect of visual cues on auditory stream segregation in musicians and non-musicians. PLoS ONE 5(6):e11297, 10 p. https://doi.org/10.1371/journal.pone.0011297

  391. Martin JG (1972) Rhythmic (hierarchical) versus serial structure in speech and other behavior. Psychol Rev 79(6):487–509. https://doi.org/10.1037/h0033467

  392. Massaro DW (1976) Perceiving counting sounds. J Exp Psychol Hum Percept Perform 2(3):337–346. https://doi.org/10.1037/0096-1523.2.3.337

  393. Masutomi K et al (2016) Sound segregation via embedded repetition is robust to inattention. J Exp Psychol Hum Percept Perform 42(3):386–400. https://doi.org/10.1037/xhp0000147

    Article  Google Scholar 

  394. McAdams S (2013) Musical timbre perception. In: Deutsch D The psychology of music, Chap 2. Elsevier, Amsterdam, pp 35–67. https://doi.org/10.1016/B978-0-12-381460-9.00002-X

  395. McAdams S (1989) Segregation of concurrent sounds. I: effects of frequency modulation coherence. J Acoust Soc Am 86(6):2148–2159. https://doi.org/10.1121/1.398475

  396. McAdams S, Botte M-C, Drake C (1998) Auditory continuity and loudness computation. J Acoust Soc Am 103(3):1580–1591. https://doi.org/10.1121/1.421293

  397. McAdams S, Bregman AS (1979) Hearing musical streams. Comput Music J 3(4):26–60. http://www.jstor.org/stable/4617866

  398. McAdams S, Giordano BL (2009) The perception of musical timbre. In: Hallam S, Cross I, Thaut M (eds) The Oxford handbook of music psychology. Oxford University Pres, Oxford, UK, pp 72–80

    Google Scholar 

  399. McAuley JD (2010) Tempo and rhythm. In: Jones MR, Fay R, Popper AN (eds) Music perception, Chap 6. Springer Science+Business Media, New York, NY, pp 165–199. https://doi.org/10.1007/978-1-4419-6114-3_6

  400. McAuley JD, Jones MR (2003) Modeling effects of rhythmic context on perceived duration: a comparison of interval and entrainment approaches to short-interval timing. J Exp Psychol Hum Percept Perform 29(6):1102–1125. https://doi.org/10.1037/0096-1523.29.6.1102

    Article  Google Scholar 

  401. McCabe SL, Denham MJ (1997) A model of auditory streaming. J Acoust Soc Am 101(3):1611–1621. https://doi.org/10.1121/1.418176

    Article  Google Scholar 

  402. McClaskey CM (2016) Factors affecting relative pitch perception. Irvine, CA, 2016, pp i–xii, 1–91. https://escholarship.org/uc/item/32k8f2k9

  403. McCloy DR et al (2017) Pupillometry shows the effort of auditory attention switching. J Acoust Soc Am 141(4):2440–2451. https://doi.org/10.1121/1.4979340

    Article  Google Scholar 

  404. McDermott JH, Wrobleski D, Oxenham AJ (2011) Recovering sound sources from embedded repetition. Proc Natl Acad Sci USA 108(3):1188–1193. https://doi.org/10.1073/pnas.1004765108

  405. McDermott JH (2009) The cocktail party problem. Curr Biol 19(22):R1024–R1027. https://doi.org/10.1016/j.cub.2009.09.005

  406. McDermott JH, Lehr AJ, Oxenham AJ (2010) Individual differences reveal the basis of consonance. Curr Biol 20(11):035–1041. https://doi.org/10.1016/j.cub.2010.04.019

  407. McDermott JH, Lehr AJ, Oxenham AJ (2008) Is relative pitch specific to pitch? Psychol Sci 19(12):1263–1271. https://doi.org/10.1111/j.1467-9280.2008.02235.x

  408. McDermott JH, Oxenham AJ (2008) Spectral completion of partially masked sounds. Proc Natl Acad Sci 105(15):5939–5944. https://doi.org/10.1073/pnas.0711291105

    Article  Google Scholar 

  409. McDermott JH, Oxenham AJ, Simoncelli EP (2009) Sound texture synthesis via filter statistics. In: Proceedings of the IEEE workshop on applications of signal processing to audio and acoustics (WASPAA’09) 18-21 October 2009, New Paltz, NY, pp 297–300. https://doi.org/10.1109/ASPAA.2009.5346467

  410. McDermott JH et al (2016) Indifference to dissonance in native Amazonians reveals cultural variation in music perception. Nature 535(7613):547–550. https://doi.org/10.1038/nature18635

    Article  Google Scholar 

  411. McDermott JH et al (2010) Musical intervals and relative pitch: frequency resolution, not interval resolution, is special. J Acoust Soc Am 128(4):1943–1951. https://doi.org/10.1121/1.3478785

    Article  Google Scholar 

  412. McLachlan NM, Marco DJT, Wilson SJ (2012) Pitch enumeration: failure to subitize in audition. PLoS ONE 7(4):e33661, 5 p. https://doi.org/10.1371/journal.pone.0033661

  413. McLachlan NM et al (2013) Consonance and pitch. J Exp Psychol Gen 142(4):1142–1158. https://doi.org/10.1037/a0030830

    Article  Google Scholar 

  414. McPherson MJ, Grace RC, McDermott JH (2022) Harmonicity aids hearing in noise. Attent Percept Psychophys 84:1016–1042. https://doi.org/10.3758/s13414-021-02376-0

    Article  Google Scholar 

  415. McPherson MJ, McDermott JH (2017) Diversity in pitch perception revealed by task dependence. Nat Hum Behav 2(1):52–66. https://doi.org/10.1038/s41562-017-0261-8

    Article  Google Scholar 

  416. McWalter R, McDermott JH (2018) Adaptive and selective time averaging of auditory scenes. Curr Biol 28(9):1405–1418. https://doi.org/10.1016/j.cub.2018.03.049

    Article  Google Scholar 

  417. Mehta AH et al (2017) An auditory illusion reveals the role of streaming in the temporal misallocation of perceptual objects. Philos Trans Roy Soc B: Biol Sci 372(1714):20160114, 10 p. https://doi.org/10.1098/rstb.2016.0114

  418. Merchant H, Honing H (2014) Are non-human primates capable of rhythmic entrainment? Evidence for the gradual audiomotor evolution hypothesis. Front Neurosci 7, Article 274, 8 p. https://doi.org/10.3389/fnins.2013.00274

  419. Merchant H et al (2015) Finding the beat: a neural perspective across humans and non-human primates. Philos Trans Roy Soc B: Biol Sci 370(1664):20140093, 16 p. https://doi.org/10.1098/rstb.2014.0093

  420. Merker B, Morley I, Zuidema W (2015) Five fundamental constraints on theories of the origins of music. Philos Trans Roy Soc B: Biol Sci 370(1664):20140095, 11 p. https://doi.org/10.1098/rstb.2014.0095

  421. Merker BH, Madison GS, Eckerdal P (2009) On the role and origin of isochrony in human rhythmic entrainment. Cortex 45(1):4–17. https://doi.org/10.1016/j.cortex.2008.06.011

  422. Mertens P (2004) The Prosogram: Semi-automatic transcription of prosody based on. In: Proceedings of the international conference on speech prosody 23-26 March 2004, Nara, Japan, 4 p. https://www.isca-speech.org/archive_open/sp2004/sp04_549.pdf

  423. Meyer L (2018) The neural oscillations of speech processing and language comprehension: state of the art and emerging mechanisms. Eur J Neurosci 48(7):2609–2621. https://doi.org/10.1111/ejn.13748

  424. Micheyl C, Hunter C, Oxenham AJ (2010) Auditory stream segregation and the perception of across frequency synchrony. J Exp Psychol: Hum Percept Perform 36(4):1029–1039. https://doi.org/10.1037/a0017601

  425. Micheyl C, Oxenham AJ (2010) Objective and subjective psychophysical measures of auditory stream integration and segregation. J Assoc Res Otolaryngol 11(4):709–724. https://doi.org/10.1007/s10162-010-0227-2

    Article  Google Scholar 

  426. Michon JA (1964) Studies on subjective duration: I. Differential sensitivity in the perception of repeated temporal intervals. Acta Psychol 22:441–450. https://doi.org/10.1016/0001-6918(64)90032-0

  427. Middlebrooks JC (2017) Spatial stream segregation. In: Middlebrooks JC et al. (ed) The auditory system at the cocktail party, Chap 6. Springer International Publishing, Cham, Switzerland, pp 137–168. https://doi.org/10.1007/978-3-319-51662-2_6

  428. Middlebrooks JC et al. (ed) (2017) The auditory system at the cocktail party. Springer International Publishing, Cham, Switzerland, pp i–xiv, 1–291. https://doi.org/10.1007/978-3-319-51662-2

  429. Mill RW et al (2013) Modelling the emergence and dynamics of perceptual organisation in auditory streaming. PLoS Comput Biol 9(3):e1002925, 21 p. https://doi.org/10.1371/journal.pcbi.1002925

  430. Miller GA, Heise GA (1950) The trill threshold. J Acoust Soc Am 22(5):637–638. https://doi.org/10.1121/1.1906663

    Article  Google Scholar 

  431. Miller GA, Licklider J (1950) The intelligibility of interrupted speech. J Acoust Soc Am 22(2):167–173. https://doi.org/10.1121/1.1906584

    Article  Google Scholar 

  432. Miśkiewicz A, Rakowsky A, Rościszewska T (2006) Perceived roughness of two simultaneous pure tones. Acta Acustica united with Acustica 92(2):331–336

    Google Scholar 

  433. Miśkiewicz A, Rogala T, Szczeńpaska-Antosik J (2007) Perceived roughness of two simultaneous harmonic complex tones. Arch Acoust 32(3):737–748. http://acoustics.ippt.pan.pl/index.php/aa/article/viewFile/726/639

  434. Miyake I (1902) Researches on rhythmic activity. Stud Yale Psychol Lab 10:1–48

    Google Scholar 

  435. Młynarski W, McDermott JH (2019) Ecological origins of perceptual grouping principles in the auditory system. Proc Natl Acad Sci 116(50):25355–25364. https://doi.org/10.1073/pnas.1903887116

    Article  Google Scholar 

  436. Młynarski W, McDermott JH (2018) Learning midlevel auditory codes from natural sound statistics. Neural Comput 30(3):631–669. https://doi.org/10.1162/neco_a_01048

    Article  MathSciNet  MATH  Google Scholar 

  437. Molloy K, Lavie N, Chait M (2019) Auditory figure-ground segregation is impaired by high visual load. J Neurosci 39(9):1699–1708. https://doi.org/10.1523/JNEUROSCI.2518-18.2018

  438. Moore BC (2012) An introduction to the psychology of hearing, 6th edn. Emerald Group Publishing Limited, Bingley, UK

    Google Scholar 

  439. Moore BC, Gockel HE (2002) Factors influencing sequential stream segregation. Acta Acust Acust 88(3):320–333

    Google Scholar 

  440. Moore BC, Gockel HE (2012) Properties of auditory stream formation. Philos Trans Roy Soc Lond B: Biol Sci 356(1591):919–931. https://doi.org/10.1098/rstb.2011.0355

    Article  Google Scholar 

  441. Moore DR (2003) Cortical neurons signal sound novelty. Nat Neurosci 6(4):330–332. https://doi.org/10.1038/nn0403-330

  442. Moray N (1959) Attention in dichotic listening: affective cues and the influence of instructions. Quart J Exp Psychol 11(1):56–60. https://doi.org/10.1080/17470215908416289

  443. Musso M et al (2020) Musicians use speech-specific areas when processing tones: the key to their superior linguistic competence? Behav Brain Res 390:112662, 13 p. https://doi.org/10.1016/j.bbr.2020.112662

  444. Näätänen R, Kujala T, Light G (2019) Mismatch negativity: a window to the brain. Oxford University Press, Oxford, UK

    Book  Google Scholar 

  445. Näätänen R et al (2007) The mismatch negativity (MMN) in basic research of central auditory processing: a review. Clin Neurophysiol 118(12):2544–2590. https://doi.org/10.1016/j.clinph.2007.04.026

    Article  Google Scholar 

  446. Nager W et al (2003) Preattentive evaluation of multiple perceptual streams in human audition. NeuroReport 14(6):871–874. https://doi.org/10.1097/00001756-200305060-00019

    Article  Google Scholar 

  447. Naik GR, Wang W (eds) (2014) Blind source separation: advances in theory, algorithms and applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55016-4

  448. Nakajima Y, Hoopen G ten, Van der Wilk R (1991) A new illusion of time perception. Music Percept: Interdiscip J 8(4):431–448. https://doi.org/10.2307/40285521

  449. Nakajima Y et al (2014) Auditory grammar. Acoust Aust 42(2):97–101

    Google Scholar 

  450. Nakajima Y et al (2000) Illusory recouplings of onsets and terminations of glide tone components. Percept Psychophys 62(7):1413–1425. https://doi.org/10.3758/BF03212143

    Article  Google Scholar 

  451. Nakajima Y et al (1992) Time-shrinking: a discontinuity in the perception of auditory temporal patterns. Percept Psychophys 51(5):504–507. https://doi.org/10.3758/BF03211646

    Article  Google Scholar 

  452. Nakajima Y et al (2004) Time-shrinking: the process of unilateral temporal assimilation. Perception 33(9):1061–1079. https://doi.org/10.1068/p5061

    Article  Google Scholar 

  453. Neisser U, Becklen R (1975) Selective looking: attending to visually specified events. Cognit Psychol 7(4):480–494. https://doi.org/10.1016/0010-0285(75)90019-5

    Article  Google Scholar 

  454. Nelken I (2014) Stimulus-specific adaptation and deviance detection in the auditory system: experiments and models. Biol Cybernet 108(5):655–663. https://doi.org/10.1007/s00422-014-0585-7

  455. Newman RS, Evers S (2007) The effect of talker familiarity on stream segregation. J Phon 35(1):85–103. https://doi.org/10.1016/j.wocn.2005.10.004

    Article  Google Scholar 

  456. Nguyen T, Gibbings A, Grahn J (2018) Rhythm and beat perception. In: Springer handbook of systematic musicology, Chap 27. Springer GmbH Germany, Cham, Switzerland, pp 507–521. https://doi.org/10.1007/978-3-662-55004-5_27

  457. Niebuhr O (2009) F0-based rhythm effects on the perception of local syllable prominence. Phonetica 66(1–2):95–112. https://doi.org/10.1159/000208933

  458. Ning R et al (2019) Perceptual-learning evidence for inter-onset-interval-and frequency-specific processing of fast rhythms. Attent Percept Psychophys 81(2):533–542. https://doi.org/10.3758/s13414-018-1631-7

    Article  Google Scholar 

  459. Nobre AC, Van Ede F (2018) Anticipated moments: temporal structure in attention. Nat Rev Neurosci 19(1):34–48. https://doi.org/10.1038/nrn.2017.141

    Article  Google Scholar 

  460. Nobre AC (2001) Orienting attention to instants in time. Neuropsychologia 39(12):1317–1328. https://doi.org/10.1016/S0028-3932(01)00120-8

  461. Nobre AC, Correa A, Coull JT (2007) The hazards of time. Curr Opin Neurobiol 17(4):465–470. https://doi.org/10.1016/j.conb.2007.07.006

  462. Nolan F (2003) Intonational equivalence: an experimental evaluation of pitch scales. In: Proceedings of the 15th international congress of phonetic sciences (Barcelona), pp 771–774. https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2003/papers/p15_0771.pdf

  463. Nolan F, Jeon H-S (20140 Speech rhythm: a metaphor? Philos Trans Roy Soc B: Biol Sci 369(1658):20130396, 11 p. https://doi.org/10.1098/rstb.2013.0396

  464. Norman-Haignere S, Kanwisher NG, McDermott JH (2015) Distinct cortical pathways for music and speech revealed by hypothesis-free voxel decomposition. Neuron 88(6):1281–1296. https://doi.org/10.1016/j.neuron.2015.11.035

  465. Norris D, McQueen JM, Cutler A (2016) Prediction, Bayesian inference and feedback in speech recognition. Lang Cognit Neurosci 31(1):4–18. https://doi.org/10.1080/23273798.2015.1081703

  466. O’Sullivan JA, Shamma SA, Lalor EC (2015) Evidence for neural computations of temporal coherence in an auditory scene and their enhancement during active listening. J Neurosci 35(18):7256–7263. https://doi.org/10.1523/JNEUROSCI.4973-14.2015

  467. Oberfeld D (2014) An objective measure of auditory stream segregation based on molecular psychophysics. Attent Percept Psychophys 76(3):829–851. https://doi.org/10.3758/s13414-013-0613-z

  468. Oesch N (2019) Music and language in social interaction: synchrony, antiphony and functional origins. Front Psychol 10, Article 1514, 13 p. https://doi.org/10.3389/fpsyg.2019.01514

  469. Ogg M et al (2019) Separable neural representations of sound sources: speaker identity and musical timbre. Neuroimage 191:116–126. https://doi.org/10.1016/j.neuroimage.2019.01.075

    Article  Google Scholar 

  470. Ono K (2018) Modality-dependent effect of motion information in sensory-motor synchronised tapping. Neurosci Lett 675:31–35. https://doi.org/10.1016/j.neulet.2018.03.055

    Article  Google Scholar 

  471. Ortega L et al (2014) Audition dominates vision in duration perception irrespective of salience, attention, and temporal discriminability. Attent Percept Psychophys 76(5):1485–1502. https://doi.org/10.3758/s13414-014-0663-x

    Article  Google Scholar 

  472. Ortmann O (1926) On the melodic relativity of tones. Psychol Monogr 35(1): i–ii, 1–47. https://doi.org/10.1037/h0093210

  473. Oxenham AJ (2018) How we hear: the perception and neural coding of sound. Annu Rev Psychol 69:27–50. https://doi.org/10.1146/annurev-psych-122216-011635

    Article  Google Scholar 

  474. Oxenham AJ, Dau T (2001) Towards a measure of auditory-filter phase response. J Acoust Soc Am 110(6):3169–3178. https://doi.org/10.1121/1.1414706

    Article  Google Scholar 

  475. Paavilainen P (2013) The mismatch-negativity (MMN) component of the auditory event-related potential to violations of abstract regularities: a review. Int J Psychophysiol 88(2):109–123. https://doi.org/10.1016/j.ijpsycho.2013.03.015

  476. Park H-J, Friston K (2013) Structural and functional brain networks: from connections to cognition. Science 342(6158), Article 1238411, 8 p. https://doi.org/10.1126/science.1238411

  477. Parncutt R (1994) A perceptual model of pulse salience and metrical accent in musical rhythms. Music Percept: Interdiscip J 11(4):409–464. https://doi.org/10.2307/40285633

  478. Parncutt R, Hair G (2018) A psychocultural theory of musical interval: bye bye Pythagoras. Music Percept: Interdiscip J 35(4):475–501. https://doi.org/10.1525/mp.2018.35.4.475

    Article  Google Scholar 

  479. Parncutt R, Hair G (2011) Consonance and dissonance in music theory and psychology: disentangling dissonant dichotomies. J Interdiscip Music Stud 5(2):119–166. http://musicstudies.org/wp-content/uploads/2017/01/Parncutt_JIMS_11050202.pdf

  480. Parras GG et al (2017) Neurons along the auditory pathway exhibit a hierarchical organization of prediction error. Nat Commun 8:2148, 17 p. https://doi.org/10.1038/s41467-017-02038-6

  481. Pashler H (2001) Perception and production of brief durations: Beat-based versus interval-based timing. J Exp Psychol: Hum Percept Perform 27(2):485–493. https://doi.org/10.1037/0096-1523.27.2.485

  482. Pastore RE et al (1983) Duplex perception with musical stimuli. Percept Psychophys 33(5):469–474. https://doi.org/10.3758/BF03202898

    Article  Google Scholar 

  483. Patel AD (2008) Music, language, and the brain. Oxford University Press, Oxford, UK

    Google Scholar 

  484. Patel AD (2006) Musical rhythm, linguistic rhythm, and human evolution. Music Percept: Interdiscip J 24(1):99–104. https://doi.org/10.1525/mp.2006.24.1.99

  485. Patel AD (2003) Rhythm in language and music, Parallels and differences. Ann N Y Acad Sci 999(1):140–143. https://doi.org/10.1196/annals.1284.015

  486. Patel AD et al (2009) Studying synchronization to a musical beat in nonhuman animals. Ann N Y Acad Sci 1169(1):459–469. https://doi.org/10.1111/j.1749-6632.2009.04581.x

    Article  Google Scholar 

  487. Patel AD et al (2005) The influence of metricality and modality on synchronization with a beat. Exp Brain Res 163(2):226–238. https://doi.org/10.1007/s00221-004-2159-8

    Article  Google Scholar 

  488. Paton JJ, Buonomano DV (2018) The neural basis of timing: distributed mechanisms for diverse functions. Neuron 98(4):687–705. https://doi.org/10.1016/j.neuron.2018.03.045

    Article  Google Scholar 

  489. Peelle JE, Davis MH (2012) Neural oscillations carry speech rhythm through to comprehension. Front Psychol 3, Article 320, 17 p. https://doi.org/10.3389/fpsyg.2012.00320

  490. Pérez-González D, Malmierca MS, Covey E (2005) Novelty detector neurons in the mammalian auditory midbrain. Eur J Neurosci 22(11):2879–2885. https://doi.org/10.1111/j.1460-9568.2005.04472.x

  491. Peter B et al (2015) Direct and octave-shifted pitch matching during nonword imitations in men, women, and children. J Voice 29(2):260.e21–260.e30. https://doi.org/10.1016/j.jvoice.2014.06.011

  492. Petkov CI, O’Connor KN, Sutter ML (2007) Encoding of illusory continuity in primary auditory cortex. Neuron 54(1):153–165. https://doi.org/10.1016/j.neuron.2007.02.031

  493. Phillips DP et al (2012) Dual mechanisms in the perceptual processing of click train temporal regularity. J Acoust Soc Am 132(1):EL22–EL28. https://doi.org/10.1121/1.4728193

  494. Pike KL (1945) The intonation of american english. University of Michigan Press, Ann Arbor, MI

    Google Scholar 

  495. Plomp R, Levelt W (1965) Tonal consonance and critical bandwidth. J Acoust Soc Am 38(4):548–560. https://doi.org/10.1121/1.1909741

    Article  Google Scholar 

  496. Plomp R, Wagenaar WA, Mimpen AM (1973) Musical interval recognition with simultaneous tones. Acta Acustica united with Acustica 29(2):101–109. https://www.ingentaconnect.com/content/dav/aaua/1973/00000029/00000002/art00007

  497. Plomp R (1982) Continuity effects in the perception of sounds. Psychoacoust Music (Jablonna, Poland). as cited by Bregman (1990), pp 351–352. https://acoustics.ippt.gov.pl/index.php/aa/article/view/3076/1996

  498. Popescu T et al (2019) The pleasantness of sensory dissonance is mediated by musical style and expertise. Sci Rep 9:1070, 11 p. https://doi.org/10.1038/s41598-018-35873-8

  499. Popham S et al (2018) Inharmonic speech reveals the role of harmonicity in the cocktail party problem. Nat Commun 9(1):2122, 13 p. https://doi.org/10.1038/s41467-018-04551-8

  500. Port RF (2007) The problem of speech patterns in time. In: Gaskell GM (ed) The Oxford handbook of psycholinguistics, Chap 30. Oxford University Press, Oxford, UK, pp 503–514

    Google Scholar 

  501. Poudrier, È, Repp BH (2013) Can musicians track two different beats simultaneously? Music Percept: Interdiscip J 30(4):369–390. https://doi.org/10.1525/mp.2013.30.4.369

  502. Povel D-J (1981) The internal representation of simple temporal patterns. J Exp Psychol: Hum Percept Perform 7(1):3–18. https://doi.org/10.1037/0096-1523.7.1.3

    Article  Google Scholar 

  503. Povel D-J, Essens P (1985) Perception of temporal patterns. Music Percept: Interdiscip J 2(4):411–440. https://doi.org/10.2307/40285311

  504. Powers GL, Wilcox JC (1977) Intelligibility of temporally interrupted speech with and without intervening noise. J Acoust Soc Am 61(1):195–199. https://doi.org/10.1121/1.381255

    Article  Google Scholar 

  505. Pressnitzer D, Hupé J-M (2006) Temporal dynamics of auditory and visual bistability reveal common principles of perceptual organization. Curr Biol 16(13):1351–1357. https://doi.org/10.1016/j.cub.2006.05.054

    Article  Google Scholar 

  506. Pressnitzer D et al (2008) Perceptual organization of sound begins in the auditory periphery. Curr Biol 18(15):1124–1128. https://doi.org/10.1016/j.cub.2008.06.053

    Article  Google Scholar 

  507. Price C, Thierry G, Griffiths T (2005) Speech-specific auditory processing: Where is it? Trends Cognit Sci 9(6):271–276. https://doi.org/10.1016/j.tics.2005.03.009

  508. Price CJ (2012) A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. NeuroImage 62(2):816–847. https://doi.org/10.1016/j.neuroimage.2012.04.062

  509. Prince JB, Rice T (2018) Regularity and dimensional salience in temporal grouping. J Exp Psychol Hum Percept Perform 44(9):1356–1367. https://doi.org/10.1037/xhp0000542

    Article  Google Scholar 

  510. Prince JB, Sopp M (2019) Temporal expectancies affect accuracy in standard-comparison judgments of duration, but neither pitch height, nor timbre, nor loudness. J Exp Psychol Hum Percept Perform 45(5):585–600. https://doi.org/10.1037/xhp0000629

    Article  Google Scholar 

  511. Puschmann S et al (2013) Electrophysiological correlates of auditory change detection and change deafness in complex auditory scenes. Neuroimage 75:155–164. https://doi.org/10.1016/j.neuroimage.2013.02.037

    Article  Google Scholar 

  512. Puvvada KC, Simon JZ (2017) Cortical representations of speech in a multitalker auditory scene. J Neurosci 37(38):9189–9196. https://doi.org/10.1523/JNEUROSCI.0938-17.2017

    Article  Google Scholar 

  513. Pylyshyn Z (1999) Is vision continuous with cognition? The case for cognitive impenetrability of visual perception. Behav Brain Sci 22(3):341–365. https://doi.org/10.1017/S0140525X99002022

  514. Quené H (2007) On the just noticeable difference for tempo in speech. J Phon 35(3):353–362. https://doi.org/10.1016/j.wocn.2006.09.001

    Article  Google Scholar 

  515. Quené H, Port RF (2005) Effects of timing regularity and metrical expectancy on spoken-word perception. Phonetica 62(1):1–13. https://doi.org/10.1159/000087222

  516. Rahne T, Böckmann-Barthel M (2009) Visual cues release the temporal coherence of auditory objects in auditory scene analysis. Brain Res 1300:125–134. https://doi.org/10.1016/j.brainres.2009.08.086

    Article  Google Scholar 

  517. Rahne T et al (2008) A multilevel and cross-modal approach towards neuronal mechanisms of auditory streaming. Brain Res 1220:118–131. https://doi.org/10.1016/j.brainres.2007.08.011

    Article  Google Scholar 

  518. Rahne T et al (2007) Visual cues can modulate integration and segregation of objects in auditory scene analysis. Brain Res 1144:127–135. https://doi.org/10.1016/j.brainres.2007.01.074

    Article  Google Scholar 

  519. Rajasingam SL, Summers RJ, Roberts B (2018) Stream biasing by different induction sequences: evaluating stream capture as an account of the segregation-promoting effects of constant-frequency inducers. J Acoust Soc Am 144(6):3409–3420. https://doi.org/10.1121/1.5082300

  520. Rajendran VG, Harper NS, Schnupp JWH (2020) Auditory cortical representation of music favours the perceived beat. Roy Soc Open Sci 7(3):191194, 13 p. https://doi.org/10.1098/rsos.191194

  521. Rajendran VG, Teki S, Schnupp JWH (2018) Temporal processing in audition: insights from music. Neuroscience 389:4–18. https://doi.org/10.1016/j.neuroscience.2017.10.041

    Article  Google Scholar 

  522. Rajendran VG et al (2016) Rhythm facilitates the detection of repeating sound patterns. Front Neurosci 10, Article 9, 7 p. https://doi.org/10.3389/fnins.2016.00009

  523. Rajendran VG et al (2013) Temporal predictability as a grouping cue in the perception of auditory streams. J Acoust Soc Am 134(1):EL96–EL104. https://doi.org/10.1121/1.4811161

  524. Ramus F, Nespor M, Mehler J (1999) Correlates of linguistic rhythm in the speech signal. Cognition 73(3):265–292. https://doi.org/10.1016/S0010-0277(99)00058-

  525. Rand TC (1974) Dichotic release from masking for speech. J Acoust Soc Am 55(3):678–680. https://doi.org/10.1121/1.1914584

  526. Rankin J, Osborn Popp PJ, Rinzel J (2017) Stimulus pauses and perturbations differentially delay or promote the segregation of auditory objects: psychoacoustics and modeling. Front Neurosci 11, Article 198, 12 p. https://doi.org/10.3389/fnins.2017.00198

  527. Rankin J, Rinzel J (2019) Computational models of auditory perception from feature extraction to stream segregation and behavior. Curr Opin Neurobiol 58:46–53. https://doi.org/10.1016/j.conb.2019.06.009

    Article  Google Scholar 

  528. Rankin J, Sussman E, Rinzel J (2015) Neuromechanistic model of auditory bistability. PLoS Comput Biol 11(11):e1004555, 34 p. https://doi.org/10.1371/journal.pcbi.1004555

  529. Rao RPN, Ballard DH (1999) Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci 2(1):79–87. https://doi.org/10.1038/4580

    Article  Google Scholar 

  530. Räsänen O, Doyle G, Frank MC (2018) Pre-linguistic segmentation of speech into syllable-like units. Cognition 171:130–150. https://doi.org/10.1016/j.cognition.2017.11.003

    Article  Google Scholar 

  531. Rauschecker JP, Tian B (2000) Mechanisms and streams for processing of ‘what’ and ‘where’ in auditory cortex. Proc Natl Acad Sci 97(22):11800–11806. https://doi.org/10.1073/pnas.97.22.11800

    Article  Google Scholar 

  532. Ravignani A, Bowling DL, Fitch W (2014) Chorusing, synchrony, and the evolutionary functions of rhythm. Front Psychol 5, Article 1118, 15 p. https://doi.org/10.3389/fpsyg.2014.01118

  533. Ravignani A, Verga L, Greenfield MD (2019) Interactive rhythms across species: the evolutionary biology of animal chorusing and turn-taking. Ann NY Acad Sci 1453(1):12–21. https://doi.org/10.1111/nyas.14230

  534. Ravignani A et al (2019) Rhythm in speech and animal vocalizations: a cross-species perspective. Ann N Y Acad Sci 1453(1):79–98. https://doi.org/10.1111/nyas.14166

    Article  Google Scholar 

  535. Regev TI, Nelken I, Deouell LY (2019) Evidence for linear but not helical automatic representation of pitch in the human auditory system. J Cognit Neurosci 31(5):669–685. https://doi.org/10.1162/jocn_a_01374

  536. Remijn GB, Nakajima Y, Tanaka S (2007) Perceptual completion of a sound with a short silent gap. Perception 36(6). https://doi.org/10.1068/p5574

  537. Remijn GB et al (2008) Frequency modulation facilitates (modal) auditory restoration of a gap. Hear Res 243(1–2):113–120. https://doi.org/10.1016/j.heares.2008.06.007

  538. Remijn GB et al (1999) On the robustness of time-shrinking. J Acoust Soc Jpn (E) 20(5):365–373. https://doi.org/10.1250/ast.20.365

    Article  Google Scholar 

  539. Repp BH (1984) Categorical perception: Issues, methods, findings. In: Lass NJ (ed) Speech and language: advances in basic research and practice. Academic, Orlando, FL, pp 243–335. https://doi.org/10.1016/B978-0-12-608610-2.50012-1

  540. Repp BH (2007) Hearing a melody in different ways: multistability of metrical interpretation, reflected in rate limits of sensorimotor synchronization. Cognition 102(3):434–454. https://doi.org/10.1016/j.cognition.2006.02.003

  541. Repp BH (1990) Patterns of expressive timing in performances of a Beethoven minuet by nineteen famous pianists. J Acoust Soc Am 88(2):622–641. https://doi.org/10.1121/1.399766

  542. Repp BH (2007) Perceiving the numerosity of rapidly occurring auditory events in metrical and nonmetrical contexts. Percept Psychophys 69(4):529–543. https://doi.org/10.3758/BF03193910

  543. Repp BH (1992) Perceptual restoration of a ‘missing’ speech sound: auditory induction or illusion? Percept Psychophys 51(1):14–32. https://doi.org/10.3758/BF03205070

  544. Repp BH (2006) Rate limits of sensorimotor synchronization. Adv Cognit Psychol 2(2–3):163–181

    Google Scholar 

  545. Repp BH (2005) Sensorimotor synchronization, A review of the tapping literature. Psychon Bull Rev 12(6):969–992. https://doi.org/10.3758/BF03206433

  546. Repp BH, Doggett R (2007) Tapping to a very slow beat: a comparison of musicians and nonmusicians. Music Percept: Interdiscip J 24(4):367–376. https://doi.org/10.1525/mp.2007.24.4.367

    Article  Google Scholar 

  547. Repp BH, Penel A (2002) Auditory dominance in temporal processing: new evidence from synchronization with simultaneous visual and auditory sequences. J Exp Psychol Hum Percept Perform 29(5):1085–1099. https://doi.org/10.1037/0096-1523.28.5.1085

    Article  Google Scholar 

  548. Repp BH, Su Y-H (2013) Sensorimotor synchronization: a review of recent research (2006–2012). Psychon Bull Rev 20(3):403–452. https://doi.org/10.3758/s13423-012-0371-2

    Article  Google Scholar 

  549. Richards DG, Wolz JP, Herman LM (1984) Vocal mimicry of computer-generated sounds and vocal labeling of objects by a bottlenosed dolphin, Tursiops truncatus. J Comparat Psychol 98(1):10–28. https://doi.org/10.1037/0735-7036.98.1.10

  550. Riecke L, Micheyl C, Oxenham AJ (2012) Global not local masker features govern the auditory continuity illusion. J Neurosci 32(13):4660–4664. https://doi.org/10.1523/JNEUROSCI.6261-11.2012

  551. Riecke L, Van Opstal AJ, Formisano E (2008) The auditory continuity illusion: a parametric investigation and filter model. Percept Psychophys 70(1):1–12. https://doi.org/10.3758/PP.70.1.1

  552. Rimmele JM et al (2018) Proactive sensing of periodic and aperiodic auditory patterns. Trends Cognit Sci 22(10):870–882. https://doi.org/10.1016/j.tics.2018.08.003

    Article  Google Scholar 

  553. Roberts B, Glasberg BR, Moore BC (2008) Effects of the build-up and resetting of auditory stream segregation on temporal discrimination. J Exp Psychol: Hum Percept Perform 34(4):992–1006. https://doi.org/10.1037/0096-1523.34.4.992

  554. Roberts B, Glasberg BR, Moore BC (2002) Primitive stream segregation of tone sequences without differences in fundamental frequency or passband. J Acoust Soc Am 112(5):2074–2085. https://doi.org/10.1121/1.1508784

  555. Roberts B, Summers RJ (2019) Dichotic integration of acoustic-phonetic information: competition from extraneous formants increases the effect of second-formant attenuation on intelligibility. J Acoust Soc Am 145(3):1230–1240. https://doi.org/10.1121/1.5091443

    Article  Google Scholar 

  556. Roberts KL et al (2019) Can auditory objects be subitized? J Exp Psychol Hum Percept Perform 45(1):1–15. https://doi.org/10.1037/xhp0000578

    Article  Google Scholar 

  557. Roberts LA (1986) Consonance judgements of musical chords by musicians and untrained listeners. Acta Acustica united with Acustica 62(2):163–171

    Google Scholar 

  558. Rogers WL, Bregman AS (1993) An experimental evaluation of three theories of auditory stream segregation. Percept Psychophys 53(2):179–189. https://doi.org/10.3758/BF03211728

    Article  Google Scholar 

  559. Rogers WL (1998) Cumulation of the tendency to segregate auditory streams: resetting by changes in location and loudness. Percept Psychophys 60(7):1216–1227. https://doi.org/10.3758/BF03206171

    Article  Google Scholar 

  560. Romanski LM et al (1999) Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nat Neurosci 2(12):1131–1136. https://doi.org/10.1038/16056

    Article  Google Scholar 

  561. Rosburg T (2003) Left hemispheric dipole locations of the neuromagnetic mismatch negativity to frequency, intensity and duration deviants Cognit Brain Res 16(1):83–90. https://doi.org/10.1016/S0926-6410(02)00222-7

  562. Rose MM, Moore BC (2000) Effects of frequency and level on auditory stream segregation. J Acoust Soc Am 108(3):1209–1213. https://doi.org/10.1121/1.1287708

    Article  Google Scholar 

  563. Rose MM, Moore BC (1997) Perceptual grouping of tone sequences by normally hearing and hearingimpaired listeners. J Acoust Soc Am 102(3):1768–1778. https://doi.org/10.1121/1.420108

    Article  Google Scholar 

  564. Rose MM, Moore BC (2005) The relationship between stream segregation and frequency discrimination in normally hearing and hearing-impaired subjects. Hear Res 204(1–2):16–28. https://doi.org/10.1016/j.heares.2004.12.004

  565. Rosenthal DF, Okuno HG (eds) (1998) Computational auditory scene analysis. Lawrence Erlbaum Associates Publishers, Mahwah, NJ, pp i–xiii, 1–399

    Google Scholar 

  566. Ross JM, Iversen JR, Balasubramaniam R (2016) Motor simulation theories of musical beat perception. Neurocase 22(6):558–565. https://doi.org/10.1080/13554794.2016.1242756

  567. Rossi S et al (2020) How the brain understands spoken and sung sentences. Brain Sci 10(1):36, 18 p. https://doi.org/10.3390/brainsci10010036

  568. Russo FA, Thompson WF (2005) An interval size illusion: the influence of timbre on the perceived size of melodic intervals. Percept Psychophys 67(4):559–568. https://doi.org/10.3758/BF03193514

    Article  Google Scholar 

  569. Russo FA, Thompson WF (2005) The subjective size of melodic intervals over a two-octave range. Psychon Bull Rev 12(6):1068–1075. https://doi.org/10.3758/BF03206445

    Article  Google Scholar 

  570. Russo FA, Vuvan DT, Thompson WF (2019) Vowel content influences relative pitch perception in vocal melodies. Music Percept: Interdiscip J 37(1):57–65. https://doi.org/10.1525/mp.2019.37.1.57

  571. Ryan KM (2014) Onsets contribute to syllable weight, Statistical evidence from stress and meter. Language 90(2):309–341. https://doi.org/10.1353/lan.2014.0029

  572. Saint-Arnaud N, Popat K (1995) Analysis and synthesis of sound textures. In: Readings in computational auditory scene analysis. In: Proceedings of the IJCAI-95 workshop on readings in computational auditory scene analysis. Taylor & Francis Inc., London, UK, pp 293–308. http://citeseerx.ist.psu.edu/viewdoc/citations?doi=10.1.1.111.586

  573. Salminen NH et al (2015) Neural realignment of spatially separated sound components. J Acoust Soc Am 137(6):3356–3365. https://doi.org/10.1121/1.4921605

    Article  Google Scholar 

  574. Samuel AG (1981) The role of bottom-up confirmation in the phonemic restoration illusion. J Exp Psychol: Hum Percept Perform 7(5):1124–1131. https://doi.org/10.1037/0096-1523.7.5.1124

  575. Sasaki T (1980) Sound restoration and temporal localization of noise in speech and music sounds. Tohoku Psychol Folia 39(1–4):79–88

    Google Scholar 

  576. Schachner A et al (2009) Spontaneous motor entrainment to music in multiple vocal mimicking species. Curr Biol 19(10):831–836. https://doi.org/10.1016/j.cub.2009.03.061

    Article  Google Scholar 

  577. Schaefer RS, Vlek RJ, Desain P (2011) Decomposing rhythm processing: electroencephalography of perceived and self-imposed rhythmic patterns. Psychol Res 75(2):95–106. https://doi.org/10.1007/s00426-010-0293-4

  578. Scharine AA, McBeath MK (2018) Natural regularity of correlated acoustic frequency and intensity in music and speech: auditory scene analysis mechanisms account for integrality of pitch and loudness. Audit Percepti Cognit 1(3–4):205–228. https://doi.org/10.1080/25742442.2019.1600935

  579. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003

    Article  Google Scholar 

  580. Scholl B, Gao X, Wehr M (2010) Nonoverlapping sets of synapses drive on responses and off responses in auditory cortex. Neuron 65(3):412–421. https://doi.org/10.1016/j.neuron.2010.01.020

  581. Schröger E, Marzecová A, SanMiguel I (2015) Attention and prediction in human audition: a lesson from cognitive psychophysiology. Eur J Neurosci 41(5):641–664. https://doi.org/10.1111/ejn.12816

  582. Schröger E et al (2014) Predictive regularity representations in violation detection and auditory stream segregation: from conceptual to computational models. Brain Topogr 27(4):565–577. https://doi.org/10.1007/s10548-013-0334-6

    Article  Google Scholar 

  583. Schulze H-H (1989) Categorical perception of rhythmic patterns. Psychol Res 51(1):10–15. https://doi.org/10.1007/BF00309270

  584. Schulze H-H (1978) The detectability of local and global displacements in regular rhythmic patterns. Psychol Res 40(2):173–181. https://doi.org/10.1007/BF00308412

  585. Schwartz A, McDermott JH, Shinn-Cunningham BG (2012) Spatial cues alone produce inaccurate sound segregation: The effect of interaural time differences. J Acoust Soc Am 132(1):357–368. https://doi.org/10.1121/1.4718637

  586. Schwartz AH, Shinn-Cunningham BG (2010) Dissociation of perceptual judgments of ‘what’ and ‘where’ in an ambiguous auditory scene. J Acoust Soc Am 128(4):3041–3051. https://doi.org/10.1121/1.3495942

    Article  Google Scholar 

  587. Schwartz J-L et al (2012) Multistability in perception: Binding sensory modalities, an overview. Philos Trans Roy Soc B: Biol Sci 367(1591):896–905. https://doi.org/10.1098/rstb.2011.0254

  588. Sek A, Moore BC (1995) Frequency discrimination as a function of frequency, measured in several ways. J Acoust Soc Am 97(4):2479–2486. https://doi.org/10.1121/1.411968

    Article  Google Scholar 

  589. Sethares WA (1993) Local consonance and the relationship between timbre and scale. J Acoust Soc Am 94(3):1218–1228. https://doi.org/10.1121/1.408175

  590. Sethares WA (2007) Rhythm and transforms. Springer London Limited, London, UK, pp i–xiii, 1–336. https://link-springer-com.dianus.libr.tue.nl/book/10.1007%2F978-1-84628-640-7

  591. Sethares WA (2005) Tuning, timbre, spectrum, scale, 2nd edn. Springer, London, UK, pp i–xviii, 1–426. https://doi.org/10.1007/b138848

  592. Shahin AJ, Bishop CW, Miller LM (2009) Neural mechanisms for illusory filling-in of degraded speech. Neuroimage 44(3):1133–1143. https://doi.org/10.1016/j.neuroimage.2008.09.045

  593. Shamma SA (2008) On the emergence and awareness of auditory objects. PLoS Biol 6(6):e155, 1141–1143. https://doi.org/10.1371/journal.pbio.0060155

  594. Shamma SA, Elhilali M, Micheyl C (2011) Temporal coherence and attention in auditory scene analysis. Trends Neurosci 34(3):114–123. https://doi.org/10.1016/j.tins.2010.11.002

  595. Shamma SA, Klein D (2000) The case of the missing pitch templates: how harmonic templates emerge in the early auditory system. J Acoust Soc Am 107(5):2631–2644. https://doi.org/10.1121/1.428649

    Article  Google Scholar 

  596. Shamma SA, Micheyl C (2010) Behind the scenes of auditory perception. Curr Opin Neurobiol 20(3):361–366. https://doi.org/10.1016/j.conb.2010.03.009

    Article  Google Scholar 

  597. Shamma SA et al (2013) Temporal coherence and the streaming of complex sounds. In: Moore BC et al (ed) Basic aspects of hearing: physiology and perception, Chap 59. Springer Science+Business Media, New York, NY, pp 535–543. https://doi.org/10.1007/978-1-4614-1590-9_59

  598. Shams L, Kamitani Y, Shimojo S (2000) What you see is what you hear. Nature 408(6814):788. https://doi.org/10.1038/35048669

  599. Shestopalova LB et al (2014) Do audio-visual motion cues promote segregation of auditory streams? Front Neurosci 8, Article 64, 11 p. https://doi.org/10.3389/fnins.2014.00064

  600. Shinn-Cunningham BG (2008) Object-based auditory and visual attention. Trends Cognit Sci 12(5):182–186. https://doi.org/10.1016/j.tics.2008.02.003

  601. Shinn-Cunningham BG (2008) Best V Selective attention in normal and impaired hearing. Trends Cogn Sci 12(5):182–186. https://doi.org/10.1177/1084713808325306

    Article  Google Scholar 

  602. Shinn-Cunningham BG, Best V, Lee AK (2017) Auditory object formation and selection. In: Middlebrooks JC et al (ed) The auditory system at the cocktail party, Chap 2. Springer International Publishing, Cham, Switzerland, pp 7–40. https://doi.org/10.1007/978-3-319-51662-2_2

  603. Shinn-Cunningham BG, Lee AK, Oxenham AJ (2007) A sound element gets lost in perceptual competition. Proc Natl Acad Sci 104(29):12223–12227. https://doi.org/10.1073/pnas.0704641104

  604. Shinn-Cunningham BG, Wang D (2008) Influences of auditory object formation on phonemic restoration. J Acoust Soc Am 123(1):295–301. https://doi.org/10.1121/1.2804701

    Article  Google Scholar 

  605. Shonle JI, Horan KE (1976) Trill threshold revisited. J Acoust Soc Am 59(2):469–471. https://doi.org/10.1121/1.380858

    Article  Google Scholar 

  606. Shriberg EE (1992) Perceptual restoration of filtered vowels with added noise. Lang Speech 35(1–2):127–136. https://doi.org/10.1177/002383099203500211

  607. Sidiras C et al (2017) Spoken word recognition enhancement due to preceding synchronized beats compared to unsynchronized or unrhythmic beats. Front Neurosci 11, Article 415, 11 p. https://doi.org/10.3389/fnins.2017.00415

  608. Siegel JA, Siegel W (1977) Categorical perception of tonal intervals: Musicians can’t tell sharp from flat. Percept Psychophys 21(5):399–407. https://doi.org/10.3758/BF03199493

    Article  Google Scholar 

  609. Siman-Tov T et al (2019) Is there a prediction network? Meta-analytic evidence for a cortical-subcortical network likely subserving prediction. Neurosci Biobehav Rev 105:262–275. https://doi.org/10.1016/j.neubiorev.2019.08.012

    Article  Google Scholar 

  610. Simons DJ, Chabris CF (1999) Gorillas in our midst: sustained inattentional blindness for dynamic events. Perception 28(9):1059–1074. https://doi.org/10.1068/p281059

    Article  Google Scholar 

  611. Singh L, Seet SK (2019) The impact of foreign language caregiving on native language acquisition. J Exp Child Psychol 185:51–70. https://doi.org/10.1016/j.jecp.2019.04.010

    Article  Google Scholar 

  612. Singh PG (1987) Perceptual organization of complex-tone sequences, A tradeoff between pitch and timbre? J Acoust Soc Am 82(3):886–899. https://doi.org/10.1121/1.395287

  613. Singh PG, Bregman AS (1997) The influence of different timbre attributes on the perceptual segregation of complex-tone sequences. J Acoust Soc Am 102(4):1943–1952. https://doi.org/10.1121/1.419688

    Article  Google Scholar 

  614. Sivonen P et al (2006) Phonemic restoration in a sentence context: evidence from early and late ERP effects. Brain Res 1121(1):177–189. https://doi.org/10.1016/j.brainres.2006.08.123

    Article  Google Scholar 

  615. Skerritt-Davis B, Elhilali M (2018) Detecting change in stochastic sound sequences. PLoS Comput Biol 14(5):e1006162, 24 p. https://doi.org/10.1371/journal.pcbi.1006162

  616. Skinner BF (1936) The verbal summator and a method for the study of latent speech. J Psychol 2(1):71–107. https://doi.org/10.1080/00223980.1936.9917445

  617. Slawson AW (1968) Vowel quality and musical timbre as functions of spectrum envelope and fundamental frequency. J Acoust Soc Am 43(1):87–101. https://doi.org/10.1121/1.1910769

  618. Sloboda JA (1983) The communication of musical metre in piano performance. Quart J Exp Psychol Sect A 35(2):377–396. https://doi.org/10.1080/14640748308402140

  619. Smith BK et al (1986) Phase effects in masking related to dispersion in the inner ear. J Acoust Soc Am 80(6):1631–1637. https://doi.org/10.1121/1.394327

    Article  Google Scholar 

  620. Snyder JS, Alain C (2007) Toward a neurophysiological theory of auditory stream segregation. Psychol Bull 133(5):780–799. https://doi.org/10.1037/0033-2909.133.5.780

    Article  Google Scholar 

  621. Snyder JS, Elhilali M (2017) Recent advances in exploring the neural underpinnings of auditory scene perception. Ann N Y Acad Sci 1396(1):39–55. https://doi.org/10.1111/nyas.13317

    Article  Google Scholar 

  622. Snyder JS et al (2012) Attention, awareness, and the perception of auditory scenes. Front. Psychol 3, Article 15, 17 p. https://doi.org/10.3389/fpsyg.2012.00015

  623. Southwell R et al (2017) Is predictability salient? A study of attentional capture by auditory patterns. Philos Trans Roy Soc B Biol Sci 372(1714):20160105, 11 p. https://doi.org/10.1098/rstb.2016.0105

  624. Spielmann MI et al (2013) Using a staircase procedure for the objective measurement of auditory stream integration and segregation thresholds. Front Psychol 4, Article 534, 12 p. https://doi.org/10.3389/fpsyg.2013.00534

  625. Spratling MW (2016) A neural implementation of Bayesian inference based on predictive coding. Connect Sci 28(4):346–383. https://doi.org/10.1080/09540091.2016.1243655

  626. Spratling MW (2017) A review of predictive coding algorithms. Brain Cognit 112:92–97. https://doi.org/10.1016/j.bandc.2015.11.003

    Article  Google Scholar 

  627. Stachurski M, Summers RJ, Roberts B (2015) The verbal transformation effect and the perceptual organization of speech: influence of formant transitions and F0-contour continuity. Hear Res 323:22–31. https://doi.org/10.1016/j.heares.2015.01.007

    Article  Google Scholar 

  628. Stainsby TH et al (2011) Sequential streaming due to manipulation of interaural time differences. J Acoust Soc Am 130(2):904–914. https://doi.org/10.1121/1.3605540

    Article  Google Scholar 

  629. Stecker GC, Hafter ER (2000) An effect of temporal asymmetry on loudness. J Acoust Soc Am 107(6):3358–3368. https://doi.org/10.1121/1.429407

    Article  Google Scholar 

  630. Steele SA, Tranchina D, Rinzel J (2015) An alternating renewal process describes the buildup of perceptual segregation. Front Comput Neurosci 8, Article 166, 13 p. https://doi.org/10.3389/fncom.2014.00166

  631. Stefanics G et al (2007) Auditory temporal grouping in newborn infants. Psychophysiology 44(5):697–702. https://doi.org/10.1111/j.1469-8986.2007.00540.x

    Article  Google Scholar 

  632. Stevens SS, Volkmann J (1940) The relation between pitch and frequency: a revised scale. Am J Psychol 53(3):329–353. https://doi.org/10.2307/1417526

    Article  Google Scholar 

  633. Stevens SS, Volkmann J, Newman EB (1937) A scale for the measurement of the psychological magnitude pitch. J Acoust Soc Am 9(3):185–190. https://doi.org/10.1121/1.1915893

  634. Stroop JR (19335) Studies of interference in serial verbal reactions. J Exp Psychol 18(6):643–662. https://doi.org/10.1037/h0054651

  635. Stumpf C (1898) Konsonanz und Dissonanz. Beiträge zur Akustik und Musikwissenschaft 1, pp 1–108. https://archive.org/details/beitrgezurakust01stumgoog/page/n17

  636. Sussman E (2017) Auditory scene analysis: An attention perspective. J Speech, Lang Hear Res 60(10):2989–3000. https://doi.org/10.1044/2017_JSLHR-H-17-0041

  637. Sussman E (2005) Auditory scene analysis: Examining the role of nonlinguistic auditory processing in speech perception. In: Divenyi P (ed) Speech separation by humans and machines, Chap 2, Kluwer Academic Publishers, New York, NY, pp 5–12

    Google Scholar 

  638. Sussman E (2005) Integration and segregation in auditory scene analysis. J Acoust Soc Am 117(3):1285–1298. https://doi.org/10.1121/1.1854312

  639. Sussman E, Bregman AS, Lee W-W (2014) Effects of task-switching on neural representations of ambiguous sound input. Neuropsychologia 64:218–229. https://doi.org/10.1016/j.neuropsychologia.2014.09.039

    Article  Google Scholar 

  640. Sussman E et al (2014) The five myths of MMN: redefining how to use MMN in basic and clinical research. Brain Topogr 27(4):553–564. https://doi.org/10.1007/s10548-013-0326-6

    Article  Google Scholar 

  641. Swallowe GM et al (1997) On consonance: pleasantness and interestingness of four component complex tones. Acta Acust Acust 83(5):897–902

    Google Scholar 

  642. Symonds RM et al (2017) Distinguishing neural adaptation and predictive coding hypotheses in auditory change detection. Brain Topogr 30(1):136–148. https://doi.org/10.1007/s10548-016-0529-8

    Article  Google Scholar 

  643. Szabó BT, Denham SL, Winkler I (2016) Computational models of auditory scene analysis: a review. Front Neurosci 10, Article 524, 16 p. https://doi.org/10.3389/fnins.2016.00524

  644. Szalárdy O et al (2014) The effects of rhythm and melody on auditory stream segregation. J Acoust Soc Am 135(3):1392–1405. https://doi.org/10.1121/1.4865196

    Article  Google Scholar 

  645. Takeya R et al (2017) Predictive and tempo-flexible synchronization to a visual metronome in monkeys. Sci Rep 7:6127, 12 p. https://doi.org/10.1038/s41598-017-06417-3

  646. Tal I et al (2017) Neural entrainment to the beat: the ‘Missing Pulse’ phenomenon. J Neurosci 37(26):6331–6341. https://doi.org/10.1523/JNEUROSCI.2500-16.2017

    Article  MathSciNet  Google Scholar 

  647. Tan S-L, Pfordresher P, Harré R (2017) Psychology of music: from sound to significance. Psychology Press, Sussex, UK

    Book  Google Scholar 

  648. Tanaka S, Nakajima Y, Sasaki T (2007) On the mechanism of the gap Transfer Illusion. in Japanese, abstract in English. In: Report of the acoustical society of Japan (H-94-72 1994). cited by Remijn et al.2007, pp 1–6

    Google Scholar 

  649. Taubman RE (1950) Studies in judged number: I. The judgment of auditory number. J Gen Psychol 43(2):167–194. https://doi.org/10.1080/00221309.1950.9710619

  650. Taubman RE (1950) Studies in judged number: II. The judgment of visual number. J Gen Psychol 43(2):195–219. https://doi.org/10.1080/00221309.1950.9710620

  651. Teki S et al (2011) Distinct neural substrates of duration-based and beat-based auditory timing. J Neurosci 31(10):3805–3812. https://doi.org/10.1523/JNEUROSCI.5561-10.2011

    Article  Google Scholar 

  652. Teki S et al (2016) Neural correlates of auditory figure-ground segregation based on temporal coherence. Cereb Cortex 26(9):3669–3680. https://doi.org/10.1093/cercor/bhw173

    Article  Google Scholar 

  653. Ten Hoopen G, Miyauchi R, Nakajima Y (2008) Time-based illusions in the auditory mode. In: Grondin S (ed) Psychology of time, Chap 5. Emerald Group Publishing Ltd., Bingley, UK, pp 139–187. https://www.researchgate.net/publication/285718257_Time-based_illusions_in_the_auditory_mode

  654. Ten Hoopen G, Vos J (1979) Attention-switching and grouping in counting interaurally presented clicks. Acta Physiol (Oxf) 43(4):283–297. https://doi.org/10.1016/0001-6918(79)90037-4

    Article  Google Scholar 

  655. Ten Hoopen G, Vos J (1979) Effect on numerosity judgment of grouping of tones by auditory channels. Percept Psychophys 26(5):374–380. https://doi.org/10.3758/BF03204162

    Article  Google Scholar 

  656. Ten Hoopen G et al (1993) A new illusion of time perception - II. Music Percept: Interdiscip J 11(1):15–38. https://doi.org/10.2307/40285597

    Article  Google Scholar 

  657. Ten Hoopen G et al (2006) Time-shrinking and categorical temporal ratio perception: evidence for a 1:1 temporal category. Music Percept: Interdiscip J 24(1):1–22. https://doi.org/10.1525/mp.2006.24.1.1

    Article  Google Scholar 

  658. Tenney J (1988) A history of ‘consonance’ and ‘dissonance’. Excelsior Music Publishing Company, New York, NY

    Google Scholar 

  659. Terhardt E (1974) Pitch, consonance, and harmony. J Acoust Soc Am 55(5):1061–1069. https://doi.org/10.1121/1.1914648

  660. Terhardt E (1984) The concept of musical consonance: a link between music and psychoacoustics. Music Percept: Interdiscip J 1(3):276–295. https://doi.org/10.2307/40285261

  661. Terreros G, Delano PH (2015) Corticofugal modulation of peripheral auditory responses. Front Syst Neurosci 9, Article 134, 8 p. https://doi.org/10.3389/fnsys.2015.00134

  662. Theeuwes J (2018) Visual selection: Usually fast and automatic; seldom slow and volitional. J Cognit 1(1):29, 15 p. https://doi.org/10.5334/joc.13

  663. Thomassen S, Bendixen A (2018) Assessing the background decomposition of a complex auditory scene with event-related brain potentials. Hear Res 370:120–129. https://doi.org/10.1016/j.heares.2018.09.008

    Article  Google Scholar 

  664. Thomassen S, Bendixen A (2017) Subjective perceptual organization of a complex auditory scene. J Acoust Soc Am 265(2):265–276. https://doi.org/10.1121/1.4973806

    Article  Google Scholar 

  665. Thompson SK, Carlyon RP, Cusack R (2011) An objective measurement of the build-up of auditory streaming and of its modulation by attention. J Exp Psychol: Hum Percept Perform 37(4):1253–1262. https://doi.org/10.1037/a0021925

  666. Thompson WF et al (2012) The effect of intensity on relative pitch. Q J Exp Psychol 65(10):2054–2072. https://doi.org/10.1080/17470218.2012.678369

    Article  Google Scholar 

  667. Thurlow WR (1957) An auditory figure-ground effect. Am J Psychol 70(4):653–654. https://doi.org/10.2307/1419466

  668. Thurlow WR, Elfner LF (1959) Continuity effects with alternately sounding tones. J Acoust Soc Am 31(10):1337–1339. https://doi.org/10.1121/1.1907631

    Article  Google Scholar 

  669. Thurlow WR, Rawlings IL (1959) Discrimination of number of simultaneously sounding tones. J Acoust Soc Am 31(10):1332–1336. https://doi.org/10.1121/1.1907630

    Article  Google Scholar 

  670. Tierney A, Patel AD, Breen M (2018) Repetition enhances the musicality of speech and tone stimuli to similar degrees. Music Percept: Interdiscip J 35(5):573–578. https://doi.org/10.1525/mp.2018.35.5.573

  671. Todd NPM (1985) A model of expressive timing in tonal music. Music Percept: Interdiscip J 3(1):33–58. https://doi.org/10.2307/40285321

  672. Töpken S, Verhey JL, Weber R (2015) Perceptual space, pleasantness and periodicity of multi-tone sounds. J Acoust Soc Am 138(1):288–298. https://doi.org/10.1121/1.4922783

    Article  Google Scholar 

  673. Tordini F, Bregman AS, Cooperstock JR (2016) Prioritizing foreground selection of natural chirp sounds by tempo and spectral centroid. J Multimodal User Interfaces 10(3):221–234. https://doi.org/10.1007/s12193-016-0223-x

  674. Tordini F et al (2013) Toward an improved model of auditory saliency. In: Proceedings of the international conference on auditory displays (ICAD2013) 6-10 July 2013, Łódź, Poland, pp 189–196. http://hdl.handle.net/1853/51667

  675. Torres HM et al (2021) F0 perturbation due to articulatory movements: filtering, characterization and applications. IEEE/ACM Trans Audio, Speech, Lang Process 29:1977–1986. https://doi.org/10.1109/TASLP.2021.3082671

    Article  Google Scholar 

  676. Tougas Y, Bregman AS (1985) Crossing of auditory streams. J Exp Psychol Hum Percept Perform 11(6):788–798. https://doi.org/10.1037/0096-1523.11.6.788

    Article  Google Scholar 

  677. Trainor LJ et al (2014) Explaining the high voice superiority effect in polyphonic music: evidence from cortical evoked potentials and peripheral auditory models. Hear Res 308:60–70. https://doi.org/10.1016/j.heares.2013.07.014

    Article  Google Scholar 

  678. Traunmüller H, Eriksson A (1993) F0-excursions in speech and their perceptual evaluation as evidenced in liveliness estimations. Phonetic Experimental Research, Institute of Linguistics, University of Stockholm (PERILUS) 17 (1993), pp 1–34. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.211.1743 &rep=rep1 &type=pdf#page=17

  679. Traunmüller H, Eriksson A (1995) The perceptual evaluation of F0 excursions in speech as evidenced in liveliness estimations. J Acoust Soc Am 97(3):1905–1915. https://doi.org/10.1121/1.412942

    Article  Google Scholar 

  680. Trulla LL, Di Stefano N, Giuliani A (2018) Computational approach to musical consonance and dissonance. Front Psychol 9, Article 381, 11 p. https://doi.org/10.3389/fpsyg.2018.00381

  681. Turgeon M, Bregman AS, Ahad PA (2002) Rhythmic masking release: contribution of cues for perceptual organization to the cross-spectral fusion of concurrent narrow-band noises. J Acoust Soc Am 111(4):1819–1831. https://doi.org/10.1121/1.1453450

  682. Turgeon M, Bregman AS, Roberts B (2005) Rhythmic masking release: effects of asynchrony, temporal overlap, harmonic relations, and source separation on cross-spectral grouping. J Exp Psychol: Hum Percept Perform 31(5):939–953. https://doi.org/10.1037/0096-1523.31.5.939

  683. Turk A, Shattuck-Hufnagel Shattuck-Hufnagel S (2013) What is speech rhythm? A commentary on Arvaniti and Rodriquez, Krivokapic, and Goswami and Leong. Lab Phonol 4(1):93–118. https://doi.org/10.1515/lp-2013-0005

    Article  Google Scholar 

  684. Ulanovsky N et al (2004) Multiple time scales of adaptation in auditory cortex neurons. J Neurosci 24(46):10440–10453. https://doi.org/10.1523/JNEUROSCI.1905-04.2004

    Article  Google Scholar 

  685. Ungan P, Yagcioglu S (2014) Significant variations in Weber fraction for changes in inter-onset interval of a click train over the range of intervals between 5 and 300 ms. Front Psychol 5, Article 1453, 9 p. https://doi.org/10.3389/fpsyg.2014.01453

  686. Urban CJ, Gates KM (2021) Deep learning: a primer for psychologists. Psychol Methods 26(6):743–773. https://doi.org/10.1037/met0000374

    Article  Google Scholar 

  687. Van de Geer JP, Levelt W, Plomp R (1962) The connotation of musical consonance. Acta Psychol 20(4):308–319. http://hdl.handle.net/2066/15399

  688. Van Noorden LPAS (1971) Discrimination of time intervals bounded by tones of different frequencies. IPO Ann Prog Rep 6:12–15

    Google Scholar 

  689. Van Noorden (1977) LPAS Minimum differences of level and frequency for perceptual fission of tone sequences ABAB. J Acoust Soc Am 81(4):1041–1045. https://doi.org/10.1121/1.381388

  690. Van Noorden LPAS (1971) Rhythmic fission as a function of tone rate. Institute for Perception Research, pp 9–12

    Google Scholar 

  691. Van Noorden LPAS (1975) Temporal coherence and the perception of temporal position in tone sequences. IPO Ann Prog Rep 10:4–18

    Google Scholar 

  692. Van Noorden LPAS (1975) Temporal coherence in the perception of tone sequences. Technische Hogeschool Eindhoven, Eindhoven

    Google Scholar 

  693. Van Noorden LPAS (1982) Two channel pitch perception. In: Clynes M (ed) Music, mind, and brain: the neuropsychology of music, Chap 13. Plenum Press, London, UK, pp 251–269. https://doi.org/10.1007/978-1-4684-8917-013

  694. Van Noorden LPAS, Moelants D (1999) Resonance in the perception of musical pulse. J New Music Res 28(1):43–66. https://doi.org/10.1076/jnmr.28.1.43.3122

  695. Vanden Bosch der Nederlanden CM, Hannon EE, Snyder JS (2015) Finding the music of speech: musical knowledge influences pitch processing in speech. Cognition 143:135–140. https://doi.org/10.1016/j.cognition.2015.06.015

  696. Varlet M, Williams R, Keller PE (2020) Effects of pitch and tempo of auditory rhythms on spontaneous movement entrainment and stabilisation. Psychol Res 84:568–584. https://doi.org/10.1007/s00426-018-1074-8

    Article  Google Scholar 

  697. Vassilakis PN, Kendall RA Psychoacoustic and cognitive aspects of auditory roughness: Definitions, models, and applications. Human Vision and Electronic Imaging XV. Ed. by Rogowitz BE, Pappas TN Vol. 7527. Bellingham, WA: SPIE, 2010, 7 pages. https://doi.org/10.1117/12.845457

  698. Vencovský, V, Rund F (2017) Roughness of two simultaneous harmonic complex tones on just-tempered and equal-tempered scales. Music Percept: Interdiscip J 35(2):127–143. https://doi.org/10.1525/mp.2017.35.2.127

  699. Verschuure J, Brocaar MP (1983) Intelligibility of interrupted meaningful and nonsense speech with and without intervening noise. Percept Psychophys 33(3):232–240. https://doi.org/10.3758/BF03202859

    Article  Google Scholar 

  700. Verschuure J (1978) Auditory excitation patterns: the significance of the pulsation threshold method for the measurement of auditory nonlinearity. Rotterdam, pp 1–176. http://hdl.handle.net/1765/25949

  701. Vincent E, Virtanen T, Gannot S (eds) (2018) Audio source separation and speech enhancement. Wiley, Hoboken, NJ

    Google Scholar 

  702. Virtanen T, Plumbley MD, Ellis D (eds) Computational analysis of sound scenes and events. Springer International Publishing, Cham, Switzerland, pp i-x, 1–422. https://doi.org/10.1007/978-3-319-63450-0

  703. Vitevitch MS (2003) Change deafness, The inability to detect changes between two voices. J Exp Psychol Hum Percept Perform 29(2):333–342. https://doi.org/10.1037/0096-1523.29.2.333

  704. Vitevitch MS, Siew CSQ (2017) Estimating group size from human speech: three’s a conversation, but four’s a crowd. Q J Exp Psychol 70(1):62–74. https://doi.org/10.1080/17470218.2015.1122070

    Article  Google Scholar 

  705. Vliegen J, Moore BC, Oxenham AJ (1999) The role of spectral and periodicity cues in auditory stream segregation, measured using a temporal discrimination task. J Acoust Soc Am 106(2):938–945. https://doi.org/10.1121/1.427140

  706. Vliegen J, Oxenham AJ (1999) Sequential stream segregation in the absence of spectral cues. J Acoust Soc Am 105(1):339–346. https://doi.org/10.1121/1.424503

    Article  Google Scholar 

  707. Von Helmholtz H (1913) Die Lehre von den Tonempfindungen als Physiologische Grundlage für die Theorie der Musik, 6th edn. Druck und Verlag von Friedr. Vieweg & Sohn, Braunschweig

    Book  MATH  Google Scholar 

  708. Vuust P, Witek MAG (2014) Rhythmic complexity and predictive coding: a novel approach to modelling rhythm and meter perception in music. Front Psychol 5, Article 1111, 14 p. https://doi.org/10.3389/fpsyg.2014.01111

  709. Vuust P et al (2018) Now you hear it: a predictive coding model for understanding rhythmic incongruity. Ann N Y Acad Sci 1423(1):19–29. https://doi.org/10.1111/nyas.13622

    Article  Google Scholar 

  710. Wacongne C et al (2011) Evidence for a hierarchy of predictions and prediction errors in human cortex. Proc Natl Acad Sci 108(51):20754–20759. https://doi.org/10.1073/pnas.1117807108

    Article  Google Scholar 

  711. Wagemans J et al (2012) A century of Gestalt psychology in visual perception. Psychol Bull 138(6):1172–1217. https://doi.org/10.1037/a0029334

    Article  Google Scholar 

  712. Wagner B, Bowling DL, Hoeschele M (2020) Is consonance attractive to budgerigars? No evidence from a place preference study. Animal Cognit 23(5):973–987. https://doi.org/10.1007/s10071-020-01404-0

  713. Wagner B et al (2019) Octave equivalence perception is not linked to vocal mimicry: budgerigars fail standardized operant tests for octave equivalence. Behaviour 156(5–9):479–504. https://doi.org/10.1163/1568539X-00003538

  714. Wallin JEW (1911) Experimental studies of rhythm and time. II. The preferred length of interval (tempo). Psychol Rev 18(2):202–222. https://doi.org/10.1037/h0071786

  715. Walsh KS et al (2020) Evaluating the neurophysiological evidence for predictive processing as a model of perception. Ann N Y Acad Sci 1464(1), 27 p. https://doi.org/10.1111/nyas.14321

  716. Wang D, Brown GJ (2006) Computational auditory scene analysis: principles, algorithms, and applications. Wiley-IEEE Press, Hoboken, NJ. http://ieeexplore.ieee.org/xpl/bkabstractplus.jsp?bkn=5769523

  717. Ward WD (1954) Subjective musical pitch. J Acoust Soc Am 26(3):369–380. https://doi.org/10.1121/1.1907344

  718. Warren JD et al (2003) Separating pitch chroma and pitch height in the human brain. Proc Natl Acad Sci 100(17):10038–10042. https://doi.org/10.1073/pnas.1730682100

    Article  Google Scholar 

  719. Warren RM (1999) Auditory perception: a new synthesis. Cambridge University Press, Cambridge, UK

    Google Scholar 

  720. Warren RM (1961) Illusory changes of distinct speech upon repetition - the verbal transformation effect. Br J Psychol 52(3):249–258. https://doi.org/10.1111/j.2044-8295.1961.tb00787.x

  721. Warren RM (1970) Perceptual restoration of missing speech sounds. Science 167(3917):392–393

    Google Scholar 

  722. Warren RM (1984) Perceptual restoration of obliterated sounds. Psychol Bull 70(4):371–383. https://doi.org/10.1037/0033-2909.96.2.371

  723. Warren RM, Ackroff JM (1976) Two types of auditory sequence perception. Percept Psychophys 20(5):387–394. https://doi.org/10.3758/BF03199420

    Article  Google Scholar 

  724. Warren RM, Bashford JA (1981) Perception of acoustic iterance: pitch and infrapitch. Percept Psychophys 29(4):395–402. https://doi.org/10.3758/BF03207350

    Article  Google Scholar 

  725. Warren RM, Gregory RL (1958) An auditory analogue of the visual reversible figure. Am J Psychol 71(3):612–613. https://doi.org/10.2307/1420267

    Article  Google Scholar 

  726. Warren RM, Obusek CJ, Ackroff JM (1972) Auditory induction: perceptual synthesis of absent sounds. Science 176(4039):1149–1151. https://doi.org/10.1126/science.176.4039.1149

  727. Warren RM et al (1994) Auditory induction: reciprocal changes in alternating sounds. Percept Psychophys 55(3):313–322. https://doi.org/10.3758/BF03207602

    Article  Google Scholar 

  728. Warren RM et al (1969) Auditory sequence: confusion of patterns other than speech or music. Science 164(3879):586–587. https://doi.org/10.1126/science.164.3879.586

    Article  Google Scholar 

  729. Warren RM et al (1997) Spectral restoration of speech: intelligibility is increased by inserting noise in spectral gaps. Percept Psychophys 59(2):275–283. https://doi.org/10.3758/BF03211895

    Article  Google Scholar 

  730. Watson CS (2005) Some comments on informational masking. Acta Acust Acust 91(3):502–512

    Google Scholar 

  731. Wenhart T, Hwang Y-Y, Altenmüller E (2019) Enhanced auditory disembedding in an interleaved melody recognition test is associated with absolute pitch ability. Sci Rep 9:7838, 14 p. https://doi.org/10.1038/s41598-019-44297-x

  732. Wertheimer M (1923) Untersuchungen zur Lehre von der Gestalt. II. Psychologische Forschung 4(1):301–350

    Google Scholar 

  733. Wessel DL (1979) Timbre space as a musical control structure. Comput Music J 3(2):45–52. https://doi.org/10.2307/3680283

  734. Wever EG (1929) Beats and related phenomena resulting from the simultaneous sounding of two tones: I. Psychol Rev 36(5):402–418. https://doi.org/10.1037/h0072876

  735. Williams SM (1994) Perceptual principles in sound grouping. In: Auditory display: sonification, audification and auditory interfaces. Addison-Wesley Publishing Company, MA, pp 95–125

    Google Scholar 

  736. Wilson M, Cook PF (2016) Rhythmic entrainment: why humans want to, fireflies can’t help it, pet birds try, and sea lions have to be bribed. Psychon Bull Rev 23(6):1647–1659. https://doi.org/10.3758/s13423-016-1013-x

    Article  Google Scholar 

  737. Winkler I, Czigler I (2012) Evidence from auditory and visual event-related potential (ERP) studies of deviance detection (MMN and vMMN) linking predictive coding theories and perceptual object representations. Int J Psychophysiol 83(2):132–143. https://doi.org/10.1016/j.ijpsycho.2011.10.001

    Article  Google Scholar 

  738. Winkler I, Denham SL, Nelken I (2009) Modeling the auditory scene: predictive regularity representations and perceptual objects. Trends Cognit Sci 13(12):532–540. https://doi.org/10.1016/j.tics.2009.09.003

  739. Winkler I et al (2012) Multistability in auditory stream segregation: a predictive coding view. Philos Trans Roy Soc Lond B: Biol Sci 367(1591):1001–1012. https://doi.org/10.1098/rstb.2011.0359

    Article  Google Scholar 

  740. Winkler I et al (2003) Newborn infants can organize the auditory world. Proc Natl Acad Sci 100(20):11812–11815. https://doi.org/10.1073/pnas.2031891100

    Article  Google Scholar 

  741. Winkler I et al (2009) Newborn infants detect the beat in music. Proc Natl Acad Sci 106(7):2468–2471. https://doi.org/10.1073/pnas.0809035106

    Article  Google Scholar 

  742. Winkler I et al (2006) Object representation in the human auditory system. Eur J Neurosci 24(2):625–634. https://doi.org/10.1111/j.1460-9568.2006.04925.x

    Article  Google Scholar 

  743. Witek MAG et al (2014) Syncopation, body-movement and pleasure in groove music. PLoS ONE 9(4):e94446, 12 p. https://doi.org/10.1371/journal.pone.0094446

  744. Wood N, Cowan N (1995) The cocktail party phenomenon revisited: how frequent are attention shifts to one’s name in an irrelevant auditory channel? J Exp Psychol Learn Mem Cogn 21(1):255–260. https://doi.org/10.1037/0278-7393.21.1.255

    Article  Google Scholar 

  745. Woodrow H (1932) The effect of rate of sequence upon the accuracy of synchronization. J Exp Psychol 15(4):357–379. https://doi.org/10.1037/h0071256

  746. Woodruff J, Wang D (2013) Binaural detection, localization, and segregation in reverberant environments based on joint pitch and azimuth cues. IEEE Trans Audio Speech Lang Process 21(4):806–815. https://doi.org/10.1109/TASL.2012.2236316

    Article  Google Scholar 

  747. Woodruff J, Wang D (2012) Binaural localization of multiple sources in reverberant and noisy environments. IEEE Trans Audio Speech Lang Process 20(5):1503–1512. https://doi.org/10.1109/TASL.2012.2183869

    Article  Google Scholar 

  748. Woods KJ, McDermott JH (2018) Schema learning for the cocktail party problem. Proc Natl Acad Sci 115(14):E3313–E3322. https://doi.org/10.1073/pnas.1801614115

  749. Wright AA et al (2000) Music perception and octave generalization in rhesus monkeys. J Exp Psychol Gen 129(3):291–307. https://doi.org/10.1037/0096-3445.129.3.291

    Article  Google Scholar 

  750. Xu F, Spelke ES (2000) Large number discrimination in 6-month-old infants. Cognition 74(1):B1–B11. https://doi.org/10.1016/S0010-0277(99)00066-9

  751. Yang J et al (2020) Tapping ahead of time: its association with timing variability. Psychol Res 84:343–351. https://doi.org/10.1007/s00426-018-1043-2

    Article  Google Scholar 

  752. Yost WA, Pastore MT, Pulling KR (2018) Loudness of an auditory scene composed of multiple talkers. J Acoust Soc Am 144(3):EL236–EL241. https://doi.org/10.1121/1.5055387

  753. Yost WA, Pastore MT, Pulling KR (2019) The relative size of auditory scenes of multiple talkers. J Acoust Soc Am 146(3):EL219–EL224. https://doi.org/10.1121/1.5125007

  754. Yost WA, Pastore MT, Zhou Y (2018) Discrimination of changes in spatial configuration for multiple, simultaneously presented sounds. J Acoust Soc Am 145(4):EL310–EL316. https://doi.org/10.1121/1.5098107

  755. Zalta A, Petkoski S, Morillon B (2020) Natural rhythms of periodic temporal attention. Nat Commun 11(1), Article 1051, 12 p. https://doi.org/10.1038/s41467-020-14888-8

  756. Zarate JM, Ritson CR, Poeppel D (2013) The effect of instrumental timbre on interval discrimination. PLoS ONE 8(9):e75410, 9 p. https://doi.org/10.1371/journal.pone.0075410

  757. Zatorre R (2016) Amazon music. Nature 535(7613):496–497. https://doi.org/10.1038/nature18913

  758. Zatorre RJ, Baum SR (2012) Musical melody and speech intonation: singing a different tune. PLoS Biol 10(7):e1001372, 6 p. https://doi.org/10.1371/journal.pbio.1001372

  759. Zhang H, Wiener S, Holt LL (2022) Adjustment of cue weighting in speech by speakers and listeners: evidence from amplitude and duration modifications of Mandarin Chinese tone. J Acoust Soc Am 151(2):992–1005. https://doi.org/10.1121/10.0009378

  760. Zhao S et al (2019) Rapid ocular responses are modulated by bottom-up-driven auditory salience. J Neurosci 39(39):7703–7714. https://doi.org/10.1523/JNEUROSCI.0776-19.2019

    Article  Google Scholar 

  761. Zhong X, Yost WA (2017) How many images are in an auditory scene? J Acoust Soc Am 141(4):2882–2892. https://doi.org/10.1121/1.4981118

    Article  Google Scholar 

  762. Zhou B et al (2014) Learning deep features for scene recognition using places database. In: Proceedings of the twenty-eighth conference on neural information processing systems (NIPS 2014) Montréal, Canada. 2014, pp 487–495. http://papers.nips.cc/paper/5349-learning-deep-features-for-scene-recognition-usingplaces-database.pdf

  763. Zion Golumbic E et al (2013) Mechanisms underlying selective neuronal tracking of attended speech at a ‘cocktail party’. Neuron 77(5):980–991. https://doi.org/10.1016/j.neuron.2012.12.037

    Article  Google Scholar 

  764. Zuk NJ, Teoh ES, Lalor EC (2020) EEG-based classification of natural sounds reveals specialized responses to speech and music. NeuroImage 210:116558, 11 p. https://doi.org/10.1016/j.neuroimage.2020.116558

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dik J. Hermes .

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hermes, D.J. (2023). Auditory-Stream Formation. In: The Perceptual Structure of Sound. Current Research in Systematic Musicology, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-031-25566-3_10

Download citation

Publish with us

Policies and ethics