Natural Music Evokes Correlated EEG Responses Reflecting Temporal Structure and Beat

The brain activity of multiple subjects has been shown to synchronize during salient moments of natural stimuli, suggesting that correlation of neural responses indexes a brain state operationally termed ‘engagement’. While past electroencephalography (EEG) studies have considered both auditory and visual stimuli, the extent to which these results generalize to music—a temporally structured stimulus for which the brain has evolved specialized circuitry—is less understood. Here we investigated neural correlation during natural music listening by recording dense-array EEG responses from N = 48 adult listeners as they heard real-world musical works, some of which were temporally disrupted through shuffling of short-term segments (measures), reversal, or randomization of phase spectra. We measured neural correlation across responses (inter-subject correlation) and between responses and stimulus envelope fluctuations (stimulus-response correlation) in the time and frequency domains. Stimuli retaining basic musical features evoked significantly correlated neural responses in all analyses. However, while unedited songs were self-reported as most pleasant, time-domain correlations were highest during measure-shuffled versions. Frequency-domain measures of correlation (coherence) peaked at frequencies related to the musical beat, although the magnitudes of these spectral peaks did not explain the observed temporal correlations. Our findings show that natural music evokes significant inter-subject and stimulus-response correlations, and suggest that the neural correlates of musical engagement may be distinct from those of enjoyment.

being currently involved in musical activities. Reported music listening ranged from 3-52.5 hours (mean 15.03 hours) per week. and 12 participants were assigned to each stimulus, we collected a total of 24 trials for each 166 of the 16 stimuli. 167 The experiment was programmed using Neurobehavioral Systems Presentation software. analyses were performed using in-house Matlab code unless otherwise specified. 186 Preprocessing was performed on a per-recording basis as follows. We first extracted the 187 behavioral ratings delivered by the participant at the end of each trial. Next, the contin-188 uous EEG was epoched using the time stamps sent from the audio and associated with 189 corresponding stimulus labels sent from Presentation. As we observed zero-frequency con-190 tent in the recordings even after EGI filtering, we removed any linear trend and performed  If abrupt changes in envelope were driving ISC, we would expect these correlations to be 252 significantly greater than zero. 253 We computed SRC between each RC1 time course and the magnitude fluctuations of the 254 corresponding stimulus envelope. As the delay between stimulus events and corresponding 255 evoked responses is unknown, we temporally filtered each envelope feature to maximize 256 its correlation with the already spatially filtered RC1 activations. Temporal filtering was 257 performed separately for each stimulus across the full time course of stimulus and response.

258
To prepare the data for this procedure, the time-by-trials RC1 EEG matrix X ∈ R T ×24 was 259 reshaped to a vector x cat ∈ R 24T concatenating the trials in time. Next, the stimulus feature 260 z ∈ R T was expanded to a Toeplitz matrix Z ∈ R T ×K , whose columns comprised successively 261 sample-wise delayed versions of the envelope up to one second, plus an intercept column.

262
This matrix was then repeated row-wise 24 times, forming Z cat ∈ R 24T ×K . actual EEG responses on a per-trial basis. We report mean correlation, and standard error 267 of the mean, across the 24 stimulus-response correlations for each stimulus.

268
The inter-subject and stimulus-response analyses described above produced one corre-269 lation per trial. In order to assess the relation between the two measures, we correlated  stimuli; Phase responses were excluded because those stimuli did not have a steady beat. 275 We computed inter-subject and stimulus-response magnitude-squared coherence on a per- interest' for that song. We report the mean, and standard error of the mean, of magnitude-285 squared coherence at the frequency of interest for each stimulus.

286
The relationship between inter-subject and stimulus-response coherence was assessed by 287 aggregating RC1 coherence measures across stimuli at each song's frequency of interest. We 288 then correlated the trial-wise inter-subject and stimulus-response measures. We report the 289 correlation coefficient and its statistical significance. 290 We computed cross power spectral density (CPSD) phase angles using the same DFT and The significance of each EEG analysis was assessed using the permutation testing approach 300 described by Theiler et al. (1992). Surrogate EEG data were generated by phase-scrambling 301 each trial matrix prior to input to RCA-in fact the same approach used to create the phase-302 scrambled stimuli. Phases were randomized independently for each trial, and all electrodes in 303 a given trial were assigned the same randomized conjugate-symmetric distribution of phases.

304
The resulting data preserved aggregate power spectra and autocorrelation characteristics 305 inherent to EEG, but stimulus-driven temporal characteristics were lost (Sturm et al., 2014). 306 We computed RCA over 1,000 independent instantiations of surrogate data, and the resulting  non-musical stimuli. In addition, as shown in Figure 2B, the RC1 correlation coefficient 375 was well above permutation test significance thresholds for responses to Intact, Measure,  Figure S4). All correlations were weak (|r| ≤ 0.15). Therefore, it is unlikely that the SRC results ( Figure 3A)  (inter-subject) as well as between responses and envelope fluctuations (stimulus-response). 427 We observed prominent peaks in the low-frequency coherence spectrum (0-12 Hz). As shown 428 in Figure 4A, peaks occurred at frequencies corresponding to metrically relevant groupings Inter-subject coherence at each song's frequency of interest was always statistically significant (permutation test p < 0.001, FDR corrected) and varied according to stimulus condition (repeated-measures ANOVA, p < 0.001). Intact and Measure coherence were higher than Reversed coherence (repeated-measures ANOVA, p < 0.001, FDR corrected) but did not differ significantly from one another. (B) Stimulus-response coherence at the same frequencies was also significant (permutation test p < 0.001, FDR corrected) and varied according to stimulus condition (repeated-measures ANOVA, p < 0.001), with higher coherence for Intact and Measure compared to Reversed (repeated-measures ANOVA, p < 0.001, FDR corrected). (C) Correlation of stimulus-response and inter-subject coherence for individual trials was significant at the peak frequency for the respective stimulus (r = 0.77), with stimulus-response coherence explaining 59% of the variance of inter-subject coherence.
to converge at frequencies corresponding to maximal coherence peaks (though this was not  We presented participants with popular yet novel musical works in original form as well 495 as in states of temporal disruption. We found that stimuli retaining basic musical features 496 produced statistically significant ISC and SRC, while Phase excerpts did not (Figure 2-3).

497
These results confirm findings of Abrams et al. (2013)-whose fMRI study also involved a 498 phase-scrambled control-in that statistically significant ISC was not elicited by low-level 499 auditory cues alone. Our results also extend their findings, as we find that natural music 500 excerpts need not be in their original, intact form in order to elicit significant ISC. However, 501 we note that among the musical features removed by phase scrambling are amplitude enve-502 lope fluctuations characteristic to music ( Figure 1A). Therefore, subsequent investigation is 503 aimed to disentangle the contributions of amplitude envelope fluctuations and higher-level 504 acoustical and musical features in driving temporally correlated responses.

505
Our results also indicated that brain responses to synthetic music constructed in a way 506 that produced a consistently high level of surprise (Measure condition) were most correlated 507 in both ISC and SRC contexts. This finding was in contrast to our expectation, which 508 was that Intact stimuli created for public consumption would produce the most correlated 509 brain activity. These findings could not be explained simply by temporal discontinuities in      (B) For each stimulus, ISC was computed separately for RCs 1-3 and then summed. A repeatedmeasures ANOVA indicated that summed ISC differed significantly according to stimulus condition (χ 2 (3) = 305.83, p < 2.2e-16). Follow-up pairwise comparisons showed that Measure stimuli elicited highest summed ISC across RC1-3 (χ 2 (1) ≥ 50.69, p FDR ≤ 1.3e-12, FDR corrected, 6 comparisons), and Phase stimuli elicited lowest RC1-3 ISC (χ 2 (1) ≥ 129.05, p FDR ≤ 4.4e-16). In fact, ISC for Phase RCs 2 and 3 was sometimes negative. Figure S4: ISC is inversely related to envelope dynamics in musical stimuli. We computed both the ISC and the absolute mean of the envelope derivative along 5-sec time windows of all presented stimuli. Results were pooled across songs. For two of the three stimulus conditions retaining musical features, we found a significant inverse relationship between ISC and the amount of fluctuation in the envelope. For phase-scrambled stimuli, a small but significant positive correlation between ISC and envelope fluctuation was observed. Overall, all correlations were weak (|r| ≤ 0.15). These findings indicate that conditional differences in envelope dynamics do not explain those in the ISC ( Figure 2C).