Breaking (musical) boundaries by investigating brain dynamics of event segmentation during real-life music-listening

Significance Understanding how we perceive musical boundaries is crucial for unraveling the intricate processes behind our musical experiences, such as our ability to find enjoyment and meaning in music. Imagine listening to music without discernible segments—its continuous stream would lack structure and become overwhelming or rather dull. By analyzing how the brain reacts to musical phrase transitions in musicians and nonmusicians, we fill a critical gap in music perception and cognition. We noted brain activity changes during boundary transitions with discernible modulations based on musicianship. This emphasizes the impact of expertise on refining our neural processing and underscores a fundamental language-like system for temporal segmentation in the brain, with broader implications for auditory scene analysis beyond the domain of music.


FMRI experiment
Thirty-six healthy participants with no history of neurological or psychological disorders participated in the fMRI experiment.The participants were screened for inclusion criteria before admission to the experiment (no ferromagnetic material in their body; no tattoo or recent permanent coloring; no pregnancy or breastfeeding; no chronic pharmacological medication; no claustrophobia).The participant pool was selected to include an equal number of professional musicians (n = 18, age = 28.2±7.8,females = 9) and nonmusicians (n = 18, age = 29.2±10.7,females = 10, left handers = 1).The criteria for musicianship were having more than 5 years of music training, having finished a music degree in a music academy, reporting themselves as musicians, and working professionally as a performer.As for the type of musicians, there were classical (n = 12), jazz (n = 4), and pop (n = 2) musicians.The instruments played were strings (violin = 4; cello = 2; double bass = 1), piano (n = 8), winds (trombone = 1; bassoon = 1), and mixed (n = 1).The musicians' group was homogeneous in terms of the duration of their musical training, onset age of instrument practice, and amount of years of active instrument playing.These details were obtained and crosschecked via questionnaires and Helsinki Inventory for Music and Affect Behavior (HIMAB; Gold et al., 2013).All participants classified as musicians had to meet all four criteria without exception, while nonmusicians were disqualified from meeting any of the four criteria.Both groups were comparable with respect to gender, age distribution, cognitive performance, socioeconomic status, and personality and mood questionnaire.The present dataset was part of a broad project ("Tunteet") investigating different hypotheses related to auditory processing and its dependence on person related factors by means of a multidimensional set of paradigms and tests, involving several experimental sessions, brain and behavioral measures as well as questionnaires.The findings related to the various hypotheses investigated appear in separate papers (cf.2-4).

Perceptual experiment
Thirty-six participants took part in the perceptual experiment (18 nonmusicians [females = 8] and 18 musicians [females = 10]).The mean age of the participants was 27.45 years (SD = 4.54).They were all students or graduates from different faculties of the University of Jyväskylä and of the JAMK University of Applied Sciences.Participants were rewarded with a movie ticket as a token for their participation.Both musicians and nonmusicians met the relevant group criteria outlined previously.Musicians had an average of 14.39 years (SD = 7.49) of musical training.The musical style played by 12 of the musicians was classical music, whereas the other 6 musicians played non-classical musical styles.The main instruments played by participants were piano (5), guitar (4), flute (2), bass guitar, clarinet, saxophone, cello, violin, viola and voice.All the nonmusicians reported having had no musical training, whereas all of the selected musicians considered themselves either as semiprofessional (12) or professional (6 participants) musicians at the time of the data collection.None of the participants reported experience in dance, ballet or sound engineering.Six participants were very familiar with at least one stimulus but nobody reported having performed any of the examples.As a general rule, we referred to a participant as musician when they had reported more than 8 years of musical training and had also considered themselves as semiprofessional musician or professional musician.We discarded, for example, participants who, in a multiple-choice questionnaire, reported to be amateur musicians.In contrast, we considered participants to be nonmusicians if they considered themselves as nonmusicians and if they did not report any musical training.

FMRI scanning and preprocessing
Participants' brain responses were acquired while they listened to each of the musical stimuli in a counterbalanced order.For each participant the stimuli loudness was adjusted to a comfortable but audible level inside the scanner room (around 75 dB).In the scanner, participants' only task was to attentively listen to the music delivered via high-quality MR compatible insert earphones while keeping their eyes open.
Scanning was performed using a 3T MAGNETOM Skyra whole-body scanner (Siemens Healthcare, Erlangen, Germany) and a standard 20-channel head-neck coil, at the Advanced Magnetic Imaging (AMI) Centre (Aalto University, Espoo, Finland).Using a single shot gradient echo planar imaging (EPI) sequence, 33 oblique slices (field of view = 192x192 mm; 64x64 matrix; slice thickness = 4 mm, interslice skip = 0 mm; echo time = 32 ms; flip angle = 75°) were acquired every two seconds, providing whole-brain coverage.T1-weighted structural images (176 slices; field of view = 256x256 mm; matrix = 256×256; slice thickness = 1 mm; interslice skip = 0 mm; pulse sequence = MPRAGE) were also collected for individual coregistration.Functional MRI scans were preprocessed on a Matlab platform using SPM8 (Statistical Parametric Mapping), VBM5 for SPM (Voxel Based Morphometry; Ashburner & Friston, 2000; Wellcome Department of Imaging Neuroscience, London, UK), and customized scripts developed by the present authors.For each participant, low resolution images were realigned on six dimensions using rigid body transformations (translation and rotation corrections did not exceed 2 mm and 2° respectively), segmented into grey matter, white matter, and cerebrospinal fluid, and registered to the corresponding segmented high resolution T1 weighted structural images.These were in turn normalized to the MNI (Montreal Neurological Institute) (6) segmented standard a priori tissue templates using a 12-parameter affine transformation.Spatial smoothing was then applied to the functional images using an 8 mm full width at half maximum Gaussian filter as to enhance the signal-to-noise ratio.Movement related variance components in fMRI time series resulting from residual motion artifacts, assessed by the six parameters of the rigid body transformation in the realignment stage, were regressed out from each voxel time series.Next, spline interpolation was used to detrend the fMRI data, followed by temporal filtering (Gaussian smoothing with kernel width = 4 sec).
Brain responses to the three stimuli were concatenated making a total of ~24 minutes worth of data.The rationale behind this was to combine stimuli representing a wide range of musical genres and styles in order to cancel out effects that the specific kinds of music may have on the phenomenon under investigation.The final time series had 702 samples after the 4 first samples of each of the three runs were removed to avoid artifacts due to magnetization effects.

Perceptual experiment to obtain boundary data
To determine the precise locations of the boundaries in the music (i.e., transition points), a perceptual experiment was carried out using a computer in a sound-attenuated room.This perceptual task was conducted with a separate participant pool (see Perceptual experiment).The rationale for using a different participant pool for fMRI and perceptual experiments is that it allows to minimize familiarity effects with the music, which could affect the listening task during the fMRI scanning (or vice versa), leading to participants reacting differently to cadential closure, repetition, and other features that could contribute to boundary detection.However, to minimize differences, groups were matched in terms of their demographic variables.
Participants were instructed to mark 'instants of significant change' as they listened to the music by pressing the space bar of the computer keyboard in real time via a Max/MSP computer patch (7).During the experiment, the boundary data for each participant and stimulus was collected in a single pass, without any opportunity for them to preview the stimuli prior to segmentation, nor to make any changes to their boundary indications after completing the task.Subsequently, they were free to playback from different parts of the stimulus and make their segmentations more precise by adjusting the position of boundaries or remove them if these were added by mistake.However, participants could not add any new boundaries at this stage.
The interface included a play bar that offered basic visual-spatial cues regarding the beginning, current time position and end of the examples.
The KDE time series for each group was convolved with the canonical double gamma hemodynamic response function (HRF) in order to match the hemodynamic response delay typical of blood oxygen level dependent (BOLD) brain responses, and downsampled to 0.5 Hz to match the sampling rate of the fMRI scanner.The peaks in this time series indicate the temporal locations where salient changes in the music materials occurred, as measured by high between-listeners consensus.We will refer to this variable as the "boundary regressor".

Region of Interest (ROI) selection
To restrict the brain area to be examined, an initial GLM analysis was performed to identify brain regions significantly related to the boundary regressor (alpha = 0.001, cluster wise corrected, FWE = 0.05).The rationale for restricting the brain volume subjected to further analysis is twofold: (1) to reduce the issue of multiple comparisons by decreasing the number of statistical tests and only including informative brain volume, and (2) to improve the separation and anatomical precision of the identified spatial components within the context of the ICA approach (10)(11)(12).Accordingly, the resulting ROI contained a total of 29980 voxels, and was used in the subsequent GLM and ICA analyses, to determine, respectively, both the region-and network-related brain activation driven by the processing of boundaries in the music.

GLM analyses
Following the approach used by Sridharan et al. ( 2007), we analyzed the moments before, during, and after the boundary transitions.That is, we examined regional activation related to the boundary regressor at different lags (0 lag ± 1 lags; each lag represents 2 seconds).This allowed us to investigate how brain activity changed leading up to, during, and following the transitions, and how this activity was temporally related to the onset of the boundary events.
To do this, we performed three GLM analyses on the selected ROI by shifting the boundary regressor in time by one scan (2 seconds) before and after the boundary location to identify regions predicted significantly by it (p < 0.001, cluster wise corrected, FWE = 0.05; see Figure 7 for an overall visualization of the statistical pipeline of the study).
GLM Analyses were performed separately for musicians and nonmusicians using the group-level boundary regressor specific to each group.The significance of the correlation coefficients was assessed by considering the intrinsic serial correlation between adjacent fMRI samples via estimation of the effective degrees of freedom (13).In line with this method, degrees of freedom were computed for each participant by randomly selecting 10,000 voxels iteratively and subsequently averaging the estimates across participants.GLM results were pooled across all participants (Fisher's combined probability test, N = 36, p < 0.001, cluster-wise corrected, FWE = 0.05) to examine the common brain pattern of activation for both musicians and nonmusicians during boundary transitions.Between-group comparisons to determine group differences in regional activation between musicians and nonmusicians were assessed by means of two-sample Wilcoxon signed rank tests (two-tailed, alpha = 0.05).Similar tests were performed to compare BOLD signal changes between the different shifts.

Nonparametric analyses
The normality assumptions may not always be fully met in fMRI data because the distribution of fMRI data is often complex and can be influenced by various factors (i.e., noise, artifacts, and the underlying neural activity being measured).It is important to note that the departure from normality does not necessarily invalidate the use of GLM in fMRI.Many statistical methods are robust to departures from normality, especially for large sample sizes.Additionally, the Central Limit Theorem (CLT) and the use of large sample sizes in fMRI studies can help mitigate the impact of departures from normality at the individual voxel level when pooling results across participants.This means that even if the individual fMRI data within each group may not be normally distributed, the distribution of group means or group-level statistics tends to approach a normal distribution as the sample size increases.
However, the GLM framework does assume a linear association between the predicted and predictor variables.If the relationship between variables is not linear, GLMs may not accurately capture the true association.Non-linear relationships can be assessed using alternative measures, such as rank-based correlation coefficients (e.g., Spearman's rank correlation) that do not assume linearity.Therefore, it was deemed appropriate to explore this a non-parametric technique in addition to GLM analyses.
We conducted an identical approach using Spearman's rank correlation analyses along with permutation tests for significance estimation.Spearman's rank correlation calculates the correlation between the ranks of the variables rather than their raw values, making it robust to non-linear relationships and outliers and providing a more flexible alternative to GLMs.This nonparametric approach provided an alternative method for assessing the association between the boundary regressor and fMRI responses, without making any assumptions about specific distributions or linearity between variables.
To quantify the significance of the final Z-transformed Fisher's combined probability scores, a null distribution of z-scores under the null hypothesis of no fMRI vs regressor correlation was estimated using 'random' versions of the boundary regressor.Because merely randomizing the samples in the time series does not preserve its temporal dependencies, leading to biased estimates, a phasescrambling procedure was employed (14).This involved scrambling the phases of the boundary regressor time series in the frequency domain and subsequently inverse-transforming them to the time domain.The significance of the z-scores was subsequently derived from this distribution.
The primary disparity identified in the spatial maps derived from the GLM and nonparametric methodology pertained only to the statistical significance.Specifically, the nonparametric analyses revealed a higher level of significance compared to the GLM method.For instance, voxels that attained a significance level of 0.001 (one-tailed) in the GLM approach exhibited a significance level of 0.0005 (one-tailed) in the nonparametric analyses.This was expected given the increased statistical power associated with the non-parametric test.
It is noteworthy however that the continuous spatial maps generated by both approaches showed a high correlation (r > 0.96, p < 0.001).These findings indicate a high degree of similarity in the spatial patterns of hemodynamic activity, with consistent spatial relationships observed between the maps.Consequently, both GLM and nonparametric maps exhibited identical clusters but expanded in size in the nonparametric approach.This discrepancy may be attributed to the enhanced sensitivity of the nonparametric approach in detecting subtle activation patterns, as well as its ability to relax the assumption of linearity imposed by the GLM.

Agglomeratice Hierarchical Clustering of GLM maps
Following one reviewer's suggestion, prompted by the absence of significant differences across lags or between groups in the GLM analysis, we delved deeper into the GLM results by employing aglglomerative hierarchical clustering analyses.This technique helped cluster the GLM maps, providing insights into their similarity.We used the GLM continuous z-score spatial maps corresponding to each lag in vectorized form, averaged across lags, and calculated the cosine distance between these vectors, generating a pairwise distance matrix.Subsequently, we applied the Ward linkage method to this matrix to construct a hierarchical clustering tree.The resulting dendrogram was then generated (see Figure S1).This method facilitated the elucidation of spatial relationships among lag-specific z-score maps, aiding in the interpretation of our findings.Subsequently, we conducted an separate identical analysis using group averages per lag (musicians and nonmusician).Notably, both groups exhibited consistent clustering of lag 0 and lag +1 maps, distinct from lag-1, aligning with prior results (see Figure S2).These findings confirm earlier observations while also incorporating musician-nonmusician distinctions.We observed identical results when employing the Euclidean distance metric in conjunction with the average linkage method.

ICA analyses Spatial ICA
Spatial ICA (sICA) (15) was applied to the ROI selection, following an identical pipeline as described in (23).This variety of ICA is commonly performed in fMRI, where it decomposes the fMRI signal into spatially independent components (ICs).The basic assumption in ICA is that each voxel time series (observation) represents a linear combination of a number of unknown underlying source signals, which are assumed to be maximally statistically independent from one another.In the case of sICA, the assumption of independence means that the identified IC spatial maps do not overlap in space (16).Each IC may thus represent a functionally coherent brain network (e.g., visual, auditory, or motor), a physiological process (e.g., breathing) or an artefact (e.g., head motion) (17).
Prior to performing ICA, participants' data were reduced both at the individual and group levels via Principal Component Analysis (PCA).The dimensionally reduced data were next decomposed into ICs by means of ICA: Where  denotes the T-by-V (time by voxel) group dimensionally reduced matrix of fMRI brain responses,  represents an unknown N-by-N mixing matrix, and  is an unknown N-by-V matrix containing the  ICs that are maximally independent.The rows and columns of  and  are the spatially independent images and the spatially independent time courses associated with those images.ICASSO, a robust analysis tool, was used to perform ICA by running the ICA algorithm iteratively and clustering similar IC estimates (18).The ICA algorithm used was FastICA (19), which has yielded consistent results for fMRI data analyses in the literature (20).We used a maximum of 100 different randomly initialized unmixing matrices up to convergence.
One caveat of the ICA approach is that the number of ICs into which the fMRI data is decomposed must be set a priori.This value, also known as ICA model order, represents the latent dimensionality of the data, and can thus have a significant impact on the spatial characteristics of the estimated functional networks and their parcellation into sub-networks (21).However, model order selection, i.e., inferring the appropriate number of ICs is a non-trivial problem still under investigation (22).To tackle this problem, we followed the procedure described in Burunat et al. (2017), which involved performing decomposition assuming ten different model orders.Next, the GICA3 back-reconstruction algorithm was used to reconstruct participant-specific IC spatial maps and associated temporal courses, whereby the mixing matrix  was back-projected to the PCA subject space (24).These participant-specific IC maps along with their respective temporal courses enabled statistical inferences to be drawn from group (musicians vs nonmusicians) comparisons.
The utilization of back projection for subject-level comparisons offers a robust and reliable method for estimating individual spatial maps and time courses.By deriving group-level ICA spatial maps from all subject time courses, we achieve improved parcellation compared to alternative approaches such as individual subject ICA followed by alignment (24).Moreover, the process mirrors that of dual regression, ensuring consistency and reliability in estimating subject-specific components.Despite potential information loss during PCA steps, which typically retains over 99.99% of variability, individual spatial maps and time courses remain effectively independent of each other, conditional on the shared spatial map.This methodology underscores the strength and validity of back projection for subject-level comparisons in our analysis.

GICA3: back-reconstruction of subject-specific IC spatial maps and temporal courses
Once the unmixing matrix has been estimated, subject-specific IC spatial maps and associated temporal courses can be reconstructed, enabling statistical inferences between groups, for which various multi-subject ICA approaches have been proposed.Among the existing methods, recent evidence suggests that GICA3 provides both the most robust results with the most intuitive interpretation derived from its mathematical properties, such as the aggregate IC spatial map being the sum of the back-reconstructed subject-specific spatial maps (24).
We applied GICA3 on the ICA results to obtain subject-level IC spatial maps through backreconstruction.This allowed us to conduct a one-sample Wilcoxon signed rank test (p < 0.001) across the subject-level back-reconstructed spatial maps to assess the significance of the spatial maps associated with the time-shifted boundary regressor.Similarly, to test for group differences, non-parametric group comparisons (two-sample Wilcoxon signed rank tests, two-tailed, alpha = 0.05) were performed between groups.

Identification of IC temporal courses associated with boundary processing
In order to obtain the IC spatial maps associated with boundary processing, we determined the IC component that was best predicted by the time-shifted regressor for each phase of the transition (before, during, and after the boundary) across all model orders.We used Spearman's rank correlation coefficient as a suitable non-parametric measure of statistical dependence.This choice was made because the potential relationship between the time-shifted boundary regressor and the IC-associated temporal courses was not necessarily expected to be linear.
The significance of the correlation coefficients had to be estimated due to the intrinsic serial correlation between adjacent fMRI samples, which reduces the effective degrees of freedom in the data.These were estimated by computing the cross correlation between IC time course and time-shifted boundary regressor (13).The estimate of effective degrees of freedom was averaged across model orders and subsequently used to compute the significance of the correlations by dividing the Fisher Z transformed correlation coefficients ( 27) by the standard error 1/√( − 3), where  represents the effective degrees of freedom.This approach has been applied in prior studies (refer to 25 and 26).
For each model order, the significance threshold was adjusted using the false discovery rate (FDR) criterion (q < 0.05).underwent correction for multiple comparisons using the false discovery rate (FDR) criterion (q < 0.05).The most significant IC (p < 0.001) was determined for each of the lags (before, during, and after the boundary transitions) across all model orders.
Once the IC time courses best predicted by the boundary regressor were identified (based on their highest correlation coefficient across all model orders), their associated IC spatial maps were identified.Subsequently, to assess the significance of these spatial maps, a statistical test was employed to generate maps indicating connectivity significantly different from zero.Specifically, a one-sample Wilcoxon signed rank test (alpha = 0.001) was employed for this purpose.This nonparametric test allowed for the comparison of spatial distributions of brain activity against a null hypothesis of zero activity for each lag (pre-, during, and post-boundary transitions).These IC maps defined relevant networks at the whole participant pool level.

Cluster correction for multiple comparisons
To account for multiple comparisons, a non-parametric cluster-wise correction approach was used (FWE = 0.05), whereby participant-level IC spatial maps were bootstrap resampled with replacement from the pool of back-projected IC maps within a given model order (i.e., for model order N, 36 IC spatial maps were randomly drawn from a total of 36 * N IC maps).The sample was then t-tested and thresholded (one-sample Wilcoxon signed rank test; p < 0.001).By running 5000 iterations, an empirical distribution of cluster sizes was generated per model order.Bootstrap resampling within a given model order ensures not only that the spatial maps are uncorrelated, but also that the spatial autocorrelation structure is consistent among them.

Granger causality analyses
We aimed to test the hypothesis that the two ICA networks involved in event boundary processing have directional or causal influence.Specifically, we predicted that the Early Auditory Network, active before the boundary transition, drove the engagement of the Boundary Transition Network, responsible for processing the boundary transition onwards.
GCA is used to measure the predictive value of one time series in forecasting the values of another.
If incorporating past values of variable x improves the prediction of variable y's current value, we say that x Granger causes y.This technique relies on the concept of temporal precedence to identify the direction of causality based on the information present in the data.
The GCA was performed separately for each subject using back-reconstructed subject-level ICA temporal courses associated with both networks.

Model order estimation
In the context of GC analysis, model order estimation refers to the process of selecting the appropriate order (or lag length) of the autoregressive (AR) model used to estimate GC between two or more time series variables.The AR model is a common tool used to model time series data, and it assumes that each observation in a time series is a linear combination of its past observations.The order of the AR model determines how many past observations are included in the model.
Determining the optimal model order for Granger causality analysis in fMRI data depends on the specific characteristics of the data and the research question of interest.In general, the optimal model order for fMRI data may vary depending on the specific dataset and analysis approach.
We determined the optimal model order for the Granger causality (GC) analysis using both the Akaike information criterion (AIC) and the Bayesian Information Criterion (BIC).These methods balance model fit and complexity to select the best trade-off between the two.Both the AIC and BIC yielded a model order of TR = 3 (6 seconds) as the optimal model order for GCA.This model order struck the right balance by effectively capturing the dynamic interactions among variables while avoiding overfitting.Therefore, Granger causality took into account a lag of 6 seconds into the past.

Difference of influence term
Spurious causal influences can arise due to the low temporal resolution and hemodynamic blurring in the fMRI signal, where an unidirectional influence can turn into a bidirectional interaction due to the loss of dynamic stochastic information in both channels of the system (29).To overcome this issue, a difference of influence term ( !→# −  #→! ) was introduced into the GCA.This term allows to compare the forward influence of the first network on the second network and the reverse influence of the second network on the first.If the difference between these influences is statistically significant, it suggests a unidirectional causal influence from one network to the other.By considering this difference, redundant bidirectional connections in the network can be eliminated, enhancing the specificity of the inferred causal relationships.
The statistical significance of the difference was evaluated by generating a null distribution of difference of influence terms  !→# −  #→! = 0 on simulated data (5000 surrogate time series) with comparable temporal properties (by means of using Fourier resampling).
If the difference between the two influences is significant, it suggests a unidirectional causal influence less likely to be spurious than a bidirectional influence.Note that positive values of the influence difference term suggest a directional influence from X to Y, while negative values indicate influence in the reverse direction.for both musician and non-musician groups using an identical approach as in Figure S1.Notably, consistent clustering patterns emerge for lag 0 and lag +1 maps across both groups, distinct from lag -1, aligning with previous findings (refer to Figure S1; M: musicians; N: nonmusicians).

Table S1. Table related to Figure 2 (GLM results)
Activations and deactivations at 0 lag are displayed at p < 0.001, cluster-wise corrected, FWE = 0.05.clusters were obtained via the 18-connectivity scheme employed in SPM.Table reports within-cluster region size (k; i.e., number of voxels), peak Z-statistic value per region within the cluster with its respective MNI coordinates, and Brodmann area (BA).Voxels identified as white matter or voxels encroaching very small regions within the cluster (k< 5 voxels) were discarded from the resulting table.

Table S2 A. Table related to Figure 3 (ICA results; Early Auditory Network)
IC spatial map corresponding to the Early Auditory Network (one-sample Wilcoxon signed rank test, p < 0.001, cluster-wise corrected, FWE = 0.05).Table reports within-cluster region size (k; i.e., number of voxels), peak Z-statistic value per region within the cluster with its respective MNI coordinates, and Brodmann area (BA).Voxels identified as white matter or voxels encroaching very small regions within the cluster (k< 5 voxels) were discarded from the resulting table.

Table S2 B. Table related to Figure 3 (ICA results; Boundary Transition Network)
IC spatial map corresponding to the Boundary Transition Network (one-sample Wilcoxon signed rank test, p < 0.001, cluster-wise corrected, FWE = 0.05).Table reports within-cluster region size (k; i.e., number of voxels), peak Z-statistic value per region within the cluster with its respective MNI coordinates, and Brodmann area (BA).Voxels identified as white matter or voxels encroaching very small regions within the cluster (k< 5 voxels) were discarded from the resulting table.

Table S3 A. Table related to Figure 5 (group comparison; Early Auditory Network)
Group comparison map resulting from the Early Auditory Network (two-sample Wilcoxon signed rank tests, p < 0.05, cluster-wise corrected, FWE = 0.05).Table report within-cluster region size (k; i.e., number of voxels), peak Z-statistic value per region within the cluster with its respective MNI coordinates, and Brodmann area (BA).Voxels identified as white matter or voxels encroaching very small regions within the cluster (k< 5 voxels) were discarded from the resulting table.

Table S3 A. Table related to Figure 5 (group comparison; Boundary Transition Network)
Group comparison map resulting from the Boundary Transition Network (two-sample Wilcoxon signed rank tests, p < 0.05, cluster-wise corrected, FWE = 0.05).Table report within-cluster region size (k; i.e., number of voxels), peak Z-statistic value per region within the cluster with its respective MNI coordinates, and Brodmann area (BA).Voxels identified as white matter or voxels encroaching very small regions within the cluster (k< 5 voxels) were discarded from the resulting table.

Legends for Movie S1 (separate file). Dynamic brain changes during boundary transitions.
Movie S1 exhibits the engagement of the Early Auditory Network and Boundary Transition Network, highlighting the dynamic shifts in functional connectivity and adaptive responses of the two networks during the processing of musical boundaries (t = -1, 0, and +1, representing moments before, during, and after the boundary transition (t = TR = 2s).The smooth transition between these two distinct functional networks was achieved through the utilization of 3D linear interpolation for visualizing purposes.Three musical excerpts representing boundary transition peaks were selected per musical stimuli to highlight the dynamic shifts in functional brain networks.
Three musical pieces were used in the experiment: (a) Stream of Consciousness by Dream Theater; (b) Adios Nonino by Astor Piazzolla; and (c) Rite of Spring (comprising the first three