The impact of ROI extraction method for MEG connectivity estimation: Practical recommendations for the study of resting state data.

Magnetoencephalography and electroencephalography (M/EEG) seed-based connectivity analysis typically requires regions of interest (ROI)-based extraction of measures. M/EEG ROI-derived source activity can be treated in different ways. For instance, it is possible to average each ROI ’ s time series prior to calculating connectivity measures. Alternatively one can compute connectivity maps for each element of the ROI, prior to dimensionality reduction to obtain a single map. The impact of these different strategies on connectivity estimation is still unclear. Here, we address this question within a large MEG resting state cohort (N = 113) and simulated data. We consider 68 ROIs (Desikan-Kiliany atlas), two measures of connectivity (phase locking value-PLV, and its imaginary counterpart-ciPLV), and three frequency bands (theta 4-8 Hz, alpha 9-12 Hz, beta 15-30 Hz). We consider four extraction methods: (i) mean, or (ii) PCA of the activity within the ROI before computing connectivity, (iii) average, or (iv) maximum connectivity after computing connectivity for each element of the seed. Connectivity outputs from these extraction strategies are then compared with hierarchical clustering, followed by direct contrasts across extraction methods. Finally, the results are validated by using a set of realistic simulations. We show that ROI-based connectivity maps vary remarkably across strategies in both connectivity magnitude and spatial distribution. Dimensionality reduction procedures conducted after computing connectivity are more similar to each-other, while PCA before approach is the most dissimilar to other approaches. Although differences across methods are consistent across frequency bands, they are influenced by the connectivity metric and ROI size. Greater differences were observed for ciPLV than PLV, and in larger ROIs. Realistic simulations confirmed that after aggregation procedures are generally more accurate but have lower specificity (higher rate of false positive connections). Although computationally demanding, after dimensionality reduction strategies should be preferred when higher sensitivity is desired. Given the remarkable differences across aggregation procedures, caution is warranted in comparing results across studies applying different extraction methods.


Introduction
Technological advances in neuroimaging over the past three decades have allowed the study of brain connectivity, which has helped to understand the neural basis of healthy cognition and clinical disorders (Schnitzler & Gross, 2005).Today it is well-established that task-based or resting state functional connectivity patterns are markers of efficienct cognitive processes, while disrupted connectivity patterns may suggest impaired functional brain circuits (Aydin et al., 2020;Baldassarre et al., 2016;Carter et al., 2009;Englot et al., 2015;Hawkins et al., 2020;Pellegrino et al., 2012Pellegrino et al., , 2019;;Pellegrino, Mecarelli, et al., 2018;Schuler et al., 2018Schuler et al., , 2019;;Schuler & Pellegrino, 2021).Among many neuroimaging modalities used to estimate brain connectivity, magnetoencephalography and electroencephalography (M/EEG) are particularly effective because they provide sub-millisecond temporal resolution, direct monitoring of neural activity oscillating across multiple frequencies (Baillet, 2017), acquisitions in a noise-free environment, either while seated, laying down, or walking (Colenbier et al., 2022;Pellegrino et al., 2022;Schuler & Pellegrino, 2021).
While brain functions are spatially distributed, in order to measure connectivity the cerebral cortex is typically divided into parcels, regions of interest (ROI), or seeds, often standardized using atlases (e.g., Desikan et al., 2006;Eickhoff et al., 2018;Schaefer et al., 2018).Typically connectivity is computed considering these regions in a seed-based connectivity fashion.Within this approach measures of functional relationship are usually estimated between one seed and other seeds, or between one seed and the rest of the brain (Betti et al., 2013;Brookes et al., 2011;Siems et al., 2016).When computing seed-based connectivity, the seed can be a single voxel for volumes, or a vertex for analyses restricted to cortical surfaces, (Brookes et al., 2011;Hipp et al., 2012;Siems et al., 2016).More often seeds include larger regions consisting of multiple elements (voxels or vertices), containing rich and spatially varying information (Kriegeskorte & Bandettini, 2007;Meier et al., 2008).As each voxel or vertice has connectivity pattern, aggregation procedures are necessary in order to estimate the connectivity of a given ROI.One of the simplest dimensionality reduction strategies is to consider the average time-course of all elements of the ROI, which is standard procedure in fMRI seed-based connectivity (Basti et al., 2020;Brookes et al., 2011).This is reasonable given the spatial properties of the BOLD signal: large brain regions show some degree of homogeneity in signal and connectivity, allowing thus to extract brain parcellations (Fan et al., 2016;Schaefer et al., 2018;Thirion et al., 2014;Yeo et al., 2011).
In M/EEG the process of extracting seed-based connectivity is more challenging (Basti et al., 2019(Basti et al., , 2020;;Capilla et al., 2022;Farahibozorg et al., 2018;Hillebrand et al., 2012;Tait et al., 2021).M/EEG brain signals are not measured directly, as the sensors are placed on the scalp (EEG) or several centimeters above (MEG).M/EEG activity is reconstructed via source imaging, which requires resolving forward and inverse problems (Baillet, 2017;He et al., 2018;Pellegrino, Hedrich, et al., 2016, 2018, 2020).This means that brain connectivity is typically measured in the computed source space, as this reduces the bias due to volume conduction and signal leakage, providing more accurate topology of brain connectivity (Brunner et al., 2016;Haufe et al., 2013;Hincapié et al., 2017;Lai et al., 2018;Palva et al., 2018;Schaworonkow & Nikulin, 2022;Van de Steen et al., 2019).When the source space is restricted to the cortical surface and the inverse solution is a distributed technique, time-courses are vectors of vertex based magnitude and direction , that take into account cortical folding (Dale & Sereno, 1993;Hedrich et al., 2017).Or simply, time-courses of neighboring vertices have different directions due to the curvature of the cortical folds.Here, time-course averaging within the parcel leads to a certain degree of signal cancellation (Ahlfors et al., 2010;Ahlfors et at., 2010;Chowdhury et al., 2018;Hillebrand et al., 2012;Irimia et al., 2012).There are alternative approaches to extract single connectivity patterns from an extended seed, but the effects of these multiple strategies on connectivity outputs remain to be explored (Colclough et al., 2015(Colclough et al., , 2016;;Hillebrand et al., 2012).For instance, it is possible to consider time-course of the first PCA component rather than the average, as a representative of the ROI's activity (Basti et al., 2020;Bruña & Pereda, 2021;Colclough et al., 2015Colclough et al., , 2016;;Dimitriadis et al., 2018).Additional approaches involve computing connectivity using time-courses of all seed's elements (voxel or vertex), and then extract the average, PCA, or apply multivariate methods (Aydin et al., 2020;Bruña et al., 2023Bruña et al., , 2023;;Colclough et al., 2016;Dimitriadis et al., 2018;Hillebrand et al., 2012;Palva et al., 2010).A 'multivariate' strategy acknowledges that all elements of the source space and their time-courses carry some useful information, while the aggregation approach such as mean or PCA is the easiest computational choice.Beyond these, a number of more complex procedures (identification of the center of mass, time-series with maximal power, weighted average, etc.) have been proposed (Basti et al., 2019(Basti et al., , 2020;;Bruña et al., 2023;Chalas et al., 2022;Garcés et al., 2016;Korhonen et al., 2014;Luckhoo et al., 2012).
The purpose of this study is to systematically compare the most common dimensionality reduction strategies to estimate connectivity patterns.Our analyses were performed using: (i) a large resting state MEG cohort (N=113) with four extraction methods and two connectivity measures, across three frequency bands, by applying one canonical cortical atlas (Desikan et al., 2006); and (ii) realistic simulations in order to compare real data connectivity results with ground truth (simulated data).
In short, this work addresses the following questions: (i) How does the choice of ROI extraction method affect the estimation of resting state functional connectivity?(ii) Are differences between extraction strategies consistent across different connectivity measures (ciPLV, PLV) and frequency bands (theta, alpha, beta)?(iii) Does the reliability of extraction methods vary depending on the size of the ROI? (iv) What are the recommendations for future studies?

Participants and MEG acquisition
The study was approved by the local Ethics Committee, the Research Ethics Board of the Province of Venice (Italy), and complied with the 1964 Declaration of Helsinki and its later amendments.Every participant provided written and informed consent.All participants were healthy (self-reported) adults, with normal or corrected-to-normal vision, had no history of neuropsychiatric disorders, or brain injuries.Demographic details of the sample of participants included in this analysis are illustrated in Table S1 of the supplementary material.Overall, 84 subjects were included (66 Female, age range 21-58, mean 29 years old).Of these, 23 had two or more resting state sessions, in separate instances, that were treated as independent acquisitions.We analyzed 113 five-minute resting state recordings, acquired using a whole head 275-channel CTF system (VSM MedTech Systems Inc., Coquitlam, BC, Canada).Participants sat with their eyes closed in a magnetically shielded room.Eye movements (EOG) and cardiac rhythm (ECG) were recorded with bipolar electrodes.The sampling rate was set to 1200 Hz.Prior to the MEG data acquisition, the position of scalp points and anatomical landmarks (nasion, left and right pre-auricular points) were digitized with a 3D Fastrack Digitizer (Polhemus, Colchester, Vermont, USA).The head position within the dewar was tracked using the Continuous Head Localization system.

MRI data acquisition and analysis
MR images of the head were obtained from each participant, after the MEG session.Specifically, T1-weighted anatomical images were acquired at 1.5 T with the Achieva Philips scanner (Philips Medical Systems Best, The Netherlands) using the following parameters: TR = 8.3 ms, echo time TE = 4.1 ms, flip angle = 8 • , isotropic spatial resolution = 0.87 mm.

MEG preprocessing and source imaging
MEG data was analyzed with the Brainstorm MATLAB-based toolbox (Tadel et al., 2011) (Fig. 1).First, MRIs were imported and processed with CAT12 (Gaser et al., 2022) for tissue segmentation, cortical reconstruction and cortical labelling.Source space was defined as the cortical mesh extracted with CAT12, and downsampled to 4032 vertices.ROIs were the 68 cortical regions of the Desikan-Killiany (DK) surface-based atlas (Desikan et al., 2006).Then, MRI-MEG co-registration was performed by fitting a surface between the T1 head shape and the digitized head and fiducial points, acquired prior to the MEG recording, as described in our previous studies (Pellegrino, Machado, et al., 2016).MEG data was preprocessed using: spatial gradient noise cancellation of third order; band-pass filtering [0.3 -256 Hz] and notch filtering (50, 100, 150 Hz); Signal Space Projection (SSP) to remove cardiac and eye movement artefacts (Taulu & Simola, 2006;Tesche et al., 1995); downsampling to 128 Hz; data segmentation into 2.5-second epochs; inspection and rejection of epochs affected by residual artefacts or head movement.SSP was chosen for data preprocessing as it is more operator independent, computationally efficient and, based on our experience, leads to better results (Pellegrino, Xu, et al., 2020).Similarly, we chose 2.5 the epoch length according to previous applications by Basti et al. (2020), acknowledging that this could be too short according to other authors (Fraschini et al., 2016).Nevertheless, paired comparisons were applied across aggregation procedures, reducing therefore the impact of this parameter on the outcome of the study.The forward model was computed using overlapping spheres (Pellegrino et al., 2018).Noise covariance was estimated from a 3-minute empty room MEG recording acquired prior to each data acquisition.The inverse problem was solved with the weighted minimum norm estimate (wMNE), with dipoles at the source space constrained to be perpendicular to the cortical surface mesh (Brancaccio et al., 2020;Cona et al., The source space corresponded to the cortical surface tasseled into 4000 vertices/nodes.(B) MEG data was acquired with a 275 CTF system.Data underwent standard preprocessing.(C) MRI-MEG co-registration was performed with a surface fitting procedure taking into account fiducial points acquired with a Polhemus system (D) Source imaging was based on an overlapping sphere head model and wMNE inverse solution.This allowed reconstruction of time-series for each of the 4000 nodes of the source space.(E1) The dimensionality reduction procedure was applied to the time-courses of the vertices/nodes belonging to the ROI (cyan box).Two procedures were considered: (i) the average of the time courses and (ii) the PCA of the time-courses explaining the largest variance.Then the resulting timecourse was considered as the seed and pairwise connectivity was estimated with each node/vertex of the source space (purple box).This resulted in a single map representing the connectivity between the ROI and all nodes of the cortex.E2) The dimensionality reduction procedure was applied to the connectivity maps computed for each vertex/node of the ROI and the source space (cyan box).The resulting connectivity maps correspond to the number of nodes/vertices of the ROI (purple box).The final output is a single map obtained as an average or maximum of all the maps previously computed.

Connectivity measures
Connectivity was computed with two measures of phase consistency: the Phase Locking Value (PLV) (Lachaux et al., 1999) and its corrected imaginary counterpart, or corrected imaginary Phase Locking Value (ciPLV), (Bruña & Pereda, 2021).PLV is a measure of phase consistency between two oscillating time series, where higher values represent higher connectivity.More specifically, PLV measures the instantaneous phase difference between two signals based on the assumption that the phase of connected signals are aligned and evolve together (Lachaux et al., 1999;Nolte et al., 2020;Varela et al., 2001).It has been shown how the application of the imaginary part of the complex definition of PLV, ciPLV, is less sensitive than PLV to volume conduction and signal leakage, since it does not consider zero-lag phase alignment (Bruña et al., 2018;Colclough et al., 2016;Palva et al., 2018;Schuler et al., 2022;Tabarelli et al., 2022).The mathematical explanation of differences between PLV and ciPLV are discussed in detail in Nolte et al. (2020).Here PLV and ciPLV were applied in three canonical frequency bands of interest: theta (5-7Hz), alpha (8-12Hz), and beta (15-29Hz) (Bruña et al., 2018;Nolte et al., 2020).PLV and ciPLV connectivities were estimated for each frequency band, and between each ROI and each element of the source space.This resulted in a a connectivity map.Each of these maps was represented by an array of 4032 elements, where each value is the connectivity estimate between the ROI and each element (vertex) of the source space.In order to perform group analyses, individual connectivity maps were projected onto a common template (MRI-ICBM152) (Mazziotta et al., 2001) and smoothed with a full width at half maximum of 5mm (Bernal-Rusiel et al., 2010;Brodoehl et al., 2020;Coalson et al., 2018;Hagler et al., 2006;Worsley et al., 2002).Lastly, we considered coherence as an additional connectivity measure.The results are consistent with ciPLV and PLV outcome, and are described in the supplementary material.

Dimensionality reduction methods or ROI extraction functions
We compared four methods of ROI extraction, illustrated in Fig. 1: (a) Mean before: mean of the signal within ROI.This function averages all time-courses within the ROI before computing connectivity between the resulting average time-course and the timecourse of each element of the source space.This is very simple and computationally efficient approach.As the time-course of adjacent elements of the source space may display a different sign (s) due to the folding of the cortical surface, a flip-sign function is applied before averaging in order to reduce cancellation.(b) PCA before: PCA of the signal within the ROI.This method takes the time course of the first component of the PCA decomposition of all the signals within a ROI, before computing the connectivity between the time course of that component and the time course of all elements of the source space.(c) Mean after: mean of the connectivity maps computed from all elements of a ROI.Here, dimensionality reduction is applied after the computation of connectivity.A connectivity map is computed for each element of the ROI, taking into account the time course of that element and the time-course of all elements of the source space.The final connectivity map is the average of all the maps across the same ROI.(d) Max after: maximum estimate of the connectivity computed for all elements of the ROI.Similar to mean after, only the maximum value is retained for each element of the resulting connectivity map/vector.

Clustering analysis
As outlined before, in this study we considered four dimensionality reduction methods and three connectivity measures (ciPLV, PLV, and coherence described in the supplementary material).To assess similarity across reduction strategies, we conducted a hierarchical clustering analysis.This analysis was performed separately for the three bands of interest: theta (5-7Hz), alpha (8-12Hz), and beta (15-29Hz).Specifically for each MEG dataset we initially computed the 8×8 similarity matrix (R), where each element resulted from Pearson correlation coefficient computed between connectivity patterns of each joint combination of dimensionality reduction method and connectivity measure.Subsequently, we derived a distance matrix as 1 -<R>, where <R> is the similarity matrix averaged across all datasets.Finally, we applied hierarchical clustering to this distance matrix using the linkage function (MATLAB), utilizing the average linkage criterion.We adopted two approaches to this clustering analysis.First, patterns were obtained by concatenating the connectivity maps across all 68 seed regions according to the DK atlas.Second, we considered each ROI-specific connectivity map separately.The entire clustering analysis pipeline is openly accessible on Github, (jrasero, 2022/2023) Fig. 2. Simulation Pipeline (A) For each ROI we randomly drew a point, v1, within the ROI and four other vertices, v2, v3, v4, v5, outside the selected ROI, with the only constraint that the distance between each pair of points was at least 5 cm.Around each point vi, i=1,…,5, we defined a patch Pi by considering vi's neighbouring source-space elements that were less than 2.5 cm apart.We set three possible sizes of the patch, namely (i) all points with a distance from v1 lower than 2.5cm, (ii) 50% of those points by selecting the closest to v1, and (iii) only v1.(B) For each group of patch configurations, we simulated three time-courses, s1(t), s2(t), s3(t) so that s1(t) leads the activity of s2(t), while s3(t) is uncorrelated.s1(t) was assigned to v1, s2(t) to v2 and v4, and s3(t) to v3 and v5.For each patch Pi the activity of the remaining points was defined by randomly perturbing the Fourier transform of the time series of the corresponding centre vi so as to reach a certain amount of intrapatch coherence.(C) for each set of patch activities, we computed the magnetic field at sensors and added simulated additive noise according to the random dipole brain noise model PoMAM.(D) A total of N=204 MEG data points were simulated.Source space signals were reconstructed using a similar procedure as was used for real data.

Pairwise comparison between scout functions
While the clustering analysis provided an estimate of (dis)similarities across different strategies, and how connectivity maps cluster together, it did not quantify differences across options and their spatial distribution.Hence to address the extent to which the use of the four scout functions differed, we performed parametric testing.Specifically, we applied paired t-testing to compare across all possible combinations of scout functions, for each ROI, frequency band, and connectivity measures (PLV and ciPLV).This analysis was carried out as a post-hoc exploration of the magnitude and spatial distribution of differences across the reduction approaches.We found that the differences across procedures were strong and widespread.To make sure that these differences were specific and not caused by a simple offset (i.e. a constant difference), we repeated the same analysis after normalizing all connectivity maps.Here the maximal connectivity value of each map was set to 100% and all other values were expressed as percentages of the maximum.For these analyses we only report some examples, whereas all comparisons are available as GIfTI at www.hsancamillo.it.

Simulation of MEG data with known connectivity structure
Applying different ROI-extraction methods to realistic simulations allowed to depict the properties of each approach in comparison with the 'ground truth'.The MEG forward model was based on individual anatomical data of one of the participants.In line with the real data analysis, source space was confined to the cortical surface, and the lead field matrix was computed by applying the overlapping sphere model.We simulated neural activity by mimicking five interacting patches (for a schematic representation of the implemented pipeline, refer to Fig. 2).For each ROI we randomly drew a point, v1, within the ROI and four other vertices, v2, v3, v4, v5, outside the selected ROI, with the only constraint being that the distance between each pair of points was at least 5 cm.Around each point vi, i=1,…,5, we defined a patch Pi by considering vi's neighboring source-space elements that were less than 2.5 cm apart.Additionally, as for P1, we restricted it to entirely belong to the ROI containing v1 and we set three possible sizes of the patch, namely (i) all points with a distance from v1 lower than 2.5 cm, (ii) 50% of those points by selecting the closest to v1, and (iii) only v1 (Fig. 2, Panel A).For each group of patch configurations, we then simulated three time-courses, s1(t), s2(t), s3(t), t=1, …., T, where T=10000 mimicking about 78 s of neural activity sampled at 128 Hz.To this end and in line with previous work (Haufe & Ewald, 2019;Sommariva et al., 2019;Vallarino et al., 2021), only alpha band was considered for simulation using three signals following a multivariate autoregressive (MVAR) model of order 5, so that s1(t) leads the activity of s2(t), while s3(t) is uncorrelated.We only retained stable MVAR models such that (i) for each of the resulting signals the average power spectrum in the alpha band represented at least 40% of the overall average power spectrum, and (ii) the average coherence in the alpha band between s1(t) and s2(t) was greater than 0.5.Following this, s1(t) was assigned to v1, s2(t) to v2 and v4, and s3(t) to v3 and v5.For each patch Pi the activity of the remaining points was defined by randomly perturbing the Fourier transform of the time series of the corresponding centre vi so as to reach a certain amount of intra-patch coherence (Hincapié et al., 2017).Additionally, a Gaussian window was used to modulate the resulting time-series so that source intensity decreased for increasing distance from vi (Fig. 2, Panel B).Finally, for each set of patch activities, we computed the magnetic field at sensors and added simulated additive noise according to the random dipole brain noise model PoMAM (Calvetti et al., 2019;de Munck et al., 1992) (Fig. 2, Panel C).Hence a total of N=68×3=204 MEG data points were simulated, 68 being the number of ROIs within the DK atlas and 3 being the considered sizes for P1.

Connectivity estimate and evaluation criteria
As with the experimental MEG data, neural activity was estimated from the simulated MEG data by using the wMNE inverse solution, while connectivity was quantified from the source space with PLV and ciPLV.Specifically, for each of the 204 simulated signals and for the two connectivity measures, we estimated cortical connectivity maps by considering as seed the ROI containing v1.The four ROI extraction methods, i.e.Mean before, PCA before, Mean after, and Max after, were used to quantify the connectivity between this ROI and all 4002 other elements of the source space (Fig. 2, Panel D).
To evaluate the results, we exploited the fact that, when P1 only includes v1, a ground truth can be defined by computing the values of PLV and ciPLV between s1 and the activity in all the other points of the source space.Hence for these simulated MEG data, connectivity metrics, and scout function, we computed Pearson correlation coefficient between the estimated cortical connectivity maps and the corresponding ground truth.On the other hand, when P1 includes more than one element, defining a ground truth is not straightforward.For this reason, we also measured the accuracy of the estimated connectivity maps by computing true and false positive rates (TPR and FPR, respectively).To this end, we fixed one hundred thresholds uniformly distributed between 0 and 1 (alpha).Then for each of the simulated MEG data points, for each of the four scout functions and for both connectivity measures, we applied a normalization procedure by rescaling each map with its maximum value.Thereafter, for each map, for every alpha value of the threshold, we defined: where: -P and N are the number of positive and negative, respectively.P counts the source-space points truly connected with v1, i.e. the nodes within P2 and P4 in our simulations, while N is the number of the remaining source-space points without considering the nodes of the patch P1 centered in v1.-TP(alpha) is the number of true positives, i.e. points of the source space truly connected to v1 where the normalized reconstructed connectivity value (either PLV or ciPLV) exceeded the threshold alpha.-FP(alpha) is the number of false positives, which are points not truly correlated with v1 but where the normalized reconstructed connectivity still exceeded alpha.
Finally, results in terms of True/False positive rate were summarized by computing the corresponding Receiver Operating Characteristic (ROC) curve and the associated Area Under the Curve (AUC).

Results
This section describes results that focus on PLV and ciPLV, as connectivity measures.The results for coherence can be found in the supplementary material.

Hierarchical clustering
The results of the hierarchical clustering are summarized in Fig. 3.This analysis showed similar patterns of similarity-dissimilarity across frequency bands (Fig. 3, rows), and the two connectivity measures of interest (PLV and ciPLV).
Similarity map was overall higher and distances overall lower for PLV than ciPLV (Fig. 3, Left and right columns).Across all frequency bands, PLV and ciPLV maps clustered together within each connectivity measure (Fig. 3, middle column).The relative distances across extraction methods also clustered consistently together, for ciPLV and PLV (Fig. 3, middle column).We observed that Max after and Mean after were the two most similar extraction procedures, and as such clustered together under all circumstances, all frequency bands, and for both connectivity measures (Fig. 3, middle column).Max after and Mean after were relatively close to Mean before, and clustered together, whereas PCA before had higher distance, or was the furtherest, from all other approaches (Fig. 3, middle and right columns).
The right column in Fig. 3 illustrates that Mean after and Max after had the lowest distance, followed by Before-Mean after.
The topographical distribution of the distances across extraction methods is highlighted in Fig. 4, left panel.The lowest distances were found in the deep regions especially of the midline, for all frequency bands.The higher distances were found in the convexity, over the fronto-central regions, and parietal regions (Fig. 4).
There was a significant and positive relationship between the size of the region and the average distance across extraction methods (Fig. 4, right), with Pearson correlation coefficient ranging between r>0.7 and p<0.001, across frequency bands and connectivity measures.

Paired comparisons
The cluster analysis provides a pattern of similarities but does not measure the magnitude and spatial distribution of differences across strategies.Therefore, we followed up on clustering analysis by estimating how connectivity changed across paired comparisons between the four strategies under investigation.Specifically, for each connectivity measure and within each frequency band, we compared: i) Mean before vs PCA before; ii) Mean before vs Mean after; iii) Mean before vs Max after; iv) PCA before vs Mean after; v) PCA Before vs Max after; vi) and Mean after vs Max after.For each of these comparisons, we computed tmaps as a representation of normalized differences between each pair of strategies (illustrated in supplementary Fig. S3).ciPLV connectivity maps computed showed differences as high as 40 t-values, with magnitude and direction (positive/negative) consistent across frequency bands (alpha, beta, theta).Similar patterns were observed for PLV.

Correlation between true (simulated) and estimated maps
The results of the correlation between estimated and true (simulated) connectivity when the patch (P1) contains only its center v1 are The most similar extraction procedures were Max after and Mean after, which were closely clustered, followed by Before, which clustered with both Max after and Mean after.PCA before was most distant from all other approaches, as seen in both middle and right columns.The right column shows that Mean after and Max after had the lowest distance, followed by Before-Mean after.Fig. 4. Topographical distribution of average distances by frequency band (rows) and relationship between the size of the parcel and average distances.Left: The lowest distances were found in deep regions, especially of the midline, across connectivity measures and frequency bands.The higher distances were found especially in the convexity, over the fronto-central regions, and parietal regions.Right: There was a significant positive relationship between the size of the region and the average distance across extraction methods.summarized in Fig. 5.For both PLV and ciPLV and for all aggregation procedures the correlation coefficients are rather low (< 0.15, Fig. 5, upper row).This result is expected, as the simulated map does not contain any activity for a large portion of the cortical surface, whereas the estimated map may contain some activity over the entire surface as a result of signal leakage associated with the inverse solution.Similarly PLV did provide lower values for this correlation, being more sensitive to source leakage than ciPLV.Finally, negative values of the correlation coefficient correspond to data where the value of connectivity estimated in the patches that were uncorrelated with v1, namely P3 and P5, was on average higher than the value of connectivity estimated in the correlated patches, (P2 and P4).In this scenario, the aggregation approaches applied after computing connectivity generally perform better.When applying PLV, the correlation between estimated and simulated connectivity is significantly higher (better) for Mean after as compared to Mean before and PCA before (Wilcoxon signed-rank, p<0.05 consistently).When applying ciPLV Max after performs slightly but significantly better than all other approaches (Max after vs PCA before W=1761, p=0.003;Max after vs Mean before W=1897, p=0.0001;Max after vs Mean after W=1690, p=0.02) and Mean after performs slightly better than Mean before (W =1793, p=0.002).
We repeated the correlation analysis by restricting true and estimated connectivity maps to the points pertaining to the four active patches, namely P2-P5 (Fig. 5, bottom row).In this scenario, correlation coefficients are remarkably higher, especially for ciPLV.The aggregation approaches applied after computing connectivity remained slightly better, especially for ciPLV.The correlation coefficient of both ciPLV Mean after and ciPLV Max after, were significantly higher than ciPLV Mean before and ciPLV PCA before (Mean after vs Mean before W=1944, p=3*10^-5; Mean after vs PCA before W=2023, p=2.5*10^-6;Max after vs Mean before W=2043, p=1.3*10^-6;Max after vs PCA before W=2018, p=2.9*10^-6).Note that no significant difference was observed comparing ciPLV Max after and Mean after.Interestingly, in this scenario the correlation coefficient was less stable for PLV, with high levels of variance depending on the aggregation procedure.Here the only significant difference was in the comparison between Mean after which was higher than Max after (W=1835 p=6*10^-4).

Accuracy of estimated maps (true and false positive rates)
The analysis of False (FPR) and True (TPR) Positive rates provides more insight in interpreting results from the clustering analysis performed on the experimental resting-state data.FPR(alpha) and TPR (alpha) are reported in Fig. 6.This analysis is focused on the case of P1 only containing the center v1, which is a scenario analogous to the one described in the previous paragraph .
For both PLV and ciPLV the two before procedures (Mean before and PCA before) show higher specificity, while the two after procedures (Mean after and Max after) show higher sensitivity.Indeed, PCA before and Mean before show a lower false positive with the tradeoff of identifying fewer true connections.In other words, these aggregation procedures seem to be more conservative.This means that the reconstructed connections most likely identify truly connected sources, at the expense of weak connections which may be lost when these aggregation procedures are used.Vice versa, Mean after and Max after show a higher value of both true and false positive rates.This seems to suggest that weak connections may be retrieved at the expense of retaining spurious connectivity.Also note that for both ciPLV and PLV, the FPR and TPR curves are very similar and overlap for most of the alpha range.
Similar results are obtained when varying the size of P1 (supplementary material).Although the differences across aggregation procedures can vary, Max after and Mean after typically show a higher rate of FPR and TPR.
The comparison of the AUC for all conditions under investigation (Fig. 7) shows a significantly higher value for Max after and Mean after for ciPLV when only the centre of P1 is considered (Mean after vs Mean before W=1792, p=0.002;Mean after vs PCA before W=1830, p=7*10^-4; Max after vs Mean before W=1914, p=7.15*10^5;Max after vs PCA before W=1907, p=8.75*10^-5).The distribution of the AUC values is similar for the four ROI-extraction methods when all points of the patch P1 are considered.With respect to the PLV metric, the distribution of AUC values is significantly higher for Mean after regardless of the dimension of the P1 patch (Mean after vs PCA before W=1656, p=0.037 when P1 containing only v1; Mean after vs Max after W=2068, p=5.4*10^-7 when P1 contains half points; Mean after vs Max after W=1957, p=2*10^-5 when P1 contains all voxels).

Discussion
The outcome of this study indicate that the choice of dimensionality reduction method has a significant impact on the resulting connectivity outcomes.These results were observed in both resting state data and realistic simulations.Interestingly, the choice of extraction method led to considerable differences in both the magnitude and distribution of connectivity.More specifically, via cluster analysis, we observed that the aggregation procedures applied after computing connectivity yielded more consistent results.This was observed for both in-vivo and simulated data.
The after approaches are computationally demanding, but allow for the integration of information from all elements of a given parcel, providing a more comprehensive connectivity estimation (Aydin et al., 2020;Bruña et al., 2023Bruña et al., , 2023;;Colclough et al., 2016;Hillebrand et al., 2012;Palva et al., 2010).
The results of the clustering analysis also revealed that the Mean before method produced connectivity maps that were similar to those generated by the after strategies.On the other hand, PCA before method showed results that were markedly dissimilar from all other approaches.While clustering analysis alone may suggest that the Max after, Mean after, Mean before and Mean after methods produce similar results, it is important to note that this approach is based on correlations across maps and does not provide insight into the topographical distances.Therefore to gain a better understanding, we conducted pairwise analysis of connectivity maps generated by different extraction strategies.Our results showed how most similar maps differed by a magnitude of 40+ t-values, indicating that even minor variations in method choice lead to significant differences in connectivity outcomes (supplementary Fig. S3).We also ruled out the possibility that these differences were due to an offset of connectivity values by directly contrasting normalized maps, where each connectivity value was rescaled against the maximum.Here we observed important and widespread differences in connectivity outcomes.Hence, caution is warranted when comparing connectivity patterns between studies that employ different extraction strategies, as even small variations in method can lead to significant discrepancies.
Our analysis also shed light on some of the factors that may influence the effect of extraction method(s) on connectivity computation.Specifically, we observed that similarity across extraction methods was higher for PLV as compared to ciPLV (Fig. 3).Both connectivity measures employed in this study are based on phase synchrony across time series.However, ciPLV is insensitive to Lag=0 connectivity and, as such, is less susceptible to volume conduction and spatial leakage (Bruña et al., 2018;Bruña & Pereda, 2021;Colclough et al., 2016;Lachaux et al., 1999;Nolte et al., 2020;Palva et al., 2018;Schuler et al., 2022;Tabarelli et al., 2022;Varela et al., 2001).The resulting connectivity maps are therefore less smooth, possibly explaining why similarities across extraction methods is lower than for PLV (Fig. 3).On the other hand, our results also highlight that other connectivity measures may exhibit varying levels of robustness to the extraction methods.We observed no Fig.6.False Positive Rates (upper row) and True Positive Rates (lower row) models of the two connectivity measures (PLV -left-and ciPLV -right-) extracted via the four different aggregation procedures.Plots show mean and standard error of the mean across 68 simulated neural activity so that P1 only contains the centre v1.Results concerning different sizes of P1 can be found in the supplementary material.relevant differences across frequency bands, although the spatial distribution of the distances shows a trend towards greater differences for lower frequencies (Figs. 3 and 4).Low frequency activity recorded from the scalp typically originates from large populations of neurons, or from large distribution of high frequency activity (Pellegrino et al., 2017;Tao et al., 2007).It is then possible that differences in aggregation methods are more pronounced when the underlying generators span across widespread brain areas.This idea is supported by the significant positive linear relationship we observed between the size of the parcel (expressed in number of vertices) and the average difference across extraction methods, and frequency bands (Fig. 3).Given the effect of the parcel size, one could opt for cortical parcellations with smaller parcels than the ones used here, and/or of similar size if the objective is to minimize the bias of aggregation procedures across parcels.Fig. 4 provides some insight into the topographical distribution of the differences.Smaller differences are more often observed in deep regions especially of the midline.It has been shown in fact that MEG spatial sampling is not homogeneous across the cortical surface, and the reconstruction of source signal in deep, basal, and midline regions is not always accurate (Ahlfors et al., 2010;Gavaret et al., 2014;Huiskamp et al., 2010).It is plausible that the estimation of cortical connectivity was less refined in those areas, and the differences across methods were smaller.
Alternatively, the distribution of rhythms across the cortical surface with more alpha activity in the posterior quadrant, more theta activity in the temporal lobes, and faster activity in the range of beta band in the central regions may have played a role in determining this topography of the effects.Overall, these findings suggest that the choice of extraction method may have a greater impact on connectivity measures in lower frequency bands and in regions that are more challenging to capture with MEG.
Furthermore, group comparisons were defined based on the aggregation procedure.The magnitude of the differences between aggregation procedures is large enough that effects could be appreciated at single subject level.Although the contribution of individual differences was not systematically assessed in this instance, we have provided open access to individual connectivity maps, for future studies to explore.
While MEG resting-state data allowed us to identify relative differences across extraction methods, it was challenging to determine which method was more accurate.This prompted us to conduct realistic simulations to gain a more comprehensive understanding of the performance of different methods (Grova et al., 2016;Hincapié et al., 2017;Machado et al., 2018;Sommariva et al., 2019).The simulations confirmed that aggregation procedures performed after the computation of connectivity may provide more accurate results.The correlation coefficient between simulated and estimated maps of connectivity was significantly higher for Mean after for PLV, and for Mean after and Max after for ciPLV.To better understand the performance of different methods, we estimated true and false positive connections captured by applying different aggregation methods.This analysis revealed that Mean after and Max after captured a larger number of true connections, but also a larger number of false positives.In other words, Mean after and Max after showed higher sensitivity for connectivity, but lower specificity.
As for the AUC values, we found that there were differences between PLV and ciPLV.For ciPLV, Mean after and Max after performed better (with higher AUC) when only the center of the patch was considered.While for PLV, Mean after had the highest AUC regardless of the parcel size.Different strategies may be more or less appropriate depending on the specific scenario, and it may be difficult to determine which aggregation method performs best in real-world situations.Our results suggest that Mean after and Max after may be preferable, however at the expense of longer computation time and higher rates of false positives.While PCA provided maps that were very different from all other extraction methods in real data, simulated data showed similar behaviour to Mean before.Overall, we believe it is reasonable to choose Mean before when higher specificity and shorter computation time are desired, and Mean after when greater sensitivity is prioritised.
The effects of extraction methods on connectivity analysis have been extensively studied in the fMRI field.Previous research has shown that dimensionality reduction at the ROI level can result in the loss of important information (Basti et al., 2019).As an alternative it has been proposed that multi-dimensional connectivity methods could provide more accurate results [Geerligs et al., (2016); for a recent review see Basti et al., (2020)].The application of dimensionality reduction strategies at the seed level, for a single time-course, is recommended only in the case of homogeneous ROIs.This is rare in M/EEG signal due to cortical folding that often results in opposite directions for reconstructed source space activity.Many advanced multivariate approaches that integrate the computation of connectivity and the extraction of unique patterns of relationship among brain regions have been developed (Basti et al., 2018;Ewald et al., 2012).A recent MEG simulation study compared standard procedures based on the identification of a representative time-course or ROI (average, centroid, largest power, first PCA component, and first Kosambi-Hilbert torsion component) versus the multivariate approach, and concluded that the latter is remarkably better (Bruña & Pereda, 2021).Another EEG-MEG study compared three approaches to extract ROI-based time courses (centroid, first PCA component, average), and two multivariate approaches, namely the average of the pairwise connectivities across all elements of the ROIs (corresponding to the Mean after in our study) and the root-mean-square (RMS) of all pairwise connectivity (Bruña et al., 2023).The results showed that multivariate procedures (Mean after and RMS post) performed better based on the concordance between EEG and MEG connectivity.Given that no ground truth was available in this study, it cannot be ruled out that some of the results are driven by inherent differences between EEG and MEG.Despite the development of new multivariate approaches, the majority of M/EEG researchers continue to rely on default built-in approaches among most commonly used toolboxes for connectivity analyses (i.e., Brainstorm, Fieldtrip, MNE python), (Gramfort et al., 2014;Oostenveld et al., 2011;Tadel et al., 2011).Within these toolboxes, the most used extraction strategy is the ROI average time course.The use of PCA for dimensionality reduction has received conflicting evaluations.Some studies suggest that PCA may perform better than averaging the seed signal in cases where the signal is better captured by some voxels than others (Basti et al., 2020).Other studies suggest that PCA may be biased towards weighting more low frequencies at the expense of high frequencies (Chalas et al., 2022).A recent simulation study demonstrated that PCA appears to be the best technique, particularly when a fixed number of components is chosen across areas.However, this is only true when PCA is coupled with other optimal pieces of the pipeline, including specific inverse solutions and connectivity measures (Pellegrini et al., 2022).Furthermore another MEG study in clinical and control populations revealed that PCA is outperformed by the centroid of the ROI (Dimitriadis et al., 2018).This means that the choice of the aggregation procedure matters in healthy subjects as well as in clinical populations.Lastly, it is important to underline that the application of PCA entails (arbitrary) choices for several parameters.For instance, several rules regarding the number and properties of the components to retain can be applied.Or whether to apply PCA to individual/single segments of data or to all data at once.The investigation of all possible combinations was beyond the scope of this study.Here we used the implementation available in Brainstorm (version year 2022) and warn the reader that, albeit computationally efficient, this approach has its own limitations.

Limitations
There are several limitations to the current study.First, the results might not be generalizable to all possible strategies, which was beyond the scope of this study.The focus of the current study was on a set of common and practical scenarios.As such, the effect of flip sign was not investigated for the procedures of aggregation prior to computing connectivity, even though this is a popular approach to limit cancellation effects (Ahlfors, Han, Belliveau, et al., 2010;Lai et al., 2018).The influence of flip sign on the estimation of connectivity remains unclear and requires further systematic investigation.
In the analysis of both experimental and simulated data, for each ROI we generated a connectivity map of size 1 x N representing connection strength between that ROI and all the points of the source space.This representation was chosen because it allowed a more intuitive visualization and interpretation of the results.It also made possible to investigate the impact of ROI features such as size.Moreover, in the simulations this representation allowed a straightforward definition of the ground truth connectivity pattern without involving aggregation procedures that could have biased the final results.Additional care is required in studies where further reduction strategies are performed.For instance when applying the aggregation procedures at both seed and target levels, to obtain connectivity matrices of size NxN (being N the number of ROIs).
Another limitation is that the study only focused on resting-state data.While many studies have shown that some patterns of connectivity remain consistent between rest and tasks, recent literature suggests that individual variations should be considered (Colenbier et al., 2023).Therefore, the generalizability of the results to task-based data may be limited.
Here we obtained consistent results across different connectivity measures (ciPLV, PLV, and Coherence).We believe that this study is a strong proof of principle showing the relevance of the aggregation procedures in studying resting state connectivity.It remains to be determined the extent to which these results depend on the choice of connectivity measures, that go beyond phase or amplitude based metrics.Some of the most popular estimation methods, such as Temporal Granger, Frequency Granger (Cekic et al., 2018), Transfer Entropy (Vicente et al., 2011), Partial Directed Coherence (Astolfi et al., 2006), etc. are not considered here, and should be investigated in future studies.
Lastly the simulation presented in this work rely on MVAR model, a linear model often used to simulate neural signals with realistic and possibly time-varying spectra (Chella et al., 2019;Haufe et al., 2013;Haufe & Ewald, 2019;Pagnotta & Plomp, 2018;Sommariva et al., 2019;Vallarino et al., 2021;West et al., 2020).Future work will be devoted to extending our results to more sophisticated models, such as spiking neurons or neural masses (Garofalo et al., 2009;Orlandi et al., 2014;Ricci et al., 2021;Ursino et al., 2020;Wang et al., 2014;Wendling et al., D. Brkić et al. 2002, 2009), where excitatory and inhibitory populations produce oscillations in feedback with different synaptic dynamics, and the corresponding connectivity involves delays and glutamatergic/GABAergic dynamics.Ideally future analyses will include intracranial EEG data too.This is planned future direction, in order to provide a more comprehensive evaluation of the different aggregation procedures.

Conclusions
This work highlights the critical importance of the aggregation procedures in determining accurate connectivity estimations, in both real and simulated data.The choice of the extraction method has a great impact on the connectivity output.Differences are higher for ciPLV than PLV, with a similar pattern across frequency bands.Further, larger is the ROI the higher is the difference across connectivity outputs obtained with different strategies.Overall to obtain higher accuracy, or higher sensitivity at the expense of lower specificity, after aggregation procedures (Mean after) are recommended in connectivity analyses, even though they are computationally demanding.Given the significant differences across aggregation methods, it is essential to exercise caution when comparing studies that employ different methods.

Fig. 1 .
Fig. 1.Analysis pipeline.(A) VolumetricT1w MRI was acquired for each participant and processed (segmentation, cortical reconstruction, cortical labeling) with CAT12.The source space corresponded to the cortical surface tasseled into 4000 vertices/nodes.(B) MEG data was acquired with a 275 CTF system.Data underwent standard preprocessing.(C) MRI-MEG co-registration was performed with a surface fitting procedure taking into account fiducial points acquired with a Polhemus system (D) Source imaging was based on an overlapping sphere head model and wMNE inverse solution.This allowed reconstruction of time-series for each of the 4000 nodes of the source space.(E1) The dimensionality reduction procedure was applied to the time-courses of the vertices/nodes belonging to the ROI (cyan box).Two procedures were considered: (i) the average of the time courses and (ii) the PCA of the time-courses explaining the largest variance.Then the resulting timecourse was considered as the seed and pairwise connectivity was estimated with each node/vertex of the source space (purple box).This resulted in a single map representing the connectivity between the ROI and all nodes of the cortex.E2) The dimensionality reduction procedure was applied to the connectivity maps computed for each vertex/node of the ROI and the source space (cyan box).The resulting connectivity maps correspond to the number of nodes/vertices of the ROI (purple box).The final output is a single map obtained as an average or maximum of all the maps previously computed.

Fig. 3 .
Fig. 3. Hierarchical Clustering.The figure shows the clustering results by frequency band (rows).Left: similarity matrices.The color scale indicates the similarity of connectivity maps across connectivity measures (PLV and ciPLV) and extraction methods (Max after, PCA before, Mean after, Before).Middle: Dendrograms with distances across aggregation procedures.Right: Distances across extraction procedures by connectivity measure (PLV and ciPLV).Map similarity was overall higher and distances overall lower for PLV than ciPLV (Left and right columns).PLV and ciPLV maps cluster together within connectivity measures for all frequency bands (middle column).The relative distances across extraction methods also clustered consistently together within connectivity measure (PLV and ciPLV, middle column).The most similar extraction procedures were Max after and Mean after, which were closely clustered, followed by Before, which clustered with both Max after and Mean after.PCA before was most distant from all other approaches, as seen in both middle and right columns.The right column shows that Mean after and Max after had the lowest distance, followed by Before-Mean after.

Fig. 5 .
Fig. 5. Pearson correlation coefficient between true (simulated) and estimated connectivity maps across 68 simulated neural activity sources such that P1 contains only the center v1.Upper row, left.PLV correlation coefficients of maps defined in the entire source-space.Mean after performed better than Mean before and PCA before.Upper row, right ciPLV map correlation coefficients defined for the entire source-space.Max after performed significantly better than all other aggregation approaches.Mean after performed significantly better than Mean before.Lower row, left PLV correlation coefficients of maps containing only P2-P5.The distribution of correlation coefficients was remarkably variable across aggregation procedures.Mean after performed significantly better than Max after.Lower row, right ciPLV correlation coefficients of maps containing only P2-P5.Both Mean after and Max after performed significantly better than Mean before and PCA before.Note, in the violin plots, white dots depict median values.Darker colors correspond to the interquartile range.* p < 0.05, ** p < 0.005, *** p < 0.0005.

Fig. 7 .
Fig. 7. Distribution of the AUC values across the simulated MEG data when the patch P1 contains all the points that are less than 2.5cm apart from v1 (third row), half of such points (second row), only its centre v1 (first row).Left column shows results concerning PLV while right column refers to ciPLV.