Hyperedge bundling: Data, source code, and precautions to modeling-accuracy bias to synchrony estimates

It has not been well documented that MEG/EEG functional connectivity graphs estimated with zero-lag-free interaction metrics are severely confounded by a multitude of spurious interactions (SI), i.e., the false-positive “ghosts” of true interactions [1], [2]. These SI are caused by the multivariate linear mixing between sources, and thus they pose a severe challenge to the validity of connectivity analysis. Due to the complex nature of signal mixing and the SI problem, there is a need to intuitively demonstrate how the SI are discovered and how they can be attenuated using a novel approach that we termed hyperedge bundling. Here we provide a dataset with software with which the readers can perform simulations in order to better understand the theory and the solution to SI. We include the supplementary material of [1] that is not directly relevant to the hyperedge bundling per se but reflects important properties of the MEG source model and the functional connectivity graphs. For example, the gyri of dorsal-lateral cortices are the most accurately modeled areas; the sulci of inferior temporal, frontal and the insula have the least modeling accuracy. Importantly, we found the interaction estimates are heavily biased by the modeling accuracy between regions, which means the estimates cannot be straightforwardly interpreted as the coupling between brain regions. This raise a red flag that the conventional method of thresholding graphs by estimate values is rather suboptimal: because the measured topology of the graph reflects the geometric property of source-model instead of the cortical interactions under investigation.

Source time-series were generated to reflect the functional connectivity of simulated ground-truth graphs. These ground-truth time-series were next forward-and inverse-modeled to represent a virtual MEG measurement thus introducing linear mixing between sources. Next, functional connectivity was estimated from these source-modeled time-series using a zero-lag-free interaction metric. Finally, hyperedge bundling were performed and hyperedge sensitivity, specificity and separatablity were compared against that of raw metric edges.

Data source location
Helsinki, Finland Data accessibility The sample dataset, software executable and source code is publically available from below link to repository: https://figshare.com/projects/Hyperedge_Bundling/26503 These are original DATA that have not been published elsewhere

Value of data
Allows the readers to replicate how one true interaction is "ghosted" into dozens of false-positive spurious interactions.
Provides a platform to intuitively demonstrate how bundling of observed (spurious & true) edges into hyperedges decreases the false positive rate.
Illustrates how mixing properties of a MEG/EEG source space are computed using simulations.
Exposes an important fact that the source modeling accuracy biases functional connectivity estimates.

Data
The dataset includes the MEG forward, inverse and sparse-source-to-parcel-collapsing operators for 12 subject that were experimentally derived from the visual working memory experiment described below. We also provide 10 5 randomly generated ground-truth graph templates for simulation purpose. Other data and group level graphs (described in the methods) can be created from aforementioned data with the software. The software can perform hyperedge bundling using connectivity graphs generated in this pipeline or using external data imported with compatible format (*. csv files).

Toy model: A truncated Gaussian mixing function
For the toy model, we used a mixing function to simulate the mixing between sources. This mixing function assesses cross-talk by describing how activity at one source location is affected by leakage from other sources. Here, we used a 2D grid system of n×n point-sources to demonstrate how spurious interactions (SIs) arise when signals from interacting sources mix with their surrounding uncorrelated sources. The mixing function f mix (v i ,v j ) between sources v i and v j decreases with increasing spatial distance d ij between them, so that: where f 0,dg 2 (d ij ) is a truncated normal distribution (see Fig. 1A of [1]) with μ¼0 and σ¼d g normalized so that the maximum mixing is a point-source's mixing with itself: max[f 0,dg Here, d g is the unit distance between two neighbouring sources on the same row or column and d ij is the spatial distance between v i and v j . Thus, the mixing function is identical for each point-source on the grid. Because the grid contains n 2 sources (n rows×n columns), the mixing function f mix (v i , v j ) is a n 2 ×n 2 matrix.

Simulation of a single interaction
To demonstrate how multiple SIs can arise from a single true interaction in reconstructed source space, we simulated one phase-lagged interaction (between V 1 and V 2 , Fig. 1A) on a grid of 13×13 ¼169 sources. On the grid, each source has unit distance from its immediate neighbours. The time-series of all sources, except for V 2 , consisted of 1000 samples of Morlet-filtered white noise (f c ¼10 Hz, Morlet m ¼5, sampling frequency f s ¼100 Hz). The time series of V 2 was simulated to have a correlation coefficient of 0.9 with that of V 1 with a 3 samples delay. The distance between V 1 and V 2 on the grid was set to 8.5d g , to ensure there would be no mixing between them. We then modeled linear signal mixing so that the signal measured at any vertex v was obtained as: where X v (t) is the original time series of a source v, the mixing function f mix measures the signal mixing between two sources and is defined in Eq. (2) of [1], X vi (t) is the original time series of v's neighbor v i , and M is the number of sources that have non-zero f mix (v,v i ). We used these linearmixing-contaminated time series to compute phase interactions with iPLV between all pairs of pointsources. For illustration, we retrieved the strongest 0.1% of iPLV edges Fig. 1B of [1].

Multiple true interactions with various distances in mixing
To illustrate concurrent bundling of multiple hyperedges, we simulated six interacting sources (true edges) on a grid system of 20×20¼ 400 sources ( Fig. 1 of [1]). Using these six edges' spatial adjacency on the grid, we paired them as "kin", "nearby", and "far". Note that the "far" pair, edge E(V 9 , V 10 ) and E(V 11 ,V 12 ), were not only far from each other but also far from all other edges. After applying the mixing function to the simulated time-series and computing the pairwise iPLV, we thresholded the resulting 400×400 source interaction matrix at 99.7th percentile, obtaining 239 significant edges that are visualized in Fig. 1D of [1].

Simulation on cortical source-space
We simulated the parcel time-series X of 1,000 randomly chosen ground-truth graphs on a 400parcel cortical source-space obtained by iterative splitting [3] of the Destrieux atlas [4][5][6]. Virtual MEG measurements and source reconstruction were performed by forward-and inverse-modeling X into the reconstructed time seriesX . The forward-and inverse-operators used here were derived from the visual working memory experiment described below. Finally, all-to-all FC ofX was estimated with the imaginary part of the complex phase-locking value (iPLV, Wang et al., in press).
Each ground-truth graph contained 200 randomly connected nodes so that each node was connected only to a single other node. The narrow band time series X(t, f 0 )∈ℝ m×n (m ¼1000 independent samples, n ¼400 parcels) of each ground-truth graph was simulated as follows: 1) 400 parcels were first randomly divided into two groups V 1 and V 2 as source and target nodes.
2) The nodes from V 1 and V 2 were next randomly paired to create 200 edges E, thus for each edge 3) The time series of 200 source nodes in V 1 were obtained by convoluting uncorrelated white noise time series with Morlet wavelets wðt; f 0 Þ, f 0 ¼10 Hz, Morlet m ¼5, f s ¼100 Hz. 4) For each edge e k (v 1 (i) ,v 2 (j) ), the time series of the target node v 2 (j) was simulated to have a correlation coefficient of 0.9 with the time series of source node v 1 (j) delayed by 3 samples.
5) X(t, f 0 ) were finally decimated into 1000 independent samples before forward and inverse-modeling (see Theory, Wang et al., in press).
To identify significant iPLV edges in the ground-truth graphs, we also estimated null hypothesis graphs G rawH0 by simulating uncorrelated time series X H0 ∈ℝ m×n (m ¼1000, n ¼400 parcels) and estimating their iPLV as done for the ground-truth graphs. A range of five thresholds was obtained from G rawH0 that corresponded to significance levels ranging from relaxed to strict, i.e., T iPLV ¼ −log(pvalue)∈{1.3, 2, 3, 4, 5} (equivalent to p ofrom 0.05 to o0.00001).

The unweighted pair group method with arithmetic mean clustering
Here we illustrate the partitioning of the similarity matrix S E with the unweighted pair group method with arithmetic mean (UPGMA). The UPGMA algorithm is an agglomerative hierarchical clustering method that builds a hierarchical tree through an iterative procedure to reflect the distance between pairs of objects in an adjacency matrix A [7]. In each iteration, two objects p and q with nearest distance d(p, q) were linked into a cluster. Here, p and q can be either an element from A or a cluster of elements from A, and the distance between p and q is defined as where n p and n q are the number of elements in p and q respectively, d(x, y) is the distance between x, y, and x, y are elements from A. Suppose that we have a matrix A 1 derived from the edge-to-edge adjacency matrix (Fig. 1A), where each element is the distance between edges (a~e), and we intend to build a hierarchical tree to reflect the tied structure of similarity in signal mixing between all these edges.
First, we find the closest two edges in A 1 , i.e., d(a, b) ¼ 0.17, and combine them to form cluster u. Next, we compute the distance between u and the rest of edges c, d and e using Eq. (S3), and nearest objects in the updated matrix (A 2 ) is d(u, e) ¼0.22. Hence we cluster u and e to v. By repeating this, eventually all the objects will be connected into a hierarchical tree (Fig. 1B), in which the height δ of a cluster on the tree is the distance between the two members of that cluster, e.g., δ(v), the height of v, the distance of u and e, is 0.22.
The information about tree structure is used to partition the data into well separated clusters. If a cluster's height is close to its members' height, then the members of the joined cluster at this level of hierarchy are very similar or "inseparable". Otherwise, the members should be separated into two distinct clusters. This can be quantified using the inconsistency coefficient where δ is the height of a cluster, μ δ and σ δ are the mean and standard deviation of the clusters height included at a given search depth θ below (and including) this cluster. For example, in cluster h (Fig. 1B), with θ of 2, The κ θ of bottom tie clusters is zero. After the tree is built, the inconsistency coefficient κ θ can be computed for all clusters, and the cutoff limit (CL) defines the criterion by which the tree is partitioned into clusters, e.g., CL¼0.15 means those clusters whose κ θ exceed the 85 th percentile will be partitioned into independent clusters.
In the example of the random graph in Fig. 6C, when performing bundling with a CL of 0.15 and a depth θ of 2, those clusters whose κ 2 is greater than 1.04 will be partitioned, e.g., the cluster indicated by the black box (right panel, Fig. 1C). On the other hand, those clusters (and all of its branches) whose κ 2 is less than 1.04 will remain as a distinct cluster, e.g., the cluster indicated by the red triangle. For more details about the UPGMA algorithm, please visit Matlab® Statistics and Machine Learning Toolbox™ (http://se.mathworks.com/discovery/cluster-analysis.html).

Workflow for testing the stability of hyperedge
To ensure that the hyperedges are not random outcomes of partitioning the edge similarity matrix, we tested, at any given resolution, if the differences between the partitioning solutions of n randomly perturbed versions of a edge similarity matrix is statistically smaller than their surrogate counterparts. The workflow is described below (Fig 2): 2.5. Using the software and sample data to replicate the main results First, download the sample dataset and the software package from the public repository given above and unzip the package to local disk. The package includes following contents, and the Labview code runs on the MS Windows operating system and requires Labview 2015 or newer and Matlab R2013 or newer.
The Labview GUI provides functions for "simulations, virtual MEG measurement" and "FC measurement and hyperedge bundling" (workflow, Fig. 3) to replicate the main results of hyperedge bundling paper. The GUI provides detailed instructions at each step. We offer both source code and complied executable files of the GUI. Source code: Hyperedge(LabviewSource).zip. Labview executable: Hyperedge_GUI_EXE.zip and Hyperedge_GUI_Setup.zip. Note: if you do not have Labview (or Labview runtime engine) installed on your PC, you will need to use the Setup program. Detailed instructions for simulations and tests can be found on the GUI.
The sample data: Hyperedge_sample_data.zip (for Labview GUI only), includes 10,000 truth graph templates, individual subject forward, inverse and fidelity optimized sparse collapsing operator [8]. These data were derived from the working memory experiment introduced below.
Python source code: https://github.com/palvalab/hyperedges. This includes the hyperedge bundling core functions as well as a Python notebook demonstration. The software was written in Python from the Python Software Foundation (http://www.python.org/), version 2.7.

Visual working memory (VWM) experiment
We illustrated the application of hyperedge bundling to real MEG data using iPLV raw graphs obtained from a VWM experiment. In the VWM task, subjects memorized one feature in the Sample stimulus and reported whether the memorized features were the same in the Test stimulus presented 2 seconds later. Data preprocessing and group statistics were carried out in the same manner as in our earlier studies and are described in these studies and their supplementary material [3,9,10]. The raw graphs contained the connections with significantly stronger phase correlations during the VWM retention period than during pre-stimulus baseline.

Experimental paradigm
To demonstrate the hyperedge bundling approach can be applied to real MEG/EEG data, we estimated raw FC graphs from a visual working memory (VWM) MEG experiment. The paradigm of the VWM experiment is descrbied below (Fig. 4): In each trial, subjects were first shown a sample stimulus (S1) that contained 2 or 4 objects (Load2 and Load4) in either left or right visual field. The objects were 3-to 6-sided polygons with distinct colors. After a 2.05 s retention period (black screen with a center fixation), a test stimulus of one object (S2) was displayed in the same visual field where S1 was shown. Subjects made a forced-choice response to indicate whether the shape of S2 matched any one of the objects in S1. The response was a left or right thumb twitch, to indicate whether S2's shape matched any of S1's shapes. Object features were generated randomly so that each S1 and objects therein were unique, and the shape of S2 object always had a 50% probability of being identical with any of the object features in S1. The field of view was 7.3°×7.3°and the objects on average spanned an area of 0.65°×0.65°. The minimum center-to-center distance between the objects was 1.29°and maximum 3.87°. Prior to M/EEG recording, subjects were given clear instructions and performed at least one practice session. M/EEG recordings were carried out in 5 blocks. Each block comprised 160 trials. We estimated network phase synchrony for a baseline period (0.7-0.1 s prior to S1) and two retention periods (0.6-1.2 s and 1.2-1.8 s after S1) as indicated. For simplicity, we selected a subset of trials from the whole experiment where subjects made a correct response (indicating they remembered the shape the sample stimuli) in the Load4 condition.

Subjects and recording
12 healthy subjects (29 7 6 years of age, mean 7SD, 7 females) participated in the experiment. We acquired T1-weighted anatomical MRI scans for each subject at a resolution of r1×1×1 mm with a 1.5-T MRI scanner (Siemens, Germany). During the VWM experiment, concurrent M/EEG data were recorded with 204 planar gradiometers, 102 magnetometers, and 60 EEG electrodes (Elekta Neuromag Ltd, Finland) with a sampling rate of 600 Hz. The preprocessing of the M/EEG data was described in [10]. The projects were approved by the Ethical Committee of HUS and conducted in accordance with the Declaration of Helsinki.

Cortical surface reconstruction and parcellation
We used FreeSurfer software (http://surfer.nmr.mgh.harvard.edu/) for volumetric segmentation of MRI images, reconstruction of the pial and the white and gray matter surfaces, and neuroanatomical labelling with the Destrieux atlas [4,5,11]. This atlas is comprised of 148 parcels covering the entire neocortex. To obtain a parcellation with a finer resolution and to decrease variability among parcel  surface areas, those parcels that had largest mean area in the subject cohort were iteratively selected and split in two [3] until a total of 400 parcels was obtained [3,12].

Forward operator, inverse operator and source reconstruction
We used MNE software (www.martinos.org/mne/) for MEG-MRI co-localization, creating individual source models with dipole orientation fixed to surface normal at 7-mm spacing, and preparing forward and inverse operators [13][14][15]. M/EEG sensor data were filtered using Morlet wavelets into 31 narrow-band frequency time-series covering frequency bands from 3 to 120 Hz with equal spacing on the log scale. Wavelet analysis was employed because it does not require the data to be stationary and therefore is more suitable for non-stationary neurophysiological signals than Fourier-based methods [16]. Inverse-operators were prepared for each frequency band with the noise covariance matrix χ estimated from pre-sample-stimuli baseline. For each subject, the sensor time series of these trials were projected into sources time series (6-8 k sources/subject) using Eq. (5) of [1] and λ 2 ¼ 0.05. These time series were collapsed into 400-parcel space using a sparsely-weighted-collapse-operator that was optimized to maximize source-reconstruction accuracy [8].

The mixing properties of the source space
We first estimated the mixing properties (i.e., f mix , PLV 0 , f p ) for individual cortical source-space and next computed the group average (Fig. 5, left). The individual mixing properties were obtained by averaging the results over 10 iterations in each iteration, the virtual MEG measurement of one null hypothesis time series X H0 (i) would yield one set of raw mixing property estimates (in cyan box, Fig. 5).
We picked three regions of interest (ROIs) to illustrate the computation of mixing properties in one subject (Fig. 6A, B). Among these three ROIs, the medial frontal gyrus (mFG) is spatially distal from the other two ROIs, and therefore, the mixing between mFG and two posterior parcels can be ignored.   The inferior parietal gyrus (iPG) and the medial temporal gyrus (mTG) are relatively close to each other, and therefore, there should be small but non-zero mixing between them. To demonstrate this, we aligned discrete phase time series of original and modeled null hypothesis time series from these parcel pairs, we subsequently subtracted sample-by-sample to obtain phase difference time series. The phase difference distributions were next computed for 50 bins between -π and π (Fig. 6).
First, phase differences between the original null-hypothesis time series of these three ROIs were uniformly distributed, and therefore PLV estimates were zero (Fig. 6C, left). However, there was a preferred phase difference at zero degree between original null hypothesis and modeled phase time series between (θ iPG ,θ mTG ) as well as between (θ mTG ,θ iPG ) indicating a bi-directional, non-zero mixing effect (f mix ), between mTG and iPG (Fig. 6C, middle). Last, the phase difference between modeled time series of (θ mTG ,θ iPG ) was even larger than that observed in Fig. 6C middle, indicating a large residual spread (PLV 0 ) between them (Fig. 6C, right).
We observed that parcel fidelity f p was spatially inhomogeneous across the cortical surface, and that low fidelity parcels were clustered in the medial and deep structures (Fig. 7A). The shape of the distributions of f mix and PLV 0 were similar (heavy-tailed) and PLV 0 was greater than f mix (Fig. 7B). Moreover, f mix and PLV 0 were both negatively correlated with parcel distance (Fig. 7C).
We further inspected the spatial distribution of the group-level mixing properties. The high fidelity parcels were mostly situated on the dorsal and lateral sides of the brain and predominantly on gyri (Fig. 8A). In contrast, high residual spread parcels were located in deeper sources such as the cingulate, inferior occipital, temporal, fontal and insula (Fig. 8B). We next obtained the intractableedge-mask (Methods of [1]) and deleted the 40% most poorly reconstructed interactions. Parcels with the most deleted edges are concentrated in the sulci and deep structures, including the cingulate, inferior occipital, temporal and insula. In addition, parcel fidelity and PLV 0 were negatively correlated (Spearman's rank correlation coefficient ρ ¼−0.82, Fig. 8D), meaning that high fidelity parcels tend to have low residual spread. The large dark area represents the deleted edges from poorly reconstructed loci (f e o 0.1), the fragmented dark areas within the heat map indicates the "short-range" edges that were deleted due to high local mixing (quantified by null hypothesis PLV 0 ).

The intractable edge mask (IEM)
We employed an IEM to delete edges connecting poorly reconstructed sources (Methods of [1]). Here we use an example to demonstrate applying IEM increases the quality of connectivity graphs. For example, in graphs of uniformly distributed edge coupling of 0.9, although all true interactions were simulated uniformly, the iPLV estimates of true positive edges were highly biased by edge fidelity f e (Spearman's rank correlation coefficient ρ¼ 0.91, Fig. 9A). This is because false negative (FN) edges (true edges that were rejected as non-significant) appeared to arise from loci of low edge fidelity (f e o0.2) (Fig. 9B).
If we define the edge fidelity mask M fe using f e o0.1 (orange), 35% edges from the raw-graph are removed (Fig. 9C). This corresponds to a true-positive rate (TPR) of 0.5 in raw-graphs with T iPLV ¼5 (thinnest black curve), representing a small loss of TP compared to not applying M fe (black vs red). The true-negative rate (TNR) of raw graphs with T iPLV ¼5 was 97% when applying the above mentioned M fe .

Mixing effects on measured individual FC graphs
After the virtual MEG experiment, the overall descriptive statistics of the FC graphs estimated in individual subjects were distorted due to mixing in the respective individual source spaces. For each truth graph, we simulated one set of null hypothesis time series H 0 and ten sets of truth time series, which included two patterns of coupling strength (coupling of the true edges were uniform or gamma distributed with each pattern having 5 levels of strengths, see Fig. 3). Here, "other" edges refer to all uncorrelated signal pairs, and there are 8×10 4 such "other than true" edges for a FC graph with dimension of 400×400.
In FC graphs estimated directly from the truth time series, the mean iPLV and PLV of true edges are much higher than other uncorrelated edges for all coupling patterns and strengths except for H 0 time series where the true edges had uniform zero couplings (Fig. 10A.i). In these truth time series, oiPLV 4 and oPLV4 are equivalent because coupling was simulated with a 90 degree phase-lag ( Fig. 10A.ii) where iPLV is at its maximum and equal to PLV. On the other hand, mean PLV of other edges was slightly greater than iPLV because in uncoupled signal pairs, the mean phase difference is uniformly distributed (Fig. 10A.iii), and therefore, the projections of complex-valued cPLV to the imaginary axis are always smaller than the vector length of the cPLV, i.e., PLV.
The mean iPLV and PLV of true edges estimated after the virtual MEG experiment, i.e., after linear mixing was introduced, were lower than those of the truth graphs yet higher than other uncorrelated edges (Fig. 10A.i vs. B.i). Moreover, the mean PLV of other edges after the virtual MEG experiment was nearly 3 times larger than in the truth graph (i.e., an increase from 0.02 to 0.06) whereas there was a smaller increase in the mean iPLV of "other" edges.
In addition, we found a change in the phase-lag in the true edges after the virtual MEG experiment. All the true edges were simulated with π/2 phase-lag (Fig. 10A.ii), but we found a decrease in the number of true edges with π/2 phase-lag and an increase in the number of edges with in zero-or 7π phase-lag in all graphs (Fig. 10 B.ii). This change was found most prominently in graphs with weak couplingas if the true interactions and mixing effects were two orthogonal forces competing to dominate the cPLV vector's angle (π/2 lag in true correlations vs. zero-and 7π phase-lag of mixing effect).
Finally, the phase-lag distribution of other uncorrelated edges in all 10 coupled graphs and in the H 0 graphs deviated from the uniform distribution ( Fig. 10B.iii). The most pronounced increase was observed in the zero-or 7π phase-lag edges due to mixing. This deviation from a uniform distribution explains 1) the systematical increase of PLV in all graphs that was seen earlier (implying the presence of a large amount of artificial interactions), and 2) the fact that the zero-and or 7π phaselag (i.e., artificial interaction) insensitive iPLV was less inflated than PLV. Moreover, there was also a slight increase in the number of edges having π/2 degree lag edges, which is a sign of spurious interactions that are "ghosting" the π/2 degree lag true interactions.
Note that, even in the most strongly coupled graphs (C c ¼ 0.9), the increase in the number of edges with π/2 degree lag was small (0.015 in both 790). However, considering the large population of "other" edges (8×10 4 ), a mere 0.03 increase in FPR means 2400 false positive spurious edges, overwhelming the true edges by a ratio of 12.
Funding sources