Comparison of beamformer implementations for MEG source localization

Beamformers are applied for estimating spatiotemporal characteristics of neuronal sources underlying measured MEG/EEG signals. Several MEG analysis toolboxes include an implementation of a linearly constrained minimum-variance (LCMV) beamformer. However, differences in implementations and in their results complicate the selection and application of beamformers and may hinder their wider adoption in research and clinical use. Additionally, combinations of different MEG sensor types (such as magnetometers and planar gradiometers) and application of preprocessing methods for interference suppression, such as signal space separation (SSS), can affect the results in different ways for different implementations. So far, a systematic evaluation of the different implementations has not been performed. Here, we compared the localization performance of the LCMV beamformer pipelines in four widely used open-source toolboxes (FieldTrip, SPM12, Brainstorm, and MNE-Python) using datasets both with and without SSS interference suppression. We analyzed MEG data that were i) simulated, ii) recorded from a static and moving phantom, and iii) recorded from a healthy volunteer receiving auditory, visual, and somatosensory stimulation. We also investigated the effects of SSS and the combination of the magnetometer and gradiometer signals. We quantified how localization error and point-spread volume vary with SNR in all four toolboxes. When applied carefully to MEG data with a typical SNR (3–15 dB), all four toolboxes localized the sources reliably; however, they differed in their sensitivity to preprocessing parameters. As expected, localizations were highly unreliable at very low SNR, but we found high localization error also at very high SNRs. We also found that the SNR improvement offered by SSS led to more accurate localization.


SSS interference suppression. 28
We analyzed MEG data that were i) simulated, ii) recorded from a static and moving phantom, and 29 iii) recorded from a healthy volunteer receiving auditory, visual, and somatosensory stimulation. We Imaging of Coherent Sources) (Gross et al., 2001) approach is popular. 65 The LCMV beamformer estimates the activity for a source at a given location (typically a point 66 source) while simultaneously suppressing the contributions from all other sources and noise 67 captured in the data covariance matrix. For evaluation of the spatial distribution of the estimated 68 source activity, an image is formed by scanning a set of predefined possible source locations and 69 computing the beamformer output (often power) at each location in the scanning space. When the 70 scanning is done in a volume grid, the beamformer output is typically presented by superimposing it 71 onto an anatomical MRI. open-source toolboxes for MEG data analysis are FieldTrip (Oostenveld et al., 2011), Brainstorm inside the MEG helmet and continuous MEG data were recorded with 1-kHz sampling rate for three 165 dipole amplitudes (20, 200 and 1000 nAm); one dipole at a time was excited with a 20-Hz sinusoidal 166 current for 500 ms, followed by 500 ms of inactivity. The recordings were repeated with the 200-nAm 167 dipole strength while moving the phantom continuously to mimic head movements inside the MEG 168 helmet; see the movements in Fig. 3e and Suppl. Fig. 2 173 We recorded MEG evoked responses from the same volunteer whose MRI and spontaneous MEG 174 data were utilized in the simulations. These human data were recorded using a 306-channel Elekta 175 Neuromag® system (at BioMag Laboratory, Helsinki, Finland). During the MEG acquisition, the 176 subject was receiving a random sequence of visual (a checkerboard pattern in one of the four 177 quadrants of the visual field), somatosensory (electric stimulation of the median nerve at the left/right 178 wrist at the motor threshold) and auditory (1-kHz 50-ms tone pips to the left/right ear) stimuli with an 179 interstimulus interval of ~500 ms. The Presentation software (Neurobehavioral Systems, Inc., 180 Albany, CA, USA) was used to produce the stimuli. 181

182
The datasets were analyzed in two ways: 1) omitting bad channels from the analysis, without 183 applying SSS preprocessing, and 2) applying SSS-based preprocessing methods (SSS/tSSS) to 184 reduce magnetic interference and perform movement compensation for moving phantom data. The 185 SSS-based preprocessing and movement compensation were performed in MaxFilter TM software 186 (version 2.2; Megin Oy, Helsinki, Finland). After that, the continuous data were bandpass filtered 187 (passband indicated for each dataset later in the text) followed by the removing the dc. Then the 188 data were epoched to trials around each stimulus. We applied an automatic trial rejection technique 189 based on the maximum variance across all channels, rejecting trials that had variance higher than 190 the 98 th percentile of the maximum or lower than the 2 nd percentile (see Suppl. Fig. 4). This method 191 is available as an optional preprocessing step in FieldTrip, and the same implementation was applied 192 in the other toolboxes. Below we describe the detailed preprocessing steps for all datasets. 193

Simulated data
In each toolbox, the raw data with just bad channels removed or SSS-preprocessed continuous data 195 were filtered using a zero-phase filter with a passband of 2-40 Hz. The filtered data were epoched 196 Then the noise and data covariance matrices were estimated from these epochs for the time 199 continuous data were filtered to 2-40 Hz using a zero-phase bandpass filter, and the filtered data 208 were epoched from -500 to +500 ms with respect to stimulus triggers. Bad epochs were removed 209 using the automated method based on maximum variance, yielding ~100 epochs for each dataset. 210 The noise and data covariance matrices were estimated in each toolbox for the time windows of -211 500 to -50 ms and 50 to 500 ms, respectively. 212

213
Both the unprocessed raw data and the data preprocessed with tSSS were filtered to 1-95 Hz using 214 a zero-phase bandpass filter in each toolbox. The trials with somatosensory stimuli (SEF) were 215 epoched between -100 to -10 and 10 to 100 ms for estimating the noise and data covariances, 216 respectively. The corresponding time windows for the auditory-stimulus trials (AEF) were -150 to -217 20 and 20 to 150 ms, and for the visual stimulus trials (VEF) -200 to -50 and 50 to 200 ms, 218 respectively. Trials contaminated by excessive eye blinks (EOG > 250 μV) or by excessive magnetic 219 signals (MEG > 5000 fT or 3000 fT/cm) were removed with the variance-based automated trial 220 removal technique. Before covariance computation, baseline correction by the time window before 221 the stimulus was applied on each trial. The covariance matrices were estimated independently in 222 each toolbox. 223 Since the actual source locations associated with the evoked fields are not precisely known, we 224 defined reference locations using conventional dipole fitting in the Source Modeling Software of 225 Megin Oy (Helsinki, Finland). A single equivalent dipole was used to represent SEF and VEF 226 sources, and one dipole per hemisphere was used for AEF (see Suppl. human MEG data, the head models and the source space were defined in the same way as for the 251 beamformer scanning of the simulated data. 252

LCMV beamformer 253
The linearly constrained minimum-variance (LCMV) beamformer is a spatial filter that relates the 254 magnetic field measured outside the head to the underlying neural activities using the covariance of 255 where the × 3 matrix ( ) is known as spatial filter for a source at location . This type of spatial 263 filter provides a vector type beamformer by separately estimating the activity for three orthogonal 264 source orientations, corresponding to the three columns of the matrix. According to Eqs 16-23 in 265 van Veen et al. (1997), the spatial filter ( ) for vector beamformer is defined as 266 Here ( ) is the × 3 local leadfield matrix that defines the contribution of a dipole source at location 268 in the measured data x, and is the covariance matrix computed from the measured data samples. 269 To perform source localization using LCMV, the output variance (or output source power) Var( (r j )) 270 is estimated at each point in the source space (see Eq (24) in van Veen et al., 1997), resulting in 271 Usually, the measured signal is contaminated by non-uniformly distributed noise and therefore the 273 estimated signal variance is often normalized with projected noise variance n calculated over some 274 baseline data (noise). Such normalized estimate is called Neural Activity Index (NAI; van Veen et 275 al., 1997) and can be expressed as 276 Scanning over all the locations in the region of interest in source space transforms the MEG data 278 from a given measurement into an NAI map. 279 In contrast to a vector beamformer, a scalar beamformer (Sekihara and Scholz, 1996; Robinson and 280 Vrba, 1998) uses constant source orientation that is either pre-fixed or optimized from the input data 281 by finding the orientation that maximizes the output source power at each target location. Besides 282 simplifying the output, the optimal-orientation scalar beamformer enhances the output SNR 283  Denoting η opt ( ) = ( ) opt ( ) instead of ( ), the weight matrix in Eq (2) becomes × 1 weight 291 vector ( ), 292 When the data covariance matrix is estimated from a sufficiently large number of samples and it has 296 full rank, Eq (7) provides the maximum spatial resolution (Lin et al., 2008;Sekihara and Nagarajan, 297 2008). According to van Veen and colleagues (1997), the number of samples for covariance 298 estimation should be at least three times the number of sensors. Thus, sometimes, the amount of 299 available data may be insufficient to obtain a good estimate of the covariance matrices. In addition, 300 pre-processing methods such as signal-space projection (SSP) or signal-space separation (SSS) 301 reduce the rank of the data, which impacts the matrix inversions in Eq (7). These problems can be 302 mitigated using Tikhonov regularization (Tikhonov, 1963) by replacing matrix −1 by its regularized 303 version ( + λ ) −1 in Eqs (2-7) where λ is called the regularization parameter. 304 All tested toolboxes set the λ with respect to the mean data variance, using ratio 0.05 as default: 305 If the data are not full rank, also the noise covariance matrix n needs to be regularized. 307

Differences between the beamformer pipelines 308
Though all the four toolboxes evaluated here use the same theoretical framework of the LCMV 309 beamformer, there are several implementation differences which might affect the exact outcome of 310 a beamformer analysis pipeline. Many of these differences pertain to specific handling of the data 311 prior to the estimation of the spatial filters, or to specific ways of (post)processing the beamformer 312 output. Some of the toolbox-specific features reflect the characteristics of the MEG system around 313 which the toolbox has evolved. Importantly, some of these differences are sensitive to input SNR, 314 and they can lead to differences in the results. Table 1 lists the main characteristics and settings of 315 the four toolboxes used in this study. We used the default settings of each toolbox (general practice) 316 for steps before beamforming but set the actual beamforming steps as similar as possible across 317 the toolboxes to be able to meaningfully compare the results. 318 Insert Table 1 about here 319 All toolboxes import data using either Matlab or Python import functions of the MNE software 320 For MEG-MRI co-registration, there are several approaches available across these toolboxes such 327 as an interactive method using fiducial or/and digitization points defining the head surface, using 328 automated point cloud registration methods e.g., the iterative closest point (ICP) algorithm. Despite 329 using the same source-space specifications (rectangular grid with 5-mm resolution), differences in 330 head models and/or co-registration methods change the forward model across toolboxes; see Fig. 4. 331 Though there are several approaches to compute data and noise covariances across the four 332 beamformer implementations, by default they all use the empirical/sample covariance. In contrast to 333 other toolboxes, Brainstorm eliminates the cross-modality terms from the data and noise covariance 334 matrices. Also, the regularization parameter is calculated and applied separately for gradiometers 335 and magnetometers channel sets in Brainstorm. FieldTrip and SPM12 assume a single sensor type for all the MEG channels. This approach makes 344 SPM12 to favor magnetometer data (with higher numeric values of magnetometer channels) and 345 FieldTrip to favor gradiometer data (with higher numeric values of gradiometer channels). However, 346 users of FieldTrip and SPM12 usually employ only one channel type of the triple-sensor array for 347 beamforming (most commonly, the gradiometers). Due to the presence of two different sensor types 348 in the MEGIN systems and the potential use of SSS methods, the eigenspectra of data from these 349 systems can be idiosyncratic (see Suppl. Fig. 7) and differ from the single-sensor type MEG systems.  The focality of the estimated source, also called focal width, depends on several factors such as the 378 source strength, orientation, and distance from the sensors. PSV measures the focality of an 379 estimate and is defined as the total volume occupied by the source activity above a threshold value; 380 thus, a smaller PSV value indicates a more focal source estimate. We fixed the threshold to 50% of 381 the highest NAI in all comparisons. In this study, the volume represented by a single source in any 382 of the four source spaces (5-mm grid spacing) was 125 mm 3 . 383

Signal-to-Noise ratio (SNR): Beamformer localization error depends on the input SNR, which varies 384
among other factorsas a function of source strength and distance of the source from the sensor 385 array. Therefore, we evaluated beamformer localization errors and PSV as a function of the input 386 SNR of the evoked field data. 387 We estimated the SNR for each evoked field MEG dataset in MNE-Python using the estimated noise 388 covariance as follows: The data were whitened using the noise covariance and the effective number 389 of sensors was then calculated as 390 where are the eigenvalues of noise covariance matrix n . 392 Then, the input SNR was calculated as: 393

432
In the case of phantom data, the background noise is very low and there is a single source 433 underneath a measurement. Also, the phantom analysis uses a homogeneous sphere model that 434 does not introduce any forward model inaccuracy, except the possible co-registration error. All four 435 toolboxes show high localization accuracy and high resolution for phantom data, if the input SNR is 436 not very low. Corresponding results for the static phantom data are presented in Fig. 6a-b. Fig. 6a

Human subject MEG data 461
Since the correct source locations for the human evoked field datasets are unknown, we plotted the 462 localization difference across the four LCMV implementations for each data. These localization 463 differences were the Cartesian distance between an LCMV-estimated location and the 464 corresponding reference dipole location as explained in Section 2.1.4. Fig. 8a shows the plots for 465 the localization differences against the input SNRs computed using Eq (9) for four visual, two 466 auditory and two somatosensory evoked-field datasets. The localization differences for both 467 unprocessed raw and SSS preprocessed data are mostly under 20 mm in each toolbox. The higher 468 differences compared to the phantom and simulated dataset could be because of two reasons. First, 469 the recording might have been comprised by some head movement, which could not be corrected 470 because of the lack of continuous HPI. Second, the reference dipole location may not represent the 471 very same source as estimated by the LCMV beamformer. In contrast to dipole fitting, beamforming 472 utilizes data from the full covariance window, so some difference between the estimated localizations 473 is to be expected. For all SSS-preprocessed evoked field datasets, Fig. 8b shows the estimated 474 locations across the four LCMV implementation and the corresponding reference dipole locations. 475 For simplifying the visualization, all estimated locations in a stimulus category are projected onto a 476 single axial slice. All localizations seem to be in the correct anatomical regions, except the estimated 477 location from right-ear auditory responses by MNE-Python after SSS-preprocessing ( Fig. 8b; red  478 circle). After de-selecting the channels close to the right auditory cortex, the MNE-Python-estimated 479 source location was correctly in the left cortex ( Fig. 8b; green circle). The regression plots fitted over 480 the maxima of the localization differences across the LCMV implementations show the improvement 481 in input SNR and also localization improvement in some cases. Fig. 9 in Supplementary material 482 shows the PSV values as a function of the input SNR for the evoked-field datasets, demonstrating 483 the spatial resolution of beamforming. 484 Insert Fig. 8 about here  486

489
The localization accuracy and beamformer resolution as a function of the input SNR were 490 investigated and compared across the LCMV implementations in the four tested toolboxes. In the 491 absence of background noise, the phantom data showed high localization accuracy and high spatial 492 resolution if the input SNR >~5 dB. All implementations also showed high localization accuracy for 493 data recording from a moving phantom after compensating the movement and applying tSSS. Our results indicate that with the default parameter settings, none of the four implementations works 503 universally reliable for all datasets and input SNR values. In the case of low SNR (typically less than 504 3 dB), the lower contrast between data and noise covariance may cause the beamformer scan to 505 provide a flat peak in the output and so the localization error goes high. Unexpectedly, we found high 506 localization error for high SNR signal and significant differences between the toolboxes. The 507 regression curves fitted over averaged maximum PSV across all toolboxes showed higher values 508 for low-and high-SNR simulated data. As expected, reliable localization provides higher spatial 509 resolution across the implementations and vice-versa ( Fig. 5 and 6). The lower spatial resolution 510 (higher PSV) for the signal with low SNR also agrees with previous studies (Lin et al., 2008;511 Hillebrand and Barnes, 2003). We further discuss here the significant steps of the beamformer 512 pipelines, which affect the localization accuracy and introduce discrepancies among the 513 implementations. 514

Preprocessing with SSS 515
Due to the spatial-filter nature of the beamformer, it can reject external interference and therefore 516 SSS-based pre-processing may have little effect on the results. Thus, although the SNR increases 517 as a result of applying SSS, the localization accuracy does not necessarily improve, which is evident 518 in the localization of the evoked responses (Fig. 8). 519 However, undetected artifacts, such as a large-amplitude signal jump in a single sensor, may in SSS 520 processing spread to neighboring channels and subsequently reduce data quality. Therefore, 521 channels with distinct artifacts should be noted and marked as bad prior to beamforming of 522 unprocessed data or before applying SSS operations. In addition, trials with large artifacts should be 523 removed based on an amplitude thresholding or by other means. Furthermore, SSS processing of 524 extremely weak signals (SNR < ~2 dB) may not improve the SNR for producing smaller localization 525 errors and PSV values. Hence the data quality should be carefully inspected before and after 526 applying preprocessing methods such as SSS, and channels or trials with low-quality data (or lower 527 contrast) should be omitted from the covariance estimation. data at the very first stage (see Suppl. Fig. 6). The data import either keeps the data in SI-units (T 532 for magnetometers and T/m for gradiometers) or rescales the data (fT and fT/mm) before further 533 processing. The actual pre-processing steps in the pipeline may contribute to differences in the 534 results. The filtering step is performed to remove frequency components of no interest, such as slow 535 drifts, from the data. By default, FieldTrip and SPM use an IIR (Butterworth) filter, and MNE-Python 536 uses FIR filters. The power spectra of these filters' output signals show notable differences and the 537 output data from these two filters are not identical. Significant variations can be found between MNE-538 Python-filtered and FieldTrip/SPM-filtered data. Although SPM and FieldTrip use the same filter 539 implementation, the filtering results are not identical because of numeric differences caused by 540 different channel scaling (Suppl. Fig 6). These differences affect the estimated covariance matrices, 541 which are a crucial ingredient for the spatial-filter computation and finally may contribute to 542 differences in beamforming results. 543

Effect of SNR on localization accuracy 544
We reduced the impact of the unknown source depth and strength to a well-defined metrics in terms 545 of the SNR. We observed that the localization accuracy is poor for very low SNR values, i.e. below 546 3 dB. The weaker, as well as the deeper sources, project less power on to the sensor array and thus 547 show lower SNR; see Eq (9). On the other hand, the LCMV beamformer may also fail to localize 548 accurately sources that produce very high SNR values, likely because the data covariance matrix is 549 over-fitted, or the scanning grid is too sparse with respect to the point spread of the beamformer 550 output. In this case the output is too focal and a small error in forward solution, introduced for 551 example by inaccurate coregistration, may lead to missing the true focal source and obtaining nearly 552 equal power estimates at many source locations, increasing the chance of mislocalization. Usually, 553 such high levels of SNR do not occur in typical human MEG experiments, however, the strength of 554 equivalent current dipoles (ECD) for interictal epileptiform discharges (IIEDs) typically ranges 555 between 50 and 500 nAm (Bagic et al., 2011a). 556 All four beamformer pipelines provided very similar results when the SNR is in the "suitable range" 557 of about ~3-15 dB. Unsatisfactory performance is typically due to the data; either the SNR is 558 extremely low, or there are some uncorrected artifacts in the data. The results of the phantom data 559 showed that all toolboxes provide equally good results if there are no uncorrected large artifacts in 560 the data and if the SNR is not extremely small or large. 561

Effect of the head model 562
Forward modelling requires MEG-MRI co-registration, segmentation of the head MRI and leadfield 563 computation for the source space. The four beamformer implementations use different approaches, 564 or similar approaches but with different parameters, which yields slightly different forward models. 565 From Eqs (2-7), it is evident that beamformers are quite sensitive to the forward model. Hillebrand 566 and Barnes (2003) showed that the spatial resolution and the localization accuracy of a beamformer 567 improve with accuracy of the forward model. Dalal and colleagues (2014) reported that co-568 registration errors contribute greatly to EEG localization inaccuracy, likely due to their ultimate impact 569 on head-model quality. Chella and colleagues (2019) presented the dependency of beamformer-570 based functional connectivity estimates on MEG-MRI co-registration accuracy. 571 The increasing inter-toolbox localization differences towards very low and very high input SNR is 572 also subject to the differences between the head models. Fig. 4 shows the three overlapped head 573 models prepared from the same MRI where a slight misalignment among head models can be easily 574 seen. This misalignment also affects source space. These differences in head models and thus in 575 forward solutions will contribute to differences in beamforming results across the toolboxes. 576

577
The data covariance matrix is a key component of the adaptive spatial filter in LCMV beamforming, 578 and any error in covariance estimation can cause an error in source estimation. We used 5% of the 579 mean variance of all sensors to regularize data covariance for making its inversion stable in FieldTrip, 580 SPM12 and MNE-Python. Brainstorm uses a slightly different approach and applies regularization 581 with 5% of mean variance of gradiometer and magnetometer channel sets separately and eliminate 582 cross-sensor-type entries from the covariance matrices. As SSS preprocessing reduces the rank of 583 the data, usually retaining at most 75 non-zero eigenvalues, the trace of the covariance matrix 584 decreases strongly. At very high SNRs (> 15 dB), overfitting of the covariance matrix becomes more 585 prominent; the condition number (ratio of the largest and the smallest eigenvalues) of the covariance 586 matrix becomes very high even after the default regularization, which can deteriorate the quality of 587 source estimates unless the covariance is appropriately regularized. Therefore, the seemingly same 588 5% regularization can have very different effects before and after SSS; see Suppl. Fig. 7. Thus, the 589 commonly used way of specifying the regularization level might not be appropriate to produce a good 590 and stable covariance model at high SNR, and this could be one of the explanations for the 591 anecdotally reported detrimental effects of SSS on beamforming results. 592