MEG current source reconstruction using a meta-analysis fMRI prior

Magnetoencephalography (MEG) offers a unique way to noninvasively investigate millisecond-order cortical activities by mapping sensor signals (magnetic fields outside the head) to cortical current sources using current source reconstruction methods. Current source reconstruction is defined as an ill-posed inverse problem, since the number of sensors is less than the number of current sources. One powerful approach to solving this problem is to use functional MRI (fMRI) data as a spatial constraint, although it boosts the cost of measurement and the burden on subjects. Here, we show how to use the meta-analysis fMRI data synthesized from thousands of papers instead of the individually recorded fMRI data. To mitigate the differences between the meta-analysis and individual data, the former are imported as prior information of the hierarchical Bayesian estimation. Using realistic simulations, we found out the performance of current source reconstruction using meta-analysis fMRI data to be better than that using low-quality individual fMRI data and conventional methods. By applying experimental data of a face recognition task, we qualitatively confirmed that group analysis results using the meta-analysis fMRI data showed a tendency similar to the results using the individual fMRI data. Our results indicate that the use of meta-analysis fMRI data improves current source reconstruction without additional measurement costs. We assume the proposed method would have greater effect for modalities with lower measurement costs, such as optically pumped magnetometers.


Introduction
Magnetoencephalography (MEG) is valued as a major modality for non-invasive neuroscience studies thanks to its millisecond-order temporal resolution ( Baillet, 2017;Hämäläinen et al., 1993 ). Combined with current source reconstruction methods, which map sensor signals (magnetic fields outside the head) to cortical current sources, it offers a unique way to noninvasively investigate millisecond-order cortical activities ( Baillet et al., 2001;Gross et al., 2013 ).
The current source reconstruction methods using the distributed source model rely on solutions of the inverse problem, which is an illposed problem requiring constraints in addition to data, since the number of observed signals is only a few hundred, whereas the number of considered current sources is in the thousands. This problem is similarly defined for electroencephalogram (EEG), which observes changes in the electric field caused by neural activity. Many studies have proposed various types of additional constraints and ways to incorporate them in the mathematical model. The most straightforward but powerful one is minimum norm estimation (MNE) ( Hämäläinen et al., 1993;Hämäläinen and Ilmoniemi, 1994;Wang et al., 1992 ). The MNE solution is obtained by minimizing a cost function consisting of a data-fitting probabilistic distributions having parameters according to the fMRI statistical map (i.e., they incorporate the fMRI map into the hierarchical prior of the current variance). Therefore, even if the intrinsic difference in the measurement principles of fMRI and MEG causes a discrepancy between the spatial patterns of the fMRI activation map and the source current map ( Kaneoke, 2006 ), the hVB approach is able to resolve such discrepancies.
Although the fMRI prior information leads to significant improvement in terms of spatial accuracy, there are practical problems. First, it requires additional cost to measure the fMRI data. It increases not only the measurement cost but also the burden on the subject. This burden also causes a decrease in measurement quality and leads to a second issue. It is also inapplicable to a published MEG dataset without fMRI records. Furthermore, when targeting modalities with lower measurement costs, such as optically pumped magnetometers (OPM) ( Boto et al., 2018 ), the measurement cost of fMRI will become relatively more serious. As a second problem, the reliability of individual-level statistical maps is not always sufficient, especially for cognitive or emotional tasks that induce smaller signal changes compared with visual and motor tasks ( Elliott et al., 2020 ). The quality of fMRI data is strongly influenced by the subject's condition and noise caused by a system and physiology ( Geissler et al., 2007;Welvaert and Rosseel, 2013 ).
In this study, to solve the issues mentioned above, we propose using meta-analysis fMRI data as a hierarchical prior of current source reconstruction instead of an individual prior. The meta-analysis fMRI data form a statistical map synthesized from thousands of published fMRI studies. We used the meta-analysis results available from the Neurosynth open-source project ( Yarkoni et al., 2011 ). A key idea here is that we took meta-analysis results as the hierarchical prior distribution of the current variance in hVB estimation ( Sato et al., 2004 ) rather than the prior current variance as in the dSPM approach, since the former provides an adaptive way to incorporate fMRI information based on MEG measurements.
The feasibility of meta-analysis fMRI data as prior information was assessed by comparing the results of source reconstruction with those estimated using an individual fMRI prior. The results were quantitatively and qualitatively evaluated using simulated and experimental data, respectively. For the simulation, we conducted multi-modal data generation. Both fMRI and MEG data were generated based on the same ground truth to mimic a situation in which these are obtained from an individual. In order to bring the statistical properties of simulated data closer to those of real data, this simulation adopted real data to the extent possible. Using simulated data, we studied the effectiveness of a meta-analysis fMRI prior compared with a contaminated individual fMRI prior. We also investigated the appropriate range of an important hyper-parameter we call prior weight. Finally, we conducted group analysis using the meta-analysis fMRI prior on the experimental data. This paper is organized as follows. The Materials and Methods section explains the key methods used to accomplish current source reconstruction with the meta-analysis fMRI prior, the multi-modal data generation framework, and a detailed analysis of the experimental data. The Results section presents the results of data analysis from simulations and experiments. The simulations show that the performance of the metaanalysis fMRI prior was comparable to that using an individual fMRI prior with a low contrast-to-noise ratio (CNR). The experiments show that group analysis results of the meta-analysis fMRI prior were similar to the results of the individual fMRI prior. Finally, we summarize the significance of the present study and discuss the advantages and limitations of meta-analysis fMRI as prior information for source reconstruction.

Materials and methods
In this section, we introduce the key methods used to achieve current source reconstruction with a meta-analysis fMRI prior. Next, we explain the details of the simulated and experimental data.

Brain model and lead field matrix
We constructed a brain model and a lead field matrix using the MAT-LAB open-source toolbox VBMEG ( https://vbmeg.atr.jp ). The individual brain model is constructed based on a subject's T1-MRI image. First, we construct a polygon model of the cortical surface using FreeSurfer ( http://surfer.nmr.mgh.harvard.edu/ ). From the polygon model, we select 10,004 vertices as the current sources based on the predefined coordinate in the standard brain (MNI-ICBM152). As a result, source locations of different subjects are aligned to those of the standard brain while the sulcus-gyrus structure of each subject was maintained. This allows a simple comparison of the estimated source currents across subjects for each source. Therefore, we can proceed to group analyses on the source currents without any transformation. Next, we construct a 1-shell (cerebrospinal fluid) head conductivity model. Based on the model, we make a lead field matrix by solving the Maxwell equations with a boundary element method. For more information on preparing the brain model and lead field matrix using VBMEG, see Takeda et al. (2019) .

Importing fMRI information
The statistical results of fMRI data were imported into a subject's brain model using VBMEG. Original statistical maps (t-value for individual fMRI data and z-value for meta-analysis fMRI) are defined on voxels, and thus they were transformed to the cortical surface using an inverse-distance weighted interpolation method. Imported fMRI information were used to calculate parameters in the probability distribution of the prior current variances as described below.

Bayesian formulation of current source reconstruction
MEG measures the magnetic field generated by neural activities on the cortex. For the distributed source model, the forward model, which is the mathematical model of this measurement process, can be described by the linear equation ( Hämäläinen et al., 1993 ): where = ( 1 , … , ) ⊤ represents neural activities (current dipole moment) on the cortex, = ( 1 , … , ) ⊤ is the magnetic field of MEG sensors, the × matrix is the lead field matrix, and = ( 1 , … , ) ⊤ is observation noise. When we consider time series data, Eq. (1) is described in matrix form as where = ( 1 , ..., , ..., ) and = ( 1 , ..., , ..., ) are current and MEG time series data consisting of time points ( is an index of time samples), respectively. As the number of observations is typically far fewer than the number of current dipoles, solving Eq. (1) becomes an ill-posed problem requiring constraints to avoid non-uniqueness and instability of solutions. Assuming the observation noise follows an identically independent Gaussian distribution  ( , ( ) −1 ) , where is a scaling parameter and is a noise precision matrix scaled by ( is normalized to satisfy tr ( ) = and is typically determined from pre-stimulus rest period measurements), the likelihood function, which is the probabilistic model of the forward model, is given by and for a single time point observation and multiple time points observations, respectively. From the Bayesian perspective, constraints are imposed in the form of the prior distribution, with hyperparameter . Differences in source reconstruction methods are mainly explained as differences in the assumptions on this prior distribution 0 ( | ) . Given the likelihood function and the prior distribution, the current source is estimated using the posterior distribution ( | ) derived from Bayesian theorem ∶

Hierarchical variational bayesian model
In this study we incorporate fMRI priors in the hierarchical variational Bayesian (hVB) model ( Sato et al., 2004 ), here we followed the parametrization in Yoshioka et al. (2008) . The hVB model assumes the prior distribution in a hierarchical way: where = diag ( ) = diag ( 1 , 2 , … , ) is the diagonal precision matrix of the currents and is the same as in Eq. (3) . Each is a random variable representing an inverse variance of the nth current , and Eq. (8) is the hierarchical prior of . Γ( | 0 , 0 ) represents a Gamma distribution with mean 0 and degree of freedom 0 , where Γ( 0 ) ≡ ∫ ∞ 0 0 −1 − is the Gamma function. Here, 0 is determined based on fMRI information as −1 0 ∝ fMRI 2 , where fMRI is an imported statistical value of the n th dipole. Eq. (10) is the prior distribution of the scaling parameter , defined as a non-informative prior distribution.
Using the likelihood function Eq. (4) and the prior distributions Eq. (7) , (8), (10) , we compute the joint posterior distribution of the current , prior precision , and scaling parameter given the magnetic field as follows: However, it cannot be solved analytically because the marginal likelihood ( ) involves integration over . Therefore, the approximated posterior distribution is calculated using the variational Bayesian method. The resulting algorithm consists of iterative updating of the posterior distribution ( , ) and ( ) ( Appendix A ).

Prior weight
We used the VBMEG toolbox ( Takeda et al., 2019 ) to conduct the hVB estimation. In the VBMEG implementation, the hyperparameter called the prior weight controls the relative confidence between the fMRI information and the observed MEG. The prior weight is defined as where 0 is the degree of freedom of the hierarchical prior distribution. The prior weight ranges from 0 to less than 1. This confidence parameter appears in updating the current variance α −1 : where <> indicates taking the expectation with respect to the posterior distributions of and . The left-hand side denotes the updated nth current's variance α −1 . The first term of the right-hand side means fMRI information, and the second term is current sources calculated from observed MEG data with time points. When we set the prior weight close to 1, the resulting prior distribution tends to have peaks at the places where the fMRI information has a high value. This indicates strong confidence in the fMRI information in the estimation. On the other hand, if it is set to a small value, the result tends to emerge according to the MEG data. To represent the strongest confidence in the fMRI information, we set = 0 . 99999999 . Note that if the prior weight is 0, then we use only the fMRI information as the initial value of α −1 . Even if the prior information includes false positive activities, hVB estimation is able to suppress them by setting the proper value on the prior weight ( Yoshioka et al., 2008 ).

Meta-analysis fMRI prior
A statistical map synthesized by Neurosynth ( Yarkoni et al., 2011 ) was imported to calculate the hierarchical prior using VB-MEG. All of the results in this paper were based on Neurosynth v0.4 ( https://neurosynth.org ). Neurosynth infers the regions associated with a term by mining thousands of papers and conducting a type of multilevel kernel density analysis ( Wager et al., 2009 ). Therefore, the term must be selected by the experimenter when obtaining the meta-analysis results. In order to generate the effective meta-analysis data as prior information, the term must be related to the task of a MEG experiment. However, there are many choices in the term for a task. We selected the term from the highest layer of the ontology of each task domain ( Thompson and Fransson, 2017 ). Namely, the term "visual " was chosen for the visual domain task and "motor " was chosen for the motor domain task (see discussion on our choice of terms). After generating the meta-analysis statistical maps, they were imported to the subject's brain model in the same way as the individual fMRI prior. The synthesized meta-analysis maps ( Fig. 1 ) were employed as the hierarchical fMRI prior.

Data simulation
To evaluate the feasibility of meta-analysis fMRI data as the prior information of current source reconstruction, we generated individual fMRI and MEG data with two task conditions and several difficulty settings. Both fMRI and MEG data were generated based on the same ground truth, which we call a common truth. This procedure aggregated various datasets of several modalities (structural MRI, fMRI, restingstate fMRI, diffusion MRI, and MEG sensor locations).
We generated two sets of data focusing on visual and motor experiments. The simulated dataset of the visual domain was made using multi-modal datasets of a face recognition task, which is publicly available from https://openneuro.org/datasets/ds000117/versions/1.0.3/ ( Wakeman and Henson, 2015 ). The simulated dataset of the motor domain was made using the left-foot movement task from the Human Connectome Project ( Van Essen et al., 2012 ).
For all simulated data, we used the same brain model constructed from an individual's data (subject No. 2 of Wakeman and Henson (2015) using VBMEG.

Common truth
First, we randomly selected five subjects from both datasets (subject No. 2, 6, 9, 15, 16 for the visual domain, and No. 115724, 144933, Fig. 1. Meta-analysis priors synthesized by the terms "visual " and "motor. " These are synthesized using Neurosynth and imported into VB-MEG's coordinate, and they are used for the prior information of current source reconstruction. The visual prior is used for the visual domain task, and the motor prior is used for the motor domain task. 481951, 896879, 902242 for the motor domain). FMRI preprocessing, including realignment, slice timing correction, normalization, and smoothing, were conducted for each subject using the same parameters as those in the original study (we referred to Wakeman and Henson (2015) for visual data and Van Essen et al. (2012) for motor data). Then, task conditions against a baseline were contrasted. For the visual domain dataset, we employed the face image recognition task (famous condition). For the motor domain dataset, we employed the left-foot movement task (LFoot condition). Then, we acquired one contrasted statistical map for each subject of both tasks.
Next, group analysis was conducted using contrasted statistical maps of five subjects for each task. Finally, the resulted statistical maps of group analysis were thresholded with three levels of significance: p < 0.001, 0.01, or 0.05. These thresholded maps were used as the common truth for both fMRI and MEG data generation. Thresholding controlled the density of the common truth map and modulated the cancellation index of the MEG data ( Ahlfors et al., 2010 ). These procedures are presented in the middle part of Fig. 2 . To prevent the simulated data from being highly dependent on the results of an individual, the common truth was built based on the results of group analysis. All of the preprocesses and statistical tests were performed using spm8 ( https://www.fil.ion.ucl.ac.uk/spm/software/spm8/ ).

FMRI data generation
We generated fMRI data with various signal qualities based on the common truth by adding the resting-state brain activity as background noise (upper part of Fig. 2 ). We designed a virtual fMRI task design. It was an event-related paradigm. A stimulus was designed to be presented for 2 s with 20 s intervals. We set the total number of fMRI scans to 240, with 2 s of repetition time. Then the designed time series were convolved with the canonical hemodynamic response function (HRF) to obtain the time series of task-related brain activities. The amplitude of the time series on all voxels was determined by weighting with the common truth map, resulting in simulated fMRI data during the designed task. The fMRI time series was further contaminated by adding the experimental resting-state fMRI activity (see Ogawa et al., 2018 , for data acquisition) as noise to take temporal modulations of each brain region into account ( Valente et al., 2009 ). We varied the amount of noise so that the contrast-to-noise ratio (CNR) ranged from 0.5 to 10. CNR is defined as the ratio between the task-related signal amplitude and a standard deviation of the task-irrelevant noise signal ( Baumgartner et al., 2000;Welvaert and Rosseel, 2013 ): Finally, general linear model (GLM) analysis was conducted on the generated task fMRI time series data using SPM8 ( Friston et al., 1994 ), resulting in statistical maps of various CNR levels, which served as individual fMRI priors for current estimation. Examples of the common truth and simulated individual fMRI priors for CNR = 0.5, 1, 3, 5, and 10 are shown in Fig. 3 . The statistical maps in the right panel gradually approach the common truth map as CNR increases. Using these CNRcontrolled priors, we investigated the effects of the quality of the individual fMRI data on the current reconstruction that were comparable to the meta-analysis fMRI data. In Supplementary Figs. 1 and 2, we report the distribution of CNR for all simulation settings. These were calculated using coefficients and residuals of GLM, as CNR = coeff.(condition) / mean square of residuals × scaling factor. The scaling factor is the maximum value of the task design time series (after convolution with HRF). As these figures indicate, the maximum CNRs of each setting were configured to the desired value. To be compatible with more research, we also show the percent signal change (PSC), which is a popularly used quality measure ( Pernet 2014 ;Welvaert and Rosseel 2013 ). It was also calculated using parameters of GLM as PSC = coeff.(condition) / coeff.(constant) × scaling factor × 100.

MEG data generation
The locations and amplitudes of dipoles were defined based on the common truth spatial map imported into VBMEG's coordinate. Because it is difficult to design the temporal evolution among distributed dipoles by hand, we designed it using the ROI-level whole-brain current simulation. The simulation was conducted using the connectome obtained from diffusion MRI data (it was averaged over 13 subjects, see Endo et al. (2020) , for data acquisition), and the Larter-Breakspear neural mass model ( Breakspear et al. 2003 ;Larter et al. 1999 ) with the AAL parcellation (details of the ROI-level whole-brain current simulation are following ( Endo et al., 2020 ). To obtain vertex-level whole-brain current time series, we assumed the same waveforms for vertices belonging to a Fig. 2. FMRI (upper part) and MEG (lower part) data-generation process from the common truth (middle part). Simulated individual fMRI priors are taken as the hierarchical prior of hVB estimation for simulated MEG data. Both modalities are generated based on the common truth map. Density of the common truth is controlled by thresholding the p -value ( p < 0.001, 0.01, or 0.05). It affects the density of the generated fMRI statistical map as well as the cancellation of MEG data. The SNR of single-trial MEG data is fixed at 0 [dB], and total SNR is controlled by the number of trials (only a single trial is shown in the figure).

Fig. 3.
Examples of the common truth map and simulated individual fMRI priors. Here, the common truth is computed for the face image recognition task with thresholding p < 0.01. Individual fMRI data are generated from the common truth with controlled contamination. As the individual fMRI images show, statistical maps gradually clarify the common truth map with increasing CNR. They are imaged from the bottom view to observe the change in statistical maps in the fusiform face area. single ROI. Then by weighting the whole-brain waveforms with the common truth of the task (visual or motor), we generated the current source time series of the desired task. Observed MEG time series were calculated by multiplying the lead field matrix and adding Gaussian noises. The length of a single trial was set to 250 ms. These procedures are presented in the lower part of Fig. 2 .
The quality of MEG data was defined as the signal-to-noise ratio (SNR) of trial-averaged data, and it was controlled by varying the num-ber of trials. Here, SNR of MEG data is defined as SNR = 10 log 10 where 2 is a variance of signals during the task and 2 is a variance of noises. We configured the SNR of single-trial MEG data to be 0 [dB]. Consequently, the SNR of trial-averaged data became 0, 5, and 10 [dB] when the number of trials was 1, 3, and 10, respectively.
It is well known that current sources at nearby vertices with opposite dipole orientations cancel each other, resulting in nearly zero MEG signals. Consequently, it is assumed that the more vertices that are activated coherently, the more difficult source reconstruction is. To characterize such a difficulty of the source reconstruction, we computed the weighted cancellation index (wCI), which is an extended version of a previously proposed cancellation index ( Ahlfors et al., 2010 ) : where is an element of a lead field matrix and is a current at source . For the simulated current time series, we calculated the wCI at each time point and averaged them over time. The wCI of the simulated data obtained from the thresholding of common truth with p < 0.001, 0.01, and 0.05 was 0.77, 0.88, and 0.90 for the visual task and 0.40, 0.71, and 0.83 for the motor task, respectively.
To summarize the above, we generated MEG data by varying three SNR levels and three wCI levels; thus, in total, there were nine simulation settings for each task.

Performance evaluation
Inspired by Owen et al. (2012) , we evaluated the performance of source reconstruction using the aggregate score of source-level spatial and temporal correlations, defined as Aggr egate Scor e = Spat ial Correlat ion + Tempor al Corr elation 2 .
Since it is the average of Pearson's correlation coefficients, it takes a value from − 1 to 1. Here, the correlation coefficient of spatial maps was calculated between the simulated current map and the reconstructed one for each time point, and they were averaged along the time axis: where is the number of time points (250 for all simulations) and | true | and | recon | are the simulated current map and the reconstructed current map at time , respectively. On the other hand, to calculate temporal correlation, we matched each true source to the reconstructed source that has maximal Pearson's correlation within 10 mm. Here, we consider only the true sources that have a non-zero activity. Then we take the mean over these maximal correlations. Details of computing temporal correlation are described in Appendix B .

Current source reconstruction methods for comparison
In the simulation experiments, we compared the source currents estimated by hVB with different priors and also other source imaging methods. Estimation using hVB was conducted with the meta-analysis priors (synthesized with the terms "visual " and "motor "), the simulated individual fMRI priors (with CNR = 0.5, 1, 3, 5, 10), the common truth prior, and without a prior.
When we used the meta-analysis prior, we basically selected the relevant prior for the task ( "visual " prior for visual domain task and "motor " prior for motor domain task). However, we also used irrelevant priors for the tasks, namely "visual " prior for motor domain task and "motor " prior for visual domain task.
The individual fMRI priors obtained from different CNRs were considered. We also considered the common truth as the prior information of the ideal case. When there is no prior information, we can only use the uniform spatial pattern. This means we set the same values on all initial values of the hierarchical prior −1 0 in Eq. (8) . In this paper, we call this uniform spatial pattern as "uniform prior " of hVB estimation. If the uniform prior was used with a small prior weight, the estimated current variances tend to be sparse due to the effect of automatic relevance determination ( Neal, 1996 ).
We applied spatial smoothness filtering for all the above methods (see Appendix C ).

Experimental data
We analyzed the multi-subject, multi-modal neuroimaging dataset for face processing recorded by Wakeman and Henson (2015). It was the same dataset that was used to generate simulated data of the visual domain. This dataset contains the evoked responses of 16 subjects to three types of face stimuli: famous, unfamiliar, and scrambled. MEG, EEG, electro-oculograms, and electro-cardiograms were simultaneously recorded at 1100 Hz with an Elekta Neuromag Vectorview 306 system (Helsinki). T1 images and fMRIs were also collected with a Siemens 3T TIM TRIO (Siemens, Erlangen, Germany). These data are stored in the Brain Imaging Data Structure (BIDS) format ( http://bids.neuroimaging.io/ ). We preprocessed MEG data in the same way as done previously ( Takeda et al., 2019 ). Schematic descriptions were visualized in Supplementary Fig. 3. As a result, the face (famous and unfamiliar) and the scrambled conditions were compiled.
We constructed a t-map by contrasting all stimulus conditions (face, scrambled) against the baseline using SPM8. The t-map was computed for each subject and imported as individual fMRI priors ( Fig. 4 ). On the other hand, the meta-analysis prior was synthesized using Neurosynth with the term "visual, " and it was common for all subjects ( Fig. 1 ). We carried out source reconstruction using two types of priors, an individual fMRI prior and the meta-analysis prior, for each subject. The prior weight parameter was set to 0.3 for both priors.
Using all the subjects' source currents, we conducted a group analysis and examined the statistical differences of the current amplitudes between the face and scrambled conditions. The method of group analysis followed our previous work ( Takeda et al., 2019 ). Briefly, we calculated the stimulus-triggered average of the estimated source currents for each subject and condition. Then for each source and time, we compared the 16 subjects' current amplitudes between the face and scrambled conditions using a paired t -test. Finally, the multiple comparison problem was solved using Storey and Tibshirani (2003) 's method (false discovery rates were controlled at 0.05). All of the procedures were computed separately for results with an individual fMRI prior or the meta-analysis prior. The graphical description of whole group analysis procedure was visualized in Supplementary Fig. 3.

Results for simulated data
We first studied the effects of the prior weight parameter for both task domains (visual and motor). We then evaluated how effective the meta-analysis prior was compared to the individual fMRI priors using the appropriate prior weight.

Effects of prior weight parameter
We first studied the effects of the prior weight parameter on reconstructed sources. We repeatedly conducted current source reconstruction using the hVB method by varying the prior weight from 0 to 0.99999999 (denoted as 1). The full results with all simulation settings are reported in Fig. 5 for the visual task and Fig. 6 for the motor task. For comparison, we also plot the results with the worst (CNR = 0.5) and   5. Scores for the visual task with various prior weight parameters. Results of the hVB estimation with the worst and the best individual fMRI priors (CNR = 0.5 and 10), the common truth prior, and the meta-analysis priors (synthesized using terms "visual " and "motor ") are shown. Vertical axes indicate the aggregate score (upper is better), and horizontal axes indicate the value of the prior weight parameter. The error bar represents standard error of the mean over ten different Monte Carlo repetitions for the data generation. We display results of all simulation settings (3 SNR by 3 wCI (denoted as thresholds of the common truth)). the best (CNR = 10) individual fMRI priors. In addition, the results using the common truth prior are reported.
For both tasks, the scores of the meta-analysis fMRI prior were very consistently located between the simulated individual fMRI priors when the prior weight and the term ( "visual " or "motor ") were properly selected. For example, for the results of p < 0.05 in the visual task (top 3 panels of Fig. 5 ), when the meta-analysis prior "visual " was used with a prior weight 0.3, the aggregate scores were consistently higher than the results of CNR0.5 and lower or comparable to CNR10. We can find similar results for all conditions when using the relevant prior ( "visual " for the visual and "motor " for the motor task) and the optimal prior weight settings. The appropriate range of the prior weight tends to be high when the cancellation is high ( p < 0.05, with wCI of 0.90 for the visual task and 0.83 for the motor task). Conversely, when the cancellation is low ( p < 0.001, wCI of 0.77 for the visual task and 0.40 for the motor task), the appropriate range of the prior weight is low as well.
These results indicate that estimations require strong prior information when the cancellation is severe. This is because strong cancellation induces very few observation signals.

Comparison between meta-analysis prior and other priors
We conducted a close comparison between the meta-analysis prior and the individual fMRI prior with the properly fixed prior weight parameters. We focused on one simulation setting (characterized by SNR and wCI), which had a level of difficulty similar to the experimental data (face recognition task data ( Wakeman and Henson, 2015 )). Since we do Fig. 6. Scores for the motor task with various prior weight parameters. Results of the hVB estimation with the worst and the best individual fMRI priors (CNR = 0.5 and 10), the common truth prior, and the meta-analysis priors (synthesized using the terms "visual " and "motor ") are shown. Vertical axes indicate the aggregate score (upper is better) and horizontal axes indicate the value of the prior weight parameter. The error bar represents standard error of the mean over ten different Monte Carlo repetitions for the data generation. We display results of all simulation settings (3 SNR by 3 wCI (denoted as thresholds of the common truth)). not know the actual values of the SNR ( Eq. (15) ) and the wCI ( Eq. (16) ) for the experimental data, we approximated those values as follows. A variance of signals during the task 2 and a variance of noises 2 were calculated as the variance of trial-averaged signals and the variance of signals before task onset, respectively. The wCI was calculated using the t-map of task fMRI data of face -baseline contrast. Hence, was substituted with the t-value. The calculated SNR was 4.88 [dB], and wCI was 0.95. Therefore, we selected the simulation setting with SNR = 5 [dB] and the threshold of the common truth as p < 0.05, corresponding to wCI = 0.90 (face image recognition task) and 0.83 (left-foot movement task).
The results are presented in Fig. 7 . As a control analysis, we also compared the results of the meta-analysis prior with those of the uniform prior and the benchmark methods, wMNE and Champagne. The MNE and Champagne methods used their default hyperparameters, although the hVB methods used the best prior weight parameter found through investigation using a grid search. In this search, hVB was conducted using all prior weight values as Figs. 5 and 6 , and then assessed using the aggregate score to select the best prior weight. Reconstructed current maps of the visual domain task are also shown in Fig. 8 . We can see that the sparser the estimated source, the higher the amplitude is recovered. For the visual domain task, the result of the relevant meta-analysis prior ( "visual ") shows a better score than that for the individual fMRI prior with CNR = 1. For the motor domain task, the result of the relevant meta-analysis prior ( "motor ") shows a better score than that for the individual fMRI prior with CNR = 0.5. On the other hand, such an individual fMRI prior with low-CNR was better than the benchmark meth-ods. However, if we use irrelevant meta-analysis data, the performance is degraded. Therefore, meta-analysis fMRI data synthesized using the relevant term has comparable efficiency with the individual fMRI prior with CNR 0.5-1.
We also reported results assessed using spatial correlation (Supplementary Fig. 4 ) and temporal correlation ( Supplementary Fig. 5 ) for the same simulation setting as that in Fig. 7 . Although the results assessed using temporal correlation show little difference between priors, spatial correlation shows clearer differences. Hence, the differences in prior data mainly affect the estimated spatial maps rather than the time series.
To show the results under various conditions, the aggregate scores for all nine simulation settings for the visual and motor tasks are shown in Supplementary Figs. 6 and 7, respectively. For both visual and motor tasks, the hVB results using the appropriate meta-analysis fMRI prior were comparable or better than the scores of low-CNR fMRI priors. Counterintuitively, when p < 0.001 for the motor task, the results of the meta-analysis fMRI data of "visual " are comparable to those of "motor. " This is due to the fact that the results of the fMRI group analysis used to generate the simulated data had activation on vertices that can be covered by both "visual " and "motor " meta-analysis data when p < 0.001.
Additionally, the results of the Champagne method show a comparable score to hVB using individual fMRI data when the simulated current map has a sparse setting ( p < 0.001). In particular, this method shows consistent scores regardless of SNR. These characteristics of the Champagne method seem to be caused by the effect of sparse estima- Fig. 7. Comparison of aggregate scores with various methods for the visual task (left) and the motor task (right). The results of the hVB estimation with the uniform prior, the individual fMRI priors (CNR = 0.5, 1, 3, 5, 10), the common truth prior, and the meta-analysis priors (synthesized using the terms "visual " and "motor ") are reported. These priors are used with the best prior weight parameter denoted in parentheses. Vertical axes indicate the aggregate score (upper is better). The error bar represents standard error of the mean over ten different Monte Carlo repetitions for the data generation. We display results of a realistic simulation setting (p < 0.05, SNR5) determined using experimental data. The results of wMNE and Champagne are also shown as a control.
tion ( Wipf et al., 2010 ). These results indicate that the fMRI data are particularly helpful when sources are densely activated.
Moreover, we also conducted current source reconstruction with the meta-analysis fMRI prior synthesized using more detailed terms such as "face " and "object recognition " for the visual domain task and "hand " and "foot " for the motor domain task. Results are shown in Supplementary Figs. 8 and 9. For comparison, the scores of "visual " and "motor " were plotted (the same as in Fig. 7 ). In these cases, "face " and "foot " seem to be suitable for the visual (face image recognition) and the motor (left-foot movement) tasks, respectively. Actually, only the results with the "foot " term improved the performance, whereas the result of the "face " term was comparable to "visual. "

Results for experimental data
Since we do not know the ground-truth of current sources for real experimental data, we evaluated the reproducibility of the statistical maps obtained from group analyses. We compared the individual fMRI prior with the meta-analysis fMRI prior ( "visual "). The prior weight parameters for the individual fMRI prior and the meta-analysis fMRI prior were each set to 0.3, following our previous work ( Takeda et al., 2019 ) and the above simulation results (according to Fig. 7 , which is the simulated face image recognition task with the realistic setting, 0.3 was the best prior weight parameter).
First, we checked the reconstructed currents using the individual fMRI prior or the meta-analysis prior of one example subject (Supplementary Figs. 10 and 11). Next, we conducted the group analysis of reconstructed current sources by contrasting the face condition to the scrambled condition. Fig. 9 shows the number of sources exhibiting significant differences between the face and scrambled conditions along the time axis. Although the largest difference was observed at 0.17 s for both priors, the detected source of the meta-analysis prior was slightly smaller than the individual fMRI prior. The time series of detected sources were quite similar, although we found a slight difference at the second peak.
In Fig. 10 , we compared the significant t-value maps of both priors at 0.17 s, and they looked very similar to each other. They show significance at the right Fusiform Face Area (FFA). The significant differences at the right FFA are consistent with previous studies that reported that this area exhibits face-selective responses ( Grill-Spector et al., 2004;2017;Jas et al., 2018;Rossion et al., 2018;Wakeman and Henson, 2015 ). The statistical map obtained using the meta-analysis prior shows a significant difference on the left fusiform area and the left insula, which are not observed in that obtained from the individual fMRI prior.

Discussion
In the present study, we proposed the use of meta-analysis fMRI data for MEG current source reconstruction. The meta-analysis fMRI data consist of a statistical map synthesized from thousands of published fMRI studies. We used the meta-analysis results available from the Neurosynth open-source project ( Yarkoni et al., 2011 ) as the hierarchical prior distribution of the current variance in the hierarchical variational Bayesian estimation method ( Sato et al., 2004 ) because this approach offers an adaptive way to incorporate fMRI information based on MEG measurements. The proposed method was quantitatively evaluated using simulations of a visual task and a motor task. As a result, we discovered that source reconstruction performance using the meta-analysis prior was comparable to that using an individual fMRI prior with CNR = 1 and CNR = 0.5 for the visual and motor tasks, respectively. We also confirmed the fMRI prior outperformed the MNE, Champagne, and hVB estimations with a uniform prior (no fMRI information). In particular, when the simulated currents are spatially dense ( p < 0.05), or the SNR of MEG data is high, hVB using the low-CNR fMRI prior was better than the sparse estimations of the Champagne method. Using the experimental data of a real face recognition task ( Wakeman and Henson, 2015 ), we qualitatively confirmed that group analysis results obtained from the meta-analysis fMRI prior were similar to those obtained from individual fMRI priors.
Our simulations were based on multi-modal data generation, where both whole-brain fMRI and MEG time series data were generated from the common truth activation map. We defined the common truth activation map for a specific task, which was the group analysis of results obtained from real fMRI experimental data of five subjects during the task, representing the spatial pattern of task-specific neuronal activation. From the common truth map, whole-brain fMRI time series data were generated using the virtually designed event-related paradigm and resting-state fMRI time series as noise. We modified the amount of resting-state fMRI noise to obtain the individual fMRI prior maps of various signal quality, defined as CNR. MEG time series was generated by applying the lead field matrix to simulated current time series data with added noise. The current time series data for a specific task were generated by weighting the whole-brain current time series (simulated using the connectome dynamics model) with the spatial map of the task (common truth map). This procedure allows us to remove any biases of the cancellation effect among dipoles caused by the hand design. However, the simulated current time series seems to be more difficult than the empirical one when we do not adjust the wCI. This is because the connectome dynamics model was simulated with coarse parcellation (AAL Fig. 8. Reconstructed current maps of the visual domain task with various methods. The ground truth current map (top) and reconstructed maps are shown. To visualize the maps, amplitudes are averaged along the time axis. The prior weights are set to the same values as those in Fig. 7 . For all figures, activities over 10 % of their maximum value are displayed. atlas), and the time series were common within each ROI. Moreover, the coherence of simulated resting-state data tends to be higher than the stimulus evoked response. To make the difficulty realistic under such parcellation, we modulated wCI of the generated data by thresholding the common truth with p < 0.001, 0.01, or 0.05. We also modulated observation noises of MEG data with SNR = 0, 5, or 10 [dB] by varying the number of trials. Then, simulations were conducted with nine combinations of settings (3 SNR by 3 wCI).
Inspired by Owen et al. (2012) , we assessed the results of source reconstruction using the aggregate score ( Eq. (17) ). This metric averaged the sum of spatial and temporal correlation coefficients between simulated and reconstructed currents. By comparing the evaluation using the aggregate score ( Fig. 7 ), spatial correlation ( Supplementary  Fig. 4 ), and temporal correlation ( Supplementary Fig. 5 ), we confirmed that the differences in prior data mainly affect the estimated spatial maps rather than the temporal time series. This is natural because we Fig. 9. Differences in current amplitudes between face and scrambled conditions along the time axis. The number of sources exhibiting significant differences (q < 0.05) are shown for both individual fMRI and meta-analysis fMRI priors. Both of them exhibit the maximum number of significant sources at 0.17 sec. incorporated the spatial prior data in the covariance of the current map.
From the simulations of the visual and motor tasks, we confirmed that the current source reconstruction performance of hVB using the meta-analysis fMRI prior was better than that using the low-CNR individual fMRI prior. For the visual domain task (left of Fig. 7 ), the result of the relevant meta-analysis prior shows a better score than the individual fMRI prior with CNR = 1. For the motor domain task (right of Fig. 7 ), the result of the relevant meta-analysis prior shows a better score than for the individual fMRI prior with CNR = 0.5. Referring to Supplementary Figs. 1 and 2, CNR1 and CNR0.5 of the p < 0.05 condition corresponds to PSC0.7 and PSC0.6 at the maximum activated voxel, respectively. Note that these low PSC values could be observed as the worst-case in a real-task fMRI study (e.g., Drobyshevsky et al., 2006 , reported similar PSCs for the cognitive and emotional tasks). We also confirmed that the hVB method using the meta-analysis fMRI prior was superior to the control methods without fMRI priors, such as the hVB method using the uniform prior, wMNE, and Champagne. Especially, by comparing with the Champagne method, we confirmed that the meta-analysis prior for the hVB method was helpful when the source spatial density was high ( p < 0.05). Moreover, the meta-analysis fMRI data synthesized using an irrelevant term, such as "motor " for the visual task or "visual " for the motor task, resulted in degraded performance. Therefore, we concluded that it is worth using the meta-analysis fMRI prior synthesized using relevant terms when the individual fMRI data are missing or of low quality. This also means that the results equivalent to the use of a low-CNR fMRI prior can be obtained without fMRI measurement, and in that case, the reduced cost could be used to enhance MEG measurement. It should be noted that we used the default hyperparameters for the control methods (MNE and Champagne) while the optimized parameters for the hVB methods. Thus, the performance comparison may not be fair. Although their performance may be improved using parameter fine-tuning ( Bertrand et al., 2019;Cai et al., 2021 ), it is beyond the scope of this study.
The prior weight parameter is the most important parameter for hVB estimation, which affects the quality of current source reconstruction. Therefore, we studied the proper range of the prior weight parameter using simulated data by the grid-search strategy. For both visual and motor tasks ( Fig. 5 and Fig. 6 ), the optimal prior weight tends to be high when the cancellation is high ( p < 0.05) regardless of the type of fMRI priors, while the optimal prior weight is low when the cancellation is low ( p < 0.001). These results indicate that the current source with a higher cancellation setting requires stronger prior information because strong cancellation induces fewer MEG observations.
Using the experimental data of a real face-recognition task, we qualitatively confirmed that group analysis results of the meta-analysis fMRI prior have a similar tendency with the results of an individual fMRI prior. We set the prior weight parameter for the meta-analysis fMRI prior to 0.3 based on the simulation experiment of the visual task and set the parameter for the individual fMRI prior to 0.3 according to the previous work ( Takeda et al., 2019 ). For the significance time series of both priors, the largest difference was observed at the same time point, although there were small differences ( Fig. 9 ). In Fig. 10 , we compare the significant t-value maps of both priors at that time. Both of them show significance at the right FFA, although they were derived using different types of priors. The significance at the right FFA is consistent with previous studies reporting that this area exhibits face-selective responses ( Grill-Spector et al., 2004;2017;Jas et al., 2018;Rossion et al., 2018;Wakeman and Henson, 2015 ). In these figures, we can observe activities on the insula as well. We hypothesized that they occurred due to signal leakage ( Brookes et al., 2012;Colclough et al., 2015;Palva et al., 2018;. To verify this, we calculated the resolution kernel for both priors ( Sekihara et al., 2005 ). As a result of the analysis, the activities on the insula could be observed due to the signal leakage from the ventral occipitotemporal cortex ( Supplementary Fig. 12). These results corresponded to our previous study ( Takeda et al., 2019 ). Such similarities in the results between individual fMRI and meta-analysis fMRI suggest the possibility of substituting a meta-analysis prior for the individual prior in a neuroscience study.
It is difficult to subjectively select the term used to synthesize the meta-analysis fMRI data. This came from the results with detailed terms such as "face " and "object recognition " for the visual domain task and "hand " and "foot " for the motor domain task (Supplementary Figs. 8  and 9). The meta-analysis prior synthesized using the term "foot " significantly improved the score (comparable to the high-CNR fMRI priors) for the left-foot movement task. However, against our intuition, the result with the term "face " was not improved, and "object recognition " was lower than the result with the term "visual. " These results might be due to the following reasons. In most visual tasks, the stimulus-induced activation occurs from the primary visual cortex, and it is then transferred to the higher visual cortex. Hence, the entire visual cortex is activated. This situation is favorable for the meta-analysis prior synthesized using the term from the highest layer of the ontology. Therefore, the term "face " was not a better selection than the term "visual. " On the other hand, the left-foot movement task with the p > 0.05 setting mainly activated left-foot-related areas. Therefore, the term from the lower layer of the ontology was suitable for this situation. In the results, only "foot " was a good prior, whereas both "hand " and "foot " were selected from the lower layer. This seems natural because the meta-analysis with the term "hand " is not directly related to the left-foot-related areas. Although these obvious results indicate that the meta-analysis data synthesized using detailed terms might improve the reconstruction, they also imply the difficulty of making the appropriate data selection for the metaanalysis.
In the neuroscience literature, free energy is one of the popularly used statistical criteria for model selection Fukushima et al., 2015;Wipf and Nagarajan, 2009 ). In particular, free energy is used to select prior information of source reconstruction in the multiple sparse priors method ( Friston et al., 2008;Henson et al., 2010 ). Inspired by these studies, we attempted to select hyperparameters (prior weight and meta-analysis fMRI data) for hVB estimation using free energy. First, we studied whether the prior weight parameter could be selected using free energy. The effects of the prior weight on free energy and the aggregate score are reported in Supplementary Fig. 13. As the results indicate, free energies were negatively correlated with prior weights, but they were not related to aggregated scores. This tendency was shown for other prior data as well. We speculated that this is because when the prior weight is low, the estimated currents can be fitted to the observation data independent of the prior data and thus free energy also has a high value. Thus, we concluded that prior weight cannot be selected based on free energy. Next, we considered the prior data selection when the prior weight was fixed to 0.3 for all priors. For both visual and motor tasks, we showed the results with SNR0 and 10, respectively ( Supplementary Figs. 14 and 15). These figures indicate strong relationships between free energies and aggregated scores regardless of SNR. These results support the potential of selecting the prior data using free energy. Therefore, we also plotted free energies with the prior weight of 0.3 for the experimental data ( Supplementary Fig. 16). The figure shows that the free energy of the meta-analysis prior of the "motor " term is higher than the subject's own fMRI data and the meta-analysis data of "visual, " although this is a face recognition task. However, when "motor " was used as a prior in the source reconstruction, little activity was observed in the visual processing area. In addition, the results of the group analysis showed almost no significant difference between conditions. Therefore, the results of the "motor " prior supported by free energy were inappropriate. Compared to the results of simulated data, this does not support the prior data selection using free energy. Note that we applied the hVB algorithm to all single-trial data rather than trialaveraged data because our previous study showed higher intra-subject reproducibility in the former setting. However, when we applied the trial-averaged data and checked the applicability of free energy for the prior data selection, we obtained plausible results similar to the results of simulated data ( Supplementary Fig. 17). This is probably because the SNR of the single-trial data of real data was much worse than the SNR0 of the simulation. To confirm the applicability of prior data selection based on free energy, further investigation into the impact of a singletrial setting on free energy computation is required. Furthermore, since we need to repeatedly run the hVB algorithm for all candidate prior data to compute free energy, the computational cost will be significantly increased in proportion to the number of candidates.
Care must be taken when using the meta-analysis fMRI prior for scientific findings. Use of prior information biases the results of current source reconstruction. Since there are numerous combinations of the meta-analysis prior and the prior weight value, the choice depends on the data analyst, and results may be chosen after trials. To mitigate such results selection, we need a guideline on how to use the method and report the results. For now, we recommend that meta-analysis fMRI data be synthesized using relevant and general (conservative) terms such as "visual " for visual tasks or "motor " for motor tasks. This comes from the above discussion on term selection. However, there is no solution to decide the prior weight value systematically, although it has been shown to have a significant impact on the results as well ( Figs. 5 and 6 ). Therefore, trial-and-error efforts are indispensable to tune the prior weight. However, this recommendation might be updated by the progress made in free-energy-based prior data selection. For example, it may be possible to select the term based on the free energy using pre-fixed prior weight value. On the other hand, this method is ready for applications such as brain-machine interface (BMI), where the direct goal is to im-prove decoding accuracy. In this case, we believe there is no problem in selecting a term that will improve the decoding accuracy as long as the generalization performance is evaluated.
In this study, we used the results of meta-analysis from Neurosynth. However, many useful and practical platforms for fMRI meta-analysis have also been proposed and are promising as the prior information of source reconstruction. The BrainMap group supplies the ICA maps of large-scale meta-analysis fMRI data ( https://www.brainmap.org/icns/ ). These are well-suited to our present recommendation of selecting a conservative prior because ICA maps are already classified to a few intrinsic connectivities ( Laird et al., 2011;Smith et al., 2009 ). Another option is NeuroQuery ( https://neuroquery.org ). Because of its ability to predict brain activity from a short sentence ( Dockès et al., 2020 ), it seems to be suitable for applications such as decoding newly designed tasks. While this contradicts our recommendation to generate a conservative prior, it is still attractive for applications such as BMI.
In terms of reproducibility, meta-analysis fMRI is promising. In recent years, the study of task fMRI has been a problem because of its poor reproducibility ( Elliott et al., 2020 ). When individual fMRI data are used as the prior for source reconstruction, the reconstructed maps are accumulated with uncertainty included in the fMRI data. On the other hand, although the meta-analysis prior does not take into account individual variability, it is not affected by the reproducibility issue of individual data. Therefore, it is not possible to conclude which is better, but in terms of reproducibility, there is potential in using meta-analysis data as prior information. This is supported by the fact that the results of the two group analyses were very similar ( Fig. 10 ).
Our approach is also easily applicable to the source reconstruction problems of EEG. As demonstrated previously ( Takeda et al., 2019 ), hVB estimation for EEG data is supported in VBMEG software. We expect the low measurement cost of EEG to make it suitable for combination with meta-analysis fMRI data.
The use of the meta-analysis fMRI prior would be a significant step in the development of MEG studies. This is because the advent of OPM has shown remarkable results in reducing the cost of MEG measurements and expanding the measurement targets ( Boto et al., 2018;Hill et al., 2019;Lin et al., 2019;Tierney et al., 2020 ). However, the number of sensors in the current OPM devices is limited to dozens due to physical constraints ( Hill et al., 2020 ). Therefore, it would be important to gain the ability to add prior information that is comparable to a low-CNR individual fMRI prior without the cost of measurement.

Declaration of Competing Interest
None.

Acknowledgments
This work was supported by contracts with the National Institute of Information and Communications Technology (NICT) entitled "Development of network dynamics modeling methods for human brain data simulation systems " (173) and "Analysis of multi-modal brain measurement data and development of its application for BMI open innovation " (209). This work was also supported by Japan Society for the Promotion of Science (JSPS) KAKENHI Grant Number JP20H00600. K.S. was supported by JSPS KAKENHI Grant Number 19J14715, a Grant-in-Aid for JSPS Fellows. We thank Mitsuo Kawato, Ph.D., Kazushi Ikeda, Ph.D., Yusuke Takeda, Ph.D., Ayumu Yamashita, Ph.D., and Yu Takagi, Ph.D. for helpful suggestions that contributed to improve the manuscript.