Inter-Subject Variability of Skull Conductivity and Thickness in Calibrated Realistic Head Models

Skull conductivity has a substantial influence on EEG and combined EEG and MEG source analysis as well as on optimized transcranial electric stimulation. To overcome the use of standard literature values, we propose a non-invasive two-level calibration procedure to estimate skull conductivity individually in a group study with twenty healthy adults. Our procedure requires only an additional run of combined somatosensory evoked potential and field data, which can be easily integrated in EEG/MEG experiments. The calibration procedure uses the P20/N20 topographies and subject-specific realistic head models from MRI. We investigate the inter-subject variability of skull conductivity and relate it to skull thickness, age and gender of the subjects, to the individual scalp P20/N20 surface distance between the P20 potential peak and the N20 potential trough as well as to the individual source depth of the P20/N20 source. We found a considerable inter-subject variability for (calibrated) skull conductivity (8.44 ± 4.84 mS/m) and skull thickness (5.97 ± 1.19 mm) with a statistically significant correlation between them (rho = 0.52). Age showed a statistically significant negative correlation with skull conductivity (rho = -0.5). Furthermore, P20/N20 surface distance and source depth showed large inter-subject variability of 12.08 ± 3.21 cm and 15.45 ± 4.54 mm, respectively, but there was no significant correlation between them. We also found no significant differences among gender subgroups for the investigated measures. It is thus important to take the inter-subject variability of skull conductivity and thickness into account by means of using subject-specific calibrated realistic head modeling.

Various measurement approaches such as electrical impedance tomography (EIT) ( Abascal et al., 2008 ;Fernández-Corazza et al., 2018 ;Gonçalves, Munck, Verbunt, Heethaar, & Lopes da Silva, 2003 ;Vauhkonen, Kaipio, Somersalo, & Karjalainen, 1997 ), magnetic resonance EIT (MREIT) ( Gao, Zhu, & He, 2006 ), magneto acoustic tomography (MAT) ( Li, Yu, & He, 2016 ), and directly applied current (DAC) ( Akhtari et al., 2002 ;Hoekema et al., 2003 ) have already been studied to determine skull conductivity. These approaches require further specialized equipment and/or expertise (EIT, MREIT and MAT) and/or are invasive (DAC). Here, we propose to use the combined modalities EEG and MEG with magnetic resonance imaging (MRI) for the individual estimation of skull conductivity in healthy human subjects with the aim of investigating its inter-subject variability. The used modalities combined EEG/MEG and MRI are available in an MEG-laboratory and the proposed procedure can easily be applied in neuroscientific research on healthy human subjects and/or patients due to its non-invasiveness. Furthermore, MEG is an emerging technology and broader use of it might be pushed by the newest generation of optically pumped magnetometer (OPM) MEG devices ( Boto et al., 2018 ;Labyt et al., 2019 ).
Estimating skull conductivity with combined EEG/MEG and MRI has already been proposed in studies with three-compartment head models ( Baysal & Haueisen, 2004 ;Fuchs et al., 1998 ;Gonçalves et al., 2003 ;Huang et al., 2007 ) and in first case studies with more realistic head models ( Aydin et al., 2014 ;Wolters, Lew, MacLeod, & Hämäläinen, 2010 ). In these studies, a so-called 'bulk' skull conductivity parameter ( Akhtari et al., 2002 ;Papageorgakis, 2017 ) was estimated in a calibration procedure that included source analysis of somatosensory evoked potential (SEP) and field (SEF) data at 20 ms post-stimulus, the P20/N20 component. It was shown that this component has a focal and dipolar origin with mainly tangential orientation with respect to the inner skull surface ( Allison, McCarthy, Wood, & Jones, 1991 ;Hari et al., 1993 ;Nakamura et al., 1998 ). Although other studies proposed to estimate skull conductivity based on only SEP data ( Lew, Wolters, Anwander, Makeig, & MacLeod, 2009 ;Vallaghé & Clerc, 2009 ), additional SEF data stabilize the estimation ( Wolters et al., 2010 ). This is due to the complementarity of EEG and MEG data ( Dassios, Fokas, & Hadjiloizi, 2007 ;Huang et al., 2007 ) and the insensitivity of MEG localizations to skull conductivity ( Fig. 8 in Brette & Destexhe, 2012 ;Haueisen, Ramon, Eiselt, Brauer, & Nowak, 1997 ;Lew, Sliva, & Choe, 2013 ). Gonçalves et al. (2003) found a strong agreement between the results of an EIT procedure and the SEP/SEF method, even though they are quite different in both theoretical and technical terms, which indicates the stability of the SEP/SEF based calibration. EIT uses Ohm's law between the measured voltages and the currents injected via scalp electrodes ( Fernández-Corazza et al., 2018 ). Both methods function under in vivo conditions and in low frequency ranges ( McCann et al., 2019 ).
A common feature of the aforementioned methods is that they are based on an accurate and realistic head model with individual compartments ( McCann et al., 2019 ;Vorwerk et al., 2014 ). In this regard, most of the above SEP/SEF calibration studies were based on the modeling of homogenized compartments such as single-layer skull or brain. More realistic subject-specific head volume conductor modeling based on MRI is increasingly becoming a standard procedure due to the now available automatized and open source segmentation ap-proaches ( Antonakakis et al., 2019 ;Huang, Datta, Bikson, & Parra, 2019 ;Nielsen et al., 2018 ). In this work, we investigate the inter-subject variability of skull conductivity in a group of twenty subjects using an SEP/SEF calibration procedure for the first time. This procedure is based on six-compartment anisotropic (6CA: skin, skull compacta, skull spongiosa, cerebrospinal fluid (CSF), gray matter and anisotropic white matter) head modeling.
As shown in ( Tang et al., 2008 ) using excised skull samples from patients undergoing surgery, skull conductivity is strongly and positively correlated to skull thickness. It is therefore important that the head model also represents skull thickness accordingly. While accurate skull thickness can best be determined using CT ( Lillie, Urban, Lynch, Weaver, & Stitzel, 2016 ), the non-invasive procedure proposed here is based on MRI. Using T1-weighted (T1w) MRI, Gorbenko, Miko ł ajczyk, Jasionowska, Narloch, and Ka ł u ż y ń ski (2020) accurately segmented soft tissues, but with only 67% specificity and 83% sensitivity, the results concerning the skull were rather limited. This is because the low contrast between CSF and cranial tissue makes it difficult to estimate the inner skull boundary from T1w MRI alone. Therefore, in the present study, we use T1w and T2w scans for an improved MRI-based estimation of skull thickness , which is our second measure used throughout this study.
The use of T1w and T2w scans additionally allows the segmentation of skull compacta and spongiosa to model the skull's three-layeredness (spongiosa located between inner and outer compacta) ( Akhtari et al., 2002 ;Montes-Restrepo et al., 2014 ). Thus, our procedure distinguishes skull compacta and spongiosa tissues and uses a fixed conductivity ratio for the compacta:spongiosa conductivity values as suggested by ( Dannhauer, Lanfer, Wolters, & Knösche, 2011 ) based on the measurements of ( Akhtari et al., 2002 ). The use of a fixed conductivity ratio reduces the degrees of freedom and helps to avoid overfitting in our skull conductivity calibration procedure ( Lew et al., 2009 ;Wolters et al., 2010 ). We therefore refine the definition of the term 'bulk skull conductivity' introduced above so that from now on it is the calibrated value for skull compacta in combination with the fixed conductivity ratio to the spongiosa.
It has also been assumed that skull conductivity may vary due to demographic factors such as age ( Hoekema et al., 2003 ;Horesh, 2006 ;McCann et al., 2019 ;Wendel, Väisänen, Seemann, Hyttinen, & Malmivuo, 2010 ), especially over infancy ( Azizollahi, Darbas, M. Diallo, El Badia, & Lohrengel, 2018 ). Using direct measurements of the (homogeneous) skull conductivity of skull pieces, temporarily removed during epilepsy surgery, Hoekema et al. (2003) observed a weak negative correlation over an adult age group of five patients. Such a negative correlation was furthermore supported by ( Arumugam et al., 2017 ) based on EIT measurements in ten participants. On the other hand, in CT studies, it was reported that skull thickness in adults varies substantially among individuals and is independent of age ( De Boer, van der Merwe, & Soerdjbalie-Maikoe, 2016 ; Lillie et al., 2016 ) but dependent on subject's gender ( Pellegrini, Zoghi, & Jaberzadeh, 2018 ). However, it is not yet clear whether gender can also influence skull conductivity and its inter-individual variation in an adult age group. In this study, our non-invasive approach is used to investigate and evaluate these aspects.
The influence of inter-and intra-individual skull conductivity variations has already been highlighted in earlier studies ( Aydin et al., 2014 ;Dabek et al., 2016 ;Dannhauer et al., 2011 ;Fernández-Corazza et al., 2018 ;Haueisen et al., 1997 ;Montes-Restrepo et al., 2014 ;Oostendorp, Delbeke, & Stegeman, 2000 ). It is clear from these studies that, for a fixed dipolar source in the brain, differences in skull conductivity result in differences in the surface distance between the two poles, the potential peak and trough, of the corresponding EEG forward solution potential topography. The latter means that lower skull conductivity leads to a higher pole distance ( Nunez & Srinivasan, 2006 ;Pernier, Perrin, & Bertrand, 1988 ). However, in practical P20/N20 measurements, the measure of the scalp surface distance between the peak and trough well depends on the EEG recording quality, the number of available electrodes and the accuracy in the assumption that the underlying source is a single focal dipolar source. Therefore, we propose here to use the P20/N20 surface distance as our third measure investigated in this study. This measure is the distance between the surface potential poles of the simulated EEG forward solution of the best fitting single dipolar source, reconstructed in a calibrated 6CA head model from the combined SEP and SEF measurement data 20 ms post-stimulus (P20/N20 peak). We will show that this forward simulated potential topography leads to an accurate and stable measure of the P20/N20 surface distance, and that this distance is an important additional measure in the evaluation of inter-subject variability of skull conductivity, being in the focus of this study.
Finally, our fourth investigated measure refers to the source depth . As in the EEG source analysis studies by ( Haueisen et al., 1997 ;Vorwerk et al., 2019 ), the source depth is defined here as the distance between source location and the nearest point on the inner skull surface; the larger this distance, the deeper the source is considered to be. Since they are closely related to each other ( Vorwerk et al., 2019 ), we will evaluate this additional measure to gain further insights into the inter-subject variability of skull conductivity.
In summary, we investigate non-invasively the inter-subject variability of the calibrated bulk skull conductivity using 6CA head models, i.e., distinguishing also between spongiosa and compacta compartments, in a group-study of twenty adult subjects. First, we define the four proposed measures individually, i.e., skull conductivity and thickness, P20/N20 surface distance, as well as source depth. In a second step, we compare these measures with the participants' age and gender to statistically evaluate the most relevant factors of variability between the subjects. To the best of our knowledge, this is the first study that uses combined EEG/MEG source analysis with subject-specific realistic head volume conductor models, with the aim to investigate the inter-subject variability of skull conductivity in a large number of participants.

Participants and Ethics Statement
Twenty right-handed adult subjects, ten males and ten females, in the age range of 18 to 53 years (mean and standard deviation 34.1 ± 10.88) participated in this study. All subjects gave their written informed consent forms and all measurements have been approved by the ethics committee of the University of Erlangen, Faculty of Medicine on 20. 02.2018 (Ref No 4453 B).

Experiment and EEG/MEG/MRI Acquisition
Somatosensory evoked responses (SEP and SEF) were simultaneously acquired in a magnetically shielded room. 80 AgCl sintered EEG ring electrodes (EASYCAP GmbH, Herrsching, Germany, 74 channel EEG plus additional 6 channels EOG to detect eye movements), one additional Electrocardiography (ECG) electrode, and a whole head MEG with 275 axial gradiometers and 29 reference coils (OMEGA2005, VSM MedTech Ltd., Canada). The MEG reference coils were used to calculate third order synthetic gradiometers for the reduction of interference caused by magnetic fields originating from distant locations. Before the measurements, the electrode positions of the EEG cap were digitized using a Polhemus device (FASTRAK, Polhemus Incorporated, Colchester, Vermont, U.S.A.). Moreover, during the acquisition, the head position inside the MEG was tracked via three head localization coils placed on nasion and left/right preauricular points.
SEP and SEF activity was elicited by stimulating the median nerve at the wrist of the right arm with monophasic square-wave electrical pulses of 0.5 ms width. The stimulus strength was increased until a clear movement of the right thumb was visible. The stimulus duration was 200 ms and a random stimulus onset asynchrony between 350 and 450 ms was applied to avoid habituation and to obtain a clear pre-stimulus interval.
The duration of the experiment was 10 minutes for a measurement of 1200 trials and data was acquired with a sampling rate of 1200 Hz and online low pass filtered at 300 Hz.
A 3T scanner (MAGNETOM Prisma 3.0 T, Release D13, Siemens Medical Solutions, Erlangen, Germany) was used for MRI acquisition. A 3D-T1-weighted (T1w) fast gradient-echo pulse sequence using water selective excitation to avoid fat shift (TR/TE/TI/FA = 2300/3.51/ 1100 ms/8°, cubic voxels of 1 mm edge length) and a 3D-T2w turbo spin echo pulse sequence (TR/TE/FA = 3200/408 ms/90°, cubic voxels, 1 mm edge length) were measured. Diffusion tensor MRI (DTI) was acquired using an echo planar imaging sequence (TR/TE/FA = 9500/79 ms/90°, cubic voxels, 1.89 mm edge length). DTI included one scan with diffusion sensitivity b = 0 s/mm 2 (b 0 , flat diffusion gradient), an additional scan with flat diffusion gradient with reversed spatial encoding gradients for susceptibility artifact correction ( Ruthotto et al., 2012 ) and twenty scans with b = 1000s/mm 2 in different directions, equally distributed on a sphere. During MRI scanning, we placed gadolinium markers at the same positions as in combined EEG/MEG, i.e., nasion, left and right distal outer ear canal, for landmark-based registration of combined EEG/MEG to MRI. We perform all measurements in supine position to reduce head movements and to prevent CSF effects due to a brain shift when combining EEG/MEG and MRI ( Rice, Rorden, Little, & Parra, 2013 ).

Head Model Preparation
We created a realistic and individual six-compartment head volume conductor model for each subject. The head modeling procedure combines T1w and T2w MRIs for improved modeling of the skin, skull compacta (SC), skull spongiosa (SS), CSF, gray matter (GM) and anisotropic white matter (WM). The use of T2w MRI was important for the segmentation of the cancellous bone because there is no water-fat-shift that could otherwise cause mispositioning of the skull compacta. All algorithmic steps were developed in MATLAB (The MathWorks, Inc., Natick, Massachusetts, U.S.A.) using our own custom code and open-source toolboxes for integration into a quasi-automated pipeline.
The procedure starts with the rigid registration of the T1w and T2w MRIs using the FSL 1 software that is used through subroutines in the FieldTrip toolbox ( Oostenveld, Fries, Maris, & Schoffelen, 2011 ). The flirt routine of FSL is used with mutual information as a cost function for the registration procedure and spline interpolation for the reslicing (i.e., transformation of MRI slices) of the registered T2w MRI to the reference T1w MRI . Then, the segmentation of the skin, GM and WM compartments is performed using the T1w MRI, while the T2w MRI is used to segment SC, SS, and CSF. All segmentations are performed with SPM12 2 which is embedded in the FieldTrip toolbox. The spongiosa is calculated using a 2 mm eroded full skull compartment image to define a region-of-interest (ROI) and then gray-value thresholding is used within this ROI in the T2w MRI. After visual inspection in some cases, 1 mm dilation is performed in the eroded full skull image, if the resulting SS mask is not matching the corresponding cancellous bone represented in the registered T2w MRI. In addition, the brain mask, including CSF, GM and WM, is segmented from the combined T1w and T2w MRIs. These masks are important for the subsequent merging of the segmented tissues to construct the six-compartment model. During this merging procedure, further enhancements based on MATLAB image processing routines such as regionprops and imfill are applied to the segmented tissues to avoid overlap and reduce segmentation inaccuracies. In this procedure, regionprops is used to remove unrealistic areas created during the merging of the segmented tissues. We also apply imfill for the correction of artificial tiny holes still present in the initially segmented compartments skull and skin. Following the recommendation of (Lanfer Figure 1. The main ingredients for the skull conductivity calibration procedure. a) The individual high-resolution head model with six-compartments (skin (human light skin color); skull compacta (blue) and spongiosa (gray); CSF (red); gray matter (orange) and white matter (yellow)). b and c) The iso-potential and field lines of the bipolar scalp topographies for the somatosensory evoked potential and fields (SEP and SEF) at 20 ms post stimulus (the P20/N20 component) for one of the subjects. et al. , 2012), a cut 40 mm below the lower skull compartment is applied in order to avoid unnecessary FEM calculations (see Section 2.4.1).
DTI data was preprocessed to reduce eddy current artifacts and nonlinear susceptibility artifacts following ( Aydin et al., 2014 ), using FSL and the subroutine HySCO from the SPM12 toolbox ( Ruthotto et al., 2012 ). Diffusion tensors were calculated and transformed into WM conductivity tensors by an effective medium approach ( Aydin et al., 2014 ;Tuch, Wedeen, Dale, George, & Belliveau, 2001 ). These tensors were then included in the head model to account for WM tissue conductivity anisotropy.
For the labeled volumes, geometry-adapted hexahedral finite element (FE) head models were constructed using SimBio-VGRID 3 ( Wolters, Anwander, Berti, & Hartmann, 2007 ). The adaptation was performed with a node-shift of 0.33, ensuring that interior angles at element vertices were convex and the Jacobian determinant in the FE computations remained positive. This approach increases the conformance to the real geometry and mitigates the effects of the staircase-like segmented voxel meshes. As a result, subject-specific six-compartment anisotropic head models were built.
In the next step, a 2 mm resolution source space was constructed in the middle of the GM compartment without restriction to source orientations (no normal-constraint). This positioning of the source space ensured that all sources were located inside GM and sufficiently far away from the neighboring tissue compartments to fulfill the so-called Venant condition, i.e., for each source node, the closest FE node should only belong to elements labeled as GM. The fulfillment of this condition is important to avoid numerical problems and unrealistic source modeling for the chosen Venant dipole modeling approach . Fig. 1 a depicts one of the realistic head volume conductor models of this study exemplarily.

Preprocessing of EEG/MEG
The raw combined EEG and MEG recordings were first filtered from 20 to 250 Hz ( Buchner et al., 1994 ) using digital bandpass filtering in CURRY8 4 . A notch filter was used to reduce the effect of power line noise at 50 Hz and its harmonics. Then, the preprocessed recordings were separated into epochs with 100 ms pre-stimulus and 200 ms post-stimulus. After deselecting the bad channels visually, artifact reduction was performed in CURRY8 using a threshold-based determination of candidate bad trials in each modality, followed by visual inspection and exclusion of bad trials from the rest of the analysis. The SEP/SEF evoked responses were then determined by averaging across all the remaining 3 http://vgrid.simbio.de/ 4 https://compumedicsneuroscan.com/products/by-name/curry/ trials. Fig. 1 b and c depict exemplarily the artifact-corrected SEP and SEF scalp topographies of the P20/N20 component for one of the subjects.

Definition of Measures
In this section, we define the four measures that will be investigated with regard to their inter-subject variability, age and gender dependence, as follows: (i) skull conductivity (as well as the related measures of the calibration process) (ii) skull thickness (iii) P20/N20 surface distance and (iv) source depth.

Measures for Skull Conductivity Calibration
Skull conductivity is individually modeled by adapting a calibration procedure that benefits from the different sensitivity profiles of EEG and MEG (algorithm 2 in Aydin et al., 2014 ). While the procedure in ( Aydin et al., 2014 ) only uses a single resolution level, we refine it here by proposing two resolution levels, resulting into the following three steps procedure.
Step 1 (source localization): While individual skull conductivity has a considerable influence on the P20/N20 SEP source reconstruction, it hardly influences source analysis of the SEF component at 20 ms poststimulus ( Brette & Destexhe, 2012 ;Lew et al., 2013 ). Therefore, using a dipole scan approach ( Fuchs et al., 1998 ;Knösche, 1997 ) throughout the whole source space and a head model with the standard skull conductivity parameters 1.6 mS/m for SC and 5.76 mS/m for SS ( Akhtari et al., 2002 ), we exploit the SEF data to accurately localize the peak at 20 ms post-stimulus. The single dipole scan assumes that its generator is focal and single-dipolar ( Allison et al., 1991 ;Buchner et al., 1994 ;Hari et al., 1993 ;Nakamura et al., 1998 ). The main goal of the dipole scan is the determination of the source for which the residual variance (RV) between the measured and the simulated SEF data at 20 ms post-stimulus is minimal. Furthermore, the dipole scan is regularized accordingly to suppress the amplification of high-frequency spatial noise into erroneously high radial dipole orientation components within the inversion procedure ( Fuchs et al., 1998 ;Wolters, Beckmann, Rienäcker, & Buchner, 1999 ). This source location is then fixed as the outcome of step 1 and will no longer be modified by the next two steps of our calibration procedure.
Step 2 (coarse resolution calibration): Our coarse resolution level uses the predefined set of skull compacta conductivity values SC = [0.8, 1.6, 2.4, 2.8, 3.3, 4.1, 5.5, 7, 8.3, 16.5, 33] mS/m. These values were selected based on ( Aydin et al., 2014 ) including the additional value of 0.8 mS/m ( Altakroury, 2017 ;McCann et al., 2019 ). The ratio of SC:SS is fixed to 1:3.6 ( Akhtari et al., 2002 ;Dannhauer et al., 2011 ). Therefore, our skull conductivity calibration includes only one degree of freedom, namely the SC conductivity, to avoid overfitting due to a too high number of degrees of freedom ( Lew et al., 2009 ;Wolters et al., 2010 ). The following steps compensate for the insensitivity of MEG to quasi-radial source components: For the fixed source location of step 1 and an SC value out of the above predefined coarse resolution set, we then determine the dipole orientation and strength that achieves lowest RV to the measured SEP and SEF components at 20 ms post-stimulus and store the RV to the SEP data as output value. This results in one point of the calibration curve of the corresponding subject in Figure 4 . These steps are repeated for all values of the coarse resolution level, resulting in a coarse resolution calibration curve, for which minimum is then finally selected as the coarse level calibration optimum.
Step 3 (fine resolution calibration): A finer resolution level for SC calibration conductivity is now produced around the coarse level calibration optimum of step 2 and the new RV minimum is determined as in step 2. The outcome is a calibration curve with refinement around the minimum, the skull conductivity calibration value, as shown in Figure 4 . Thereby, our two-level procedure helps to reach an even lower residual variance to the simultaneously measured SEF and SEP P20/N20 peaks.
For the above two-level skull conductivity calibration, EEG and MEG forward methods are employed, as described in the following: All EEG and MEG forward solutions in this study are calculated using the finite element method (FEM), as implemented in the SimBio 5 software. We use the Venant direct FEM approach Wolters et al., 2007 ) due to its high numerical accuracy ( Bauer, Pursiainen, Vorwerk, Kostler, & Wolters, 2015 ) and high computational efficiency when used in combination with EEG and MEG transfer matrices and an algebraic multigrid preconditioned conjugate gradient (AMG-CG) solver ( Wolters, Grasedyck, & Hackbusch, 2004 ). Standard piecewise trilinear basis functions are furthermore used in an isoparametric FEM framework.
In order to adapt to the different units of the EEG and MEG measurements, a signal to noise ratio (SNR) based transformation is applied, whitening the data by means of each channel's individual noise level, and resulting in unitless measures for both EEG and MEG ( Fuchs et al., 1998 ).
Besides (i) the calibrated skull conductivity, we also investigate further measures that are taken into account when accessing the overall quality of the source reconstruction in the calibration procedure ( Fuchs et al., 1998 ). These further measures are the following: (ii) The individual P20/N20 latency. (iii) The individual SNR SEP and SNR SEF , quantifying the quality of our SEP and SEF data, respectively, at the specific P20/N20 signal peak. In this study, the SNR is estimated based on ( Fuchs et al., 1998 ) and it is considered as the maximum value across all sensors, separately for SEP and SEF data. (iv) The Residual Variance of the SEP data (RV SEP ) indicating the remaining distance of the forward simulated to the measured P20/N20 component. (v) The source strength of the dipole scan result of the calibration procedure.

P20/N20 Surface Distance
The scalp surface distance is defined between the potential peak (point P in Fig. 2 a) and the potential trough (point N in Fig. 2 a), based on the following methodological steps. (1) We estimate the subject-specific Cartesian coordinates of P and N as follows. With the result from the calibration in Section 2.4.1 ( Fig. 2 a, green dipole), we produce a forward simulated dipolar scalp topography. The following aspects characterize this topography: (a) it maximally resembles the measured SEP/SEF topography at 20 ms post-stimulus, (b) it reduces the influence of data noise on the peak-to-peak detection procedure and (c) it allows to identify accurately the positive and negative potential peaks P and N, especially since they are most often between electrodes. Thus, this procedure can be seen as a subject-specific inter-and extrapolation method for the scalp potentials. An example of the dipole scalp topography is presented in Fig. 2 a for one of the subjects. (2) We connect P and N through a line with a length corresponding to the Euclidean distance between P and N and discretize this line into equidistant line points, where the distance is chosen according to the discretization size of the scalp surface triangles.
(3) We then use the MATLAB function point2trimesh 6 to determine for each line point the corresponding closest point on the triangulated scalp surface mesh ( Fig. 2 a, a subset of these points is shown by black dots). Thereby, a distinct curved line results which consists of linear elements over the surface. (4) We then approximate the final surface distance between P and N, by summing up the Euclidean length of all linear elements of this curved surface line.

Skull Thickness
The ROI, in which skull thickness is determined, includes an important area of the left hemisphere (due to right-hand stimulation) under the main SEP topography. It does not necessarily include the potential peak P and the potential trough N. For example, including P would mean to include mid-sagittal areas, where the pronounced dura compartment might influence skull segmentation accuracy. Fig. 2 b shows an exemplary skull ROI (in dark yellow) for one subject used in this study.
The investigated skull thickness in this ROI is measured for four different compartments: (i) full skull (ii) outer skull compacta (iii) skull spongiosa and (iv) inner skull compacta. For this purpose, the segmented masks of the full skull (including both compacta compartments and the spongiosa) and of the skull spongiosa ( Section 2.2 ) are utilized. For each one of these masks, a surface-based geometry (or surface), i.e., a set of triangles and nodes, is constructed through the MATLAB function isosurface . Then, the thicknesses are estimated following a recent approach of ( Lillie et al., 2016 ). In short, the thickness is measured using the compartments outer surface and its inner surface for each one of the four skull compartments. To determine the outer and inner surface of the given skull compartment, the normal vectors ( Fig. 2 b, green box, arrows in black), and the center of gravity (CG) of the skull ROI is used ( Fig. 2 b, red point). The determination of the normal vectors is established at each node of the skull compartment surfaces. If the scalar product of CG and a surface node normal is positive, the corresponding node is labeled as outer skull surface point, otherwise inner. By applying this procedure independently both to the full skull and the skull spongiosa surfaces, we extract the surfaces F OUT and S OUT ( Fig. 2 b, blue box, surfaces in green) and F IN and S IN ( Fig. 2 b, blue box, surfaces in red) where 'F' and 'S' denote full skull and spongiosa, respectively. We then measure the four different skull compartment thicknesses. Each thickness value is the average value across the minimum Euclidean distance between each node of the corresponding outer surface (F OUT for full skull and outer skull compacta; S OUT for skull spongiosa; S IN for inner skull compacta) to all nodes of the corresponding inner surface (F IN for full skull and inner skull compacta; S OUT for outer skull compacta; S IN for skull spongiosa).

Source Depth
For each participant, we define the source depth as the minimum Euclidean distance between the P20/N20 reconstructed dipole source location, resulting from the procedure in Section 2.4.1 and the inner surface of the skull. In Fig. 2 c, we present a sketch for the determined Fig. 2. Visualization of the measures of Section 2.4 for one subject. a) Dipole scan result (in green) for the measured P20/N20 component. The black points represent a subset of the surface points for the determination of the distance between the interpolated most positive (P) and most negative (N) potentials of the forward simulated topography. b) Skull ROI (in dark yellow) under the P20/N20 topography (P and N points) and dipole scan result (in green). The ROI includes an important area of the left hemisphere (due to right-hand stimulation) under the main SEP topography. It does not contain the potential peak P in order to avoid inclusion of mid-sagittal skull areas where the segmentation quality might be influenced by the pronounced dura compartment. Color boxes show the main steps for the calculation of skull thickness. The black box shows the skull ROI (sagittal view). The green box shows the normal vectors ( ̂ in black) for the determination of outer and inner skull surfaces. The red point represents the center of gravity (CG) of the ROI. The blue box shows the two outer surfaces F OUT (outer surface of full skull) and S OUT (outer surface of skull spongiosa) (in green) and the two inner surfaces F IN (inner surface of full skull) and S IN (inner surface of skull spongiosa) (in red). c) Determination of the source depth: Minimum distance (D) between the reconstructed P20/N20 dipole scan result (in green) and the inner skull surface. Visualizations were performed with custom MATLAB code and CURRY 8. source depth, given a reconstructed P20/N20 source (green dipole) for one of the participants.

Statistical Analysis
The statistical analysis includes a correlation procedure for testing whether (a) the measures on the head tissues defined in Section 2.4 are age-related, (b) skull thickness is related to the calibrated skull conductivity and (c) the P20/N20 surface distance is related to the source depth. We use the Robust Correlation Toolbox 7 , allowing automatic detection of outliers and determination of statistical significance through percentile bootstrap confidence intervals (CIs). We select the skipped Pearson correlation (rho) as a non-parametric method that takes into account the heteroscedasticity effects compared to the standard Pearson correlation ( Pernet, Wilcox, & Rousselet, 2013 ). The rejection of the null hypothesis is based on the bootstrapped CIs at the 95 % percentile level (95 % CI). We further derive the corresponding p-value ( P ) of each 95 % CI and apply false discovery rate control (FDR) due to the multiple correlation estimations. The applied FDR method follows ( Benjamin & Hochberg, 1995 ) and the adjustment level is set to 0.05. The data resampling within the bootstrap procedure is performed 1000 times while the outlier detection is based on the rule of the interquartile range from the same toolbox.
In a subsequent analysis of variance (ANOVA) 8 , the effect of the gender is taken into account by adding it as a between-subject factor in a linear regression analysis 9 with each of the above-mentioned pairs. In a last step, we conduct pairwise gender comparisons, including two-tailed tests separately for each measure of Section 2.4 and P20/N20 source analysis parameters. The examined null hypothesis H 0 is that females and males have the same mean value. For each test, we first apply a data normality test based on a Kolmogorov-Smirnov test ( Massey et al., 1951 ). A parametric (paired sample t-test), or non-parametric (Mann-Whitney u-test, Mann & Whitney, 1947 ) pairwise test is then applied depending on the result of the normality test. A threshold is defined at 95 % level of confidence for both ANOVA and pairwise tests for the significance level of the p-value. FDR adjustment is also applied to the p-values for the multiple comparison correction.

Results
The result section is divided into two parts: The first part presents the results for the four defined measures of Section 2.4 in male and female participants: (i) calibrated skull conductivity (ii) skull thickness (iii) P20/N20 surface distance and (iv) source depth. To improve readability, unless otherwise stated, the term skull thickness will be the full skull thickness (including outer compacta, spongiosa and inner compacta). In the second part, we outline results from the correlation analysis, as defined in Section 2.5.

Variability in the Measures for Skull Conductivity Calibration
In Figure 3 , the P20/N20 reconstructed dipole source (in red) is presented on the individual MRI for one of the subjects. This source reconstruction is the result of the combined EEG and MEG source analysis within the skull conductivity calibration procedure (Section 2.4.1). The calibrated conductivity is 12.5 mS/m for skull compacta and, due to the fixed conductivity ratio of 1:3.6, 45 mS/m for the spongiosa. The dipole source is located on the postcentral wall of the central sulcus in Brodmann area 3b in primary somatosensory cortex (SI) and has a mainly tangential orientation with regard to the inner skull surface.
For each participant, the skull conductivity calibration procedure (Section 2.4.1) was applied in the corresponding subject-specific realistic 6CA head model, resulting in a 6CA calibrated (6CA_cal) head model. The Residual Variance (RV) of the simulated to the measured data, collected for each conductivity within the calibration procedure, resulted in subject-specific calibration curves that are depicted in Fig. 4 . Finally, determining the minimum in the RV curve allowed us to set up the individual 6CA_cal head model for each subject.
As Figure 4 and Table 1 show, the resulting residual variance for the SEP skull conductivity calibration (minimum of each curve) has a mean of 8.57 % with a standard deviation of 3.44 %, is below 20 % for all of the subjects and the best fit goes down to only 4 %. Furthermore, large inter-subject variability of the skull conductivity can be observed across all subjects with the lowest skull compacta (spongiosa) conductivity being at 2.6 mS/m (9.36 mS/m) and the highest at 16.9 mS/m (60.84 mS/m), respectively. In Figure 4 , the age-related color coding of the curves shows that the calibration skull conductivity values of the younger participants are overall at higher skull conductivities than those of the older participants. Only a more detailed inspection ex-  Residual Variance (in %) to the P20/N20 SEP data on the vertical axis, for the dipole scan result as determined by the skull conductivity calibration procedure (Section 2.4.1). Each subject is represented by one of the curves, color-coded by age. Y-axis is logarithmically-scaled for better readability.

Table 1
Gender-based mean and standard deviation across all the participants of the P20/N20 source analysis with regard to latency (second column), signal-to-noise ratio (SNR) for SEP and SEF (third and fourth columns, resp.), source strength Q (fifth column) and residual variance to the SEP data RV SEP (sixth column) resulting from the calibration procedure as described in Section 2.4.1. The symbol asterisk ' * ' indicates a significant statistical difference (p-value < 0.05) between genders. presses a rather complex relationship between cranial conductivity and age, especially due to two older subjects of age 40 and 43 for whom the calibrated skull conductivities are at 16.1 and 16.9 mS/m, respectively, i.e., as high as for most of the young participants.
In the following, we present the P20/N20 source analysis parameters monitored during the skull conductivity calibration procedure as additional measures introduced in Section 2.4.1. The resulting average value across all subjects is shown in Table 1 . Between genders ( Table 1 , first row: males, second row: females), the P20/N20 latency is significantly shorter ( P < 0.05) for females (22.67 ± 0.84 ms) than males (23.92 ± 1.3 ms). Otherwise, we do not observe any significant gender differences for the remaining P20/N20 source analysis parameters.

Variability in Measures and Gender Differences
In the present section, we investigate the inter-subject variability of the four measures defined in Section 2.4 and examine if gender differences can be found in those measures. The variability and the median of the measures is provided in Fig. 5 across all subjects (gray boxplot) and split into subgroups of males (blue) and females (pink).
The most important result is the wide inter-subject variability with large ranges (from minimum to maximum) for all four measures for the total group as well as for the male and female subgroups.
The calibrated skull compacta conductivity ranges from 3.1 up to 16.9 mS/m for males and from 2.6 up to 16.7 mS/m for females ( Fig. 5 a). The mean and standard deviation across all subjects is 8.44 ± 4.84 mS/m.

Figure 5. Descriptive statistics and inter-subject variabilities.
Boxplots depict the inter-subject variability for a) skull compacta conductivity (in mS/m), b) the averaged full skull thickness in the ROI as indicated in Fig. 2 b (in mm) c) the P20/N20 surface distance (distance of P to N; in cm) and d) source depth (in mm). Color-coding is used to distinguish male (blue; 10 subjects), female (pink; 10 subjects) and total (gray; 20 subjects) groups. The filled circles represent individual results per subject. Note that there are overlapping values within some of the boxplots. Per boxplot, the central horizontal black line is the median and the edges of the box are the 25th and 75th percentiles.

Table 2
Gender-wise mean and standard deviation of the thicknesses for outer and inner compacta and spongiosa skull compartments in the ROI as indicated in Fig. 2 Table 2 , we additionally present group-wise (male, female, total) mean and standard deviation of the thicknesses for the three different cranial compartments. For outer-and inner-compacta, we find the male subgroup having a higher mean thickness value than the female one, while it is the other way around for the spongiosa.
The P20/N20 surface distance was found to be in a range of 9.5 to 16.4 cm for males and 7.8 to 18.1 cm for females ( Fig. 5 c). Mean and standard deviation across all subjects are 12.08 ± 3.21 cm.
In Fig. 5 d, we present the inter-subject variability in source depth, where values range from 11.57 up to 24.05 mm for males and from 5.1 up to 18.56 mm for females. Additionally to the results presented in Fig. 5 d, we determined for source depth a mean and standard deviation of 15.45 ± 4.54 mm across all participants.
Finally, no statistically significant gender difference was observed when applying pairwise statistical analysis on the mean value of each of these measures.

Table 3
Interaction of age with the measures from Section 2.4 , i.e., cranial compartment thickness in the ROI, surface distance and source depth with age. The first and second columns indicate the number and the name of the correlation pair, respectively. The third column shows the correlation value (rho) and the fourth column presents the bootstrapped confidence interval (CI). The fifth column shows the p-value ( P ) as derived from each bootstrapped CI and adjusted with FDR. 95 % CI were computed using bootstrapping with 1000 permutations.

Statistical Results
The robust pairwise correlation was applied independently between the investigated adult age group and each of the four measures. We also assessed the relationship i) between the thickness of the skull (for all three cranial compartments) and the calibrated skull conductivity and ii) between the P20/N20 surface distance and the source depth.
In Figure 6 , we show the statistically significant correlation pairs, i.e., calibrated skull conductivity with age (left subfigure) and calibrated skull conductivity with skull thickness (right subfigure). The remaining correlation pairs are outlined in Table 3 and 4 .
When including gender as a between-subject factor in these pairs through linear regression modeling, no statistically significant effect ( P > 0.05) could be observed. Therefore, the corresponding ANOVA results are not presented.
In our first investigation, we examined if the defined measures in Section 2.4 are correlated with age. In Fig. 6 a, we show that a statis-Figure 6. Interaction of skull conductivity with age and skull thickness. The figure contains the robust correlations between the (calibrated) skull conductivity (in mS/m) and a) age (years), b) (full) skull thickness (in mm). The skipped Pearson correlation value (rho) and the confidence interval (CI) are presented on top of both images. 95 % CI were computed using bootstrapping with 1000 permutations. The corresponding FDR adjusted p-value was 0.017 and 0.01 for a) and b) , respectively. Notice that the data from the participants are overlapping in case that less than twenty points are depicted.

Table 4
Interaction of the calibrated skull conductivity (skull conductivity) with all the cranial compartment thicknesses and the surface distance with the source depth. The first column indicates the row number, the second the correlation pair, the third the correlation value (rho), the fourth the bootstrapped confidence interval (CI) and the fifth the p-value ( P ) derived from each bootstrapped CI and adjusted with FDR. 95 % CI were computed using bootstrapping with 1000 permutations.  ( Fig. 6 a, two circled black crosses on yellow background). The P20/N20 surface distance has a weak positive interaction with age ( Table 3 , fifth row: rho = 0.29, 95 % CI = [-0.12 0.64]), while the thicknesses of the full skull, outer compacta, spongiosa and inner skull compacta in the ROI are also weakly, but negatively, correlated with age ( Table 3 , rows 1-4). For the correlation pair of source depth and age, a positive interaction is observed but not statistically significant ( Table 3 , row 6: rho = 0.35, 95 % CI = [-0.14 0.65]). No outliers were detected for these correlations. In our second study, we investigated whether there is a dependence between skull thickness and conductivity, also with the aim of observing whether our non-invasive approach can achieve a similar result as an invasive approach (e.g. Tang et al., 2008 ). With regard to the correlations between the thickness of all of the cranial compartments in the ROI and the calibrated skull conductivity, the results varied depending on the combination. In Fig. 6 b, a statistically significant positive correlation was revealed between skull thickness and calibrated skull conductivity (rho = 0.52, 95 % CI = [0.19 0.75], P = 0.01). The remaining correlation pairs, shown in Table 4 , are not statistically significant according to their CIs and p-values. In particular, the thickness of the cranial compartments spongiosa and inner compacta has a low positive interaction with the calibrated skull conductivity ( Table 4 , rows 2-3), while an opposite low interaction occurs with the outer compacta thickness and the calibrated skull conductivity ( Table 4 , row 1, rho = -0.25).
For the last correlation pair, i.e., the surface distance and the source depth, we found a marginally negative but non-significant value ( Table 4 , row 4, rho = -0.11).
No outliers occurred during the assessment of the correlations shown in Table 4 .
Taking into account the significant correlations ( Fig. 6 ), a linear mixed-effect analysis 10 was also used to assess the effect of age and full skull thickness on the calibrated skull conductivity based on gender. The predicted variable was the calibrated skull conductivity which age, full skull thickness and gender were the predictor variables ( b ). From this analysis, we get that age and (full) skull thickness are two significant predictors ( b age = -0.01 and b skull_thickness = 0.18, P < 0.05) while gender showed a weak effect ( b gender = -0.39, P = 0.06). In this analysis, similar outliers were observed as presented in Fig. 6 .

Discussion
In this study in a group of twenty adult subjects in the age of 18 to 53, we estimated and evaluated the inter-subject variability of bulk (calibrated) skull conductivity using the non-invasive modalities EEG and MEG in fusion with MRI, modalities that are available in MEG laboratories and for which ethical clearance is nowadays standard. We proposed a two-level calibration procedure to estimate individual skull conductivity using source analysis based on detailed FEM head modeling of the P20/N20 component of combined SEP and SEF data from electric wrist stimulation. Our most important result is the high inter-subject variability over the investigated age range and in each age subgroup, as the high variances in Figures 4 , 5 and 6 clearly illustrate. This means that approaches like the proposed calibration procedure are needed to individually estimate skull conductivity, one of the most important forward modeling parameters in EEG and combined EEG/MEG source analysis ( Aydin et al., 2014 ;Vorwerk et al., 2014Vorwerk et al., , 2019 as well as in transcranial electric stimulation (TES) ( Saturnino et al., 2019 ;Schmidt et al., 2015 ). Besides the high inter-subject variability, pointing to the need for individualization of experimental procedures, we also found two significant relationships ( Fig. 6 ). Our results therefore motivate the following experimental setup: In a first run of 10 min, SEP and SEF data are collected, serving for head model calibration, followed by the main acquisition runs of combined EEG/MEG data of interest. Then, these main data are being analyzed using the individually calibrated realistic head model.
The application of the presented calibration procedure in a group of twenty adult subjects yielded large inter-subject variability among the estimated skull conductivities ( Figs. 4 , 5 a and 6 ). This was also reported in a recent review ( McCann et al., 2019 ). Furthermore, other studies using DAC on skull pieces temporarily removed during surgery showed a high inter-subject variability ( Hoekema et al., 2003 ;Tang et al., 2008 ). For 1 kHz DAC, Tang et al. (2008) indicated a variation of skull conductivity between 3.77 and 17.29 mS/m, which is close to our range of 2.6 to 16.9 mS/m. Arumugam et al. (2017) used EIT in ten subjects and found a skull conductivity variability of 1.8 to 5.6 mS/m. Compared to the above studies, our results ( Fig. 4 and Fig. 5 a) were found to be in a similar variability range. Those results are measured under in vivo conditions and in the low frequency range of interest, when considering the frequency-dependence of conductivity measurements ( Akhtari et al., 2002 ;Gabriel, Gabriel, & Corthout, 1996 ;Stinstra et al., 1998 ;Tang et al., 2008 ).
In Table 1 , we presented the further defined measures from Section 2.4.1 for the source analysis within our skull conductivity calibration procedure. With SNR values of 14.88 ± 4.20 for SEP and 22.72 ± 7.29 for SEF data, a single run of only 10 min for 1200 trials gave us enough data quality for accurate source analysis. The higher SNR value for MEG compared to EEG for the same number of trials shows the higher sensitivity of MEG than EEG to the rather lateral ( Fig. 5 , maximally 24 mm deep) and mainly tangentially-oriented (on average 25.5°± 18.6°deviation from the tangential plane, being parallel to the inner skull surface) P20/N20 dipole source in Brodmann area 3b. It has been shown in various studies that such sources are better detectable by MEG than by EEG ( Goldenholz et al., 2009 ;Hillebrand & Barnes, 2002 ;Huang et al., 2007 ). This higher detectability together with the insensitivity of MEG to skull and skin conductivity ( Brette & Destexhe, 2012 ) supports the idea of relying on MEG dipole scans for an accurate localization within our SEP/SEF calibration procedure. The low residual variance ( Fig. 4 and Table 1 ) shows that the collection of only a single run with 1200 trials together with the model of a focal dipolar P20/N20 source ( Allison et al., 1991 ;Hari et al., 1993 ;Nakamura et al., 1998 ) in a calibrated and highly realistic head model seems acceptable for our calibration needs. A simultaneously activated deep thalamic source at the P20/N20 peak as proposed by ( Buchner et al., 1994 ;Fuchs et al., 1998 ) is hardly detectable in the MEG and therefore also hardly influences our MEG driven localization process. Furthermore, Götz, Huonker, Witte, and Haueisen (2014) showed that in 10 out of 12 subjects, the single dipole model performed accordingly at the P20/N20 peak and in some first test simulations, an additional thalamic source also did not significantly influence our calibration results. The short acquisition time of 10 min for SEP/SEF data is an important advantage when compared to e.g. MREIT, which takes longer ( McCann et al., 2019 ). The computation time for the skull conductivity calibration, including all EEG/MEG forward calculations for six-compartment anisotropic (6CA) head modeling, was an overnight job for each subject, using a conventional laptop (Dell, XPS 15, 2016). We can thus summarize that our proposed calibration procedure is feasible in a standard MEG laboratory with an additional EEG/MEG measurement time of only 10 min per subject.
A particularly strong influence of skull conductivity on EEG forward simulations and EEG source analysis has been reported in many studies using realistic head models of different detail ( Akalin Acar et al., 2016 ;Azizollahi et al., 2016Azizollahi et al., , 2018Cuartas Morales, Acosta-Medina, Castellanos-Dominguez, & Mantini, 2019 ;Gençer & Acar, 2004 ;Montes-Restrepo et al., 2014 ;Vallaghé & Clerc, 2009 ;Vorwerk et al., 2019 ). Previous studies on EEG ( Cuartas Montes-Restrepo et al., 2014 ) and combined EEG/MEG ( Aydin et al., 2014 ) source analysis also showed that skull conductivity inaccuracies can easily lead to localization errors in the centimeter range. Furthermore, skull conductivity was also found to be the most influential parameter for optimized TES, as shown in recent uncertainty analyses ( Saturnino et al., 2019 ;Schmidt et al., 2015 ). Therefore, the use of subject-specific calibrated realistic head volume conductor modeling, as presented in our work, is suggested to take into account the inter-subject variability of skull conductivity in EEG and combined EEG/MEG source analysis as well as in optimized TES.
Based on the results of our correlation analysis, (calibrated) skull conductivity and age showed a significant negative correlation ( Fig. 6 a). This inverse relationship is in line with previous studies on this topic ( Arumugam et al., 2017 ;Hoekema et al., 2003 ). In agreement with our result, Hoekema et al. (2003) , who worked on a group study with five patients aged 11 to 50 years, found that skull conductivity is higher in younger patients than in older patients. Using EIT, Arumugam and colleagues (2017) found a negative correlation trend in a group study with ten subjects, aged 23 to 49 years. Other studies with mammals such as for example rats ( Peyman, Rezazadeh, & Gabriel, 2001 ;, which however measured skull conductivity at microwave frequency, have also reported such a negative relationship. In the present study, the skull conductivity was estimated by means of a non-invasive procedure based on SEP/SEF recordings in the low frequency range of interest and in an age range from 18 to 53 years. Since EIT and SEP/SEF methods have shown agreement on their estimated skull conductivities ( Gonçalves et al., 2003 ), our resulting correlation of the calibrated skull conductivity with age could have been expected, considering the distribution of the age range used in this study.
Our correlation analysis also yielded a statistically significant positive correlation between skull conductivity and (full) skull thickness ( Fig. 6 b). This finding is supported by ( Tang et al., 2008 ) who measured resistivities of 388 skull samples, excised from 48 skull flaps of patients undergoing surgery using DAC. Furthermore, we observed a non-significant negative correlation between skull thickness and age ( Table 3 ). As shown by ( Delye, Clijmans, Mommaerts, Sloten, & Goffin, 2015 ;see Fig. 5 ), skull thickness increases exponentially in the age range from 0 to 18 years, while in the age range from 18 to 20 years, a high inter-subject variability starts dominating an only small remaining linear increase of skull thickness over time. It can be assumed that this variability continues for older subjects, as shown here ( Fig. 5 b) and similarly supported by ( De Boer et al., 2016 ;Lillie et al., 2016 ;Lynnerup, Astrup, & Sejrsen, 2005 ), which makes it difficult to extract a robust correlation of skull thickness with age.
Our significant finding on the relationship of skull conductivity and age could depend on the chosen age range. When excluding the older participants around 50 (49 -53), we do no longer observe a significant negative correlation (rho = 0.11, CI = [-0.55 0.60]) between calibrated skull conductivity and age. When excluding the same subgroup of participants, we still get a non-significant correlation of spongiosa thickness over age (rho = 0.17, CI = [-0.30 0.63]) compared to ( Table 3 , row 3). This irregularity in the age subgroups over our larger age range should be further studied, also due to the results of ( Lynnerup et al., 2005 ;Figs. 3 -5), where it was shown that the spongiosa thickness varies nonmonotonically over the large age range from 16 to 90 years. Therefore, a future study should include a larger number of participants, particularly in the age range from 40 to 53, to further investigate the relationship between skull conductivity, age and thickness, with a possible further focus on age subgroups. Finally, osteoporosis ( Aspray & Hill, 2019 ) or osteopetrosis ( Boskey & Coleman, 2010 ) could potentially influence these relationships.
Our evaluated correlation pairs were determined for a group of subjects in the adult age range (age of 18 to 53). The results could differ for groups of subjects in childhood and also in older age. Particularly in newborns, also due to the presence of fontanelles, as well as in the first years of life, cranial development, including skull thickness and skull conductivity, can be considered highly nonlinear ( Azizollahi et al., 2016( Azizollahi et al., , 2018Delye et al., 2015 ;Gibson, Bayford, & Holder, 2000 ;Li et al., 2015 ;Odabaee et al., 2014 ).
Regarding the lack of further significant correlations ( Table 3 , 4 ), the limited sample size and the relatively non-uniform age range could be main factors, and remaining modeling simplifications might play a role. Sowell et al. (2006) determined that cortical thickness in the postcentral gyrus could decrease in the age from 20 to 87, replaced by CSF ( Fjell et al., 2010 ). The latter could be a reason for the positive, however, non-significant, correlation of P20/N20 source depth over age ( Table 3 ). In addition, modeling simplifications, such as the use of standard nonindividualized skin and gray matter conductivity values could have influenced our results for the calibrated skull conductivity, source strength and residual variance to the SEP data RV SEP in Table 1 ( Vorwerk et al., 2019 ;Fig. 7).
With regard to gender, the only significant difference we found was the P20/N20 latency ( Table 1 ). The shorter P20/N20 latency we measured in males is in line with previous studies ( Allison, Wood, & Goff, 1983 ;Huttunen et al., 1999 ) and can easily be attributed to the longer arms of males ( Huttunen et al., 1999 ). Furthermore, even if gender is often considered as an additional factor in the relationship between skull thickness and age ( Pellegrini et al., 2018 ), in our data inter-subject variability limits the possibility of an observation of such a gender effect. Since skull thickness and conductivity are related ( Fig. 6 b and Tang et al., 2008 ), we assumed that through a possible influence of gender on skull thickness, an indirect influence of gender on skull conductivity could also exist. However, as ANOVA analysis showed, no gender effect was observed ( P > 0.1). Considering also gender in a linear mixedeffects analysis ( Section 3.2 ), age and full skull thickness remained significant predictors of the calibrated skull conductivity while gender was weak ( P = 0.06). Considering the variability of cranial thickness in both subgroups ( Fig. 5 b, Table 2 ) which is supported by ( De Boer et al., 2016 ;Lillie et al., 2016 ), the absence of a gender effect could be expected. These two studies used a large number of CT datasets and also showed no significant gender differences for thickness of the skull regions in the left hemisphere. Our and their results mainly only emphasize the large inter-subject variability. In summary, due to the large inter-subject variability, the evaluation of gender effects and differences in the measures studied here might remain a challenging task.
Two subjects in the age of 40 and 43 were detected as outliers in the correlation pair presented in ( Fig. 6 a) due to their exceptionally high calibrated skull conductivity. We found that their average skull thicknesses of 6.5 mm and 8 mm in the defined skull ROI ( Fig. 2 b) was also relatively large, with large variation over the ROI (3.4 -8 mm). However, while in our group study, these values were on the higher side, according to ( Fig. 6 in Tang et al., 2008, Fig. 1 in De Boer et al., 2016, even higher thicknesses can be found. We expect that a larger number of participants in the age range of 40 would smoothen the skull thickness range presented in this study.
The selected age group in this study reflects the age range of the subject pool at a MEG center, with fewer data points in the range under 22 and over 40 and many participants of student age. In particular, however, this study is an important part of an epilepsy project to investigate whether combined EEG/MEG analysis in individualized head-volume conductor models with calibrated skull conductivity can provide a better estimate of the epileptogenic zone. Of particular interest is the comparison to EEG or MEG single-modality analyses and analyses using simplified and non-calibrated head models. In first proofof-principle studies, a superiority of combined EEG/MEG analysis using head models calibrated for skull conductivity has already been shown ( Aydin et al., 2015( Aydin et al., , 2017. Most epilepsy patients in presurgical epilepsy diagnosis -a main clinical application of EEG/MEG source analysis -in whom surgery is considered are also in the age group as investigated here . Therefore, in especially this age range, individually calibrated skull conductivity can provide useful information for epilepsy diagnostics. Thus, it was our specific interest to use a noninvasive method using hardware available in a MEG center to investigate how age and gender can influence skull conductivity and thickness in middle-aged adults. We therefore did not collect the same number of participants in all age subgroups, but we only paid attention to an equal number of men and women for the gender investigations. The main result of our study, namely to show the need for individually calibrated head models for a combined EEG/MEG source analysis due to the large variance in skull conductivity for this important age range, could therefore be achieved.
Correlations between skull conductivity, thickness and age in childhood have not been examined here due to the limitation of our ethics vote to adult studies. Future studies using our non-invasive procedure could thus investigate not only larger sample sizes, but also include the childhood age range, and thereby stabilize the statistics for an analysis in a complete age-range. It would also be interesting to investigate how other factors, e.g., nutrition or health, might influence the defined measures.
Within the construction of the realistic head models in Section 2.2 , modifications in the erosion operator would have influenced the determined ratio between cancellous and cranial bone. An erosion of only 1 mm could have resulted in too thin inner and outer compacta and could have thereby led to 'skull leakage' as described by ( Engwer, Vorwerk, Ludewig, & Wolters, 2017 ;Piastra et al., 2018 ). A higher erosion value ( > 2 mm) could have artificially reduced the skull spongiosa and increased the inner and outer compacta thicknesses, which would in turn have increased our value for calibrated skull conductivity. In the future, investigations will be carried out for the use of level set tissue segmentation approaches in combination with unfitted finite element methods that better take into account the partial volume effects (Nüßing et al. , 2016), and its influence on skull conductivity calibration.
In order to avoid overfitting ( Wolters et al., 2010 ;Lew et al., 2009 ), we only allowed one degree of freedom in our calibration procedure for the most influential parameter as detected by uncertainty analysis, namely skull conductivity ( Vorwerk et al., 2019 ;Saturnino et al., 2019 ). We cannot exclude that possible inter-subject variabilities in skin or GM conductivity could have influenced our calibrated skull conductivity values ( Vorwerk et al., 2019 ;Saturnino et al., 2019 ). However, for the influence of the second most important parameter for the EEG, namely skin conductivity ( Vorwerk et al., 2019 , Figs. 7 and 9), it was also found that for lower skull conductivities, variability of skin conductivity has a smaller influence on source depth ( Vorwerk et al., 2019 ;Fig. 9). Furthermore, since MEG is insensitive to skin conductivity, at least our source localizations and source depths should be mainly not influenced by individual variations in skin conductivity. An overlayed thalamic activity might also simultaneously be present in the P20/N20 component in a small percentage of subjects ( Götz et al., 2014 ). Still, future studies are needed to determine the potential of these effects on the calibration procedure.
Our head models ignored the volume conduction effects of the dura ( Ramon, Garguilo, Fridgeirsson, & Haueisen, 2014 ), blood vessels ( Fiederer et al., 2016 ) as well as local skull inhomogeneities such as sutures, which could provide a path of higher conductance ( Tang et al., 2008 ;Ollikainen, Vauhkonen, Karjalainen, & Kaipio, 1999 ;Pohlmeier et al., 1997 ). In addition, following ( Baumann et al., 1997 ) for CSF conductivity, we assumed a fixed value of 1.79 S/m at body temperature, which is nearly identical to the recommended weighted mean value of 1.71 S/m of a recent meta-analysis ( McCann et al., 2019 ). In the latter study, however, a larger variability of CSF conductivity was reported when using MREIT instead of DAC for its determination (Fig. 8 in McCann et al., 2019 ). These are the main reasons why we have consistently used the terms 'estimation' or 'calibration' of skull conductivity in this study, since the term 'determination' would have feigned too much precision. Despite these limitations, we believe that our proposed procedure is a considerable step forward when compared to the current standard, i.e., the use of non-individual literature-based or only age-dependent skull conductivity values.
We fixed the conductivity ratio between skull compacta and spongiosa, using the measurements of ( Akhtari et al., 2002 ), again with the main argument to avoid overfitting ( Wolters et al., 2010 ;Lew et al., 2009 ). First of all, skull conductivity calibration with such a fixed conductivity ratio for compacta:spongiosa has been successfully used in a proof-of-principle study for combined EEG/MEG source analysis in presurgical epilepsy diagnosis ( Aydin et al., 2017 ). Secondly, also the simulation studies of ( Montes- Restrepo et al., 2014 ;Dannhauer et al., 2011 ;Vorwerk et al., 2014 ) support the use of skull modeling approaches that distinguish between skull spongiosa and compacta. However, it was also shown that for the somatosensory cortex this distinction causes only a weak effect in both EEG and MEG when using an optimized conductivity value for the homogenized full skull compartment (Fig. 12 in Vorwerk et al., 2014 ). Therefore, we do not expect that a calibration similar to our approach would be able to additionally estimate an individual ratio as a second degree of freedom. For the MEG, the observed effect on forward solutions was even much smaller compared to EEG (Fig. 12 in Vorwerk et al., 2014 ), and since our calibration uses the MEG for the localization part, we do not expect a high sensitivity of our calibration procedure on the chosen ratio. Skull conductivity calibration could also be performed using a homogenized full skull compartment, which would lead to a calibration value reflecting the combined effect of both compartments. Because of the overall weak effect of the spongiosa compartment on EEG and especially MEG forward solutions for somatosensory sources (Fig. 12 in Vorwerk et al., 2014 ), we expect only moderate changes in the calibration value.
Finally, comparison with EIT and/or combination with EIT procedures are also interesting as future goals ( Gonçalves et al., 2003 ). Calibration procedures might be studied that exploit other SEP/SEF (left median nerve, tibial nerve, trigeminal nerve) or auditory or visually evoked potential (AEP, VEP) and field (AEF, VEF) data in order to calibrate other skull ROI's. Combinations of such calibration datasets might even allow the use of more than one degree of freedom in the calibration process, which, if presented alternately, need not even extend the measurement time.

Conclusion
In this group study with twenty participants, we evaluated the intersubject variability of skull conductivity in the context of combined EEG/MEG source analysis using a non-invasive calibration procedure. Our method is based on the reconstruction of the SEP and SEF P20/N20 component with subject-specific realistic head modeling. We found large inter-subject variability for the calibrated skull conductivity, as well as for the examined related measures of skull thickness, P20/N20 surface distance and source depth. Our statistical analysis shows that the calibrated skull conductivity is significantly related to the skull thickness and age of the participants with no clear gender effects. We did not find gender differences besides a significantly shorter P20/N20 latency in females than males. In the context of source analysis of EEG or combined EEG/MEG data and also for optimized TES, our study emphasized the critical importance of taking the inter-subject variabilities of skull conductivity and thickness into account. We therefore propose the additional measurement of the individual SEP/SEF P20/N20 component and its use for subject-specific calibrated realistic head modeling. Our procedure is non-invasive and easily applicable in a standard MEG laboratory.

Funding
This work was supported by the German Research Foundation (DFG), projects WO1425/7-1 and RA2062/1-1, by EU project Child-Brain (Marie Curie innovative training network, grant no. 641,652), by the DFG priority program SPP1665, project WO1425/5-2 and by the "Alexander S. Onassis " Public Benefit Foundation.

Data and Code Availability
The data that support the results of this group study are available in anonymized form on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.
The developed code in the present study is part of an ongoing research project and only parts of it can be made available on request by the corresponding author. The third-party code used in the present study is either cited accordingly or the corresponding openly available code repository is provided via a link in the footnote that a description of the code exists.

Declaration of Competing Interest
None.