ComBat models for harmonization of resting-state EEG features in multisite studies

(cid:1) Spectral features exhibited consistent relationships with age after harmonization with ComBat variants. (cid:1) ComBat harmonization reduced batch effects in resting-state EEG spectral features at the sensor level. (cid:1) Optimized Nested ComBat with Gaussian Mixture Model and HarmonizR achieved better batch effects correction than other ComBat variants.


Introduction
Multiple resting-state Electroencephalogram (rsEEG) features have been proposed as potential neurophysiological correlates of development, aging, and diseases (Babiloni et al., 2020).However, the sample size of rsEEG studies has been typically small, often limiting the statistical power, generalizability, and external validity of the findings from some single-site studies (Newson and Thiagarajan, 2019).In recent decades, the corpus of clinical research on rsEEG has significantly grown (Colom-Cadena et al., 2020;Yao et al., 2022), and the open-science philosophy adopted by several researchers and institutions granted full accessibility to analysis software and datasets (Niso et al., 2022).Although pooling data from multiple sites could be considered a practical solution to the stated problems, it might introduce bias in further analyses due to site-related variability of rsEEG features (''batch" effects).
Batch effects can be caused by headset/hardware differences, non-identical experimental setup, and particular parameters for acquisition and processing (Bigdely-Shamlo et al., 2020;Li et al., 2022).Similarly, cross-site differences in rsEEG features could be attributed to the biological variability of the samples (e.g., different ages, diagnoses, or sex proportions across site samples).Prior works in genetics, proteomics, and neuroimaging have formulated potential solutions to harmonize batch effects while preserving the effect of biological covariates that might be of interest for analysis (Fortin et al., 2018;Voß et al., 2022).
The combining batches (ComBat) method, initially developed for genomics (Johnson et al., 2007), is a popular approach for harmonizing batch effects.This method and its variants have shown effectiveness in harmonizing a broad range of features in radiomics (Beer et al., 2020;Bell et al., 2022;Pomponio et al., 2020;Shiri et al., 2022;Voß et al., 2022;Xu et al., 2023), and transcriptomics (Ryan et al., 2022).In the imaging community, it has been acknowledged that standardized processing workflows and pooling multisite datasets while controlling for batch effects might contribute to more generalizable findings and enhance the potential discovery of robust biomarkers for neurological and psychiatric conditions (Jovicich et al., 2019).Conversely, there are scarce reports evaluating batch effects harmonization of rsEEG features with ComBat models (Kurbatskaya et al., 2023;Li et al., 2022).With all the above, we aim to evaluate four ComBat harmonization pipelines in a pooled sample from five independent rsEEG datasets of young and old adults.As a practical tool for clinical researchers running multisite studies, the following paper presents an exemplary case of batch effects harmonization when assessing potential relationships between age and rsEEG spectral features.This openly available workflow is conducted on relevant and robust neurophysiological characteristics of the rsEEG signals in healthy and diseased populations (Babiloni et al., 2020), namely, oscillatory and aperiodic features (Donoghue et al., 2020).Beyond, adapting this harmonization workflow favors the exploitation of existing rsEEG open repositories and the combination of in-house data with multisite datasets across clinics, leveraging the capabilities of machine learning and artificial intelligence applications for diagnosis, prognosis, or treatment monitoring (Al-Qazzaz et al., 2014;Bailey and Hoy, 2021).

Study design and participants
This secondary analysis utilized open datasets from five crosssectional studies conducted in the USA, Finland, Norway, and Germany involving healthy young and old adults.Three studies com-pared healthy controls and Parkinson's disease patients (Anjum et al., 2020;Railo et al., 2020:;Rockhill et al., 2021), while the other two focused on aging (Babayan et al., 2019;Hatlestad-Hall et al., 2022a).Most primary studies excluded participants with neurological and psychiatric conditions.

California dataset
Data was acquired in 2013.Cognitively normal older adults (n = 16) were recruited at Scripps Clinic in La Jolla, California, USA.Participants were recruited from the community (or were spouses of the Parkinson's Disease patients) and signed a written informed consent approved by the University of California, San Diego.Further details are provided elsewhere (George et al., 2013;Jackson et al., 2019).

Finland dataset
Data was acquired in 2018.Cognitively normal older adults (n = 20) were recruited as a control group at the University of Turku and Turku University Hospital, Turku, Finland.Participants signed written informed consent with ethical approval from the local committee.Further details are provided elsewhere (Railo et al., 2020:).

Iowa dataset
Data was acquired from 2017 to 2019.Cognitively normal older adults (n = 14) were recruited as a control group at the University of Iowa, Narayanan Lab, Iowa, USA.All participants provided written informed consent, approved by the Office of Institutional Review Board.Further details are provided elsewhere (Anjum et al., 2020;Singh et al., 2020).

Oslo dataset
Data was acquired in 2017.Cognitively normal young and old adults (n = 111) were recruited from the local community at the University of Oslo, Oslo, Norway.The primary study investigated visual and auditory stimulus-specific response modulation and aging effects, acquiring event-related and rsEEG data.All participants provided written informed consent, approved by the regional ethics committee.Further details are provided elsewhere (Hatlestad-Hall et al., 2022b;Rygvold et al., 2021).

LEMON dataset
Data was acquired between 2013 and 2015.Participants (n = 216) were included in the ''Leipzig Study for Mind-Body-Emotion Interactions" (LEMON) to explore mind-brain-bodyemotion interactions in aging.The sample of healthy young (20-35 years) and old adults (59-77 years) was recruited at the Day Clinic for Cognitive Neurology of the University Clinic Leipzig and the Max Planck Institute for Human and Cognitive and Brain Sciences, Leipzig, Germany.All participants provided written informed consent, approved by the ethics committee at the University of Leipzig À Medical Faculty.Further details are provided elsewhere (Babayan et al., 2019).

RsEEG recordings and signal preprocessing
All raw rsEEG data was obtained from open repositories: California (Rockhill et al., 2021), Finland (Railo, 2021), Iowa (Narayanan Lab, 2020), Oslo (Hatlestad-Hall, 2022), and LEMON (Babayan et al., 2019).RsEEG signals from Iowa were acquired under the eyes open condition; the remaining datasets were recorded under eyes closed.Signals were acquired with electrode caps (monopolar montages); amplifier details are presented in Supplementary Table 1.Twenty-nine common channels across all recordings were considered for this analysis; Supplementary Figure 1.In order to automatize the preprocessing and feature extrac-tion steps, all datasets were standardized following the Brain Image Data Structure (BIDS) (Pernet et al., 2019) with the sovaBIDS package, available at https://github.com/yjmantilla/sovabids.
For preprocessing, we used the sovaharmony pipeline (Isaza et al., 2023;Jaramillo-Jimenez et al., 2023;Suárez-Revelo et al., 2018), available at https://github.com/GRUNECO/eeg_harmonization.Sovaharmony wraps multiple preprocessing tools.First, robust average re-referencing, adaptative line-noise correction, and bad channel interpolation were performed using the PyPREP library (Appelhoff et al., 2022), a Python implementation of the MATLAB PREP pipeline (Bigdely-Shamlo et al., 2015).PyPREP estimates a robust average reference by excluding noisy channels, ensuring a comparable reference scheme across datasets.A 1 Hz high-pass Finite Impulse Response (FIR) filter was applied to remove low-frequency drifts.Then, wavelet-enhanced Independent Component Analysis (ICA) was performed to smooth strong artifacts in the data (Castellanos and Makarov, 2006).Thus, the MNE library's FastICA algorithm extracted artifactual and brain components from the signal (Gramfort et al., 2013).These components were decomposed into wavelets, and artifacts were smoothed through thresholding.Signals were then low-pass filtered at 30 Hz, and epochs of 5 s length were obtained.Finally, artifactual epochs were automatically rejected based on extreme amplitude or power values, linear trend, joint probability, and kurtosis.The test-retest reliability of this preprocessing scheme is validated elsewhere (Isaza et al., 2023).

Features extraction: spectral parameterization
Preprocessed signals were down-sampled to a uniform sampling rate of 500 Hz.Power Spectral Density (PSD) vectors were computed at the sensor level using the psd_array_multitaper function implemented in MNE, with default parameters (Gramfort et al., 2013).Next, the median of the PSD vectors was computed across epochs.Given the potential confounder effect of nonoscillatory (aperiodic) activity in the median PSD vectors, spectral parameterization was conducted using the Fitting Oscillations & One Over Frequency algorithm (FOOOF) (Donoghue et al., 2020).

Separating oscillatory peaks from non-oscillatory (aperiodic) trend
The PSD reveals neuronal oscillations as peaks superimposed on non-oscillatory, aperiodic activity, typically following a 1/frequency distribution (i.e., greater power values at slow frequencies that decrease as frequency increases).Although spectral analysis often focuses on oscillations within frequency bands, simulation studies have proven potential conflation of band power and frequency peak estimations when the PSD is not separated into oscillatory and non-oscillatory components.Spectral parameterization models the oscillatory and non-oscillatory activity, avoiding confounder effects of the aperiodic trend on peak descriptors (Donoghue et al., 2020).Spectral parameterization methods have been recently applied in clinical research to study age-related rsEEG aperiodic changes and conditions like neurodegenerative diseases (Kopc ˇanová et al., 2024;McKeown et al., 2023;Merkin et al., 2023Merkin et al., , 2021;;Rosenblum et al., 2023aRosenblum et al., , 2023b;;Wang et al., 2024).The FOOOF algorithm takes an input PSD (median PSD per channel in this study) to perform a linear fit to the aperiodic trend in the log-log scale using a Lorentzian function.This aperiodic fit is subtracted from the input PSD to ''expose" detrended oscillatory peaks, which are then quantified through iterative Gaussian fits based on peak count, height, and width.A comprehensive review is available elsewhere (Donoghue et al., 2020;Gerster et al., 2022).The FOOOF parameters in this study were: frequency range [1, 30 Hz], peak width [1,8], minimum peak height 0.05, and a maximum of 6 peaks.

Descriptors of oscillatory and aperiodic activity
After separating oscillatory and aperiodic activity, FOOOF returns the following descriptors for each channel: Oscillatory parameters (Power -PW, Bandwidth -BW, Center Frequency -CF), Aperiodic parameters (Exponent, Offset), Fitting parameters (error and R-squared).For the current analysis, we included oscillatory PWs in fixed frequency bands: delta (1 -4 Hz), theta (4 -8 Hz), alpha (8 -13 Hz), and beta (13 -30 Hz) (Babiloni et al., 2020), and the individual alpha frequency quantified as the oscillatory CF (and its BW) in the extended alpha band (5 -14 Hz) (Moretti et al., 2013).On the other hand, we included the aperiodic exponent (representing the slope estimated from the fitted 1/frequency activity) and the aperiodic offset (representing the y-axis intercept of the 1/frequency activity).A graphical representation of the FOOOF fitting and the resulting spectral parameters is available in Supplementary Figure 3.
These oscillatory and aperiodic spectral features were combined into a pooled dataset containing all subjects (n = 374) and common channels (n = 29).Channels with a good FOOOF fitting, defined here as an R-squared equal to or greater than 0.8, were included for analysis (83.51% of all channel fittings).A complete description of FOOOF fitting estimations is presented in Supplementary Figure 4 and Supplementary Table 2. Despite the reported good test-retest reliability of FOOOF and its consistency for eyesclosed data (McKeown et al., 2024), oscillatory parameters might produce missing values in some cases.The latter was most evident for frequency bands with a null area under the oscillatory spectrum curve (i.e., flat due to the absence of peaks), so the PW, BW, and CF parameters could not be computed, as well as in channels with non-prominent peaks detected within the extended alpha band.Thus, missing values represented 19.83% of all spectral parameters in the pooled dataset.The percentage of missing values for each feature was delta PW (74.98%), theta PW (69.93%), alpha PW (18.49%), beta PW (4.85%), extended alpha CF and BW (15.03%).A high amount of undetected peaks on the delta and theta band is consistent with prior reports (McKeown et al., 2024) and is expected due to the physiological prominence of alpha oscillations during resting state with the eyes closed.Following this rationale, we replaced the missing values with zeros as they represented null oscillatory power or undetected peaks.

Batch harmonization of rsEEG spectral parameters with ComBat and ComBat variants
Using linear models with ''site" as a covariate is a standard approximation to remove batch effects from the features.However, this assumes the absence of cross-site sample differences in biological variables (such as age and sex) that might explain differences in the distribution of the features.Biological variables whose variability is desired to be retained could be included as covariates of the mentioned linear model (adjusted residuals harmonization).Beyond, ComBat was formulated to mitigate batch effects by modeling site-specific scaling factors with empirical Bayes in the context of small sample sizes (Johnson et al., 2007), outperforming linear models (Fortin et al., 2018).ComBat assumes that the harmonized features can be modeled as a linear combination of the site-related batch effects, biological covariates, and a site-specific error term.Adaptations and variants developed to address poten-tial limitations of ComBat have proven consistent results across different data modalities.A comprehensive review of ComBat harmonization variants for neuroimaging can be found elsewhere (Hu et al., 2023).
Batch effects in the pooled dataset were harmonized using four ComBat variants.NeuroCombat (Fortin et al., 2018), neuroHarmonize (Pomponio et al., 2020), and Optimized Nested ComBat -Gaussian Mixture Model (OPNestedComBat-GMM) (Horng et al., 2022a) were implemented in Python, while HarmonizR (Voß et al., 2022) was implemented in R. We used the default parameters for all ComBat variants.Considering that HarmonizR can handle missing values, the replacement with zeros was not conducted for this harmonization method.Replacement of missing values with zeros in the unharmonized pooled dataset (see Section 2.3.2) was necessary for the Python-based ComBat variants.NeuroCombat, neuroHarmonize, and OPNestedComBat-GMM allow the introduction of biological covariates to retain their variability, whereas HarmonizR does not.Thus, age and sex variability were controlled only in the Python-based harmonization variants.

NeuroCombat
NeuroComBat reimplemented the original ComBat to adjust batch effects in diffusion tensor imaging and cortical thickness from structural magnetic resonance images (Fortin et al., 2018(Fortin et al., , 2017)).Since its formulation, neuroCombat has been validated in various structural and functional imaging modalities (Amini et al., 2023;Da-ano et al., 2020;Du et al., 2023;Leithner et al., 2022;Xu et al., 2023).The model incorporates batch-specific scaling factors (i.e., location and scale, accounting for mean and variance batch-related differences in each feature) estimated via empirical Bayes.Thus, neuroCombat re-scales the features' values to make them comparable across batches.In addition, neuroCombat retains the linear effect of biological covariates on the harmonized features (Fortin et al., 2018).Despite this, nonlinear relations between biological covariates and features might not be appropriately captured with neuroCombat (Hu et al., 2023).NeuroCombat is available at https://github.com/Jfortin1/neuroCombat.

NeuroHarmonize
The neuroHarmonize variant was proposed for scenarios where the biological covariates exhibit a nonlinear relationship with the analysis features (Cetin-Karayumak et al., 2020:).NeuroHarmonize preserves the core steps of neuroCombat but retains the nonlinear relationships between features and covariates via Generalized Additive Models.NeuroHarmonize was evaluated in the context of harmonizing structural imaging volumetry features and validated in a large cohort, proving the preservation of nonlinear relations between brain volumes and age throughout the lifespan (Pomponio et al., 2020).NeuroHarmonize is available at https:// github.com/rpomponio/neuroHarmonize.

Optimized nested ComBat with Gaussian Mixture Model
Earlier ComBat variants assume that unharmonized data features are normally distributed and require the definition of a single batch variable (such as site or scanner).Optimized Nested ComBat (OPNComBat) was developed for scenarios where multiple siterelated technical or biological factors could explain batch effects (e.g., two sites with the same rsEEG amplifier but different headsets/caps).Thus, OPNComBat implements an iterative nesting process by creating subgroups from all the combinations resulting from the included batch variables.OPNComBat has been extended through Gaussian Mixture Models (OPNComBat-GMM) to handle features within a single batch with a particular subgrouping (i.e., bimodal or multimodal distributions) due to unobserved batch variables that should be mitigated (or biological variability to be preserved).OPNestedComBat-GMM has been validated on large cohorts with features extracted from lung computerized tomography images (Horng et al., 2022a, Horng et al., 2022b).OPNestedComBat-GMM is available at https://github.com/hannah-horng/opnested-combat.

HarmonizR
HarmonizR was developed as a variant of the original ComBat (Johnson et al., 2007) designed to handle missing values using an iterative splitting of the unharmonized dataset.This iterative resampling of each input feature results in a new dataset without missing data, which is subsequently harmonized using the original ComBat.Finally, this process is conducted for all the unharmonized features, preserving the maximum amount of original data.Further details about HarmonizR can be found elsewhere (Voß et al., 2022).

Statistical analysis
The evaluation of batch effects and harmonization comprised A) Qualitative visualizations, B) Statistical tests across batches, and C) Downstream analyses (age-rsEEG relations), as recommended in prior reports (Hu et al., 2023).Analyses were performed in Python version 3.9.13 and R version 4.3.

Qualitative visualization of batch and harmonization effects
Based on prior rsEEG publications (Bigdely-Shamlo et al., 2020; Kottlarz et al., 2020;Li et al., 2022), t-distributed Stochastic Neighbor Embedding (tSNE) was used to obtain a visual representation of batch effects.Following this, tSNE models were independently fitted to the unharmonized pooled dataset (to identify potential batch effects) and the harmonized versions (generated by each ComBat variant).As a dimensionality reduction algorithm, tSNE maps high-dimensional data to a lower-dimensional space while preserving local similarities.By visualizing the data in this reduced space, tSNE helps identify patterns, clusters, and trends that may not be apparent in the original feature space.Perplexity was set to 30 in all tSNE models based on prior publications on rsEEG spectral features (Kottlarz et al., 2020).The perplexity represents the number of nearest neighbors in the embedded space.Larger perplexities lead to more nearest neighbors and less sensitivity to local structure.
Further, the mean points in tSNE components 1 and 2 were calculated for each site to represent the central tendency of potential batches.Pairwise differences in the Euclidian distance between the mean points of each site were calculated, as well as the average cross-site distance.This procedure was conducted in the unharmonized dataset (missing values replaced with zero) and each of the harmonized datasets.A reduction of observed site-related clusters and centering of the points in tSNE maps, with smaller site-related distances after harmonization, supports the hypothesis of mitigated batch effects.
Feature-wise qualitative visualization of batch effects is recommended to complement tSNE plots (Hu et al., 2023).Probability plots were used to contrast the distribution of each rsEEG feature across sites.Probability plots depict the cumulative probability distribution of each feature (i.e., the likelihood of observing a specific value among all potential values of a given feature) for each site.Probability plots were computed feature-wise for the unharmonized and harmonized datasets.Cross-site convergence of the cumulative probability distributions after harmonization (i.e., lines from different sites tend to overlap) supports the hypothesis of reduced batch effects.

Quantification of batch and harmonization effects
Statistical testing across batches quantifies the harmonization performance (Hu et al., 2023); therefore, we evaluated cross-site differences in rsEEG features in unharmonized and harmonized datasets.Given the distribution of some rsEEG features and considering that some harmonization methods could not handle biological covariates, batch effects were quantified feature-wise with multiple Kruskal-Wallis models (not adjusting for covariates).For each Kruskal-Wallis model, ''site" was included as the independent variable, and each of the rsEEG features was the dependent variable.The effect size of batch-related differences was reported using the bias-robust epsilon squared statistic.Epsilon squared ranges from 0 to 1, representing weak [0.00 -0.04), moderate [0.04 -0.36), and strong effects [0.36 -1] (Lakens, 2022;Okada, 2013).Reduced epsilon squared values in the harmonized datasets support the hypothesis of appropriate mitigation of batch effects in the rsEEG features.

Assessing rsEEG and age relationships before and after harmonization
Earlier reports assessed age-related relationships in some spectral parameters estimated using the FOOOF method.Overall, older adults exhibit higher PW in delta, low-alpha, beta, and gamma bands with reduced aperiodic parameters, as well as decreased alpha CF and BW (He et al., 2019;Karekal et al., 2023;Merkin et al., 2021).As ComBat variants aim to preserve the existing relationship between the harmonized features and biological covariates of interest, we evaluated the associations between rsEEG features and age before and after harmonization.Ordinary least squares were used to assess potential relations between rsEEG features and age, examining the effect of harmonization methods on the direction, magnitude, and statistical significance.Therefore, for each subject, we computed the median feature value in the channels of the ''Posterior" Region Of Interest (ROI), namely, P7, P8, P3, P4, O1, Oz, and O2.Pearson R coefficients represent the direction and effect size of the relationship between age and the feature median value in the posterior ROI.Given the exploratory nature of this analysis and the need to explore the effect of harmonization on the estimated uncorrected p-values, multiple testing correction was not performed.Preserved direction of rsEEG-age associations (with comparable Pearson R magnitudes and significance) supports the hypothesis of appropriate mitigation of batch effects while retaining biological variability.
This manuscript adheres to the STROBE statement guidelines to provide clear reporting in observational studies (Vandenbroucke et al., 2007).The codes used for analysis and the datasets are available at https://github.com/alberto-jj/combat_eeg.

RsEEG spectral parameterization and batch harmonization
Visual inspection of the cross-site alignment of the unharmonized posterior PSD, oscillatory, and aperiodic fit vectors showed notable differences; see Fig. 2. As an unexpected finding in the Iowa dataset (with eyes open) and the pooled sample, older adults showed higher values in the averaged PSD, aperiodic, and oscilla-  2. On qualitative visualization of batch effects, all ComBat harmonization models achieved a reduced batch-related dispersion and centered the points in the tSNE maps; see Fig. 3. Compared to the unharmonized data, the average cross-site Euclidean distance was consistently reduced with all ComBat variants.OPNComBat-GMM, followed by HarmonizR, achieved the lowest average cross-site distance.Pairwise site distances in the LEMON dataset were better reduced with OPNComBat-GMM.The tSNE-derived Euclidean distances exhibited comparable performance of neuroCombat and neuroHarmonize.
The results of batch harmonization methods over summary statistics (mean, median, variance, and interquartile range) are displayed feature-wise in Supplementary Figure 5.Each feature was explored before and after harmonization via distribution (Supplementary Figures 6-8) and probability plots by site (Supplementary Figures 9-10).Raincloud plots showed the feature-wise distributions in the pooled sample (Supplementary Figure 11).Overall, the distribution of rsEEG features harmonized with ComBat models tended to cross-site centering, with reduced dispersion in most cases.By contrast, distribution tails of most rsEEG features were not correctly aligned; see Supplementary Figures 9 and 10.Replacing missing values with zeros introduced bimodal distributions and skewness in most rsEEG features; Supplementary Figure 11.The latter was predominantly observed in the extended alpha CF, resulting in negative values after harmonizing with most ComBat variants (except HarmonizR); Supplementary Figure 10.
The statistical quantification of batch effects using Kruskal-Wallis and epsilon squared tests is presented in Supplementary Figure 12.Before harmonization, the more prominent batch effects were identified in the unharmonized oscillatory alpha PW and aperiodic offset.After harmonization, most features (except by extended alpha CF) exhibited notably lower epsilon squared values.In line with tSNE qualitative visualizations, the lowest average cross-site differences (assessed through epsilon squared) were observed in HarmonizR, followed by OPNComBat-GMM.

Relation between rsEEG spectral features and age
The median delta PW and beta PW in the posterior ROI exhibited statistically significant positive relationships with age in the unharmonized data (either with zeros or missing values).Comparable direction, magnitude, and significance of the relationship were observed across all ComBat variants; Fig. 4. Conversely, correlations between age and posterior theta PW or alpha PW were not found.
On the other hand, the median values of posterior extended alpha CF, aperiodic offset, and aperiodic exponent exhibited statistically significant negative relationships with age in the unharmonized pooled dataset (either with zeros or missing values).Across ComBat variants, most correlations' magnitude and significance slightly increased (with equal direction) compared to the unharmonized data; Fig. 5.

Discussion
To our knowledge, this is the first study evaluating the harmonization of batch effects in rsEEG data with different ComBat variants.Our results indicate that batch effects were evident in rsEEG spectral features after data pooling.All the examined ComBat variants successfully mitigated batch effects in rsEEG spectral features, with OPNComBat and HarmonizR outperforming the others (Section 4.1).On the other hand, the rsEEG-age relation was preserved after harmonization, supporting the hypothesis that ComBat variants preserve biological variability.The age-related pattern of increased posterior beta PW with decreased posterior extended alpha CF, aperiodic offset, and aperiodic exponent is consistent with preliminary works on different neurophysiological modalities comparing old and young adults (Section 4.2).Although our findings are promising, we discuss the potential limitations of ComBat harmonization on rsEEG data, which should be addressed in future research for a broader application in clinical multisite studies (Section 4.3).

ComBat harmonization of rsEEG features: applications in clinical research
As supported by expert panels, spectral features are robust neurophysiological rsEEG descriptors with potential applications in healthy and diseased populations (Babiloni et al., 2020).Systematic reviews on age-related conditions such as Alzheimer's Disease (Modir et al., 2023) conclude that spectral features should be included as baseline features for the classification of patients.Similarly, the posterior extended alpha CF has been included as a sup-  et al., 2023b).Similarly, recent evidence shed light on potential neurophysiological mechanisms of rsEEG aperiodic activity, including excitatory/inhibitory balance and synapse timescale (Brake et al., 2024).Despite using a standardized preprocessing workflow and extracting robust neurophysiological features, we highlight that pooling multisite samples introduced batch effects that could affect the downstream analysis (Hu et al., 2023).
ComBat models effectively reduced batch effects in the pooled sample.As previously reported (Horng et al., 2022a), OPNComBat-GMM outperformed ComBat variants which cannot handle non-normal distributions (e.g., neuroCombat, neuroHarmonize).However, the superior performance of HarmonizR over OPNComBat-GMM might be explained by HarmonizR's capability to accept unharmonized features with missing values as input (characterized by unimodal distributions in our study).Of note, bimodal distributions were introduced in our results when the missing values in the oscillatory spectral features were replaced with zeros.Although replacing missing values with zeros might seem intuitive (i.e., missing values corresponding to a null area under the PW curve or undetected peaks), this practice could impact ComBat models that assume features normality, such as neuroCombat, neuroHarmonize, and HarmonizR.The latter has been suggested in microbiome count data, often zero-inflated and over-dispersed (Ling et al., 2022), and represents a potential explanation for our findings on delta PW -age relations (not previously reported).
On the other hand, we identified inappropriate outliers (i.e., negative CF values) introduced after harmonization in the distribution tails of features with larger variance, such as extended alpha CF (ranging from 0 -for models with missing extended alpha peaks À to 15 Hz in the unharmonized data).The re-scaling to inappropriate values was mainly observed in the Python-based ComBat variants that cannot handle missing values.By contrast, Harmo-nizR did not introduce outliers (zero values) in the extended alpha CF but maintained potential batch-related differences in the extreme values of most features, as shown in Supplementary Figures 9 and 10.These inconsistencies in re-scaling were also evident in oscillatory PW features, generating negative estimations of power that should be positive by definition.In line with this, simulation studies have revealed that outliers can affect ComBat performance and even introduce stronger batch effects if the outliers are not balanced and significantly differ in proportion across batches (Han et al., 2023).
Despite the remarkable performance of HarmonizR, modeling of biological covariates is not allowed with this method.The latter could eliminate the variance attributed to the biological characteristics of the sample (as if biological variability were batch effects that should be removed).Notwithstanding, all ComBat variants (including HarmonizR) showed consistent correlations of age and most rsEEG spectral features before and after batch harmonization, supporting that ComBat methods maintain the performance of downstream analyses while reducing batch effects (see Section 4.2).
Open science and the replicability crisis of neuroscience have grown the number of openly available datasets, facilitating multisite data pooling (increasing statistical power, sample size, and representativeness).Similarly, pooling multisite data while controlling for nuisance batch effects could be considered in multicentric studies on rare diseases where sample sizes at each site are small and potentially not representative of the population (Hu et al., 2023;Marzi et al., 2024).Moreover, clinical researchers adapting these harmonization pipelines could scale rsEEG datasets and evaluate more complex modeling strategies, such as artificial intelligence applications.However, further developments in harmonization methods may address the limitations of the evaluated ComBat variants (see Section 4.3).As a case in point, caveats in handling missing values, including biological covariates (in the case of HarmonizR), and introducing constraints to the re-scaling parameters (avoiding biologically unplausible features like negative CF or PW) could simplify the interpretability of results in clinical applications.

Associations between age, oscillatory, and aperiodic rsEEG feature
Our results of increased beta PW, with decreased extended alpha CF, as an age-related phenomenon extend a growing body of evidence from smaller studies.Thus, increased rsEEG beta PW (modeled with FOOOF at sensor space) has been reported in healthy older adults when compared to younger individuals (Karekal et al., 2023;Merkin et al., 2021).Similarly, the agerelated increase in beta PW has been documented in magnetoencephalography (MEG) when assessing neurodevelopment from childhood to adulthood in source space (He et al., 2019).These findings collectively underscore the critical role of beta band maturation as an age-dependent process.Despite the above, agerelated changes in beta oscillations have not been reproduced in early to middle childhood (Hill et al., 2022).This discrepancy likely stems from the fact that cognitive control (which relies on beta and higher frequency oscillations) does not fully develop until adolescence (Crone and Steinbeis, 2017;He et al., 2019).
Furthermore, alpha peak frequency (extended alpha CF in our study) has been consistently reported as a robust surrogate marker of age-related maturation of brain rhythms in neurodevelopment and aging research (Klimesch, 1999;Li et al., 2022;Moretti, 2015;Moretti et al., 2013).Large normative datasets documented an inverted U-shaped relationship between age and alpha peak frequency, with alpha peak frequency increasing in childhood and decreasing in aging (Li et al., 2022).Comparable observations have also been reported when the power spectrum has been parameterized using FOOOF, supporting that the shifting in frequency observed in aging is present even after removing the potential bias induced by aperiodic activity (McSweeney et al., 2023;Merkin et al., 2023).
In addition to our findings in oscillatory features, age-related changes in aperiodic parameters have been extensively detected across different neurophysiological modalities, including rsEEG, visual and auditory event-related potentials, MEG, and electrocorticography (He et al., 2019;Hill et al., 2022;Merkin et al., 2021).Healthy older adults typically exhibit a flattened aperiodic activity (i.e., lower slope) compared to young adults or children.Similarly, the offset of the aperiodic component tends to decline as an agedependent process.The flattened aperiodic spectrum in restingstate modalities has been considered a correlate of a large number of asynchronous neuronal spiking, which reflects a decoupling in the oscillatory behavior of cortical neuronal ensembles due to an increased excitation-to-inhibition ratio (mainly associated with the activity of GABAergic interneurons and pyramidal neurons, respectively).Besides, the observed age-related decrease in the aperiodic offset has been interpreted as a surrogate of a reduced broadband power; computational insights point out the loss of grey matter and slower spiking rate due to pruning and aging as a potential explanation (He et al., 2019).The impact of synaptic kinetics, excitatory/inhibitory balance, and network dynamics on aperiodic neural activity measured with rsEEG has been recently reviewed in a comprehensive publication (Brake et al., 2024).

Potential limitations of ComBat harmonization in multisite rsEEG studies
We highlight the importance of standardized experimental conditions to enhance comparability when pooling independent datasets for clinical research.The latter was not possible in our study as we were not involved in the design of the primary studies.For instance, the Iowa dataset (n = 14) only acquired eyes open rsEEG, representing a source of variability in the PSD.Similarly, despite including non-diseased individuals, the current study lacks a control group used as a reference.Synthetic data simulations could provide ground truth and facilitate the evaluation of batch harmonization (Marzi et al., 2024).
Besides, the reader must consider that our scope was focused on the performance of batch harmonization of rsEEG features, although the observed age-related patterns are consistent with prior reports.Dedicated studies and modeling methods may shed light on the biological effect of aging over rsEEG spectral features (Engemann et al., 2022a).Also, the scarcity of open rsEEG longitudinal datasets impedes assessing the test-retest reliability of our cross-sectional findings.Although longitudinal rsEEG studies evaluating the test-retest reliability of the posterior alpha, beta, and aperiodic FOOOF-derived features in sensor space are limited, their results support fair-to-excellent reliability (Pathania et al., 2021;Popov et al., 2023;Tröndle et al., 2022).Another challenge involves the artifact-free epoch imbalance across datasets, which could introduce differences in signal-to-noise ratio in PSD estimations and further FOOOF fitting.However, earlier reports support that PSD estimations tend to be robust (good test-retest reliability) with more than 40 s of artifact-free preprocessed signals (Gudmundsson et al., 2007).Future research on rsEEG data augmentation might provide reliable methods to handle the imbalance of artifact-free epochs (Lashgari et al., 2020).Moreover, the robustness of automated preprocessing workflows has recently been questioned in event-related potentials and is suggested to be comparable to using a high-pass filter only (Delorme, 2023).Despite this, an increased signal-to-noise ratio has been proven in multiple rsEEG datasets preprocessed using our multistep pipeline, with high test-retest reliability (Isaza et al., 2023;Suárez-Revelo et al., 2018).
Our approach did not incorporate source-space reconstruction due to its computational demands, which complicates the local analysis of extensive data collections.Thus, sensor space features might be insufficient to examine the contribution of adjacent sources of a given brain rhythm due to the volume conduction mixing across sensors.Nevertheless, occipital generators are the most prominent sources of PSD estimated in parietal and occipital sensors (Schaworonkow and Nikulin, 2022).As a computationally efficient alternative to source space reconstruction, spatiospectral analysis and Riemannian Geometry analysis of the rsEEG channel covariance matrix have proven good performance even in sparse low-density montages (Banville et al., 2023:;Bomatter et al., 2024:;Engemann et al., 2022b).Batch correction strategies for electrophysiological covariance matrixes based on Procrustes analysis and Geodesics have been proposed (Mellot et al., 2024(Mellot et al., , 2023)).CovBat, a ComBat-based algorithm that can handle covariance matrixes, has been tested on functional imaging, outperforming other ComBat variants (Chen et al., 2022;Hu et al., 2023).Novel methods handling multiple biological covariates for batch harmonization of covariance matrixes will contribute to applying stateof-the-art machine learning models in rsEEG clinical research.
Finally, missing data and its replacement could also reduce harmonization performance.Synthetic data supports that outliers' amount, location, and magnitude could lead to abnormal performance of ComBat (Han et al., 2023).Setting an appropriate strategy for handling outliers and missing data in multisite clinical research is mandatory when deriving biomarkers' cut-off and normative values from features harmonized with ComBat.Therefore, visualization and quantification of batch effects, as well as sensitivity analyses replicating the harmonization in features with and without outliers, are recommended to enhance the validity of the results (Han et al., 2023;Hu et al., 2023).Moreover, researchers working with multisite datasets must consider other potential factors affecting the harmonization performance, such as cross-site sample size differences (Parekh et al., 2022), variations on default parameters across ComBat methods, the singularity of the biological covariate matrix (Adamer et al., 2022), and data leakage in machine learning applications (Marzi et al., 2024).

Conclusions
In conclusion, batch effects were found in rsEEG sensor-level spectral features.All ComBat variants reduced batch effects and overall dispersion of rsEEG features.HarmonizR and OPNested-GMM ComBat exhibited the greatest reduction of batch effects.Batch harmonized Beta power, individual Alpha peak frequency, Aperiodic exponent, and offset in posterior electrodes showed significant relations with age before batch harmonization.ComBat variants preserved the observed relationships and increased the effect size.This openly available workflow can help harmonize sensor-space rsEEG spectral features in multisite studies, enhancing statistical power while addressing batch effects.By re-using or adapting our analysis workflow in clinical research, clinical neuroscientists have a method to exploit existing data collections and combine data across clinics/research centers, paving the way toward Big Data analyses using machine and deep learning models for diagnosis, prognosis, or treatment monitoring.
support is also received from the National Institute for Health Research (NIHR) Biomedical Research Centre in South London, the Maudsley NHS Foundation Trust, King's College London in the UK, and the Center for Innovative Medicine (CIMED) in Sweden.The views expressed are those of the author(s) and not necessarily those of the abovementioned institutions.

Fig. 1 .
Fig. 1.Demographic characteristics of the sample.On the top row (A), raincloud plots of age distributions across sites, in the total pooled sample, and by sex.On the bottom row (B), bar plots show the absolute frequency of females and males across centers and the relative frequency of sex for each dataset.

Fig. 3 .
Fig. 3. Harmonization effects across all rsEEG spectral features.A) T-distributed stochastic neighbor embedding (tSNE) components, perplexity = 30.Each plot shows the tSNE fit for the unharmonized (missing values zero-filled) and harmonized datasets with the four ComBat variants.Each dot depicts each subject (all rsEEG features reduced) and is color-coded by batch/site.Bigger dots represent the average of the tSNE components for each batch/site.B) Distance matrices show the Euclidean distance between batch/site-average values (big dots in tSNE maps).The average cross-site distance in unharmonized and harmonized datasets is computed from each matrix.

Fig. 4 .
Fig. 4. Correlations of age and oscillatory band powers (Before and after harmonization).From left to right columns: A) Unharmonized dataset (with zeros), B) neuroCombat, C) neuroHarmonize, D) OPNComBat-GMM, and E) HarmonizR.Dots indicate each subject's median value in the posterior Region Of Interest (ROI), color-coded by site/research center.Ordinary Least Squares fit was conducted on the pooled sample.Each frequency band power is represented in one different row.PW: Oscillatory band power.

Fig. 5 .
Fig. 5. Correlations of age and extended alpha center frequency, aperiodic exponent, and offset (Before and after harmonization).From left to right columns: A) Unharmonized dataset (with zeros), B) neuroCombat, C) neuroHarmonize, D) OPNComBat-GMM, and E) HarmonizR.Dots indicate each subject's median value in the posterior Region Of Interest (ROI), color-coded by site/research center.Ordinary Least Squares fit was conducted on the pooled sample.Each spectral feature is represented in one different row.CF: Center frequency.