Spectral features of resting-state EEG in Parkinson’s Disease: A multicenter study using functional data analysis

(cid:1) Increased generalised theta with posterior pre-alpha relative power are reproducible ﬁndings in Parkinson’s Disease. (cid:1) Functional Data Analysis is a potential tool to overcome potential limitations of epochs averaging in rs-EEG spectral analysis. (cid:1) Functional Data Analysis constitutes a reliable method to analyse epoch-to-epoch variability of the rs-EEG.


Introduction
Parkinson's Disease (PD) is one of the most frequent neurodegenerative disorders, with an estimated prevalence in 2019 of 8.5 million people worldwide, an increase of up to 155% compared to 1990 estimations. Thus, a remarkably growing PD-related burden on the whole society, including patients, family caregivers, and the healthcare system, has been pointed out (Ou et al., 2021). Strategies for managing neurodegenerative disorders, including PD, may benefit from non-expensive, non-invasive biomarkers that can complement clinical-based diagnosis, providing valuable insights for timely detection, study inclusion, and therapeutic success in clinical trials, as well as prognosis (Law et al., 2020).
In addition to protein aggregation and accumulation, a shared phenomenon in most types of neurodegenerative disorders is a continuum of heterogeneous clinical phenotypes that progress to a dementia syndrome, with either prodromal or concomitant neuropsychiatric symptoms. From a structural and functional perspective, progressive cholinergic dysfunction is a common feature in those neurodegenerative diseases that usually lead to dementia, such as Alzheimer's Disease (AD), Dementia with Lewy Bodies (DLB), and PD dementia (PDD) (Amalric et al., 2021;Kehagia et al., 2012;Schumacher et al., 2020b). Interestingly, a recent report on neuroimaging showed structural and functional abnormalities of cholinergic systems in de-novo PD patients and those diagnosed with PD mild cognitive impairment (PD-MCI), supporting this continuum of neurodegeneration-related changes (van der Zee et al., 2022).
Considering the lack of availability of expensive neuroimaging devices, and given the potential portability, non-expensiveness, non-invasiveness, and high acceptability of electroencephalogram (EEG) devices, many researchers have explored cholinergicrelated functional abnormalities in neurodegenerative disorders using EEG recordings during task-free conditions such as keeping the eyes closed or open in resting-state EEG (rs-EEG). Thus, spectral analysis of rs-EEG signals represents one of the relatively standard and well-operationalised approaches in sensor-level analysis.
Recently, rs-EEG correlates of cholinergic dysfunction were explored in dementia patients. Schumacher et al. reported significant associations between spectral features of rs-EEG (i.e. mean reactivity of alpha power in occipital channels) and the volumes of the nucleus basalis of Meynert, mainly observed in the PDD group (Schumacher et al., 2020b). Similarly, in cognitively normal PD, and PD-MCI patients, the mean relative power in the fasttheta band (5.5 Hz to 8 Hz), deemed ''pre-alpha" (Bonanni et al., 2008), seems to be associated with volumes of basal forebrain cholinergic nuclei independently of cognitive diagnostic status (Rea et al., 2021). Thus, many authors have used this cholinergicdysfunction model to explain PD-related and DLB-related spectral findings of slowing down in the rhythms of posterior derivations (Jaramillo-Jimenez et al., 2021;Massa et al., 2020;van der Zande et al., 2018).
One of the advantages of EEG against functional neuroimaging techniques is the higher time resolution achieved with EEG. Also, most spectral features, such as power spectral density (PSD) band powers, are usually computed from many epoch PSD vectors and frequently averaged at the end of feature extraction to get a summary/mean estimation for each sensor across all epochs. The rationale behind this is to increase the signal-to-noise ratio by reducing the epoch-to-epoch variability of the PSD and derivative features (Rodriguez et al., 1999). However, this approach may obscure important information since the epoch-to-epoch spectral representations of the rs-EEG may better describe time stability or variability of spectral features in clinical and healthy populations. As a case in point, Compressed Spectral Arrays (CSA) are epoch-to-epoch graphical representations of the PSD which have been used in DLB to show patterns of frequency shifting and variability, as well as changes in the relative power of specific sub-bands (Bonanni et al., 2008).
Previous omnibus epochs and channel averaging approaches may have also been applied due to limitations in traditional multivariate statistics. However, several statistical alternatives can account for temporal-related information by modelling spectral features as a function of each epoch rather than averaging the PSD across all the rs-EEG segments. Such functional data analysis (FDA) has been recently implemented in functional neuroimages (Tian, 2010) as well as event-related EEG data (Zhang et al., 2020). Despite the extended use of relative band powers and peak frequency as often explored features of the rs-EEG spectral analysis, we are unaware of studies using the FDA to model and compare these features in neurodegenerative disorders such as PD.
With all the above, this study aims 1) To identify differences in the rs-EEG spectral features of PD and non-PD healthy subjects using FDA and 2) To explore, in four independent cohorts, the external validity and reproducibility of the above spectral findings using both FDA and conventional averaged-epochs methods.
Considering our preliminary findings in one of these cohorts, where PD subjects exhibited a more pronounced slowing down (lower alpha/theta ratio) than non-PD control subjects in averaged epochs, we hypothesise that: 1) FDA will identify a more pronounced slowing down in the PD group than in the controls, represented by spectral features such as posterior pre-alpha power, dominant frequency, and alpha/theta ratio; and 2) Specific features of rs-EEG that reflect PD-related spectral slowing down in averaged-epochs approach, may yield more generalisable effects using FDA, hence, more external validity across datasets from different centres.

Study design and settings
This manuscript presents a secondary analysis of four crosssectional studies conducted in three countries (United States of America, Finland, and Colombia) to examine PD vs non-PD differences.
Study settings, locations, and periods of data collection in the primary studies are listed below for each subset:

California dataset
This dataset was collected in 2013. PD subjects were recruited from Scripps Clinic in La Jolla, California, and non-PD healthy controls were recruited from the local community or the patients' spouses. Written informed consent was provided by all participants, with the approval of the University of California, San Diego (George et al., 2013).

Finland dataset
This dataset was collected in 2018. PD patients and 20 non-PD healthy controls were recruited at the University of Turku, and Turku University Hospital, Turku, Finland (Railo et al., 2020). The local committee provided ethical approval for this research project, following the Declaration of Helsinki principles.

Iowa dataset
The dataset was collected from 2017 to 2019. PD patients and non-PD healthy controls were recruited at the University of Iowa, Narayanan Lab (Anjum et al., 2020;. All individuals signed a written informed consent form approved by the Office of Institutional Review Board prior to inclusion in the primary study.

Medellin dataset
The dataset was collected in 2016 in Medellin, Colombia. PD subjects were recruited from the outpatient neurology service of the Group of Neurosciences of Antioquia and the Group of Neuropsychology and Behavior, School of Medicine -University of Antioquia. Non-PD healthy subjects were recruited from local community volunteers and spouses of the patients. The study had the approval of the Ethical Research Committee of the local University. All participants signed informed consent before enrolment in the study. Our group has previously studied this data; see details elsewhere (Carmona Arroyave et al., 2019;Jaramillo-Jimenez et al., 2021).

Participants
A total sample of 169 subjects (85 non-PD; 84 PD) was included. Other neuropsychiatric disorders than PD and conditions affecting the rs-EEG were exclusion criteria. Eligibility criteria and the number of patients in each dataset are stated below:

California dataset
Clinical diagnosis of PD was made by a specialist in movement disorders at the PD and movement disorders department of the institution mentioned above (operationalised diagnostic criteria were not reported) (George et al., 2013;Jackson et al., 2019). PD patients were described as clinically typical (with mild to moderate severity of the disease) and cognitively normal according to screening tests for global cognitive performance (see section 2.3. Clinical assessment) (George et al., 2013). Initially, 32 individuals (16 non-PD; 16 PD) were matched by age, sex, global cognitive performance, and handedness. However, one patient was excluded from the primary study due to facial dyskinesia. Thus, our analysis included available data for 31 individuals (16 non-PD; 15 PD). Both clinical and rs-EEG assessments were collected during the ON phase of levodopa treatment for the PD group.

Finland dataset
PD patients were diagnosed either using the United Kingdom Brain Bank criteria (Gibb and Lees, 1988) or the Movement Disorder's Society (MDS) criteria (Postuma et al., 2015) for the clinical diagnosis of PD (Railo et al., 2020). Forty subjects were recruited (20 non-PD; 20 PD), all with spared cognition according to screening tests for global cognitive performance (see section 2.3. Clinical assessment). Age-matching was controlled in the primary study, but not sex, global cognition, or handedness. 13 PD subjects were assessed in the OFF phase, and the remaining patients were in the ON phase. For one PD patient, rs-EEG was not saved due to technical issues (Railo et al., 2020) and was not included here. Also, the rs-EEG from one non-PD participant could not be processed due to the bad quality of the recording (>60% of bad channels for interpolation). Therefore, we included a sample of 38 subjects (19 non-PD; 19 PD) in our study.

Iowa dataset
Clinical diagnosis of PD was conducted following the United Kingdom Brain Bank criteria (Anjum et al., 2020;Gibb and Lees, 1988;Singh et al., 2020), but information about cognitive diagnosis status was not available. A total sample of 28 individuals was recruited (14 non-PD; 14 PD) and included in our analysis. All PD subjects were examined clinically and with rs-EEG during the ON phase. The non-PD and the PD group were matched for age and sex and had comparable education, as reported by the authors of the primary study (Anjum et al., 2020).

Medellin dataset
Two neurologists and a trained physician made the clinical diagnosis of PD following the MDS criteria (Jaramillo-Jimenez et al., 2021;Postuma et al., 2015). Non-PD participants were volunteers selected from an open call. Seventy-two subjects were recruited (36 non-PD; 36 PD) and included in our analysis. Among PD subjects, 14 PD-MCI individuals were included (see section 2.3. Clinical assessment). Individuals were age, sex, and education matched. Clinical and electrophysiological assessments for the PD group were conducted in the ON phase.

Clinical assessment
In each dataset, we characterised demographics and common clinical variables such as Levodopa Equivalent Daily Dose (LEDD) (Tomlinson et al., 2010), motor subscale Unified Parkinson's Disease Rating Scale -Part III (UPDRS-III) (Goetz, 2003), and the global cognitive performance score of either Mini-Mental State Examination (MMSE) (Folstein et al., 1975) or Montreal Cognitive Assessment (MoCA) (Nasreddine et al., 2005). Although the cognitive performance was assessed in all centres, most datasets did not provide information about the cognitive diagnosis of PD or non-PD individuals. In the Medellin dataset, the PD-MCI subjects were diagnosed using MDS taskforce level I criteria (Litvan et al., 2012).
All the above datasets were standardised following the Brain Image Data Structure (BIDS) specification (Pernet et al., 2019) to operationalise subsequent analysis steps.
For preprocessing of all rs-EEG signals, we used a Python implementation of an already validated workflow previously published by our group (Jaramillo-Jimenez et al., 2021;Suarez-Revelo et al., 2018, 2016. Briefly, we built our fully automated pipeline by wrapping multiple preprocessing tools. Initially, robust average re-referencing, adaptative line-noise correction, and bad channel interpolation were performed using a Python reimplementation of the MATLAB PREP pipeline (Bigdely-Shamlo et al., 2015) done by the authors of the PyPREP library (Appelhoff et al., 2022). The goal of average re-referencing is to get a comparable reference scheme across datasets. Nevertheless, the average reference can be affected by noisy channels. Thus, the main goal of the PyPREP pipeline is to estimate a robust average reference by excluding these noisy channels from it. After PyPREP, a wavelet-enhanced Independent component analysis (ICA) artefact smoothing stage was carried out (Castellanos and Makarov, 2006). Thus, a 1 Hz high-pass Finite Impulse Response (FIR) filter was conducted to remove low-frequency drifts that would affect the following ICA stage. Then the FastICA algorithm, available at the MNE library (Gramfort et al., 2013), was applied to obtain both artifactual and brain components from the EEG signal. These components were then decomposed into wavelets, and wavelet thresholding smoothed out strong artefacts in the data (such as those originating from muscular or eye-blink components). Later, the signal was low-pass filtered at 30 Hz. Afterwards, five seconds-length epochs (5 s epochs) were segmented from the rs-EEG recordings. Finally, the automatic rejection of artifactual epochs was conducted as the last step of our pipeline, based on signal parameters such as extreme amplitude and spectral power values and statistical features like linear trends, joint probability, and kurtosis.
The number of available epochs in each dataset variated depending on each centre's protocols. Thus, the number of available 5 s epochs for each dataset was defined considering the common number of epochs across all subjects. For the California dataset, we included 28 epochs; for Finland, 19 epochs were analysed. In the Iowa dataset, 17 epochs were available, while the Medellin dataset had 44 epochs.

Spectral features extraction
Preprocessed signals were down-sampled at a common sampling rate of 500 Hz. For each of the included 5 s epochs, the PSD vectors were obtained at the sensor level using the psd_multitaper function implemented in MNE with adaptive filters, full normalisation (length and sampling rate), and a low_bias parameter (Gramfort et al., 2013). Then, for the PSD vectors of each epoch, the relative PSD in each frequency band was obtained using the bandpower function from the Yet Another Spindle Algorithm (YASA) library. This function computes the relative power of a given frequency band (i.e. estimated band power/total power within the1 to 30 Hz bandwidth) by approximating its area under the PSD curve using the composite Simpson's rule (i.e. decomposing the band-indexed area with several parabolas and then sum the area of these parabolas) (Vallat and Walker, 2021). Relative PSD was expressed as 0 -1 values in the delta (1 -< 4 Hz), theta (4 -< 8 Hz), alpha (8 -< 13 Hz), and beta (13 -< 30 Hz) band, as recommended by the International Federation of Clinical Neurophysiology guidelines for rs-EEG spectral analysis in clinical research (Babiloni et al., 2020a). Similarly, we assessed other relative PSD features, such as the ratio between alpha and theta relative PSD (Jaramillo-Jimenez et al., 2021). Also, we explored the potential effects of the theta sub-bands. Thus, we divided the theta band into slow-theta (4 -< 5.5 Hz) and fast-theta sub-bands, also defined as the ''pre-alpha" sub-band (5.5 -< 8 Hz) (Bonanni et al., 2008), and estimated the relative PSD in each band.
Additionally, we quantified the dominant frequency (DF) as the frequency with the maximum peak in the vector of PSD Schumacher et al., 2020a). For DF estimation, we used the Welch method with 50% overlap and a Hamming window with a frequency resolution of 0. 5 Hz.
Often, authors average the PSD vectors of all the available epochs/windows to increase the signal-to-noise ratio and get spectral features representing the whole recording time. These resulting PSD vectors represent each channel's mean (or median). The averaged-epochs PSD vectors are then used to compute the relative PSD by frequency band and the DF and DFV (Bonanni et al., 2008;Rodriguez et al., 1999;Schumacher et al., 2020a). Following this conventional ''averaged-epochs approach", all the abovementioned spectral features were quantified in each channel's mean/averaged PSD vector. Consequently, for the traditional averaged-epochs approach, DFV was defined as the standard deviation of the DF (obtained in the averaged-epochs PSD vector of each channel) Schumacher et al., 2018). On the other side, we quantified all the above features in the PSD vectors of each available epoch for the FDA approach. Also, in the FDA, we defined DFV as the absolute value in Hertz of the epoch-to-epoch difference in the DF (obtained in the PSD vectors of each epoch).

Statistical analysis
All statistical analyses were performed in the RStudio environment, running R version 4.2.1 (R Core Team, 2021). For statistical significance, a multiple-testing corrected alpha below 0.05 was considered.
Clinical and demographic group-related differences in the non-PD and PD groups using independent samples t-tests, Mann-Whitney U tests, or Chi-square tests according to the variables' distribution. Differences in age and clinical variables like UPDRS-III, disease duration, and LEDD were also assessed between datasets.

Averaged-epochs analysis of spectral features:
The conventional procedures for spectral analysis in fixed frequency bands require a logarithmic transformation of the relative PSD to achieve normal distributions (Schumacher et al., 2018). Thus, all the relative PSD values x ð Þ were Log x ð Þ transformed for statistical analysis.
To compare the averaged-epochs spectral features between non-PD and PD, we conducted a two-tailed independent samples t-test. To correct for multiple hypotheses testing, we used the False Discovery Ratio (FDR) method with a corrected significance threshold of 0.05 (setting the total number of channels in each dataset as the total number of comparisons) (Benjamini and Hochberg, 1995).
Given the matched design of most of the primary studies, we did not conduct any additional subject matching in this secondary analysis.

Functional data analysis of spectral features
After feature extraction, the relative PSD in each band and the DF and DFV estimations were modelled epoch-to-epoch using FDA (Ramsay and Silverman, 2005).
The idea behind FDA is to express these spectral features as a time series of observations across epochs in the form of functional means by each channel and each diagnostic subgroup. Thus, in FDA, the data is a set of smoothed curves constructed from regres-sions, optimised by cross-validation, instead of a set of vectors, as in classical multivariate analysis (Ullah and Finch, 2013).
A B-spline smoothing process was used (a set of k-functions / k f g k2N ), which fits a polynomial of degree m to convert the data (i.e. spectral features) in a smoothed curve as a function of w (i.e. epochs). These smoothed curves are hereafter referred to as functional data.
The bases of B-Splines (Eubank, 1999) allow approximation of all the spectral features, thus facilitating their handling. Its characteristics are based on the following properties: Each element of the base / k w ð Þ will be a Spline function of order m and partition s.
Any linear combination of Splines functions is a Spline function.
Any Spline function of order m on the partition s can be expressed as a linear combination of the basis functions.
The functional data then is defined as a basis of k-functions / k f g k2N such that any function can be represented by a linear combination of it, The optimal number of bases (k) to compute functional data was defined based on generalised cross-validation. In this process, we estimated the root mean squared error for each number of bases (k), that is, the squared root of the quadratic mean error calculated as the difference between the observed and the estimated spectral metric per each point/epoch of the smoothed curve.
Once the functional data has been constructed, concepts from scalar statistics can be extended to the functional, preserving the coherence of the concept but adding elements of analysis such as slopes, and inflexion points, among others. Thus, to characterise spectral patterns of each subgroup, following preliminary methods (Ramsay and Silverman, 2005), we computed descriptive functions defined as: To analyse group-related differences in spectral features using FDA, we conducted a functional pointwise t-test based on the permutation method (Coffey and Hinde, 2011;Ramsay and Silverman, 2005). The differences in the functional mean of spectral features in PD and non-PD groups were compared for each channel. For hypothesis testing, the p-values for functional t-tests were estimated based on 200 permutations per timepoint (i.e. at each epoch) as suggested by Ramsay, J. O. et al. (Ramsay and Silverman, 2005). Supplementary materials include a detailed description of the functional t-test estimation method. Supplementary Figures 2 and 3 show exemplary cases of FDA methods and FDA estimations across different datasets.

Reproducibility and external validity of the results across datasets
Pooling signals from multiple primary studies (with different headsets or research centres) into a single total sample (so-called ''EEG mega-analysis") might potentially lead to increased Type I error rates in the group-level analysis due to ''batch" effects (Bigdely-Shamlo et al., 2020;Li et al., 2022). These effects have been reported even when using a standard preprocessing pipeline for all included recordings, as we did in the preprocessing.
To overcome the above limitation and in line with the Strengthening the Reporting of Observational studies in Epidemiology (STROBE) statement recommendations for reporting clear results about the generalizability of the findings (Vandenbroucke et al., 2007), we examined the external validity and reproducibility of our results by conducting the PD vs Non-PD comparisons independently in each of the datasets (i.e., Iowa, California, Medellin, Finland).
As we believe in open science philosophy, we freely provide the codes used for feature extraction and FDA analysis; the codes can be found on the following GitHub link (https://github.com/alberto-jj/functional_data_analysis_rs_eeg/). Also, the raw and derivative features are available for all studies, except for the Medellin dataset, due to ethical restrictions that constrain us from publicly liberating this data.

Clinical and demographic features
Given the matched design of most primary studies, there were no differences in age or sex of the PD and non-PD subgroups. In the Finland dataset, where matching was not explicitly reported, we did not find any significant difference either. As previously reported in the Medellin dataset (Jaramillo-Jimenez et al., 2021), there were differences in MoCA scores of PD and non-PD subjects. See Table 1 for an overview of each dataset's participant demographics and clinical variables.
In PD patients from the different datasets, there were no significant differences in LEDD (p > 0.783) or duration of disease (p > 0.158). However, we observed significant differences in UPDRS -III scores between centres, with lower scores in the Iowa dataset when compared to each of the other centres (California vs Iowa, FDR < 0.001; Finland vs Iowa, FDR = 0.006; Medellin vs Iowa, FDR < 0.001), see Table 1 and Supplementary Figure 4.
In PD and non-PD subjects, age was comparable between centres after correcting p-values for multiple testing (FDR > 0.05). However, before the multiple testing correction, there were significant differences between California and Medellin datasets as these two samples exhibited the lower mean ages among the four included centres, see Supplementary Figure 4A and Table 1.
The sex proportion variated across datasets; see Table 1. Significant differences in the sex proportion were observed between centres after FDR correction, especially when comparing Finland and Medellin datasets (FDR = 0.036). Similarly, uncorrected pvalues showed significant sex differences, with a higher proportion of male subjects in Medellin (Medellin vs California, p = 0.041; Medellin vs Iowa, p = 0.029).

Rs-EEG spectral features
As expected, due to different headsets, amplifiers, and acquisition parameters, exploratory analyses showed potential centrerelated differences in the spectral features across some datasets (i.e. batch effects) despite our standard preprocessing and feature extraction methods, see Supplementary Figure 5. Thus, we decided against pooling all the datasets into a single set.
3.2.1. PD vs Non-PD: Averaged-epochs analysis in each dataset Fig. 1 and Supplementary Figures 6 and 7 summarise the most relevant findings concerning the sensor level of mean spectral features at each centre.
Detailed group-related differences in each dataset are reported below.
3.2.1.1. California dataset. After correcting for multiple testing, significantly greater relative PSD in the theta and pre-alpha frequency bands were evident in all the channels for the PD group, see Fig. 1. The differences in the theta band were mainly due to pre-alpha contributions since we did not observe significant findings when comparing PD and non-PD subjects in the slow-theta sub-band, see Supplementary Figure 8.
Besides a couple of electrodes significant at the uncorrected pvalue < 0.05 in the alpha/theta ratio, DF, and DFV (see Fig. 1), there were no other consistent significant findings in this dataset.
The mean relative PSD in the theta band was significantly higher in PD patients, longitudinally across frontal and posterior channels, excluding the midline, see Fig. 1. Similar significant differences were observed in the slow-theta sub-band; see Supplementary Figure 9. Pre-alpha relative PSD showed differences in posterior and right hemisphere channels but only at the uncorrected-p level, see Fig. 1.
For DF, three electrodes of the right frontal, centroparietal, and left frontocentral cap positions were significantly different at the uncorrected p-value < 0.05, see Fig. 1. There were no other major significant findings in the remaining features.
3.2.1.3. Iowa dataset. FDR corrected results showed a significantly higher pre-alpha relative PSD in PD patients at all analysed channels. Similarly, in the same sensors (except by F7), a significantly higher theta PSD was evidenced in PD; see Fig. 1. The analysis of the slow-theta sub-band did not demonstrate group-related differences, Supplementary Figure 10.
The mean relative PSD in the beta band was significantly lower in PD patients in one left frontal, one right centroparietal, and seven posterior leads. These results remained significant after correcting for multiple testing, Supplementary Figure 11.
Interestingly, in most of the channels of this eyes-open dataset, DFV was significantly higher in the non-PD group, see Fig. 1.
PD subjects exhibited lower alpha/theta ratio and lower DF, consistently seen in posterior and right parietal leads. Although these results were no longer significant after p-values correction, see Fig. 1. No other significant findings were observed in this dataset.
3.2.1.4. Medellin dataset. FDR corrected results pointed out a significantly higher relative PSD in the theta and pre-alpha frequency bands in all channels (except F7) of PD patients. The significant differences in the theta band had contributions from both slow-theta (25 channels) and pre-alpha sub-bands (31 channels), with higher values in the PD group for both sub-bands after the multiple testing correction, see Fig. 1 and Supplementary Figure 12.
Similarly, all channels' alpha/theta ratio was significantly higher in PD patients. At the uncorrected level, we found that DF was significantly lower in PD in two channels (one right frontocentral and one left occipital), see Fig. 1. There were no other significant findings in the remaining features.

PD vs Non-PD: FDA of spectral features
After the generalised cross-validation process, the optimal number of bases (k) for PSD by bands and the alpha/theta ratio was defined as follows: k California PSD ¼ 21; k Finland PSD ¼ 10; k Iowa PSD ¼ 12; k Medellin PSD ¼ 21.
Results of the FDA group differences are summarised in Figs. 2 and 3, as well as in Supplementary Figure 13.
Specific results of the FDA in each dataset are described below.

California dataset.
In the PD group, significantly higher values in the functional mean of the theta relative PSD were found for most epochs. Findings in the theta band were also observed in most channels. Consistent differences were evidenced from the first epochs in the posterior region and could be either consecutive across multiple epochs, or fluctuant. These differences appear subsequently in central leads and later in anterior channels, see Fig. 2. Comparable findings were seen in the pre-alpha relative PSD, see Fig. 3. Conversely, exploratory analysis of the slow-theta subband showed scarce significant differences between both subgroups with a non-congruent direction of the difference, see Supplementary Figure 13. Other significant findings were observed in the delta, alpha, beta, and alpha/theta relative PSD, but these differences were diffuse and not evident for most of the evaluated data. Similarly, DF and DFV showed significant diffuse findings not detected in most epochs, see Figs. 2 and 3.

Finland dataset.
In PD subjects, the most prominent significant findings included higher values in the functional mean of the theta relative PSD. These consistent differences were evident in the first epochs in the posterior and central leads, showing significance across multiple consecutive epochs, see Fig. 3. Exploratory analysis of the sub-bands within the theta range showed a greater number of significant differences in the slow-theta sub-band, with fluctuant differences across epochs evidenced in multiple channels. By contrast, in the pre-alpha sub-band, significant differences between PD and non-PD were predominant in consecutive epochs of posterior leads, see Supplementary Figure 13. Besides, the relative PSD in the alpha band was significantly lower in some central and anterior channels. Similarly, the alpha/theta ratio was lower in PD patients, showing a similar topographic distribution of the significantly different electrodes.
Among other significant findings, we observed a lower DF in PD patients, but these findings were not observed for most of the evaluated epochs, see Fig. 3.

Iowa dataset.
There were prominent differences across most epochs in the theta band for most channels. PD patients showed significantly higher values in the theta relative PSD, see Fig. 2. Exploratory analysis of the theta band showed negligible contributions from the slow-theta sub-band. Most of the differences observed in theta were reproduced when comparing the pre-alpha sub-band in PD and non-PD groups, see Supplementary  Figure 13. Besides, the central-anterior delta relative PSD and the central-posterior beta relative PSD were significantly higher in PD patients, see Fig. 2. The alpha/theta ratio was significantly lower in PD patients. Still, the proportion of significant values across epochs was smaller than the observed for relative PSD in theta or pre-alpha bands.
The DF also showed the most consistent significant differences in the posterior leads, where PD patients presented lower values. Of note, we found multiple channels and epochs with a significantly reduced DFV in the PD group, Fig. 3.

Medellin dataset.
In all channels of PD patients, we observed significantly lower values of the theta relative PSD, see Fig. 2. The greatest size of the difference was observed in the posterior leads. Exploratory analyses of the theta band showed similar results for the pre-alpha sub-band. Thus, pre-alpha relative PSD was lower across epochs and channels in PD subjects. By contrast, differences in the slow-theta sub-band had smaller difference sizes and were less consistent across channels and epochs, see Supplementary Figure 13. We also found a significantly lower alpha relative PSD in PD patients, particularly in posterior and frontal leads, showing the most sustained differences in posterior leads. Significant differ- Fig. 1. Differences between non-PD and PD in spectral features using averaged epoch analysis. Topomaps represent the differences between the control (non-PD) and the Parkinson's Disease (PD) groups. Each row includes the most consistent PD-related differences in the conventional averaged-epochs analysis. Each of the plotted spectral features is placed in a column. The colour bars represent t-test values, where blueish colours show a higher value of a given feature in the PD group, and yellowish colours indicate lower values of a spectral feature in the PD group. Channels showing significant differences are plotted. Channels in red dots were significant at the False Discovery Ratio-corrected level, while channels marked as a white dot were only significant at the uncorrected p-value. Rel. PSD: relative Power Spectral Density.
A. Jaramillo-Jimenez, D.A. Tovar-Rios, J.A. Ospina et al. Clinical Neurophysiology 151 (2023) 28-40 ences in the alpha/theta ratio were remarkably observed in all channels, see Fig. 3. Findings in the delta and beta bands were considered negligible as they were not observed in most of the analysed data, see Fig. 2.
The DF in most posterior and some central channels was significantly lower in PD subjects across most of the analysed epochs; there were other widespread fluctuant differences in DF observed across the analysed data, see Fig. 3.

Summary of findings using conventional averaged epochs and FDA
In PD subjects, multiple testing corrected spectral analysis with averaged epochs showed a significantly higher relative PSD in the theta band evident in all the datasets. Theta band differences were reproducible using the pre-alpha relative PSD in three out of four cases, except in the Finland dataset, where pre-alpha differences (evidenced in similar channels of theta differences) were no longer significant after FDR correction.
In line with the findings using averaged epochs, the FDA approach showed statistically significant higher theta relative PSD in PD patients, at most channels during most of the evaluated epochs. Conversely to conventional averaged-epochs methods, FDA showed pre-alpha differences (particularly in posterior channels) in four datasets, accounting partly for the observed PD-related differences in the full theta band. PD-related differences in the slow-theta sub-band were found in two of four datasets (Finland and Medellin).
Using conventional averaged epochs: The alpha/theta ratio differences were reproduced in two out of four datasets (Finland and Medellin), and some findings were significant at the uncorrected pvalue in a third dataset (Iowa). Conversely, FDA did not show significant differences in the alpha/theta FDA across many consecutive epochs. The latter happened in the Finland dataset (most leads significant in averaged-epochs analysis and scarcely significant across epochs in FDA), and Iowa (central and posterior region significant in conventional analysis at the uncorrected level), and not significant across multiple consecutive epochs nor many channels as in the averaged-epochs approach.
None of the datasets showed significant differences in the delta or alpha bands using averaging epochs. On the other hand, FDA showed lower alpha relative PSD in the posterior (Medellin) and anterior (Finland) channels, but this was not reproduced in other datasets.
In the Iowa dataset of rs-EEG with eyes open, we found a significantly lower DFV in the PD group. This was significantly consistent in both the averaged-epochs approach and FDA. Finally, in PD subjects from the same dataset, we found a significantly lower beta relative PSD in posterior and central leads (consistently in averaged epochs and FDA) and a lower delta in some epochs of central channels (only with FDA). Fig. 2. Differences between PD and non-PD in the relative PSD at each frequency band using FDA. Heatmaps show the differences between PD and non-PD subjects in relative PSD features. Each of the major grids shows the heatmaps for each studied dataset. Each column represents each spectral feature. The relative Power Spectral Density (PSD) in the delta, theta, alpha, and beta bands are depicted from the left to right columns. The colour bar represents the t-values of the differences. Yellowish colours indicate positive t-values (i.e. higher values of a given feature in the control group (non-PD) compared to Parkinson's Disease (PD subjects). In comparison, blueish colours represent negative t-values (i.e. higher values of a given feature in the PD group compared to the non-PD group). Greyish colours indicate lower t-values, closer to 0. Within each heatmap, channels are plotted on the Y-axis and are segmented into three regions (anterior, central, posterior) separated by a white line. In each heatmap, epochs are plotted on the X-axis and variate from dataset to dataset depending on the minimal length of recordings at each centre. The epochs significantly different in the functional permutation t-test are marked with an asterisk (*). Consistent findings across all datasets are represented inside a red square.
Additional exploratory analysis of the findings in relative theta and pre-alpha PSD showed that those patients in the Finland dataset with an MMSE score below the median of the PD group had a significantly higher theta and pre-alpha relative PSD in the O1, Oz, P7, P8, and T8 channels. Consistently, those in the Iowa dataset with a MoCA score below the median of the PD group showed a significantly higher Theta power in the O2, Oz, and P8 channels; for pre-alpha, we observed similar trends. For PD groups in the Medellin and California datasets, the hypothesis tests were nonstatistically significant but had a common direction of the effect in most of the posterior leads (i.e. the higher the theta and prealpha power, the lower the cognitive performance); see Supplementary Figures 14-17. There were no statistically significant differences in theta or pre-alpha values based on LEDD, UPDRS -III scores, or disease duration staging.

Discussion
In this study, we identified the differences in the rs-EEG spectral features of PD and non-PD subjects using FDA and conventional averaged-epochs approaches. Besides, we aimed to explore the external validity of the above findings in four independent PD cohorts. Overall, in PD patients, we found a greater theta relative PSD. This was the most reproducible finding across all four datasets using both FDA and averaged epochs. However, when subdividing the theta sub-bands (slow-theta and pre-alpha), FDA revealed significantly increased pre-alpha relative PSD in posterior leads of PD patients. Pre-alpha findings were replicated in all datasets with FDA. By contrast, conventional averaged epochs only showed significant pre-alpha differences in three out of four datasets.
The direction of the differences mentioned above was consistent across the analysed rs-EEG data from different research centres. Therefore, in rs-EEG at the sensor level, a higher generalised theta with prominent posterior pre-alpha relative PSD was the most reproducible PD-related spectral pattern.
Quantitative analysis of PSD at the sensor level has been a widely adopted procedure for rs-EEG clinical research (Babiloni et al., 2020a). Spectral features of the rs-EEG have also been recognised as promising markers for many neurodegenerative diseases that can lead to dementia, such as PD (Shaban and Amara, 2022). Thus, there is a shared rs-EEG pattern between many heterogeneous and overlapped entities like PD, AD, and DLB: an increase in the power of slow frequency bands like delta and theta, whilst faster frequencies like beta and gamma tend to reduce their power. In addition to these PSD variations, left-side frequency shifting might be present in certain cases (Babiloni et al., 2020b).
Most neurodegenerative disorders are prone to variations in the intrinsic oscillatory properties of the brain (Lopes da Silva, 1991). Some hypotheses suggest that this particular signature in rs-EEG rhythms could fit the so-called thalamocortical dysrhythmia Fig. 3. Differences between PD and non-PD in the other studied spectral features of rs-EEG using FDA. Heatmaps show the differences between PD and non-PD subjects in other studied features. Each of the major grids shows the heatmaps for each analysed dataset. Each column represents each spectral feature. The alpha/theta relative Power Spectral Density (PSD) ratio, the pre-alpha relative PSD, the dominant frequency (DF), and its variability (DFV) are depicted from left to right. The colour bar represents the t-values of the differences. Yellowish colours indicate positive t-values (i.e. higher values of a given feature in the control group (non-PD) compared to Parkinson's Disease (PD subjects). In comparison, blueish colours represent negative t-values (i.e. higher values of a given feature in the PD group compared to the non-PD group). Greyish colours indicate lower t-values, closer to 0. Within each heatmap, channels are plotted on the Y-axis and are segmented into three regions (anterior, central, posterior) separated by a white line. In each heatmap, epochs are plotted on the X-axis and variate from dataset to dataset depending on the minimal length of recordings at each centre. The epochs significantly different in the functional permutation t-test are marked with an asterisk (*). Consistent findings across all datasets are represented inside a red square.
A. Jaramillo-Jimenez, D.A. Tovar-Rios, J.A. Ospina et al. Clinical Neurophysiology 151 (2023) 28-40 model. From tinnitus and depression research, the thalamocortical dysrhythmia model has been proposed as an integrative framework that might encompass the main findings in neurodegenerative disorders. Slowing down in thalamocortical loops could result in a constant cross-frequency coupling between lower frequencies (like theta and alpha) and higher frequencies such as beta and gamma. This cross-frequency coupling could then impair information processing in specific cortical columns (mediated by high frequencies), perpetuating the cycle of alpha power reduction and alpha peak frequency shifting to the left (increasing power in the theta and, subsequently, delta bands) (Franciotti et al., 2020;Llinás et al., 1999;Vanneste et al., 2018). Nevertheless, many other mechanisms, such as the integrity of the cholinergic systems, might play a crucial role in the underlying process behind rs-EEG power changes in neurodegenerative disorders (Rea et al., 2021;Schumacher et al., 2020b). Given preliminary evidence supporting consistent rs-EEG findings in multicenter studies on other neurodegenerative diseases (Bonanni et al., 2016), as well as recent findings in the pre-alpha sub-band in a small cohort of non-demented PD subjects (Rea et al., 2021), we included the pre-alpha relative PSD in our analyses. In line with these findings, we observed an increased theta relative PSD in all datasets for most channels. However, in the fasttheta (i.e. pre-alpha) sub-band, the differences observed in posterior leads exhibited greater effect sizes, see Supplementary Fig ure 18. The posterior dominant rhythm (generally with peak highest power within the alpha band) is generated by the interaction of multiple cortical components, including occipitotemporal and occipitoparietal, as well as other cortico-cortical and thalamocortical sources integration (Barzegaran et al., 2017;Garcés et al., 2013). Slowing down of the posterior dominant rhythm is usually reported in chronic encephalopathy, sedation, delirium, and neurodegeneration (Carrarini et al., 2023;Kimchi et al., 2019;Smith and Smith, 2005). Increased clinical severity is generally associated with greater power in lower frequencies such as theta and delta. Thus, we hypothesise that our theta and pre-alpha findings in posterior leads might potentially reflect the initial indirect effects of mild encephalopathy due to neurodegeneration over brain dynamics that generate the posterior dominant rhythms. Also, pre-alpha differences were more prominent using FDA than conventional averaged epochs, which reflected constantly significant epoch-toepoch differences in posterior electrodes that might disappear in a traditional averaging process. The latter was observed in the Finland dataset, where conventional epochs-averaging only showed significant results in the pre-alpha sub-band at the uncorrected p-values. Of note, pre-alpha differences were not confined to the eyes closed rs-EEG condition, as higher pre-alpha was robustly observed in PD during the open eyes condition (Iowa dataset), emphasising the important contribution of theta rhythms rather than an exclusive effect of alpha slowing (as alpha activity is not prominent in the eyes-open condition) (Mari-Acevedo et al., 2019). Therefore, this might reflect the initial stages of crossfrequency coupling phenomena related to neurodegeneration (i.e. increase in power of lower frequency bands). The latter might be supported as we also found lower alpha relative PSD in FDA in the Medellin dataset (which included 14 PD-MCI subjects), potentially reflecting a more advanced stage of cross-frequency coupling (i.e. increase in theta and pre-alpha power, plus DF shifting to the left resulting in a reduction of alpha power) (Vanneste et al., 2018).
Exploratory analysis of our findings in theta and pre-alpha was not concluding about the capability of these features to stage patients based on clinical variables. In two datasets (Finland and Iowa), we found that those PD subjects with a global cognitive performance below the group's median exhibited significantly higher relative theta PSD values. Nevertheless, these results were not observed in all datasets and should be confirmed in subsequent analyses. Thus, more comprehensive statistical modelling methods that account for data pooling, control of sensor topography (and ideally for volume conduction), and include clinical outcomes and confounders may increase the applicability of these spectral features in the clinical praxis.
In addition to the power changes discussed above, evidence indicates that DF at posterior leads is present in DLB and PDD even at predementia stages such as MCI (Bonanni et al., 2015(Bonanni et al., , 2008Schumacher et al., 2020a). We did not observe major significant findings in DF, except for the FDA results in the Medellin dataset, which included a subsample of PD-MCI patients. Of note, when using the averaged-epochs approach, differences in DF were scarce in the Medellin dataset. Still, these differences were robustly significant across the majority of epochs when using FDA. DF in the Iowa dataset also showed some significant findings in posterior epochs using FDA; this dataset had two subjects with pronounced lower MoCA test scoring (19 and 22) but lacked information confirming a PD-MCI diagnosis in the primary study. Left-side shifting of the DF can be either fluctuant or stable across epochs (DFV), but differences in the DFV in the eyes closed condition are still not conclusory and might show different directions (Bonanni et al., 2008;Stylianou et al., 2018).
Interestingly, a recent publication showed that healthy controls had a greater DFV change in the eyes closed/eyes open ratio compared to subjects with multiple types of dementia (Jennings et al., 2022). Unfortunately, we did not have available closed-eyes data for Iowa, but we observed a greater DFV in non-PD subjects only in this eyes-open dataset. Indeed, studying the open eyes condition might contribute to the studies of rs-EEG as one can consider dynamic features such as reactivity to the eyes opening, closely related to cholinergic integrity in both cognitively spared and impaired patients with neurodegeneration (Rea et al., 2021;Schumacher et al., 2020b). Also, as opposed to cortico-cortical relative PSD values, the brain generators of DF and posterior alpha peak frequency are thought to be thalamic, facilitating the integration of this feature within the thalamocortical dysrhythmia model (Arnaldi et al., 2017;Law et al., 2020). In line with the above, recent systematic reviews found high consistency regarding DF reduction with increased pre-alpha power as prevalent findings in diseases with associated Lewy Body pathology (i.e. DLB and PD with or without dementia) compared to the marked alpha power reduction observed in AD patients (Law et al., 2020). Our results support the above observations, particularly when using the FDA to detect PDrelated differences in DF and pre-alpha power, as conventional averaging might cause a spectral masking effect in subtle changes of peak frequency and potentially less precise effect size estimations due to outlier contamination. Thus, for DF, FDA showed more significant differences than averaging across epochs, particularly in the Medellin dataset, and slightly significant results in Iowa on a couple of posterior channels across multiple epochs.
We have previously reported a generalised lower alpha/theta ratio in PD patients from the Medellin dataset (Jaramillo-Jimenez et al., 2021). Unfortunately, these prior findings were not generalisable across the four datasets with either conventional epochsaveraging or FDA methods. However, FDA resulted in a much more informative method as we realised that most of the potential significant findings we obtained in other datasets (i.e., multiple significant findings across all regions in the Finland dataset) were not evidenced in FDA across all the epochs and might be highly influenced by extreme values as showed in some high t-test values that were not constant in consecutive epochs. Nevertheless, further examination of our current results, including the modelling of confounding factors, should be conducted for better estimations regarding the alpha/theta ratio.
In addition, FDA in clinical neurophysiology might be a valuable complement to the analysis of Compressed Spectral Arrays (CSA), i.e., epoch-to-epoch representations of the power spectrum. The CSA has been a helpful tool for neurodegenerative diseases involving Lewy Bodies such as DLB, given their differences in the DF and the reported DFV in many studies. Therefore, FDA can model and compare epoch-wise the potential differences potentially related to the diagnosis. The latter can contribute to the interpretation of CSA and overcome the loss of epoch-to-epoch information when collapsing the time dimension into a single average. FDA can reflect the proportion of significant differences in a specific feature, and provide detailed estimations of the size of these differences, facilitating the examination of potential spurious results due to outliers by showing the magnitude and direction of the differences at each time point. This is also important for feature selection in ML models, particularly those used for event-related recordings. These models usually take each epoch (or trial) as an input value for training, validation, or testing sets (Shaban and Amara, 2022). Therefore, researchers might consider a priori selecting meaningful features from the EEG that show robust findings across all trials and potentially get a better classification performance.
By examining the results of this paper, the reader could consider FDA as a potential tool to overcome potential limitations of epochs averaging for spectral analysis. Also, the multicentric approach provides a broader panorama, allowing the reader to examine which findings can be reproduced across subsets of data from different populations. Also, as external validity is one of the flags of this paper, this study uses standard and automatic preprocessing and feature extraction workflows to control for differences in analytic procedures, subjective inclusion/exclusion of rs-EEG data, and other potential confounders to the interpretation of the results.
Despite our efforts, this study has some limitations that must be considered when examining our results. First, the study's retrospective nature did not allow us to control the experimental conditions of the recording and data collection nor determine causal relationships. Also, due to the heterogeneity of the rs-EEG data, it was not possible to pool all the datasets into a single set after conducting standardised automatic preprocessing and feature extraction pipelines. The latter could affect the significance of comparisons between PD and non-PD subjects, as the low sample size in some datasets reduces the statistical power. Still, the consistency of results from different centres using various recording devices indicates the reliability of the rs-EEG sensor-level spectral markers to differentiate subjects with PD from healthy controls. Despite a generalisable direction of the effect in multiple independent cohorts, external validity in clinical scenarios must include the differentiation among potential differential diagnoses. Unfortunately, we could not include another age-matched group with a different disease as all datasets were already collected in primary studies previously conducted. Future efforts by our group aim to include cohorts encompassing other neurodegenerative and neuropsychiatric disorders from international collaborators, as well as to implement potential harmonisation alternatives, such as ComBat, and longitudinal ComBat, based on Generalised Additive Linear Mixed-effects Models for statistical control of some of the confounders and determinants of the centre/headset-related variability (Beer et al., 2020). However, harmonisation methods for EEG signals are still underdeveloped and scarcely applied as a standard procedure in multicentric studies on rs-EEG and neurodegeneration . In addition, we find particularly important the need for parametrising aperiodic components obtained in the rs-EEG spectra (Donoghue et al., 2020) and assessing features with different conceptual natures (such as complexity/regularity or connectivity) to complement the information provided by spectral features (Al-Qazzaz et al., 2014). With all this in mind, we encourage the scientific community to continue with the joint effort toward more valid and reproducible research on rs-EEG.

Conclusions
In conclusion, with FDA, the most reproducible findings in the PD groups were an increased generalised theta and posterior prealpha. Besides, we found a lower alpha/theta ratio and DF in PD subjects, but this was not generalisable across all datasets. Also, DFV was significantly higher in non-PD only in the eyes-open dataset. Further studies should validate these findings in cohorts involving multiple neurodegenerative and neuropsychiatric diagnoses.
Finally, for spectral analysis of rs-EEG, FDA may constitute a reliable alternative to the conventional averaged-epochs methods. We propose that FDA can contribute to expanding the scope of rs-EEG research and feature engineering.