Two-channel EEG based diagnosis of panic disorder and major depressive disorder using machine learning and non-linear dynamical methods

The current study aimed to investigate the possibility of rapid and accurate diagnoses of Panic disorder (PD) and Major depressive disorder (MDD) using machine learning. The support vector machine method was applied to 2-channel EEG signals from the frontal lobes (Fp1 and Fp2) of 149 participants to classify PD and MDD patients from healthy individuals using non-linear measures as features. We found significantly lower correlation dimension and Lempel-Ziv complexity in PD patients and MDD patients in the left hemisphere compared to healthy subjects at rest. Most importantly, we obtained a 90% accuracy in classifying MDD patients vs. healthy individuals, a 68% accuracy in classifying PD patients vs. controls, and a 59% classification accuracy between PD and MDD patients. In addition to demonstrating classification performance in a simplified setting, the observed differences in EEG complexity between subject groups suggest altered cortical processing present in the frontal lobes of PD patients that can be captured through non-linear measures. Overall, this study suggests that machine learning and non-linear measures using only 2-channel frontal EEGs are useful for aiding the rapid diagnosis of panic disorder and major depressive disorder.


Introduction
Anxiety disorders are a group of psychiatric disorders characterized by excessive fear, anxiety, and related behavioral disturbances (DSM-V, 2013).Of the anxiety disorders, Panic disorder (PD) is quite prevalent.It is characterized by the onset and recurrence of unexpected panic attacks and developing fear towards future attacks (DSM-V, 2013).Panic attacks are acute periods of intense fear and discomfort accompanied by palpitations and trembling (Hoppe et al., 2012), but are not unique to PD (DSM-V, 2013) as they also typify conditions such as agoraphobia, substance abuse, and major depressive disorders (MDD), which often occur together with PD (DSM-V, 2013; Hoppe et al., 2012;Skapinakis et al., 2011).PD is especially comorbid with MDD, with up to 50% of patients who have PD experiencing an episode of MDD (Baldwin, 1998;Hirschfeld, 2001;Kaufman and Charney, 2000), further complicating accurate diagnoses of PD and MDD (DSM-V, 2013; Goodwin et al., 2005;Locke et al., 2015).Consequently, PD has been under-diagnosed or misdiagnosed, particularly in primary care settings (Goodwin et al., 2005;Herr et al., 2014;Huffman and Pollack, 2003;Locke et al., 2015).Therefore, diagnostic methods accurately classifying PD from other psychiatric disorders are an area requiring attention.
Previous diagnostic studies on PD involved using electroencephalography (EEG) signals, as it is a relatively cheap and flexible way to monitor neural activity.EEG signals are electrophysiological representations of brain activity which are recorded non-invasively through electrodes placed on the scalp (Sha'Abani et al., 2020).However, what EEG offers in terms of flexibility is obtained by sacrificing signal to noise ratio (SNR) quality of the obtained data (Sha'Abani et al., 2020;Teplan, 2002).Additionally, EEG signals are generated by combinations of electrical activity from large neuronal populations roughly underlying each electrode.This means that spatial resolution is relatively limited in EEG applications (Teplan, 2002).In spite of this, EEG possesses good temporal resolution and remains a valuable tool for observing stimuli response in individuals as well as representing aberrant electrical activity in the human brain, and as a result, its use is common in clinical neurophysiology (Teplan, 2002), and also on studies exploring the etiology of PD.One such study reported a significant decrease in right frontal alpha-band power of PD patients at rest compared to the left, and in response to anxiety-relevant stimuli (Wiedemann et al., 1999), suggesting that frontal brain asymmetry can be a biomarker for PD.This asymmetry purportedly arises from dysfunction in generating positive or negative (avoidance) responses to stimuli, which are controlled by frontal activity (Sutton and Davidson, 1997;Wiedemann et al., 1999).Studies also connect frontal EEG asymmetry to emotional response and processing (Allen et al., 2001;Coan and Allen, 2004).Building on these findings, it was demonstrated that frontal EEG alpha asymmetries helped differentiate between psychiatric disorders such as PD, post-traumatic stress disorder, schizophrenia, and depression (Gordon et al., 2010).EEG activity in temporal regions also exhibited differences between patients with PD and healthy individuals (Carvalho et al., 2013;Gordeev, 2008;Hanaoka et al., 2005;Wise et al., 2011).Notably, these studies have focused mainly on linear analysis of the EEG signals.
Thus, the goal of this study was to investigate the changes in nonlinear dynamics of EEG signals in patients with PD in comparison with healthy individuals and patients with MDD as it is highly comorbid with PD.Correlation dimension (D2), largest Lyapunov exponent (L1), Lempel-Ziv complexity (LZC), and approximate entropy (ApEn) were used as non-linear measures in this study, and have been successfully used in non-linear analyses of EEG signals in patients with mental disorders (Abásolo et al., 2005(Abásolo et al., , 2006(Abásolo et al., , 2007;;Acharya et al., 2012;Bachmann et al., 2015;Carlino et al., 2014;Chae et al., 2004;Geng and Zhou, 2010;Hu et al., 2006;Jeong et al., 1998b).We demonstrate the use of these features in a machine learning classifier through the wavelet-chaos approach (Adeli et al., 2007) of obtaining non-linear features from specific EEG bands.Non-linear features and alpha asymmetry were inputs into a LASSO regression model for feature selection to create an optimal feature space for classification.Notably, we used 2-channel EEGs recorded from frontal regions of both hemispheres (i.e., Fp1 and Fp2) in both patient groups and controls to examine the possibility of rapid diagnosis in the clinical field for practical application.Thus, a key strength of this study is to present the possibility of diagnosing PD, and MDD using a parsimonious electrophysiological recording setup and paradigm that has practical potential applications in clinical settings owing to its relative ease of setup and use.This work is an extension of an earlier study involving preliminary pilot analyses (Aderinwale et al., 2019).

Participants
Data was obtained from 149 participants consisting of 60 healthy individuals, 40 PD patients, and 49 MDD patients (see Table 1).The experiment protocol was carried out at Samsung medical center in Seoul, South Korea.All patients were diagnosed by senior psychiatrists according to DSM-IV criteria.Screening for MDD and PD was additionally based the administration of the Mini-International Neuropsychiatric Interview (M.I.N.I), the Hamilton Depression Rating Scale (HAM-D), the Hamilton Anxiety Rating Scale (HAM-A), the Panic Disorder Severity Scale (PDSS), and the stress response inventory (SRI).Healthy participants reported no medical history of psychiatric disorders.The purpose of the experiment was well explained to all participants involved, and written consent was received.All subjects were compensated with $50 for their participation in the study.This study was approved by the institutional review board of Samsung Medical Center of Seoul, South Korea.

Experiment paradigm
EEG signals were recorded from five consecutive phases separated by ten seconds of instruction and preparation.Participants were instructed to make themselves as comfortable as possible before beginning the experiment and were asked to avoid movements during the experiment.The recording used two channels (Fp1 and Fp2) at the prefrontal region of both hemispheres at a sampling frequency of 256 Hz with eyes open in all periods.
The goal of this design was to observe how EEG response profiles changed during subsequent periods of rest, stimulation, and recovery.The experiment started with a resting phase, followed by a mental task, a recovery phase, a relaxation task, and a final recovery phase, as shown in Fig. 1.Each recording phase lasted for 5 min, and 1-minute EEG segments were taken from the middle of each phase for all participants.The EEG recordings were obtained through a ProComp Infiniti (SA7500, Computerized Biofeedback system, Thought Technology, Canada) device along with other physiological signals as part of a larger study.
The resting phase was used as a baseline for comparison with other phases.During the mental task, participants counted downwards serially from 500 in steps of seven.The aim here was to examine potential differences due to cognitive workload and consequent stress.The fourth phase was a relaxation task in which participants were presented with images of natural scenery in order to observe responses to stimuli that did not entail any cognitive workload or emotional valence as PD patients reportedly process stimuli differently from healthy individuals, based on hypothesized lower regulatory activity of frontal areas leading to a hypersensitivity of the amygdala and other components of Gorman et al.'s proposed "Fear network" (Gorman et al., 2000(Gorman et al., , 1989)).

Preprocessing of the EEG
The purpose of EEG preprocessing was to minimize any possible artifacts in EEG recordings, including ocular and motion artifacts (Fig. 2).EEG signals were filtered using a band-pass finite-duration input response filter (FIR) with a passband of 0 -50 Hz.Since the EEGs were recorded from 2 channels in this study, blind source separation techniques like independent component analysis (ICA) were deemed unsuitable for filtering out possible artifacts from the EEG data.Alternatively, wavelet-based artifact removal methods have been used in analyzing non-stationary signals, including EEG (Chavez et al., 2018;Chen et al., 2015;Khatun et al., 2016), and following suit, wavelet thresholding using the maximal overlap discrete wavelet transformation (MODWT) (Chavez et al., 2018;Percival and Walden, 2000) was applied in this study.We used a MATLAB implementation of the wavelet threshold method adopted by Chavez et al. (2018) on single-channel Fig. 1.Behavioral experiment paradigm.Participants went through the outlined experiment paradigm.They started with a resting period (baseline), followed by a mental arithmetic task, in which participants were asked to count downwards in steps of 7, starting from 500.This was followed by a period of rest and then a period in which images of natural scenery (such as waterfalls or forests) were presented to participants.The experiment ended with a final recovery period.Each experiment stage lasted 5 min, and EEG signals were recorded during all these experiment stages.Fig. 2. Flowchart of non-linear analysis.The raw EEG signals were filtered to limit the signal to EEG data of 0-50HZ, then preprocessed with wavelet thresholding to suppress artifacts.Using wavelet transformation techniques, the raw EEG was split into the frequency bands of interest (delta, theta, alpha, and beta).Then, nonlinear features were extracted from the preprocessed EEG data and used in ML classification.
EEG filtering methods (Chavez et al., 2018).We adopted a level-dependent threshold (Chavez et al., 2018) which was used to filter the wavelet transformed data by suppressing wavelet coefficients with absolute values higher than a determined level-dependent threshold, such that: where , and σ j = median ) .

Non-linear analysis of the EEG
We applied Taken's embedding theorem to reconstruct an attractor of the underlying system that mimics the attributes of the original system (i.e., the brain in this analysis) using time-delay coordinates (Stam, 2005;Takens, 1981).To perform time-delay embedding, it was crucial to choose parameters for the time lag "t" and the embedding dimension "m" suitably otherwise, the attractor represented would not be equivalent (Stam, 2005).If the time lag "t" is too small, the distance between points chosen from the original time series to reconstruct the attractor would be too close together, and the attractor geometry could be lost (Rodriguez-Bermudez and Garcia-Laencina, 2015).By contrast, if the time lag was too large, the points that comprise each embedding vector became more and more unrelated (Rodriguez-Bermudez and Garcia--Laencina, 2015).To address this, methods for calculating time lag estimates have been used based on non-linear dynamics and information theory (Rodriguez-Bermudez and Garcia-Laencina, 2015;Stam, 2005).
Here we adopted the first minimum of mutual information in an EEG time series as the time lag in the time delay coordinates (Rodriguez-Bermudez and Garcia-Laencina, 2015; Stam, 2005).For estimating the optimal embedding dimension "m" to be used for the time delay coordinates, the method of false nearest neighbors (Jeong et al., 1998b;Kennel et al., 1992) was adopted for this study.Both of these parameters were estimated through the use of the predictive maintenance toolbox in matlab (Inc., 2020).After estimating suitable values for the phase space representation of the time series, the next step was to obtain non-linear measures that define the underlying system's dynamics.

Correlation dimension
The correlation dimension(D2) can be taken as a measure of attractor dimensionality that reflects the number of independent variables required to describe the system's dynamics (Ahmadi and Amirfattahi, 2010;Jeong et al., 1998a).The larger the D2 value, the larger the underlying EEG system's behavioral complexity (Jeong et al., 1998b).Computation for estimating correlation dimension is based initially on the Grassberger-Procaccia Algorithm (GPA) (Grassberger and Procaccia, 1983).The adaptation used in this study was obtained from the "cordim" function in the MATLAB predictive maintenance toolbox (Inc., 2020), which defines the correlation dimension as: (2) Given that the number of within range points is defined as: where "1 ′′ represents the indicator function, meaning that all points at a distance from "Y i " less than "R", which is the radius of similarity, are counted as 1. "N i (R)" is the number of points within the range defined by the radius of similarity.

Lyapunov exponent
In terms of the system's dynamics, the Lyapunov exponent (L1) describes the divergence or convergence of trajectories that start at nearby initial states or nearby initial conditions (Rodriguez-Bermudez and Garcia-Laencina, 2015).L1 measures sensitive dependence on the system's initial conditions, and the L1 values of the EEG can be interpreted as a measure of how flexibly the brain processes information (Jeong, 2004).This measure is commonly estimated based on the algorithm proposed by Wolf et al. (1985).Further adjustments showing an improvement in computational speed led to the emergence of Rosenstein's algorithm (Rodriguez-Bermudez and Garcia-Laencina, 2015; Rosenstein et al., 1993;Stam, 2005), which was the basis of our computations in this study, as implemented in the "LyapunovExponent" function in the MATLAB predictive maintenance toolbox (Inc., 2020).
The calculation for the L1 begins with finding the nearest point "i*" to some point "i" in the time series satisfying the condition min i* (‖Y i − Y i *‖), such that | i− i* | is greater than some defined separation threshold.Then using a defined expansion range to govern the evaluation of how the trajectories of these points evolve, the largest Lyapunov exponent can be calculated by fitting to the expansion range, and λ defined such that:

Lempel-Ziv complexity
Lempel-Ziv complexities (LZC) were calculated based on the methodology proposed by Abraham Lempel and Jacob Ziv in 1976 (Lempel and Ziv, 1976).According to their work, the complexity of a finite sequence is evaluated by scanning an N-digit sequence in search of unique substrings of consecutive digits.It calculates the complexity of a sequence by essentially counting the number of unique sub-sequences present in a binarized version of the input sequence.The higher the LZC value, the more complex the sequence is.The first step involves converting the time series data into a binary format, which is commonly conducted by thresholding the sequence around the mean/median value, such that values below that become zero and values above that become 1.We used an implementation on this algorithm obtained from MathWorks file exchange (Thai, 2020), which is based on the method outlined in Lempel and Ziv's work (Lempel and Ziv, 1976).

Approximate entropy
Approximate entropy (ApEn) was introduced by Pincus et al. (1991) in 1991 as a statistic that could quantify the amount of regularity in given input data.While other non-linear measures such as fractal and correlation dimensions reflect static geometrical measures relating to the reconstructed state space, the approximate entropy can be considered a more dynamic measure of complexity in a non-linear system (Rodriguez-Bermudez and Garcia-Laencina, 2015; Stam, 2005) with higher values indicating higher complexity.The adaptation used in this study is obtained from the "approximateEntropy" function in the MAT-LAB predictive maintenance toolbox (Inc., 2020), which defines approximate entropy as: (5) where

Machine learning methods for classification
As outlined in Fig. 2, we employed the 'wavelet-chaos' analysis method proposed by Adeli et al. (2007Adeli et al. ( , 2008) ) to create a feature space for classification.The wavelet-chaos analysis is an extension of non-linear dynamical analysis methods to individual EEG frequency sub-bands (Adeli et al., 2007(Adeli et al., , 2008)).By dividing the EEG into composite A. Aderinwale et al. frequency sub-bands, we obtain information about specific frequency bands, which might otherwise be lost when looking at the whole unseparated EEG signal (Adeli et al., 2007(Adeli et al., , 2008)).We divided EEGs into the delta (0.5 Hz-4hz), theta (4 Hz to 8 Hz), alpha (8 Hz-12 Hz), and beta (13 Hz-20 Hz) frequency bands.Thus, each EEG band is divided into four frequency sub-bands, and we extracted non-linear EEG features from each composite frequency sub-band and the original EEG signal.Then, for each phase of the experiment, 20 features were produced (5 EEG bands multiplied by four features).Since there are five experiment phases, we had 100 non-linear features per hemisphere in total.In addition, considering previous research on alpha-band asymmetry, we included alpha-band power asymmetry in the feature space, which was one feature per experiment phase.Finally, our initial feature space consisted of a total of 205 features (100 multiplied by two hemispheres and five asymmetry values) for machine learning classification.
As we employed a supervised machine learning classifier, we split the data into a training and testing set with an 80:20 ratio.Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) regression on the extracted feature space.LASSO is a regression technique (Tibshirani, 1996) that can perform feature selection on a given set of predictors by searching for optimal feature combinations that carry information that increases prediction accuracy (Chaturvedi et al., 2017;Tibshirani, 1996).This technique has been successfully employed in recent research involving EEG-based feature selection (Chaturvedi et al., 2017).
Support vector machines (SVMs) are a family of supervised classification algorithms that are based on statistical learning theory, and operate on the general principle of creating decision boundaries that maximize the margin of separation between two classes as represented by their training samples (Quitadamo et al., 2017).SVMs have the advantage of offering generalizability in spite of high dimensional feature spaces, and limited training samples (Bennett and Campbell, 2000;Jain et al., 2000;Lotte et al., 2007;Quitadamo et al., 2017).These features make SVM classifiers robust for applications such as ours.Accordingly, after feature selection, an SVM model was fit to the training data set using 5-fold cross-validation.In addition, the receiver operating characteristic (ROC) curve was calculated to estimate the performance of binary classification of the models developed in this study.All machine learning simulations, feature extraction, LASSO regression, and EEG filtering steps applied in this study were performed using MATLAB r2018b.This involved using the following MATLAB toolboxes: Signal-processing toolbox, neural network toolbox, predictive maintenance toolbox, and the MATLAB statistics and machine learning toolbox.

Non-linear analysis of the EEG
For all analyses and statistical comparisons, 1-minute EEG segments were taken from the middle of each experiment phase.We first compared the mean values of D2, L1, LZC, and ApEn, features for the resting state whole-band EEG segments across three participant groups (PD, MDD, and control) through a one-way analysis of variance (ANOVA) test.In this analysis, we found significant differences in the D2 values at rest in the left hemisphere (one-way ANOVA, F = 4.54, p = 0.0122) (Fig. 3), and the LZC values also at rest and in the left hemisphere (one-way ANOVA, F = 4.15, p = 0.0177) (Fig. 4).Further t-tests (conducted at a Bonferroni corrected threshold level of 0.025, for two EEG channels) revealed significant differences in D2 values (p = 0.0088, t-stat = -2.6744,two-tailed student's t-test) and LZC values (p = 0.0138, t-stat = -2.5066,two-tailed student's t-test) between the MDD group and the control group in the left hemisphere at the resting state (Table 2).We also found a significant difference between the PD and control groups in LZC (p = 0.0153, t-stat = -2.4650,two-tailed student's t-test) in the left hemisphere at the resting state.A considerable but not Bonferronicorrected significant difference was found in the left hemisphere at rest between the PD group and controls in terms of D2 values (p = 0.0293, t-stat = -2.2094,two-tailed student's t-test).All other features exhibited no significant differences across the participant groups in other experiment phases.
As previous studies reported significant differences in the alpha-band power asymmetry in PD patients (Gordon et al., 2010), we calculated the alpha-band power asymmetry from PD and MDD patients and healthy individuals.However, we did not find any significant differences in alpha-band power asymmetry across the experiment groups at rest.We also performed t-tests to determine if the mean complexity values varied across immediately subsequent experiment phases (e.g., between the rest phase and the mental arithmetic phase, etc.) within groups.We found similar trends in each of the three experiment groups across the experiment phases.Significant differences in D2 and LZC measures were observed between rest and mental arithmetic and between mental arithmetic and recovery (see Figures S1-S4 in Supplemental material).In all three groups, we found that the values of D2 and LZC were reduced when going to the mental arithmetic phase and that the decreases were reverted in the recovery phase.

Machine learning classification
The classifiers successfully distinguished PD or MDD patient and control groups at above chance levels, as shown in Table 3. Notably, The SVM classifier performs much worse when discriminating between the PD and MDD groups.This indicates increased difficulty in differentiating between PD and MDD patient groups with the given feature space.
Tables 4-6 summarize the LASSO-derived feature spaces for the classification tasks, showing contributions from multiple frequency bands to create optimal feature spaces on each task.We found that features from each of the experiment phases contributed to the eventual feature space generated by LASSO regression, which was used in training the classifiers involved in each classification task.

Discussion
This study investigated non-linear dynamics of 2-channel (Fp1 and

Table 2
Observed differences in EEG complexities of participants across experiment phases for each experiment group.Significant differences were only found across phases 1 (rest), 2 (mental arithmetic), and 3(recovery).Participants across the three experiment groups showed a similar trend of decreased EEG complexity between rest and mental arithmetic conditions and a relative restoration of this EEG complexity as they entered the recovery phase.Fp2) EEG signals in patients with PD and MDD and applied an SVM classifier to identify PD and MDD, and reports high accuracy performance of classification across groups (Table 3).These findings suggest that machine learning methods based on non-linear measures of 2-channel frontal EEGs are potentially helpful in the rapid diagnosis of PD and MDD.A significant benefit of the paradigm used in this study is its ease of setup and low cost.By using two prefrontal channels, time involved in setting up subjects and the device is drastically reduced in comparison to large multichannel EEG recordings (typically 32 or 64 channels).The use of prefrontal channels also means that there is no need to deal with hairs during setup and after the tests, making recordings faster and more convenient for subjects.Additionally, the tasks used do not require the collection of response time or active behavioral data from patients.These points demonstrate a benefit of this study in exploring the use of a setup that is ideal for clinical or primary care settings, as it allows for non-invasive and further simplified (2-channel) electrophysiological data collection and analysis with minimal instruction.Although the overall time taken for the experiment was 30 min only 5 min of EEG recording (1 min from each task) was taken for analysis, and so the overall experiment duration can be further shortened, which further demonstrates the practicality of the proposed experiment setup for realtime clinical use.Since there is an apparent lack of investigation on non-linear dynamical properties of the EEG in patients with PD, the current study provides important results showing disturbed non-linear dynamics of the brain in PD.EEG studies on PD patients using linear methods generally revealed altered alpha band power in temporal and frontal regions (Carvalho et al., 2013), and EEG coherence studies have reported reduced functional connectivity in frontal brain regions (Carvalho et al., 2013).Our findings build on these observations as the non-linear dynamics of the EEG in the left frontal regions suggest an etiology of PD residing in the prefrontal region of the left hemisphere.The decreased complexity in PD patients observed in our findings is also observed in patients with other psychiatric disorders (Abásolo et al., 2005;Carlino et al., 2014;Chae et al., 2004;Jeong et al., 1998b).Maladaptive reductions in complexity are generally interpreted as a result of cell death of a large body of neurons, reduced neuronal activity within the circuits, or diminished network connections (Stam, 2005).Thus, our observed decreased non-linear measures (D2 and LZC) in PD patients potentially suggest reduced network connections or activity in prefrontal cortex neurons.Hypo-activations or decreased functionality in the prefrontal regions have been hypothesized to underlie PD in emotion regulation (Ball et al., 2013) and attention (Bishop, 2009).As a result, altered prefrontal cortex activity in PD patients (Ball et al., 2013;Bishop, 2009;Campbell-Sills et al., 2011;Long et al., 2013) is thought to be a feature of PD.Our findings of diminished frontal EEG complexity in PD patients potentially supports hypotheses of reduction in positive affect in the left frontal regions (Sutton and Davidson, 1997;Wiedemann et al., 1999) or defective control of the left prefrontal cortex on amygdala regulation in the fear network proposed by Gorman et al. (2000Gorman et al. ( , 1989)); Kircher et al. (2013); Kunas et al. (2019).

Table 3
The SVM classifier results revealed high performance at discriminating MDD patients from healthy individuals on the training and testing dataset.Differentiating PD patients from healthy individuals and MDD patients was more difficult, as shown by the decreased levels of classification performance.It is worth noting that we found significant differences in EEG complexity between the experiment groups only at rest and not in the other experiment phases.This is possibly because the mental arithmetic tasks and the appraisal of emotionally neutral images of natural scenery are not affected by the presence of mental disorders such as PD or MDD, causing their EEG signals to appear similar to controls during these tasks (Li et al., 2007).An interesting follow-up experiment could observe complexity measures when patients view emotionally salient images so that each image evokes a negative or positive emotional response.
The non-linear dynamics of EEG signals in MDD using non-linear measures have been investigated to some extent.Prior studies have reported increased EEG complexity of MDD patients when at rest with eyes closed as measured by Higuchi's fractal dimension (Bachmann et al., 2018) and LZC (Bachmann et al., 2015;Li et al., 2008) but decreased complexity as measured by entropy and other measures (Nandrino et al., 1994;Puthankattil and Joseph, 2014) with eyes open or during tasks.These seemingly contradictory findings are possibly due to data collection in eyes-closed versus eyes-open or during task conditions.EEG recordings in the eye-open state likely contain brain activities relating to the cortical processing of visual input, affecting EEG signal profiles (Barry et al., 2007), and Stam et al. (1996) demonstrated that non-linear features showed significant alterations between a rest condition with eyes open, with eyes closed, and mental arithmetic with eyes open, indicating the effect of task conditions on non-linear measures.Here we have also reported significantly lowered complexity values in the mental arithmetic task, which has been reported in another study on schizophrenia and depression using LZC (Li et al., 2008).
The current study has investigated the possibility of rapid diagnosis of PD and MDD using a simplified two-channel EEG setup and machine learning.To the best of our knowledge, there are no studies reporting the machine learning classification of PD based on non-linear measures through EEG, although some previous studies have focused on the presence or absence of comorbid depression with PD (Lueken et al., 2015) or on identifying neural markers through MRI data (Lai, 2019).We recognize that there is still room for improvement in terms of classification performance, and speculate that a combination of physiological measures such as EDA (Kim et al., 2018) and the possible inclusion of select frontal or temporal regions related to PD and MDD (Carvalho et al., 2013;Gordeev, 2008;Gordon et al., 2010;Hanaoka et al., 2005;Kikuchi et al., 2011;Locatelli et al., 1993;Wise et al., 2011) can establish a more robust multimodal approach towards the classification of PD and MDD.A limitation of the current study is that a larger and more diverse sample space would be ideal for drawing stronger conclusions.Furthermore, although the SVM classifier performance is lower in the classification of PD vs MDD in particular, the robustness of SVM as a classification algorithm and its better performance in the MDD vs Control case, might imply that adjusting the feature space to include additional non-linear dynamic features capturing new dimensions of information in addition to taking a multimodal approach or possibly expanding the channels used as mentioned earlier, might be a more fruitful approach than focusing on algorithmic or model variations.Nevertheless, alternative classification algorithms remain an interesting point of further consideration in order to create a better performing diagnostic system.While more deep-learning focused approaches or the use of neural networks might not be ideal in this case due to the relatively small total number of samples and resultant risks of overfitting and poor generalization as a result, the use of other classification algorithms such as gradient boosting algorithms, and potentially ensemble models could be an interesting point of inquiry in future attempts to improve classification performance.They could also serve as a point of comparison for the SVM models deployed in this study.Overall, the present study suggests that simple machine learning methods based on non-linear measures of two-channel frontal EEGs are worth considering as potential diagnostic aids for practical use in clinical and primary-care settings.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships influencing this work.

Fig. 3 .
Fig. 3. Mean values of correlation dimension.The mean D2 values were calculated across the experiment groups.The PD group (p = 0.0293, t-stat = -2.2094,twotailed student's t-test) showed considerable but not Bonferroni significant reduction in D2 values.The MDD group (p = 0.0088, t-stat = -2.6744,two-tailed student's t-test) had significantly lower mean D2 values than the healthy participant group.

Fig. 4 .
Fig. 4. The mean LZC values were calculated across the experiment groups.The PD group (p = 0.0153, t-stat = -2.4650,two-tailed student's t-test) and the MDD group (p = 0.0138, t-stat = -2.5066,two-tailed student's t-test) had significantly lower mean LZC values than the healthy individual group, indicating a reduction in complexity of the EEG signals in both disorders.

Table 1
Descriptive statistics of subjects.Information about the mean age, education years, HAM-D, HAM-A, PDSS, and SRI scores from participants involved in this study.

Table 4
LASSO derived features for PD-MDD classifier.A list of the features obtained after LASSO regression on the feature space for classifying PD and MDD patients.

Table 5
LASSO derived features for MDD-Control classifier.A list of the features obtained after LASSO regression on the feature space for classifying MDD patients and healthy participants.