Prediction of dispositional dialectical thinking from resting‐state electroencephalography

Abstract This study aims to explore the possibility of predicting the dispositional level of dialectical thinking using resting‐state electroencephalography signals. Thirty‐four participants completed a self‐reported measure of dialectical thinking, and their resting‐state electroencephalography was recorded. After wave filtration and eye movement removal, time‐frequency electroencephalography signals were converted into four frequency domains: delta (1–4 Hz), theta (4–7 Hz), alpha (7–13 Hz), and beta (13–30 Hz). Functional principal component analysis with B‐spline approximation was then applied for feature reduction. Five machine learning methods (support vector regression, least absolute shrinkage and selection operator, K‐nearest neighbors, random forest, and gradient boosting decision tree) were applied to the reduced features for prediction. The model ensemble technique was used to create the best performing final model. The results showed that the alpha wave of the electroencephalography signal in the early period (12–15 s) contributed most to the prediction of dialectical thinking. With data‐driven electrode selection (FC1, FCz, Fz, FC3, Cz, AFz), the prediction model achieved an average coefficient of determination of 0.45 on 200 random test sets. Furthermore, a significant positive correlation was found between the alpha value of standardized low‐resolution electromagnetic tomography activity in the right dorsal anterior cingulate cortex and dialectical self‐scale score. The prefrontal and midline alpha oscillations of resting electroencephalography are good predictors of the dispositional level of dialectical thinking, possibly reflecting these brain structures’ involvement in dialectical thinking.

. Later cross-cultural research demonstrated that dialectical thinking is especially prevalent in East Asian cultures with three central principles: (i) the principle of change, which states that everything is constantly changing; (ii) the principle of contradiction, which states that contradiction exists everywhere and even coexists within the same thing; and (iii) the principle of relationships or holism, which states everything is connected (Peng & Nisbett, 1999).
Although, some individuals from Western cultures may also use it regularly.
Dialectical thinking has overarching impact on cognitive processes as various levels (Spencer-Rodgers & Peng, 2018;Spencer-Rodgers et al., 2010), and an inquiry into the underlying brain mechanisms may deepen our understanding of how dialectical thinking exerts its effects on human cognition. Among the three principles of dialecticism, the principle of contradiction has received a lot of attention. Functional magnetic resonance imaging (fMRI) studies have revealed that the dorsal anterior cingulate cortex (dACC) plays a key role in the monitoring and resolution of conflicting information (Botvinick et al., 2004;Cachia et al., 2017;Carter & van Veen, 2007;Wang et al., 2016).
Furthermore, within a broader framework of executive functions and cognitive control (Cohen, 2017;Diamond, 2013), dealing with contradiction requires a heightened level of executive control. Accordingly, neuroimaging studies have found that the processing of contradiction involves attentional control network regions, such as the dorsolateral prefrontal cortex (DLPFC), (Botvinick et al., 2004;Carter & van Veen, 2007;Egner, 2007). These fMRI findings were further supported by brain lesion studies, which have also shown that a common consequence of dACC injuries is the inability to reliably eliminate conflictdriven behaviors (Mansouri et al., 2017), and the DLPFC is also associated with conflict processing (Botvinick et al., 2004;Egner, 2007).
For example, in the occupation choice task, the high-contradiction group had greater delta and theta power in the N2 amplitude in the frontocentral region than the low-conflict group (Nakao et al., 2013).
Additionally, in the signal stop task, the frequency band from 1 to 7 Hz (i.e., delta and theta range) is induced at 800 ms (Andersen et al., 2009;Moore et al., 2006;Savostyanov et al., 2009). In contrast, the amount of conflict was associated with alpha and beta frequencies in the left occipitotemporal regions (Nakao et al., 2013).
Even though these studies provide valuable insight into the question of how the brain processes conflicting information, direct investigations into the neural bases of dialectical thinking have been scarce. To the best of our knowledge, only one recent fMRI study directly examined the effect of dispositional dialectical thinking on the brain. Wang et al. (2016) used a modified self-reference paradigm to present participants with contradictory or noncontradictory personality adjective pairs and recorded their brain activities when making self or other judgments. They found that the level of dialectical thinking positively correlated with the dACC's involvement in the processing of self-relevant contradictions. Based on this finding, they suggest that the critical difference between dialectical and nondialectic thinkers is how likely they are to utilize the dACC to modulate other regions' activities.
While Wang et al. (2016) provided initial evidence regarding the neural basis of dialectical thinking, there are still issues to be clarified. First, their study was exclusive to the domain of the self, and it is still not clear whether dispositional dialectical thinking may also manifest in the brain's stable and task-free activity patterns, such as in the resting state. Intriguingly, the dACC is a part of the salience network (SN), which governs the allocation of attention to stimuli based on their subjective salience (Seeley et al., 2007;Sridharan et al., 2008).
SN has a key role in switching between the default mode network (Buckner et al., 2008) and executive control network (Osaka et al., 2004), and these networks interact with each other even in the restingstate. Therefore, it is worthy to examine the link between dialectical thinking and resting-state brain activity. Second, the fMRI technique they used, while advantageous in localizing the involved brain regions, cannot portray the finer temporal features of the neural mechanisms.
Finally, their study used a correlational approach by associating certain brain features with a behavioral index, which can be supplemented by a predictive approach that combines neural data and machine learning (ML) algorithms to achieve individualized predictions and uses crossvalidation techniques to ensure out-of-sample generalizability (Dubois & Adolphs, 2016).
In the current study, we aim to explore the possibility of predicting the level of dispositional dialectical thinking via resting-state EEG features. To achieve this goal, we need to deal with the "curse of dimensionality," that is, the brain voltage captured by EEG is usually measured thousands of times while the number of experimental subjects is small, posing huge challenges to traditional data analysis methods. Traditional EEG data analysis methods usually include manually extracting physiological features (such as frequency, spectral power, etc.) from EEG signals. A common problem of this manual feature selection strategy is choosing the type of features. EEG data contain a complex structure that makes it difficult to filter useful information simply via predefined features. Therefore, a data-driven prediction method capable of auto feature selection from the data while keeping as much information as possible is preferable.
In such situations, functional data analysis (FDA) provides a useful statistical approach for dealing with this problem. Through smoothing and decomposing, FDA eliminates data noise and extracts principal components representing most of the information from data. Recently, Zhang et al. (2020) applied the FDA method to predict working memory ability based on EEG and achieved great accuracy, demonstrating the feasibility of this approach.
Here, we first applied FDA to extract EEG features using R (Ripley, 2001) software to provide a feature representation of individual subjects. We then applied a set of ML methods to predict participants' scores on the Dialectical Self Scale (DSS; Spencer-Rodgers et al., 2004), a widely used self-reported measure of dialectical thinking.
Third, we performed a randomness test on our result to distinguish it from random noise. Finally, using eLORETA for source analyses, our specific aim was to test for the relevance of EEG-based resting state activity in dACC and DLPFC for dialectical thinking. Based on previous findings (Botvinick et al., 2004;Carter & van Veen, 2007;Egner, 2007;Wang et al., 2016), we hypothesized that the degree of EEG-based resting state activity in the dACC (as measured using eLORETA values) is related to the degree of dialectical thinking.

Measure of dispositional dialectical thinking
Dispositional dialectical thinking was assessed using the Dialectical Self Scale (DSS) (Spencer-Rodgers et al., 2004), with the 32 items rated on 1 (strongly disagree) to 7 (strongly agree) scale. Sample items include "I often find that things will contradict each other," "My world is full of contradictions that cannot be resolved," and "When two sides disagree, the truth is always somewhere in the middle." In the crosscultural psychological literature, DSS has been used widely and has shown adequate reliability and validity have been confirmed in pieces of literature (Hamamura et al., 2008;Hui et al., 2009;Spencer-Rodgers et al., 2009). In the current study, the Cronbach's alpha was .74, which was comparable to previous studies (e.g. .74 for Chinese participants in Spencer-Rodgers et al., 2009). The EEG series at the beginning time is considered to be noisy since the participants might not have been in the required state. Since it is hard to decide a subject-specific noisy period for each subject, the first 2.5 s, which is considered to be long enough to cover noisy periods for all subjects, is excluded automatically. Another reason for excluding the same EEG length for all subjects is so that the data for analysis under the same condition, which is the requirement of the FDA theory. Similarly, signals at the end 45.5 s were excluded (too noisy because the participants might have failed to stay still after having been sitting too long). An EEG series with 252 s of data was obtained for each subject's electrode.

EEG recording
The collected EEG signals were analyzed in two different ways. First, the whole signal series was considered as a predictor for the DSS score.
In addition, the whole EEG series was segmented into consecutive disjointed pieces of 3 s (or 1500 measurements). This segmentation leads to 84 periods in total in chronological order with each period treated as a separate predictor, the pth period after segmentation, 0 ≤

Functional data analysis
FDA provides a useful statistical approach for processing high frequency signal data (Ramsay & Silverman, 1997). FDA uses some basis functions to approximate the underlying continuous process from discrete observations. The basis functions can be predetermined (e.g., Fourier basis, B-splines) or data-driven. The introduction of the basis function is a key step in dimension reduction, where an infinitedimensional function space is reduced to finite vector space. The number of reduced vector dimensions is a hyperparameter and can be determined according to the signal characteristics. Once the basis functions are well estimated, the signal can be approximated using a linear combination of these basis functions, with the linear coefficients representing the underlying characteristics of the signal data.
Another advantage of the FDA approach is its theoretical support.
The EEG data collected is usually contaminated with certain artifacts (like muscle artifacts, electrocardiogram, etc.) which, in general, are difficult to handle it. Fortunately, the FDA theory shows that under certain conditions, the noise contaminating the EEG signals can be removed in an asymptotic sense with the help of B-spline estimator . So, in this paper, we used the FDA approach as a tool to remove artifacts and obtain useful information from the noisy EEG data.
For every possible wave, w; period, p; and electrode, l; the EEG series where m w,p,l (⋅) is the common mean function of all subjects, w,p,l,k (⋅) is the kth eigenfunction, and w,p,l,i,k is is the functional principal component score for the ith subject, which accounts for intersubject variation in the signal. w,p,l,i (

Machine learning approach
After the FDA approach, the data-driven features { w,p,l,i,k } 1≤i≤34,1≤k≤ , which are considered to represent most of information from EEG signal but has implicit physiological meaning, are obtained for each set of wave, period, and electrode, {w, p, l} 1≤w≤4,0≤p≤84,1≤l≤63 . The integer here is the smallest number by which the data-driven features where S w,p,l,m,s,i is the DSS score of subject i from the testing set,Ŝ w,p,l,m,s,i is the predicted score from model, n t is the sample size of the testing set, andS w,p,l,m,s = n −1 t n t ∑ i = 1 S w,p,l,m,s,i . By the definition of R 2 w,p,l,m,s , the numerator denotes the sum of squared errors of model m, while the denominator denotes the sum of squared errors from the baseline model, where all the subjects' scores are predicted by their mean. When R 2 w,p,l,m,s is negative, the model is considered worse than the baseline model and the model is useful if R 2 w,p,l,m,s > 0. To be clearer, each step of the data analysis is shown in Figure 1.
The five machine learning methods include support vector regression (SVR) (Drucker et al., 1997), least absolute shrinkage and selection operator (LASSO) (Tibshirani, 1996), K-nearest neighbors (KNN) (Altman, 1992), random forest (RF) (Ho, 1995), and gradient boosting decision tree (GBDT) (Friedman, 2001). SVR is similar to support vector machine (SVM), which attempts to maximize the margins of the support vector plane and is a popular classification method. The SVR model is used in (Al Zoubi et al., 2018) to predict age from EEG signal and is suitable for our problem. A radial basis kernel is used in the SVR model. LASSO is a linear regression model with absolute error regularization and is popular for feature selection due to its sparse regression result. Linear regression is the simplest approach for prediction or inference. Since the number of subjects is rather small, regularization is important in the fitting model and that is why LASSO is adopted. The regularization coefficient of the LASSO model was set to 1. To be more flexible and not restricted in linear F I G U R E 1 Flow chart of data analysis relationship, KNN regression is applied due to its simple assumptions and comprehensibility. The K was defined as 3 in KNN model. Besides the simple learning algorithm, some ensemble learning approaches, including bagging and boosting algorithms, were taken to compare their performance in predicting DSS problems. RF is a type of bagging algorithm and is flexible in dealing with high-dimensional data. The randomness of the feature selection of RF can adjust for the noisy EEG data and make the prediction result more stable. The number of random trees in the RF was chosen to be 15. GBDT is a boosting algorithm designed to iteratively remove prediction bias. Wu et al. (2017) use GBDT to evaluate emotion from EEG signal and achieve good performance. It is expected to perform well in our problem.
The number of boosting stages was 20 and the learning rate was 1. Finally, for every wave (w), period (p), electrode (l), and machine learning method (m), theR 2 , averaged coefficient of determination among 200 samplings, is computed. We use subscript to distinguish dif-ferentR 2 from different settings and denoted them asR 2 w,p,l,m . Becausē R 2 w,p,l,m is actually a random variable indicating whether the model is useful, a threshold of 0.1 was chosen, above which the prediction model was considered to be helpful (because the estimated standard error of R 2 w,p,l,m is much less than 0.1, this threshold is considered to be sufficient for detecting useful models).
To obtain a more powerful prediction model, we used the model ensemble technique to develop a useful model. The results are shown in the next section.

Randomness hypothesis test
In statistics, performing multiple hypothesis tests simultaneously electrodes to a multiple testing problem (Rupert, 2012 The reason waves other than periods were chosen to perform the test is that there are a total of 84 periods and values ofR 2 w,p,l,m > 0.1 may have a sparse distribution over these many periods, which will cause an infinite result in the maximum likelihood estimation. The same logic applies to the electrodes. A particular method was fixed before the test because different methods with the same wave, period, and electrode will cause correlated results, which violates the independent hypothesis.

2.4.5
Exact low-resolution brain electromagnetic tomography analysis Low-resolution brain electromagnetic tomography (LORETA) is a source-analysis technique designed to estimate the location and activity of neural generators that cause EEG activity in the scalp. It was developed by the KEY Institute of Brain-Mind Research at the University of Zurich (Pascual-Marqui et al., 1994) to calculate the three-dimensional distribution of neural current density sources in the brain. Two improvements to this method have been published, standardized low-resolution electromagnetic tomography (sLORETA), which uses standardized current density to calculate intracerebral generators (Pascual-Marqui, 2002), and exact low-resolution electromagnetic tomography (eLORETA), which has no need for standardized correct positioning (Pascual-Marqui, 2007) and is a more precise locator of possible current density sources.
The current eLORETA approach uses a real head model (Fuchs et al., 2002) and electrode coordinates (Tsuzuki et al., 2007). The steps to calculate eLORETA values are as follows: (1) electrode names to coordinates, (2) electrode coordinates to transformation matrix, (3) EEGs to cross spectrum, (4) cross spectra to sLORETA, (5) ROI creation (the dACC and DLPFC ROIs were defined using all voxels within 5 mm of the following seeds [

Behavioral data result
The mean DSS score was 137.65, with a standard deviation of 10.71.

Whole EEG results (p = 84)
The results show that the whole EEG series (p = 84) is not a good predictor of the DSS score. None of the models achieved anR 2 higher than 0.1. The results withR 2 higher than 0 are displayed in Table 2. Since thē R 2 calculated here is a random variable, there is little confidence about whether these settings are really helpful in predicting the DSS score.
The results posted here serve as a reference for future studies in this field.

Segmental EEG results (0 ≤ p ≤ 83)
Among the segmented periods (0 ≤ p ≤ 83), there were a total of 35 settings that achieved anR 2 > 0.1. Table 3 lists the information of these settings.
Two kinds of measurements were used to assess the predictive value of different methods, waves, and periods. First, the frequency of each setting was calculated and is shown in Table 3. A higher frequency indicates a higher predictive ability. Second, allR 2 values calculated under each setting were summed, and the summation was used to represent their predictive value. Figure 3 shows these two measurements for different settings. Different colors denote different methods.
There is no SVR method in Figure  2.8.1 Best R 2 As shown in Table 3, the highestR 2 was 0.376, obtained with wave alpha, period 4, and electrode FC1, which suggests a strong relationship between these predictors and DSS score. Figure 4 shows a histogram of values ofR 2 > 0.1. While there are a few values ofR 2 ≥ 0.3, mostR 2 values are less than 0.2.

Different machine learning method results
Among these 35 helpful models, the performance of different machine learning methods was examined. As shown in Figure 3, most dots are red in color, indicating that the LASSO method plays an important role in predicting the DSS score. Figure 5 shows the histogram and sum of R 2 grouped by different machine learning methods. Because there was no SVR model with a value ofR 2 > 0.1, the SVR method is not shown in the figure 5. With more than half of the results belonging to the LASSO model, the LASSO method clearly outperformed the other four methods. Meanwhile, theR 2 values obtained using the other four methods were quite small and less than 0.2 in almost all the cases, while the sum ofR 2 values obtained using the LASSO method is much larger than that for the other methods. LASSO is thus considered to be the most suitable method for dealing with the DSS prediction problem with the help of FPCA in this study.

Different period results
Since there were 84 periods, only periods with a value ofR 2 > 0.1 are displayed for the simplicity of visualization. Figure 6 shows the count andR 2 sum for each period. It can be seen from the Figure 6

Different waves results
Among the results withR 2 > 0.1, the ability of different waves to predict the DSS score was compared. As shown in Figure 3, the alpha wave appears the most times among the results withR 2 > 0.1 as well as with the highest sum ofR 2 , which suggests that the alpha wave has the best predictive accuracy. Figure 7 shows the distribution ofR 2 grouped by wave. The alpha wave is much more important in predicting the DSS score than the other three waves. The delta wave's summedR 2 was rather low, while the beta wave produced the least valuable models.
where M lead denotes the model predicting DSS score using that particular electrode as well as the alpha wave, 4th period, and LASSO method.
The model ensemble's average R 2 was 0.45 following 200 repetitions of the sampling, training, and testing sets. It can be seen that the model ensemble outperforms every single model.

sLORETA results
We selected an area of interest from the existing literature (Damasio, 1996;De Ridder et al., 2011;Song et al., 2014). The two regions were the right and left dACC (Figure 8a and 8b). The scatterplot of dACC and DSS scores is displayed in Figure 9. As Figure 9 shows, there are The homogeneity test and Anderson and Darling (1954) against the normality of data were performed. The p value of homogeneity test was 0.095, indicating no significant difference of dACC variance in the two areas. The p value of the normality of the left-dACC alpha value, right-dACC alpha value, and DSS score were <0.001, <0.001, and 0.889, respectively. The strong evidence to reject the normality of dACC value F I G U R E 9 Scatterplot of dACC alpha value and DSS scores may due to their extreme values (as shown in Figure 9). Therefore, the Spearman's correlation test was considered to be more appropriate in this study.
We also analyzed the precuneus, one of the areas with the lowest spontaneous activity (Coito et al., 2019) as a control. The precuneus is rarely activated in the resting state of EEG, so one could expect that it is not related to dialectical thinking in the resting-EEG.

DISCUSSION
Our results show that the whole EEG series is not a good predictor of DSS, suggesting that a more precise analysis of data segments is required. Among the segmented periods, periods 4−6 at the beginning of data recording demonstrated good predictive value. Our interpretation of this result is that during these periods, the participants were still new to the experimental environment and were gradually adapting to it, and dispositional dialectical thinking was especially exerting its effect during the disengagement from external stimuli. However, the current resting-state design limits our ability to finely delineate the exact events happened during the beginning phase. Future studies might directly examine the link between dispositional dialectical thinking and EEG signals when there is an overt task.
Among the machine learning methods, our results show that LASSO was the best in this study. LASSO is a linear model with shrinkage. It reduces the predicted variance while sacrificing a little bias so that the total prediction error is smaller (Tibshirani, 1996). The linear relationship is a simple assumption between the response and predictors, which makes it widely used in regression analysis. Because our sample size was rather small, we were more inclined to use a simple model in our problem. On the other hand, the number of feature dimensions after the FPCA procedure was approximately 10 for most {w, p, l} 1≤w≤4,0≤p≤84,1≤l≤63 settings, which is slightly high relative to our sample size of 23 in the training set. While LASSO also serves as a method of feature selection due to its sparse estimation result (Tibshirani, 1996), it is reasonable to expect it to behave better after further dimension reduction in our small sample problem. In contrast, the other four machine learning methods' (SVR, GBDT, KNN, and RF) prediction accuracy may be affected by some FPCA scores irrelevant to DSS score without feature subset selection.
While 4 × 84 × 63 = 21,168R 2 values were calculated for each machine learning method in segmental EEG analysis, one needs to be careful regarding results ofR 2 > 0 because false positive results occur during a large number of trials. Thus, whether these results are simply caused by random chance needs to be determined. The randomness test result rejects the null hypothesis for the identical individually independent distribution of R 2 w,p,l,m for the LASSO method, which provides us more confidence in the relationship between useful predictors and DSS score. The test results show that there is a significant difference between the alpha wave and other three waves, while the difference among the other three waves was not significant enough to distinguish them. However, this conclusion should be made cautiously.
The test statistic's distribution relies on the assumption of the identical individually independent distribution of signals from different periods and electrodes under each wave, which may not be true in real-life situations. Thus, rejecting this hypothesis does not make much sense in some cases. On the other hand, the true relationship between different periods and electrodes are too complex to characterize, not to mention to consider in a hypothesis test. However, this test result does provide some evidence regarding the predictive ability of the alpha wave despite these limitations. Furthermore, a model ensemble was used to strengthen the model's predictive ability. The model ensemble is a technique combining different models to achieve a better accuracy than any of its constituent models (Opitz & Maclin, 1999). In our case, the model ensemble achieved an averageR 2 of 0.45, while the single model's highestR 2 was 0.376. This is due to a reduction in the error variance by averaging each single model's output. The performance of the model ensemble was affected by the correlation of its constituent models. Generally speaking, the more independence among the constituent models, the better the model ensemble will perform (Goodfellow et al., 2016).
Because our constituent models were based on electrodes FC1, FCz, Fz, FC3, Cz, and AFz separately, they are not expected to share much dependence, which accounts for the increased prediction accuracy of our model ensemble. To compare with other relevant studies, Al Zoubi et al. (2018) build a model to predict age from EEG signal and achieve best R 2 = 0.37 (the number of subjects = 500) and Zhang et al. (2020) use EEG to predict the working memory and the model's R 2 = 0.72 (the number of subjects = 145). Considering the fact that only 30 subjects are used in this study, we think our model's performance is brilliant with Finally, in the model, the electrodes that could predict the best DSS results were basically consistent with the results of the literature review, primarily measuring signals from the dACC and DLPFC.
Consequently, an sLORETA source-analysis approach was used, which was designed to estimate the location and activity of the neural generators that cause EEG activity in the scalp. We explored the correlation between DSS and cortical sources of resting cortical EEG rhythms In the present study, we observed a positive correlation between right dACC resting alpha sources and DSS scores. There are several interpretations for this result (Sadaghiani & Kleinschmidt, 2016). First, alpha oscillations are associated with the inhibition of neural activity, a process that corresponds cognitively to the internal maintenance of tonic alertness, usually occurring during brain processes not directly related to tasks, similarly, the resting state EEG was used in this study. Second, alpha oscillations are associated with the cognitive function of selective attention, that is, the associated feature selection process takes precedence over other processes from top to bottom. Easterners with a higher degree of dialectical thinking pay more attention to relational situations than westerners (English & Chen, 2007) Third, alpha oscillations can also achieve rapidly changing long-distance cortical coordination, which can be thought of as phase adaptive control, including the regulation of working memory. Given the dominant cultural norms in East Asia (i.e., ingroup harmony and collective agency), these strategies play a functionally adaptive role in everyday control exertion (Park et al., 2018). And alpha waves can be well identified using a data-driven approach (Tenke & Kayser, 2015;Tenke et al., 2017). This showed that only the peculiar topography and frequency of cortical resting EEG sources were able to roughly discriminate between dialectic and nondialectical. These results are in line with previous findings suggesting that dACC alpha rhythms are one of the physiological mechanisms by which the associative dACC modulates conflict processing (Nakao et al., 2013;Strauss et al., 2012).
The current study has several limitations. First, the disadvantage of EEG is its spatial resolution. The 64 electrodes can only map a limited area of activity, and 256-electrode set-ups have a significant spatial resolution improvement over their 64-electrode equivalents (Luu et al., 2001;Wu et al., 2014). Second, on this basis, there is controversy regarding whether spatial localization truly reflects changes in specific brain regions, which is worth investigating. In the future, using MRI and magnetoencephalography to investigate spatial changes in the brain will be a worthy research direction. Third, there are some hyperparameters in these five machine learning methods (e.g., regularization coefficient in LASSO model, number of nearest neighbors in KNN, etc.) that can be tuned to improve model accuracy. Because we included many {w,p,l} settings, it was impractical to tune these parameters individually.
We decided upon these hyperparameters based on our empirical experience and successfully obtained approximate results. Since there has been no prior research on the prediction of DSS scores based on EEG data, our results can serve as a reference for a more precise study in the future.

CONCLUSION
We investigated the brain's spontaneous activity over time using resting EEG and linked it to dialectical thinking. There was a significant positive correlation between the alpha wave of sLORETA activity and DSS score in the right dACC brain region. Together with sLORETA analysis, our machine learning results show that LASSO is the best machine learning method, and the alpha wave is the best predictor of DSS score in this study. With data-driven selected electrodes (FC1, The thoughtful comments of two reviewers are gratefully acknowledged.

CONFLICT OF INTEREST
All other authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT
The data and code of this study will be available on request to the corresponding author.

PEER REVIEW
The peer review history for this article is available at https://publons. . We build our functional model where i (t) is the infeasible underlying time-varying process of the subject i, which is the realization of stochastic process is the error term of the ith subject. The observed data is the discretized realization of this model. Denote the covariance function of (⋅) as G (t, t ′ ) = Cov( (t), (t ′ )). The classical functional analysis theory assures that there exist series of values is called rescaled eigenfunctions of the covariance function G(t, t ′ ). Furthermore, we represent model 1 as the well known Karhunen-Loève L 2 form: where m(⋅) is the mean function among all the subjects and i (t) = The FPC (functional principal component) score { ik } 1≤i≤n,k≥1 is series of uncorrelated random variable with mean zero and variance 1 representing the complex structure behind the data. Since the observed data is discrete, the practical model is 3) The procedure of estimating FPC score is decomposed into following steps.

A.1 B-spline estimation
The  .

A.3 Estimation of FPC score, eigenvector, and eigenvalues
For any positive integer k, the kth eigenvalue k and eigenfunction corresponding to it satisfies equation We useĜ(t, t ′ ) to replace G(t, t ′ ) and approximate k (⋅) by spline func- .
Then it can be shown by simple algebra that the equation Finally, the number of FPC score is chosen by the rule of thumb criterion: The first FPC score are selected, by which one is assumed to have obtained enough information from the original signal.