Decoding the neural responses to experiencing disgust and sadness

Being able to classify experienced emotions by identifying distinct neural responses has tremendous value in both fundamental research (e.g. positive psychology, emotion regulation theory) and in applied settings (clinical, healthcare, commercial). We aimed to decode the neural representation of the experience of two discrete emotions: sadness and disgust, devoid of differences in valence and arousal. In a passive viewing paradigm, we showed emotion evoking images from the International Affective Picture System to participants while recording their EEG. We then selected a subset of those images that were distinct in evoking either sadness or disgust (20 for each), yet were indistinguishable on normative valence and arousal. Event-related potential analysis of 69 participants showed differential responses in the N1 and EPN components and a support-vector machine classifier was able to accurately classify (58%) whole-brain EEG patterns of sadness and disgust experiences. These results support and expand on earlier findings that discrete emotions do have differential neural responses that are not caused by differences in valence or arousal.


Introduction
Emotions have an impact on many aspects of daily life. It is therefore not surprising that the nature of emotions has been the topic of active scientific debate for more than a century (Scherer, 2005). One of the major unresolved issues is whether emotions are discrete categories (e.g. disgust or sadness) or dimensional in nature (e.g. valence and arousal). Researchers studying emotion usually adopt either the categorical or the dimensional framework and introduce experimental manipulations within that framework whilst ignoring the other. This then results in a comparison of, for instance, experiencing disgust versus sadness regardless of valence; or a comparison of positive versus negative valence regardless of emotion category. Consequently, the experience of valence and emotion categories are conflated and so are their neural representations. Disentangling those neural representations would help to advance our knowledge of emotion experiences. Recent papers have called for a more integrated approach (Mikels et al., 2005;Harmon-Jones et al., 2017;Harmon-Jones, 2019). In the current study, we compare the neural responses of experiencing disgust versus sadness, carefully matched for levels of valence and arousal.

Emotion theories
Researchers strive to construct models that are as sparse as possible, yet are still able to encompass all emotion experiences. The currently leading emotion theories can be divided in dimensional, discrete, appraisal, and constructivist theories.
The discrete models state that a small set of emotions exist, each with their own behavioral, neural, and physiological (most importantly facial) response pattern (Hamann, 2012). The best known discrete model is Ekman's theory of basic emotions, consisting of happiness, sadness, disgust, fear, anger, and surprise (Ekman, 1992). Importantly, basic emotions are assumed to be universal and innate (Ekman and Cordaro, 2011).
Dimensional models state that all emotion experiences are the product of two (or a few) independent bipolar dimensions. The most widely adopted of these is the Circumplex model of affect (Russell, 1980) which states that all emotion experiences can be placed on a circular map consisting of two independent bipolar dimensions called valence and arousal.
Appraisal theories propose that emotions are the result of physiological responses to an event and some form of cognitive appraisal of those responses. The multi-level sequential check model proposes that this cognitive appraisal is a fixed order sequence (Scherer, 2001) in which novelty processing is an essential component that occurs before the processing of pleasantness (van Peer et al., 2014).
In the constructivist view of emotions, the theory of constructed emotion (Barrett, 2017) is currently dominant. It proposes that emotions are concepts that are constructed by the brain. These concepts are neural representations that are shaped by past experiences and that predict sensory input, the best course of action, and the consequences for allostasis (the balance of resources for all physiological systems that are needed for growth, survival and reproduction). The predicted consequences for allostasis can be consciously experienced as affect.

Neural representations of emotions
Regardless of which emotion theory they provide evidence for, being able to identify distinct neural responses for the experiencing of different emotion categories would have tremendous value in both fundamental research (e.g. positive psychology, emotion regulation theory) and in applied settings (e.g. clinical, healthcare, commercial). Fine-grained measurement of which emotions someone experiences may help understand the relationship between affect and subjective well-being. So, the disentanglement of the neural responses for the experiencing of different emotion categories would serve a scientific purpose, regardless of what is innate and what is constructed.
Even though fMRI research found that certain brain areas (e.g. the amygdala, insula) show more activation for some emotions categories than for others (Fusar-Poli et al., 2009;Vytal and Hamann, 2010) this localization is not specific and consistent (Lindquist et al., 2012) and simple one-to-one mappings between brain areas and discrete emotions have now been ruled out (Hamann, 2012). Similarly, there is little support that a single brain area regulates valence, nor that two independent systems regulate positive and negative affect (Lindquist et al., 2016).
However, the findings of distinct activation patterns provide some evidence that it might be possible to identify which emotion is experienced based on neuroimaging data. Further evidence is provided by EEG studies.
Since emotional responses are believed to start within the subsecond range of an event, the temporal dynamics of emotion experiences have frequently been explored (Esslen et al., 2004). Most of this research has focused on event related potentials (ERPs) of the EEG signals using stimuli from the International Affective Picture Set (IAPS; Lang et al., 2008) since its introduction in 1988 (Lang et al., 1988). With its normative valence and arousal ratings for more than a thousand images, IAPS has proved to be a convenient tool for such studies. As a consequence, however, most studies have focused on valence and arousal. The lack of a discrete emotions equivalent has resulted in considerably less studies into ERP responses to discrete emotions. Hajcak et al. (2011) concluded that "no ERP component has been found that reflects a specific emotion, and variation in the timing and amplitude of stimulus elicited ERPs appears to relate to broad dimensions of emotion and motivation" (p. 517). In order to facilitate research into neural responses of discrete emotions, efforts have been made to provide normative ratings of discrete emotions to IAPS images by Mikels et al. (2005) and Libkuman et al. (2007).
Research into the temporal dynamics of valence and arousal experiences have found consistent effects on the amplitude of early (P1, N1), middle (P2, N2, early posterior negativity; EPN) and late ERP components (the late positive potential; LPP) (Olofsson et al., 2008;Hajcak et al., 2011). Valence effects in P1 have been mixed and contradictory (Hajcak et al., 2011). N1, P2, N2, and EPN components have all been found larger for both pleasant and unpleasant images compared to neutral ones (Foti et al., 2009;Hajcak et al., 2011;Ibanez et al., 2012), though greater EPN for pleasant compared to neutral and unpleasant images has also been found (Hajcak et al., 2011). Finally, the LPP is usually larger for arousing stimuli (Hajcak et al., 2011). Generally, early components are associated with valence effects and late components with arousal effects (Olofsson et al., 2008). However, many studies did not control for arousal when studying valence, which might be a confounding factor since very pleasant and very unpleasant images are usually higher in arousal than neutral images (Hajcak et al., 2011). Studies that explicitly contrasted both valence and arousal have found arousal effects for early components as well (Feng et al., 2012;Feng et al., 2014). Overall, valence and arousal do seem to differentially impact ERPs.
The few EEG studies that have sought to contrast the neural responses to discrete emotion experiences found differences in ERP amplitudes between fear and disgust (Carretié et al., 2011;Wheaton et al., 2013), between disgust, happiness, and sadness (Hot and Sequeira, 2013). Moreover, ERP patterns of valence and arousal, and of fear, anger, happiness, and disgust have been found to be consistent within category (Grootswagers et al., 2020). Additionally, Zhao et al. (2018) were able to successfully classify tenderness versus amusement and anger versus fear experiences based on power in frequency bands. These studies provide initial evidence of distinct temporal patterns for discrete emotions, but further research is needed to establish this clearer.
Taken together, we can conclude that consistent neural patterns have been found for both valence and arousal, and for several discrete emotions. At the same time, the literature shows much variance and noise in these neural patterns. Meta-analyses may yield statistically significant results, yet the practical implications are limited if individual empirical findings are diffuse. One possible source of noise is a lack of systematic control of the emotion aspects that are not the focus of the study. For instance, many studies exploring the neural responses to discrete emotions have not accounted for differences in valence and/or arousal levels between conditions, or vice versa.

Systematic study of multifaceted emotion experiences
We are not the first to propagate proper experimental control of all aspects of emotion experiences. Harmon-Jones et al. (2017 propose an integrated perspective, considering both discrete and dimensional views to explain the complex pattern of neural responses to emotionally salient stimuli. Mikels et al. (2005) and Libkuman et al. (2007) made efforts to categorize IAPS images with discrete emotion labels to facilitate such integrated studies.
Some studies have already adopted such an approach, separating discrete emotions from valence and arousal (Carretié et al., 2011;Hot and Sequeira, 2013;Zhao et al., 2018). Lu et al. (2016) have demonstrated that it is important to do so, i.e. that variation in valence within a discrete emotion does affect ERP responses. Carretié et al. (2011) explored whether disgusting and fearful images would differ in attracting attention in a number classification task where the images served as distractors. P2 amplitudes were larger in response to the disgusting distractors compared to the fearful ones. Importantly, differences in valence and arousal were statistically controlled for, making it unlikely that the observed P2 differences were related to differences in valence and arousal. However, the number classification task makes it difficult to relate the observed effects to the experiencing of emotions. Also, each image was shown twice, risking a repetition effect that would have its own impact on ERP amplitudes (Curran and Dien, 2003;Rugg et al., 1988;Ferrari, 2017). Zhao et al. (2018) showed short videos that elicited either amusement, tenderness, anger or fear. They were able to successfully classify the EEG patterns that were recorded during the two positively valenced (tenderness and amusement) videos, and during the two negatively valenced (anger and fear) videos which were similar in valence and arousal. Classification was based on the power in different EEG frequency bands (theta, alpha, and beta). However, they used only a single film clip per category which could pose a serious problem for the interpretation and validity of the results. Neural difference could have been caused by the experienced emotion, but just as likely by another aspect of the movie (sensory differences, attentional differences, sound track differences etc.). It is unclear whether the observed differences can be related to differences in the experiencing of discrete emotions. Hot and Sequeira (2013) showed IAPS pictures that elicited happiness, sadness, disgust, or were neutral. Differential ERP patterns between all categories were found. Most noteworthy here was a distinct pattern for sadness compared to disgust during the 160-200 ms interval at parietal sites. These two conditions were similar in their normative valence and arousal scores. Therefore, Hot and Sequeira (2013) provide the clearest evidence thus far that discrete emotion experiences evoke differential ERP responses that are unrelated to differences in valence and arousal. However, their analysis was performed through a two-step PCA (first spatial, then temporal), which makes it difficult to compare the observed effects to other studies that typically explore ERP components. Further, the study used only 15 images per category, and showed each image twice, risking repetition related ERP responses.
In sum, there appears to be some initial evidence that EEG patterns may differ in response to the experiencing of different discrete emotions, even when they are controlled for differences in valence and arousal. The aim of the present study is to establish this more firmly. It would imply that a more systematic approach of studying neural responses to emotion experiences is necessary in order to gain a fine grained view that is usable in further applied and fundamental research of emotion.

Current study
In the current study, we showed IAPS images that we independently validated to be either disgust or sadness evoking. Sadness and disgust experiences partially overlap in valence and arousal. Rather than statistically controlling for differences in valence and arousal, we carefully selected the images so that the two sets (sadness and disgust) were indistinguishable on their normative valence and arousal ratings.
In order to have sufficient statistical power we tested a higher number of participants (69 were used in the analyses) than previous studies. To prevent contamination of the EEG data with repetition effects (Rugg et al., 1988;Rugg et al., 1997;Curran and Dien, 2003), we showed each image only once. Further, we adopted a passive viewing paradigm.
To assess whether different emotions elicit different patterns of EEG responses, we analyzed the ERP components that are commonly associated with emotion experiences. Furthermore, we used state-of-the-art whole-brain SVM classification to see whether the neural responses to disgust and sadness were distinguishable, even when their valence and arousal ratings were not.

Results
We first performed a conventional ERP analysis comparing ERP amplitudes between the disgust and sadness conditions. For each ERP, the EEG data was aggregated over the relevant time interval and subset of channels. Through a hierarchical regression analysis we then explored whether emotion category predicted the observed differences in ERP components after controlling for differences in the low-and mid-level features of the images, or the normative valence and arousal ratings of the images. Finally, we performed a machine learning classification on the entire EEG waveform ([0-1 s] post stimulus onset) of all available channels to fully make use of all the detail that is embedded in the EEG signal.

ERP analysis
Nonparametric cluster-based permutation tests were performed in order to solve the family-wise error rate problem that would occur in parametric testing of EEG data (Maris and Oostenveld, 2007). Statistical significance of differences between evoked disgust and sadness responses was determined through Monte-Carlo estimates of the nonparametric cluster analyses with dependent-sample t-tests for each of five emotion related ERP components: N1, P2, N2, EPN, and LPP. P1 was not included as this ERP is not commonly present in a passive viewing paradigm (Hajcak et al., 2011). In all comparisons we used the weighted cluster mass (WCM; Hayasaka and Nichols, 2004) cluster statistic with a weight parameter (θ = 1) resulting in the use of Fishers combining function (Fisher, 1992) that is sensitive to peak intensity (Hayasaka and Nichols, 2004). EEG amplitudes were averaged over appropriate time-intervals as determined by visual inspection of the grand averages over participants, guided by intervals and topography provided by Hajcak et al. (2011). Multiple comparison correction was applied on the cluster level per ERP component. All tests were two-sided and an alpha threshold of 0.05 was used to determine significance. For the sake of readability, only significant p-values are provided.

ERP intervals
ERP time-intervals were determined by visual inspection of the grand averages over participants, guided by intervals and topography provided by Hajcak et al. (2011)

N1
Cluster-based permutation tests over the pre-determined 70-110 ms interval surrounding the N1 peak revealed a significantly larger average negative amplitude for disgust compared to sadness evoked responses at frontal-central sites (p =.01). See Fig. 2A for the topography of the N1 cluster.

P2
After selecting a latency range of 180-230 ms as the interval where P2 was most prominent, cluster-based permutation tests showed significantly larger average positive amplitude for the sadness compared to the disgust evoked EEG responses (p =.008). In this interval the difference was most clear in central sites, mostly in the right hemisphere (See Fig. 2A).

N2
In the N2 latency range of 220-290 ms after stimulus onset, EEG amplitudes to sadness evoking images were significantly larger (p =.018) from those to disgust evoking images. However, this difference can best be labelled as part of the EPN (covered next) due to its topography and overlap in latency. None of the clusters fitting the N2 topography indicated a significant difference between conditions.

EPN
The EPN was visible in latency range 230-350 ms after stimulus . A positive value reflects that the average EEG amplitude over the latency interval was more positive for the disgust than for the sadness experience. Consequently, for negative going peaks, negative differences indicate larger amplitudes for disgust compared to sadness. Dots represent the placement of the electrodes. Asterisks (*) indicate electrodes that constitute the cluster for which a significant difference was found. B. The topographies of the beta coefficients of emotion category in the final regression model which includes the participant, low-level features of the images, mid-level features of the images, valence and arousal, and emotion category (disgust or sadness) per electrode, explaining the variation in the single-trial ERP components quantified as the average amplitude in the respective latency intervals. Asterisks (*) indicate that the emotion category contributes significantly to the prediction of the EEG amplitude.
onset. Note that the EPN is a relative negative peak, meaning that the absolute amplitudes stay within the positive range, but the peak is negative going compared to the surrounding interval. Cluster-based permutation tests showed that the average amplitude of the disgust evoked response was significantly larger than that of the sadness evoked response (p =.015) at occipital channels (See Fig. 2A).

LPP
Cluster-based permutation tests using the LPP latency range (300-1000 ms) did not reveal significant differences between disgust and sadness responses. Visual inspection of the grand averages shows that the LPP amplitude is larger for disgust compared to sadness at the early LPP interval (470-550) and this difference is reversed for the later interval (880-1000 ms). A quick rising LPP, possibly related to a quick rise in arousal, seems like a plausible response from an evolutionary perspective. A disgusting event calls for swift action to increase survival chances, while a slower but longer lasting response seems plausible in a sadness evoking event.
Because the opposite effect of these LPP differences would have cancelled each other out by averaging the amplitude over time, we performed separate post hoc cluster-permutation test over early and late LPP intervals. Several emotion related studies have used a similar split into an early and late LPP interval (e.g. Bublatzky and Schupp, 2012;DeCicco et al., 2014;Dennis and Hajcak, 2009;Schindler and Kissler, 2016;Schindler et al., 2015;Schupp et al., 2004).
To complete our analysis, we performed post hoc cluster-based permutation tests over the early LPP window (400-600) and late LPP window (800-1000 ms). Resulting clusters indicated there were no significant differences between the average disgust and sadness amplitudes, though in the late window a greater response for sadness than for disgust approached significance (p =.055).

Hierarchical regressions analyses
We performed a hierarchical regression analysis to explore whether emotion category explains a significant amount of variance in the EEG response above and beyond low-and mid-level, and valence and arousal ratings of the images. Predictors were entered into the model through forced entry in five steps: 1) the participant, 2) the low-level features of the images, 3) the mid-level features of the images, 4) the normative valence and arousal ratings, and finally 5) the emotion category. The low-level features consisted of luminance, contrast, skewness and kurtosis of the luminance, and the position on the green-red and on the blue-yellow spectra that were extracted from the perceptually realistic CIELAB (or L*a*b*) color space (ISO/CIE, 2019). The mid-level features of the images were extracted using the GIST descriptor by Oliva and Torralba (2001), yielding 512 features. These features were then reduced to three principal components in a principal component analysis with a varimax rotation. The scree plot indicated a clear inflection points after the third component. The three components thus selected explained a total of 63 % of the original variance.
The dependent variables for these regression analysis were the single-trial ERPs for which we found clusters of significant differences: the N1, P2, and EPN components, quantified as the average EEG amplitude over the respective latency intervals. The analysis was performed per individual electrode to allow for an inspection of the Note. Hierarchical linear regression models were used to explain the variance in the N1, P2, and EPN components, quantified as the average EEG amplitude over the respective latency intervals per individual electrode. Predictors were entered into the model through forced entry in five steps: 1) the participant, 2) the low-level features of the images, 3) the mid-level features of the images, 4) the normative valence and arousal ratings, and finally 5) the emotion category. Listed here are the standardized beta-coefficients (β) of the emotion category in the final linear regression model, the variance that emotion category uniquely explains (ΔR 2) , the Fchange(1, 2377) (FΔ) and p-values of that change. Bold markings highlight electrodes for which emotion category explains a significant amount of variance.
topographic maps of the resulting beta coefficients (Fig. 2B). For a full listing of beta coefficients, R 2 change, Fchange, and p-values per electrode, see table 1. Adding the emotion category to the regression model explained a significantly larger amount of variance than the model without emotion category for the EPN latency in electrodes P3, PO3, O1, Oz, O2, and F8 (Table 1). The topography of the beta coefficients of emotion category (Fig. 2B) closely resembled the topography of the cluster analysis for the EPN ( Fig. 2A) providing further support for the EPN being sensitive to emotion category.
In the N1 latency interval, emotion category explained a significant amount of variance in electrodes Fp1 and F3 (Table 1), and the topography of the beta-coefficients (Fig. 2B) is very similar to that of the cluster analysis ( Fig. 2A).
In contrast, the topography of the beta-coefficients in the model predicting the P2 component (Fig. 2B) bears no resemblance to the topography of the previously found P2 cluster of significant differences ( Fig. 2A). The coefficient of only one electrode (P8; table 1) was significant, however, this was not part of the P2 cluster.

SVM classification
The cluster-based analysis is guided by previous literature and consequently makes use of the data in a condensed form to explore how the current data compares to previous reports. In addition to this, we adopted a data-driven approach in comparing the full EEG signals to explore whether sadness and disgust experiences evoked differential neural responses. We used a single-trial classification procedure on the pre-processed (see above) whole-scalp EEG of the 1000 ms interval after stimulus onset. All trials were combined over participants in order to obtain a single classification model for all participants instead of a separate model for each. We used a linear SVM classifier as implemented in the FieldTrip toolbox (Oostenveld et al., 2011). SVM is a typical classifier for decoding neuroimaging data as it is generally better at dealing with a large number of features than other classifiers (Grootswagers et al., 2017). The model made use of the time-domain data; the amplitudes of 1025 time-points of 32 channels, resulting in 32,800 features. A 5-fold validation procedure was adopted to prevent overfitting. In short, the data was randomly divided in 5 parts. Then 4 parts were used to train the classifier, and the fifth is to test the resulting model. This procedure is repeated five times, where each of the five parts functions as test data once. Inequality of class sizes was compensated through upsampling during training, and downsampling during testing as implemented in the FieldTrip toolbox (Oostenveld et al., 2011). Significance was determined by a Chi-square analysis of the confusion matrix, using an alpha threshold of 0.05.
The trials of all participants were combined per emotion condition resulting in 1170 disgust trials and 1221 sadness trials. Classes were resampled to ensure equal size by adding 40/41 additional disgust samples during training, and by removing 10/11 sadness samples during testing in each of the 5 folds. Classification was 58 % accurate, which was significantly better (Х 2 (1) = 65.01, p <.001) than chance (50 %). The contingency table revealed that disgust and sadness were classified approximately equally well, with 675 and 690 trails classified correctly, and 495 and 480 trials classified incorrectly, respectively.
To gain a better understanding of the classification model, we explored the weights of the features, averaged over the five folds. These weights are depicted in Fig. 3A in a channel × time grid. The topographical distribution of latency intervals having the largest classification weights (Fig. 3D) [108,135], [205,221], [252,263], [285,288], and [385, 387] ms after stimulus onset shows that the strongest features are all in the early range, before 400 ms, in the occipital (Oz and O2) and frontal (Fp2) channels. Averaging (of the absolute weights) over time shows that channels Oz, O2, and Fp2 contributed most to the classification, as well as channel P8 (Fig. 3B). Averaging (of the absolute weights) over channels shows that early time intervals contributed most to the classification process (Fig. 3C).

Discussion
In this study we explored the difference in electrophysiological responses to stimuli that induce the experience of sadness and disgust which were matched for valence and arousal. Further, we explored whether a whole-scalp SVM classification could successfully classify the EEG responses to those stimuli into distinct emotion categories.
ERP's to sadness-eliciting images differed from those of disgust experiences in the N1 and EPN time intervals. Specifically, both ERPs were larger for disgust compared to sadness. Our classification procedure was able to correctly label disgust and sadness, significantly better than chance, with an accuracy of 58 %.

ERPs for sadness are different from those for disgust
All emotion related ERPs that we expected to observe (N1, P2, N2, EPN, and LPP) were clearly present in the average waveform of both the sadness and the disgust experiences. ERP amplitudes for the N1 and EPN components were significantly larger for the disgust compared to the sadness experiences. Even after controlling for differences in perceptual features of the images and the normative valence and arousal ratings, emotion category was still a good predictor of the EPN and N1 responses. Moreover, the topography of the prediction by emotion category closely matched the topography of the ERP differences between disgust and sadness for the EPN and N1 components. Taken together, these results provide solid evidence that the EPN and N1 are sensitive to emotion categories.
Larger N1 and EPN components have also been found to valenced stimuli compared to neutral ones by previous studies (Hajcak et al., 2011). However, the current findings cannot be attributed to differences in valence or arousal since our stimulus sets were carefully selected to be indistinguishable in normative valence and arousal, and we have controlled for the remaining variance in valence and arousal. The same holds for comparable studies that are being discussed here, who either statistically controlled for such differences (Carretié et al., 2011;Wheaton et al., 2013) or used stimuli that were similar in valence between emotion categories (Hot and Sequeira, 2013).
Disgusting images have also been found to trigger larger EPN responses compared to threatening images (Wheaton et al., 2013). Together with the current finding of larger EPN responses to the experience of disgust compared to sadness, this suggests that the EPN component is sensitive to discrete emotions.
Initial results from our cluster analysis suggested that besides the EPN and N1, the P2 component was also larger for disgust experiences compared to sadness experiences. However, after controlling for perceptual features, and the normative valence and arousal ratings of the images, emotion category no longer predicted the P2 component. The topography of the prediction coefficients of the P2 component did not resemble the topography of the cluster of significant differences. Emotion category predicted the P2 amplitude in only one electrode, and this was not part of the initial P2 cluster. Taken together we conclude that the P2 does not appear to be differentially sensitive to experiences of disgust and sadness.
However, others have found differential P2 responses to emotion categories. For instance, Carretié et al. (2011) found larger P2 responses to disgust-evoking compared to fear-evoking or neutral distractors.
Since the P2 is considered to modulate with attention (Luck and Hillyard, 1994;Luck et al., 2000), and Carretié et al. (2011) employed an attention related task, it is possible that their result reflects an attentional component that is not present in a passive viewing paradigm.
Hot and Sequeira (2013) have also found differential EEG patterns of sadness versus disgust experiences for the 160-200 ms interval which partially overlaps with the P2 time interval of the current study (180-230 ms). However, the included ERP plot in their paper (Fig. 1, lower panel in Hot and Sequeira, 2013) suggests a stronger negative going response for sadness compared to disgust at parietal sites, indicating that this is not a P2 component. A difference in experimental design may have caused different results. Hot and Sequeira (2013) showed each image twice to boost the number of trials. Since repetition of stimuli have been shown to impact P2 amplitudes (Curran and Dien, 2003;Ferrari, 2017), this might have might have obscured the affect related responses. We showed each image only once to prevent this issue.
Not all ERP components that we explored showed significant differences between disgust and sadness. No clusters were found in the N2 time range and topography that contained significant differences between conditions.
The effect of emotion category on the LPP component was also not significant. Since the difference in EEG amplitude changed direction from the early LPP interval (disgust > sadness) to the late LPP interval (sadness > disgust), we explored the early and late LPP interval separately. This revealed that the difference in the early interval was not significant and in the later interval it was bordering significance (p =.055). Altogether, taking into account our large sample size, there is no evidence that the LPP differentiates between sadness and disgust. Future studies into neural responses to discrete emotions should explore early and late LPP effects separately.
Other studies into the electrophysiological responses to discrete emotions found that N2 and LPP did not differentiate between sadness, happiness, or disgust (Hot and Sequeira, 2013) or between threatening or disgusting pictures (Wheaton et al., 2013). Based on these previous and our current findings, it cannot be concluded that the N2 and LPP components are sensitive to discrete emotion categories.

Classification of ERP responses to sadness and disgust are accurate above chance level
In addition to the conventional ERP analyses, we also performed a state-of-the-art single-trial linear SVM classification on the whole-scalp EEG of the full second interval after stimulus onset. This method moves beyond the scope of comparing amplitudes of a single ERP between conditions, instead it compares patterns of EEG responses. Further, we moved beyond the typical classification procedure, in which classification is based on a few features of the EEG pattern (e.g. the amplitudes of a few ERP components). While the typical approach allows for comparison with the existing literature, it also limits findings to what is already known. Therefore, we chose to classify based on the EEG signal of the full 1-second time interval of all recorded electrodes in order to utilize the full potential of the classification procedure and to allow for discriminatory aspects that may not have been found yet by previous, more conventional analyses.
We classified sadness and disgust experiences with an accuracy of 58 %, which is well above chance level. Importantly, we based classification on all trials from all participants combined. Consequently, the resulting classification model is not specific to any given individual participant, but constitutes a model that generalizes across all participants, enhancing the generalizability of our findings.
In judging this level of accuracy, we should consider the following. First, the averaged ERPs show a modest difference between emotions and emotion category uniquely explained only a modest amount of the variance in the EPN and N1 components. Additionally, we classified using single-trial EEG segments which are driven by many factors beyond emotion category. Taken together, more than a modest classification accuracy should not be expected.
Inspection of the classifier model showed that the features (EEG amplitudes at a channel/time) that contributed most strongly to the classification, occurred at early latencies (<350 ms) indicating that disgust and sadness have a quick differential neural response. The topography of the features-weights showed that right-frontal and rightoccipital/parietal sites contributed most to the classification model.
The determining features of the classification model do not always align with well-known ERP latencies and topography. This shows the importance of exploring neural responses beyond the typical ERPs and instead explore the response as a whole. Latencies and topographies that would be excluded in a typical ERP analysis appear to contain differentiating features that would go unnoticed in a conventional analysis.
Previous classification studies had already shown that discrete emotions can be successfully classified based on fMRI/PET activation patterns (Kragel and LaBar, 2016), yet such studies typically did not control for differences in valence and arousal. An EEG study that successfully classified discrete emotions based on power of frequency bands (Zhao et al., 2018) used only a single video per condition. Therefore, it cannot be concluded that the differences were the result of the evoked emotions, or to some other property of the videos.

Importance of a systematic approach in exploring neural representations of emotions
Our results show that disgust and sadness have distinct EEG responses that are not related to differences in valence and arousal, further consolidating the notion that discrete emotions evoke distinct neural responses (Carretié et al., 2011;Hot and Sequeira, 2013;Zhao et al., 2018;Wheaton et al., 2013). These findings implicate that a more systematic approach of affective neuroscience is needed to gain a better understanding of the neural responses to experiences of emotions. Studies that contrast valence or arousal levels should take effort to separate those effects between distinct emotion categories. Likewise, studies that contrast discrete emotions should either match or control for valence and arousal levels in order to prevent conflated effects. The relevance of this was recently shown by Lu et al. (2016) who concluded that neural responses to fear and disgust are affected by differences in valence within those categories.
Most studies of neural responses to affective experiences have focused either on the discrete or the dimensional framework of emotions, manipulating either discrete emotions while ignoring valence levels, or vice versa. Differences in the ignored framework would have affected the measured neural responses. Pervious findings should be replicated in studies that systematically separate neural responses to valence or arousal from those to discrete emotions. This should not only lead to more robust response patterns, with less variation, it should also provide a more fine-grained picture of neural responses to affective experiences.
In this study we showed that the experience of disgust and sadness trigger differential neural responses. Though initial attempts in contrasting other discrete emotions while controlling for valence and arousal have been made by others (Carretié et al., 2011;Hot and Sequeira, 2013;Wheaton et al., 2013), more systematic studies are needed to robustly determine which discrete emotions differ in their neural patterns, and how. The current study only contrasted sadness with disgust, and used only IAPS images to evoke these emotions. Further research is needed to determine whether other emotions also have distinguishable neural responses patterns, and whether the currently found neural response patterns hold for different elicitation paradigms.
The current study used normative ratings on valence and arousal of IAPS images. We chose to not have participants rate the stimuli during the experiment, as this secondary task would strongly affect the natural experiencing of emotions in our opinion. Also, we did not use a 'second viewing' approach in which participants view all stimuli again after the EEG experiment in order to provide ratings, as we feel such second viewings lack validity (the emotion experienced while viewing a picture the second time, shortly after the first time, may be quite different from the first viewing), and in any case are not better or more valid in our opinion than using the normative IAPS ratings based on thousands of participants from all over the world. When ratings are reported for different age groups or cultures, these are either comparable to the normative ratings in absolute sense or in pattern (e.g. Libkuman et al., 2007). We have no reason to believe that our current sample would deviate substantially from the norm. Finally, the use of normative IAPS ratings is not uncommon (e.g. Hajcak and Nieuwenhuis, 2006).
As said, the current empirical work does not provide support for or arguments against different emotion theories; differences in the neural responses to different categories of emotions do not prioritize one theory over another. Nor do we claim the existence of neural fingerprints for each emotion category. However, we do suggest that there seem to detectable differences in the neural responses to different emotion categories that cannot be attributed to differences in valence or arousal, warranting further exploration. A fine-grained mapping of neural responses to discrete emotions would be of great value to both fundamental research and applied settings.
In summary, our results demonstrate that discrete emotions elicit distinguishable neural response patterns. We argue that studies exploring neural responses to affective experiences should adopt a systematic approach in carefully separating effects of valence, arousal, and emotion categories.

Participants
Eighty healthy participants (22 males; mean age = 40.8, SD = 7.3, range = 23-56 years) took part in the study after giving written informed consent. Participants were recruited via a recruitment agency and convenience sampling. Participants received a monetary reward of 50 euro. Two participants did not complete the experiment. Datasets were excluded from analysis if they failed to meet the predetermined criterion of having at least 70 % of all 120 initial trails remaining after ocular and artefact rejection. All available trials were used for this criterion to determine the overall signal quality. Nine participants were excluded from analysis in this step, mostly due to blinks coinciding with stimulus onset. The remaining 69 participants (20 males; mean age = 40.5, SD = 6.6, range = 27-52 years) were included in the analysis.
All experimental procedures were approved by the Ethics Review Board of the School of Social and Behavioral Sciences of Tilburg University (EC-2016.48).

Stimuli
In total, 120 IAPS images (Lang et al., 2008) were presented full screen and in color. Half of these were negative valence images and half were positive valence images. Positive valence images were presented to avoid habituation or saturation effects. Only the responses to the negative valence images were analyzed in this paper. Of these, we selected the images that were most distinct in their evoking of disgust versus sadness, resulting in 23 disgust and 29 sadness evoking images. We then selected a subset of 20 images in each category such that these subsets were minimally distinct in their normative valence and arousal ratings (the selection procedures. Both these selection procedures are described next.

Distinct disgust versus sadness ratings
To select a subset of images that were distinctly evoking either disgust or sadness, we collected ratings of how much each image evoked each of the four negative emotions: anger, disgust, fear, and sadness.
Ratings of 93 participants (17 male, 75 female, 1 undefined; mean age = 19.48, SD = 2.43, range = 16-29 years) were obtained through an online Qualtrics questionnaire (https://www.qualtrics.com). Sixty images were presented in random order. Each was presented for 2000 ms, preceded by a 1000 ms fixation cross, and followed by a screen with the question "How much did that image make you feel …" and 4 horizontal sliders labeled from top to bottom with: anger, disgust, fear, and sadness. Each slider was initially set at the leftmost position indicated with a rating of 0. Sliders could be moved independently on a continuous scale to a maximum of 100 on the rightmost position.
These images typically did not evoke a single/pure emotion, rather a mix of several. We selected those images that were most distinctly disgust (sadness) evoking as those that received on average the highest disgust (sadness) rating, for which this rating was at least 1.5 times as high as for each of the other emotions. This procedure resulted in 23 disgust images with a mean rating (SD) of 60.10 (4.18) compared to 7.99 (1.16), 7.94 (1.64), 5.15 (1.35) for anger, fear and sadness, respectively, and 29 sadness images with a mean rating (SD) of 49.18 (2.95) compared to 8.70 (1.35), 3.00 (0.50), 13.49 (1.82) for anger, disgust, and fear, respectively (Fig. 4).

Non-distinct valence and arousal
From these 23 most distinctly disgust and 29 most distinctly sadness evoking images, we selected 20 images of each through an iterative elimination process so that the two conditions were no longer distinguishable by a linear support vector machine (SVM) classification procedure based on their normative valence and arousal ratings ( Fig. 5; Appendix A, Table A1). Classification was performed in R (version 4.0.3; R Core Team, 2017) using the e1071 package (Dimitriadou et al., 2010) in RStudio Team (2019) following the procedure outlined in James et al. (2013).
In each iteration, the linear model having the best fit to the current data was determined through SVM. The datapoint(s) with the greatest distance from the decision boundary were removed, and the procedure was repeated with the smaller dataset. Because we started off with unequal class sizes, only one sadness datapoint was removed in each iteration until class sizes were equal. In following iterations, one datapoint of each class was removed. This process was repeated until group sizes were equal, accuracy fell below 60 %, and the Chi-square test of the contingency table of the prediction yielded a p-value of 0.5 resulting in 20 disgust and 20 sadness images that were not distinguishable by linear SVM based on their normative valence and arousal ratings (see appendix A, Table A1 for an overview of the IAPS numbers that were shown in the experiment, the selection that was used to validate discrete emotion experiences, and the subset of these that were used for classification.).

Fig. 5.
Distribution of the normative valence and arousal ratings for the selected disgust and sadness evoking images. Note: Violin plots depict the distribution of the normative valence (left) and arousal (right) ratings of the selection of disgust eliciting (blue) and sadness eliciting (orange) IAPS images as reported by Lang et al. (2008). Within the violin plots, boxplots are shown with the box indicating the Q1 to Q3 range, the solid line within the box represents Q2 (the median) and the dashed line represents the mean. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Design
In a within-subjects design, each participant passively viewed all 120 images in random order. Only the 20 disgust and 20 sadness images that were selected through the above-described process were used in this paper. Emotion labels "disgust" and "sadness" served as independent factors. Images were presented in random order for 1000 ms each, preceded by a fixation cross (1000 ms) and followed by a blank screen (2000 ms).
For the ERP analysis, dependent variables are the mean EEG amplitude of emotion relevant ERPs. Means were taken over their respective time intervals and channels. For the classification procedure, the entire EEG waveform of stimulus onset until 1000 ms after stimulus onset was used.

Procedure
The current study was one of three unrelated EEG/fEMG experiments in a single session. Before EEG preparations, the participant read information regarding all three experiments and gave informed consent in accordance with the Declaration of Helsinki. Preparation of the participant with the EEG and fEMG equipment took approximately 15 min, after which the participant was seated in a dimly lit, sound attenuating cabin, approximately 60 cm from the computer screen. The entire session lasted approximately one hour. Halfway the experiment, a selfpaced short break was offered. Viewing all 120 images took approximately 9 min.

Physiological measurements
EEG was recorded with a BioSemi setup using 32 Ag-AgCl ActiveTwo electrodes in an extended 10/20 layout (Chatrian et al., 1985) at a sampling rate of 1024 Hz. Impedance was maintained below 5 kΩ during recording. SignaGel (Parker Laboratories, Inc) was used to facilitate conduction. For offline re-referencing, we applied two electrodes behind the ears, on the mastoids. For detection of eye blinks and eye movements, we applied one electrode above, and one below the right eye, and an electrode next to each of the outer canthi of the eyes. Additionally, we measured electrical activity from two facial muscles (fEMG): the Zygomaticus Major and the Corrugator Supercilii. fEMG data was not analyzed for this paper.

Preprocessing
The raw EEG data was preprocessed in Brain Vision Analyzer 2.1.2 (Brain Products GmbH). First, the data was re-referenced to the mean of the mastoid channels. Slow drift and high frequency noise were filtered out using Butterworth filters (0.1 Hz high-pass and 100 Hz low-pass, 24 dB/octave roll-off). Channels with overall poor signal quality were reconstructed by fourth order spherical splines interpolation with a limit of 3 reconstructed channels (10 %) per participant. On average 0.2 channels were reconstructed per participant. Next, artefacts caused by eye blinks and eye movements were corrected using Independent Component Analysis (ICA). Prior to ICA, major artifacts were manually marked as bad segment on individual channels. The remaining data (M = 474 s, range = 354-536 s) was used for the ICA procedure. Ocular components were manually identified and excluded in the inverse ICA process. The data was then segmented from 700 ms before until 1700 ms after stimulus onset based on stimulus markers. These segments were baseline corrected using the 200 ms interval before stimulus onset.
Trials in which the participant likely missed stimulus onset due to blinks or eye movements were rejected through semi-automatic inspection. First, trials in which the amplitude of the uncorrected bipolar horizontal and vertical eye channels exceeded − 100 µV or 100 µV in the interval from 200 ms before until 200 ms after stimulus onset were automatically detected and marked. These segments were then visually inspected and rejected if blinks or eye movements were apparent in the interval from 100 ms before until 100 ms after stimulus onset.
Finally, trials with artifacts were rejected through semi-automatic inspection. Artifacts were automatically detected and marked on each of the 32 EEG channels where a voltage step exceeded 50 µV/ms, the voltage difference within a 1000 ms interval exceeded 200 µV, the voltage fell below − 150 µV or above 150 µV, or a 100 ms interval had<0.5 µV activity. We then visually inspected these marked segments and rejected the trials if the impact of the artifact required exclusion. Participants who had more than 30 % of their trials rejected, were excluded from further analysis. The remaining 69 participants on average had 16.96 disgust trails (minimal 12) and 17.70 sadness trails (minimal 12) left of the initial 20, amounting to a total of 1170 disgust trails and 1221 sadness trials. Fig. 6. Low-level feature of the selected disgust and sadness evoking images. Note: Violin plots depict the distribution of the average luminance, contrast, skewness and kurtosis of the luminance, and the average position on the green-red and blue-yellow spectrum for the selected disgust eliciting (blue) and sadness eliciting (orange) IAPS images. Ranges of these values are as used in the CIELAB color space: [0 100] for luminance and contrast (completely dark/no contrast to completely bright/max contrast), and [-110 110] for the green-red and blue-yellow spectra (from fully green/blue to fully red/yellow). The ranges for skewness and kurtosis are arbitrary with 0 indicating a normal distribution of luminance. Within the violin plots, boxplots are shown with the box indicating the Q1 to Q3 range, the solid line within the box represents Q2 (the median) and the dashed line represents the mean. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Upon loading the data in FieldTrip (Oostenveld et al., 2011) for statistical analysis, the EEG was low-pass filtered to 30 Hz.

Table A1
Overview of all 120 images that were shown in the experiment, with indications of the selection process to obtain the 20 sadness and 20 disgust images that were used for the ERP analyses and classification.  Note. IAPS image numbers, normative valence and arousal rating as reported in the technical manual of the International Affective Picture System (IAPS; Lang et al., 1999). Crosses in the column "Negative" indicate which images were used in the questionnaire to assess the negative emotion labels (anger, disgust, fear, sadness) of each of these images. Column "Emotion" lists which images were distinctly rated as either disgust or sadness evoking. Column "Selection" indicates which of these disgust and sadness images were selected by the SVM classification procedure to form sets that were indistinguishable based on their normative valence and arousal scores.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.