On the use of pairwise distance learning for brain signal classification with limited observations

https://doi.org/10.1016/j.artmed.2020.101852Get rights and content

Highlights

  • Dedicated Siamese neural networks improve learnability from limited EEG recordings.

  • Cosine-loss measuring distance of spectral content is more robust to noisy EEG data.

  • Pairing schema and spectral processing handle spatiotemporal nature of EEG data.

  • Comprehensive models of discriminative brain patterning found from resting state EEG.

Abstract

The increasing access to brain signal data using electroencephalography creates new opportunities to study electrophysiological brain activity and perform ambulatory diagnoses of neurological disorders. This work proposes a pairwise distance learning approach for schizophrenia classification relying on the spectral properties of the signal. To be able to handle clinical trials with a limited number of observations (i.e. case and/or control individuals), we propose a Siamese neural network architecture to learn a discriminative feature space from pairwise combinations of observations per channel. In this way, the multivariate order of the signal is used as a form of data augmentation, further supporting the network generalization ability. Convolutional layers with parameters learned under a cosine contrastive loss are proposed to adequately explore spectral images derived from the brain signal. The proposed approach for schizophrenia diagnostic was tested on reference clinical trial data under resting-state protocol, achieving 0.95 ± 0.05 accuracy, 0.98 ± 0.02 sensitivity and 0.92 ± 0.07 specificity. Results show that the features extracted using the proposed neural network are remarkably superior than baselines to diagnose schizophrenia (+20pp in accuracy and sensitivity), suggesting the existence of non-trivial electrophysiological brain patterns able to capture discriminative neuroplasticity profiles among individuals. The code is available on Github: https://github.com/DCalhas/siamese_schizophrenia_eeg.

Introduction

The recording of increasingly affordable and precise electroencephalography (EEG) data is creating unprecedented opportunities to understand brain activity, aid personalized prognostics, and promote health through wearable biofeedback systems [1]. Electroencephalography is non-invasive, safe, inexpensive, and shows rich temporal content; in contrast with other brain imaging modalities, such as magnetic resonances, entailing higher costs and restrictions on the longitudinal periodicity of recordings [2]. EEG monitoring is widely used to assess psychiatric disorders, and has shown to be a valuable source to study schizophrenia, a disorder affecting about 1% of the world population, largely susceptible to misdiagnoses [3]. Since 2017, cases of individuals with schizophrenia able to regulate their brain activity using real-time EEG neurofeedback in therapeutic settings have been reported [4]. Comprehensive reviews of EEG-based studies of schizophrenia from case-control populations reveal general spectral deviations, including predisposition for decreased alpha power and an increase of activity in the lower spectrum [5]. Slow wave abnormality (mainly delta activity) can be primarily localized in frontal lobe regions, and is suggested to be a relevant neurophysiological marker of schizophrenia [5]. Connectionist and information theoretic features to discriminate brain electrophysiology have been additionally proposed [6], [7]. Despite the inherent advantages of the spectral markers and proposed scores, their use for schizophrenia diagnosis still results in high false positive and false negative rates due to the extent of individual differences on the electrophysiological activity of the brain, irrespective of clinical condition. In particular, when considering resting-state protocols for EEG recordings – clinically deemed as desirable in psychiatric settings against task-oriented and stimuli-induced settings [8] –, state-of-the-art classifiers based on the aforementioned features generally show diagnostic accuracy rates below 70%.

The difficulty of EEG-based diagnostics of neuronal diseases is mainly driven by two major factors: the limited size of case-control populations [9], and the intrinsic difficulties of mining brain signals. Brain signal data is high-dimensional, multivariate, susceptible to noise/artifacts, rich in temporal-spatial-spectral content, and highly variable between individuals [10].

This work proposes a dedicated class of neural networks to extract discriminative features of schizophrenia from electrophysiological brain data previous to the classification step. The proposed approach combines principles from pairwise distance learning and spectral imaging in order to address the aforementioned challenges, enabling superior diagnostics. Accordingly, the proposed approach offers six major contributions:

  • 1.

    Ability to learn from small datasets by taking advantage of Siamese network layering, inherently prepared to work in augmented data spaces mapped from a limited number of observations. Specifically, our approach is suggested for databases with dozens to hundreds of EEG recordings [11]. The features produced by Siamese networks have shown to be useful to perform classification as they rely on either the homologous or discriminative properties of observation-pairs in a pairwise distance domain [12];

  • 2.

    Ability to deal with the rich and complex spectral and temporal content of EEG data by processing the signal into spectral images with a fine frequency and temporal resolution per electrode [13], [14], and by subsequently reshaping the Siamese network architecture with adequate convolutional operations;

  • 3.

    Robustness to noise and wave-instability by assessing distances on the spectral content (frequency domain) under a cosine-loss. Gathered evidence shows less susceptibility to artifacts and the inherent variability of electrophysiological potentials associated with continuously changing overlapping electrical fields produced by localized neurons [10];

  • 4.

    Ability to deal with the multivariate nature of the signal (rich spatial content) by capturing interdependencies between channels as their content is simultaneously used to shape the learned classifiers;

  • 5.

    Ability to handle the extremely high dimensional nature of the gathered spectral content from brain signals (high-resolution spectral image per electrode) under L1 regularization [15], [16];

  • 6.

    Applicability of the proposed EEG-based diagnostics to alternative populations or diseases, motivated by the: (i) placed Bayesian optimization step [17] for hyperparameter tuning and fixing feature space dimension; (ii) fully automated nature of the approach once signals are recorded; and (iii) generalization ability of the learning process on validation data.

In contrast with the currently established views on neural information processing systems, this manuscript explores whether we can go deep on highly dimensional spatiotemporal data in the presence of a very limited number of data observations. This stance is much-needed in healthcare given the limited size of trials (cohort studies), often driven by disease rarity, capped size of control population, trial eligibility requirements, or the facultative nature of EEG assessments. Experimental results confirm this possibility.

This work is validated on the clinical trial conducted by Gorbachevskaya and Borisov [11], a reference database for the resting-state analysis of schizophrenia. Further, details can be found in Section 3.1. The proposed learning approach achieves 0.95 ± 0.05 accuracy, 0.98 ± 0.02 sensitivity on schizophrenia diagnostics, remarkably attaining an improvement of over 20pp against peer approaches.

The features extracted from the proposed spectral and pairwise distance space further suggest the presence of discriminative electrophysiological patterns linked to neuroplasticity aspects of the individuals. This observation is in accordance with findings from previous studies that established statistically significant relationships between variations in the frequency band spectrum and neuroplasticity conditions [18], [19].

The manuscript is organized as follows. After formalizing the problem, Section 2 surveys existing contributions to the diagnosis of individuals from brain signal data. Section 3 describes the proposed solution. Section 4 shows extended evidence of its relevance for diagnosing schizophrenia. Finally, concluding remarks are drawn in Section 5.

A EEG recording or brain signal observation is a multivariate time series X={xtjj{1,,M},t{1,,T}}, where xtj is a measure of the electrophysiological activity in scalp channel j and instant t, T is the number of time points, and M is the multivariate order (number of channels). Given brain signal dataset, {(Xi, ci) ∣ i = 1, …, N}, where N is the number of EEG recordings and each recording Xi is annotated with a label ci ∈ Σ, our task is to identify a discriminative feature space to classify (unlabeled) observations. Specifically, we are interested in classifying schizophrenia given case-control populations.

The electrophysiological signal produced by a specific channel in the cerebral cortex is a univariate time series that can be decomposed into a frequency time series using a discrete Fourier transform. The analysis of the frequency domain of a signal, generally referred as spectral analysis, determines the predominant waves monitored at a certain location. A short-time discrete Fourier transform can be alternatively applied along a sliding window of the raw signal to capture potentially relevant changes on the spectral activity of the brain throughout the EEG recording. The spectral content produced by this time-varying form of spectral analysis is here informally referred as a spectral image since it measures brain activity along two contiguous axes: frequency and time.

Section snippets

Related work

Recent works on deep learning provide principles to attemptively learn from small datasets [20], [21], a critical requirement if we want to guarantee their applicability for most cohort studies available worldwide. The use of surrogate data analysis in the context of regresssion tasks [20], or data augmentation procedures for image recognition [22] are paradigmatic cases. Despite their relevance, they either tackle different tasks or assume a substantial higher amount of data observations than

Our approach

The proposed architecture is inspired by the architecture formerly introduced by Kock et al. [12]. An advantage of this type of architecture is the ability to augment the original dataset from an instance-based data space to a pair-based one. Our approach has two main steps: (1) feature extraction; and (2) classification. In step 1, the internal representations obtained from the SNN architecture model are extracted after training. In step 2, a classification task is performed using these

Results

Given the recording setting introduced in Section 3.1 consider the two following sets of paired individuals:

  • hc _ v s _ s cz – set of all pairs of non-neighbor individuals (healthy controls paired with schizophrenic);

  • hc _ a nd _ s cz – set of all pairs of neighbor individuals (healthy controls paired with healthy controls and schizophrenic paired with schizophrenic).

Fig. 5 shows the spectral differences using FFT between concordant pairs of individuals (hc _ a nd _ s cz) and discordant pairs of

Conclusion

The rich nature of the electrophysiological data measured at the cerebral cortex makes deep learning a natural candidate to study disorders disrupting the normal brain activity. Nevertheless, the limited size of case-control populations, together with the inherent variability of the spectral content within and among individuals, has left the value of neural network approaches largely unexplored. This manuscript stresses the relevance of revisiting this problem, showing that adequately reshaped

Conflict of interest

The authors declare that there is no conflict of interest.

References (44)

  • O.S. Lih et al.

    Comprehensive electrocardiographic diagnosis based on deep learning

    Artif Intell Med

    (2020)
  • B. Matthews

    Comparison of the predicted and observed secondary structure of T4 phage lysozyme

    Biochim Biophys Acta

    (1975)
  • S.H. Na et al.

    EEG in schizophrenic patients: mutual information analysis

    Clin Neurophysiol

    (2002)
  • K. Bakhshi et al.

    The neuropathology of schizophrenia: a selective review of past studies and emerging themes in brain structure and cytoarchitecture

    Neuroscience

    (2015)
  • J.K. Wynn et al.

    Evaluating visual neuroplasticity with EEG in schizophrenia outpatients

    Schizophr Res

    (2019)
  • R. Zomorrodi et al.

    The association between cross-frequency coupling and neuroplasticity via paired associative stimulation: TMS-EEG study

    Brain Stimul

    (2019)
  • A. Ataei et al.

    Brain activity estimation using EEG-only recordings calibrated with joint EEG-FMRI recordings using compressive sensing

    2019 13th international conference on sampling theory and applications (SampTA)

    (2019)
  • W. Nan et al.

    An exploratory study of intensive neurofeedback training for schizophrenia

    Behav Neurol

    (2017)
  • Y. Zhang et al.

    Sleep spindle and slow wave abnormalities in schizophrenia and other psychotic disorders: recent findings and future directions

    Schizophr Res

    (2019)
  • Z. Dvey-Aharon et al.

    Connectivity maps based analysis of EEG for the advanced diagnosis of schizophrenia attributes

    PLoS One

    (2017)
  • F.M. Howells et al.

    Electroencephalographic delta/alpha frequency activity differentiates psychotic disorders: a study of schizophrenia, bipolar disorder and methamphetamine-induced psychotic disorder

    Transl Psychiatry

    (2018)
  • K. Gorbachevskaya et al.

    EEG of healthy adolescents and adolescents with symptoms of schizophrenia

    (2002)
  • Cited by (20)

    • Deep learning for Alzheimer's disease diagnosis: A survey

      2022, Artificial Intelligence in Medicine
    • Going deep into schizophrenia with artificial intelligence

      2022, Schizophrenia Research
      Citation Excerpt :

      DNNs have been used for SZ diagnosis (classification) on neuroimaging data with very good results (accuracy ~90%). For example, 2D convolutional DNNs have been used on fMRI single-frame data (90.8% accuracy, IDT-NI, n = 82, 5-fold CV) (Niu et al., 2019) and EEG time-frequency (95% accuracy, IDT-NI, n = 84, leave-one-out CV) (Calhas et al., 2020) and connectivity (91.7% accuracy, IDT-NI, n = 84, 5-fold CV) (Phang et al., 2020) data. Accurate results have also been obtained using 3D convolutional DNNs on MRI voxel data (88.6% accuracy, IDTI, n = 866, 10-fold CV) (Oh et al., 2020) and fMRI 3D map (84.2% accuracy, IDT-NI, n = 144, 10-fold CV) (Oh et al., 2019) and volumetric (98% accuracy, IDT-NI, n = 144, 10-fold CV) (Qureshi et al., 2019) data.

    • A multiscale siamese convolutional neural network with cross-channel fusion for motor imagery decoding

      2022, Journal of Neuroscience Methods
      Citation Excerpt :

      Nowadays, a growing number of researchers adopt SNN in the biomedicine field (Palazzo et al., 2020; Maiorana, 2019). For example, Calhas et al. (2020) propose a pairwise distance learning approach using SNN architecture, which remarkably outperforms the baseline in diagnosing schizophrenia. Patanè et al. (Patanè and Kwiatkowska, 2018) introduce a CRNN network consisting of serial CNN and RNN for arousal recognition through electrocardiogram (ECG) signals, and design feature calibration techniques using the SNNs as a systematic way.

    View all citing articles on Scopus
    View full text