1 Introduction

Human brain signals have been investigated within the medical field since the beginning of the twentieth century, with the aim of diagnosing diseases like epilepsy, spinal cord injuries, Alzheimer’s, Parkinson’s, schizophrenia, and stroke. They have been also employed for rehabilitative or assistance applications, thanks to the possibility of relating the occurrence of some of their known behaviors with the intentions of the subjects being recorded, thus allowing the design of brain–computer and brain–machine interfaces.

Despite such broad interest for medical applications, the use of brain signals for automatic people recognition has attracted the interest of the scientific community only recently. With respect to traditional biometrics such as fingerprint, face or iris, traits like electrocardiography (ECG), electrodermal response (EDR), or blood pulse volume (BVP) offer the notable advantage of being significantly harder to be covertly acquired or forged, being thus suitable for applications requiring a high level of security. Actually, most of the research so far carried out on cognitive biometrics has been focusing on the analysis of brain signals, more specifically acquired through electroencephalography (EEG) [4]. Such methodology in fact guarantees the characterization of brain activity with a high temporal resolution, achievable also when using portable and relatively inexpensive recording devices.

In more detail, EEG signals are obtained as measure of the differences in electrical voltage between specific positions on the scalp surface, sensed by proper metal electrodes. Such activity is the result of the electrical field generated in the brain by the synchronous firing of specific spatially aligned neurons residing in the cortex, i.e., the pyramidal neurons. Typically, different EEG behaviors can be observed depending on the adopted acquisition protocol, such as resting state in either eyes-closed (EC) or eyes-open (EO) conditions, visual or audio stimuli, or executing some real or imagined task such as body movements or speech.

Most of the aforementioned recording scenarios have been already exploited to design EEG-based biometric recognition systems. In order to do so, discriminative characteristics are typically extracted from the considered recordings, trying to identify properties that remain stable over time for the same individual, while being significantly different for distinct persons. However, a major issue for EEG biometrics is the presence of a significant amount of noise in the acquired data [9]. In fact, even though signal amplifiers with high sensitivity and high noise rejection are commonly used to measure the voltage fluctuations on the scalp surface, the EEG acquisition process is inevitably sensitive to both endogenous and exogenous noises, generated respectively by other physiological processes such as electromyography (EMG) activity, and by external sources such as electrode impedance or interference from electric devices [27]. Such undesired contributions, indicated as artifacts, are not related to the brain activity under analysis and therefore significantly influence the possibility of extracting stable characteristics from the EEG data of a considered subject, thus affecting the recognition performance achievable in systems adopting EEG as biometrics.

While artifacts removal is commonly done by visual inspection of an expert when treating EEG signals for biomedical [21] and clinical [11] purposes, such possibility is unrealistic when considering EEG data for biometric recognition, where the capability of automatically matching two samples of the considered trait is typically requested.

The present paper therefore takes into account the need for providing tools able to automatically discard undesired artifacts from the recorded EEG data in a biometric recognition scenario, with the aim of improving the achievable recognition rates. Two distinct acquisition protocols are here analyzed, considering brain signals captured in both EC and EO resting states. The experimental tests evaluating the effectiveness of the proposed approaches have been conducted on a large database comprising EEG recordings taken from 50 healthy subjects during three distinct sessions spanning a period of about one month.

More specifically, the paper is organized as follows: Section 2 describes the current state of the art on EEG preprocessing, with a specific detail on the approaches so far adopted when employing brain signals for biometric purposes. Section 3 then introduces the strategies here considered for treating EEG data before extracting discriminative features from them. The templates adopted for representing the considered EEG biometrics, and the methods exploited for matching them, are then detailed in Sect. 4. Section 5 presents the experimental tests performed to verify the effectiveness of the proposed approaches, while some conclusions are eventually drawn in Sect. 6.

2 EEG signal preprocessing

Trying to keep low the number of artifacts occurring in an EEG recording is critical for obtaining data from which valuable information can be extracted. Therefore, several approaches have been so far adopted for denoising EEG data, especially in clinical scenarios, where limiting the influence of artifacts in diagnoses is of paramount importance. A commonly adopted procedure consists in completely discarding signal portions, named as epochs, trials, or frames, which contain notable distortions from the expected EEG behavior, in terms of observed frequencies or amplitude values. However, such approach can be feasible only when enough data to be processed are available. Furthermore, the selection of noisy trials is commonly performed by visual inspection, being therefore a subjective and time-consuming task [7], although some automatic objective criterions [33] have been proposed in the medical literature.

If the aforementioned approach is not adequate or admissible, proper signal processing techniques have to be adopted to remove artifacts. Filtering is, for instance, one of the most simple and commonly used methods to improve the quality of the recorded signals: Band-stop filters are normally adopted for 50 Hz/60 Hz spurious power supply signal attenuation, while band-pass filters allow focusing the performed analysis on specific frequency bands, relevant for the considered application [24]. Within this regard, it is worth specifying that EEG signals are commonly interpreted as composed by five main brain rhythms with different peculiarities [1]:

  • Delta (\(\delta \), [0.5, 4] Hz) waves are predominant during the so-called deep or slow wave sleep (SWS), show relatively large amplitudes, and are related to subjects’ attention to internal processing;

  • Theta (\(\theta \), [4, 8] Hz) band power increases in response to memory demands, selectively reflecting the successful encoding of new information;

  • Alpha (\(\alpha \), [8, 14] Hz) is the most relevant activity in normal subjects during rest and attenuates with eyes opening or mental exertion;

  • Beta (\(\beta \), [14, 30] Hz) activity is characteristic for the states of increased alertness and focused attention;

  • Gamma (\(\gamma \), over 30 Hz) components are difficult to record through electrodes due to the low-pass filter nature of the scalp, with frequencies usually not exceeding 45 Hz when dealing with resting-state acquisition protocols.

Adaptive filtering [12] is also adopted in case of availability of a reference sample or model for the artifacts to be removed, for instance in terms of autoregressive–moving-average (ARMA) models or Kalman filters [29]. Furthermore, signal decomposition techniques such as those based on blind source separation (BSS) [32], also indicated as independent component analysis (ICA) [30], or empirical mode decomposition (EMD) [10], are among the most efficient approaches which have been so far employed for EEG denoising. However, also when adopting these latter methods, a human expert is usually responsible, in a clinical scenario, for determining which components of the decomposed EEG signal have to be discarded, with the possibility that two persons may take different decisions when analyzing the same data [8].

Focusing on EEG analysis for biometric recognition, it is worth remarking that so far very little attention has been dedicated to the relevance of preprocessing when discriminating people based on their brain signals. Restricting our survey on the papers which have evaluated the stability of EEG traits across time [20], raw EEG potentials from nine subjects have been spatially filtered by means of a surface Laplacian (SL) operator in [22], trying to better represent the cortical activity due to only local sources below the electrodes, and to increase the signal-to-noise ratio of the considered signals. EEG data from 20 users has been divided into 5 s-epochs and manually inspected in [23] to remove large muscle or eye movement contributions, located in case the underlying EEG rhythms were not clearly seen. ICA decomposition has been also employed to manually identify components which could be associated with the same kind of artifacts. Frequencies outside the [4, 25] Hz range are discarded from the EEG signals of six persons in [2] since they may contain EMG artifacts. A detrend operator has been then applied to remove baseline drift, and any trial containing signal amplitudes exceeding \(+/- 100\,{\upmu }\hbox {V}\) has been discarded due to the possibility of being contaminated by eye blink artifacts. A Laplacian spatial filter has been applied to the EEG acquisitions of nine users in [15] as the sole employed preprocessing, while a common average referencing (CAR) filter has been used in [17] to reduce artifacts related to inappropriate reference choices in monopolar recordings taken from nine subjects. Features based on total spectral power, maximum power value, and frequency of this latter, all evaluated over the \(\alpha \) band, have been extracted from single-electrode EEG signals captured in EC conditions from four users in [18]. Outlier data have been reduced in [25] through windsorizing as suggested in [13], using percentile values of 10 and 90 to cap the desired minimum and maximum amplitude values for each channel, once a sixth-order Butterworth infinite impulse response (IIR) filter has been used to band-pass filter, from 1 to 8 Hz, the P300 event-related potentials (ERPs) collected from three subjects. Eventually, no artifact rejection at all has been performed on the EEG data of 15 people in [28], where it is argued that none of the so far proposed artifacts removal techniques can positively impact classifier accuracy, while they only add computational time and complexity to the overall recognition procedure.

Besides noticing that all the aforementioned analyses have been carried out on small sets of users, it has also to be remarked that a comparison between the performance achievable by means of raw and preprocessed EEG signals has not been provided in any work, being therefore impossible to evaluate the effectiveness of the employed strategies in improving the achievable recognition rates. The present paper therefore represents the first attempt of evaluating the effectiveness of preprocessing techniques on the ability of recognizing people using their EEG signals, while also performing such analysis on a large database of EEG recordings acquired over different sessions for each user.

3 Proposed artifact removal strategies

The strategies here adopted for preprocessing EEG signals before feature extraction are described in this section.

Having assumed that an EEG acquisition is represented through M signals \(\varvec{g}^{\left( m \right) }\), with \(m=1,\ldots ,M\), each giving the voltage difference between the reference electrode and the mth channel, the first performed step consists in applying the CAR filter to the considered recordings, thus generating \({\varvec{s}}^{\left( m \right) }={\varvec{g}}^{\left( m \right) }-\frac{1}{M}\mathop {\sum }\limits _{m=1}^{M} {\varvec{g}}^{\left( m \right) }\).

The available data are then band-pass-filtered in order to isolate the EEG subbands whose discriminative properties have to be analyzed. Specifically, since the \(\theta -\beta = [4, 30]\,\hbox {Hz}\) range has been shown to contain most of the discriminative information of EEG signals in [20], only such frequencies are retained in the considered recognition systems. The so-obtained signals are then segmented into E consecutive epochs, each lasting 5 s and with a 40 % overlap with the previous one, indicated as \({\varvec{s}}_e^{\left( m \right) }\), \(e=1,\ldots ,E\). The aforementioned processing is assumed to be always applied to the EEG data considered in the tests detailed in Sect. 5. The methods described in the following sections are instead applied to each segmented epoch when it is required to discard as much noise as possible from the available EEG data.

3.1 Blind source separation (BSS)

A brain recording can be modeled as a linear mixture of a finite number of sources, eventually containing additive noise [5]. We can therefore represent a generic eth EEG epoch \({\varvec{s}}_e^{\left( m \right) } \), \(m=1,\ldots ,M\), as a linear mixture of independent brain sources as:

$$\begin{aligned} {\varvec{S}}_{{e}} =\mathbf{A}\cdot B_e , \end{aligned}$$
(1)

where \({\varvec{S}}_{{e}} \) is a matrix whose mth row is given by the channel contribution \({\varvec{s}}_{{e}}^{\left( m \right) }\), \({\varvec{B}}_{{{e}}} \)is the matrix describing the brain sources, and \(\mathbf{A}\) is the \(M\times M\) mixing matrix relating the estimated brain sources with the observed measurements. By leveraging on such model, BSS/ICA techniques can be therefore used advantageously for decomposing raw EEG data to relevant brain signal and noise subspaces.

Several distinct algorithms have been proposed in the literature to perform the required task. The approach here followed is based on the algorithm for multiple unknown signals extraction AMUSE [31], a BSS algorithm exploiting the second-order statistics (SOS) of the analyzed EEG data, in terms of covariances and autocovariances of the observed signals, to perform the desired separation of colored sources by estimating the unknown brain sources and the mixing matrix \(\mathbf{A}\) in Eq. (1). Specifically, the employed algorithm performs eigenvalue decomposition (EVD) twice: the first time to remove the second-order dependence among the observations (whitening step) and the second one to estimate the global separating matrix. This latter is obtained by solving the simultaneous diagonalization of two covariance matrices, \({\varvec{R}}_{\varvec{S}} \left( 0 \right) \) and \({\varvec{R}}_{\varvec{S}} \left( {\tau } \right) \). Usually, best results are obtained with small time delays [6], with \(\tau =1\) used in our experiments.

Once the set of uncorrelated brain sources \({\varvec{B}}_{{e}}\) is obtained through AMUSE, a processed epoch \({\varvec{p}}_{{e}}^{\left( m \right) } \)can be then obtained by computing the corresponding matrix \({\varvec{P}}_{{e}} =\mathbf{A}_0 \cdot {\varvec{B}}_e \), where \(\mathbf{A}_0 \) is generated from \(\mathbf{A}\) by setting to zero all the coefficients multiplying the unwanted components. The selection of these latter is often performed by ordering the estimated brain components on the basis of their kurtosis [30]. In fact, such measure provides a distance of the obtained sources from a Gaussian behavior, which is assumed typical for EEG data: The lower the kurtosis absolute value, the more Gaussian is the signal, and vice versa. Strong negative kurtosis values in fact usually reflect AC (alternating current) or DC (direct current) artifacts, while highly peaked activities, such as those associated with eye blink artifacts, are characterized by highly positive kurtosis values [9]. Discarding the components with the lowest and largest kurtosis may therefore seem a plausible preprocessing procedure, yet such strategy cannot guarantee an improvement in recognition rates, according to the performed experimental tests.

On the other hand, the preprocessing approach here adopted exploits the characteristic of the AMUSE algorithm of ordering the estimated components on the basis of their predictability, being dependent on the singular values of the time-delayed covariance matrix \({\varvec{R}}_{\varvec{S}} \left( {\tau } \right) \). Specifically, the first-ordered components are characterized by a high predictability, which is often associated with artifacts such as eye blinking. The last sources instead possess low predictability values and essentially contain EMG activity. Such property is exploited to design an automatic preprocessing procedure, tested in Sect. 5, by removing the first or last components ordered in terms of predictability by the AMUSE algorithm for generating denoised EEG data.

3.2 Sample entropy

The BSS approach in Sect. 3.1 decomposes the acquired EEG signals into uncorrelated brain components, in order to then discard some of them, based on their predictability, before regenerating the signals to be processed. Nonetheless, it could be still needed to further analyze the reconstructed epoch \({\varvec{p}}_e^{\left( m \right) } \) before extracting features from it. In fact, a specific investigation can be carried out on each individual channel in order to evaluate the presence of residual artifacts. Sample entropy (SE) is here adopted for this purpose, being a robust quantifier of complexity in EEG signals [26], and being often used as a marker for the presence of artifacts in EEG recordings [19].

The SE of a generic signal \({\varvec{x}}\) having length N is defined as the approximation of the statistics:

$$\begin{aligned} \hbox {SE}\left( {n,r} \right) =-\hbox {ln}\left[ {\frac{Pr\left( {d\left[ {{\varvec{x}}_{n+1}^{\left( i \right) } ,{\varvec{x}}_{n+1}^{\left( j \right) } } \right] } \right) \le r}{Pr\left( {d\left[ {{\varvec{x}}_n^{\left( i \right) } ,{\varvec{x}}_n^{\left( j \right) } } \right] } \right) \le r}} \right] , \end{aligned}$$
(2)

where \({\varvec{x}}_n^{\left( i \right) } \in \mathbb {R}^{n}\), with \(i\in \left[ {1,N-n+1} \right] \), is the ith vector extracted from \({\varvec{x}}\) as

$$\begin{aligned} {\varvec{x}}_n^{\left( i \right) } =\left[ {{\varvec{x}}\left[ i \right] ,{\varvec{x}}\left[ {i+1} \right] ,\ldots ,{\varvec{x}}\left[ {i+n-1} \right] } \right] , \end{aligned}$$
(3)

and the distance \(d\left[ {{\varvec{x}}_n^{\left( i \right) } ,{\varvec{x}}_n^{\left( j \right) } } \right] \) between two vectors is defined through the maximum difference of their corresponding scalar coefficients as:

$$\begin{aligned} d\left[ {{\varvec{x}}_n^{\left( i \right) } ,{\varvec{x}}_n^{\left( j \right) } } \right] =\mathop {\max }\limits _{k\in \left[ {0;n-1} \right] } \left( {\left| {{\varvec{x}}\left[ {i+k} \right] -{\varvec{x}}\left[ {j+k} \right] } \right| } \right) . \end{aligned}$$
(4)

The probability values in Eq. (2) are approximated by averaging the reported terms over the entire signal \(\varvec{x}\). As can be seen, the estimated SE is dependent on the selected parameters n and r, which are, respectively, set to \(n=2\) and \(r=0.2\) as recommended in [34].

Following the aforementioned processing, it is therefore possible to evaluate M different sample entropy estimates \(\hbox {SE}_e^{\left( m \right) } \) for each channel in the epoch \({\varvec{p}}_e^{\left( m \right) } \), with \(m=1,\ldots ,M\). Low values of SE are assumed to be associated with predictable signals, which could contain artifacts with a certain degree of regularity, such as ECG interference, eye blinks, EMG repetitive movements [19]. Standard neural activity should instead result in high SE values. As described in Sect. 4, a threshold T can therefore be employed to decide whether a specific mth channel of the eth EEG epoch \({\varvec{p}}_e^{\left( m \right) } \)should be considered when comparing two EEG samples for recognition purposes.

4 Considered EEG recognition systems: signals representations and matching

Once the acquired EEG signals have been preprocessed, discriminative features have to be extracted and then matched to evaluate the similarity between two distinct EEG recordings. The template representations here considered are therefore presented in Sect. 4.1, while the adopted matching strategy is detailed in Sect. 4.2, where it is also described how the computed SE estimations can be exploited to improve the achievable recognition rates.

4.1 Feature representations

Two distinct EEG feature representations are here considered to draw conclusions with a good level of generality regarding the effects of the proposed preprocessing steps on the achievable recognition rates. Specifically, given a generic eth EEG epoch \({\varvec{z}}_e^{\left( m \right) } \), \(m=1,\ldots ,M\), which could consist in either the original signal \(\varvec{s}_e^{\left( m\right) }\), or in the preprocessed frame \({\varvec{p}}_e^{\left( m \right) }\) in case the BSS method in Sect. 3.1 would have been employed, a biometric template can be generated from it by exploiting either autoregressive (AR) reflection coefficients or power spectral density (PSD) values, two of the representations most commonly employed when dealing with EEG-based biometric recognition [20].

4.1.1 AR modeling

According to this representation, each channel of the considered epoch \({\varvec{z}}_e^{\left( m \right) }\) is modeled as a realization of an AR process of order Q, obeying the linear difference equation

$$\begin{aligned} {\varvec{z}}_e^{\left( m \right) } \left[ i \right] =-\mathop \sum \limits _{q=1}^Q a_e^{\left( m \right) Q,q} \cdot {\varvec{z}}_e^{\left( m \right) } \left[ {i-q} \right] +{\varvec{w}}\left[ i \right] , \end{aligned}$$
(5)

where \(\varvec{w}\left[ i \right] \) is a realization from a white noise process with standard deviation \(\sigma _e^{\left( m \right) Q} \), and \(a_e^{\left( m \right) Q,q} \), \(q=1,\ldots ,Q\), are the Q AR coefficients representative of the model. Such coefficients can be estimated through the well-known Yule-Walker equations [14], which can be solved iteratively through the Levinson algorithm, introducing the concept of reflection coefficients \(K_e^{\left( m \right) q} \), \(q=1,\ldots ,Q\). These latter can be computed directly from the observed data \({\varvec{z}}_e^{\left( m \right) }\) exploiting the Burg method [14], which allows an estimation more stable than the one associated with the AR coefficients \(a_e^{\left( m \right) Q,q} \). Using reflection coefficients for biometric recognition purposes has been in fact shown to be preferable [3], and this choice is therefore here taken to derive the AR-based EEG representation with length \(V=M\cdot Q\) as

$$\begin{aligned} { }^{( \mathrm{AR} )} {\varvec{v}}_e= & {} [K_e^{\left( 1 \right) 1} ,K_e^{\left( 1 \right) 2} ,\ldots ,K_e^{\left( 1 \right) Q} ,\ldots ,\nonumber \\&,\ldots ,K_e^{\left( M \right) 1} ,K_e^{\left( M \right) 2} ,K_e^{\left( M \right) Q} ] \end{aligned}$$
(6)

Together with \({ }^{( \mathrm{AR} )} {\varvec{v}}_e\), an additional binary vector \({\varvec{o}}_e \), having the same length of \({ }^{( \mathrm{AR} )} {\varvec{v}}_e \), is also extracted during the analysis of each epoch \(\varvec{z}_e^{\left( m \right) } \). The vector \({\varvec{o}}_e \) has all its \(V=M\cdot Q\) coefficients set to 1 in case the SE of the considered EEG signals is not exploited. Conversely, values in \({\varvec{o}}_e \) are set to 1 only in correspondence to the elements in \({ }^{( \mathrm{AR})} {\varvec{v}}_e \) associated with the channels for which \(\hbox {SE}_e^{\left( m \right) } >T\), and to 0 otherwise, in case the SE is taken into account during the analysis of the available frames. Such approach allows considering only the channels without notably artifacts when comparing EEG samples, as described in Sect. 4.2. As in [20], the AR model order Q is set to 12 according to the Akaike information criterion (AIC), in order to minimize the information loss in fitting the considered data [16].

4.1.2 PSD modeling

The Welch’s averaged modified periodogram approach [14], with a sliding Hanning window lasting 1 s and an overlap of 0.5 s for consecutive segments, is here employed to get an estimation of the PSD \({\varvec{Z}}_e^{\left( m \right) } \) for each mth channel from the eth EEG epoch \({\varvec{z}}_e^{\left( m \right) } \). Since the considered EEG signals are filtered to the [4, 30] Hz range as described in Sect. 3, in case a frequency resolution of 1 Hz is employed for the considered representation, a total of \(F=27\) frequency values different from zero would be available for each mth channel in \({\varvec{Z}}_e^{\left( m \right) } \). Using such coefficients, it is possible to generate an EEG biometric template with length \(V=M\cdot F\) from \({\varvec{z}}_e^{\left( m \right) } \) through a PSD characterization as

$$\begin{aligned} { }^{\left( {PSD} \right) } {\varvec{v}}_e= & {} [{\varvec{Z}}_e^{\left( 1 \right) } \left[ 4 \right] ,{\varvec{Z}}_e^{\left( 1 \right) } \left[ 5 \right] ,\ldots ,{\varvec{Z}}_e^{\left( 1 \right) } \left[ {30} \right] ,\ldots ,\nonumber \\&,\ldots ,{\varvec{Z}}_e^{\left( M \right) } \left[ 4 \right] ,{\varvec{Z}}_e^{\left( M \right) } \left[ 5 \right] ,\ldots ,{\varvec{Z}}_e^{\left( M \right) } \left[ {30}\right] ]. \end{aligned}$$
(7)

An additional binary vector \({\varvec{o}}_e \) with the same length of \({ }^{( \mathrm{PSD})} {\varvec{v}}_e \) is also generated when adopting the PSD representation. As for the AR representation, the coefficients of \(\varvec{o}_e \) are all equal to 1 in case the SE of the considered EEG signals is not taken into account, while their values depend on the computed \(\hbox {SE}_e^{\left( m \right) } \) when the entropy of the analyzed data is also evaluated, setting them to 1 only in correspondence to the elements in \({ }^{( \mathrm{PSD})} \varvec{v}_e\) derived from channels for which \(\hbox {SE}_e^{\left( m \right) } >T\).

4.2 Matching strategy

It is assumed that, in the enrollment phase of the proposed systems, EEG recordings from U subjects are captured and processed as described in Sects. 3 and 4.1, in order to derive the biometric templates \({\varvec{v}}_e^u \left[ b \right] \hbox { and }{\varvec{o}}_e^u \left[ b \right] \), with \(u=1,\ldots ,U\), \(e=1,\ldots ,E\) and \(b=1,\ldots ,B\). As already observed, the vector size B depends on the chosen representation, being it based on either AR or PSD modeling. During the recognition phase, in case an identification scenario is taken into account, an EEG probe is presented to the system and processed in order to derive the vectors \({\varvec{v}}_a \left[ b \right] \) and \({\varvec{o}}_a \left[ b \right] \), with \(a=1,\ldots ,A\), being A the number of epochs extracted from the available EEG sample. A distance

$$\begin{aligned}&d^{u}\left( {\left\{ {{\varvec{v}}_a ,{\varvec{o}}_a } \right\} ,\varPhi ^{u}} \right) \nonumber \\&\quad =\frac{V}{M\cdot \left\{ {\mathop \sum \nolimits _{b=1}^B \varvec{o}_e^u \left[ b \right] \cdot \varvec{o}_a \left[ b \right] } \right\} }\cdot \nonumber \\&\quad \quad \cdot \left\{ {\mathop {\min }\limits _e \left\{ {\mathop \sum \limits _{b=1}^B {\varvec{o}}_e^u \left[ b \right] \cdot {\varvec{o}}_a \left[ b \right] \cdot \left| {{\varvec{v}}_a \left[ b \right] -{\varvec{v}}_e^u \left[ b \right] } \right| } \right\} } \right\} , \end{aligned}$$
(8)

between the query \(\left\{ {\varvec{v}_a ,\varvec{o}_a } \right\} \) and the set \(\varPhi ^{u}\), comprising the E epochs made available during the enrollment of user u, is then evaluated as the minimum of E Manhattan (L1) distances. As can be seen from Eq. (8), in case the SE of the EEG signals is exploited the binary vectors \({\varvec{o}}_e^u \) and \({\varvec{o}}_a \) allows performing a comparison between the corresponding EEG epochs by taking into account only the channels assumed free of relevant artifacts in both the matched epochs. The normalization term \(\frac{M}{V}\cdot \left\{ {\mathop \sum \limits _{b=1}^B {\varvec{o}}_e^u \left[ b \right] \cdot {\varvec{o}}_a \left[ b \right] } \right\} \), giving the actual number of channels considered for the distance evaluation, is included in order to make comparison among themselves the distances of the recognition probe from all the templates belonging to different enrolled users. A nearest neighbor approach is employed to estimate the identity of the presented subject as \(\mathop {\bar{u}}\nolimits _a =\mathop {\hbox {argmin}}\limits _u \left\{ {d^{u}\left( {\left\{ {{\varvec{v}}_a ,{\varvec{o}}_a } \right\} ,\varPhi ^{u}} \right) } \right\} \), on the basis of the considered ath frame available during recognition. A majority voting rule is then adopted to take the final decision on the identity \(\mathop {\bar{u}}\limits \) of the presented user, depending on the identity with the highest number of occurrences among the votes \(\mathop {\bar{u}}\nolimits _a \), \(a=1,\ldots ,A\), each taken on the basis of an individual frame.

Fig. 1
figure 1

The 10–20 international system seen from left (a) and above the head (b). (Jaakko Malmivuo and Robert Plonsey, Bioelectromagnetism, Oxford University Press, 1995, WEB version)

5 Experimental tests

The longitudinal dataset introduced in [20] is here employed to test the effectiveness of the proposed automatic preprocessing steps in improving the performance achievable with EEG biometrics. Specifically, the considered data comprises acquisitions from \(U=50\) healthy subjects, whose age ranges from 20 to 35 years with an average of 25, taken with a GALILEO BE Light amplifier with \(M=19\) electrodes, placed on the scalp according to the 10–20 international system (see Fig. 1). EEG signals are collected for each considered subject during three distinct acquisition sessions, indicated in the following as S1, S2, and S3. The average temporal distance between the first two sessions is seven days, while the third session is in average performed 34 days after the first one. During each EEG acquisition, subjects are comfortably seated on a chair in a dimly lit room, with four minutes of recording first taken in EO resting state, where the subject is asked to fix a light point on the screen, followed by other four minutes during which the subjects remain in an EC resting state.

The available sessions are employed to evaluate and compare the recognition performance achievable with or without the proposed preprocessing, when using disjoint sessions for the enrollment and the identification stage. As in [20], we here consider four different matching scenarios:

Table 1 Comparison of recognition rates achieved with EEG signals captured with the EC protocol, when either using or not using the proposed BSS preprocessing approach, for all the considered EEG feature representations and matching scenarios
Table 2 Comparison of recognition rates achieved with EEG signals captured with the EO protocol, when either using or not using the proposed BSS preprocessing approach, for all the considered EEG feature representations and matching scenarios
  • S1 vs S2, with enrollment data captured during S1 and identification data taken from S2, for a temporal distance between sessions of about one week;

  • S2 vs S3, with enrollment data captured during S2 and identification data taken from S3, for a temporal distance between sessions of about three weeks;

  • S1 vs S3, with enrollment data captured during S1 and identification data taken from S3, for a temporal distance between sessions of about four weeks;

  • (S1, S2) vs S3, where enrollment data are derived from both S1 and S2, while identification data are taken from S3.

With the aim of providing reliable statistics regarding the effectiveness of the proposed preprocessing approaches, cross-validation procedures are carried out to estimate the achievable recognition performance. Specifically, for the analysis of each considered scenario, 10 different runs are performed by randomly selecting each time, for all the 50 available subjects, an EEG signal lasting 150 s from the enrollment dataset to estimate the users’ templates \({\varvec{v}}_e^u \) and \({\varvec{o}}_e^u \), with \(u=1,\ldots ,U\), \(e=1,\ldots ,E\). It is worth specifying that such temporal extension is kept for all the considered matching scenarios, even when exploiting multiple enrollment sessions for such aim. Moreover, at each run 10 different recognition probes lasting 90 s are randomly selected from the identification session to derive the instances \({\varvec{v}}_a \) and \({\varvec{o}}_a \), with \(a=1,\ldots ,A\), employed for performance estimation. This way, \(50 \times 10 = 500\) identification tests are carried out at each iteration to estimate the rank-1 identification rates (IRs). On the basis of the performed tests, the recognition performance reported in the following are therefore expressed through the mean \(\mu _\mathrm{IR} \) of the 10 computed rank-1 IRs.

Table 3 Comparison of recognition rates achieved with EEG signals captured with the EC protocol, when using no preprocessing, the BSS approach, or the BSS+SE preprocessing, for all the considered EEG feature representations and matching scenarios
Table 4 Comparison of recognition rates achieved with EEG signals captured with the EO protocol, when using no preprocessing, the BSS approach, or the BSS+SE preprocessing, for all the considered EEG feature representations and matching scenarios

The first performed analysis regards the effectiveness of the proposed automatic BSS preprocessing in improving the achievable EEG recognition rates. Tables 1 and 2, respectively, outline the performance achieved when considering the EC and the EO acquisition protocols. Specifically, the reported results compare the behaviors achieved when not using the proposed BSS decomposition approach, or when exploiting it by discarding some of the estimated EEG sources, considering different combinations of the two first and the two last components, in order of predictability. It can be seen that, in both EC and EO conditions, an improvement in recognition rates can be actually obtained by exploiting BSS decomposition. Specifically, in EC scenarios it would be advisable to discard the last two components, that is, the Mth and the \((M -1)\)th ones, for achieving recognition rates better than those obtained without resorting to the proposed preprocessing, and also better than those achieved by discarding other subsets of estimated components. Such results is in fact confirmed, for both the considered feature representation, in all the considered matching scenarios, with the sole exception of the S1 vs S3 case, where the highest identification rates are obtained by discarding only the last component. A paired-sample t-test is also carried out, on the basis of the obtained identification performance, to provide a statistical validation about the usefulness of performing the described EEG preprocessing for the aim of people recognition. Specifically, the results achieved without preprocessing, and with BSS preprocessing discarding the last two components, are compared by evaluating the differences among the 80 identification rates estimated in both cases, each time considering altogether the 10 iterations performed for analyzing the four considered matching scenarios, by representing EEG signals through both AR and PSD features. The obtained p-value, equal to \(1.08\times 10^{-14}\), allows to reject the null hypothesis with a statistical significance level far lower than \(\alpha = 0.01\). As for EEG signals acquired according to the EO protocol, a preferable choice would consist in discarding only the last (Mth) component. In fact, besides often improving the recognition rates obtained without any preprocessing, just like when discarding also the first one, the suggested approach results in specifically high performance in the (S1, S2) vs S3 matching conditions, where typically the best possible recognition rates are reached, as shown in [20].

The paired-sample t-test performed in EO conditions provides a p-value equal to \(1.85 \times 10^{-9}\), largely justifying the reliability of the proposed BSS preprocessing, performed in this case by discarding only the last component. From the reported results, it can also be seen that when exploiting a PSD representation for the considered EEG signals, notable improvements in terms of recognition performance can be achieved through the proposed BSS approach in EO conditions, with respect to EC scenarios. This testifies the method’s effectiveness in removing artifacts, typically occurring more often in EO than in EC conditions.

Further performance improvements can be also achieved when exploiting the proposed preprocessing based on SE too. Specifically, Tables 3 and 4 detail the recognition rates, respectively, obtained in EC and EO acquisition conditions, when jointly using BSS decomposition and SE estimation for EEG preprocessing. In more detail, the reported results are obtained when employing the aforementioned suggestions for the application of BSS decomposition, that is, discarding the last two components in EC conditions and the last one in EO scenarios. For an easier visual inspection of the achieved performance, the rates obtained without any preprocessing, and with the sole BSS preprocessing, are again reported in Tables 3 and 4. The threshold T introduced in Sect. 3.2, and used to determine which channels of an epoch are deemed to contain artifacts and should be therefore discarded in the matching process, is expressed as percentile (PCTL) of the overall distribution of SE values estimated over the available database. From the obtained results, it is possible to observe that exploiting SE could guarantee a further slight improvement with respect to the sole adoption of BSS decomposition, for almost all the considered scenarios. The paired-sample t-test performed to evaluate the effectiveness of using SE information (with \(T=1\) PCTL) over the simple BSS method results in a p-value equal to \(3.5\times 10^{-2}\) in EC conditions, being therefore possible to reject the null hypothesis with a statistical significance level lower than \(\alpha = 0.05\). A much lower p-value, equal to \(1.85\times 10^{-9}\), is instead obtained under EO scenarios, testifying the larger performance gains that can be observed when dealing with an EO acquisition protocol, regardless of the employed EEG feature representation.

The highest recognition rates are achieved in case of EC EEG acquisition protocol, AR feature representation, and use of data from multiple sessions for enrollment purposes, with a mean \(\mu _\mathrm{IR} \) at about 95 %.

6 Conclusions

A preliminary study about the usefulness of the preprocessing phase on the recognition performance achievable with EEG biometrics has been presented in this paper. Specifically, automatic strategies for artifacts removal based on BSS and SE have been proposed, and their effectiveness has been evaluated over a large longitudinal database comprising EEG acquisitions taken from 50 users over 3 distinct recording sessions, in both EC and EO conditions. It has been therefore shown that investigating how to discard unwanted noise from the acquired EEG data can be of paramount importance for guaranteeing high recognition rates when exploiting brain signals as biometric identifier.