Systematic analysis of wavelet denoising methods for neural signal processing

Objective. Among the different approaches for denoising neural signals, wavelet-based methods are widely used due to their ability to reduce in-band noise. All wavelet denoising algorithms have a common structure, but their effectiveness strongly depends on several implementation choices, including the mother wavelet, the decomposition level, the threshold definition, and the way it is applied (i


Introduction
The recording of high-quality neural signals, both at the central and peripheral levels, is important for diagnostic and decoding purposes.In the latter case, neural signaling can be exploited for human-machine interfaces, particularly for function restoration [1,2].For instance, advanced motor neuroprosthetics [3][4][5][6][7] aim to control mechatronic limbs by decoding motor commands via the analysis of the action potentials recorded through invasive interfaces.Neural signals are characterized by trains of action potentials (spikes) buried in physiological interference.Their amplitude is usually lower than other electrophysiological signals [4,8,9], making it difficult to isolate them from the background noise picked up at the neural interface.Beyond the biological sources, such as muscles and other distant neuronal populations, instrumental noise and powerline interference also affect the signal.However, signal decoding based on spike sorting methods requires good signal-to-noise ratios (SNRs) to correctly cluster the action potentials based on the waveform morphology [9].Moreover, action-potential morphology carries important information about the neural source [10][11][12], the cellular morphological structure, and the localization with respect to the recording electrode [13][14][15].
Unfortunately, the SNRs of signals recorded from the peripheral nervous system (PNS) are significantly lower than those recorded from the central nervous system (CNS) [4] due to increased interference levels and tissue resistance, which hampers spike sorting and morphological analyses [8,9].In this case, the design of an effective denoising stage that preserves the signal morphology is of paramount importance, also for real-time implementations [16,17].
To date, several denoising approaches based on linear filtering have been adopted for processing neural signals.These include bandpass filtering with a finite impulse response (FIR) [18][19][20][21] and filters with an infinite impulse response (IIR) [20,[22][23][24][25].However, bandpass filtering must be carefully designed to avoid the introduction of distortions in the actionpotential waveform [26,27].The pitfall of these approaches is the impossibility of removing background noise from the neural signal band.Hence, the filtering is either not completely effective or too aggressive regarding signals of interest [28].
Wavelet denoising techniques are based on the representation of the signal in the time-scale domain, which allows for a fine analysis of the signal's frequency components without loss of their temporal information, in contrast to the traditional Fourier transform [40].Moreover, wavelet transform has been widely adopted in neural signal processing for extracting features for spike sorting and decoding algorithms [22,41,42].Basically, all wavelet denoising techniques have a common framework comprising three steps: analysis (i.e.signal decomposition to obtain the detail and approximation coefficients), thresholding of the details, and synthesis (i.e.time-domain reconstruction of the signal using the thresholded coefficients and the approximation).However, the efficacy of wavelet denoising depends on the implementation choices adopted at the different stages, including the mother wavelet, the decomposition level, the threshold definition, and how it is applied to the detail coefficients (i.e. the thresholding).
Despite the rich literature that exploits wavelet denoising as a powerful signal enhancement tool in the hands of neural engineers, a systematic analysis and comparison of different approaches are lacking.Typically, research works focus on a specific wavelet denoising approach and exploit it, establishing all the parameters a priori and ignoring their possible effects on the neural signal, especially in terms of spikemorphology preservation.Moreover, wavelet denoising is generally considered as a pre-processing step, so that a quantitative appraisal of the performance of this powerful tool compared to different denoising methods in the context of neural signal processing is impossible from a simple literature review.Even though some works demonstrated the superiority of wavelet denoising with respect to other methods [28,30,34], such results are aimed at the presentation of a specific approach rather than at a systematic analysis and, most important, the comparison between different studies is hampered by the adoption of different and not publicly available datasets.The limited generalizability of these results motivated the study reported in this work, aimed at supporting neural engineers in a reasoned selection of the wavelet denoising algorithms and parameterisations that can better lead to the expected results on neural signals, regardless the downstream decoding or sorting algorithm to be applied.
In this work, we investigated the importance of different implementation choices in wavelet denoising algorithms to quantitatively assess their effects on neural signals in terms of noise reduction and morphology preservation.To this end, we implemented and compared the performances of different thresholds and thresholding methods reported in the scientific literature.We used a synthetic dataset of CNS signals and a real dataset of PNS signals to perform a complete statistical analysis of denoising performance in terms of quantitative metrics such as the root-mean-square error (RMSE), the Pearson's correlation coefficient, and the SNR.By systematically analysing the pros and cons of the different implementation choices, this work contributes to the design of effective denoising algorithms for neural signal processing.

Wavelet denoising
Among the different approaches used for denoising neural signals, wavelet-based methods are widely used due to their ability to reduce in-band noise, i.e. noise whose spectral content overlaps the signal spectrum.Wavelet denoising is used in signal processing to reduce background noise that can be approximated by a Gaussian-distributed random source.This approach relies on the basic assumption that the input signal y is given by the signal of interest x, which is corrupted by additive white Gaussian noise (AWGN) η [40,43,44]: (1) The wavelet transform represents the signal in the time-frequency domain through a set of wavelet coefficients, also called approximation and details.The details can be considered to be the result of a matched filter between the signal and a scaled and time-shifted version of a function called the 'mother wavelet' .In this sense, the mother wavelet is typically chosen to be as similar as possible to the waveform of interest, e.g. to the neuronal action potentials, to emphasize the signal characteristics.Mathematically [40], where ψ a,b (t) indicates dilated and shifted versions of the mother wavelet ψ, respectively, which are obtained by varying the scale parameter a ∈ R + and the shift parameter b ∈ R. Based on the discretization of these parameters, we can obtain either a continuous wavelet transform (CWT) or a discrete wavelet transform (DWT).In a CWT, the shift and scale parameters can assume any value in R, whereas in a DWT these parameters can take only discrete values.For instance, in the dyadic decomposition [40]: where j, k ∈ Z and j identifies the decomposition level.In this case, the wavelet transform can be implemented by means of a filtering tree that includes, for each level, a pair of high-pass and low-pass filters, whose definition depends on the mother wavelet.Each pair of filters splits the incoming signal band in half and this process is iteratively applied to the lowest sub-band.Once the desired decomposition level is achieved, the lowest sub-band, which defines the approximation of the signal, covers the range between 0 and f n /2 l , where f n denotes the Nyquist frequency and l the decomposition level.In this way, the approximation along with the other sub-bands (i.e. the details) extend from 0 to the Nyquist frequency of the input signal.
Among the possible DWTs, in this work we chose the stationary wavelet transform (SWT) because of its translation-invariant property [40,[45][46][47].SWT has been widely used in neural signal processing [32-36, 38, 39] because the denoised signal morphology is independent of the occurrence time of the action potentials.Previous studies [30,33,34] have demonstrated the superiority of SWT in neural processing with respect to DWT.Specifically, while SWT produces denoising results that are independent from the spike occurrence times, DWT decomposition may miss or deform the spikes depending on their temporal occurrence [30] because of the decimation introduced at each level to prevent information redundancy.Rather than decimating the output of each filter, the SWT implementation adopted in this work obtains translation-invariance by upsampling at each level the two decomposition filters of the previous level, thus preserving the time resolution of the input signal.
Conventional wavelet denoising processing preserves the approximation but thresholds the details using different algorithms.Typically, the value of the threshold is defined on the basis of the noise intensity, and can be level-dependent or not.Assuming a normal noise distribution, its intensity can be estimated based on its standard deviation.As the noise that affects real neural signals is not necessarily white, but could be colored or unstructured, e.g.low-level activity from distant neural sources [26], its standard deviation may differ at different levels [48].For these reasons, in this work we estimated the noise intensity at each decomposition level j based on its standard deviation, which was approximated as described in [31,34,49]: According to equation ( 5), the standard deviation at level j is computed as the 75th percentile of the median absolute deviation calculated for the j-level detail coefficients cD j .To obtain an accurate estimate of the noise intensity, equation ( 5) is computed during a quiescent period of neural activity [31,34].In this work, we set the duration of the quiescent period to 2 s.
Different thresholds can be defined on the σ j value (see section 2.1.1).Moreover, the thresholding method used, i.e. the means by which detail coefficients above and below the threshold are manipulated, differs in different methods.The methods most often used are hard thresholding and soft thresholding [43,44], which as defined respectively as: where cD j,k denotes the kth detail coefficient at level j and θ j is the threshold defined at that level.These thresholding methods have different pros and cons.
In general, hard thresholding introduces discontinuities into the detail coefficients that can generate oscillations in the time domain, which are also known as pseudo-Gibbs phenomena.Soft thresholding reduces the above-threshold coefficients, which decreases the signal amplitude (shrinkage effect).These issues motivated the introduction of other methods in an attempt to mitigate their main drawbacks.

The investigated wavelet denoising algorithms
In this paper, we compared different wavelet denoising algorithms parameterized for the specific application domain of neural signal processing.As discussed in the section above, both the threshold definitions and thresholding methods were studied, from which eight different algorithms were identified.
As regards the threshold definition, we considered different approaches described in the scientific literature.The threshold value influences the number of detail coefficients that will be forced to zero, thus determining the aggressiveness of the denoising.All the thresholds were considered to be level-dependent, either because of the formulation of the threshold itself or solely by means of σ j .
The most common approaches to threshold definition are the universal and minimax ones [43,44], which are respectively defined as follows: In both, N represents the total number of samples.In this case, level dependency is due solely to σ j , which is computed for each level.By using different threshold definitions, level dependency can be determined also by the scaling factor, beyond the σ j value.One example is the Han et al threshold [50], which was initially introduced in a totally different application field.We included it in this study because it was conceived to be more aggressive at higher frequencies and more conservative at lower frequencies.In this case, the threshold is defined as: where the value of j is between 1 and the decomposition level L.
We tested these three methods with both hard and soft thresholding.
We also assessed two other wavelet denoising algorithms, including a custom threshold definition and thresholding approach.The first one is the Golroudbari's algorithm described in [51], for which the threshold is defined as follows: Thresholding approaches try to smooth the discontinuities introduced by the hard thresholding formulation while reducing the shrinkage effect of soft thresholding: where 0 ≤ α ≤ 1 and m ≥ 0. In this work, according to previous assessments [52], we set α = 0.75 and m = 10.
The last wavelet denoising algorithm we compared was developed by Cannas et al [53], in which both the threshold definition and the way it is applied to the detail coefficients are specific to each decomposition level.This method is based on the identification of a critical decomposition level, which is characterized by a sudden increase in the standard deviation of the detail coefficients (σ l ).Then, for levels below the critical level, soft thresholding is applied and the threshold value must be found in the range σ l , max(cD j,k ) by minimizing a specific cost function.This cost function corresponds to the maximum absolute value of the cross-correlation function computed between the denoised signal and the noise identified by the algorithm.For the critical level, this cost function must be minimized, but the solution lies within the interval [0, σ l ] From this level on, hard thresholding is applied.Lastly, for levels above the critical level, a fixed low threshold is applied and the signal contribution is assumed to be significantly higher than that of the noise.

Implementation choices explored
In addition to the threshold definition and thresholding methods, other wavelet denoising implementation choices influence the quality of the processing result.Among them, we studied the mother wavelet and the decomposition level.Moreover, due to the spectral content of the neural signals, we investigated the choice of preserving the approximation coefficients or clearing them, thereby globally high-pass filtering the signal.
Using synthetic signals, we first identified the best-performing mother wavelet in terms of preservation of the signal morphology.To this end, we fixed the wavelet denoising algorithm by selecting the minimax threshold with hard thresholding, which is a method that has been successfully applied in the field [34][35][36].We compared the following five orthogonal and biorthogonal mother wavelets with compact support [40]: Haar, Coiflet2 (Coif2), Daubech-ies4 (Db4), Biorthogonal6.8 (Bior6.8),and Symlet7 (Sym7) [28,[31][32][33][34][35][36][37][38][39].Two decomposition levels were also explored, based on the spectral characteristics of the neural signals and the spectral ranges typically considered in the neural processing scientific literature.Specifically, for input signals sampled at 16 kHz, we set the upper boundary of the SWT lowest subband to 500 Hz or 250 Hz, by respectively adopting four or five decomposition levels.In this way, when adopting five levels of decomposition, wavelet denoising operates above 250 Hz, which resembles the lower cut-off frequency of one of the best-performing bandpass filters in the neural processing domain, i.e. noncausal fourth-order elliptic bandpass filter with cutoff frequencies of 300 Hz and 3000 Hz [27].In this regard, the same lower cut-off frequency has been widely adopted in the field, such as in [16, 21-24, 37, 41].Moreover, the spectral range between 0 and 244 Hz was found to be the most effective for the approximation in [28], despite based on an unconventional wavelet denoising approach.On the other hand, with four decomposition levels, wavelet denoising spreads from 500 Hz to the Nyquist frequency, which recalls more aggressive processing approaches of the scientific literature, which focused on frequencies above 500 or even 750 Hz [4, 7, 18, 19, 25, 30, 31, 34-36, 38, 39].Furthermore, all these considerations are especially relevant if approximation coefficients are zeroed to introduce a high-pass effect, since the spectral contributions below 250 Hz or 500 Hz are completely cut out.
Then, we tested and compared the performances of the wavelet denoising algorithms introduced in section 2.1.1.By considering all the performance indexes, we identified the best-performing algorithm, along with the decomposition level that obtained the best results.
Finally, as the low-frequency contributions of the approximation coefficients can be reasonably considered to be outside the band of interest for neural signals from the PNS [4,7,18,19,21,[31][32][33][34], we evaluated the impact of the approximation on the denoised signal.Overall, approximation clearing would determine the effect of high-pass filtering on the time-domain reconstructed signal.This aspect was investigated by first comparing the bestperforming algorithm with and without clearing the approximation coefficients and then comparing the resulting best solution to that of the non-causal fourth-order elliptic bandpass filter with cut-off frequencies of 300 Hz and 3000 Hz, which is effective for neural signal denoising [27].
The algorithms were implemented and tested in MATLAB (MathWorks Inc.), and the statistical analyses were performed using Stata 16 (StataCorp LP, College Station, TX, USA).

Datasets
To assess the performances of the different algorithms and parameterizations, we used a synthetic neural dataset and a real PNS murine dataset.Figure 1 shows examples of synthetic and real signals.

Synthetic dataset
To assess the wavelet denoising performance on different noise types, we used two synthetic datasets, which contained different numbers of artificial neural signals 60 s in length, sampled at 16 kHz.
The first dataset is freely available and was introduced by the authors of [22], downsampled from 24 kHz.It consists of eight signals, each containing approximately 3400 spikes with three different morphologies and affected by additive noise, whose standard deviation ranges from 0.05 to 0.4.The creators of the dataset generated physiologically plausible background noise by placing different action potentials of variable amplitude at random instants, which resemble the typical biological interference of neural recordings.For these signals, 95% confidence intervals for SNR were identified as [4.85 ÷ 14.4] dB, with a mean value of 9.6 dB.Hereinafter, we refer to this dataset as 'Physio' .
The second synthetic dataset was derived from the first by using synchronized averaging to extract three different action-potential waveforms from a high SNR signal.The signals of this dataset were then created by assembling 2700 spikes per signal, randomly placed over time.To this basic structure, we introduced different levels of AWGN, giving rise to 30 noisy signals with standard deviations ranging linearly from 0.01 to 0.435.In terms of SNR, the mean value was estimated as 13.1 dB, whereas 95% confidence intervals matched the range [10.3 ÷ 16.0] dB.Hereinafter, we refer to this dataset as 'AWGN' .
As noted above, wavelet denoising is known to be effective on AWGN, so we developed an AWGN dataset to represent the easiest type of noise that could affect neural signals.Unfortunately, real neural signals are typically buried in less structured physiological and instrumental noises [26].Therefore, we adopted the Physio dataset to simulate a more realistic condition.
To characterize these two datasets, excluding the similarities in their noise characteristics, we performed some preliminary tests on their signal distributions.Use of the Lilliefors normality test on the Physio dataset noise enabled the exclusion of a Gaussian distribution for its signals (p < 0.001), which is consistent with the characteristics of real noise that affects PNS recordings [26].Moreover, the power spectral density estimated using the Welch's method [54] revealed the coloured nature of the pseudophysiological noise affecting this dataset, which, in accord with [26], suggests a pink noise spectrum.

Peripheral nerve dataset
The real dataset consists of 11 signals acquired from a murine PNS.Specifically, a TIME-3 H electrode was implanted in the sciatic nerve of a Sprague-Dawley rat following a procedure similar to that reported in [55].Briefly, the rat was anesthetized with a mixture of ketamine/xylazine (90/10 mg kg −1 i.p.) and the sciatic nerve was exposed at the midthigh.Then, using a straight needle attached to a surgical loop (STC-6 needle, EH7900G, Ethicon), the electrode was implanted within both the tibial and peroneal fascicles of the sciatic nerve.The experiment was performed with the approval of the Ethics Committee of the Universitat Autònoma de Barcelona following the European Communities Council Directive 2010/63/EU.
In-vivo electroneurographic signals were elicited after applying trains of ten brushes with a blunt probe and ten contacts with a von Frey filament to the skin of the implanted hind paw (as in [18]).The signals were recorded with respect to an external ground placed 1 cm proximally from the intraneural electrode.Data was acquired using a custom neural signal acquisition module featuring eight channels, each one implementing a low-noise front-end and an analogto-digital converter (ADC) [56,57].The input stage provides a gain of 43 dB with an input-referred noise of 2.97 µV rms in the bandwidth between 1.67 mHz and 8 kHz, taking into account the whole signal path including the ADC.With the use of a precision pseudo-resistor, the high-pass cut-off frequency is digitally programmable in the range between 10 Hz and 1.3 kHz [58].The ADC is a third-order deltasigma modulator, which, after decimation, enables a 10 bit resolution and a sampling frequency of 15 625 Hz, which is slightly above the limit suggested in [31].
For this dataset, 95% confidence intervals for SNR were [5.8 ÷ 6.8] dB, with a mean value of 6.3 dB.

Performance evaluation
To quantify the morphological distortions introduced by wavelet denoising, we used only the synthetic datasets.To this end, we selected the Pearson's correlation coefficient (ρ) and the RMSE.
For each denoised spike, the Pearson's correlation coefficient was computed as follows: where x d is the ith denoised spike and x orig is the corresponding spike in the noiseless signal.However, as ρ only quantifies the linear dependency between the two spikes and the amplitude difference is removed by coefficient normalization, the RMSE index could emphasize this distortion.For each denoised spike, the RMSE was computed as follows: Specifically, in the subsequent analyses, the mean value of each shape-preserving index was considered for each signal.Note that, in this study, the RMSE is dimensionless because the original synthetic dataset [22] is normalized.
To quantify the noise reduction realized by the selection of different wavelet denoising implementation parameters, we estimated the SNRs of both the synthetic and real signals by the following formula: For the synthetic data, A pp represents the median value of the peak-to-peak amplitudes of the nonoverlapping action potentials, and σ noise is the standard deviation of the samples representing the noise, which are obtained by concatenating the inter-spike segments and then excluding any action potential.To determine the improvement introduced during the denoising stage, we estimated the SNR value before and after denoising.
For the real dataset, in equation ( 15), A pp denotes the peak-to-peak amplitude of the largest spike cluster obtained by a modified version of the WaveClus spike sorting algorithm [22], and σ noise represents the standard deviation of the noise computed over the 2 s quiescent epoch.The modification of the WaveClus spike sorting algorithm involved the removal of all the input filtering stages and the computation of the detection threshold on a noisy quiescent portion of the signal by means of the sample standard deviation.Removal of the filtering stage was necessary to enable an unbiased comparison of the SNRs of the raw signal, the band-passed signal, and the best-performing wavelet denoising algorithm.We selected the Wave-Clus option, which enables the use of principal component analysis for feature extraction.

Statistical analysis
To perform the parametric statistical test, we conducted Shapiro-Wilk tests to verify the normality of the distribution of each parameter.
First, we performed an analysis of variance to examine the effects of the mother wavelets and the noise levels on the neural signals, and then to evaluate the influences of the algorithms and noise levels.
We used the Student's paired t-test in all subsequent analyses except for the choice of the decomposition level, for which a Student's unpaired t-test was performed.In the assessments of the bestperforming mother wavelet and the wavelet denoising algorithms, we adjusted the p-values by a Bonferroni correction based on an original alpha level of 0.05.

Choice of mother wavelet
First, we compared the five selected mother wavelets to identify which one to use in subsequent analyses.Figure 2 shows a comparison of their performance for the four-level and five-level decompositions and both synthetic datasets in terms of mean values and 95% confidence intervals.The highest ρ value was always obtained by the Haar mother wavelet, with statistical significance (p < 0.05).Conversely, on both levels, the lowest RMSE value was obtained again by the Haar on the Physio dataset (with statistical significance p < 0.05), and by the Coif2 on the AWGN dataset.However, the better performance of the Coif2 as compared with that of the Haar wavelet was not statistically significant (p = 0.055 for fourlevel and p = 0.065 for five-level decomposition), and both these mother wavelets performed significantly better than the other three (p < 0.05 and p < 0.01 for four-level and five-level decompositions, respectively).In the light of these results, the Haar mother wavelet appeared to be the most satisfactory choice for obtaining the best performance in terms of morphological distortion, so it was used in all subsequent analyses.

Comparison of wavelet denoising algorithms
In this section, we compare the eight wavelet denoising algorithms described in section 2.1.1 in terms of their performance indexes, and explore both the fourlevel and five-level decompositions.Figure 3 shows the results in terms of their mean values and 95%  confidence intervals for each synthetic dataset and for both the decomposition levels, and figure 4 shows the SNR values obtained for the real dataset.
With respect to the morphological indexes (figures 3(a) and (b)), the highest ρ values and smallest RMSE values were those obtained by the Han et al and minimax thresholds, both with hard thresholding.However, in the point comparisons, these differences were not always statistically significant (see tables IS and IIS in the supplementary materials, available online at https://stacks.iop.org/JNE/17/066016/mmedia),particularly in comparison with the Golroudbari's algorithm.
To carefully assess the performance of the two most efficient algorithms, we also studied the impact of noise level on the quality of the denoised signals.This study revealed that with increases in the noise magnitude, the performance decay of the Han et al threshold was lower than that of minimax in terms of ρ and RMSE (see figures 1S and 2S, supplementary materials).By comparing the two algorithms with respect to their morphological indexes, a significant difference is evident, which accounts for the higher performance of the Han et al threshold, exclusively in terms of the RMSE values on the Physio dataset (p < 0.05 for the five-level decomposition, p < 0.01 for the four-level case).
To quantify the denoising performance, we also computed and statistically compared the SNR values for noisy and denoised data (see figures 3(c) and 4).Again, the algorithms using the Han et al and minimax thresholds, both with hard thresholding, outperformed the others on both the real and synthetic datasets.On the Physio dataset, statistically significant differences were achieved for both levels in the comparisons of the soft thresholding approaches (p < 0.01).On the same dataset, a significant difference was also achieved between the two bestperforming algorithms (p < 0.001) and in their comparison with the Cannas et al algorithm (p < 0.05), for the five-level decomposition.Conversely, on the AWGN dataset, minimax with hard thresholding significantly higher SNR values than all the other algorithms (p < 0.001), followed by the Han et al threshold with hard thresholding (p < 0.001).In this regard, figures 1S and 2S, supplementary materials provide a point comparison of these two algorithms on the synthetic datasets with respect to noise level.
Finally, the Han et al and minimax thresholds with hard thresholding also outperformed all the other algorithms on the real dataset (p < 0.01), except for Golroudbari's algorithm (see table IIIS, supplementary materials).Moreover, with five-level decomposition, minimax showed significantly higher SNR values than Han et al (p < 0.05).Nevertheless, as also confirmed by the results obtained on synthetic signals, this improvement in noise removal by minimax should be carefully considered with respect to the better morphological preservation offered by the Han et al threshold, which can be observed not only on the real spike waveforms (figure 5) but also on synthetic ones, especially on smoother spikes (figures 6 and 7, panel (C)).
For the sake of completeness, figures 6 and 7 show the wavelet denoising effects obtained by the eight algorithms on two signals from the Physio and AWGN datasets, respectively, in terms of noise removal (Panels (A) and (B)) and morphological distortions on the three spike waveforms (Panel (C)), considering in all cases a five-level decomposition.
With regard to the decomposition level, tables 1-3 report the statistical indices for the mean differences and p-values for both datasets.On the synthetic datasets, the adoption of five-level decomposition led to an overall significantly better performance than the use of four-level decomposition (p < 0.05) in terms of ρ.On the other hand, despite several results with statistical significance, the RMSE findings did not permit the drawing of unequivocal conclusions.Furthermore, the SNR analysis confirmed the superiority of five-level decomposition but only on synthetic signals, at least for hard thresholding approaches and Golroudbari's algorithm (p < 0.05).These quantitative analysis results, which were expected from the wavelet denoising intervention up to about 250 Hz, were further confirmed by visual inspection, as can be seen in figure 8.
Moreover, in order to provide an overview of the morphological and denoising effects induced by the assessed thresholding methods and the decomposition levels at a glance, tables 4 and 5 graphically report the mean differences for each performance index, dataset and threshold.
Based on the obtained results, we chose a five-level decomposition for the subsequent investigations.We also selected the algorithm based on the Han et al threshold and hard thresholding because of its superior morphology preservation, which is important for spike sorting algorithms, even though the minimax threshold with the same thresholding method achieved slightly higher SNR values.

Dealing with the approximation band
On the selected best-performing algorithm and decomposition level, we studied the impact of approximation coefficients on the denoising quality   in terms of ρ, RMSE, and SNR.In figures 9 and 10, we can clearly see that clearing the approximation coefficients led to better results in terms of ρ (p < 0.05 for synthetic dataset) and the SNR (p < 0.001 for both the real and synthetic datasets), but not in terms of the RMSE.In the latter case, wavelet denoising without clearing the approximation coefficients resulted in less morphological distortion, but only for the AWGN dataset (p < 0.001).However, as the difference in terms of the RMSE was not significant in the case of the Physio dataset, whose noise is much more realistic than that in the AWGN dataset, the results obtained in terms of ρ and SNR led to a comparison between bandpass filtering and wavelet denoising with approximation clearing.

Wavelet denoising versus bandpass filtering
To assess linear and non-linear denoising approaches for neural signal processing, we compared the bestperforming wavelet algorithm, i.e. the Han et al threshold with hard thresholding, five-level decomposition, and clearing the approximation coefficients with a non-causal fourth-order elliptic bandpass filter with cut-off frequencies of 300 Hz and 3000 Hz.
As shown in figures 9 and 10, wavelet denoising significantly outperformed linear filtering in all the performance indexes on both the real and synthetic datasets (p < 0.001).
For the sake of completeness, figure 11 shows a real PNS signal segment before and after the different processing stages.

Discussion
The performance analyses of different mother wavelets, as presented in section 3.1, revealed the superiority of the Haar and Coif2 wavelets as compared to the other compact-support wavelets studied in this work.As noted in section 2.1, the selection of the mother wavelet is typically accomplished by choosing that exhibiting a waveform similar to the signal to be denoised.In neural signal processing, this has led to the widespread adoption of certain wavelets, in particular the Sym7 [31-36, 38, 39] and Db4 [28,31,37].In [59], the authors proposed the use of the RMSSA criterion for the optimal selection of the mother wavelet in PNS signal denoising.However, this criterion focuses only on the noise removal, enhancing detection performance without considering morphology preservation.In accordance with [38], our findings reveal that an 'agnostic' waveform such as the Haar, which is clearly different from an action potential, is able to emphasize the neural spiking activity while reducing amplitude distortions in the denoising process.This choice was found to reduce the impact on the filter length caused by the adoption of the SWT, as can be seen in table 6.In fact, the total number of multiply-accumulate (MAC) operations  per second needed for the full SWT decomposition and reconstruction with the Haar mother wavelet results one order of magnitude lower than the other mother wavelets, thus involving much less computational complexity and, as such, providing its highest suitability for embedded and real-time processing scenarios.Moreover, from our comparison of the eight wavelet denoising algorithms identified in the scientific literature, we can identify some further considerations.
First, despite the capability of soft thresholding to produce smoother signals than hard thresholding, in our study, hard thresholding outperformed soft thresholding in all respects (see figures 3 and 4 and table 5).Hybrid approaches such as Golroudbari's algorithm [51], when parameterized to emphasize hard thresholding behavior, also performed better than soft thresholding.A possible explanation for these findings is that discontinuities in the detail coefficients introduced by hard thresholding have less negative impact on the signal morphology than the shrinkage effect introduced by soft thresholding.This aspect, which is further confirmed by the SNR values, is consistent with the widespread adoption of hard thresholding in the scientific literature [31, 33-36, 38, 39].Conversely, the mixed use of both thresholding types at different scales in the Cannas et al algorithm [53] did not produce goodquality results on neural signals, as can be seen in figures 6-8, despite the high SNR values obtained on real signals (figure 4).This could be due to the SNR estimate, in which the peak-to-peak amplitude was considered on the spike template which, in turn, might become well shaped after synchronized averaging, even in noisy signals.Furthermore, looking at the different denoised AWGN traces (see figure 7), it was evident that spikes with smoother morphologies were generally well preserved by the Cannas et al algorithm, whereas those with sharper trends were highly distorted, even when the original noise entity was low.This behaviour, which was observable regardless of the noise level in the raw signal and the decomposition level, led to low variability of the ρ and RMSE indexes across the whole AWGN dataset (see figures 3(a) and (b)).However, being the noise reduction by the Cannas et al algorithm quite limited, the standard deviation of the noise computed on the denoised signals was variable, thus introducing higher confidence intervals for the SNR metric (see figure 3(c)).This attitude could be also responsible for the small dispersion of the Cannas et al.SNR values in the real dataset (see figure 4), since the SNR values achieved by this algorithm traced the variability of the raw data SNRs (6.3 ± 0.5 dB), even approaching the same mean values.
Moreover, the minimax and Han et al thresholds with hard thresholding achieved the best performance in terms of RMSE, ρ, and SNR.Between them, the Han et al threshold better preserved the signal morphology, even though its SNR was slightly lower than that of the minimax on the AWGN and real datasets.However, the Han et al threshold outperformed the minimax threshold at high noise levels (see figures 1S and 2S, supplementary materials).Remarkably, the higher SNR achieved by minimax was associated with a significantly higher number of spikes being completely cleared out by denoising, as compared to the Han et al threshold (36.2% vs. 8.8% and 15% vs. 3.8% considering the highest noise levels in the AWGN and Physio datasets, respectively, both with five-level decomposition).
Finally, as regards the decomposition level, by looking at figures 3 and 4, it is evident that five-level decomposition outperformed four-level decomposition when considering all the performance indexes.This superiority, which is statistically significant, suggests that denoising must be performed up to the lowest frequencies around 250 Hz.These results are consistent with those reported in the scientific literature, where the same frequency band decomposition was identified as optimal in terms of both spike-shape preservation and SNR [28].
Moreover, our analysis revealed that the denoising performance when the approximation band was cleared to zero was significantly better than that achieved when preserving these coefficients.This approach has been adopted in previous works [28,[34][35][36][37][38][39].It should be noted, however, that approximation clearing is more aggressive than applying a linear filter to cut-off the lowest frequencies.Nonetheless, from our statistical analysis, the superior performance of high-pass filtering in wavelet denoising as compared to bandpass filtering was evident.This result is consistent with that reported in the scientific literature on neural signal processing, where approximation clearing has outperformed the bandpass-filtering pre-processing stage in combination with details thresholding [34] or as a stand-alone denoising procedure [28], for both FIR and IIR filters.Specifically, improvement in wavelet denoising has been proven in terms of both classification performance [28,34] and noise reduction and spike-shape preservation [28].
Finally, based on these findings, considering the signals sampled at 16 kHz, five-level decomposition with a Haar mother wavelet combined with the Han et al threshold [50] with hard thresholding and clearing the approximation band can be suggested for the denoising of neural signals.As confirmed by statistical analysis, this approach introduces significant improvement from that obtained by a non-causal fourth-order elliptic bandpass filter with cut-off frequencies of 300 Hz and 3000 Hz, which was previously suggested for neural signal denoising due to its capability of preserving the spike waveform [27].We confirmed this result on both synthetic and real signals, in the latter case with respect to the SNR value only.

Conclusions
In this work, we performed a systematic analysis of different algorithms to achieve stationary wavelet denoising of neural signals.Conventional methods and approaches were challenged by introducing algorithms developed in different signal processing fields.The adoption of synthetic datasets enabled assessment of the ability to preserve the actionpotential morphology, from which the Han et al threshold with hard thresholding, coupled with the clearing of the approximation band in the range of 0 Hz to 250 Hz, was found to realize the best denoising of neural signals.Remarkably, the Haar mother wavelet, which is the smallest-support wavelet across different families, also exhibited the best performance when compared to other compact-support wavelets.This aspect is important for embedded-processing scenarios, where a low memory footprint and fast computations are required to cope with real-time processing needs, e.g. for the control of prosthetic limbs.

Figure 1 .
Figure 1.Example of signals from the adopted datasets: a murine PNS signal (a) with evident neural activity from around 5 s to 105 s, a simulated trace with pseudo-physiological noise (b, σ = 0.1) and a synthetic signal with additive white Gaussian noise (c, σ = 0.1).On the right, a 0.3 s-zoom on the two synthetic signals to emphasize the three different spike waveforms adopted in both the Physio (d) and AWGN (e) datasets, with the noiseless corresponding traces in gray.Real signal amplitudes are in µV and synthetic ones are dimensionless.

Figure 2 .
Figure 2. Mean and 95% confidence intervals for the morphological distortion indexes ((a): Pearson's correlation coefficient; (b): RMSE) computed for the Physio (dash-dotted line) and AWGN (solid line) synthetic datasets while changing the decomposition level and the mother wavelet.Circles and squares represent the mean values for the four-level and five-level decompositions, respectively.

Figure 3 .
Figure 3. Mean and 95% confidence intervals for the morphological distortion indexes ((a): Pearson's correlation coefficient, (b): RMSE) and the SNR (c), computed over the Physio (dash-dotted line) and AWGN (solid line) synthetic datasets while changing the decomposition level and wavelet denoising algorithm: (A) Han et al threshold with hard thresholding, (B) minimax with hard thresholding, (C) universal with hard thresholding, (D) Golroudbari's algorithm, (E) Cannas et al algorithm, (F) Han et al threshold with soft thresholding, (G) minimax with soft thresholding, and (H) universal with soft thresholding.Circles and squares represent the mean values for four-level and five-level decompositions, respectively.SNR confidence intervals for noisy signals are [4.85 ÷ 14.4] dB and [10.3 ÷ 16.0] dB for the Physio and AWGN datasets, respectively, with mean values of 9.6 dB and 13.1 dB.

Figure 4 .
Figure 4. Mean and 95% confidence intervals for the SNR results computed over the real PNS dataset while changing the decomposition level and wavelet denoising algorithm: (A) Han et al threshold with hard thresholding, (B) minimax with hard thresholding, (C) universal with hard thresholding, (D) Golroudbari's algorithm, (E) Cannas et al algorithm, (F) Han et al threshold with soft thresholding, (G) minimax with soft thresholding, and (H) universal with soft thresholding.Circles and squares represent the mean values for the four-level and five-level decompositions, respectively.SNR confidence intervals for noisy signals are [5.8 ÷ 6.8] dB, with a mean value of 6.3 dB.

Figure 5 .
Figure 5. Examples of neural-spike templates from the real PNS dataset after wavelet denoising.On the left, templates obtained with the Han et al threshold (a),(b), and on the right, those obtained with minimax (c),(d), both with hard thresholding and five-level decomposition.As can be seen, minimax slightly distorted the spike waveforms.

Figure 6 .
Figure 6.Panel (A): A 10 s window of a Physio dataset signal before (top, 'Raw') and after wavelet denoising ((a)-(h), bottom) is represented in order to visually appreciate the noise removal obtained by the different denoising algorithms.Panel (B): A 1 s window of the same signal is reported before (top, 'Raw') and after wavelet denoising (a-h, bottom), in order to provide a more focused view of the denoising effects.Panel (C): A 40 ms zoom on the noisy (top) and denoised (a-h, bottom) spikes (in black) with their original noiseless shapes (in gray), which shows the performances obtained with the different wavelet denoising algorithms in terms of morphology preservation on the three spike morphologies constituting the Physio dataset.Specifically, a five-level decomposition with the Haar mother wavelet were used, along with the Han et al threshold with hard (a) and soft (f) thresholding, minimax with hard (b) and soft (g) thresholding, universal with hard (c) and soft (h) thresholding, and the Golroudbari's (d) and Cannas et al (e) algorithms.Amplitudes are dimensionless.

Figure 7 .
Figure 7. Panel (A): A 10 s window of an AWGN dataset signal before (top, 'Raw') and after (a-h, bottom) wavelet denoising is presented, in order to show the noise removal effects obtained by the different adopted algorithms.Panel (B): A 1 s window of the same signal is represented before (top, 'Raw') and after wavelet denoising (a-h, bottom), in order to provide a more focused view on the denoising effects.Panel (C): A 40 ms zoom shows the denoised (a)-(h) spike morphologies in comparison to the original noiseless morphologies (in gray) to give an overview of the performances obtained with the different wavelet algorithms in terms of morphology preservation.For the sake of completeness, a window including all the three spike morphologies of the synthetic dataset is reported.Specifically, results obtained by Han et al threshold with hard (a) and soft (f) thresholding, minimax with hard (b) and soft (g) thresholding, universal with hard (c) and soft (h) thresholding, Golroudbari's (d) and Cannas et al (e) algorithms were reported using a five-level decomposition with the Haar mother wavelet.Amplitudes are dimensionless.

Figure 8 .
Figure 8. Real PNS signals before (top, right and left) and after wavelet denoising (a-h) with the Haar mother wavelet and five-level (Panel A) and four-level (Panel B) decompositions.Specifically, the Han et al threshold with hard (a) and soft (f) thresholding, minimax with hard (b) and soft (g) thresholding, universal with hard (c) and soft (h) thresholding, and the Golroudbari's (d) and Cannas et al (e) algorithms.Amplitudes are in µV.

Figure 9 .Figure 10 .
Figure 9. Means and 95% confidence intervals for the morphological distortion indexes ((a): Pearson's correlation coefficient, (b): RMSE) and SNR (c), computed for the Physio (dash-dotted line) and AWGN (solid line) synthetic datasets processed with both the IIR band-pass filter (right side) and the best wavelet denoising algorithm, while preserving approximation coefficients (middle) and clearing them (left side).The morphological distortion indexes are dimensionless.

Figure 11 .
Figure 11.From top to bottom: (a) a real murine PNS signal before denoising (6.0 dB), (b) after band-pass filtering (7.8 dB), (c) conventional wavelet denoising (12.4 dB), and (d) wavelet denoising clearing the approximation (22.4 dB).In the represented cases, wavelet denoising was performed by selecting the Haar mother wavelet, Han et al threshold with hard thresholding, and a five-level decomposition.

Table
Mean differences with p-values (in brackets) for comparison of the two decomposition levels, calculated for the three different performance indexes on the Physio dataset for eight wavelet denoising algorithms: (A) Han et al threshold with hard thresholding, (B) minimax threshold with hard thresholding, (C) universal threshold with hard thresholding, (D) Golroudbari's algorithm, (E) Cannas et al algorithm, (F) Han et al threshold with soft thresholding, (G) minimax threshold with soft thresholding, and (H) universal threshold with soft thresholding.Significant values are marked by asterisks (p < 0.05).

Table 2 .
Mean differences with p-values (in brackets) for comparison of the two decomposition levels, calculated for the three different performance indexes on the AWGN dataset for the eight wavelet denoising algorithms: (A) Han et al threshold with hard thresholding, (B) minimax threshold with hard thresholding, (C) universal threshold with hard thresholding, (D) Golroudbari's algorithm, (E) Cannas et al algorithm, (F) Han et al threshold with soft thresholding, (G) minimax threshold with soft thresholding, and (H) universal threshold with soft thresholding.Significant values are marked by asterisks (p < 0.05).

Table 3 .
Mean differences with p-values (in brackets) for comparison of the two decomposition levels, calculated for the SNR values on the real dataset for the eight wavelet denoising algorithms: (A) Han et al threshold with hard thresholding, (B) minimax threshold with hard thresholding, (C) universal threshold with hard thresholding, (D) Golroudbari's algorithm, (E) Cannas et al algorithm, (F) Han et al threshold with soft thresholding, (G) minimax threshold with soft thresholding, and (H) universal threshold with soft thresholding.Significant values are marked by asterisks (p < 0.05).

Table 4 .
Schematic representation of the morphological and noise-removal effects introduced by the choice of the decomposition level.Specifically, given the same wavelet denoising algorithm and dataset, mean changes are represented for each performance index.For ρ and SNR, mean differences above a threshold of 0.01 were considered as improvement (↑, if positive) or worsening (↓, if negative), whereas those below this threshold were considered as no change (↔).On the other hand, for RMSE metric differences were considered without any threshold on the values.Significant trends are marked by asterisks (p < 0.05).

Table 5 .
Pair-wise comparison on the morphological and noise-removal effects introduced by hard versus soft thresholding approach.Specifically, for each threshold, decomposition level and dataset, the mean change for each performance index is represented as improvement (↑), worsening (↓) or no change (↔).Significant trends are marked by asterisks (p < 0.05).

Table 6 .
Comparison on the computational complexity for each mother wavelet in terms of wavelet filter length, number of filter taps for the full decomposition and reconstruction and the corresponding total number of multiply-accumulate (MAC) operations per second.In this estimate, SWT with a five-level decomposition and a sampling frequency of 16 kHz were considered.