ECGdeli - An open source ECG delineation toolbox for MATLAB

The electrocardiogram (ECG) is a standard cost-efficient and non-invasive tool for the early detection of various cardiac diseases. Quantifying different timing and amplitude features of and in between the single ECG waveforms can reveal important information about the underlying (dys-)function of the heart. Determining these features requires the detection of fiducial points that mark the on-and offset as well as the peak of each ECG waveform (P wave, QRS complex, T wave). Manually setting these points is time-consuming and requires a physician’s expert knowledge. Therefore, the highly modular ECGdeli toolbox for MATLAB was developed, which is capable of filtering clinically recorded 12-lead ECG signals and detecting the fiducial points, also called delineation. It is one of the few open toolboxes offering ECG delineation for P waves, T Waves and QRS complexes. The algorithms provided were evaluated with the QT database, an ECG database comprising 105 signals with fiducial points annotated by clinicians. The median difference between the fiducial points set by the boundary detection algorithm and the clinical annotations serving as a ground truth is less than 4 samples (16ms) for the P wave and the QRS complex markers. The T wave onset, peak and offset were detected with a median difference of 5, 2 and 7 samples, respectively. Results were compared to two free algorithms available on PhysioNet. Our results show that ECGdeli can reliably detect P waves, QRS complexes and T waves. Thus, it can contribute to diagnose specific cardiac diseases by analyzing the ECG signal. As ECGdeli is published under GNU GPLv3 and thanks to its modularity, it can be used to extend existing algorithms or as a benchmark for new algorithms.


Motivation and significance
As the amount of recorded data increases in all fields, automatic processing is inevitably needed. This is the case for health * Corresponding author.
E-mail address: publications@ibt.kit.edu (N. Pilia). data, too. Especially the electrocardiogram (ECG) as a cheap and available standard heart activity monitoring device [2] is used in medicine around the world. Automatic ECG processing can be applied to batch process many ECGs in a short amount of time. In this way, manual annotation work of physicians can be avoided.  heart, optimize diagnosis and treatment strategies of cardiac diseases [3][4][5][6][7][8]. The automatic analysis of the ECG usually starts with determining wave types (P waves, T waves and QRS complexes), followed by finding the peaks and boundaries, i.e., wave onsets and offsets. Starting from that, further temporal parameters, like QT intervals, RR intervals, etc. can be derived. As well, these annotations are needed to calculate further features like amplitudes, slopes, and parameters like heart rate variability [9].
In this work, we present a toolbox for MATLAB comprising algorithms for ECG pre-processing and ECG wave delineation for research use, freely available on github. We call this open source toolbox ECGdeli. The aim of this work is not to present new approaches for wave delineation as the algorithms or slightly modified versions were already used in several studies [11][12][13][14][15]. Rather, this work should be seen as accompanying documentation of the open software resource. We will describe shortly the main ideas behind the algorithms and compare them to two existing open implementations on PhysioNet: ecg-kit [16] and ecgpuwave [17]. The algorithms for comparison, especially ecgkit, offer more options for evaluation and data import/export, T wave (A) and P wave (B) detection for signal sel100 from [10]. Extrema of WT in line 3 correspond to the T and P wave boundaries and peaks shown in line 4 in A and B, respectively. however, at the cost of a more complicated non-modular framework or perform worse. The advantages of ECGdeli in comparison with the existing implementations are a simple and modular design concentrating on the essential algorithms needed for ECG evaluation yielding an easy to use software; ECGdeli focuses on one concrete purpose: ECG wave detection and delineation. File input/output, post processing like feature calculation and a graphical user interface are intentionally not included. Thus, users can easily integrate the single algorithms or the whole toolbox in their evaluation frameworks, replace single algorithms, all by using few clear and standardized interfaces.
In the following, we present a short description of the single algorithms provided with the toolbox and discuss the main advantages of ECGdeli.

ECG preprocessing
The ECG preprocessing functionalities are shown in Fig. 5, lines one to four and include a baseline removal technique [15] ECG_Baseline_Removal.m, a bandpass filter ECG_High_Low_Filter.m and a notch filter Notch_Filter.m, as well as a method for correcting the electrical isoline Iso-line_Correction.m. All preprocessing steps are performed lead by lead.

QRS detection
In the QRS detection algorithm QRS_detection.m (an overview is shown in Fig. 1), the input signal is first bandpass filtered (5 Hz to 100 Hz) in order to reduce high frequency noise, baseline wander, to attenuate P and T waves and to obtain QRS complexes with pronounced Q and S waves. Then, a phase-free stationary wavelet transformation (SWT) using the Haar wavelet is applied [12]. The detail coefficients SWD N of the SWT are calculated at the level including frequencies around 45 Hz, which is the frequency range where spectral components of the QRS complex being relevant for delineation can be found [18].
To detect intervals containing the QRS complexes, a threshold based approach is chosen. This threshold is adapted dependent on a changing time windows which is applied to SWD N and a signal dependent statistical measure. Among all sets of solutions calculated with different thresholds, a voting algorithm yields the most consistent solution.
The R peak is annotated at the position with the highest amplitude within the bounds of the QRS interval. The Q-and S peak are subsequently marked at the minimum amplitude in the regions from QRS onset to R peak and from R peak to QRS offset, respectively. The onset and the offset of the QRS complex are afterwards marked 20 ms prior the Q-peak and 20 ms after the Speak, respectively. In the end, all detected peaks and boundaries are stored in a fiducial point table (FPT) in columns five to eight.
The structure is shown in Table 1.

T wave detection
The implemented T wave detection method T_Detection.m relies on QRS detection and is performed for each lead separately. Fig. 3A shows an overview of the algorithm. To detect the T waves, the following pre-processing steps are needed: QRS replacement, filtering and SWT. First, a fixed part of each RR interval is replaced by a sigmoid function, such that QRS complexes and P waves are blanked (see Fig. 2A, second line). Subsequently, phase-free Butterworth band-pass filtering (passband from 0.3 Hz to 20 Hz) of order 4 is performed. This QRS-free signal is fed into a phase-free SWT [12] with the rbio3.3 wavelet yielding a wavelet signal hardly showing activity in the QRS segment (see Fig. 2A, third line). The level N of the detail coefficients to be analyzed is determined based on the central frequency of the T wave (7 Hz) [18].
After preparing the signal, the polarity of the T wave is determined considering all positive and negative extrema in the SWT signal. In a search window dependent on the RR interval and QRS offset, we determine the peak of the T wave (T peak ) in each RR interval by choosing the time instant of maximum amplitude (for positive polarity) or of the minimum amplitude (for negative polarity). According to the polarity, a correction of this peak is made in the ECG signal (time domain). Afterwards, T wave boundaries are determined. Around every T peak in each RR interval, we calculate the area under the curve of the SWT at level N − 1 within a search window dependent on the respective RR interval. With two preset thresholds for the area distribution and the positions of the peaks in the interval, we estimate the position of T wave onset (T on ) and T wave offset (T off ). Finally, all detected peaks and boundaries are stored in the FPT (columns ten to twelve as shown in Table 1).

P wave detection
P wave detection P_Detection.m also relies on the phasefree SWT as well as on QRS detection and T wave detection. Fig. 3B provides an overview of the algorithm which was presented in [12]. The single lead input signal is band-pass filtered with a Gaussian filter (passband 1 Hz to 15 Hz). QRS complexes and T waves are replaced by sigmoid functions (see Fig. 2B, second line). The SWT is performed with the quadratic spline wavelet at the detail coefficients of level N corresponding to the P wave center frequency of 7 Hz [18] (see Fig. 2B, third line).
The peaks of the SWT in each RR interval are determined by annotating the absolute maximum of the SWT as P wave peak in a search window dependent on each RR interval and the Q peak location. The user can also choose to set a nearby extremum in the filtered ECG signal as peak.
After peak detection, P wave boundaries are determined. Since P waves are low in amplitude, a template of the SWT in a window of 50 waves around the current P wave peak is built to estimate the boundaries. Hotelling's T-squared method is applied to obtain a clean template. P wave onset and offset are estimated by initial guesses from the template considering the nearest wavelet extremum as the respective wave boundary.
The found peaks and wave boundaries are stored in the FPT, columns one to three as shown in Table 1.

Multilead processing
As already mentioned, P wave, QRS complex, and T wave detection are generally performed lead by lead. To achieve a multilead delineation, a voting algorithm (Sync_Beats.m) discards possible beats in the FPT if they are not visible in the majority of all leads and averages over the found positions to deliver a overall delineation result. Innately, a reasonable result can only be achieved with more than two leads.  4. A possible pipeline for processing an ECG signal with ECGdeli. After preprocessing, ECG wave peak and boundary detection are performed. The arrows pointing into the box determine input parameters, arrows pointing away output parameters, respectively.

Software architecture and functionality
A high-level view on the pipeline for the ECG preprocessing and annotation is shown in Fig. 4. First, baseline removal, filtering and isoline correction are performed (see Fig. 5, lines one to four). Following that, the signal can be annotated by launching the script Annotate_ECG_Multi.m. Here, QRS detection, T and P wave detection are run (see Fig. 5, line five). Afterwards, in the case of a multilead signal, the voting algorithm (Sync_Beats.m) would discard possible beats in the FPT if they are not visible in the majority of all leads.  5. Application of the filtering and wave detection to an ECG signal for signal sel100 from [10]. Line 1: Estimated baseline wander and ECG. Line 2: ECG without baseline wander. Line 3: Bandpass filtered signal and estimated isoline. Line 4: Bandpass filtered signal with isoline correction. Line 5: The annotated on-and offsets as well as the peaks for the P and T wave and the QRS complex.

Table 1
Structure of the FPT. Lines in the FPT represent the number of the detected beat. Column 9 is reserved (res.) for the J point, column 13 for a beat classification.

Performance evaluation
The algorithms provided in the toolbox were evaluated with the QT database, an ECG database comprising 105 signals that were either recorded during normal sinus rhythm or represent one of six selected cardiac diseases with fiducial points annotated by clinicians [10]. We extracted annotations from the *.q1c and *.q2c files of the database. To obtain the detection errors, we analyzed both of the two leads separately. For each annotation, we subsequently took the best result among the two leads meaning choosing the annotation being closest to the annotation. To get the clostest annotation, every annotation available in one beat was considered finding the respective annotation beat. We carried out the exact same evaluation for ECGdeli, ecg-kit [16] available on github (version 1.0, Commit c8e3de4) and the ecgpuwave [17] algorithm as provided in PhysioNet. With ecg-kit, we used the wavedet delineation algorithm which is the default selection for ECG delineation. Detection errors for all types of annotations for all algorithms were calculated, including median errors and the interquartile ranges, mean errors and standard deviations of the signed and unsigned (absolute) errors. Fig. 6 shows the differences between the calculated fiducial points with the ECGdeli and the manually annotated points by clinicians for all annotated beats in the QT database. Especially with the detection of QRS and T wave annotations, outliers were Table 2 Detection errors of ECGdeli, ecg-kit and ecgpuwave compared to manual expert annotations in samples as well as number of detected points/available annotations. med: median, iqr: interquartile range, m: mean, std: standard deviation, signed: signed errors, abs: absolute errors.  visible. Detection errors for all three algorithms are given in Table 2. ECGdeli and ecg-kit performed comparably, dependent on the performance measure and the annotation type, one was outperforming the other. Regarding only median and interquartile ranges, ecgpuwave was always outperformed by ecg-kit. A particular difference was visible in the number of detected waveforms. This number was highest in the case of ECGdeli (see last lines in Table 2). ecg-kit and ecgpuwave discarded waves even though they should ideally be detected due to the fact that clinicians clearly found and annotated the respective wave in the signal. ECGdeli did not so.

Discussion
In this section, three main points connected with the results from Section 3.1 will be discussed.
First, we decided to evaluate the two leads separately and then take the lowest error per annotation as the annotator had also both leads at hands during annotation and we did not know which lead was chosen. We therefore did not evaluate a possible improvement by using a multilead approach which would have been hardly possible for ECGdeli since a voting between two different annotations does not deliver an advantage.
Second, we want to reference to further publications comparing more closed source implementations [19,20]. We intentionally did not repeat the results of the closed source implementations here for two main reasons: first, we wanted to compare our results to the two most visible algorithms for ECG delineation offering P wave, T wave and QRS detection. Second, as there is no standard procedure to generate the final evaluation results, it is hardly possible to guarantee the comparability.
Third, we want to highlight the possibility of a postprocessing step for ECGdeli and a possible improvement of the average detection errors. With the current implementation, and as already stated, the algorithm is forced to detect a P and T wave in each RR interval. On the one hand, this implies that every wave is detected. Nevertheless, an adequate detection of the outliers (visible e.g. for the T wave in Fig. 6) and a subsequent correction or dropping of those, could have lowered the average errors of ECGdeli.

Impact
ECG delineation algorithms are important in clinical and research practice. As stated in Section 1, many parameters depend on the result of the wave detection. With ECGdeli we offer one of the few open toolboxes to solve this relevant problem. By comparing the toolbox with two alternative implementations for wave delineation, we showed that ECGdeli is already at its current state delivering results en par with existing open approaches. However, several partly unique features of ECGdeli should be highlighted: The simple input/output and modular structure make the toolbox functions easy and intuitive to use. In this way, the annotation functions can also be executed separately if for example only a P wave detection is necessary. This goes hand in hand with the fact that by making ECGdeli freely available under the GPLv3, single algorithms can be extracted and incorporated into existing projects to extend them.
Furthermore, others can easily apply ECGdeli as a benchmark for new algorithms as we did with ecg-kit and ecgpuwave. Usually, the evaluation of the algorithms with standard databases, like the QT database, allows to compare new work to already existing. However, manual annotations in these databases can be prone to error and different ways of calculating performance parameters impair comparability (as discussed in Section 3). Moreover, there might be pathologies not represented in a freely available database that can however be relevant for the intended application of a new algorithm. The best way to compare the own work to existing algorithms is to have benchmark algorithms available.
At last, ECGdeli is capable of detecting all P and T waves in a provided input ECG signal if the respective QRS complex was detected. Thus, for example a P wave gets detected in any case between two subsequent R peaks. In this way, no false negative detected waves are possible and false positive waves can still be eliminated in a postprocessing step that is designed for the specific problem.

Conclusions
In this work, we present the ECGdeli MATLAB toolbox for ECG wave delineation. The modularity is the major advantage as single algorithms can be used and adapted independently from the others (e.g., use another QRS detector as long as the standardized FPT is used). By publishing ECGdeli under GPLv3, the toolbox can be freely used for wave detection, serve as a benchmark method for future works on non-standard datasets or as a basis for extended feature extraction algorithms.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.