Suppressing the Influence of Ectopic Beats by Applying a Physical Threshold-Based Sample Entropy

Sample entropy (SampEn) is widely used for electrocardiogram (ECG) signal analysis to quantify the inherent complexity or regularity of RR interval time series (i.e., heart rate variability (HRV)), with the hypothesis that RR interval time series in pathological conditions output lower SampEn values. However, ectopic beats can significantly influence the entropy values, resulting in difficulty in distinguishing the pathological situation from normal situations. Although a theoretical operation is to exclude the ectopic intervals during HRV analysis, it is not easy to identify all of them in practice, especially for the dynamic ECG signal. Thus, it is important to suppress the influence of ectopic beats on entropy results, i.e., to improve the robustness and stability of entropy measurement for ectopic beats-inserted RR interval time series. In this study, we introduced a physical threshold-based SampEn method, and tested its ability to suppress the influence of ectopic beats for HRV analysis. An experiment on the PhysioNet/MIT RR Interval Databases showed that the SampEn use physical meaning threshold has better performance not only for different data types (normal sinus rhythm (NSR) or congestive heart failure (CHF) recordings), but also for different types of ectopic beat (atrial beats, ventricular beats or both), indicating that using a physical meaning threshold makes SampEn become more consistent and stable.


Introduction
Entropy is a valuable tool for quantifying the complexity or regularity of cardiovascular time series and provides important insights for understanding the underlying mechanisms of the cardiovascular system. Since the concept of 'information entropy' was first proposed by Shannon in 1948 [1], entropy was used as a tool to quantify the quantity of information. Approximate entropy (ApEn) [2], proposed by Pincus et al., is an entropy algorithm initially used in physiological signal analysis as it is adaptive in short-term time series processing. However, ApEn introduces self-matching in calculations, resulting in estimation bias and poor relative consistency [3]. To solve this problem, Richman and Moorman developed an improved version of sample entropy (SampEn) [3], which is based on the calculation of the conditional probability that any two segments of m beats that are similar remain similar when their length increases by one beat. Compared with ApEn, SampEn has a lower estimate bias, better relative consistency and less dependence on data length, which makes it more appropriate in physiological signal processing. SampEn is now the most widely used entropy algorithm in physiological signal analysis.
For entropy calculation, three intrinsic parameters, i.e., the embedding dimension m, the tolerance threshold r and the time series length N need to be initialized. SampEn was reported to not be lasting for too much or too little time. The existence of either the ectopic beats or the falsely detected QRS locations can significantly contaminate the entropy outputs.
Thus, the effectiveness of entropy measures, typically SampEn, should be re-checked for analyzing the dynamic ECG signals. A predictable situation is that SampEn may change a lot if moving the analysis window from an ectopic-free RR interval time series to an entopic one. Thus, it is necessary to further develop an entropy method, which can keep relatively stable when randomly dealing with the ectopic or ectopic-free RR interval time series for a specific subject/patient. Due to the fact that it is difficult to identify the abnormal RR intervals caused by noises or true ectopic beats in the automatic analysis for dynamic ECGs, this necessity becomes urgent and practical for real signal processing. In this study, we aimed to test the performance of a new physical threshold-based SampEn when applied to RR interval time series with ectopic beats, to explore if it can efficiently suppress the sudden change in entropy results due to the appearance of ectopic beats, i.e., to verify its ability to suppress the influence of ectopic beats for HRV analysis.

Data
All data used were from the PhysioNet/MIT RR Interval Databases from http://www.physionet. org [38], a free-access, online archive of physiological signals. The NSR RR Interval database includes 54 long-term RR interval recordings of subjects with normal sinus rhythms aged from 29 to 76. The CHF RR Interval database includes 29 long-term RR interval recordings of subjects aged from 34 to 79, with CHF diagnoses (NYHA classes I, II and III). Each of the long-term RR interval recordings is a 24-h recording, including both day-time and night-time. Both the NSR and CHF subjects took the Holter ECG measurement under a similar level of physical activity. The original ECG signals were digitized at 128 Hz, and the beat annotations were obtained by automated analysis with manual review and correction.

Physical Threshold-Based SampEn
The calculation process for the physical threshold-based SampEn is summarized as follows [26]: For the RR segment x(i) (1 ≤ i ≤ N), given the parameters m and r, first formed is the vector sequence X m i : The vector X m i represents m consecutive x(i) values. Then, the distance between X m i and X m j based on the maximum absolute difference is defined as: For each X m i , denote B m i (r) as (N − m) −1 times the number of X m ≤ r for all 1 ≤ j ≤ N − m. Instead of using the traditional threshold, which is between 0.10 and 0.25 times the SD of the data, herein, a physical threshold r is used to form a unified comparison baseline for determining the vector similarity. As the raw ECG signals were digitized at 128 Hz, which means that the difference between any two vectors is approximately an integer multiple of 8 ms, here we used r = 12 ms as the physical threshold according to the previous suggestion [10].
Then, SampEn is defined by: In addition, previous studies suggested that using an embedding dimension of m = 1 or 2 can obtain better results for classifying NSR and CHF groups when setting the RR time series length as N = 300 [4]. In this study, we kept this suggestion of m = 1 and 2.
To test the performance of physical threshold-based SampEn, traditional SampEn was used as the comparative method. Entropy values were first calculated from the raw ectopic 5-min RR segments. Then, the ectopic RR intervals in these ectopic RR segments were removed to form the ectopic-free RR segments. Finally, entropy values were re-calculated from these constructed ectopic-free RR segments. Entropy variances before and after ectopic beat removal were calculated, and the variation could be regarded as an index for evaluating the performance of entropy measures' abilities to suppress the influence of ectopic beats.  Figure 2 shows the entropy results from an NSR subject (NSR002). As shown in Table 1, NSR002 has a total of 146 5-min ectopic RR segments. The left panels in Figure 2 show the entropy values for these 146 ectopic RR segments before ectopic RR interval removal (red dotted line) and after ectopic RR interval removal (blue line). The traditional SampEn has a large variation before and after ectopic RR interval removal, while the new physical threshold-based SampEn has very small changes when analyzing ectopic free segments. The right panels show the corresponding variance ratios, i.e., the entropy value of the ectopic free segment minus the entropy value of ectopic segment, divided by the entropy value of the ectopic segment. The entropy variance ratios in SampEn varied from −65.24% to 2.25%, with an average of −16.32% and an SD of 21.93%. The corresponding variance ratios for the physical threshold-based SampEn varied from 0% to 3.34% (m = 1, r = 12 ms), with an average of 0.81% and an SD of 0.66%; and from −0.51% to 3.21% (m = 2, r = 12 ms), with an average of 0.57% and an SD of 0.72%. Compared with the traditional SampEn, the physical threshold-based SampEn showed significantly lower variance ratios, demonstrating the better robustness of the new SampEn method.  By contrast, Figure 3 shows similar results from a CHF patient (CHF202), which has a total of 150 ectopic RR segments, as shown in Table 2. The entropy variance ratios in SampEn varied from −62.50% to 3.53%, with an average of −3.18% and an SD of 11.36%. The corresponding variance ratios for physical threshold-based SampEn varied from −0.35% to 2.01% (m = 1, r = 12 ms), with an average of 0.55% and an SD of 0.49%; and from −0.98% to 1.39% (m = 2, r = 12 ms), with an average of 0.20% and

Demonstration of the Influence of Atrial Beats on Entropy Values
There are two types of ectopic beat in the used PhysioNet/MIT RR Interval Databases, atrial and ventricular beats (shown in Figure 1). To further test the robustness of physical threshold-based SampEn method, we analyzed the ectopic segments only containing atrial or ventricular beats. For NSR002, there are 17 segments containing atrial beats and 137 segments containing ventricular beats among all 146 ectopic RR segments. For CHF202, there are 41 segments containing atrial beats and 123 segments containing ventricular beats among all 150 ectopic RR segments.  Figure 4 shows the results of 17 atrial ectopic RR segments from NSR002. Entropy variance ratios in SampEn varied from −53.40% to 1.77%, with an average of −8.48% and an SD of 19.54%. The corresponding variance ratios for physical threshold-based SampEn varied from 0% to 1.38% (m = 1, r = 12 ms), with an average of 0.42% and an SD of 0.45%; and from −0.51% to 1.77% (m = 2, r = 12 ms), with an average of 0.32% and an SD of 0.56%. Compared with the traditional SampEn, the physical threshold-based SampEn showed significantly lower variance ratios for the analysis of atrial ectopic RR segments. Figure 5 shows the similar results from CHF202, which includes 41 atrial ectopic RR segments. The entropy variance ratios in the SampEn varied from −43.10% to 3.53%, with an average of −2.34% and an SD of 8.51%. The corresponding variance ratios for physical threshold-based SampEn varied from −0.19% to 0.97% (m = 1, r = 12 ms), with an average of 0.24% and an SD of 0.33%; and from −0.39% to 1.09% (m = 2, r = 12 ms), with an average of 0.10% and an SD of 0.30%. The results for CHF also support that the physical threshold-based SampEn had significantly lower variance ratios in the analysis of atrial ectopic RR segments.

Demonstration of the Influence of Atrial Beats on Entropy Values
There are two types of ectopic beat in the used PhysioNet/MIT RR Interval Databases, atrial and ventricular beats (shown in Figure 1). To further test the robustness of physical threshold-based SampEn method, we analyzed the ectopic segments only containing atrial or ventricular beats. For NSR002, there are 17 segments containing atrial beats and 137 segments containing ventricular beats among all 146 ectopic RR segments. For CHF202, there are 41 segments containing atrial beats and 123 segments containing ventricular beats among all 150 ectopic RR segments. Figure 4 shows the results of 17 atrial ectopic RR segments from NSR002. Entropy variance ratios in SampEn varied from −53.40% to 1.77%, with an average of −8.48% and an SD of 19.54%. The corresponding variance ratios for physical threshold-based SampEn varied from 0% to 1.38% ( = 1, = 12 ms), with an average of 0.42% and an SD of 0.45%; and from −0.51% to 1.77% ( = 2, = 12 ms), with an average of 0.32% and an SD of 0.56%. Compared with the traditional SampEn, the physical threshold-based SampEn showed significantly lower variance ratios for the analysis of atrial ectopic RR segments. Figure 5 shows the similar results from CHF202, which includes 41 atrial ectopic RR segments. The entropy variance ratios in the SampEn varied from −43.10% to 3.53%, with an average of −2.34% and an SD of 8.51%. The corresponding variance ratios for physical thresholdbased SampEn varied from −0.19% to 0.97% ( = 1, = 12 ms), with an average of 0.24% and an SD of 0.33%; and from −0.39% to 1.09% ( = 2, = 12 ms), with an average of 0.10% and an SD of 0.30%. The results for CHF also support that the physical threshold-based SampEn had significantly lower variance ratios in the analysis of atrial ectopic RR segments.    Figure 6 shows the results of 137 ventricular ectopic RR segments from NSR002. Entropy variance ratios in SampEn varied from −65.24% to 2.46%, with an average of −16.15% and an SD of 21.57%. The corresponding variance ratios for physical threshold-based SampEn varied from 0% to 3.34% ( = 1, = 12 ms), with an average of 0.82% and an SD of 0.66%; and from −0.89% to 3.22% ( = 2, = 12 ms), with an average of 0.57% and an SD of 0.73%. Compared with the traditional SampEn, the physical threshold-based SampEn also showed significantly lower variance ratios in the analysis of ventricular ectopic RR segments. Figure 7 shows the similar results from CHF202, which includes 123 ventricular ectopic RR segments. The entropy variance ratios in SampEn varied from −48.55% to 1.56%, with an average of −2.97% and an SD of 10.89%. The corresponding variance ratios for the physical threshold-based SampEn varied from −0.35% to 2.01% ( = 1, = 12 ms), with an average of 0.59% and an SD of 0.49%; and varied from −0.98% to 1.63% ( = 2, = 12 ms), with an average of 0.22% and an SD of 0.43%. The results for CHF also support the idea that the physical threshold-based SampEn had lower variance ratios in the analysis of ventricular ectopic RR segments.  Figure 6 shows the results of 137 ventricular ectopic RR segments from NSR002. Entropy variance ratios in SampEn varied from −65.24% to 2.46%, with an average of −16.15% and an SD of 21.57%. The corresponding variance ratios for physical threshold-based SampEn varied from 0% to 3.34% (m = 1, r = 12 ms), with an average of 0.82% and an SD of 0.66%; and from −0.89% to 3.22% (m = 2, r = 12 ms), with an average of 0.57% and an SD of 0.73%. Compared with the traditional SampEn, the physical threshold-based SampEn also showed significantly lower variance ratios in the analysis of ventricular ectopic RR segments. Figure 7 shows the similar results from CHF202, which includes 123 ventricular ectopic RR segments. The entropy variance ratios in SampEn varied from −48.55% to 1.56%, with an average of −2.97% and an SD of 10.89%. The corresponding variance ratios for the physical threshold-based SampEn varied from −0.35% to 2.01% (m = 1, r = 12 ms), with an average of 0.59% and an SD of 0.49%; and varied from −0.98% to 1.63% (m = 2, r = 12 ms), with an average of 0.22% and an SD of 0.43%. The results for CHF also support the idea that the physical threshold-based SampEn had lower variance ratios in the analysis of ventricular ectopic RR segments.      Table 3 and Figure 8 show the entropy variance ratios and standard deviations for each subject in the NSR group (in total, 45 recordings with the required numbers of ectopic segments, as indicated in Table 1) when comparing the entropy values from both before and after ectopic beat removal. The absolute variance ratio and standard deviation of SampEn for each subject were obviously larger than those from the two physical threshold-based SampEn methods, and the mean variance ratios were −6.91%, 0.63% and 0.43% for SampEn and the two physical threshold-based SampEn methods (m = 1 and m = 2 respectively, and, for both, r = 12 ms). In addition, SampEn showed significantly larger standard deviations of entropy variance ratios within subjects than the two physical threshold-based SampEn methods. The average standard deviations were 13.93%, 0.62% and 0.68% for SampEn and the two physical threshold-based SampEn methods (m = 1 and m = 2 respectively, and, for both, r = 12 ms). Table 3. Entropy variance ratios and standard deviations for each subject in the NSR group.     Similarly, Table 4 and Figure 9 show the entropy variance ratios and standard deviations for each patient in the CHF group (24 recordings). The absolute variance ratio and standard deviation for each subject of SampEn were obviously larger than those from the two physical threshold-based SampEn methods, and the mean variance ratios were −5.01%, 1.54% and 1.41% for SampEn and the two physical threshold-based SampEn methods (m = 1 and m = 2 respectively, and, for both, r = 12 ms). Meanwhile, SampEn showed significantly larger standard deviations of entropy variance ratios within patients than the two physical threshold-based SampEn methods. The average standard deviations were 11.69%, 1.28% and 1.46% for SampEn and the two physical threshold-based SampEn methods (m = 1 and m = 2 respectively, and, for both, r = 12 ms). These results further confirmed the better stability of SampEn using the physical threshold.

Discussion and Conclusion
In all of the three intrinsic parameters of SampEn, the parameter is the most difficult to be determined. Different opinions regarding the selection of threshold would lead to different entropy outputs. In a previous study, researchers developed different methods for the selection of the threshold [8,39], and tried to make the selection method more rigorous and standardized [4,40].  When comparing the group differences of variance ratios between the NSR and CHF groups, the traditional SampEn showed no significant difference (P = 0.3) while the physical threshold-based SampEn showed significant differences (both P < 0.01 for two parameter m settings), with P = 4 × 10 −7 for m = 1 and P = 2 × 10 −6 for m = 2 respectively.

Discussion and Conclusions
In all of the three intrinsic parameters of SampEn, the parameter r is the most difficult to be determined. Different opinions regarding the selection of threshold r would lead to different entropy outputs. In a previous study, researchers developed different methods for the selection of the threshold r [8,39], and tried to make the selection method more rigorous and standardized [4,40]. However, there is no unified standard for r value selection now. Special selection methods only perform well under specific circumstances, and the influencial factors may include data type, data length, disease type, etc. Therefore, the argument has always been whether to use a fixed tolerance r or a varying tolerance r. Researchers first explored this issue in the MSE method, which performed SampEn analysis on several different scales and thus induced the question of whether using a fixed or a varying tolerance r at different scales was better. Angelini et al. reported that using a fixed and a varying tolerance r in MSE generated similar changes in CHF analysis [41]. Silva et al. also confirmed this finding in a rat model of hypertension and CHF [42], suggesting that the selection of the tolerance r in the MSE method is not relevant. However, the fixed tolerance r at different scales only stays the same for special subjects. For different subjects, there is also an inter-variability of the tolerance r, since different subjects have different signal variabilities of time series.
In a previous study, we found that SampEn reported lower values in CHF patients when using a small threshold r value (r = 0.10), but higher values when using large threshold r values (r = 0.20 or 0.25). The opposite entropy change trend brings difficulty to the clinical explanation. To solve this problem, we proposed a physical threshold-based SampEn method to discriminate the opposite entropy change trend in classifying CHF and NSR subjects. This previous study was performed only on RR segments without any ectopic beats. The raw ECG signal had a sample rate of 128 Hz, generating differences of roughly 8 ms and its multiples for RR intervals. Thus, we tested the effects of different r values of r = 12 ms, r = 20 ms, r = 28 ms etc., and found that r = 12 ms provided the best discrimination between the CHF and NSR groups. In this study, we used the previously proposed fixed tolerance r method with r = 12 ms [26] with physical meaning to analyze the RR interval time series with ectopic beats, to explore if the new r method has better performance for ectopic time series. Forty-five NSR and 24 CHF recordings were enrolled in this study, all of which had an appreciable number of ectopic beats, including atrial and ventricular beats. SampEn entropy results from both the traditional varying threshold (a fraction of the SD of time series) and the new fixed physical meaning threshold were compared before and after ectopic beat removal. For both the NSR and CHF groups, the entropy variance of SampEn with the traditional threshold is obviously larger than that when using the physical meaning threshold, which verifies the better consistency of the new physical meaning threshold method.
Ectopic beats are routinely removed or edited from the RR interval time series prior to HRV analysis. Salo et al. found that both time-and frequency-domain indices were sensitive to the editing of RR intervals [28]. This finding was consistent with our current study, where we showed that the SampEn calculated by the traditional method was sensitive to the removal of ectopic beats (one to five beats). The reason is that the ectopic beats usually result in sudden changes in the RR interval time series. This effect is significant on the transient change of HRV reflected by both the time-and frequency-domain indices, as well as nonlinear indices like SampEn [29,43]. However, for each subject, after ectopic beats were removed, the entropy value only changed significantly in specific segments. The entropy value variance for all segments in subject NSR002 was between −65.24% and 2.25% for the traditional threshold; and between 0% and 3.34% (m = 1), and −0.51% and 3.21% (m = 2) for the two physical meaning thresholds. The results in subject CHF202 were similar, i.e., between −62.50% and 3.53% for the traditional threshold; and −0.35% and 2.01% (m = 1), and −0.98% and 1.39% (m = 2) for the two physical meaning thresholds. The absolute change in SampEn with the traditional threshold was much more significant than that in SampEn with the physical meaning threshold.
In addition, we also analyzed the effect of different ectopic beats (atrial or ventricular) on the tested SampEn output. Results from the segments only containing atrial or ventricular beats showed that SampEn using the physical meaning threshold still performed better than SampEn using the traditional threshold. When atrial beats or ventricular beats were removed, the absolute entropy value variation in the former SampEn was significantly smaller than that in the latter.
In conclusion, SampEn using the physical meaning threshold has better performance, not only for different data types (NSR or CHF recordings), but also for different types of ectopic beat (atrial beats, ventricular beats, or both), and using the physical meaning threshold makes SampEn become more consistent and stable.