Isolation of multiple electrocardiogram artifacts using independent vector analysis

Electrocardiogram (ECG) signals are normally contaminated by various physiological and nonphysiological artifacts. Among these artifacts baseline wandering, electrode movement and muscle artifacts are particularly difficult to remove. Independent component analysis (ICA) is a well-known technique of blind source separation (BSS) and is extensively used in literature for ECG artifact elimination. In this article, the independent vector analysis (IVA) is used for artifact removal in the ECG data. This technique takes advantage of both the canonical correlation analysis (CCA) and the ICA due to the utilization of second-order and high order statistics for un-mixing of the recorded mixed data. The utilization of recorded signals along with their delayed versions makes the IVA-based technique more practical. The proposed technique is evaluated on real and simulated ECG signals and it shows that the proposed technique outperforms the CCA and ICA because it removes the artifacts while altering the ECG signals minimally.


INTRODUCTION
An electrocardiogram (ECG) is an important tool to measure the electrical activity generated by the SA (sinoatrial) node that causes the upper heart chambers (atria) to contract. ECG is an effective tool for investigating the heart related problems like arrhythmia diagnosis and widely adopted in a number of practical applications. ECG signals are utilized for automatic detection of myocardial infarction in . Kumar, Pachori & Acharya (2017) investigated the ECG signals for detection and characterization of coronary artery disease. Similarly in , the authors presented the heart failure detection technique based on ECG signals and the extraction of fetal ECG from maternal ECG is achieved in Su & Wu (2017). Qingxue & Zhou (2018) developed person identification technique based on ECG signal processing. Moreover, ECG based silent myocardial infarction as well as long term risk of heart failure is diagnosed in Qureshi et al. (2018). Meanwhile, modern efficient ECG data recording and analysis systems are also been designed even in wireless scenario (Tao et al., 2018;Elgendi et al., 2018;Han et al., 2018;Tanguay et al., 2018;Orphanidou, 2018). However, the recorded ECG signals are normally affected by different types of electro-physiological and non electro-physiological artifacts. The artifacts affected ECG can not be adopted in the sensitive applications. Hence, efficient removal of artifacts is necessary for ECG signals analysis for various applications. Removal of these artifacts before further processing make the design of ECG instrument simpler and produce accurate results.
In literature it is well known that among ECG artifacts, baseline wandering (BW), electrode movement (EM), and muscle artifacts (MAs) are more challenging to separate from the recorded ECG signals (Hesar & Mohebbi, 2017;Varanini et al., 2016;Zarzoso & Nandi, 2001). BW is normally generated through body movements, breathing, and lose sensors contacts. EM is the result of variations of electrodes positions over the human body surface, MA is caused by contraction of the muscles near the electrode (Limaye & Deshmukh, 2016). The main challenges associated with the removal of these artifacts are their unpredictable amplitudes and variable frequency range (Hegde, Deekshit & Satyanarayana, 2012).

Related work
Numerous researches have contributed to artifacts removal from ECG signals, using algorithms like, extended Kalman filter (Hesar & Mohebbi, 2017), least mean square (LMS) (Rahman, Shaik & Reddy, 2009) and Weiner filter (Chang & Liu, 2011), etc. ECG signal denoising and classification schemes based on projected and dynamic features are presented in Chen et al. (2017). High density muscle noise removal from the recorded ECG signal is performed in Wang et al. (2020) using the independent vector analysis (IVA) technique. Separation of the fetal and maternal ECG signals is carried out in Sugumar & Vanathi (2016) through the IVA technique. Successive local filtering based denoising is discussed in Mourad (2022). Deep learning based ECG de-noising technique is proposed in Rahhal et al. (2016) and Rasti-Meymandi & Ghaffari (2022). The segmented beat classification and de-noising method discussed in Agostinelli et al. (2016), proposed a filtering technique to suppress the noise followed by the detection of QRS complex from the ECG signals using the MIT-BIH Noise Stress Test Database. Time-series clustering techniques used for ECG classification and artifacts removal in Rodrigues, Belo & Gamboa (2017), extract the best characterize features of the signal over time and group its samples in individual clusters through an agglomerative clustering approach. Moreover, the blind source separation (BSS) technique called the independent component analysis (ICA) is also used for fetal ECG extraction and artifacts removal in Varanini et al. (2016), Sameni et al. (2007, and Jafari & Chambers (2005). ECG signal classification and de-noising are also performed in Uddin & Alam (2009), Sameni, Jutten & Shamsollahi (2008, Vayá et al. (2007), andRieta et al. (2004) using ICA. In all these applications mixed data is first recorded through electrodes and then processed using ICA algorithms for un-mixing and further classifications. The IVA technique is already used for gradient noise removal from electroencephalogram signals in Acharjee et al. (2015).
Adaptive filtering techniques and ICA are used for ECG artifacts removal; however, in Zarzoso & Nandi (2001), it is shown that ICA outperforms the adaptive filtering techniques for ECG artifacts removal. Moreover, ICA is recommended by various researchers for artifacts removal but some inefficiency of ICA is also reported in Urrestarazu et al. (2004) andShackman et al. (2009). In literature, canonical correlation analysis (CCA) is used as an alternative to ICA (De Clercq et al., 2006), which is yet not used for ECG artifacts removal. CCA utilizes the original signals as well as the delayed versions of the signals. It is based on second-order statistics (SOS) and extracts maximally auto-correlated and mutually un-correlated signals (De Clercq et al., 2006). From Mowla et al. (2015), it is known that CCA is an efficient and practically useable technique as compared to ICA. Moreover, ICA utilizes high order statistics (HOS) to explore statistical independence while CCA is based on SOS to recover statistically un-correlated sources. It is clear from the statistical theory that un-correlatedness is a weaker condition than independence.
A recently developed technique of BSS called the independent vector analysis (IVA) combines the advantages of both ICA and CCA in a single framework (Anderson et al., 2014). IVA processes the original and time-delayed versions of the signals (just like CCA) while utilizing the HOS (like ICA). IVA assumes that the source signals in one data set are independent of each other and at least one source is dependent on one source of the other data set. Moreover, from Anderson et al. (2014) it is known that IVA performs well as compared to ICA and CCA.  Mohammed, Hassan & Ferikoglu (2021) in particular mentioned that the ICA algorithm gives more accurate results than the extended kalman filter in reducing baseline wandering and electrode movement artifacts. It is also important to mention that in case of low frequency applications ICA gives more accurate results (Mohammed, Hassan & Ferikoglu, 2021;Villena et al., 2018).

Contribution
Based on this discussion, an IVA based technique is proposed in this article for ECG artifacts removal. This is the first article that proposes the IVA-based technique for ECG artifacts removal. The IVA-based technique produces more clear and visible ECG signals that might help medical specialists to observe some very low amplitude electrophysiological effects of the heart. In this article, the performance of the three IVA algorithms called the IVA-L, the IVA-G, and the IVA-GGD is investigated for ECG artifacts removal. The IVA-L algorithm utilize the HOS and assumes Laplacian distribution for the source component vectors (Kim et al., 2007). The IVA-G algorithm exploits linear dependencies without taking into account the HOS. The IVA-G algorithm assumes Gaussian distribution for the mixing sources (Anderson, Adali & Li, 2012). The IVA-GGD algorithm utilizes both the SOS and HOS while assuming multivariate generalized Gaussian distribution for the underlying sources (Anderson et al., 2014). It is also important to mention that all the data is taken from the MIT-BIH Noise Stress Test Database for ECG and artifacts signals (Moody, Muldrow & Mark, 1984). The MIT-BIH Noise Stress Test Database is freely available for further research on ECG signal processing. In addition, the main contributions of this research are as follows: Recently developed BSS technique, the IVA is used to separate artifacts from ECG signal.
The three most challenging ECG artifacts BW, MA and the EM are considered to remove from the recorded ECG signals. Performance of the ICA, CCA and IVA are analyzed for artifacts removal utilizing real simulated ECG signals. Three variants of IVA, the IVA-L, IVA-G, and IVA-GGD are investigated to study their performance for ECG artifacts removal.
The rest of the article is organized such that Section 2 presents details of the ECG data. Both the realistic simulated and real ECG signals along with ECG artifacts are discussed. The system model is given in Section 3, while the proposed algorithm using IVA algorithms is discussed in Section 4. A simulation study of the simulated and real signals is carried out in Section 5 with the concluding remarks in Section 6.

ECG DATA AND ARTIFACTS
Realistic simulated and real ECG data are considered for simulations. Real data is taken from the MIT-BIH database (Moody, Muldrow & Mark, 1984). The acquired signals in the MIT-BIH Noise Stress Test Database are digitized using uni-polar ADCs with 11-bit resolution. This database is open source for further research. The MIT-BIH database contains the ECG signals and their artifacts. The artifacts considered in this work are as follows: Muscle artifact: Muscle artifacts are the results of muscle contraction having low amplitudes and a large frequency range from 0-10 kHz. Baseline wandering: Baseline wandering originates due to body movements, breathing, and loose sensor contact. Body movements cause unpredictable large amplitude and low-frequency artifacts. Breathing also causes low frequency drifting between 0.15 and 0.3 Hz. Electrode movement: Electrode movement is generated due to electrode position away from the skin contact, changing the electrode and skin impedance causing potential variations in the recorded ECG signal. Other artifacts: Other ECG artifacts include power line interference, device noise, Electro-surgical noise, quantization noise, aliasing, etc.
The time-domain real ECG, BW, EM, and MA signals are demonstrated in Fig. 1 with 2,000 data samples of each signal as a first data set.
Frequency domain representation is shown in Fig. 2. It shows that most of the frequencies of ECG and artifacts lie in the range of 50 Hz. From Fig. 2 it is observed that all the frequencies of ECG signal and artifacts overlap with each other. Hence, to cleanly extract these ECG signals, some efficient BSS techniques are required. As it is already discussed, IVA is the more efficient BSS technique as compared to ICA and CCA (Moody, Muldrow & Mark, 1984). Based on the discussions, it is recommended to utilize IVA for ECG signals de-noising. Moreover, the recorded mixed ECG data is shown in Fig. 3 for a single data set with L = 2,000 samples. The measurement is taken in the presence of additive white Gaussian noise (AWGN) with signal to noise ratio (SNR) of 20 dB. Figure 3 basically contains mixture signals of all the individual source signals. The source signals are ECG, BW, EM and MA. The mixing process is performed in MATLAB such that the source signals matrix S of size 4 Â 2;000 is multiplied with a randomly generated mixing matrix A of size 4 Â 4. Mixed data is recorded in matrix X, where X is of size 4 Â 2;000. It must be noted that in case of ICA a single data set as shown above is utilized while in case of IVA multiple copies of the source signals are recorded and processed for un-mixing i.e., multiple mixing matrices are observed and un-mixing is performed at a time. This is the main advantage of the IVA algorithm to un-mix the recorded signals and its delayed versions at a time. The realistic simulated ECG signals are generated in MATLAB version R2016a (MathWorks, Inc., Natick, MA, USA).

SYSTEM MODEL
This section presents the ECG signals and artifacts in the IVA data model. K number of independent sources i.e., ECG and artifacts are considered and all sources contain L number of samples for D data sets. The acquired data using ECG electrodes is expressed as: The matrices S d contains the source data vectors s d 1 ; s d 2 ; …;s d K , where every vector having length L. All vectors are real valued random vectors having zero mean. The mixing matices A d are also real with random values for D number of data sets. Hence, the the IVA algorithm responsible to estimate these unknown matrices while utilizing the mixed data. The source data is represented by ðS 1 Þ T ; ðS 2 Þ T ; …; ðS D Þ T in D data sets. After the estimation of A d through the IVA algorithm, the resultant source signals as given in Qamar et al. (2022) are expressed as: The W d is inverse of A d and is called the un-mixing matrix estimated for D data sets. The estimated source data vectors are y d 1 ; y d 2 ; …; y d k :

PROPOSED IVA-BASED ECG ARTIFACTS SEPARATION
Multi-channel ECG signals are recorded in the presence of various artifacts i.e., BW, EM, and MA as well as noise. The number of ECG and artifact signals are denoted by K, each signal has data block length L with D number of data sets. The recorded mixed data contains D number of data sets ðX 1 Þ T ; ðX 2 Þ T ; …; ðX D Þ T as shown in Fig. 3 for SNR of 20 dB. Since, the artifact signals have overlapped frequencies with the original ECG signal as illustrated in Fig. 2, the role of the BSS algorithms is to estimate the source signals from the recorded mixed signals. The BSS algorithms know nothing except independence and non-Gaussianity of the source signals. The estimated sources of each BSS algorithm have scaling and order ambiguities. The scaling issue can be easily resolved considering the source signals with unit variance and also scaling the un-mixing vector to extract the unit variance sources. The arbitrary order of the estimated signals in each data set can be corrected using the permutation matrix, which is common in each data set (Anderson, Adali & Li, 2012).
The IVA algorithms separate the mixed recorded signals as a first data set and their delayed versions as other data sets. This separation is performed using the minimization of The I½y k represents mutual information within k th SCVs. H is the entropy, W d is the un-mixing matrix of d th data set and C is a constant factor which is equivalent to H½X 1 ; X 2 ; …; X D depending only on the recorded mixed data. The IVA algorithms minimize the cost function of  and maximizes the mutual information within each SCV. ICA is a well-known blind source separation technique used for linearly mixed signals utilizing statistical independence of the source signals (Uddin et al., 2015). CCA considers the mixed recorded signals as well as its delayed versions by exploiting the SOS. The IVA combines the advantages of CCA and ICA by exploiting the SOS and HOS. Moreover, numerous variants of IVA algorithms, such as IVA-GGD (Anderson et al., 2014), IVA-L (Kim et al., 2007) and IVA-G (Anderson, Adali & Li, 2012) exist in literature and their dominance is already proven. Motivated by this, this research implemented various versions of IVA algorithms to verify their validity for ECG artifacts removal. All these algorithms utilize the IVA cost function given in  to estimate the unmixing matrices. In the case of complex-valued data, the IVA-G algorithm includes the pseudo-co-variance matrix in the cost function. This algorithm also ignores the HOS and sample to sample dependency. The IVA-L utilizes the HOS for un-mixing while ignoring the sample to sample dependency and SOS. The matrix gradient approach is used in the implementation of the IVA-L algorithm. The IVA-GGD algorithm utilizes the HOS and SOS for source signal estimation considering multivariate Gaussian prior. This algorithm also avoids sample to sample dependency. Moreover, processing of the original as well as the delayed versions makes the IVA algorithms more practical compared to the ICA technique. Based on these advantages, various variants of IVA algorithms are implemented in this article and their performance is tested for the ECG artifacts removal.

SIMULATION RESULTS
In this section, simulation results of the proposed IVA based technique for ECG artifacts removal from the recorded mixed signals is presented. The IVA algorithms considered for simulations are IVA-GGD, IVA-L and IVA-G. Performance of these algorithms is evaluated for various SNRs ranging from 0 to 20 dB. Results are compiled using Monte Carlo simulation. The ECG artifacts considered for simulation are baseline wandering (BW), electrode movement (EM), and mascle artifacts (MA). Real and simulated ECG signals are utilized in the simulations. The real ECG signals are downloaded from MIT-BIH database and the simulated ECG signals are generated in MATLAB. The number of source signals considered are K = 4, the number of data sets D = 4, and length L of the processing data blocks in each data set ranges from 50 to 2,000 samples. Moreover, to evaluate the effectiveness of the proposed IVA technique for ECG artifacts removal different performance evaluation criterion are used that are given below: The corresponding root mean square error ðCRMSEÞ used in Chen et al. (2017) is expressed below: The s d ECG and y d ECG represent the original simulated ECG and the reconstructed ECG signals simultaneously at data set d. Common inter-symbol-interference (ISI com ) (Anderson, Adali & Li, 2012) is also utilized as a performance measure that is presented as: The ISI com is normalized so that its maximum value is one and minimum vale is zero, where zero value corresponds to ideal separation performance. The U W d A d is utilized as another evaluation criteria and is expressed as:  Table 1 The ISI com performance of the real ECG for all the three IVA algorithms i.e., IVA-GGD, IVA-L, and IVA-G at SNR of 20 dB. The algorithms performance is evaluated for different values of the input data block lengths ranges from 50 to 2,000 samples in each data sets.  Table 2 The ISI com results of the real ECG for all the three IVA algorithms at input data block length of 2,000 samples in each data set and different SNRs that ranges from 0 to 20 dB. The ideal separation corresponds to zero value of U W d A d . First, the effectiveness of the IVA-based technique in comparison with ICA and CCA techniques is demonstrated. The results of the three techniques are demonstrated while utilizing the Fast-ICA algorithm (Uddin, Ahmad & Iqbal, 2017) of the ICA, the GMCA algorithm (Li et al., 2009) of CCA and the IVA-G algorithm of the IVA. Simulations are performed at an SNR of 20 dB. The performance evaluation criteria used is CRMSE. In the case of the ICA algorithm, the value of the data set is one. Performance evaluation is carried out for different values of L ranging between 100 to 2,000 samples in a single data set. The results of ICA, CCA and IVA algorithms are demonstrated in Fig. 4. The simulation results clearly show that the IVA outperforms ICA and CCA algorithms. These results also verify that the IVA algorithm is less sensitive to the processing data block lengths. The performance improvement at a block length of L ¼ 100 is around 85% for the IVA technique and 15% for the CCA technique as compared with the ICA technique. Similarly, we demonstrate the ISI com performance of all these algorithms for the same conditions as given in the above simulations. The results are demonstrated in Fig. 5. This figure also shows the effective performance of the IVA algorithm. The extracted ECG  signals for these three algorithms are also demonstrated in Fig. 6. It shows that the IVA algorithm outperforms other algorithms and is also less sensitive to AWGN noise. Second, the quality of the separated ECG signals from various artifacts using the IVA algorithms using (ISI com ) is evaluated. Here, the simulated ECG signal corrupted by various artifacts i.e., BW, MA, and EM is considered. Linearly mixed instantaneous signals are generated using randomly generated mixing matrices in MATLAB. The mixed recorded signals are shown in Fig. 7 for a single data set. The mixing process of Figs. 3 and 7 is same, the difference in signals is such that Fig. 3 contains the simulated ECG signals and Fig. 6 shows the realistic ECG signal. Three IVA algorithms are applied to the simulated ECG signals for artifacts removal. The reliability of the ECG signals for all three algorithms is evaluated for different values of SNRs. The simulations are performed over four recorded data sets independently. In each run, the pure ECG signal is extracted and artifacts are separated from the recorded mixed signals.
The ISI com performance of the IVA algorithms for different number of iterations is performed. Results are shown in Fig. 8 for 20 dB SNR and a block length of 1,000 samples. It shows similar performance of all the algorithms at steady state condition. Furthermore, performance of the IVA algorithms is also evaluated for different values of the input data block lengths in different data sets. Simulation results are shown in Fig. 9 at 20 dB SNR. These results show that the IVA-L algorithm is more sensitive to length of the processing data blocks. At a block length of 100 samples in each data set the performance improvements of the IVA-G and IVA-GGD are 18% and 19% as compared to the IVA-L. In order to further investigate the IVA algorithms, we evaluate the U W d A d performance of the IVA algorithms at SNR of 20 dB for different number of iterations. Results are given in Fig. 10. It shows that the IVA-L converges faster as compared to IVA-G and IVA-GGD algorithms. The IVA-L converges at approximately 10 iterations, the other two converges at 25 iterations approximately. Although the IVA-L converges fast with same steady state results as achieved by other algorithms.
In the third part of simulations, we demonstrate the practical performance of the IVA algorithms for real ECG artifacts removal. The ECG artifacts considered in this part are BW, EM and MA. Removal of these artifacts is a challenging task due to their variable amplitudes and frequencies. The IVA algorithms considered in this section are IVA-L, IVA-G and IVA-GGD. The separated signals of the IVA algorithms are shown in Fig. 11 for 20 dB SNR. The results shows that three algorithms perform well for ECG artifacts removal. Moreover, the error signals are also demonstrated in Fig. 12, where error signal is the difference of the real and separated ECG signals. The resultant very low amplitudes of the error signals shows the effectiveness of the IVA algorithms. performance of all three algorithms is evaluated while considering the 20 dB SNR. Results of all the five ECG signals i.e., ECG 1 , ECG 2 , ECG 3 , ECG 4 and ECG 5 are demonstrated in Table 3. This table shows approximately the same performance of a single algorithm for all five ECG signals. The ISI com performance of the ECG2 is also demonstrated in Fig. 14 to observe the performance improvement for increased lengths of the processing data blocks.
Although, in addition to Table 3 the results of all the other ECG signals can also be included as figures but restricted to ECG2 only to avoid the unnecessary length of the article. Furthermore, the reconstructed ECG signal i.e., ECG2 in Fig. 15 is also demonstrated for all three algorithms.

DISCUSSION AND CONCLUSION
The ECG artifacts removal problem is investigated in this article. Both realistic simulated and real ECG signals are utilized for simulation. The artifacts considered are baseline wandering, electrode movement and muscle artifacts. Removal of these artifacts is difficult due to their variable amplitudes and frequencies. The IVA technique is compared in this article shows that it outperforms the CCA and ICA techniques. We further investigated the IVA technique for ECG artifacts removal. For comparison purpose, we consider three IVA algorithms to get more clear ECG signals in the presence of various artifacts. In addition, we utilized different evaluation criterion to confirm performance of the proposed technique. The ISI com performance of the IVA algorithms for different values of the input data block lengths in different data sets. Simulation results are shown in Fig. 9 at 20 dB SNR. These results show that the IVA-L algorithm is more sensitive to length of the processing data blocks. At a block length of 100 samples in each data set the performance improvements of the IVA-G and IVA-GGD are 18% and 19% as compared to the IVA-L. As a concluding remarks, we can say that the IVA algorithms are less sensitive to input data block lengths and input SNRs as compared to the ICA technique. Thus, IVA is proved to be an efficient and more practical technique for ECG de-noising.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
The authors received no funding for this work.