A Software for Preprocessing Experimental BSPM Signals for a CRT Study

In this text, we describe the developed system for Body Surface Potential Mapping (BSPM) signals preprocessing and basic processing. The BSPM is based on multichannel ECG measurement with up to hundreds of electrodes in a specific grid on the body surface. The project is focused on the signals of patients after cardiac resynchronization therapy (CRT). These patients are indicated for CRT because of heart failure, and it is necessary to realize the difference in electrical and mechanical heart activity of such patients. The presented software is designed according to the specific conditions of the issue, with respect to minimization of the morphology distortion during filtering and specificity during signal delineation (finding of ECG characteristic points).


Introduction
The results presented in this article are a continuation of the previous research presented in [1]. In this project, in cooperation with the Department of Cardiology of the University Hospital Motol Prague, the primary domain of the solved issue is cardiac resynchronization therapy (CRT), which is a treatment for heart failure. Many studies, e.g., [2][3][4], validated CRT's influence on decreasing morbidity and mortality. The Body Surface Potential Mapping (BSPM) is a method of non-invasive measurement and analysis of surface electrical heart potentials based on a grid containing many electrodes (24-300 electrodes). Using BSPM to evaluate CRT is a relatively unknown subject. In our project, we use the Biosemi BSPM system [5] with 120 electrodes to analyze the electrical dyssynchrony. An example of the Biosemi electrodes grid is shown in Figure 1. The electrical dyssynchrony allows quantifying the progress of ICT therapy. BSPM provides a better spatial sampling of the heart field on the body surface. Thus, this approach can be (in some cases) utilized for diagnostics regarding spatial and temporal information of electrical heart activity [2,6].
The advantage of BSMP is better information about the cardiac electric field available on the body surface. However, multichannel measurement has several disadvantages. The first one is the need to place many electrodes on the chest and back. This process is time-consuming, and there is a high possibility of wrong electrode-skin contact with some electrodes [7]. Further, the large number of electrodes is not very comfortable for a patient. Hence, the record contains large amounts of movements' artefact, proportional to the length of the examination. Therefore, it is necessary to use an appropriate form of filtering. Moreover, it is possible that some leads of the record are damaged so much that it would be better to replace them by a combination of other signals, or to exclude them from further analysis. The preprocessing of the signal is performed automatically, but it is always better for an expert to make a final decision, in our case, a cardiologist. In our previous research [1], we described our approach for BPSM signals filtration. In this paper, we would like to describe a final software for the preprocessing of a BSPM record and its delineation-it is a generally similar procedure as for standard (12-lead, holter) ECG, but there is an impact of the disease severity in our case. We have focused mainly on QRS complex onset and offset, R peak, and on min (dR/dt) specifically (see Section 2.3).

Signal Acquisition
In our case, multichannel measurement is represented by unipolar leads referenced against virtual Wilson's central terminal (WCT). In fact, this central terminal is calculated by the means of the bipolar limb signals after the loading of signals (in the original file, signals are referenced to a virtual floating reference though). The WCT is given by the following formula:

3
(1) The final signals for preprocessing are obtained by subtracting the WCT from each lead. Simultaneously, the direct current (DC) offset (from amplification) is removed by subtracting the mean of the signal itself. The sampling frequency of signals is 1024 Hz. An example of raw signals from the Biosemi software [5] is depicted in Figure 2. Signals are shown in strip 7, electrodes 43-49 (see Figure 1a), which is close to the heart. The WCT reference is created by the first three leads (see the second box with the scrollbar on the left in Figure 2). The Biosemi software is used for the acquisition of data and can be used as the viewer of signals (plus there are several smaller applications for electrodes numbers reduction, sample frequency decimation, and next smaller adjustments). However, for our purpose of processing, it is necessary to create a new preprocessing tool with a specific design.

Signal Filtering and Signal Quality Report
The proposed processes of the filtration are widely described in [1]. Hence, we just briefly summarize the filtration process here. The most problematic part of the filtration is balancing noise reduction and ECG morphology corruption due to the nonlinear phase of the filters. This problem is commonly linked with Infinite Impulse Response (IIR) filters, but Finite Impulse Response (FIR) filters have exhibited the same problem [8]. Moreover, the FIR filter is not possible to use to filter every noise, for example, powerline hum noise. The final filtration deployed in the described software is composed of several filtering processes. The first one is the low pass FIR filter with a cut-off frequency of 100 Hz and a length of a filter half the sampling rate (1024 Hz). The second is the high pass filtration based on AHA guidelines with the FIR filter with a cut-off in 0.67 Hz [9]. In the next step, the moving average filter is used. It is necessary to determine the reference point in the signal (commonly R peaks, but the situation is a bit more complicated in our case; see Section 2.3) before deployment of this filter. However, it is possible to do after the previous two filtrations. The average filter has a window with a length of 1.2 times the sample frequency. Finally, the powerline hum noise is reduced by IIR notch filter with a cut-off frequency of 50 Hz and a 3 dB bandwidth of 6 Hz. As we mentioned above, this final filtration is a compromise among noise reduction, time efficiency, and phase distortion of the ECG determined waves. Signal Quality Index (SQI) is generally a metric for the evaluation of the signal quality based on specific criteria. There are many approaches as to determinate the SQI based on the application area, e.g., partly related to our research [10]. We use SQI to evaluate its characteristic in the spectrum and to estimate the noise probability. The SQI assesses the quality of signal in percentage, where 100% is a clear signal and 0% represents white noise. The developed system marks signals with SQI < 20% automatically as excluded. Nonetheless, a user can decide that the signal could be turned back into the analysis.

Signal Delineation and Beat Template Determination
As we mentioned in the previous chapter, the R peak is commonly the first found ECG point in the case of standard ECG. However, we do not deal with standard ECG, because patients indicated to CRT have heart failure. Unlike standard ECG, the ECG of these patients has a materially different morphology. Moreover, there is also dyssynchrony of electrical and mechanical activity of the heart. Therefore, such an ECG record requests a specific approach, especially for the delineation of the signal. First, we must consider the fact that the R peak does not need to be presented in classic forms (this fact is also related with a place of the electrode). We use the label "R peak" for the highest absolute amplitude (positive or negative), which is not in accordance with the definition of R peak (only positive amplitude). We are aware that, for example, only negative peak in ECG is correctly labeled as SQ. However, most important for us is the time (time interval) of electrical activation in the heart. This problematic is described in more detail in the research in [11].
The system uses the "R peak" of each beat for a generation of the beat template. There are two conditions to use to check signal stability (in the sense of similarity of consecutive beats). The first condition is a similar RR interval between "R peaks". The beat is delimited by half of the RR interval from the left and half of the RR interval from the right. The second condition is a similarity of the beats themselves. This task is solved by calculation of Euclidean distances in six consecutive windows per one beat. If some distance, or the summary of distances, grows over the given threshold, the beat is excluded from the template calculation. If there is a high enough number of included beats before the excluded beat, the template is created from them. If the number of beats is not high enough, the algorithm searches the following sequences of beats. In addition to this simple approach, we tried to use the nearest neighbor clustering with dynamic time warping, but the result was similar, and the clustering was more time-consuming than the more straightforward approach. Therefore, we decided to use the first approach.

Description of the Developed System
The developed software provides the above-mentioned automated preprocessing (filtering, delineation) and further allows adjusting particular parts of the record processing. The main window of the software with a description of particular parts is shown in Figure 3. The description of the most important parts in more detail is in the following text.
Before the main window opening, the software converts each newly opened record from Biosemi's proprietary file format (bdf) into Matlab data file format (mat), which is much better for the successive processing and saving of the additional information. Next time, the user works with the created mat file already. The first mode of the software is a window for a template determination. The system selects a stable part of the record overall leads as a basis for beat templates automatically. However, this process is a compromise because to find part of at least 20 beats where signals are stable, without any movements' artefact, in all leads is almost impossible. Therefore, the user has the option to change the selection of beats for template creation in each lead. It is necessary to realize that it is time intervals (their length) what is evaluated at the first place. This process requests high precision in the beat representation. The window of the template beats selection is depicted in Figure 4. Following is a decision about the leads, which are labeled as excluded, because of low Signal Quality Index (see Section 2.2). The user has the option to define the signal as included or if the signal's quality is inferior. The user can perform a calculation of the arbitrary signal as a superposition of selected signals in a neighborhood. A new arbitrary signal then replaces the original signal. This process is dependent on the careful selection of the user, which signals are possible to replace (i.e., base on quality of signals in the neighborhood) and which signals are really necessary to exclude from the next processing and evaluation steps. The example of a signal with an appropriate neighborhood before replacing is shown in Figure 5.
The previous two steps are the basis of preprocessing and analysis, which the user can adjust. As is obvious from Section 2.3, the delineation of the ECG signal is not a trivial issue in our case, because of many differences that are present in the seriously ill people. Moreover, these cases are not only different from standard ECG but they are also very variable in the expressions. That is why we must strongly consider an option to possibly adjust the delineation by an expert. In fact, the system has to have a very robust technique for delineation just because of the variability in data. Naturally, the high robustness can be in the opposite to the precision in particular cases. Therefore, the software contains the manual delineation for each lead in a special window (Figure 6), as well as the possibility to delineate the point overall leads at one click.
In addition to the signal modifications, the software includes a few auxiliary tools like caliper for the precise measurement of time, a slider for the setting of the zero offsets up, view of V1-V6 unipolar leads, which is generated from selected BPSM grid leads, and reset of the view into the default state. The software is programmed in Matlab and is built into the executive file (exe) for the final user-the user needs only Matlab's runtime that is freely available. The software is currently only provided in the Czech language because Czech hospitals use it.  . An example of the window for manual delineation with caliper. The button "Uložit" represents the save button and the button "Zpět" represents operation undo button.

Conclusions
Multichannel measurement can be a useful technique to obtain a better view of the temporal and spatial behavior of the heart's electrical field on the surface of the body. However, it is necessary to realize that the higher volume of the measurement also brings a higher occurrence of common problems (electrode-skin contact) and several specific problems (time for preparing, patient comfort, different approaches to preprocessing/processing). Furthermore, the CRT is indicated for a patient with heart failure. Thus, the ECG of such people has not a standard morphology. This research intends to create a sufficiently robust and precise tool for preprocessing the measured BSPM signals, but considers the variability and complexity of such records and the solved issue. The developed system allows the basic preprocessing and processing of the multichannel ECG measurement with a focus on the reduction of false results (CRT patient's ECG can be very disparate to standard ECG), with an accent to the cooperation of the system and the expert who uses it. The software is a prototype, which we will extend through further features for signal processing and which represents a gateway for the next systems to process the BSPM data.
The study data acquisition also showed one important issue that we have to consider in data science, namely the contrast between complex data acquired from one patient and impossibility to generalize the results and even algorithms to apply them blindly to data acquired from other patients with the same diagnosis. The patients who are selected as candidates for CRT exhibit such a high variability in almost all parameters that any statistical evaluation would not bring satisfactory and clinically valuable results. The designed algorithms and their implementation on the described software tool is intended to be a part of a decision support tool for the clinician.