Real time reconstruction of quasiperiodic multi parameter physiological signals

A modern intensive care unit (ICU) has automated analysis systems that depend on continuous uninterrupted real time monitoring of physiological signals such as electrocardiogram (ECG), arterial blood pressure (ABP), and photo-plethysmogram (PPG). These signals are often corrupted by noise, artifacts, and missing data. We present an automated learning framework for real time reconstruction of corrupted multi-parameter nonstationary quasiperiodic physiological signals. The key idea is to learn a patient-speciﬁc model of the relationships between signals, and then reconstruct corrupted segments using the information available in correlated signals. We evaluated our method on MIT-BIH arrhythmia data, a two-channel ECG dataset with many clinically signiﬁcant arrhythmias, and on the CinC challenge 2010 data, a multi-parameter dataset containing ECG, ABP, and PPG. For each, we evaluated both the residual distance between the original signals and the reconstructed signals, and the performance of a heartbeat classiﬁer on a reconstructed ECG signal. At an SNR of 0 dB, the average residual distance on the CinC data was roughly 3% of the energy in the signal, and on the arrhythmia database it was roughly 16%. The diﬀerence is attributable to the large amount of diversity in the arrhythmia database. Remarkably, despite the relatively high residual diﬀerence, the classiﬁcation accuracy on the arrhythmia database was still 98%, indicating that our method restored the physiologically important aspects of the signal.


Introduction
A modern intensive care unit (ICU) employs multiple bedside monitors to track the state of patients. Automated analysis systems are used to analyze the signals collected by these monitors. Many of these systems run algorithms that are dependent on the morphologies of a signal or set of signals. For instance QRS detectors are used to identify beat boundaries [1], beat classifiers are used to identify ectopic beats [2,3], and morphological variability computes the morphological variations between successive heart beats [4]. These systems critically depend on continuous uninterrupted real time monitoring of the nonstationary physiological signals such as electrocardiogram (ECG), arterial blood pressure (ABP), and photoplethysmogram (PPG). Unfortunately, the utility of these systems is compromised by the fact that the signals are often severely corrupted by noise, artifacts, and missing data.
In this article, we address the problem of identifying and reconstructing corrupted regions in a multi parameter quasiperiodic signal in a clinically useful way ( Figure 1). Here, a multi parameter signal is a set of multiple timealigned signals obtained from multiple sensor sources. They could be of the same type, e.g., two channels of an ECG, or of different signal types, e.g., ECG channels, ABP, and PPG signals. Our technique is based upon learning a patient-specific model of the relationships between time-aligned signals and then reconstructing corrupted segments using the information available in correlated signals.
This study provides a layer between the signal acquisition system and the automated analysis system. As such it is complementary to any signal enhancement (e.g., removal of baseline wander for ECG signals [1]) done during signal acquisition and any techniques for coping with low SNR in the analysis algorithms.
The examples in this article are drawn largely from signals related to cardiovascular activity. These signals are generated by the same underlying system (the heart). Therefore, the periodicity of these signals is related to http://asp.eurasipjournals.com/content/2012/1/173 Figure 1 Reconstruction on ECG signal. An excerpt from record 200 in MIT-BIH Arrhythmia database (a). The first channel is added with white Gaussian noise at SNR 0 dB (b), and reconstructed using our method (c). Premature ventricular contraction (PVC) beats as identified by an ECG beat type classifier on these signals are also shown in the pictures. heart rate. Further, these signals also share the same repetitive structure, and hence, they are jointly quasiperiodic.
The types of noise artifacts and corruptions that these signals suffer include, missing data, physical activities, muscle artifacts (MA), electro-magnetic interference (EM), and baseline wander (BW) [1,5]. One or more signal can be corrupted at any time, but we consider the situation where at least one signal is uncorrupted. This is plausible because, unlike the signals, which are generally correlated as they share the same source (heart), corruptions have different sources and characteristics. For instance, electromagnetic interference affects ECG signals but not the PPG signal.
In experiments carried out on the MIT-BIH Arrhythmia Database, a two-channel database with many clinically significant arrhythmias, our method improved a standard algorithm [6] for detecting a kind of ectopic beat by more than 7× on a signal corrupted with white Gaussian noise, and increased the similarity to the original signal, as measured by the normalized residual distance, by more than 2.5×.
As discussed in Section 'Offline learning' , the problem of reconstructing a corrupted multi-parameter physiological signal was formally posed in the 11th annual PhysioNet/CinC challenge [7]. On this CinC dataset, our method performed well relative to the winning entry. The methods used by the respondents to the challenge are substantially different from those described in this article. In particular, none of them attempted to segment the quasi-periodic signals before reconstructing them. This impaired their performance on correlated quasi-periodic signals (such as ECG and ABP), but allowed them to process independent signals (e.g., respiration).
Recent attempts to handle the corruptions by noise, artifact, and missing data in physiological signals focus on using redundant measurements, and fusing data from multiple sensors. Researchers have developed methods to robustly estimate heart rate (HR) by fusing information from multiple signals [8]. In addition to various ECG channels, researchers also make use of ABP and PPG signals for HR estimation [9]. In this context, researchers have proposed segmentation methods for ABP and PPG that involve detecting the parts of the waveforms corresponding to the onset of a pulse [10,11]. These methods independently segment the physiological signals. Hence, they don't preserve the alignment between the segments across different signals.
The organization of this article is as follows. In Section 'Method' , we present our method and provide the mathematical framework of our study. In Section 'Results' , we discuss the measures of performance used to evaluate our method, and present the results of a series of tests in which comparisons are made using each of the performance measures. Finally, in Section 'Conclusion' , we summarize our study.

Method
We use a two-phase process designed to reconstruct corrupted signals in a way that improves the reliability of the automated systems that analyze these signals ( Figure 2).
We use anonymized public medical data in our study.

Segmentation
The segmentation stage decomposes a continuous physiological signal into intervals with clinically relevant morphologies. We consider a multi parameter signal represented by a matrix S, where each column represents an individual channel of the signal (e.g., an ECG channel, ABP or PPG) and each row represents a point in time. For simplicity, we assume that all the channels are sampled at the same rate so that the matrix S has a repetitive structure that is shared by all the channels in the structure. Physiological signals are typically segmented according to some well-defined notion. For example, QRS detectors that identify the R-peaks are typically used to segment ECG signals [12]. However, when there is significant corruption and noise in the signal, these methods fail to perform adequately. Use of correlated signals has been explored by researchers in the context of heart rate estimation, where a coarse estimation of segment lengths would suffice [8].
We jointly segment the multi parameter signal using template matching. The goal of the template-matchingbased joint segmentation is to divide the multi parameter http://asp.eurasipjournals.com/content/2012/1/173 Figure 2 Block diagram of our method. We use a two-phase process. In the first phase, we simultaneously segment the quasiperiodic multi parameter signal into basic segments (e.g., ECG beats) and estimate the signal qualities of all the signals in each segment. The second phase takes as inputs the segments and the signal quality estimates produced in the first phase. The method builds a database of templates from those segments in which all signals are of high estimated signal quality. When the method encounters a segment with a low estimated signal quality, it tries to reconstruct that segment using the best match from the database. The fundamental idea is to learn a set of morphological templates, and reconstruct the corrupted segments using them.
signal into quasiperiodic units (segments). The approach is different from typical ECG segmentation methods in two ways. First, it doesn't use any specialized knowledge of the characteristics of the signals. This allows the extension of the method to other types of signals (ABP and PPG). Second, a multi parameter signal is jointly segmented. Therefore, when one or more signals in the multi parameter signal are corrupted, uncorrupted signals are weighted more, and the joint segmentation is highly influenced by the uncorrupted signals.
The method uses an evolving template ( Figure 3), a short multi parameter signal, and matches it with a sliding window of the multi parameter signal. The method requires an initial template that can be provided manually or generated from the signal itself provided that the signal is uncorrupted at the beginning. In our experiments, we use the first two segments of the signal as identified by a widely used QRS detector [1]. Then the template is regularly updated to reflect the time evolution of the signal.
The algorithm continuously extracts a non-overlapping window from the signal, and identifies the segment boundary in the window by finding the prefix of the window that most closely matches the template. . The template is little longer than two segments. It contains two full segments of length 2 − 1 and − 2 ; the length of the template is . http://asp.eurasipjournals.com/content/2012/1/173 The matching is done using weighted time warping (WTW) [13], which minimizes the weighted morphological dissimilarity across all the signals. The warped distance between two one-dimensional time series gives the morphological dissimilarity. Time warping is necessary, because the template and current window can be of different lengths. The weights represent the estimated quality of each signal, which is again computed by the morphological similarity between the signal in the window and its counterpart in the template.
To follow the gradual changes that are common in physiological signals, the template is updated regularly.

Weighted time warping based template matching
Let S ∈ n×m be a multi parameter time series consisting of m physiological signals, and Z ∈ ×m be the initial template. The goal is to segment S into a set of , and segment corresponds to a single heart beat. Here, S [p i ,p j ) denotes the window in the target sequence S from time t = p i to t = p j − 1, where p i and p j are the row indices of the matrix S.
As addressed later, we require the template ( Figure 3) to be comprised of at least two segments to improve the accuracy of the matching. The template is used to find the segment Y i in Y. We also assume that we know the locations of the segment boundaries 1 , 2 , and in the template.
We start the process at some arbitrary point in time p start on the signal that is to be segmented. This need not be an actual segment boundary. We run the algorithm starting at p start , continuously segment S, and add the segments to Y. We also update Z to reflect the evolution of the time series.
An iteration of our method on a single channel ECG signal is illustrated in Figure 4. We start each iteration with the extraction of a window W = S [p i ,p i +v) from S at p i . Here, the window length is given by v = + e, where is the length of the template and e is cushion length. We add a cushion length because the signal being processed may contain segments longer than those of the template, and we wish to ensure that the window obtained from the signal can be matched to the two segments that comprise the template (see Section 'Window length'). In the following discussion, we use j to index individual signals in the multi parameter signal S. The window is then detrended and normalized so that the corresponding signals have the same peak-peak distances, and the morphological quality estimates {q j } m (Section 'Signal quality estimation') are computed, where q j represents the morphological similarity between the channel W j from W and the corresponding channel Z j from the template. For each channel j, a pairwise Euclidean distance matrix pD j is calculated between Z j and W j (Equations 2 and 1). The final distance matrix D is obtained by weighting pairwise distance matrix pD j with q j (Equation 3).  Picture shows an iteration of the segmentation process on an example containing one ECG channel (a). Starting from p i , a window W is extracted, and matched with template Z. The prefix of the windowŴ that best matches the template is found, and using the alignment between Z andŴ (b), the next segment boundary p i+1 is determined. http://asp.eurasipjournals.com/content/2012/1/173 In Equation 3, we extend dynamic time warping [14] to higher dimensions by introducing a weighted norm over individual signals to vary the influence exerted by each signal.
For one-dimensional sequences A =[ a 1 , a 2 , . . . , a x , . . . , a n ] and B =[ b 1 , b 2 , . . . , b y , . . . , b n ], we use a recursion (Equation 4) with local continuity and global path constraints (proposed as Type III and Type IV local continuity constraints in [14]) to calculate the accumulated distance between subsequences . These constraints ensure that there are no long horizontal or vertical paths in the matrix aD along the alignment. This prevents physiologically implausible alignments such as a long subsequence of a sequence being matched with a significantly shorter subsequence of another sequence.
Next, we trim W to obtain the prefixŴ = W [1, w ] so that it contains only the portion of the signal that matches the template, where the prefix length w is given by Equation 5.
If the prefixŴ doesn't contain two segments, the cushion length e is increased and the process is repeated. More detailed explanation is provided in Section 'Window length' .
Next, we find the optimal path alignment (φ(k) = (φ w (k), φ z (k)); 1 ≤ k ≤ K) between Z andŴ in the accumulated distance matrix aD. Here, K is the length of the alignment [15]. From the alignment, we obtain the point f inŴ that is matched with the segment boundary 2 in Z.
This corresponds to the segment boundary we are interested in, because 2 marks the end of the first segment in the template.
Using a two segments long template, we align the ends of the second segment, backtrack to the end of the first segment in that alignment, and find the matching points for these segments. Using both sides of the segment boundaries increases the accuracy of the segmentation process ( Figure 4(b)).
Then, using the corresponding length f , we update p i+1 Following the template update, the process is repeated to find the next segment boundary.

Template updating
To follow the gradual changes that are common in the physiological signals, we update the template regularly ( Figure 5). We do this when the variation of the segment lengths in the neighborhood is small, and the quality estimates of all the channels are above a threshold. We require the difference between the maximum segment length and the minimum segment length to be less than 25 % of the mean segment length over a moving time period of 60 seconds. We also require the minimum signal quality of any channel in the segment to be greater than a threshold.
We update the template by averaging the excerpt of the last two segments with the current template time-warped to match the excerpt length (Equation 8). The time warping is necessary, since the current template Z and the excerpt of the last two segments Z can be of different lengths. New template Z is given by, Here, ensures that the template consists of at least two segments. We vary the influence of the recent segments on the template through the constant 0 ≤ η ≤ 1, where η = 0 implies a static template. We chose η = 0.05, but the results are not sensitive on the choice of η.

Window length
The template Z contains two segments. We expect the window W to contain at least two segments so that the prefix of the window would match the template. When the cushion length e is not long enough, the window W = S [p i ,p i +v) ; v = +e would not contain two full segments. In such cases, we repeat the matching with a longer window.

Signal quality estimation
Our framework requires the assessment of the signal quality at multiple stages. In joint segmentation (Section 'Segmentation'), we weigh each signal's influence on the joint segmentation by the estimated quality of the signal. In reconstruction, we use signal quality estimates http://asp.eurasipjournals.com/content/2012/1/173 Figure 5 Template update. Picture shows the evolution of the template over different regions in the record 106 (a) from MIT-BIH Arrhythmia database. In Region A (d), the template (c) reflects the morphology of the signal in that region. When the process moves to Region B (f), the template (e) evolves to follow the morphology in that region. Particularly, S-wave in the QRS complex becomes less noticeable. The effect of the template update is prominent in Region D (h), where the signal contains bigeminy, and the template (g) successfully captures the change. Further, in Region C (b), since the signal quality was deemed to be poor, hence the template was not updated. http://asp.eurasipjournals.com/content/2012/1/173 to identify the corrupted segments. Further, in Section 'Ensembles' , we use the signal quality of the matches to fuse the matches for reconstruction.
If we had a model of the source, the signal quality assessment would be straightforward, the signal to noise ratio (SNR) gives an estimate of the signal quality. However, for physiological signals, the source characteristics vary significantly over the time. Therefore, researchers use indirect measures to estimate the signal quality of the physiological signals. These methods assume that some characteristics of the signals are known a priori, and the noise artifacts cause any deviation present in the signal.
There is a lot of literature on signal quality estimation of physiological signals [8,[16][17][18]. ECGSQI estimates the signal quality of a multi-channel ECG signal [8]. ABP-SQI [8] combines wSQI and jSQI to obtain the ABP signal quality. Deshmane estimates PPG signal quality using Hjorth [19] parameters [10]. Each of these methods generates independent quality estimations on different kinds of signals using signal specific characteristics, and hence the indices are not comparable with each other.
To be useful in our framework, we need the signal quality estimate to have the following properties.
• Independent of the beat type and the location of the corruption within the segment, • Comparable across different type of signals, such as ECG, ABP, and PPG signals, and • Estimated at the granularity of a heartbeat.
In our method, we hypothesize that the corrupted regions of the signals will be morphologically dissimilar from clean signals, as represented by our evolving template.
We use longest common subsequences (LCSS) to estimate the morphological similarity. LCSS is a variation of edit distance that is extended to real sequences [20]. Given two time series, A and B, LCSS maximizes the similarity between A and B by finding the longest subsequences from both sequences whose elements fall within δ x and δ y of each other in X-axis and Y-axis. The following dynamic programming function on sub-sequences A x = a 1 a 2 . . . a x and B y = b 1 b 2 . . . b y is used to achieve this [20].
Threshold δ x controls the time warping permitted and δ y bounds the pair wise differences of two similar elements. LCSS uses a bandwidth to estimate the similarity. This makes LCSS resilient against noise [20].
For the multi parameter signal S and the template Z, the signal quality of the kth signal in segment i is calculated from the morphological similarity between the window p i+1 ) , and the corresponding signal in the template B = Z k . The similarity L = LCSS (A, B)/min(n, m) gives the signal quality estimate q k of the kth signal in segment i. Here n and m are the lengths of the sequences A and B respectively. Figure 6 shows the signal qualities estimated by our method on an ECG signal.
To estimate the signal quality, we compare a window of the signal against the template. This would be valid if any difference between the template and the window were due to the noise or corruption present in the signal. However, the signal morphology evolves along the time. For instance, heart rate may increase, the amplitude of the ABP signal might drop, etc. We regularly update the template to accommodate the time evolution of the signal.

Comparison
Related methods such as ECGSQI and ABPSQI estimate the quality of the signal over a 10 s long region [8]. In contrast, the morphological similarity metric estimates the quality of the signal beat by beat. The ability to obtain a more fine-grained quality estimate helps us handle the bordering regions (start and end of the corrupted region) better.
Unlike the related methods, morphological similarity is applicable to each of, ECG, ABP and PPG signals. Therefore, we are able to use a single metric across different types of signals (Figure 7). This allows us to use the signal quality estimates obtained from the morphological similarity to weigh the individual signal's influence on the joint segmentation of a multi parameter signal.

Feature representation
In reconstruction, we search the database for the closest match to the current segment. Since we want to do it in real time with a growing database, we need a fast method of retrieval.
Since the segments are usually of different lengths, a direct comparison function, such as Euclidean distance is not suitable. On the other hand, variable length metrics such as DTW [14] and LCSS [20] are quadratic in the length of the sequence. Using one of these techniques would result in a complexity of O(n 2 ), where is the length of the sequence and n is the number of templates in the data base.
We use an intermediate feature-based representation of the segment to reduce the dimensionality of search. This represents a segment with a fixed length vector, hence two http://asp.eurasipjournals.com/content/2012/1/173 Estimated signal qualities of ECG. Signal quality estimated by the morphological similarity on ECG signal from MIT-BIH Arrhythmia database at Physionet.org. The signal qualities are bounded between 0 and 1. Segments C and D contain uncorrupted normal beats, and they get the highest signal quality estimates. Segment B is uncorrupted, but contains an abnormal beat. Yet, it receives a high signal quality estimate. Segments E and F are corrupted, and aptly get low signal quality estimates.
sequences can be compared in linear time in feature space. Just as importantly, it provides a level of abstraction that preserves relevant physiological information. Table 1 lists the set of features in the feature representation F i = {f j |j = 1, 2, . . . , 30}. The first four of them are related to the segment intervals, which capture the contextual details of the beat, followed by the features that are both physiologically and morphologically representative of the signal in the segment.
Once we generate the features from the segments, we normalize them so that each feature has zero mean and unit variance. Next, using the standardized features, we perform a nearest neighbor search in the database to select the best K = 20 matches. The search is done using locality sensitive hashing as described in [21]. From the top K matches, we find the candidate for replacementŶ i using both DTW distance and feature distance.
Here, DTW distance (f dtw ) is computed between the clean channels (correlated signals) of the segments (Y a k and Y a i ). The regularizer λ = 25 was empirically chosen. The summation is performed over all the features in the feature vector. The combination helps avoid over-fitting and balances the preservation of clinically relevant events and morphological similarity.

Reconstruction
Assume that we want to reconstruct the corrupted channel Y 1 i of the current segment Y i with the corresponding channelŶ 1 i from the replacement candidateŶ i . Further, Y a i is the correlated channel. We first verify that the purportedly uncorrupted channel of the current segment is sufficiently similar to the corresponding channel in the replacement candidate. We perform reconstruction only if the cost of the match c i is less than a threshold. If the cost c i is greater, we flag the segment Y i so that automated systems could avoid producing false alarms in those regions.
Since the length of the current segment Y i , and the length of the candidate found (match)Ŷ i are typically unequal, we next time warp the match with the current segment. Time warping is done by finding the optimal alignment φ(k) between the corruption-free channel of the current segment Y a i and the corresponding channel of the matchŶ a i (Equations 12 and 13).
We then replace each sample of the corrupted chan- , which is obtained from the median of the samples with which it is aligned.

Ensembles
Thus far, we have described applying our reconstruction framework in the context of a pair of correlated signals.
In this section, we describe how to extend it to a situation in which there is more than one uncorrupted correlated signal. For each uncorrupted correlated signal, we find the entry from the corresponding database that most closely matches the current segment. We weight each entry by the estimated quality of the reconstruction, and combine the reconstructions.
For the segment Y i , let Y 1 i be the corrupted channel, {Ŷ j i |j = 2, 3, . . . , m} be the matches found using correlated signals j = 2, 3, . . . , m, and {c j i |j = 2, 3, . . . , m} be the costs of the matches. The reconstructionŶ 1 i is obtained by fusing the matches, Here, q j i is the estimated signal quality of jth channel in ith segment, and r j i is the correlation coefficient between the 1st channel and jth correlated channel estimated around the segmentŶ j i . The choice of p influences the performance of our method. For instance, when we have a diverse set of signals (e.g., ECG channels, ABP, and PPG), lowering p to one seems to improve the performance. However, when we have a set of highly correlated signals (e.g., several ECG channels), making p → ∞ tends to perform better; this is similar to selecting the most useful signalŶ 1 Our approach is based on a "bucket of models" concept, and hence is an ensemble method. Typically, modelselection is performed by cross-validation in ensemble methods. However, because of the autoregressive nature of the data, we use the hypothesized quality of the reconstruction (w j ) to perform the model-selection (or modelfusion depending on the choice of p), and determine the model weights, which get updated in real time to follow the signal evolution.

Results
The organization of this section is as follows. In Section 'Segmentation' , we present the evaluation of the segmentation algorithm. In Section 'Reconstruction' , we discuss the methodology that is used to evaluate the reconstruction, and present the results of a series of tests in which comparisons are made using this evaluation methodology.
In our experiments, we artificially corrupt one or more channels of each record by adding different kinds of synthetic corruptions. Table 2 lists these corruptions.

Segmentation
We compare our method's effectiveness to that of a widely used QRS detection based segmentation method [1,25] that is known to be resilient against signal noise [12,26]. It is publicly available as open source software [6].
We use clean excerpts from the MIMIC data at Physionet.org [27] to evaluate our segmentation method. The database has 72 waveform records with several annotation sets including ECG beat labels. It includes recordings from multiple ECG channels, ABP, PPG, and other signal types ( Figure 8). The signals are sampled at 125 Hz.
We randomly extracted 1000 high quality 5 min long excerpts from the MIMIC data, for which both our method and the QRS detector were 100 % accurate in detecting the segment boundaries. First, we added additive white Gaussian noise (AWGN) at different Signal/Noise (SNR) levels to these excerpts and tested both the methods on them. We carried out three sets of tests. In the first set, we added white noise to all m channels, and in the second set to m − 1 channels. In the third set, we added transient corruption on five randomly selected non-overlapping 1 min long regions. We applied one of the following types of transient corruptions: signal interruption, exponential damping, overshooting and clipping, or superimposition of artificial low frequency signals and high frequency signals. Table 3 presents the summary of the experimental results on the artificially corrupted data. Average segment length is 521 ms. The reported error is the mean distance over the 1000 excerpts between an actual segment boundary and the closest segment boundary found by either method. Our method was able to identify the segment boundaries with reasonable accuracy even in the presence of significant additive noise on all channels. If any one of the channels is free of noise, the performance is comparable to the performance on the uncorrupted data. Under transient corruption, the average error was http://asp.eurasipjournals.com/content/2012/1/173

Additive white Guassian noise (AWGN): we use a standard AWGN algorithm [22]
Transient corruption: We use the following types of transient corruptions.
• Signal interruption: We make the amplitude of the signal zero.
• Overshooting and clipping: We amplify the signal by a randomly chosen factor (uniformly distributed) between 1 and 5, and clip at the maximum and minimum values of the original signal.
• Super imposition of high frequency signal: We add white noise of the same power as the signal, high pass filtered at 150 Hz.
• Super imposition of low frequency signal: We add white noise of the same power as the signal, low pass filtered at 0.05 Hz.
Non-Gaussian corruption: We use the following types of structured (non-Guassian) interferences in the evaluation of reconstruction. We use MIT-BIH noise stress test database [23] and nstdbgen [24] to generate these interferences.
• Electromagnetic interference (EM) • Muscle artifact (MA) • Baseline wander (BW) 2.89 ms for our method. This is comparable to that of the data with AWGN at SNR 10 dB on all m channels. Under severe transient corruption the QRS detector becomes totally unusable, whereas our method finds the segment boundary with reasonable accuracy.

Reconstruction
Since our method is based on patient-specific learning, the method of training is an important part of the evaluation.
We used two approaches to the training, called offline and online. We use the following evaluation methodology on all subsequent experiments.

Evaluation methodology
In our experiments, we artificially corrupt one channel, for instance the first channel S 1 of the signal S, and try to reconstruct it. We use two criteria to evaluate the effectiveness of the reconstruction.  1. Residual distance (r) : We measure the similarity between two sequences, for instance the reconstructed data (Ŝ 1 ) and the original uncorrupted data (S 1 ), by measuring the normalized Euclidean residual distance.
Here, n is the length of the signal S, and σ 2 s is signal variance of the corrupted channel. Similarly, r(S) can also be computed between the corrupted dataS 1 and the original data S 1 . 2. Classification accuracy ( ) : Our ultimate goal is to enable automated analysis systems to produce results that are more reliable. Hence, we test our method's ability to improve the classification accuracy of a clinically relevant task. We run a widely used premature ventricular contraction (PVC) detector [6], on the original data (S 1 ), the artificially corrupted data (S 1 ) and the reconstructed data (Ŝ 1 ), and record their agreements. If the PVCs are detected within 150 ms on two signals, we consider it an agreement. The disagreement is finally expressed in terms of the fraction between the total number of disagreements, and the total number of beats in the region.
We evaluated our study on two sets of data. Most of the experiments use the multi parameter physiological signal data from CinC Challenge 2010 Test Set at Physionet.org [27]. The diversity of heart beat types in the multi parameter dataset is limited, which makes it less than ideal for testing classification accuracy. Hence, we also evaluate our study on two-channel ECG data from the MIT-BIH arrhythmia database at Physionet.org [27], a widely used bench-mark database.
The CinC database has 100 waveform records of ECG, ABP, and PPG signals, each 10 min long, sampled at 125 Hz. We use only the 70 records that contain at least 3 channels that are relatively free of transient corruption and misalignment.
The arrhythmia database contains 48 half-hour ECG recordings, sampled at 360 Hz. The recordings were selected to include a variety of clinically significant arrhythmias. The database has 48 ECG waveform records; each contains two channels and is 30 min long. We use the 39 records from this set that are relatively free of significant corruption on both channels. An additional file lists the records from MIT-BIH Arrhythmia Database that were used in our experiments [see Additional file 1].
In our experiments, we artificially corrupt one channel of each record, and reconstruct it using our method. The original uncorrupted signal serves as a gold standard in our experiments. The method is evaluated by comparing both the corrupted channel and the reconstructed channel to the original channel. We use both additive white Gaussian noise (AWGN) and various kinds of non-Gaussian (structured) interference to generate synthetic corruption. We use MIT-BIH Noise Stress Test Database [28] and nstdbgen [23] to generate non-Gaussian interferences.

Offline learning
We build the template database using the first 80 % of each record, add AWGN at signal to noise ratio (SNR) 0 dB to last 20 % of the last channel of each record, and evaluate the effectiveness of reconstruction on this artificially corrupted data.
As Table 4 shows, our method improves the signal quality of the artificially corrupted multi parameter data by reducing the average residual distance by 33× and the average disagreement by 2.8×. Residual distance of the reconstructed data in the CinC multi parameter dataset is smaller than that in the two-channel arrhythmia dataset. We attribute this to the availability of multiple correlated signals in the multi parameter dataset and the utility of the ensemble approach that fuses the information from multiple correlated signals. On the other hand, the improvement in classification accuracy is higher for CinC dataset.
We attribute this to the abundance of PVC beats in that dataset, thus making it easier to find morphologically similar segments for rare beat types.

CinC 2011 challenge
In the 11th annual PhysioNet/CinC [7] contestants were asked to reconstruct, using any combination of available prior and concurrent information, 30 s segments of ECG, continuous blood pressure waveforms, respiration, and other signals that had been removed from recordings of patients in ICU. The submissions to the contest used several methods such as neural networks, Kalman and adaptive filters, hidden Markov models, and principal component analysis [24,[29][30][31][32].
The highest scored method [24] uses a multilayer perceptron neural network with a fixed length window to perform the regression. The method's reported normalized residual distance for the contest is 0.17, compared to our method's 0.02. To be fair, about 20 % of the contest involved signals that we did not test, a respiratory signal and two other pressure waveforms. These signals lack the quasiperiodic morphological structures, and have little correlation with the other signals, and therefore they are not suitable targets for our method.

Characteristics of interference
Although our method doesn't make any assumptions about the type or level of corruption, it's performance can depend upon the type and level of corruption. We use MIT-BIH Arrhythmia dataset for to test this. We corrupt the first 20 % of the first channel with AWGN at SNR levels of 10, 0, and −10 dB. Table 5 summarizes the average disagreement ( ), and the residual distance (r) for the reconstructed signal. At low SNR levels the performance doesn't deteriorate with decreasing signal quality. But, somewhat surprisingly, we get the worst performance at the highest signal to noise ratio. The relatively poor performance at SNR 10 dB can be attributed to the fact that our algorithm makes a binary decision to reconstruct the signal or leave it as it is. This results in relatively poor performance, because when the  signal is only mildly corrupted our algorithm chooses not to reconstruct it. Next, we test our method against different structured interferences. Again, we build our database from the first 80 % of each record. We corrupt the last 20 % of the first channel with the following types of corruptions at SNR 10 dB: additive white Gaussian noise (AWGN), electromagnetic interference (EM), muscle artifact (MA), and baseline wander (BW). We chose 10 dB because that was the noise level at which our method exhibited the worst performance in the test reported above. We use MIT-BIH Noise Stress Test Database [28] and nstdbgen [23] to generate non-Gaussian interferences. Table 6 summarizes the average disagreement ( ), and the residual distance (r) for the reconstructed signal. The worst performance was observed for AWGN, and EM noise.

Online learning
In this experiment, we add artificial corruption to the last minute of the data and to 1 min long regions starting at the 2nd, 4th, or 6th minute. The idea is to keep the span of the corrupted region to 20 % of the data, and evaluate the performance of our method. The algorithm builds an initial database from the first 2 min of the data, and adds new entries to the database whenever it encounters a segment that it judges to be relatively clean. Table 7 presents the results of the experiment. Our framework's performance seems to be independent of the location of corruption. Though, we start with a database built from only 20 % of the signal, we are able to achieve performance similar to the offline approach, where we built the database from the first 80 %. The only difference is a slight decline in classification accuracy. We attribute this to an increased probability that the training data doesn't include PVC's with morphologies similar to those in the test data. This is also consistent with the fact that our algorithm performs better when the training data is more proximate in time to the test data.

Conclusion
The goal of this study was to find a way to recover clinically useful information in corrupted quasiperiodic physiological signals.
The fundamental premise of this study is that given a multi parameter signal, information in some signals can be used to infer the value of other signals. Based on this premise, we approached the problem of reconstruction as a learning problem. The key technical idea is to use templates to capture the relationships between the morphologies of correlated signals. These templates are learned on a patient-specific basis, and they continue to evolve as data is processed. A level of abstraction is provided by converting variable length segments of a physiological signal into fixed length feature vectors that capture relevant properties of the segment and the context in which it occurs.
The use of a feature vector also plays a vital role in speeding up lookups so that reconstruction can be done in real time. Our method has a running time that is linear on the number of samples n, and hence is suitable for real time applications. Using two levels of indirection (the fixed length feature vector and locality sensitive hashing based nearest neighbor lookup) we achieve a running time that is independent of the size of the database.
We evaluated our method on MIT-BIH arrhythmia data, a two-channel ECG dataset with many clinically significant arrhythmias, and on the CinC challenge 2010 data, a multi-parameter dataset containing ECG, ABP, and PPG. For each, we evaluated both the residual distance between the original signals and the reconstructed signals and the performance of a heartbeat classifier on a reconstructed ECG signal. At an SNR of 0 dB, the average residual distance on the CinC data was only roughly 3 % of the energy in the signal, but on the arrhythmia database it was roughly 16 %. The difference is attributable to the large amount of diversity in the arrhythmia database. Despite the relatively high residual difference, the classification accuracy on the arrhythmia database was still 98 %, indicating that our method did restore the physiologically important aspects of the signal. The results strongly suggest that our initial premise is valid and that templates are indeed a good way to capture the relationships among correlated signals.
While we have tested our method only on ECG, ABP, and PPG signals, we believe that it could be useful in other multi-signal settings in which one or more signals are corrupted and at least one correlated signals is transiently uncorrupted [29,31]. Going forward, we plan to test our algorithm on a database containing simultaneous recordings of other signals such as CVP, PAP, and respiratory signals.
Because of inter-patient variation the use case for the method involves learning the patient's baseline before starting reconstruction. It continues to learn as the patient's baseline changes over time. If there is an abrupt change in the baseline, however, the method will not work. This might not be a practical problem, since it is probably appropriate that an abrupt change in the baseline triggers an alarm.
In other future study, we plan to explore using transfer learning to bootstrap the process. For instance, we would like to test the effectiveness of beginning the reconstruction process with a generic template database, and then adapt it to the current record by learning new morphological relationships in the current record. This is useful at the beginning of the reconstruction process when the database would be otherwise empty.

Additional file
Additional file 1: MIT-BIH Arrhythmia Database Records.