Temporal Compounding: A Novel Implementation and Its Impact on Quality and Diagnostic Value in Echocardiography

—Temporal compounding can be used to suppress acoustic noise in transthoracic cardiac ultrasound by spatially averaging partially decorrelated images acquired over consecutive cardiac cycles. However, the reliable spatial and temporal alignment of the corresponding frames in consecutive cardiac cycles is vital for effective implementation of temporal compounding. This study introduces a novel, efﬁcient, accurate and robust technique for the spatiotemporal alignment of consecutive cardiac cycles with variable temporal characteristics. Furthermore, optimal acquisition parameters, such as the number of consecutive cardiac cycles used, are derived. The effect of the proposed implementation of temporal compounding on cardiac ultrasound images is quantitatively assessed (32 clinical data sets providing a representative range of image qualities and diagnostic values) using measures such as tissue signal-to-noise ratio, chamber signal-to-noise ratio, tissue/chamber contrast and detectability index, as well as a range of clinical measurements, such as chamber diameter and wall thickness, performed during routine echocardiographic examinations. Temporal compounding (as implemented) consistently improved the image quality and diagnostic value of the processed images, when compared with the original data by: (i) increasing tissue and cavity signal-to-noise ratios as well as tissue/cavity detectability index, (ii) improving the corresponding clinical measurement repeatability and inter-operator measurement agreement, while (iii) reducing the number of omitted measurements caused by data corruption. (E-mail: A.Perperidis


INTRODUCTION
Transthoracic echocardiography, although a valuable tool for the assessment of cardiac morphology and function, suffers from a range of artifacts caused by the interaction of the transmitted ultrasound with anatomic structures, such as bone, lung and fat.Acoustic noise combined with speckle (Burckhardt 1978;Goodman 1976;Wagner et al. 1983) can limit the delineation of cardiac structures, obscuring fine anatomic detail.Furthermore, reverberations and shadowing, which obscure portions of the imaged structure (Feigenbaum et al. 2005;Sutton and Rutherford 2004), may appear momentarily or alter their position and orientation throughout a scan because of the patient's respiratory motion.Such artifacts may limit the diagnostic value of the acquired cardiac ultrasound images.Moreover, they limit the effectiveness of image processing techniques, such as image registration and segmentation, that enable the development of tools that have been found to enhance the accuracy, robustness and repeatability of the diagnostic process for computed tomography (CT) and magnetic resonance imaging (Maintz and Viergever 1998;Makela et al. 2002;Pham et al. 2000).Although advances in data acquisition technologies have substantially improved cardiac ultrasound data, a considerable portion of cardiac scans provide lowquality images of limited diagnostic value.Consequently, there is research interest in the development of effective post-processing methods that address these limitations, enhancing the image quality and diagnostic value of cardiac ultrasound.
Over the years, a number of approaches for enhancing cardiac ultrasound images have been suggested.Spatial compounding is a popular approach that suppresses noise by combining partially decorrelated images of an anatomic structure whose speckle patterns have been modified by imaging the target region of interest from varying viewing angles.During spatial compounding, no potentially valuable clinical information is filtered out.Instead, tissue structures present in all the partially decorrelated views of the scanned structure are enhanced, whereas artifacts not present in all views are suppressed.Consequently, spatial compounding approaches appear to be inherently more suitable for the enhancement of cardiac ultrasound images than filtering.A number of studies have successfully employed spatial compounding through transducer repositioning for the enhancement of 3-D cardiac ultrasound data (Gooding et al. 2010;Mulder et al. 2014;Rajpoot et al. 2009Rajpoot et al. , 2011;;Szmigielski et al. 2010;Yao and Penney 2008;Yao et al. 2010).However, the acquisition of independent cardiac views through different acoustic windows using 2-D ultrasound is very challenging because (i) all images need to be acquired over the same or a very similar scan plane, and (ii) a substantial overlap between the individual heart views is required.
Some studies have attempted to enhance 2-D cardiac ultrasound images by averaging the intensity levels from temporally consecutive frames (Achmad et al. 2009;dos Reis et al. 2008dos Reis et al. , 2009;;Li et al. 1994;Petrovic et al. 1986).Because of the constant motion of the heart, the consecutive frames are partially decorrelated, and consequently, spatial compounding can reduce noise and speckle in the processed images.On the other hand, the deformation of the heart among the consecutive frames may result in blurring of the compounded structures.Lin et al. (2010) attempted to suppress the introduced blurring by using a hierarchical, motioncompensating technique to spatially align (warp) up to nine frames.Qualitative and quantitative assessment revealed considerable noise reduction and enhancement of anatomic structures.However, the technique relied heavily on the accurate non-linear registration of consecutive cardiac ultrasound frames.Currently, the applicability of non-linear image registration methods is limited for a large proportion of cardiac ultrasound scans because of high levels of noise and low contrast.Consequently, the applicability of this noise reduction method is limited to cardiac ultrasound images with low levels of noise.
Other studies have used the repeated rhythmic contractions of the heart to acquire multiple 2-D images of the same cardiac phase over consecutive cardiac cycles through a single acoustic window.Minor random movements during a multicycle image acquisition alter the scan plane, resulting in partially decorrelated views of the imaged cardiac structure.Spatially compounding such partially decorrelated frames corresponding to the same cardiac phase can therefore produce enhanced cardiac images.The process has been referred to as temporal compounding (Abiko et al. 1997;Perperidis et al. 2009).
Accurate and robust spatiotemporal alignment of corresponding frames acquired over multiple cardiac cycles is essential for effective temporal compounding (Rohling et al. 1997).Insufficient alignment may result in severe blurring of the imaged cardiac structure, substantially reducing the diagnostic value of the processed images.Spatial alignment is required to compensate for larger cardiac movements during the multicycle image acquisition.Such displacements occur mostly because of probe slippage and changes in heart orientation during the periodic respiratory motion of the patient's chest.Furthermore, the temporal behavior of a heart may vary during a multicycle cardiac ultrasound examination.Variations in the cardiac temporal dynamics range from small, for healthy hearts, to large, for hearts suffering from arrhythmia or other cardiac diseases.Moreover, temporal variations can be global, such as differences in the length of cardiac cycles, or local, such as differences in the length of each of the seven independent phases within a cardiac cycle (Berne et al. 2004;Bray et al. 1999;Guyton 1991;Guyton and Hall 1997).In general, these variations tend to be non-linear, with greater effect in the relaxation phase of the cardiac cycle.Consequently, the temporal relationship between any two image sequences is required to compound frames at corresponding stages within the cardiac cycle (Fig. 1).
Van Ocken et al. (1981) first identified the potential of fusing information acquired over consecutive cardiac cycles to enhance the quality of ultrasound data sets (Sinclair et al. 1983).Unser et al. (1989) suppressed noise in M-mode ultrasound scans by averaging data acquired over a number of consecutive cardiac cycles.Rigney and Wei (1988) and Vitale et al. (1993) described the earliest attempts to use compounding of partially decorrelated Bmode images acquired over consecutive cardiac cycles.Rigney and Wei (1988) employed an exhaustive search along template matching to identify and compound all frames in the multiframe sequence that corresponded to a specific reference frame.This approach, although capable of generating good temporal alignment, remains (even today) a very computationally intensive choice.Vitale et al. (1993) identified the end-diastole (ED) frames from each cardiac cycle by analyzing the recorded electrocardiographic signal.Corresponding frames from consecutive cardiac cycles, extracted at regular temporal intervals to the ED frames, were then spatially compounded by intensity averaging.Similar approaches have been adopted as a pre-processing step for more effective image segmentation of cardiac structures (Amorim et al. 2009;Melo et al. 2010).A limitation in these studies was that no spatial alignment was performed on the temporally aligned frames prior to intensity averaging.Klingler et al. (1989) recognized the need for spatial alignment prior to compounding and described an empirical method to reject (but not compensate for) temporally aligned frames that demonstrated large spatial displacement with respect to the reference frame.Olstad (2002) extended the approach by Vitale et al. (1993) by introducing a rigid spatial alignment to compensate for larger cardiac movements during image acquisition.Similar to the aforementioned attempts (Amorim et al. 2009;Melo et al. 2010;Vitale et al. 1993), Olstad's (2002) approach suffered from two major limitations.In the first instance, although electrocardiography (ECG) enables accurate identification of the ED frames, it is much more challenging to extract accurate information related to any of the other phases of the cardiac cycle (Friesen et al. 1990).Second, the study assumed that cardiac cycles occurred in regular time intervals.Such an assumption can limit the effectiveness of the technique by introducing tissue/chamber blurring in the resultant images.Abiko et al. (1997), in an early implementation of temporal compounding, used normalized crosscorrelation and exhaustive search to identify all frames in a multicycle data set corresponding to each frame in a reference cardiac cycle.To reduce its computational requirements, the spatiotemporal alignment used 1-D intensity information extracted from the central scan line of each frame.The study demonstrated the potential for an accurate temporal alignment (i) without making any assumptions on the characteristics of the cardiac cycle and (ii) without use of ECG information.However, the study used very limited information (a single line) from potentially noisy images to perform the critical task of spatiotemporal image alignment.
Moreover, the method did not allow for image rotation or translation along the long axis occurring in a multicycle scan.
From the literature discussed above, it can be seen that it is well recognized that compounding partially decorrelated images from consecutive cardiac cycles can result in considerable noise suppression and enhance visually weak cardiac structures (Abiko et al. 1997;Amorim et al. 2009;Melo et al. 2010;Olstad 2002;Rigney and Wei 1988;Sinclair et al. 1983;Unser et al. 1989;van Ocken et al. 1981).However, although the accurate and robust spatiotemporal image alignment has been found to be a key process for effective temporal compounding, many implementations attempted to date fail to provide a reliable and effective registration method.The aim of this article is to introduce a novel alignment approach that addresses the variable and sometimes non-linear spatiotemporal characteristics of consecutive cardiac cycles.Extending on the early implementation presented in Abiko et al. (1997) and Perperidis et al. (2009), it uses a more versatile (up to seven control points) and robust non-linear temporal interpolation, as well as spatial alignment of frames prior to compounding.Moreover, none of the previous studies have attempted to assess the effect of temporal compounding on clinical cardiac ultrasound images in a comprehensive manner.Consequently, this article also aims to (i) investigate how closely the proposed approach can correct for the non-linear temporal characteristics of consecutive cardiac cycles in comparison to past temporal alignment approaches, and (ii) assess the effect of temporal compounding on patient cardiac ultrasound scans using a range of quantitative image quality measurements as well as routine clinical measurements.Optimal acquisition parameters, such as the ideal number of cardiac cycles to use in the proposed implementation of temporal compounding, are also derived.

Data acquisition and classification
Multicycle cardiac data from 32 patients were acquired by two experienced echocardiographers in the Echocardiography Department of Western General Hospital, Edinburgh.All data sets used in this study were gathered from fully anonymized cine loops recorded during routine clinical examinations in the course of normal care.No prior intention for their use in research existed at the time of collection.Consequently, no National Health Service (NHS) ethics approval was required under the terms of the Governance Arrangements for Research Ethics Committees: A Harmonised Edition (DH Research and Development Directorate (UK) 2011, available at: www.dh.gov.uk).The data set range was considered to be representative of patients examined in the department, as no patient-or condition-specific selection process was employed in their selection.No data sets were discarded for reasons of image quality or irregular temporal characteristics.The only selection criteria used were that the desired the acquisition view and imaging mode had been used and that the duration of the data set was sufficiently long for our purposes.More precisely, the data were acquired using a GE Vivid 7 Dimension ultrasound scanner (GE Healthcare, Little Chalfont, UK) operating in tissue harmonic imaging mode using a 3-MHz phased array probe.Only data consisting of at least 25 cardiac cycles of the parasternal long-axis (PLAX) view were used.(All patient data had been acquired according to the standards adopted by the British Society of Echocardiography (BSE) (Feigenbaum et al. 2005;Henry et al. 1980;Wharton et al. 2012)).The PLAX view was used as it is commonly employed in clinical practice, enabling (i) the visualization of a range of cardiac structures revealing variable spatiotemporal characteristics over consecutive cardiac cycles, and (ii) the acquisition of a range of clinical measurements, such us the interventricular septum (IVS) thickness, left ventricle internal dimension (LVID) and left ventricle posterior wall (LVPW) thickness, which are essential during the diagnostic process.Images were captured at 25 frames/s (fps).Other acquisition parameters such as acquisition depth, focus depth, sector width, gain and time gain compensation (TGC) were optimally set by the echocardiographer for each subject.Finally, B-mode image sequences of 434 3 636 pixels were exported in DICOM format with no compression applied.Prior to any processing, each data set was manually labeled by the echocardiographer as high (12), average (12) or low (8) image quality and diagnostic value.

Data analysis
Temporal compounding was implemented using a three-step process (Fig. 2): (i) temporal alignment of the multicycle data to a reference cardiac cycle, (ii) spatial alignment of the temporally aligned frames and (iii) spatial compounding of the spatiotemporally aligned data.Decoupling the temporal and spatial registration steps reduces the computational complexity of the overall process.
Step 1: Temporal alignment Temporal alignment was divided into four substeps: Identification of end-diastole and end-systole frames.A semi-automatic approach was proposed identifying all end-diastole (ED) and end-systole (ES) frames within a multicycle data set.The method used exclusively intensity information from the B-mode image sequence and required the manual identification of one ED (ED1) and one ES (ES1) frame.During systole, because of left ventricular contraction, each consecutive frame appears less similar to ED1 and more similar to ES1.Likewise, during diastole, because of left ventricular relaxation, each consecutive frame becomes more similar to ED1 and less similar to ES1.The similarity between each subsequent frame and the ED1 and ES1 frames can therefore be estimated using normalized cross-correlation (NXC) (Lewis 1995) where S 0 corresponds to ED1 or ES1, S i is the ith frame in the sequence and S i is its mean intensity.
Each end-diastolic frame should exhibit maximum similarity to ED1 and minimum similarity to ES1.Similarly, each end-systolic frame should exhibit maximum similarity to ES1 and minimum similarity to ED1.To compensate for the high noise levels contained in cardiac ultrasound data, the combined correlation coefficient (CCC), combining information on the similarity of each frame with respect to both ED1 and ES1, is defined as NXC 5 (1) CCC 5 CED2CES (2 where CED is the correlation coefficient of a frame with respect to ED1, and CES is the correlation coefficient of a frame with respect to ES1.CCC is expected to exhibit stronger local maxima and minima on all ED and ES frames, respectively, when compared with the individual NXC profiles.
Selection of a representative reference cardiac cycle.A reference cardiac cycle, representative of the temporal characteristics within a multicycle data set, was automatically selected for all remaining cardiac cycles to be spatiotemporally registered to.For the extraction of a representative reference cardiac cycle, a weighting factor W i was defined for each cardiac cycle where DL i and SL i represent the current cardiac cycle's diastole and systole lengths (in number of frames), respectively, and DL and SL represent the mean diastole and systole lengths, respectively, over the whole multicycle data set.The cardiac cycle with the lowest weighting factor was considered the most representative within the data set and was therefore selected as the reference cycle.
Identification of additional control points.For a more representative and accurate temporal alignment, additional frames corresponding to different stages of a cardiac cycle were identified to act as extra control points (CPs) during the interpolation process.Such frames were initially introduced at regular temporal intervals in the systole and diastole phases of the reference cardiac cycle, one frame in each phase for a five-stage representation (ED1-CP1-ES-CP2-ED2), and two frames in each phase for a seven-stage representation (ED1-CP1-CP2-ES-CP3-CP4-ED2).Then, for every cardiac cycle individually, the NXC between each of the additional CPs in the reference cardiac cycle and each frame in the corresponding systole or diastole phases was derived.Frames corresponding to these additional cardiac stages would induce a global maximum in NXC.Appropriate constraints were used to avoid the temporal interchange between CPs.In cases where a clear local maximum could not be derived, the corresponding cardiac cycle was omitted during the temporal mapping and spatial compounding stages.
Interpolation process.The final stage of the temporal alignment between two cardiac cycles aims to generate a transformation function T temporal : ðtÞ/ðt 0 Þ establishing a correspondence between time, t, in the aligned frame sequence and time, t 0 , in the reference frame sequence.T temporal was decoupled into independent global, T global temporal , and local, T local temporal , components: if CP 1 ,t,CP 2 ; ::: g i ðtÞ; if CP i ,t,CP i11 ; ::: where n 5 f 2; 3; 5; 7g and g i ðtÞ 5 at1b, with a 5 ðCP i11 2CPiÞ For cardiac cycles defined by five or seven individual stages (n 5 f 5; 7g), T local temporal was also modeled using a 1-D relaxed uniform interpolating cubic B-spline curve (Barsky 1982;Caglar et al. 2006).Given a set of control point pairs P 1 to P n , whereP i 5 ½CP 0 i ; CP i , a set of control points, S i , were derived, defining a B-spline curve interpolating through P i to P n (Baker 2002;Caglar et al. 2006), 4 1 0 0 ::: 0 1 4 1 0 0 1 4 1 0 0 1 ::: 0 ::: ::: with S 1 5 P 1 and S n 5 P n representing the start and end points.T local temporal was then modeled as a series of uniform cubic B-spline segments (Barsky 1982) where i 5 {1, 1, 1, 2, 3, ., n -1, n, n, n}, u ∊ [0, 1], S i represents the ith control point and B l represents the lth basis function of the B-spline curve.Multiple instances of the first and last control points were used so the curve interpolated through control points P 1 and P n .
Temporal interpolation, T temporal ðtÞ, was applied between the reference cardiac cycle and all the remaining cardiac cycles within a multicycle B-mode frame sequence (Fig. 3).Nearest-neighbor interpolation on the transformation curve was employed to allocate corresponding frames to each frame within the reference cardiac cycle.Appropriate constraints were used to ensure monotonically increasing temporal mapping.
Step 2: Spatial alignment The last step prior to spatial compounding was spatial registration of the temporally aligned frames to relate each point of an image, I, to the corresponding anatomic point in the reference image, I (Brown 1992).A rigid body transformation was applied for the spatial registration and consisted of rotation and translation components combined in a single transformation cosðqÞ sinðqÞ 0 2sinðqÞ cosðqÞ 0 where t x and t y are the translation parameters along the x and y axes, respectively, and q represents the rotation angle.Although non-rigid spatial registration can provide a more accurate alignment, it may also result in undesired deformation of the cardiac anatomy and is therefore not advisable.Bilinear interpolation was applied during the image transformation process because it has been found to provide the best trade-off between accuracy and computational complexity (Zitova and Flusser 2003).
Nelder and Mead's (1965) intrinsic optimization method, using image similarity information, was employed to derive optimal transformation, T, which maximized the NXC between the registered images: arg max tx;ty;q NXC À I 0 ; TðIÞ Á : (10) Nelder and Mead's (1965) simplex approach is known to provide a good trade-off between robustness and convergence time (Krucker et al. 2000;Meyer et al. 1999;Shekhar and Zagrodsky 2002;Zagrodsky et al. 2001).Lagarias et al. (1998) provide a thorough description of the algorithm.
Step 3: Spatial compounding In this final step, each frame, I, within the reference cardiac cycle was finally replaced by a compound frame, I 0 , generated from the spatiotemporally aligned images, one from each cardiac cycle.The mean (M) and standard deviation (SD) in the similarity (NXC) between the reference frame and all the corresponding spatiotemporally aligned frames was derived.Frames with NXC outside the M 6 SD region were discarded to avoid the compounding of dissimilar frames that would result in tissue/chamber boundary blurring.Intensity averaging was used as a well-established and effective spatial compounding method for noise suppression in ultrasound data sets.The intensity of each pixel within a compound frame was therefore set to the average intensity of the corresponding pixels from all the spatiotemporally aligned data where N is the number of cardiac cycles used during compounding, and Iðx; y; t i Þ represents the corresponding spatiotemporally aligned frame on the ith cardiac cycle of the data set.

Clinical assessment
Two experienced echocardiographers independently assessed the effect of temporal compounding on the diagnostic value of cardiac ultrasound images.Quantitative assessment was achieved by performing routine clinical measurements on ED and ES frames from both the original unprocessed and the temporally compound data.The quantitative analysis of such measurements provided valuable information on whether any boundary blurring introduced has a clinically limiting effect on the compound data sets.More precisely, a sequence of ED frames were presented and the interventricular septum thickness (IVS d ), left ventricle internal dimension (LVID d ) and left ventricle posterior wall thickness (LVPW d ) measurements were made on each frame (Fig. 4a).Similarly, a sequence of ES frames were presented, and LVID s and left atrium dimension (LAD s ) measurements were made on each frame (Fig. 4b).Each frame sequence contained one original and one compound frame for each of the data sets (64 frames in total).The selected measurements are widely used during routine clinical cardiac ultrasound examinations and provide valuable information on the state and function of the examined heart.More information on the clinical measurements performed in cardiac ultrasound examinations can be found in Feigenbaum et al. (2005).The order of the frames was randomized to ensure no bias in the results.The echocardiographers had the option to abstain from a clinical measurement if they considered there were insufficient visual cues for accurate measurement of the structure in the displayed image.All clinical measurements were performed twice, according to the standards adopted by the BSE (Feigenbaum et al. 2005;Fuster et al. 2008;Henry et al. 1980;Wharton et al. 2012), to enable evaluation of measurement repeatability and agreement between the two techniques (Bland and Altman 1986).

Temporal alignment
The principal steps during the temporal alignment process were (i) identification of ED and ES frames and (ii) temporal interpolation of each individual cardiac cycle in the multicycle data set to a representative reference cardiac cycle.The accuracy and robustness with which these two steps are executed have a direct effect on the effectiveness of the temporal alignment and, as a result, the temporal compounding process.
Identification of ED and ES frames.All 25 ED and ES frames in each multicycle data set were manually identified using multiple periodic visual cues.To assess intra-operator variability, the manual ED and ES detection was repeated three times for each data set by an experienced operator.The value with the most common occurrence was selected as the representative reference ED or ES frame for each cardiac cycle.In cases of three distinct manual identifications, their mean value was selected as the corresponding reference ED or ES frame.Because of the lack of a gold standard method for deriving accurate ED and ES frames, the reference ED and ES frame sequences were considered as the benchmark for the subsequent algorithm evaluation.Table 1 illustrates the percentage (among the 32 patient data sets) of the identified frames that lie within 0 or 1 frame with respect to the corresponding reference frames.The linear regression between the reference and the corresponding manually and semi-automatically identified ED and ES frames was also derived with (i) the correlation coefficient R 2 (for all linear regressions) being approximately equal to 1 (R 2 $ 0.9997), and (ii) the mean root mean square error (RMSE) being consistently less than one frame.Figure 5 provides an example illustrating the effect of using CCC as opposed to NXC as a similarity measure between the reference ED and ES frame pair and the remaining frames in the multicycle data set for the semi-automatic identification of all ED and ES frames.
Temporal interpolation. Figure 6 illustrates an example of the temporal alignment of four individual cardiac cycles to the reference cardiac cycle.Each plot contains curves for each of the six proposed temporal interpolation methods.Because of the absence of a gold standard temporal interpolation method, the 7-point Bspline interpolation was considered the benchmark for the subsequent algorithm evaluation.Table 2 outlines the differences between the 5-point linear interpolation and the corresponding 7-point interpolation as well as between the 7-point linear interpolation and the corresponding B-spline interpolation.Finally, Figure 7 illustrates the variations in temporal alignment between each cardiac cycle and the reference cardiac cycle within four example patient data sets.

Temporal compounding
Effect on tissue SNR, chamber SNR, tissue/chamber contrast and SDNR.Two 11 3 11-pixel square regions of interest (ROIs) corresponding to the IVS and right ventricle (RV) chamber were manually defined on each Table 1.Percentages (ranges over the 32 patient data sets) of the manually and semi-automatically identified end-diastole and end-systole frames that lie within 0 or 1 frame with respect to the corresponding reference frames  Here, M and SD referred to the mean and standard deviation of the corresponding ROI intensity values (Burckhardt 1978;Krucker et al. 2000).In a similar manner, the tissue/chamber contrast (C) (Krucker et al. 2000;Peli 1990) and signal difference-to-noise ratio (SDNR), also referred to as detectability index (Krucker et al. 2000;Mohamed and Kadah 2008), were derived as where M T and M C correspond to the mean intensity level within the tissue and chamber ROIs, respectively, and SD C corresponds to the chamber standard deviation.Figure 8 displays the mean profiles, averaging the curves of all 32 data sets, illustrating the effect of temporal compounding on SNR and tissue/chamber detectability index (SDNR), for an increasing number of cardiac cycles.The mean curves provide representative profiles for all 32 data sets.The individual cavity SNR and SDNR curves were omitted because they demonstrate a trend very similar to that of the corresponding tissue SNR curves.Finally, Table 3 outlines the mean percentage change on each of the four quantitative measures between the original unprocessed data and the temporal compound data generated using an increasing number of cardiac cycles.
Visual effect on cardiac ultrasound data.Figures 9-15 illustrate three example ED frames before and after temporal compounding is applied on their Each curve illustrates a different temporal interpolation approach.
Table 2. Differences (in number of frames) between the 5-and 7-point linear interpolations as well as between the 7-point linear and 7-point B-spline interpolations* 5-point linear vs. 7-point linear 7-point linear vs. 7-point B-spline Mean difference $1 37.1% 10.6% Maximum difference 9 2 * The percentage of aligned frames that lie within $1 frame with respect to the corresponding 7-point temporally aligned frames was derived.The mean of the derived percentages along with the maximum frame difference observed is listed.Effect on the diagnostic value of clinical data.Bland-Altman analysis (Bland and Altman 1986) was employed for quantitative assessment of the effect of temporal compounding on routine clinical measurements.Bland-Altman analysis derives the coefficient of repeatability (CR), denoting (i) the level  of repeatability of clinical measurements performed on either the original or the compound data, (ii) the level of agreement between corresponding measurements performed on original and compound data sets and (iii) the level of inter-operator agreement on measurements performed on either the original or the compound data.In all cases, the lower the CR, the higher are the measurement repeatability and measurement agreement.Moreover, the mean difference indicates the presence of any bias, whereas the 22SD and 12SD intervals provide the lower and upper limits of agreement between the compared measurements.Table 4 summarizes the bias, repeatability levels and agreement coefficients derived from the individual plots for each clinical measurement.Similarly, Table 5 summarizes the corresponding inter-operator agreement coefficients derived for each individual measurement.Finally, Table 6 provides an overview of the clinical measurements omitted by each echocardiographer for the original and compound data sets.

DISCUSSION
A range of quantitative and qualitative results were presented to assess the effect of temporal compounding on cardiac ultrasound data.The main objective was to ensure the clinical feasibility of temporal compounding by enhancing the image quality and diagnostic value of the processed data while keeping data acquisition requirements to a minimum.

Identification of ED and ES frames
Currently there is no gold standard for the accurate and robust identification of ED and ES frames.Even manual identification encounters a number of challenges because of high levels of noise and shadowing.Table 1 indicates that (i) the manual identification is subjective and cannot be considered a gold standard, and (ii) the ES identification is more challenging than the corresponding ED identification.This is so mostly because ECG cannot provide reliable information on the ES state, while the aortic valve, whose closing signifies the end of systolic phase, is in most data sets hard to identify and track.Alternatives such as the periodic motion of the left ventricle and right ventricle and the opening of the mitral valve   were therefore employed, making the process very laborious and highly subjective.
Figure 5 illustrates that CCC can generate a smooth curve with few, clearly distinguishable local maxima (ED frames) and local minima (ES frames) when compared with NXC curves.Consequently, for data sets of lower image quality, CCC was found to generate more robust ED and ES identification.Table 1 illustrates that the semi-automatic ED and ES identification using CCC exhibits behavior very similar to that of the corresponding manual identification.The mean percentage of semiautomatically identified ED and ES frames that were consistent with the corresponding reference frames was about 80% of the equivalent manual level.This percentage rose to about 91% for ED and ES frames that lay within one frame of the corresponding reference frames.Consideration should be given to the fact that the reference ED and ES frames are just representative frames derived from a series of manual identifications.They do not provide definite representation of all ED and ES states in the data set.On the other hand, the proposed approach guarantees the identification of all frames in the B-mode sequence demonstrating maximum similarity to the seed ED and ES frames, hence providing the best options for temporal compounding.Acquiring data sets with up to 25% more cardiac cycles will further ensure that the most accurate of the semi-automatically identified ED and ES frames are used.
Temporal interpolation. Figure 7 provides four representative examples demonstrating the wide range of variations in temporal cardiac behavior (with respect to the reference cardiac cycle) over consecutive cardiac cycles.Such variations can be large (Fig. 7a) or small (Fig. 7b) and follow similar patterns (Fig. 7c) or variable, independent patterns (Fig. 7d).Correcting these variations through temporal alignment ensures that the resulting compound images are largely unaffected by such temporal characteristics.Selecting a representative reference cardiac cycle minimizes the overall temporal deformations required throughout a data set.
Figure 6 provides examples of the temporal interpolation (to the reference cardiac cycle) curves for four cardiac cycles within a multicycle data set.In the first example (Fig. 6a), where small temporal variations were observed, 2 or 3 control points can produce temporal alignment of sufficient accuracy.In the second and third examples (Fig. 6b, c), where large variations were observed during at least one of the cardiac systole or diastole phases, employment of 5 control points is necessary.Finally, in the fourth example (Fig. 6d), where large variations were observed during a subsection of the diastole phase, 7 control points were required to generate temporal alignment of sufficient accuracy.Introducing additional control points increased the range of temporal  variations that could be corrected during temporal alignment.Table 2 verifies these observations with an average of 37.1% of frames within each data set for which even five and seven control points generate disparate interpolation.The potential disparity can reach up to nine frames, introducing severe blurring in the compound data, limiting their clinical usability.These results indicate that using a global affine transformation to map between the temporal characteristics of two cardiac cycles will not suffice in most clinical cases.If it is taken into consideration that (i) there are a wide range of temporal cardiac characteristics, demon-strating both global and local variations; (ii) a cardiac cycle consists of seven continuous, yet independent stages (Berne et al. 2004;Bray et al. 1999;Guyton 1991;Guyton and Hall 1997); and (iii) the introduction of additional CPs is not computationally intensive, it is recommended that a 7-CP B-spline interpolation be employed to generate a smooth and accurate alignment independent of the nature of the temporal variation.

Temporal compounding
Effect on tissue SNR, cavity SNR, tissue/chamber contrast and SDNR.Table 3 illustrates the mean percentage change in tissue SNR, chamber SNR, SDNR and contrast achievable through temporal compounding with an increasing number of cardiac cycles.The compounding effect varied significantly between different data sets because of different levels of noise, shadowing, non-optimal acquisition setup, variations in ROI location and amongst the movement range during data acquisition.Consequently, the mean profiles provide a fairer representation of the effect of temporal compounding on cardiac tissue and chambers.More precisely, temporal compounding resulted in a modest reduction in contrast (4%) between cardiac tissue and chamber.This contrast change was relatively unaffected by the decrease in the number of cardiac cycles used.On the other hand, considerable increases in SNR and SDNR were achieved.By investigating the profile curvatures in Figure 8, as well    as the percentage increases in Table 3, the mean SNR and SDNR profiles can be divided into three segments demonstrating (i) a large increase during compounding with 4-5 cardiac cycles, (ii) a moderate yet considerable increase with up to 12-13 cardiac cycles and (iii) a very modest increase with more than 13 cardiac cycles.This behavior is due to the decrease in new (decorrelated) information introduced by increasing the number of compounded cardiac cycles.To maintain the high SNR and SNDR increases, a higher level of decorrelation, achieved through larger movements during acquisition, is required.However, such large movements can result in severe blurring of the tissue/chamber boundary and should therefore be avoided.
After examination of the findings for all 32 data sets, it is believed that compounding 12 cardiac cycles can introduce sufficient tissue and chamber noise suppression (average SNR increases of 87.1% and 143.1%, respectively), as well as an increase in tissue/chamber detectability (128.6%).Twelve cardiac cycles provide a considerable reduction to the 20 cardiac cycles suggested by Abiko et al. (1997), hence reducing (i) the acquisition and computational requirements, and (ii) the potential tissue/chamber boundary blurring introduced.Nevertheless, more cardiac cycles may be required when temporal alignment is challenging, whereas fewer cardiac cycles (4-5 cardiac cycles) will suffice when 12 cardiac cycles cannot be obtained.
Visual effect on cardiac ultrasound data.Figures 9-15 (and Supplementary Videos 1-7) provide characteristic examples of the effects of temporal compounding (over 12 consecutive cardiac cycles) on clinical data over a range of image quality and diagnostic value (from low to high).A thorough visual examination of the 32 compound data sets suggests that temporal compounding can significantly reduce tissue speckle as well as noise in cardiac tissue and chambers.Furthermore, by averaging over multiple decorrelated cardiac cycles, it can also enhance structures whose boundaries are hard to delineate because of high levels of noise or shadowing.This is clearly illustrated in the visual enhancement of the RV and IVS in Figures 11-14 (Supplementary Videos 3-6, respectively), the LVPW in Figures 10 and 11 (Supplementary Videos 2 and 3) and the aortic valve in Figures 12 and 13 (Supplementary Videos 4 and 5).Similarly, in Figure 9 (Supplementary Video 1), an example of very limited diagnostic value, while the quality of the data set remained low, structures such as the RV cavity and the IVS were marginally enhanced, making them easier to detect and outline.On the other hand, temporal compounding has no noteworthy effect on tissue/chamber contrast and may introduce modest tissue/chamber boundary blurring.Blurring was mostly observed around rapidly moving structures, such as valves (e.g.,mitral valve [Figs. 13 and 14]), with some cases of blurring identified along chamber walls.The blurring effect is due partially to image misalignment prior to spatial compounding and partially to quantification errors as a result of the limited acquisition temporal resolution (frame rate).The presented alignment and compounding approach ensures that blurring caused by misalignments is kept to a minimum, not degrading data sets of high image quality and diagnostic value (Figs. 14 and 15, Supplementary Videos 6 and 7).Moreover, current ultrasound scanners can acquire 2-D cardiac ultrasound image sequences at frame rates $50 Hz, depending on the acquisition depth and sector width.Such an increase in temporal resolution (from 25 Hz using the available acquisition method) is expected to further decrease the level of the tissue/chamber boundary blurring introduced by temporal compounding.
The spatial transformations of the example in Figure 16 contradict the assumption made by Abiko et al. (1997) that no rotation or translation along the xaxis is required during temporal compounding.On the contrary, substantial spatial misalignments exhibiting a periodic behavior, with repetition periods around two to four cardiac cycles, were observed in most data sets.Such periodic behavior is attributed to the respiratory movements of the patient's chest during the multicycle data acquisition.Addressing all possible misalignments can considerably reduce the blurring between cardiac tissue and chamber boundaries and, therefore, have a direct effect on the quality of the compound data.
Effect on the diagnostic value of clinical data.Visual examination of the repeatability and agreement plots derived for each of the five clinical measurements revealed (i) no major variations between the original and compounded limits of agreement (62SD), (ii) no major outliers and (iii) no significant or systematic bias (mean) within each measurement method or between the two methods.Consequently, the results indicate a strong potential for the original and compound data to be interchangeable when performing clinical cardiac ultrasound measurements.Tables 4-6 enable a thorough investigation of the effect of temporal compounding on clinical measurements performed in routine cardiac ultrasound examinations.During the measurement process, echocardiographer 1 displayed a confident but sometimes adventurous approach in making clinical measurements.This was illustrated by the choice to attempt the majority of clinical measurements, even for challenging data sets (Table 6).On the other hand, echocardiographer 2 adopted a more thorough and conservative approach to clinical measurements.This was illustrated by the choice to omit more clinical measurements (Table 6), taking fewer risks on challenging data sets (corrupted either by high noise or by shadowing levels).A consequence of this conservative approach was echocardiographer 2's higher repeatability and agreement levels on clinical measurements when compared with those of echocardiographer 1 (Table 4).Such higher repeatability levels in the original data reduced the scope for improvement by using the processed data.Nevertheless, temporal compounding enhanced the diagnostic information in the processed data, introducing a considerable improvement in overall clinical measurement repeatability for both echocardiographers.More precisely, the enhancement of structures such as the IVS and LVPW that were originally obscured by heavy artifacts (as illustrated in Figs.11-14) (i) increased the repeatability coefficients of IVS thickness, LVPW thickness, LVID d and LVID s measurements by up to 48%, and (ii) induced a noticeable drop (approximately 47%) in the number of measurements omitted by echocardiographer 2 (Table 6).
The results in Table 5 indicate that in addition to the substantial improvement in measurement repeatability of each individual operator, temporal compounding reduced the measurement disparity between the two echocardiographers.More precisely, measurements on the original data by echocardiographer 1 indicated the tendency to give a larger estimate of the size of the measured cardiac structure when compared with the corresponding measurements by echocardiographer 2. As a result, the measurements on four of the five cardiac structures reveal positive inter-operator bias with an overall bias of 10.87 mm.Temporal compounding reduced this tendency, with the overall bias dropping to a very modest 10.26 mm (70% improvement).Moreover, compounding improved the inter-operator measurement agreement in three of the five cardiac structures by up to 36% (12% overall improvement).A very modest reduction in inter-operator agreement was introduced in wall thickness measurements such as IVS d and LVPW d .Enabling echocardiographer 2 to perform previously omitted measurements in challenging data sets (Table 6) is identified as the source of this moderate reduction in inter-operator agreement.Furthermore, the very low CR on the original data had left very limited room for improvement.Nevertheless, the overall reduction in measurement disparity, as illustrated in Table 5, indicates the potential for temporal compounding to reduce the operator dependence of clinical measurements.

FUTURE WORK
The current implementation of temporal compounding is aimed at the off-line processing and enhancement of cardiac ultrasound images.However, the real-time aspect of cardiac ultrasound constitutes a major advantage over other imaging modalities.Although beyond the scope of this article, implementing temporal compounding for the real-time enhancement of cardiac ultrasound images could be of great benefit to the diagnostic process.

CONCLUSIONS
In this article, a novel and effective implementation of temporal compounding, a method for enhancing the image quality of cardiac ultrasound data, was introduced and quantitatively evaluated.The accurate and robust spatiotemporal alignment of multicycle data is an essential process for effective temporal compounding.Insufficient alignment results in the intensity averaging of frames corresponding to different cardiac phases, introducing severe tissue/chamber boundary blurring to the processed data sets.A 7-control-point interpolating Bspline was found to provide an accurate representation of the temporal variations between consecutive cardiac cycles.Furthermore, a rigid spatial registration was found to provide sufficient correction for spatial misalignments between temporally aligned frames caused mostly by the patient's respiration motion during data acquisition.Compounding data from 12 cardiac cycles was also found to provide the best trade-off between data enhancement and acquisition time.Data enhancement introduced by temporal compounding includes suppressing tissue speckle and chamber noise substantially, increasing the corresponding tissue/chamber detectability and enhancing tissue structures that are masked out by high levels of noise and shadowing.Compounding data sets acquired using a higher frame rate is expected to further improve the resulting data enhancement.Furthermore, temporal compounding can increase the level of measurement repeatability and reduce the degree of operator dependence, as well as the number of omitted clinical measurements, when compared with the original unprocessed data.Clinical measurement repeatability is expected to improve further as the familiarity of echocardiographers with the compound images increases.This article has described the potential of temporal compounding to improve the diagnostic value of echocardiographic images over a wide image quality range without degrading the very best of images.The method, therefore, has the potential to replace or act as an adjunct to existing image processing and display methods in ultrasonic scanners.

Fig. 1 .
Fig. 1.Example highlighting the necessity for temporal alignment prior to spatial compounding.Dashed lines and solid lines correspond to the reference and aligned cardiac cycles, respectively.(a) No temporal alignment would result in compounding frames corresponding to different cardiac stages.(b) Temporal alignment provides mapping between corresponding temporal positions.LV 5 left ventricle.
5 T global temporal ðtÞ1T local temporal ðtÞ$ (4) The global component of the transformation, T global temporal , was represented by the affine transformation T global temporal ðtÞ 5 at1b (5) where a compensates for scaling differences between the two frame sequences (different cardiac cycle lengths) and b compensates for translation differences, aligning the start of the two frames sequences.The representation of the local component of the transformation, T local temporal , depended on the number of frames (CPs) representing different cardiac cycle stages.Using [CP 1 :CP n ] to define the aligned cardiac cycle and ½CP 0 1 : CP 0 n the reference cardiac cycle, T local temporal was initially modeled by a piecewise linear transformation

Fig. 3 .
Fig. 3. Temporal mapping between two frame sequences using (i) global linear interpolation (dotted line), (ii) piecewise linear interpolation based on three cardiac cycle stages (dashed line) and (iii) B-spline interpolation based on five cardiac cycle stages (solid black line) in aforementioned measurements.ED 5 end diastole, ES 5 end systole.

Fig. 4 .
Fig. 4. (a) Example measurements of the interventricular septum thickness (IVS d ), left ventricular internal dimension (LVID d ) and left ventricular posterior wall (LVPW d ) during end diastole (ED).(b) Example measurements of left ventricular internal dimension (LVID s ) and left atrial dimension (LAD s ) during end systole.All measurements were made across the parasternal long-axis view of the heart.

Fig. 5 .
Fig. 5. Example profiles of the correlation coefficient between each frame and the manually identified reference: (a) enddiastole frame (CED); (b) end-systole frame (CES); (c) combined correlation coefficient (CCC) between each frame and both reference end-diastole and end-systole frames.CCC provides stronger local maxima and minima, indicating enddiastole and end-systole frames within an image sequence.

Fig. 6 .
Fig. 6.Temporal alignment of four individual cardiac cycles to the corresponding reference cardiac cycle for a singleexample patient data set.The four plots provide examples of cardiac cycles with variable temporal characteristics.Each curve illustrates a different temporal interpolation approach.
Temporal compounding d A.PERPERIDIS et al.  corresponding multicycle data sets.Supplementary Videos 1-7 (supplementary videos accompanying this article can be found in the online version at http://dx.doi.org/10.1016/j.ultrasmedbio.2015.02.008) provide the corresponding single-cardiac-cycle cine loops.Data sets covering a range of image and diagnostic quality (2 5 low, 3 5 average and 3 5 high) were selected to best illustrate the effect of temporal compounding on cardiac ultrasound data.Figure16provides a representative example of the variations of the rigid spatial transforma-tions applied to temporally aligned ED frames from consecutive cardiac cycles, along with the mean and maximum transformation observed across the 32 data sets.

Fig. 7 .
Fig. 7. Variations in temporal alignment between each cardiac cycle and the reference cardiac cycle in four example patient data sets.Seven-control-point B-spline interpolation was used for each curve.

Fig. 8 .
Fig. 8. Effect of temporal compounding on: (a) mean tissue and mean chamber signal-to-noise ratio (SNR) and (b) tissue/ chamber signal difference-to-noise ratio (SDNR) for increasing number of compound cardiac cycles.

Fig. 9 .
Fig. 9. Original (left) and compound (right) end-diastole frames of very low diagnostic value.Data remain of low diagnostic value, but some structures such as the interventricular septum and right ventricle are marginally enhanced, enabling their delineation.

Fig. 10 .
Fig. 10.Original (left) and compound (right) end-diastole frames of low diagnostic value.Data remain of low diagnostic value, but some structures such as the interventricular septum, right ventricle, left ventricular posterior wall and aortic valve are marginally enhanced, enabling their delineation.

Fig. 12 .
Fig. 12. Original (left) and compound (right) end-diastole frames of average diagnostic value.Structures such as the interventricular septum, right ventricle and aortic valve are enhanced.

Fig. 14 .
Fig. 14.Original (left) and compound (right) end-diastole frames of high diagnostic value.Speckle is suppressed and structures such as the right ventricle and aortic valve are enhanced without any noticeable blurring across cardiac tissue and cavities.

Fig. 16 .
Fig. 16.Representative example of the rigid spatial transformations applied on temporally aligned end-diastole frames from consecutive cardiac cycles along with the mean and maximum transformation values observed across the 32 data sets.

Table 3 .
Mean overall effect of temporal compounding (percentage change between original and processed data) on four quantitative measures for increasing the number of compound cardiac cycles SDNR 5 signal difference-to-noise ratio; SNR 5 signal-to-noise ratio.

Table 5 .
Inter-operator agreement for measurements performed on the original and processed images interventricular septum thickness; LAD 5 left atrium dimension; LVID 5 left ventricle internal dimension; LVPW 5 left ventricle posterior wall thickness; subscript d 5 end diastole; subscript s 5 end systole.

Table 6 .
Measurements omitted by each echocardiographer