Fast Tracking of the Left Ventricle Using Global Anatomical Aﬃne Optical Flow and Local Recursive Block Matching

. We present a novel method for segmentation and tracking of the left ventricle (LV) in 4D ultrasound sequences using a combination of automatic segmentation at the end-diastolic frame and tracking using both a global optical ﬂow-based tracker and local block matching. The core novelty of the proposed algorithm relies on the recursive formulation of the block-matching problem, which introduces temporal consistency on the patterns being tracked. The proposed method oﬀers a competitive solution, with average segmentation errors of 2.29 and 2.26mm in the training (#=15) and testing (#=15) datasets respectively.


Introduction
While magnetic resonance imaging remains the gold standard for cardiac morphology and function assessment, several studies have shown that real-time 3D echocardiography (RT3DE) is a competitive modality for this clinical task [1].Indeed, RT3DE offers superior performance when compared to conventional 2D ultrasound imaging on the visualization of the entire left ventricle (LV), thus avoiding several pitfalls of 2D echocardiography such as foreshortening, out-ofplane motion and the need of geometric assumptions for volume estimation [2].On the other hand, and due to the intrinsic physical limits of acoustical wave propagation, 3D ultrasound imaging requires advanced beam-forming strategies to sweep the entire scan volume with a number of ultrasound pulses compatible with real-time imaging.As a consequence, image quality may be impaired as compared to conventional 2D echo.Simultaneously, the increased dimensionality of the data poses some challenges on the data analysis pipeline, which has triggered a significant effort from both industrial and academic research teams on the development of automated software packages for LV volumetric assessment [3,4].However, even state-of-the-art commercial solutions still require some degree of user interaction both at the initialization step and for a correction step of the segmentation/tracking results [5].Thus, tools allowing automatic fast 3D LV segmentation are still needed [6].
Our prior approaches to LV tracking focused on the problem from a global perspective, since we have developed an algorithm modeling the LV motion during the cardiac cycle as an affine transformation.Despite its interesting performance for the quantification of global functional indices [7], there is still room for improved tracking performance using local refinement of the globally deformed LV surface.We have initially used a hybrid method relying on the combination of the global tracking-based algorithm with a local refinement based on segmentation-oriented clues.We have shown that this positively contributes to the tracking performance [8], but it remains limited to the assumption that the boundary position matches the optimal value of the associated data attachment term.Since often the physicians do not delineate the LV boundaries at maximum local contrast positions, the development of accurate segmentation energies is challenging.Thus, in the present paper we follow the global tracking plus local refinement strategy but we drive the local tracking using a block matching approach rather than a segmentation-based term.

Automatic LV Segmentation at End-Diastole
B-Spline Explicit Active Surfaces (BEAS) is a real-time segmentation framework recently introduced [9].The fundamental concept of the BEAS framework is to regard the boundary of an object as an explicit function, where one of the coordinates of the points within the surface, x 1 , is given explicitly as a function of the remaining coordinates, i.e.
. Following [9], ψ was defined as a linear combination of B-spline basis functions, where the segmented surface is explicitly controlled through the B-spline coefficients c[k], where k defined the position of the B-spline kernel spanned on a grid.
In the present work, we use a modified version of the localized means separation energy which takes advantage of the darker appearance of the blood with respect to the myocardial tissue [10].While this approach evolves the contours towards the positions of maximum local contrast, the expert physicians usually prefer to delineate the LV surface closer to the blood-tissue interface.Thus, we have previously introduced a hyper-parameter that allows controlling the balance between the forces exerted by the inner and the outer regions [11].This allows to globally steer the LV surface position inwards or outwards, in order to better match the manual contouring protocol.Thus, in the present work the B-spline coefficients c[k] are updated using the following expression: where u x and v x are the local means estimated inside and outside a local neighborhood B around each point on the LV surface, respectively.For clarity sake, Ī(x * ) corresponds to the image value at the position α in controls the balance between the forces exerted by the inner and outer regions.A value of α in > 1 will increase the influence of the inner regions, thus attracting the segmented LV towards the blood-tissue interface.For the present work, ψ is defined in the spherical space, i.e. ρ = ψ(θ, ϕ).The initialization of the segmentation algorithm was obtained with the method introduced in [10], which provides an ellipsoid approximation to the LV endocardial surface.

Combination of Global Affine Optical Flow and Recursive Block Matching For Efficient LV Tracking
We propose a two-step tracking strategy.First, we estimate the global deformation between two subsequent frames using the anatomical affine optical flow introduced in [7].This allows a robust tracking over the cardiac cycle, while also keeping a consistent shape since only affine transformations are allowed.The novelty of the proposed solution lies in the subsequent local refinement based on a block matching approach.Since the larger displacements were already accounted for in the global tracking stage, the block matching can be restricted to a smaller search region.
The traditional block-matching approaches perform an exhaustive search of a given image kernel in the subsequent frame.The position optimizing the chosen similarity criterion (such as sum of absolute differences, normalized cross correlation, among others) is taken as the most likely new position and the inter-frame displacement is computed accordingly.However, errors can cumulate over the entire cardiac cycle and indeed it can be shown that there is an intrinsic optimal relationship between the frame rate and the displacement to be recovered [12].One of the key drivers of error accumulation is the fact that the tracking problem is posed as a sequential block matching problem, thus not enforcing temporal continuity between multiple frames.Indeed, the kernel being tracked at frame t might be completely distinct from the kernel found in t − 2. In the present paper, we introduce the concept of temporally recursive block matching by considering a dynamic version of the block to be searched in the next image.Indeed, instead of considering that the kernel is simply a block from the frame t, we use a dynamic block which is recursively built not only with the image patterns from frame t but also from the previous frames.
Traditional block matching approaches estimate motion by tracking a 3-D kernel between subsequent frames using a sliding window technique and using a given similarity measure to estimate the optimal match.Considering the sum of squared differences, the motion for a given point would be estimated as the minimum over (u, v, w) of: where x ∈ Ω corresponds to the local region of interest (ROI) being tracked.Typically the search region is limited by constraining the values of the displacement vector (u, v, w) where the SSD is estimated.In this case the optimal displacement (u t , v t , w t ) is taken as the minimum of SSD(u, v, w).
In the proposed approach, instead of taking the local region of interest from the previous frame, we recursively combine the kernel of the current frame being tracked with the previously tracked ROI's through: (4) Using the proposed approach, if the tracking fails at a given frame, it can still recover from the error since the temporal consistency introduced through the recursive estimation of the ROI will steer the block matching in the next frame towards the patterns previously being tracked.

Implementation Details
Prior to processing, each 4D ultrasound sequence was re-sampled to guarantee isotropic voxel size, which was set to the smallest voxel dimension of the original dataset.Regarding the underlying segmentation framework, all the parameters were set as originally reported in [9] and [10], with two notable exceptions.First, the size of the sliding plane used to detect the LV base position was set to rmin (i.e. 15 mm) rather than rmax (i.e.35 mm), contrarily to what was proposed in [10].Secondly, and since there was significant differences in the voxel size, the radius of the local neighborhood was set to 16 mm rather than 16 voxels as previously reported in [10].The parameter controlling the balance between the forces of the inner and outer regions, i.e. α in , was empirically set to 1.25.The anatomical optical flow parameters were set as in [7], whereas the sizes of the ROI and search kernels in the block matching stage were set to[7x7x7] and [11x11x11], as proposed in [13].No sub-pixel accuracy strategies were used for the block-matching stage.

Experiments and Results
The proposed pipeline was tested in the database from the CETUS challenge, which is composed by 4D ultrasound sequences from healthy volunteers and patients with myocardial infarction and dilative cardiomyopathy.This data has been acquired with multiple imaging platforms from different vendors, namely a Siemens SC 2000, a Philips i33 and a GE E9.This database is divided into two subsets of 15 exams for training and testing.While for the training dataset the meshes for the reference segmentation are provided, this information is not available for the testing dataset.The performance of the algorithm was assessed using distance metrics to the reference segmentation result, namely mean absolute distance (MAD), Hausdorff distance (HD) and modified Dice coefficient (Dice * , estimated as 1-Dice).These metrics were computed directly through the challenge MIDAS platform, thus guaranteeing a common evaluation platform for all the participants in the challenge.The accuracy of the estimation of LV volumetric indices used in clinical routine, namely end-diastolic volumes (EDV), end-systolic volumes (ESV), stroke volumes (SV) and ejection fraction (EF), was equally included.The reported CPU timings refer to a MATLAB-based implementation running on a standard Windows 7 laptop equipped with a dual-core i7-640m processor and 4 GB of RAM.
We have carried an initial experiment using the training dataset in order to evaluate the added value of the proposed dynamic ROI introduced in (3) versus the classical pairwise block matching for local refinement (GAOF+BM), as well as against the global affine anatomical optical flow baseline tracker (GAOF).The results of this initial experiment are presented in Table 1.
The summary of the results for the distance segmentation metrics is presented in Table 2, while the accuracy of the estimation of LV volumetric indices can be found in Tables 3 and 4. Fig. 1 illustrates the local surface errors on a representative sub-set of 3D ultrasound frames.We have equally compared the performance of the proposed method with the baseline global affine optical flow tracker and with the hybrid framework (HT) introduced in [8], where the local refinement is done using segmentation-based clues, as opposed to local trackingbased hints in the proposed algorithm.An example of a 4D RT3DE segmented with the proposed approach is illustrated in Fig. 2

Discussion
The results of the initial experiment put an evidence on the trade-off between local accuracy and global tracking performance.Indeed, while starting from the reference LV surfaces, the global affine optical flow algorithm introduced in [7] presents the lower MAD, the local refinement using block matching improves the local tracking, as supported by the lower HD values.Regarding the added value of the proposed algorithm, the memory effect of the ROI being locally tracked improves the estimation accuracy of the functional LV volumetric indices (SV and EF), reducing the bias and the limits of agreements of the estimated values when compared to the traditional block matching refinement.For the complete pipeline (ED segmentation + LV tracking over the entire cycle), the method in [8] presents lower tracking performance than the one introduced in the present manuscript, whose local refinement strategy is primarily driven by tracking-based clues.While this observation does not imply that hybrid strategies are not suited to LV tracking problems, it stresses the need to develop more accurate segmentation energies adapted to the complexity of the local appearance of RT3DE images around the endocardial surface, since positions of higher contrast do not always correspond to the true endocardium.
The global tracker [7] remains very competitive when compared with the proposed approach, thus not being trivial to identify the best tracking strategy.Indeed, while the LV motion using the proposed approach appears more realistic, since it presents more complex deformation patterns with local variations, the global tracker remains more insensitive to errors.Since the analyzed dataset only assesses the LV position at ED and ES frames, a more complete 4D analysis would likely be necessary to put in evidence the differences between the global tracking approach and the locally refined LV surface using the proposed method.

Conclusion
The proposed LV segmentation and tracking framework offers competitive performance for the fully automatic detection of LV endocardial borders in 4D ultrasound sequences.This is supported by the low segmentation errors and accurate volume estimates found in a cohort of 30 exams with a wide range of image quality and cardiac functional status.Furthermore, the computational burden of the proposed method remains low, allowing global affine tracking plus local refinement to be done in approximately 2s for two consecutive frames.From our prior experience, this indicates that near-real time performance could be achieved by using a more efficient coding of the algorithm in, for instance, C++.

Fig. 1 .
Fig. 1.Best (left) and worse (right) segmentation results for the training (up) and testing (down) datasets.The local segmentation error is color encoded, while the reference LV surface is shown in white for the training dataset cases.

Table 1 .
Segmentation performance of the proposed algorithm in the CETUS training dataset, starting from the ground truth surface at ED and evaluating the LV surfaces at end-systole (MAD: mean absolute distance; HD: Hausdorff distance; Dice * : modified Dice coefficient (i.e.1-Dice); R: Pearson correlation coefficient; LOA: Limits of agreement, estimated as µ ± 1.96σ).MAD and HD values given in mm.

Table 2 .
Segmentation performance of the proposed algorithm in the CETUS database (MAD: mean absolute distance; HD: Hausdorff distance; Dice * : modified Dice coefficient (i.e.1-Dice)).All values given as µ ± σ, MAD and HD values given in mm.

Table 3 .
Assessment of LV volumetric indices in the training database (R: Pearson correlation coefficient; LOA: Limits of agreement, estimated as µ ± 1.96σ).

Table 4 .
Assessment of LV volumetric indices in the testing database (R: Pearson correlation coefficient; LOA: Limits of agreement, estimated as µ ± 1.96σ).