Optical flow estimation for motion-compensated compression

https://doi.org/10.1016/j.imavis.2013.01.002Get rights and content

Abstract

The computation of optical flow within an image sequence is one of the most widely used techniques in computer vision. In this paper, we present a new approach to estimate the velocity field for motion-compensated compression. It is derived by a nonlinear system using the direct temporal integral of the brightness conservation constraint equation or the Displaced Frame Difference (DFD) equation. To solve the nonlinear system of equations, an adaptive framework is used, which employs velocity field modeling, a nonlinear least-squares model, Gauss–Newton and Levenberg–Marquardt techniques, and an algorithm of the progressive relaxation of the over-constraint. The three criteria by which successful motion-compensated compression is judged are 1.) The fidelity with which the estimated optical flow matches the ground truth motion, 2.) The relative absence of artifacts and “dirty window” effects for frame interpolation, and 3.) The cost to code the motion vector field. We base our estimated flow field on a single minimized target function, which leads to motion-compensated predictions without incurring penalties in any of these three criteria. In particular, we compare our proposed algorithm results with those from Block-Matching Algorithms (BMA), and show that with nearly the same number of displacement vectors per fixed block size, the performance of our algorithm exceeds that of BMA in all the three above points. We also test the algorithm on synthetic and natural image sequences, and use it to demonstrate applications for motion-compensated compression.

Highlights

► A nonlinear constrained system for larger scale displacement motion is proposed. ► An adaptive framework for solving the nonlinear system of equations is developed. ► The estimated flow field is derived based on a single minimized target function. ► The motion-compensated prediction is optimized for video coding. ► There is only the smaller number of motion parameters for transmission/storage.

Introduction

The determination of the object motion and its representation in sequential images has been studied by many scientists in several disciplines, e.g., digital image processing, digital video coding, computer vision, and remote sensing data interpretation. The motion analysis, determination, and representation are crucial for the removal of temporal redundancy. A large compression ratio that results in high fidelity motion picture quality requires an accurate large-scale displacement and long temporal range motion estimation for efficient transmission of compressed image sequences. Thus the creation of more efficient and effective algorithms for estimating optical flow is very important.

For these reasons, significant effort has been devoted to solving the optical flow estimation problem. To place this body of literature in a meaningful context, we present most of these works in a block diagram (Fig. 1).

Almost all existing optical flow estimation models and their algorithms assume the image intensity obeys a brightness constancy constraint [3], [4], [5]. The inverse problem for estimating a velocity or displacement map (i.e., the optical flow) is found to be under-constrained because the two unknown velocity components must be derived from this single conservation equation at each of these pixel points. To solve the under-constrained problem, several constraints on the displacement field, such as smoothness and other assumptions, have been proposed. Typical smoothness assumptions include Horn and Schunck's regularization constraint [3], a uniform velocity assumption in a block (or template) (Lucas and Kanade [4], [5]; Shi and Tomasi [16]), an intensity-gradient conservation constraint (Nagel et al. [12], [13], [14]; Nesi [15]), modeling the motion field as a Markovian random field (Konrad and Dubois [20]), and velocity field modeling with bilinear or B-splines functions (Chen et al. [28], [29]). Traditionally, scientists have focused their efforts on extending the many other different methods of estimating the optical flow [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35]. Some of the algorithms have been implemented in hardware [36], [37], [38], [39].

Most realistic image sequences in computer vision applications are constructed by multiple objects moving against a static background and with respect to one another. That is, the velocity field can be discontinuous over the image. In order to handle the discontinuities in the transition boundaries between a static background and mobile objects, Chen et al. [28] proposed bilinear modeling of the motion field. This numerical model has solved the under-constrained problem successfully.

However, the optical flow equation is derived from a differential form of the conservative constraint (i.e., a first-order Taylor expansion), and is only valid for infinitesimal motion. Therefore, the motion field based on the optical flow equation can be estimated successfully for small displacement motion only.

Three criteria are used to evaluate the derived optical flow: 1.) comparison between the retrieved and the ground truth optical flow fields, 2.) frame interpolation, and 3.) performance of motion-compensated compression. An evaluation using only the first method with some special datasets may not provide sufficiently stringent tests of the capability of the high performance estimators. The comparison of ground truth optical flow is dependent on the type of motion, texture morphology, and scale of displacement. The most important feature of motion-compensated compression is not only how well an estimated optical flow matches the physical motion. Equally important are how well the motion pictures are synthesized with minimized distortion and without artifacts and dirty window effect, and the low coding cost of the motion vector field. Although the last performance test is the most important for a variety of applications in computer vision, a successful motion estimator should demonstrate excellent performances for all three tests in a range of displacement scale from small to large.

Many motion estimators and video compression algorithms each perform far from optimally by themselves, but motion estimator and video compression techniques must also interface compatibly together. The estimator adopted in the international standards for digital video compression is the block-matching algorithm (BMA) [16], [17], [18], [19] (or overlapped BMA). Compared with ground truth optical flow, the BMA method is less accurate for flow field estimation, but performs better for motion-compensated prediction (MCP) and interpolation (MCI) in realistic video coding applications. Current popular approaches with the optical flow equation may outperform the BMA methods in the optical flow comparison test for some specific datasets, but cannot pass the overall tests. For this reason, the BMA method is still adopted as an estimator today.

The global (or energy-based) approaches usually employ the brightness constancy constraint combined with a prior constraint on the motion field with a weighting (penalty) parameter [1], [2], [3]. However, a major issue emerges when using a weighting parameter, which is related to its optimal value. Several different weighting parameter values have been suggested [1], [3], [25], [26], because the correct optimal value depends upon the specific ground truth flow field.

In the present paper, we depart from the established weighting parameter approach. Instead, we employ a quantity derived from the nonlinear model and minimize it by varying a huge number of unknown parameters (the average velocity or displacement field). Since there exist numerous local minima in image data applications—especially those having large featureless regions—we have found it necessary to develop new algorithms for solving the problem.

To improve the performance of the velocity estimation, especially for large displacement motion, we replace the standard differential form with a direct temporal integral of the optical flow conservation constraint (or Displaced Frame Difference (DFD)) equation, and create a nonlinear system. To solve the inverse problem of the flow field estimation, we propose an adaptive framework and employ more stringent performance criteria when applying the motion-compensated compression. Our numerical approach for the flow field estimation proposed in this paper is highlighted in Fig. 1.

This paper develops a generic approach that can deliver high performance for both flow field estimation and motion-compensated compression. A difficulty we face is that a moving image scene necessarily contains both featureless and texture-rich regions. Our goal in this paper is to develop a single motion estimation technique, which incorporates the same formalism to treat both types of regions together.

This paper is organized as follows: In Section 2, a set of nonlinear system equations with the velocity field model is derived. Section 3 introduces algorithms for this estimator. In Section 4, we deal with the validation of the new algorithms by deriving velocity from synthetic tracer motion within a numerical ocean model, and apply the new technique to video image sequences. Finally, conclusions are drawn in last section.

Section snippets

Brightness constancy constraint

If we designate I(x, y, t) as the intensity specified in (x, y) coordinates and time t and a velocity vector of the optical flow is v(x, y, t) = (u(x, y, t), v(x, y, t))T, we may write a differential form of the brightness constancy constraint (or optical flow) equation asdIrt,tdt=t+vIrt,t=0.

In order to constrain the image scenes at times t = t1 and t = t2, we integrate Eq. (1) from time t1 to t2t1t2dIrt,tdtdt=Irt2,t2Irt1,t10,where r(t1) and r(t2) are the position vectors at times t1 and t2.

Numerical algorithms

Detailed implementations for this motion estimator include computation of the MCP function, partial derivative calculations, the progressive relaxation of the over-constraint algorithm, and the iteration procedures. These are described in this section.

Experiments

The performance test of the estimators includes a benchmark test for optical flow and an error evaluation of frame interpolation. The optical flow field estimated from an image sequence is used to compare with a ground truth flow field using average angular and magnitude errors as the benchmark tests. The error evaluation of frame interpolation is an indirect test in which the compared objects are between the ground truth image and the image interpolated by the motion-compensated interpolation

Conclusion

In this paper, we have presented an adaptive framework for solving the optical flow for motion-compensated compression. Using the nonlinear DFD equations, modeling the velocity field, and least-squares model, we formulate iterative equations based on Gauss–Newton and Levenberg–Marquardt algorithms. We also propose an algorithm for the progressive relaxation of the over-constraint on a flow field, in which each vector is consistent with its neighbors.

The overarching goal of optical flow

Acknowledgements

This research work was supported by the Office of Naval Research through the project WU-4279-02 at the Naval Research Laboratory.

References (41)

  • G. Aubert et al.

    Computing optical flow via variational techniques

    SIAM J. Appl. Math.

    (1999)
  • F. Heitz et al.

    Multimodal estimation of discontinuous optical flow using Markov random fields

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1993)
  • A. Kumar et al.

    Optic flow: a curve evolution approach

    IEEE Trans. Image Process.

    (1996)
  • H.H. Nagel

    Constraints for the estimation of displacement vector fields from image sequences

  • H.H. Nagel et al.

    An investigation of smoothness constraints for the estimation of displacement vector fields from image sequences

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1986)
  • H.H. Nagel

    Extending the ’oriented smoothness constraint’ into the temporal domain and the estimation of derivatives of optical flow

  • F. Glazer

    Scene matching by hierarchical correlation

  • H. Ghanbari et al.

    Block matching motion estimations: new results

    IEEE Trans. Circuit Syst.

    (1990)
  • V. Seferidis et al.

    General approach to block-matching motion estimation

    J. Opt. Eng.

    (July 1993)
  • J. Shi et al.

    Good features to track

  • Cited by (14)

    • Non-stationary content-adaptive projector resolution enhancement

      2021, Signal Processing: Image Communication
      Citation Excerpt :

      Gonzalez and Bergasa [52] proposed a method for text detection and recognition in natural images using geometric and gradient-based features, however, this method aimed to recognize individual characters, narrower than our needs. Next, a wide variety of researchers have explored motion detection [60–63,66]. Chen et al. [60] proposed an adaptive framework for optical flow field estimation and motion-compensated compression, however, the method requires solving a costly optimization problem.

    • Towards the intrahour forecasting of direct normal irradiance using sky-imaging data

      2018, Heliyon
      Citation Excerpt :

      Estimating cloud motion from a sequence of sky images remains a really challenging task due to the non-linear phenomena of cloud formation and deformation, and the non-rigid motion and structure of clouds (Brad and Letia, 2002). In the literature, several methods for estimating cloud motion have been studied: pel-recursive techniques (Skowronski, 1999), Block Matching Algorithms (BMAs) (Song and Ra, 2000), optical flow algorithms (Chen and Mied, 2013) and fuzzy inference systems (Chacon-Murguia and Ramirez-Alonso, 2015). To date, block matching and optical flow algorithms are the most popular of all motion estimation techniques.

    • Frame Rate Up-Conversion under Stationary Constraints of Spatio-Temporal Local Motion

      2022, 2022 IEEE 5th International Conference on Automation, Electronics and Electrical Engineering, AUTEEE 2022
    • Inpainting-Based Video Compression in FullHD

      2021, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus

    This paper has been recommended for acceptance by Yiannis Andreopoulos.

    View full text