Effective computation-aware algorithm by inter-layer motion analysis for scalable video coding

https://doi.org/10.1016/j.jvcir.2015.07.015Get rights and content

Highlights

  • We propose a strategy to control the computational complexity in the EL for SVC.

  • The MVDs are analyzed to observe their relationship between BL and EL.

  • The predicted search points in the EL and the RD-costs of the BL are considered.

  • The computations can be efficiently allocated to MBs where need more computations.

  • The proposal outperforms the other researches in terms of RD performance.

Abstract

Scalable video coding incorporated with computation-aware ability achieves quality as well as being computation scalable. This paper presents a computation-aware algorithm for scalable video coding with spatial/quality scalability aiming for the best trade-off between rate distortion performance and computational consumption. We first observe and analyze and then establish a model for the motion vector difference relationship between the scalable base and enhancement layers. By using the modeling results, a linear algorithm for computation distribution is thus proposed to allocate the computation for each macroblock in the enhancement layer. In addition, the rate distortion costs of the base layer are also taken into account for the computation allocation process in order to further improve the coding performance. The simulation results demonstrate that our proposed computation-aware algorithm not only accomplishes better rate distortion performance than other works under the same computational constraints, but also achieves less computation necessities.

Introduction

In order to effectively transmit an enormous amount of data through a variety of transmission environments and support a diversity of hardware device, such as mobile phones, notebooks, television, or high quality HDTV, an extension of the H.264 video coding standard [1] called scalable video coding (SVC) [2] has been developed to achieve this goal. SVC supports spatial, temporal and quality scalabilities so that the SVC encoder can generate one or more subsets of bitstreams to meet the preferences of different user terminals or network restrictions. Fig. 1 shows the system block diagram of SVC with two spatial scalable layers. First, the input video with full resolution is downsampled to smaller resolution and inputted into the H.264/AVC compatible encoder to generate a scalable base layer (BL) bitstream. Afterwards, the full resolution image is encoded by using the H.264/AVC video coding encoder to generate a scalable enhancement layer (EL) bitstream that additionally considers the inter-layer information to further improve the coding performance. Moreover, quality scalability can be achieved by reusing the quantization module through the coefficients accumulation approach. In SVC, the inter-layer prediction includes inter-layer motion prediction, inter-layer residual prediction and inter-layer intra prediction, all of which utilize the base layer information for the prediction in the enhancement layer at the expense of increased computational complexity.

Since SVC is the extension of H.264/AVC, the computational complexity of H.264/AVC has also been inherited by SVC. To efficiently decrease the number of modes to be tested, [3] proposed a fast mode decision to accelerate the encoding process of SVC through coded block pattern (CBP) analysis. CBP contains the residual information of each macroblock and provides cues for mode prediction in the encoding process. This work reduces encoding time significantly in both the SVC’s quality and spatial scalabilities with only minor degradation in PSNR. To effectively reduce the computational complexity of SVC, many algorithms [4], [5], [6], [7], [8], [9], [10] had been proposed to reduce the computational complexity of the enhancement layer in SVC by utilizing the correlation between BL and EL. Although these studies can reduce the coding burden, they are lacking in the consideration of computational variation. In other words, computational variation could noticeably affect visual quality. [11] proposed a rate-distortion model for describing the motion prediction efficiency in inter-frame wavelet video coding to improve the mode decision. By adopting the proposed mode decision procedure, the work improved both the PSNR performance and the visual quality for scalability cases with less extraction-bitrate dependence. Research [12] analyzed the rate distortion costs between the base and enhancement layers and sorted the rate distortion costs of the sub-MB in the base layer to indicate the prediction order of the macroblock in the enhancement layers. Afterwards, based on the required computations of each prediction mode, a scalable approach was thus proposed to achieve computational scalability. In [13], the computation of each prediction mode was normalized first. Then, a linear distribution was proposed to allocate the computational complexity for each macroblock in the enhancement layer by considering the rate distortion costs of the sub-MBs in the base layer. In [14], Tai et al. introduced the computation-awareness (CA) scheme. First, the MBs were sorted with their mean square error (MSE) values since the authors assumed that the MBs with the larger MSE values would have a greater opportunity to reduce the MSE. Afterwards, the computation was allocated by considering the MSE information. Chen et al. [15] proposed an adaptive search strategy using the CA concept. This work adaptively changed the diamond search, three step search, and full search regarding motion estimation by considering motion vector prediction or a variation of neighboring motion vectors. The MBs were classified in [16] by analyzing the MB complexity using rate distortion costs and the MV threshold. Afterwards, the classified MBs were searched for by different methods. Our previous work [17] also proposed a linear model to distribute the computation complexity for each macroblock by considering the relationship of the motion vector differences between the base and enhancement layers. Although the approach could achieve an acceptable tradeoff between the computational complexity and the rate distortion performance, in order to fit the model there still remains room for us to find out more accurate parameters and further improve the coding performance.

In this paper, we propose a strategy to control the computational complexity in the enhancement layer for scalable video coding. The motion vector differences (MVDs) are first analyzed to observe their relationship between the base and enhancement layers. By using the analytical results, the computation-aware algorithm is proposed to allocate computations in the scalable spatial and quality enhancement layers by considering the predicted search points in the enhancement layer and the rate distortion costs of the base layer.

The rest of this paper is organized as follows. In Section 2, we analyze the motion vector differences and rate distortion costs of BL and EL, and then their correlations are investigated. In Section 3, our proposed computation-aware algorithm is described in detail. In Section 4, the experimental results are exhibited to show the efficiency and performance of our proposed scheme. Finally, the conclusion is provided in Section 5.

Section snippets

Motion vector difference analysis

Motion estimation takes the motion vector prediction as the center of the search window and then finds the best motion vector in the search window. MVD is the difference between the motion vector and motion vector prediction center encoded into video bitstream. Therefore, the MVD could provide some information about the motion behavior of the video content. In SVC, the enhancement layer obtains the information from the base layer to increase the coding efficiency. For spatial scalability, the

Proposed computation allocation algorithm

From the previous section, we can observe that the motion vector differences as well as the rate distortion costs between BL and EL have a high correlation. Therefore, we employ the motion vector difference and rate distortion cost to allocate the computation for macroblocks within a frame in EL. The flowchart is shown in Fig. 11. First, we use the motion vector difference calculated in (2) to estimate how many search points for a full search are required for the i-th MB in the enhancement

Simulation results

Table 8 tabulates the simulation settings for the experimental results in the section. In our simulation, two researches [12], [13] are adopted for comparison and the performance measurements of PSNR, Bit-rate, BDPSNR, BDBit-rate [19], [20] and SSIM (Structural Similarity Index Metric) [21], are evaluated.

Fig. 13 shows the computation allocation results for different test video sequences with 10% computation availability. From these figures, we can observe that the computations can be

Conclusion

In this paper, the motion vector difference and rate distortion cost are effectively observed, analyzed, and used for deriving our computation allocation algorithm for SVC. Through the help of our proposed computation allocation algorithm, the computations can be efficiently allocated to MBs belonging to the regions which need more computations to find out the best prediction results. Our simulation results show that our proposed algorithm outperforms the other computation-aware literature in

Acknowledgment

This work was supported by Ministry of Science and Technology, Taiwan, R.O.C. under Grant NSC 99-2221-E-259-019-MY3 and Grant NSC 102-2221-E-259-022-MY3.

References (21)

  • C.H. Yeh et al.

    Mode decision acceleration for scalable video coding through coded block pattern

    J. Vis. Commun. Image Represent.

    (2012)
  • C.Y. Tsai et al.

    A rate-distortion analysis on motion prediction efficiency and mode decision for scalable wavelet video coding

    J. Vis. Commun. Image Represent.

    (2010)
  • T. Wiegand et al.

    Overview of the H.264/AVC video coding standard

    IEEE Trans. Circuits Syst. Video Technol.

    (2003)
  • H. Schwarz et al.

    Overview of the scalable video coding extension of the H.264/AVC standard

    IEEE Trans. Circuits Syst. Video Technol.

    (2007)
  • T.J. Kim et al.

    Fast mode decision for combined scalable video coding based on the block complexity function

    IEEE Trans. Consum. Electron.

    (2011)
  • X. Sun, S. Xiao, J. Du, M. Hu, Fast mode decision algorithm for H.264/SVC based on motion vector relation analysis, in:...
  • C.S. Park et al.

    Selective inter-layer residual prediction for SVC-based video streaming

    IEEE Trans. Consum. Electron.

    (2009)
  • S.V. Leuvun et al.

    Generic techniques to reduce SVC enhancement layer encoding complexity

    IEEE Trans. Consum. Electron.

    (2011)
  • B. Lee et al.

    A low complexity mode decision method for spatial scalability coding

    IEEE Trans. Circuits Syst. Video Technol.

    (2011)
  • C.H. Yeh et al.

    Fast mode decision algorithm for scalable video coding using bayesian theorem detection and markov process

    IEEE Trans. Circuits Syst. Video Technol.

    (2010)
There are more references available in the full text version of this article.

This paper has been recommended for acceptance by M.T. Sun.

View full text