An enhanced second order prediction using the statistics of image difference signal for HEVC

This paper presents an enhanced second order prediction (ESOP) algorithm using the statistics of image difference signal. The algorithm can selectively control the residual of the first inter prediction for the second intra prediction. Unlike other second order prediction (SOP) algorithms, our proposed algorithm employs a check-strategy to adaptively select the residual of the first inter prediction for the second intra prediction so that the correlation between residual pixels can be kept close to the allocated accurate prediction as much as possible. Experimental results show that under the same condition, on average, the PSNR is increased by 0.40dB and the bit rate is reduced by 6.15% as compared with traditional second order prediction for QCIF and CIF sequences. This selective algorithm is particularly suitable for dealing with video contents containing complex motions in video compression.


Introduction
Although motion-compensated prediction (MCP) can reduce temporal redundancies between pixels, there would still be some redundancies left in a residue. Moreover, when a video contains complex movements, such as shape transforming, rotation or fading, MCP would result in low coding efficiency because of large residues caused by low accurate motion estimation. To solve the problems, a method of weighted prediction (WP) [1] in High Efficiency Video Coding (HEVC) [2] has been presented. In the method, a weighting factor a and an additive weighting offset b are used to compensate the lighting difference between current picture and a reference picture to deal with the fading sequences with global illumination change between frames. Lighting conditions may vary not only between frames but also within a frame. To handle the local lighting variation, a method of localized weighted prediction [3] was proposed. In the method, offset b was estimated based on reconstructed neighboring samples of current block and these reconstructed neighboring samples' associated motion compensated samples in the reference picture. With estimated localized offset b, current block was compensated for lighting condition change by applying a DC offset to inter prediction residue. In reference [4], a coefficient reordering mechanism algorithm was proposed. The order of coefficient according to the distribution of the different characteristics of images. This method enhanced coding efficiency to some extent for video contained fading but had high computational complexity. In reference [5], a skipping method of HEVC weighted prediction for various illuminating effects was proposed with low computational complexity. It adopted a localized weighted prediction to enhance the performance of weighted prediction. In reference [6], a single reference frame multiple weighted prediction model schemes were proposed. The scheme facilitated the use of multiple weighted prediction models in different macroblocks of the current frame even when these different macroblocks of the current frame were predicted from the same reference frame. It could handle videos which contain global brightness variations. Although the efficiency of dealing with fading scene was significant, the above methods utilized only temporal correlation but not spatial correlation. As a result, these methods still could not handle motion like shape transforming and rotation. In reference [7], a context-adaptive pixel based prediction algorithm was proposed. The algorithm was based on the assumption that the pixels having the same coordinate within one block which would own the same prediction weight, then the corresponding weights for each pixel within the target block was calculated by least square method. Reference [8] proposed a method which utilized the pixels surrounding current block and those surrounding reference blocks in order to estimate local illumination changes. The methods proposed in [7] and [8] utilized spatial correlation to deal with complex motion, but experimental results showed that the algorithms had some limitations.
The rest of this paper is organized as follows. Section II is an analysis of SOP. Section III describes ESOP using statistical properties of image difference signals. Section IV gives some experimental results of ESOP and compares the results with those of SOP and HEVC P-picture coding. Finally, the paper is concluded with Section V.

Proposed Algorithm
In order to get higher coding efficiency, assume that most of the contents in the current block are similar to some displaced blocks from other available frames, only small differences between the predicted block and the original block need to be coded into bit-stream before second intra prediction.
Here, the statistics of image difference signals is used to monitor the MCP process in first inter prediction and then determine which residual pixels could be used for the second intra prediction. The proposed statistics of image difference signal and the control mechanism of ESOP are described in the following two sub-sections.

statistical properties of image difference signal
An image difference signal can be divided into intra-frame difference signal and inter-frame difference signal. Here, we defined image difference signal as intra-frame difference signal which means the difference between adjacent pixels within an image. The intra-frame difference signal can be divided into horizontal and vertical differences between adjacent pixels. The horizontal difference is derived as follow: where represent the horizontal difference. ( , ) f x y and ( , 1) f x y − represent the values of the pixels at the x row y column and the x row y-1 column in the quantization matrix respectively. Similarly, the vertical difference is derived as: The statistical of intra-frame difference signal can be described by its spatial distribution. Fig. 1 shows the statistical distribution curve of the horizontal intra-frame difference signal.
Where E and Me represent mathematical expectation of horizontal intra-frame difference signal and the average of the horizontal intra-frame difference signal. Fig.1 shows that the smaller the absolute value of the intra-frame difference signal, the greater the probability and the maximum probability of zero difference signal and vice versa. This also shows that for the flat area which images have small change in brightness, the value of σe is small and it's high probability of the difference signal which has large absolute value. So, the residual of these areas could be chosen for second intra prediction. On the contrary, for the images which have complex motions, the values of σe are bigger. It's low probability of the difference signal which has large absolute value and these areas are not suitable for second intra prediction.

Intra prediction using statistical properties of image difference signal
The ESOP encoding procedures can be seen in figure 2, and the detailed procedures of the second prediction are as follows： Step 1: For each macroblock partition, the residual of the inter prediction is located through motion estimation of the traditional inter-prediction modes and the residual of the intra prediction is located through traditional intra-prediction modes.

Fig. 2. Encoding procedure of ESOP
Step 2: For residual of inter prediction get from setp.1, we use statistical of intra-frame difference signal to check the correlation between residual pixels. In proposed ESOP algorithm, the value of σe is the key technology for the second intra prediction. On the one hand, if the value of σe is too large, the correlation between the pixels of inter prediction residual is small and this would reduce the accuracy of prediction in second intra prediction. On the other hand, if the value of σe is too small, it would increase the loss of image information and lead to large error of reconstruction pixel. Finally, we set the scope of σe from 0.2 to 0.4. If the intra-frame difference signal's σelocated in [0.2~0.4], then we use ESOP as second intra-prediction. Otherwise, we do not perform second-order prediction and go to Step 3 directly.
Step 3: Similar to 4x4 intra prediction in HEVC, different ESOP modes are supported by adjusting the prediction direction, as shown in Fig. 3. Its best second-order prediction mode associated with every block is decided using the rate distortion framework.

. Second-order prediction modes
Step 4: The final encoding is encoded of original intra residual for I block, original inter residual and residual of Second-order prediction.

Experimental Results
We have implemented the proposed algorithm using the HEVC verification model HM10.1 and HM14.0 [13] for coding various 176x144 (QCIF) and 352x288 (CIF) video sequences. Experimental results of SOP and RP are conducted based on common test conditions (High Profile) recommendation for coding efficiency experiments. Table 1 lists the encoder setup in testing these video sequences.  [14]. The proposed ESOP is simulated using the same conditions. Table 2 and Table 3 show the comparisons of BD-bitrate saving among the proposed schemes, SOP and RP using QCIF and CIF video sequences respectively. The average BD-bitrate saving of the proposed ESOP is 6.58% for QCIF sequences and 3.07% for CIF sequences, larger than those of SOP and RP.   Table 4 presents the detailed results of PSNR gain and bit-rate saving of SOP on three CIF sequences using ESOP and HEVC P-picture coding according to BD-PSNR tool [15]. It demonstrates that the approach outperforms the HEVC P-picture coding by 0.40dB of BD-PSNR, which corresponds to 5.77% bit-rate saving under the same PSNR [15] in average. Besides，we calculate the Time-saving of code for each sequence. Although the time-saving averagely increased by 1.23%, but it has no substantive effect on coding efficiency.  Fig. 4 shows the comparisons of the total bit counts between HM 14.0 reference software [13] and RP [11] and the proposed ESOP. The total bit cost of the proposed ESOP can be reduced about 6.12% in average.

Conclusion
In this paper, we have proposed an enhanced second order prediction algorithm for HEVC video coding. By checking the correlation between residual pixels of first inter prediction according to statistical of intra-frame difference signal, the proposed algorithm can control the second intra prediction process to be completed within an accurate prediction while the video contains complex movements such as shape transforming, rotation or fading. We have tested the proposed algorithm to code various QCIF and CIF video sequences under different computational constraints (HM10.1 and HM14.0). Experimental results on 7 CIF video sequences and 12 QCIF video sequences show that the proposed ESOP can provide averagely 6.35% BD bitrate saving among QCIF sequences, and 4.42% BD bitrate saving among CIF sequences. Moreover, the reference software of HEVC achieves improvement up to 0.40dB of BD-PSNR gain, which correspond to 5.77% bit-rate saving, compared with the HEVC P-picture when video has relatively slow movement for CIF sequences.
Compared with those SOP and RP algorithms as proposed in [9]- [12] in which the adequacy of the prediction scheme and reference data can not be exactly met, our proposed algorithm has the ability of trying to achieve better coding efficiency by checking the correlation strategy to be used with only slight degradation in time-saving during ESOP.
In our proposed algorithm, how to improve the image quality in lower bit rate for the video which has large range complex movements and combine with B frame is our next research.