DAMAGED WATERMARKS DETECTION IN FREQUENCY DOMAIN AS A PRIMARY METHOD FOR VIDEO CONCEALMENT

This paper deals with video transmission over lossy communication networks. The main idea is to develop video concealment method for information losses and errors correction. At the beginning, three main groups of video concealment methods, divided by encoder/decoder collaboration, are briefly described. The modified algorithm based on the detection and filtration of damaged watermark blocks encapsulated to the transmitted video was developed. Finally, the efficiency of developed algorithm is presented in experimental part of this paper.


Introduction
The main problems arising by the video transmission over IP communication networks are caused by network congestion.With growing of the multimedia services popularity, video concealment techniques become more important.A lot of video concealment methods with different efficiency have been proposed.They can be divided into three primary groups, namely, forward error concealment, interactive video concealment and postprocessing video concealment.These methods were developed to work separately.Moreover, in this paper, the new combined concealment algorithm for increasing the efficiency of error cancellation was developed [1].
The outline of the paper is as follows.In the next two sections, an overview of basic principles of video concealment techniques, especially the post/processing methods are introduced.Based on state of art in this area, the new method utilizes the damaged watermark detection algorithm is presented in the section 4. Finally, the realized experiments, brief summary and future tasks are discussed in sections 5 and 6.

Methods of Video Concealment
In recent years, three basic types of video concealment have been developed.These approaches include forward, interactive and post-processing video concealment.In first type of video concealment, the encoder plays leading role.If transmission channel is not errorless, the two kinds of distortions in decoder can be observed.First one, quantization noise caused by waveform coder and second one, caused by transmission error.Optimal pair of source and channel encoder should minimize both types of distortion.Thus, it can achieve better performance of video transmission.Forward video concealment includes FEC (Forward Error Correction), common application of source and channel coder, as well as layered coding transmitted by channels with different priority.Some of these methods require the cooperation of network and coders in order to achieve different level of quality of service for diverse parts of video stream.However, the base layer contains principal video information, it is transferred with higher reliability.The higher video quality can be achieved by additional video information layer enhancement.Design of layered coding can be solved by several approaches, for example in temporal domain with different frame rates or in spatial domain with different resolutions.Most important task is partition of video data to given layers.The transmission error in enhancement layer could decrease of video quality.This method of video concealment adds redundancy into source or channel coder and also utilizes transport prioritization.The main advantage of this method is good error resistance due to increase of the overhead.
Forward video concealment and post-processing video concealment utilize a very small interaction between encoder and decoder.If a backward channel from decoder to encoder is available, these devices can optimally cooperate and achieve better results by the damaged video processing.This cooperation can be realized by source or channel coder.Source coder can adapt coding parameters based on the backward channel information from decoder.Also the information from backward channel for the bandwidth reservation for FEC or repeated transmission can be used.There are also other techniques of the interactive video concealment, for example; selective coding in source encoder, adaptive coding on channel level, resending lost data or damaged parts without waiting, as well as prioritization and multiple copies of resending data.The decoder sends information about correctly received data.Encoder from this information can determine damaged parts.Subsequently, these parts are replaced with correct parts from encoder buffer and used for prediction.This method helps to reach higher error resilience.Automatic repeat request can be used together with conventional decoder.In general, one retransmission creates delay about 70 ms.This value is appropriate for most of non-real time applications.On the other hand, it causes delay in decoding that may be unacceptable for real-time applications.On suppression problem of delay in transmission, there has been proposed a method that averts the delay in decoding by remembering the path of damaged data at decoder side.Other method decreases delay in decoding by sending multiple copies of lost data in every repeated transmission due to reduction of the number of requested repeated transmissions.
The last one video concealment method uses spatial and temporal redundancy presented in video signal.The filter dimension used for the interpolation the lossy blocks strictly depends on the on the amount of motion in the concealed area of processed frame.There are some combinations of temporal and spatial causal interpolation masks.In the case of using the frame memory, the interpolation masks can have a non-causal form.Thus, the temporal causal concealment techniques repairs lost blocks with corresponding blocks from previous frames, spatial concealment techniques calculate lost pixels from neighborhood pixels just the same frame and finally, the spatio-temporal concealment techniques merge previous mentioned principles.Moreover, the spatial interpolation techniques are preferred to use for video with low amount of motion.On the other hand, the temporal interpolation techniques are preferred for video with high amount of motion [1], [2].

Post-Processing Video Concealment
In this section, some of post-processing video concealment methods will be introduced.They are characterized by minimal interaction between encoder and decoder and use correlation between temporal and spatial pixels/blocks.Other ones use estimation of optical flow at pixel or block level and extrapolation of motion vectors.Following methods repair whole damage frames by multiframe video concealment in order to make effective concealment of errors generated during the transmission over network.

Hybrid Method of Video Concealment
The hybrid method utilizes spatial and temporal redundancy in video signal.This method is trying to mask lost blocks by using the information from two previous frames and spatial adjacent blocks of the actual frame.Moreover, spatial concealment uses directly adjacent vectors.These vectors are also use for the smooth estimation of lost blocks.Temporal error concealment conceals lost block with block from previous frame.Further, in this method the PDC (Pixel Difference Classification) function is used.PDC function compares pixels each other in the processed frame and pixels at the same position in previous frame.PDC function detects motion in video.When the value of PDC function is higher than threshold, the algorithm uses spatial concealment only.If detected motion in video is too much for temporal concealment, so it might make inaccurate concealment.This algorithm can be used for application that does not request very high video quality such as video conference or broadcasting for mobile phones [2], [3].

Video Concealment Algorithm on Pixel-C Ap
Video concealment usually uses the information from adjacent blocks near lost block.In case that whole frame is lost, conventional concealment methods fail because the information from adjacent blocks is unavailable.Thus, in these cases other methods for the motion estimation from previous decoded frames in order to conceal the lost frame have to be applied.The method based on pixels has been proposed in situation, when whole frame is lost, so this method is based on the optical flow estimation.Optical flow estimation is based on the hypothesis, that motion between two following frames is not very different.This technique of concealment of lost frame is based on projecting adjacent correctly decoded frame on lost frame in pixel domain.Motion in video can be assessed by estimation of motion vector field.For properly and effective concealment of lost frame is necessary perform several steps: estimation of motion vector field, their spatial adjustment, projecting on lost frame and finally, interpolation and filtration [5].

Video Concealment Algorithm on Block-C Ab
Video concealment method based on block level tries to provide a concealment that will be possible to implement in real-time applications.CAb method utilizes the concept of optical flow to reconstruct field of motion vectors for the lost frame [5].Reconstruction on block level offers two main advantages:  algorithm is used in case, when decoder detects random lost in picture.Solution is assumed by single filling of the data structure used with normal macroblock decoding and data filtering.Therefore this form can be used in parallel architecture, where filtration and interpolation are processed together,  all filtration operations are needed for interpolation of lost information, for example motion vectors from intra-frame coding areas, based on block level with resolution 4x4 pixels.This resolution enables more accurate processing than C Ap method.
This concealment algorithm for loss frame on block domain is performed gradually in several steps.At first, in order to increase coding effectiveness the convenient reference frame is searched.It means that algorithm searches the closest frame in time in reference buffer.Projection on motion vectors of all pixels from reference frame is exploited and applied on lost frame, subsequently.In next step, statistical value for moving vectors in macroblocks and blocks with different size is computed.In order to achieve more precision assess of motion, estimation for macroblock and block levels is performed.Algorithm tries to determine more uniform motion in image on macroblock level such as background and for remaining part of image is used estimation on block level.

Video Concealment with HMVE
Motion vectors of lost pixels that are extrapolated from previous frame motion vectors may be not accurate.Some of them could by badly extrapolated, especially in video with very high amount of motion.These limitations decrease the accuracy of motion vector for a pixel as well as total performance.Owing to this fact some techniques have been developed.These techniques utilize extended extrapolation of motion vectors based on pixel level.In order to remove these limitations there has been proposed a hybrid motion vector extrapolation method based on PMVE (Pixel Motion Vector Extrapolation), that uses not only extrapolation motion vectors of pixels, but it also extrapolates motion vectors of blocks too.This method is capable to eliminate badly extrapolated motion vectors in order to achieve accurate motion vectors.Moreover, HMVE (Hybrid Motion Vector Extrapolation) video concealment method works with block resolution 4x4 pixels and video standard H.264/AVC that uses compensation blocks with resolution 4x4 pixels too [6], [7].

Multiframe Concealment Method
This method uses not only previous decoded frames, but also information from partly decoded following frames in order to increase quality of the lost frame and also quality of following frames.Macroblocks of the lost frame are concealed in convenient order that is specified by using the information from partly decoded following frames.This algorithm can be adjusted for different number of reference frames that are used for successful concealment of damage parts of frames.High number of used frame can lead to more accurate concealment, but on the other hand high number of frames may not be suitable for realtime applications.Operations of concealment are performed on two levels -macroblock domain and also on blocks with resolution 4x4 pixels.In first step, algorithm decodes following frames from damaged frame, consequently determines priority value of decoded macroblocks in damaged frame and adjusts its appropriate order.Macroblocks in damage frame are concealed by suitable motion vectors.After concealment, residual parts of following frame can be correctly decoded.
Errors in received frame not only decrease the quality of video sequence, but also cause error propagation in following frames.Video concealment may minimize the immediate impact of lost packet on actual frame and may minimize error propagation.Multiframe concealment based methods provide better error concealment than previous methods due to objective and subjective quality of lost frames concealment and minimize the error propagation [8].Special kind of multiframe concealment was applied on multi-view 3D video sequences where the multiple frames captured by cameras contain the same scene from different viewpoints [13].

Watermark Based Concealment Method in Frequency Domain
In this section, the new algorithm for the detecting the lost video blocks inside the video transmission is developed.It is based on the well known method for measuring the video quality, namely, the watermark encapsulation method.The entire algorithm is divided to the 3 main parts, namely, watermark embedding process at the transmitter's side, lost block detector and lost block of pixels interpolation at the receiver's side as shown in Fig. 1.Moreover, the receiver has to know the watermark embedded to transmitted video [12].

Watermark Embedding
At the beginning, the choice of watermark embedding technique to video with minimum affect to video quality has to be chosen.There are several options how the watermark can be inserted into video, namely, in spatial or frequency domain and likewise, into luminance or chrominance components.Based on previous research, the embedding in frequency domain into luminance component was selected.The watermarks serve as some kind of feature to measuring the error transmissions over the network.
In first order, to the each frame of transmitted video the binary watermark in 8x8 size dimensions to each DCT block of luminance part of YCbCr color space video in frequency domain have been embedded.Higher human eye sensitivity on luminance and the simplicity of whole algorithm are the main reasons (Fig. 2).

Detector of Damaged Watermark in DCT Block
Let the receiver transforms the received frame of video from RGB or YUV to YC b C r or receives directly Y component of YC b C r color space.Thus, the main role of error detector is correctly detects of damaged DCT blocks transmitted via lossy channel by known watermark.
Let the inserted watermark w has the same dimension as DCT block and is composed by bipolar values belong to the high (HF), middle (MF) or lower (LF) image frequencies (Fig. 3a).Best choice for watermark embedding to DCT blocks is using the MF area (Fig. 3b).Moreover, the high values of LF coefficients can cause the periodic appear of white pixel in left-upper corner in time domain and high values of HF coefficients will cause visible level of Gaussian noise.In first step, the masks of positive and negative values of the watermark's DCT coefficients are computed as follows: where the LOG1{.}express the logical function for true value (logical 1) of expression located in the brackets and i, j are the coordinates of DCT coefficient.The Likewise, the overall numbers of positive and negative values can be computed too as follows

Lost DCT Block Interpolation
In the case of correct identification of lost DCT block using detection of damaged watermark in luminance channel, a lot of interpolation techniques can be used.In this paper, the simple time VMF (Vector Median Filter) belongs to the order-statistics filter family is used [11].

Experimental Results
In the experimental part, the testing video sequences to indentify the algorithm performance were used.

Test Conditions
At first, the two static color images called Lena (representative of simple image) and Mandrill (representative of detailed image) as test image data with resolution 256x256 pixels and contain 1024 DCT blocks by 8x8 coefficients were used.Moreover, they are shown in Fig. 4a.These reference images were tested for binary watermark pattern in order to apply on video sequence.Thus in Fig. 4c, the test images with embedded watermark pattern chess <1;-1> are shown.Moreover, watermarked image Lena achieved SSIM parameter equal to 0,9876 and Mandrill 0,9957.
A lot of types of watermarks were tested.At first, watermarks have been inserted into all DCT coefficients.In these experiments, bipolar binary watermarks called chessboard, horizontal lines, vertical lines, text with combination of values: <0;1>, <1;-1> and their multiplications were tested.They are shown in Fig. 4b.
In the Tab. 1, the impact of used watermark pattern on number of false detected block for Threshold SCORE = 0,4 is presented.Experiment shows that pattern "Chess" with binary range at <1; -1> achieved 10 false detected DCT block only.More positive is fact that better results are obtained for detailed image Mandrill.Embedded watermark for all coefficients produced image artifacts mentioned in previous section.The choice of watermark strongly depends on two opposite scopes, namely, the video quality maximization and ability to detect the damages in video.This compromise leads to using the chessboard watermark in MF DCT coefficients with <1,-1> values.Based on adjusted appropriate threshold score achieved on the test color images, the most suitable watermark were encapsulated into video sequences.
Tab.1: Impact of watermark pattern on number of false detected DCT blocks, THRESHOLD SCORE =0,4.Following experiments with static color images and two test video sequences, namely "Taxi" as delegate of real video (classic movie) and "IceAge" as delegate of artificial video (animated cartoon) were used too.The Taxi video is about the discussion of two people at a car and IceAge video is about the discussion of animated animals.Both videos had same dimension 320x120 pxl with 25 fps and 250 frames in total duration 10 seconds coded in Indeo5 codec (IV50) supported by MATLAB.

Watermark pattern
In the Fig. 5, the false detection of DCT blocks (8x8 pixels) with embedded watermarks vs. threshold score are shown.Moreover, some chessboard watermarks with various parameters were tested too.The experiment shows, that more suitable value of threshold laid under threshold 0,4.

Lossy Channel Modelling
The algorithm was tested as offline application, where the errors caused by video transmission via lossy communication channel were modelled.There were various types of errors.Moreover, approach of the errors modeling by uniform and Gaussian PDF functions were applied.The achieved results were measured by well known parameter SSIM (Structural SIMilarity).The test videos at size 320x120 pxl had overall number of 8x8 DCT blocks equal to 600 in one frame (40 in row by 15 in column).

Developed Algorithm Behaviour and Results
It is very important, that after process of inserting watermark into video, the average video quality according to objective SSIM value may not decrease under 0,98.Based on this condition, the average value of SSIM parameter for embedded watermark was not lower than 0,99 [9][10].After embedding the watermarks into test videos, achieved SSIM values were at levels equal to 0,9905 for Taxi video sequence and 0,9910 for IceAge video sequence [12].For correct detection of corrupted watermarks, the thresholds have to be adjusted.Thus, these thresholds determine the detection correctness.If the calculated value is lower than threshold, then algorithm evaluates processed DCT block as damaged and initializes the interpolation process.
As the interpolation method, the VMF (Vector Median Filter) with time filtration mask was used.It is very powerful method in the cases, when the restorations of fast changed video data are required.Moreover, its vector approach affects interpolation with low level of CD (Color Difference).In the terms of DCT block loosing, the usage of time filtering over the spatial one is preferred.The main reason is low interpolation error in centre of DCT blocks.Because the probability of damaged DCT blocks at the same position for some following video frames is low, the VMF may not to use long filter window.In experiments, the filter window with length equal to 5 was used.Thus, the short filter window was produce fast interpolation.Moreover, in experiments, the non-causal VMF, namely, two previous and two following frames according to present frame based on the measuring the Euclidean distance (L2) were used.The filtration of DCT blocks marked as lost was realized after conversion of pixels of non-corrupted DCT blocks from YC B C R to RGB color space.After this, the filtration on all 64 pixels of lost DCT block was realized.The interpolated frames by time VMF are presented in the Fig. 4b and 4d (normal PDF, V=0,03).The SSIM parameter dependencies on ThresholdSCORE for test video Taxi and for two PDF parameters (Gaussian PDF) were tested.The optimal value was detected experimentally up to 0,4 for both V=0,05 and V=0,15 and for following experiments was used equal to 0,2 (shown in Fig. 7).The values of Threshold UP and Threshold DOWN for adjusting the acceptance level of undamaged value of watermarked image were in all experiments identical and equal to 0. Because the embedded watermark is bipolar <1,-1>, the equal value at zero level is statistically appropriate.Achieved curves are presented in the Fig. 7.The next experiments were focused to find out the algorithm behaviors in terms of real and cartoon videos.Moreover, the dependency quality parameter SSIM on the PDF parameters was investigated.The PDF parameter for both distributions, namely, normal and uniform in range from 0,01 to 0,2 as shown in Tab. 2. was adjusted.In Fig. 7 and 8, the achieved curves for damaged and concealed videos are presented.From the experiments is evident that average values of video quality parameter SSIM is increasing.Thus it confirms that video quality after application of developed concealment algorithm was improved.
There is not absolute concealment of all damaged blocks, because the detector is not able to detect all damaged watermarks.Anyway, the developed algorithm achieves similar results for classic movie as well as for animated cartoon.Moreover, the level of damage for the test video IceAge is higher than Taxi for both PDFs.After application of the error concealment technique, both test videos at the same PDF parameter had some joint level of SSIM.Thus, for uniform PDF, it was around 0,9 and for normal PDF around 0,85.These parameters were measured at the PDF parameter D or V equal to 0,2.

Conclusion
In this paper, the novel method for video concealment have been proposed and tested.Developed algorithm is based on the detection the damaged watermarks encapsulated to the transmitted video.Thus the watermarks are encapsulated to video frames in the frequency domain.For the simulation requirements, the transmission errors via communications networks by normal and uniform distribution were used.Developed algorithm uses the detector of damaged block by identifying the lost watermark in DCT blocks.Finally, the time vector median filter for interpolation of lost blocks was used.The realized experiments approve the efficiency of developed algorithm mentioned in experimental part of this paper.In the future algorithm improvements, the temporal-spatial concealment filter masks, enhancement filters, adaptive detector of lost watermarks and shot boundary detector could be adopted.Likewise, the enhancement method for encapsulation and detection of watermarks into video frames could improve the results too.In order to testing the developed algorithm in real operation, the test video data transmission via real channel has to be used.

Fig. 3 :
DCT block, a) dividing to image frequencies, b) an example of bipolar watermark "Chess" concentrated to MF area.
Let the Threshold UP and Threshold DOWN set up the acceptance level of undamaged value of watermarked image and each element of detected results can be computed as follows: if DCT block was corrupted or not is classified by next formulas: Y is luminance channel of watermarked image received by receiver.In case, where the algorithm evaluates some block as damaged, block Y and appropriate C b and C r color blocks are marked in mask of lost DCT blocks Lost DCT (Y) as value 1 (True).Creation of lost DCT blocks mask inside one video frame mitigates realization of error concealment mechanisms on such damaged positions[9].

Fig. 8 :
SSIM dependency on PDF parameter for test video "Taxi", a) uniform PDF, b) normal PDF.