Frame interpolation in video stream using optical flow methods

The article deals with the issues of interpolation and extrapolation of video stream frames in real time. Results of research of the method of generation of intermediate frames of a video stream are given. The method is based on the optical flow calculate. The results of comparison of implementations of the method using the Farneback, Brox and Duality based TV-L1 approaches for the calculation of optical flow and CUDA technology are presented. It is shown that the Farneback method best result is gives. The Duality based TV-L1 method shows a smooth and high-quality optical flow, but gives a very slow calculation speed. The Brox method is not suitable at all, since it shows the worst result both in terms of calculation speed and optical flow quality. The results of comparison of implementations based on CUDA and OpenCL technologies are also presented. OpenCL implementations are almost two times slower than CUDA implementation. This leads to the inability to generate frames in 60 FPS in real time even for low resolution images. However, CUDA can work only on NVidia GPU accelerators. In conclusion, the ways of further research on this topic based on the use of deep neural networks are presented.


Introduction
Motion interpolation is a task in video processing. The goal is to create an intermediate frame or number of frames between existing frames. There are numerous works which apply to this problem [1][2][3]. But most of them aim motion interpolation to video streams where all existing frames are known. They use currently in TV devices and some software for smoothing videos by increasing frame rate to 60 or more Frames per second (FPS).
In opposite to these approaches there are ones in which only current and previous frames are known. Variation of this kind of methods were presented in VR by Oculus, Microsoft for WMR headsets and in SteamVR platform for HTC Vive headsets. But the main problem is that these methods are strongly connected with VR and hardware.
In this paper we present a method, which can handle hardware independent frame interpolation when only current and previous frames are known. By applying interpolation and extrapolation techniques in combination with optical flow intermediate frames can be generated.
In this paper the authors made the following contribution:  Simple yet good method for frame interpolation;  Generation for additional frames in video streams where only current and previous frames are known.

Optical Flow
Optical flow is the method utilized in various fields of computer vision to determine shifts, segmentation, feature extraction, video compression.
Optical flow is a picture of visible motion, representing shift of each pixel between two images. In fact, it represents the velocity field.
To define pixels compliance one should take some pixel function, which is not changed as a result of displacement. It is generally considered that pixel intensity (brightness or color) remains unchanged. Obvious, intensity changes when light is changing or the light angle. However difference between frames in video stream usually are not so significant, because time delta between them is relatively small. So the intensity is usually used as pixel function.
In time perspective, a pixel I1(x, y, t) is shifted by (x, y) in next frame after time t. Since the pixels on both frames should not be changed, we can assume that intensities should be the same, then: Let approximate I1(x, y, t) by Taylor series, then remove high order terms and divide by t: which result: where t There exist some methods to solve this equation. In this paper used next methods: Farneback method [4], Brox method [5,6] and Duality based TV-L1 method [7].

Motion Interpolation
Interpolation itself is a method which allows predicting data points inside the given range of the data. Using interpolation one can approximate function for some dataset or find additional points, in addition one can also make approximation of the complex functions. In perspective of this work, more interesting aspect is to find additional data points on given dataset. Motion interpolation is the variation of interpolation method, which is mostly used in video processing as the way to smooth animations by interpolating intermediate frames between existing ones. In addition, it finds application in video compressing: instead of containing every frame in video stream some frames contain only motion difference according to previous frame and could be reconstructed from it.
Modern displays have this method as a feature, it allows to increase frame rate on a display. For example, some TV's can show 120 FPS, but content doesn't have such frame count, so the TV itself may interpolate additional frames and show 120 FPS instead of 60 FPS. Also, some media players implement such feature, one of them was mentioned before is SVP video player.
Extrapolation is another type of approximation where function approximates outside given data points interval instead of inside as in interpolation. Like in interpolation, approximated function should go through the given data points.
In other words, extrapolation is an approximate determination of the function f(x) values in points x, which lies outside of the range [x0, xn], by the values in points x0, x1, …, xn.

Frame extrapolation using optical flow
In this work, we present the method, which can generate frame not between two consecutive frames, but after these two frames in respect of time. This task can be simply described as frame extrapolation. But this method can also be used as frame interpolation.
After solving the equation (4) we get motion vector (u, v) for each pixel. Now by multiplying some coefficient k and motion vector we can get following equation:

Experiments
The main criteria to evaluate method is the speed criteria. As mentioned before, this is the comparison method performed with different optical flow calculation methods: Farneback, Brox and Duality based TV-L1 methods. As well as API to calculate them: CUDA and OpenCL.
Evaluation process includes image process: implementations work with image sequences with different resolutions. Evaluation provides with and without read/write operations speed. Main evaluation parameter is work speed. The quality provided by evaluation is considered as well, but from more subjective view point. Examples of generated frames are shown in figures 2-4.
Evaluation provided on three image sets with different resolutions: 25601600 pixels, 1182664 pixels and 578425 pixels. A computer with characteristics was used for the experiments: Farneback method for optical flow calculation show the best result above mentioned and best suitable for this task. Duality based TV-L1 method show smooth and good quality optical flow, but calculation speed is very slow. Brox method not suitable at all, it shows worst result in both calculation speed and optical flow quality. Results obtained using the CUDA API presented in figures 5-7 and tables 1-2.   In terms of API as expected CUDA shows much better results than OpenCL. CUDA is more suitable for the task, but has small disadvantage and that is CUDA can work only on NVidia GPU accelerators, according to Steam about 75% of users prefer NVidia GPU's. In other hand OpenCL is a cross-platform API supported by all GPU's and CPU's, but the performance level is not good enough to consider it as an option. OpenCL implementations almost two times slower than CUDA implementation. This leads to the inability to generate frames in 60 FPS in real time even for low resolution images, thus the OpenCL implementation is insufficient for the task.

Conclusion
In this work presented a new approach of frame generation based on optical flow. Proposed approach achieves frame interpolation based on optical flow information gained from current and previous frames. Analysed different optical flow methods in terms of speed and quality.
And show that this method is able to maintain performance in term of time to generated frames. However this performance only applicable to low resolution frames.
Further work may be conducted based on the results of this work. There are several ways which can improve this work and may give better results both in terms of calculation speed and quality.
Deep neural networks is one way to continue research on this topic. As shown in other works neural networks are good in image super resolution [8][9][10] as well as in image restoration [11]. By combining them new approach to frame interpolation may be presented. This approach may deliver good performance in both speed and quality.