Video reconstruction based on Intrinsic Tensor Sparsity model
Introduction
Compressive Sensing (CS) breaks through the Nyquist sampling limit and has brought the revolutionary change for the data acquisition technology [1], [2], such as the fields of compressive imaging systems and cameras. One of the representative imaging systems is the Single Pixel Camera (SPC) by Rice University [3]. With the advent of SPC, imaging system has been transformed drastically. For video sampling, the CS is used to trade off the spatial and temporal resolution of the cameras. And there are many video sampling methods designed [4], [5], [6] to capture videos with both high spatial and temporal resolutions. An effective sampling method is the Coded Aperture Compressive Temporal Imaging (CACTI) [6] which reduces the pressure of the bandwidth with low implementation complexity.
With the CS theory, the original signal can be reconstructed from the CS measurement. For sparse signal , through the measurement matrix , , we obtain the estimation of the original signal by the following optimization problem which can be written as where is the -norm, . is the -norm that is a total number of non-zero elements in a vector.
Various methods have been proposed to solve the -norm optimization problem. Such as the evolutionary algorithms [7], the greedy algorithms [8], [9], [10], [11], [12]. The item , a ‘prior item’, is a sparse constraint to the signal. The -norm optimization problem is NP-hard. To address the problem, -norm minimization problem is proposed as it gives same results as -norm under certain circumstances [13], and can be solved through analytical solvers, for instance, the iterative shrinkage algorithms [14], [15] and Basis Pursuit (BP) [16].
In addition to all these sparse models mentioned above, the Statistical Compressed Sensing (SCS) with Gaussian Mixture Models (GMMs) which works with general Bayesian models is proposed in recent years [17]. GMMs which describes most of the real signal very well has been applied to solve various image processing problems, such as classification [18], denoising [19], CS reconstruction [20], [21]. Based on GMMs, we proposed Gaussian Joint Sparsity model to capture the temporal similarity.
As people pay more attention to the video reconstruction, a serious of algorithms are proposed to reconstruct the video sequence, such as Generalized Alternating Projection (GAP) [22], GMM-based algorithm [21], etc. The GMM-based algorithm is proposed by Yang et al. and applies GMM to model spatial–temporal 3D video patches successfully, yet it ignores the similarity between the spatial and temporal of the video.
It is well known that the still images have geometric self-similarities. This is especially true for the video sequences. A multitude of still image restoration methods improve the reconstruction by using the spatial similarity [23], [24], [25], [26]. In this paper, the reconstruction model based on Intrinsic Tensor Sparsity (ITS) and Gaussian Joint Sparsity (GJS) model is proposed to exploit both the Spatial and temporal similarity of the video sequence. The innovations are briefly described as follows: (1) We propose a tensor sparsity based reconstruction framework for video CS recovery exploiting the nonlocal structured sparsity via sparsity tensor approximation. In this model, the 2D similar patches are searched in the spatial–temporal domain, and the ITS is used as the tensor sparsity measure, fully taking advantage of the redundancy. (2) We propose Gaussian Joint Sparsity (GJS) model to reconstruct the initial video sequence by employing the frame-to-frame similarity. In this model, the 2D image blocks which in the same position of the adjacent frames are assumed to have the same structure and obey the same Gaussian distribution. (3) An efficient ADMM algorithm is designed to solve the reconstruction problem based on ITS. What is more, the large matrix inverse problem is simplified by the block CS when solving the video signal with fixed sparse tensor. When reconstructing the video, a reliable initialization video sequence is obtained by GJS, then the video reconstruction model based on ITS is adopted to improve the reconstruction results. The two models work together and improve the reconstruction results effectively
The outline of the rest of the paper is as follows. Section 2 talks about some notions and related work. Section 3 describe the video reconstruction model based on ITS and the initialization method based on GJS. Section 4 reports the experimental results. At last, conclusions are discussed in Section 5.
Section snippets
Notions and related work
This work involves tensor, ITS measure, and CACTI. Next, we will review these three categories of related works.
Video reconstruction based on ITS and GJS
The natural images have self-similarities which can be used to improve the image reconstruction result. There are three types of similarity in videos sequence:
- 1.
For one frame of the video, an image block can find its similar blocks in the same frame (see the upper-right boxes of Fig. 3). There is spatial redundancy in each frame of the videos.
- 2.
For the static scenes in the video, if a scene occurs at one frame, the same scene will occur in the same position in the adjacent frames (see the
Experimental results
In this section, four comparison methods are used for analyzing, one is the GMM proposed in paper [21], the other three are the methods which combine with the proposed.
GMM: The algorithm proposed in paper [21].
GJS_PLE: It is the proposed initialization method in Section 3.2. The details can be seen in Algorithm 2.
GMM_ITS: GMM is used to obtain an initial video sequence. Then the proposed video reconstruction model based on ITS in Algorithm 1 is used to improve the video sequences.
GJS_PLE_ITS:
Conclusion
Our paper proposes a video reconstruction algorithm base on ITS and GJS with the CACTI measurement. By transforming the video reconstruction problem to a tensor sparsity approximation problem, the proposed algorithm enjoys the following advantages: (i) The tensor sparsity model adequately captures the self-similarity of video by using a spatial–temporal tensor sparsity penalty. The ITS measure which finely encodes the correlation insights under the known Tucker and CP decomposition for tensors
Acknowledgments
This work was supported in part by the State Key Program of National Natural Science of China (No. 61836009), the National Natural Science Foundation of China (No. 61871310, No. 61573267, No. 61771376, , No. 61801350, No. 61876220) in part by the Equipment Pre Research Field Foundation of China (No. 61403120101), in part by the Program for Cheung Kong Scholars and Innovative Research Team in University, China (No. IRT_15R53), in part by The Fund for Foreign Scholars in University Research and
References (33)
- et al.
CoSaMP: Iterative signal recovery from incomplete and inaccurate samples
Appl. Comput. Harmon. Anal.
(2009) - et al.
A study of Gaussian mixture models of color and texture features for image classification and segmentation
Pattern Recognit.
(2006) - et al.
Image reconstruction with locally adaptive sparsity and nonlocal robust regularization
Signal Process., Image Commun.
(2012) Compressed sensing
IEEE Trans. Inform. Theory
(2006)- et al.
Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information
IEEE Trans. Inform. Theory
(2006) - et al.
Single-pixel imaging via compressive sampling
IEEE Signal Process. Mag.
(2008) - et al.
Maximum frame rate video acquisition using adaptive compressed sensing
IEEE Trans. Circuits Syst. Video Technol.
(2011) - Y. Hitomi, J.W. Gu, M. Gupta, T. Mitsunaga, S.K. Nayar, Video from a Single Coded Exposure Photograph using a Learned...
- et al.
Coded aperture compressive temporal imaging
Opt. Express
(2013) - et al.
Nonconvex compressed sensing by nature-inspired optimization algorithms
IEEE Trans. Cybern.
(2015)
Matching pursuits with time-frequency dictionaries
IEEE Trans. Signal Process.
Signal recovery from random measurements via orthogonal matching pursuit
IEEE Trans. Inform. Theory
Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit
IEEE Trans. Inform. Theory
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
IEEE Signal Process. Mag.
For most large underdetermined systems of linear equations the minimal l(1)-norm solution is also the sparsest solution
Comm. Pure Appl. Math.
Iterative hard thresholding for compressed sensing
Appl. Comput. Harmon. Anal.
Cited by (3)
Tensor-based plenoptic image denoising by integrating super-resolution
2022, Signal Processing: Image CommunicationCitation Excerpt :Besides, to avoid the vectorization of image, some works for hyper/multi-spectral image (HSI/MSI) denoising introduce the tensor representation with the higher order singular value decomposition (HOSVD) [40], low rank tensor approximation (LRTA) framework [41], Laplacian Scale Mixture modeling [42], Hyper-Laplacian regularization [43], tensor dictionary learning (TDL) [44], TenSR [45,46] and intrinsic tensor sparsity regularization (ITSReg) [47] to keep the intrinsic structure of HSI/MSI data to improve the denoising performance. In addition, a spatial–temporal tensor sparse penalty for similar patches is introduced in a video tensor sparsity model for video reconstruction [48] and an effective low-rank tensor completion method is used to address the color video recovery problem [49]. A tensor-based optimization algorithm with the nuclear norm regularization term is utilized for the 4D computed tomography (4D-CT) super-resolution (SR) reconstruction [50].
High dimensional data reconstruction based on L<inf>2,1</inf> norm
2021, Applied Mathematical ModellingCitation Excerpt :Finally, the processing speeds of the three methods are also studied. Table 2 presents the processing speeds of the model in [3], model in [5] and proposed method. From Table 2, it can see that the model in [3] needs the most time than the model [3] and our method.
Survey on compressive sensing video stream for uplink streaming media
2021, Journal of Image and Graphics