Tensor Conjugate-Gradient methods for tensor linear discrete ill-posed problems

: This paper presents three types of tensor Conjugate-Gradient (tCG) methods for solving large-scale linear discrete ill-posed problems based on the t-product between third-order tensors. An automatic determination strategy of a suitable regularization parameter is proposed for the tCG method in the Fourier domain (A-tCG-FFT). An improved version and a preconditioned version of the tCG method are also presented. The discrepancy principle is employed to determine a suitable regularization parameter. Several numerical examples in image and video restoration are given to show the e ﬀ ectiveness of the proposed tCG methods.


Introduction
In this paper, we consider the solution of large minimization problems of the form min X∈R m×p×n A * X − B F , A = [a] l,m,n i, j,k=1 ∈ R l×m×n , B ∈ R l×p×n , ( where the Frobenius norm of singular tube of A rapidly attenuates to zero with the increase of the index number.In particular, A has ill-determined tubal rank.Many of its singular tubes are nonvanishing with tiny Frobenius norm of different orders of magnitude.Problem (1.1) with such a tensor is called tensor linear discrete ill-posed problems.They arise from the restoration of color image and video, see e.g., [1][2][3][4][5].Throughout this paper, the operation * represents tensor t-product introduced in [1] and • F denotes the tensor Frobenius norm, represented by We assume that the observed tensor B ∈ R m×p×n is polluted by an error tensor E ∈ R m×p×n , i.e., where B * ∈ R m×p×n is an unknown and unavailable error-free tensor related to B. B * is determined by A * X * = B * , where X * represents the explicit solution of problem (1.1) that is to be found.We assume that the upper bound of the Frobenius norm of E is known, i.e., E F ≤ δ. (1.4) Straightforward solution of (1.1) generally is not meaningful due to propagation and severe amplification of the error E into the solution of (1.1) and the ill-posedness of A = [a] l,m,n i, j,k=1 .In this paper we use Tikhonov regularization to reduce this effect and this regularization replaces (1.1) with penalty least-squares problems of the form min where µ > 0 is a regularization parameter.We assume that where N(A) denotes the null space of the tensor A under * , is the set of all solutions X of the equation A * X = O.I is the identity tensor and O ∈ R m×p×n is a tensor whose elements are all zero, respectively.The normal equation of the minimization problem (1.5) is then is the unique solution of the Tikhonov minimization problem (1.5) under the assumption (1.6).
There are many methods for solving large-scale tensor linear discrete ill-posed problems (1.1).Recently, a tensor Golub-Kahan bidiagonalization method [4] and a GMRES method [5] were introduced for solving large-scale linear ill-posed problems (1.1) by iteratively solving (1.5).The randomized tensor singular value decomposition (rt-SVD) method in [6] was presented for computing super large data sets, and has prospects in image data compression and analysis.Ugwu and Reichel [7] proposed a new random tensor singular value decomposition (R-tSVD), which improves the truncated tensor singular value decomposition (T-tSVD) in [1].Kilmer et al. [2] presented a tensor Conjugate-Gradient method (tCG) for tensor linear systems A * X = B corresponding to the least-squares problems (1.1),where the regularization parameter in the tCG method is user-specified.
This paper mainly extends CG methods from matrix problems to tensor problems.Using matrix methods to solve the problem of image restoration and denoising is to flatten the three-dimensional data of color images into matrix form in turn for processing.Based on t-product, we extend the matrix method to tensor, which can preserve the data correlation between three-dimensional tensor sections.The tensor problem is projected into the Fourier domain, which makes use of the particularity of tproduct structure.We further discuss the tCG method for the approximate solution of (1.1) in the Fourier domain.The discrepancy principle is used to determine a suitable regularization parameter of the tCG method.The proposed automatic determination strategy is called the tCG method with automatic determination of regularization parameters (A-tCG-FFT).A least-squares method based on the tCG method is provided for (1.1), which is called A-tCGLS-FFT.A preconditioned version of the A-tCG-FFT method is presented, which is abbreviated as A-tPCG-FFT.The Conjugate Gradient method only needs to save the current and last gradient values, which takes up less memory resources.Moreover, only the previously calculated gradient information is needed in each iteration process, so parallel calculation can be carried out and the calculation efficiency can be improved.Moreover, the way of projecting tensor problem into Fourier domain also greatly avoids the time and space complexity required to smooth tensor problem into matrix problem.
The rest of this paper is organized as follows.Section 2 introduces some symbols and preliminary knowledge that will be used in the context.Section 3 presents the A-tCG-FFT, A-tCGLS-FFT and A-tPCG-FFT methods for solving the minimization problem (1.5).Section 4 gives several examples on image and video restoration and Section 5 draws some conclusions.

Preliminaries
This section gives some notations and definitions, and briefly summarizes some results that will be used later.For a third-order tensor A ∈ R l×m×n , Figure 1 shows the frontal slices A(:, :, k), lateral slices A(:, j, :) and tube fibers A(i, j, :), and we abbreviate A k = A(:, :, k) for simplicity.An ln × m matrix is obtained by the operator unfold(A), whereas the operator fold folds this matrix back to the tensor A, i.e., , fold (unfold (A)) = A.
[1] Given two tensors A ∈ R l×m×n and B ∈ R m×p×n , the t-product A * B is defined as where C ∈ R l×p×n .
The following remarks will be used in Section 3.
Remark 2.1.[8] For suitable tensors A and B, it holds that (1) bcirc(A * B) = bcirc(A) * bcirc(B). ( Let F n be an n-by-n unitary discrete Fourier transform matrix, i.e., , n , then we get the tensor Â generated by using FFT along each tube of A, i.e., where ⊗ is the Kronecker product, F H n is the conjugate transposition of F n and Âi denotes the frontal slices of Â.
We also need the following remark.
Table 1.Description of notations.

Notation Interpretation
inverse of tensor, and the k-th frontal slice of tensor A A(i, j, :) the tube fibers of tensor A A k A(:, :, k) Â FFT of A along the third mode ÂH transpose of complex tensor Â unfold(A) the block column matrix of A bcirc(A) the block-circulant matrix I identity tensor Tensor Conjugate-Gradient methods

The A-tCG-FFT method
Based on the tensor conjugate gradient (tCG) method proposed by Kilmer et al. [10], this section presents a method to solve the regularization (1.5) by using the CG process in Fourier domain, where an suitable regularization parameter is automatically determined by the discrepancy principle.This method is abbreviated as A-tCG-FFT.The A-tCG-FFT improves the tCG method presented in [2] where the regularization parameter was user-specified.The following result shows the equivalent equation of (1.5) in the Fourier domain.
The normal equation of (3.1) can be represented as Once getting X, we have the approximation solution of (1.1) with the form X=iff( X, [ ], 3).
Proof.Following Remarks 2.1 and 2.2, (1.8) is represented as Thus, by using (2.4) we have Xk where η > 1 is usually a user-specified constant and is independent of δ k .For more details on the discrepancy principle, see e.g., [11].
In [12,13], the method of selecting regularization parameter through an a-priori strategy and aposteriori parameter choice rule is mentioned to verify the convergence estimation between exact solution and regularization solution.Here, we choose regularization parameter to be obtained by automatically updating the polynomial function where 0 < ρ < 1 and µ 0 = Â .Then, we obtain a suitable regularization parameter by iteratively applying (3.5) until (3.4) is satisfied.Algorithm 1 summarizes the A-tCG-FFT method for solving (1.5).Algorithm 1 contains two nested iterations.The outer iteration updates µ by applying (3.5) so that the discrepancy principle is satisfied.The inner iteration is to use the CG method (CG-iteration) to solve the k-th normal equation (3.2) with the value of µ j determined by the outer iteration, and M, N represents the inner product between matrices M and N. The inner iteration is stopped when the norm of the residual of the i-th iterate X k,µ j ,i is less than a tolerance tol.
In the following Corollay 3.1, the rationality of the two methods for initial X 0 is expounded.
Corollary 3.1.Let X 0 = 0 be the initial solution of the inner iteration of Algorithm 1, then we have the initial residual Theorem 3.2.If X * k is the exact solution of the symmetric positive definite equations (1.7), X k,µ j ,i and X k,µ j ,i+1 are generated by the inner CG-iteration of Algorithm 1, then where (A) .Here λ max (A) and λ min (A) are the maximum and minimum eigenvalues of A respectively.Theorem 3.2 illustrates the convergence of Algorithm 1. Refer to [14] for the detailed proof process.

The A-tCGLS-FFT method
In the actual numerical calculation, if ÂH k Âk is singular and µ is very small, then ÂH k Âk + µ j I is illconditioned.Inspired by the idea in [15] for the numerical stability of a linear system in matrix form, in this section, we use the CGLS method instead of the CG method in Algorithm 1.
The main difference between Algorithms 2 and 1 is the updating of z i = Bk − Âk X k,µ j ,i in Algorithm 2 rather than the residuals (3.6) of Algorithm 1.We refer to [16] for more details for a corresponding linear system in matrix form.

A preconditioned tensor Conjugate-Gradient method
In this subsection, we consider the acceleration of Algorithm 1 by preconditioning.In Algorithm 1, the coefficient matrix ÂH Denote Pk,i = M R Pk,i and tk,i = M −1 L Rk,i , then we have According to the definition of matrix inner product A, B = tr(A T B), where tr denotes the trace of a square matrix.Then, we have Then, we have and Finally, we have Based on (3.10)-(3.16),we have the preconditioned process of Algorithm 1. Algorithm 3 lists the A-tPCG-FFT method.
The following result gives the convergence of Algorithm 3.
Theorem 3.3.If X * k is the true solution of the symmetric positive definite equation (1.7), X k,µ j ,0 is the initial solution in the internal CG iteration of Algorithm 3, and X k,µ j ,i is the i-th iterate, then where A = ÂH k Âk + µ j I and κ(M) is the same as that in Theorem 3.2.Proof.Let A = ÂH k Âk + µ j I, Taking A as M −1 L AM −1 R in the similar result of Algorithm 1 in [17] results in (3.17).
We can expect from Theorem 3.3 that Algorithm 3 generally achieves better convergence than Algorithm 1 since cond(M −1 L AM −1 R ) < cond(A).We will illustrate this in numerical experiments in Section 4.

Numerical examples
This section presents three examples to show the application of Algorithms 1-3 on the restoration of image and video.All calculations were performed in MATLAB R2018a on computers with intel core i7 and 16GB RAM.
Let X iter denote the iterative solution to (1.5).The quality of the approximate solution X iter is defined by the relative error and the signal-to-noise ratio (SNR) , where X true denotes the uncontaminated data tensor and E(X true ) is the average gray-level of X true .In (1.5), we generate a noise tensor E and simulate the error in the data tensor B = B true + E. E is a noise tensor with a random term of normal distribution, the mean value is zero, and the variance selection corresponds to a specific noise level ν = E F B true F .We summarize the cross-channel blurring and within-channel blurring in the original blurring process in [3] that will be used in the following examples.The complete blurring model is presented in the form of A blur ⊗ A (1) ⊗ A (2) x true = b true , where a rr a rg a rb a gr a gg a gb a br a bg a bb , and A blur is a 3 × 3 matrix with the sum of each row equal to 1, which denotes cross-channel blurring as in [18].
, and vec(•) is an operator that transforms a matrix into a vector by stacking columns of the matrix from left to right.A (1) ∈ R n×n and A (2) ∈ R n×n define within-channel blurring, which are the horizontal inner blurring matrix and the vertical inner blurring matrix, respectively, see [18] for more details.A special case to consider is a rr = a gg = a bb = µ, a gr = a rg = a gb = a bg = β and a br = a rb = γ.If the formula (4.1) is calculated directly in MATLAB, the results are stored in the matrix and cannot be calculated effectively.Therefore, t-product structure is considered, and (4.1) is presented in the following tensor form where A υ ∈ R n×n×3 and A ω ∈ R n×n×3 .Each blurring matrix A (i) is defined as follows: where σ is a parameter that controls the amount of smoothing.Therefore, if σ is larger, the problem becomes more ill posed. .
We specify η = 1.05 and ρ = 1 2 .We set the initial solution of inner iteration as X 0 = X k,µ j−1 , and the convergence criterion of the inner iteration is given by the residual norm being less than tol = 10 −6 .
Table 2 lists CPU time, SNR and relative error of three algorithms for the restoration of penguin image.We can see from Table 2 that the A-tPCG-FFT algorithm needs the least CPU time among all the methods, while the A-tCG-FFT algorithm is the most time consuming.There is little difference between SNR and relative errors for all the methods.Figure 3 shows the change of relative error with CPU time for the algorithms to process the third slice of penguin image with ν = 10 −3 in Table 2. Figure 3 shows the iterative solution of the inner iteration and outer iteration of the three algorithms.It can be seen that when the regularization parameters are selected, the relative error norm values of the iterative solutions obtained by the inner iteration process of the three algorithms are gradually decreasing.Similarly, after the three algorithms automatically update the regularization parameters in the outer iteration, the iterative solutions corresponding to the updated regularization parameters are all smaller than the relative error norm of the iterative solutions before updating.We can still get from Figure 3 that the A-tPCG-FFT algorithm converges the fastest among all methods.These frames are blurred by (4.2), where A υ is a 3-way tensors such that A υ(:,:,1) = A (1) , A υ(:,:,i) = O for i = 2, . . .30, and A ω = I.Using σ = 3 and r = 5 to build the blurring matrices with periodic boundary conditions.The condition number of the obtained first slice of A is 1.6191e + 03, and the condition number of the remaining frontal sections of A is infinite.Use B = B true + E adds noise to the blurred image, and the considered noise levels are ν = 10 −2 and ν = 10 −3 .We set the initial solutions of their iterations as X 0 = X k,µ j−1 , and under the convergence condition that the residual norm is less than tol = 10 −6 .Using the discrepancy principle to determine regularization parameters, specify η = 1.1, α 0 = 1 2 and ρ = 1 2 .Table 3 records the concrete results of recovering these 10 frames of video data, including relative error, SNR and CPU time.Because the same criteria for selecting parameters and stopping iteration are set, these three algorithms differ slightly in relative error and SNR.However, the A-tPCG-FFT algorithm has a strong advantage in stopping iteration CPU time.Below we analyze the video recovery process of the 50th frame.Figure 5 shows the 50th original frame of this video file and the blurred and noisy video frame with ν = 10 −3 .Figure 6 shows the variation of the relative errors of the third piece of the 50th video frame with CPU time when the three algorithms process ν = 10 −3 in Table 3.
As can be seen from Figure 6, the convergence speed of algorithm A-tPCG-FFT is the fastest among the three algorithms.We can also get from Figure 6 that with the automatic updating of regularization parameters by the outer iteration and the progress of the inner iteration algorithm, the relative error norms of the iterative solutions obtained by the three algorithms are gradually decreasing.According to the results of Example 4.2, A-tCGLS-FFT and A-tPCG-FFT require less CPU time compared to A-tCG-FFT when recovering the data of each frame of video, so when processing multi-frame video data, this lead time will be accumulated in each process.That is to say, the more video frames, the more obvious the time advantages of A-tCGLS-FFT and A-tPCG-FFT, especially A-tPCG-FFT.

k 1 L 1 L
Âk +µI of the k-th normal equation (3.2) is symmetric and positive definite.We set M = ÂH k Âk + µI and apply approximate Cholesky decomposition to M. The process of approximate Cholesky decomposition can be directly realized by Matlab function chol, then M chol = chol(M) can be obtained.Let M L = M H chol and M R = M chol , then there is M = M L M R , where M L is a lower triangular nonsingular sparse matrix and M R = M H L .Then we solve the preconditioned normal equations Ãk Xk = Bk , (3.8) instead of (3.2) in Algorithm 1, where Ã = M −ÂH k Âk + µI M −1 R , X = M R Xk and B = M −ÂH k Bk .Let Xk,i and Xk,i represent the i-th iterate of (3.8) and (3.2) in the inner CG-iteration of Algorithm 1 under a certain regularization parameter µ, respectively.Then we have Rk Example 4.1.(Color image) This example shows the restoration of a blurred penguin color image by Algorithms 1 -3.This example compares the effects of A-tCG-FFT, A-tCGLS-FFT and A-tPCG-FFT in color image deblurring, and the image is contaminated by cross-channel blur and additive noise.The cross-channel bluring is determined by the matrix

Figure 2 .
Figure 2. (a) The original image of penguin, (b) the blurred and noisy image of penguin with ν = 10 −3 .

Figure 3 .
Figure 3.Comparison of relative errors versus CPU time for different methods.

Figure 4 Figure 4 .
Figure 4 displays the recovered penguin image for Algorithms 1-3 corresponding to the results with

Figure 6 .
Figure 6.Comparison of relative errors versus CPU time for different methods.

Figure 7
Figure 7  is the result of the restoration of the blurred and noisy image of the 50th frame by A-tCG-FFT, A-tCGLS-FFT, and A-tPCG-FFT.According to the results of Example 4.2, A-tCGLS-FFT and A-tPCG-FFT require less CPU time compared to A-tCG-FFT when recovering the data of each frame of video, so when processing multi-

Table 2 .
CPU time, SNR and relative error of different algorithms for the restoration of penguin image.

Table 3 .
CPU time, SNR and relative error of different algorithms for the restoration of 10 frames of vipbarcode.