A parallel multi-block alternating direction method of multipliers for tensor completion

This paper proposes an algorithm for the tensor completion problem of estimating multi-linear data under the limitation of observation rate. Many tensor completion methods are based on nuclear norm minimization, they may fail to achieve the global solution for solving nuclear norm minimization in tensor completion problem with high missing ratio. To tackle this issue, an adaptive tensor completion method based on parallel multi-block alternating direction method of multipliers (ADMM) algorithm is proposed, it can derive the model from the initial estimate and compute the next estimate from the current solution. The parallel multi-block ADMM with global convergence is adopted to solve the dual problem, which greatly improves the processing power and reliability of the algorithm.


INTRODUCTION
Sparse recovery is widely studied in modern science and engineering applications. As tensor is a general concept of multidimensional array, which can generalize vectors and matrices to higher order and represent the inner structure of more complex higher-order data. The tensor notation has been widely used in data representation and storage. How to effectively mine the essential low-dimensional expression of tensor and recover the original data from noise pollution or partial loss has become a basic problem in the fields of machine learning [1,2], data mining [3,4], pattern recognition [5] and computer vision [6].
In the past few years, low-rank matrix completion have developed rapidly in the field of data analysis. Intuitively, the low This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2021 The Authors. IET Image Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology rank of the original image is damaged by the sparse distribution of noise. Thus, low-rank matrix completion problem always drives out the noise of degraded image as a group of lowdimensional data. As the rank function of matrix is non-convex and its minimization has strong global constraint, [7] transformed matrix completion into a convex optimization problem by nuclear norm approximation. In order to solve the rank minimization problem of large-scale matrix, [8] proposed fixed point iterative method and Bregman iterative method, and Toh [9] relaxed the standard matrix incomplete problem into a matrix LASSO model, and used the nearest neighbour gradient algorithm to solve it. In addition, [10] extended the orthogonal matching tracking technology of vectors and constructed an extendable matrix completion model based on rank-1 matrix tracking, which greatly reduced the operation time. [11] highlighted the use of smooth hyperbolic tangent function instead of rank function to establish a smooth non-convex optimization model, by taking advantage of the differentiable property of gaussian function, they used gradient projection algorithm to solve the problem and achieved better completion accuracy.
In general, high-dimensional data analysis is always restricted by built-in complex multi-linear structure. The generality of tensor is a versatile tool for expressing the essential characteristics of complex data. The goal of tensor completion is to recover the missing entries in the tensor by partial observation entities [12,13]. While the rank of the tensor is more complicated than the matrix, CP rank [14] and Tucker rank [15] of tensors are commonly used for approximation. As the CP rank of a tensor is usually NP-hard to compute, there are many algorithms to solve tensor completion based on the Tucker rank [10,16,17]. In [18], a Tucker decomposition algorithm with missing data to approximate tensor is proposed. [19] introduced an efficient algorithm for low Tucker rank tensor restoration by using convex optimization problem, which incorporates multiple rank components in the geometric analysis. In addition, [20] gave the definition of tubal rank of tensor and the framework of the proposed multiple linear algebra was widely used for tensor computation in computer vision tasks [21,22].
However, [23] found that the sample complexity would increase and the prediction quality would be significantly reduced with the different sampling ratios of the entries of the matrix. In order to reduce the computational cost, a high accuracy low rank tensor completion algorithm [24] was introduced by adding a new auxiliary variables to derive the equivalent form of the optimization problem with alternating direction method of multipliers (ADMM) to find the extreme value of the function. A new tensor framework was also introduced in [25], which was based on the high-order SVD of image arrangement in the database and did not require the least squares solution of coefficients.
Although matrization is a common method to solve tensor completion problem, due to the dependency between multiple constraints, it is necessary to give an appropriate definition of the trace norm of tensors. [26] defined a new tensor nuclear norm for the n rank of tensors, they propose an alternate direction multiplier to solve the resulting nuclear norm minimization problem. [27] proved the suboptimality of the sum of rank of tensor under different modal expansion matrix and introduced Square Deal method to solve the completion problem. In order to achieve high-precision recovery, [28] proposed an adaptive tensor completion method, which extended the modified model with constraint matrix completion to the tensor completion with bounded constraints.
Since minimization of rank function is usually NP-hard, a widely used convex relaxation method is to replace rank function with nuclear norm. Nuclear norm technique has long been considered a successful approach in solving low-rank tensor completion. However, its efficiency can be challenged when high-probability sampling is performed on certain rows and columns, because the setting of most tensor completion problems is far less than the number of observations required for recovery, resulting in nuclear norm minimization failure. This means that under the general sampling scheme, the efficiency of nuclear specification technology may be greatly reduced. In many computer vision problems, training datasets usually have a large scale for which parallel and distributed computing algorithm is necessary. Here, we propose a tensor completion method based on parallel multi-block ADMM, the main contributions of this manuscript are described below: (1) A new adaptive low-rank tensor completion method based on parallel multi-block ADMM is proposed. This algorithm can work even in a small number of samples and ensure the accuracy of the results. (2) In order to ensure the strict convexity of our optimization problem, we add approximation terms for each iteration converge. (3) Compared with other advanced algorithms, the proposed method can obtain accurate results in an effective way through experiments.

PROBLEM STATEMENT
Tensor completion aims to recover the lost entries with a lowrank model, which is extended from matrix completion. We now start with a brief overview of matrix completion. In image processing, matrix completion is usually carried out in the case of bounded rank optimization. Recovering a matrix M ∈ R m×n of missing components is usually accomplished by solving the following problems: in which A ∈ R m×n is recovery matrix, Ω is the observed subset of M , and f Ω is an orthogonal projection operator which extracts the entries in Ω and fills in the positions of non-entries with zeros, which can be expressed as Model (1) provides a matrix completion idea which is proved as a NP-hard problem based on rank sparsity. Since the nuclear norm ‖ ⋅ ‖ * can be the most efficient to approximate the rank function, we use the following convex optimization problems to obtain a new matrix: This idea is inspired from the study of tensor completion. As tensors are higher-order extensions of matrices, tensor completion can be expressed as the following optimization problem by utilizing the Tucker rank: where T is an incomplete Nth-order tensor and Ω is an index set for providing entries observed from T . By using nuclear norms, the rank in problem (4) can be solved by calculating the following formula: where i is the set of weight coefficients and obeying ∑ N i=1 i =1. A i is produced by the tensor A through mode-i unfolding and each component is a matrix of tensors over a single modulus. However, since the single-mode processing is not balanced, the problem of (5) is not the convex formulation, it may lead to low efficiency in solving the tensor completion problem.
New optimization algorithms have been proposed as the substitute for solving the above problem, a proven efficient and easy implementation called Square Deal model was proposed in: This model can make A | j | more balanced by the reshaping operator: The detailed process in which the elements of T are rearranged to form a matrix of different shapes is described in [27]. Inspired by this method, [28] proposed an adaptive correction approach for tensor completion. In image applications, such as colour video, the tensors are generally bounded and satisfy ‖T ‖ ∞ ≤ c, where c is a constant. In order to obtain a more reliable low-rank solution, the tensor completion problem can be solved by a correction model referring to the modified model of matrix completion in [29] with an initial estimationÃ of the given tensor which is to be completed.
in which F : It is a spectral operator: R n 1 ×n 2 → R n 1 ×n 2 , U and V are obtained by singular value decomposition in which U ∈ R n 1 ×n 2 , V ∈ R n 1 ×n 2 , Diag( (A)) = ( 1 (A), 2 (A), … , n 1 (A)) T , 1 , … , n 1 are the singular values, and the symmetric function: where is a scalar function for R + → R n + : R + , R n + denotes non-negative real numbers and vectors, respectively. The convergence criterion of the algorithm is: is the accuracy threshold.

TENSOR COMPLETION BASED ON PARALLEL MULTI-BLOCK ADMM
Here, we introduce the ADMM algorithm and discuss the parallel algorithm. It is noticeable that the proposed method can realize distributed computing and establish the distributed solution of the adaptive low-rank tensor completion model.

Alternating direction method of multipliers
ADMM is a common method to solve separable convex optimization problems. It is based on the augmented Lagrange algorithm, through decomposing and coordination process, the large global problem is decomposed into several smaller local subproblems, and the solution of the large global problem is obtained by coordinating the solution of the subproblems.
In order to integrate the disintegrability of dual ascent and the convergence of method of multipliers, new variables were introduced into the previous method, and the direction of crosstransformation was optimized alternately. The specific form is as follows: this idea is to split the original variable and the target function, but unlike the dual ascent method, which takes the disassembled x as part of x, and then puts them together after the optimization step by step. It initially takes the disassembled variables as different variables X and Z and does the same thing with the constraint conditions. This provides a loose condition for redisplaying x and z at a later stage and ensures the decomposition of the previous optimization process. Thus, the optimization process of the ADMM algorithm can be directly expressed as follows: in which is a step factor. ADMM is typically used to solve the specific problem that contains two blocks of variables (N = 2). By extending its algorithm framework and adding different kinds of approximate terms to subproblems, ADMM can expand from two blocks to N blocks and keep the convergence, so that subproblems can be solved in a flexible and effective way.

Parallel multi-block ADMM
A typical ADMM algorithm consists of two variable blocks, which is the case of N = 2 in the following optimization problem.
To extend to the case where N ≥ 3, we use the Jacobian scheme and realize parallelism: = argmin The Jacobi-type scheme implementation of parallelization has a downside: it is easier to diverge. It might even diverge in the case of two blocks. In order to ensure its convergence, additional assumptions or modifications must be made to the algorithm.
For the sake of universality, additional correction factor is tried to be added to the iterative process, and a proximal Jacobian ADMM was proposed in [30]: in which In order to realize parallelization and overcome the disadvantage of updating blocks one by one in the classical algorithm, PJ-ADMM realizes natural extension based on the algorithm framework of Jacobi-type scheme. It makes additional assumptions or modifications, adding approximate terms to each subproblem to ensure the stability of the algorithm. The solving mechanism of PJ-ADMM is that in any round of iteration, two subproblems are solved in parallel and the shared variables are transferred to the other subproblem after the completion of the calculation. Then, the subproblems update the multiplier variable according to Equation (18) and move to the next round of calculation.
This algorithm has been proved to have certain advantages, such as global convergence and o(1∕k) convergence speed. The added approximation term can make the subproblem strictly convex or strongly convex for improving the robustness. PJ-ADMM model can realize distributed parallel optimal computation of multiple subproblems. The advantage of this algorithm is that computational efficiency will be greatly increased under multiple computers available.

Tensor completion based on parallel multi-block ADMM
For problem (8), we introduce parallel multi-block ADMM to address it, which is described in detail in this section.
In order to solve the problem of (8), we need to further obtain its dual problem, as (8) can be converted to: in which Y is a slack variable, B ∶= {A|‖A‖ ∞ ≤ c},Ã is an initial estimator, (⋅) is the indicator function on B: The Lagrangian dual problem can be further obtained: According to the derivation in [31], due to strong duality, it can be equivalent to: in whichC = ⟨F (Ã | j | ),Ã | j | ⟩. Obviously, A and Y are not directly related to each other, so we can minimize each of them. Considering there are: Through it we can get: And the following equation holds: in which * B (⋅) is a conjugate function, and * B (−Z ) = c‖Z ‖ 1 can be deduced from the definition of B. According to these equations, problem (8) can be reformulated as follows: by introducing a new constant G ∈ R n 1 ×⋯n N , the above optimization problem is equivalent to the following problem: in which R(A | j | ) ∶= A. The augmented Lagrange for this problem is: in which is the penalty parameter. We deal with this dual problem by parallel multi-block ADMM. The objective function of this optimization problem ALGORITHM 1 Tensor completion based on parallel multi-block ADMM until stopping criterion is met; based on PJ-ADMM mode is In any round of iteration, two subproblems are solved in parallel. When the calculation is completed, the shared variable is passed to another subproblem, that is, a round of solution is completed. Then, the subproblems update the multiplier variable and move to the next round of calculation. Modifying the above equation to get the approximate Jacobi ADMM iteration formula: [20] proves that using the common choices of P i and mentioned in this paper, PJ-ADMM algorithm can ensure global convergence, which converges at the rate of o(1∕k). According to the distributed iteration mode of PJ-ADMM, the parallel algorithm flow is given as follows: The approximation term is added to ensure the strong convexity of the subproblem, the matrix P i can make the subproblems easier to solve, and the algorithm guarantees strict global convergence.

EXPERIMENTS
Here, our method is tested on some standard benchmark images and compared with advanced tensor recovery methods. For the convenience of expression, our proposed method is called PATC in the following text. The purpose of the experiment is to verify the effectiveness of restoring the original image from the incomplete observation image, and the results of algorithms are evaluated by the relative standard error (RSE). The platform is Matlab R2014a under Windows 7 on a PC of a 3.4 GHz CPU and 8 GB memory. We define the RSE to measure the performance of experiments: It can be easily converted to a signal-to-noise ratio by SNR = −20 log 10 RSE.
To ensure the accuracy of the statistical results, we repeat the experiment five times for each case to evaluate the results. For Algorithm 1, the stop criterion is set as in which = 10 −4 . The parameters P i and are crucial to the convergence of ADMM parallel algorithm, we set P i = i I and = 1, respectively. As the setting of the correction step size depends on the sample sampling rate, results show that a further correction is needed for the lower sampling rate case. In our algorithm, we set the maximum size of the correction step to 5. In each step of the correction, the relative error has been greatly improved.
We validate the effectiveness of proposed model via experiments on benchmark images, which is often used to evaluate and compare the performance of different methods. These colour images can be represented by third-order tensors with the size of length width channels, we designed four scenarios to conduct experiments and compare the results with the following advanced methods: FBCP [32], FBCP-MP [32], FaLRTC [26], HaLRTC [26], TFTC [33].

Synthetic dataset
Here, we test on the popular benchmark image House and Butterfly, which is reliable in evaluating image algorithms. The original image is processed into observation images with random and uniform pixels missing according to the sampling rate of various sizes, we evaluate the performance by using RSE values. In Figures 1-4, we show the original image, the missing pixel observation images and some examples of restored images of House. The benchmark image is shown in the first column and can be constructed into a tensor of 256 × 256 × 3. We generated the observation images with sampling rates of 5%, 10%, 15%, 20%, 25% and 30%, respectively. Then, experiments are conducted on each image to evaluate the robustness of these methods. Figure 5 shows the average results of our model and the comparison algorithm. We show the original and observation of Butterfly in Figure 6. The relative error recovered by different methods for the Butterfly image was shown in Table 1.
We can see that in the test experiments, the accuracy of our proposed method is obviously improved compared with other methods. In terms of tensor completion, our model performs well even in the case of low observation ratio, which proves that the correction term can reduce the recovery error.

Natural dataset
Here, we experiment on the natural image dataset to get the results of tensor completion methods. We show some typical examples of recovery results in Figures 7-10. For image Baboon, which is a face of baboon, the sampling rates of the observed images are set into 5%, 10%, 15%, 20%, 25% and 30%, respectively, and pixels are missing in a uniform and random manner. We tested all algorithms and got the average of results which is showed in Figure 11. It can be seen from these experi-    mental results that the overall performance of PATC is better than other algorithms in terms of correction. We also tested image River Otter, and the algorithm can also achieve excellent results which is shown in Figures 12-14, the relative error recovered by different methods are shown in Table 2. Compared to other algorithms, our restored image can ensure relatively clear texture and details. This suggests that our method ranks among the best in robustness and performs well in precision for natural images at low observation rates.

Non-random missing pixels image
Our comparison consists of two parts shown in Figure 15: (A) We use the image Barbara (256 × 256 × 3) superimposed with the texture pixels as the observation image. In practice, it is difficult to accurately identify the location of text pixels, here we simply represent the missing entries with pixels with grsyscale values greater than 254, ensuring that the blocks covered by the text pixels are completely missing. (B) In this part, the image Barbara superimposed by Chinese character pixels is regarded as the observation image, and the pixels with the same grey value greater than 254 are marked as missing. All algorithms are tested and the results are shown in Figure 16-21.
The experiment shows that the texture shadow can be well smoothed, but the global restoration effect of Chinese character overlay image is not good, which leads to the unsatisfactory pre-

Target removal image
Here, we need to complete image restoration with completely losing the pixel area of the object. We introduced the image Sailboat to test the accuracy of these algorithms. In Figure 22, we show the benchmark image, there is a white boat at the center of image, and in subsequent experiments, the pixel area where the boat is located will be completely missing. In this case, results in Figures 23 and 24 show that PATC can delete the target object more cleanly, while other methods still leave a large shaded area.

CONCLUSION
Here, PATC, a tensor completion method based on parallel multi-block ADMM algorithm is proposed. We extend to the tensor completion with bounded constraints, and design a parallel ADMM algorithm to solve the model on the basis of the modified tensor completion model. Experimental results show that PATC can get better results compared with other methods. Our modified model performs well in tensor completion, especially in the case of low sample ratio. We have determined that the correction term can reduce the recovery error and will continue to study and improve the reliability of the algorithm.