LOW-RANK AND SPARSE MATRIX RECOVERY FROM NOISY OBSERVATIONS VIA 3-BLOCK ADMM ALGORITHM

Recovering low-rank and sparse matrix from a given matrix arises in many applications, such as image processing, video background substrac-tion, and so on. The 3-block alternating direction method of multipliers (ADMM) has been applied successfully to solve convex problems with 3-block variables. However, the existing sufficient conditions to guarantee the convergence of the 3-block ADMM usually require the penalty parameter γ to satisfy a certain bound, which may affect the performance of solving the large scale problem in practice. In this paper, we propose the 3-block ADMM to recover low-rank and sparse matrix from noisy observations. In theory, we prove that the 3-block ADMM is convergent when the penalty parameters satisfy a certain condition and the objective function value sequences generated by 3-block ADMM converge to the optimal value. Numerical experiments verify that proposed method can achieve higher performance than existing methods in terms of both efficiency and accuracy.


Introduction
The low-rank and sparse matrix recovery problem arises widely in statistical model selection, system identification, and machine learning, such as image recovery [11], face recognition [20,21] and background modeling [3,12]. A central goal of the problem is to decompose a given high-dimensional matrix C into a sum of a low rank matrix A and a sparse matrix B. This can be mathematically formulated as where ∥ · ∥ 0 is l 0 norm counting the number of nonzero entries, λ > 0 is the hyperparameter, and C ∈ R m×n is the given matrix.
However, the problem (1.1) is generally NP-hard due to the non-convexity and non-smoothness of rank(·) and ∥ · ∥ 0 . To overcome this difficulty, based on the fact that the ℓ 1 norm and the nuclear norm have been shown to be effective surrogate for ∥ · ∥ 0 and rank(·) respectively [8], the problem (1.1) was often translated into the convex relaxation formulation [4,5,18]: (1.2) where ∥A∥ ⋆ = n i=1 σ i (A) denotes the nuclear norm of matrix A. σ i (A) is the ith singular value of A, and ∥B∥ 1 = ij |B ij | denotes the ℓ 1 norm of B vectorized as a long vector in R m×n . It has been shown that the above convex model can exactly recover a low-rank matrix A and a sparse matrix B under some conditions [3].
When the observation matrix C is corrupted by Gaussian noise, problem (1.2) is then formulated as the following nonsmooth convex optimization problem [13,23]: where δ > 0 is a constant to describe noise intensity. For the convenience of optimization, problem (1.3) can be usually transformed into its penalized form where ρ > 0 is a penalty parameter.
In practice, a more common formulation for the corrupted observations is often expressed as where D denotes the random noise and ∥D∥ F ≤ δ for some δ > 0. Therefore, the original low-rank and sparse matrix decomposition can be formulated as In recent years, the two problems (1.2) and (1.3) have been proposed to extract low-dimensional structure from a possibly noisy data matrix [3,23]. The work [13] has shown that accelerated proximal gradient (APG) method has impressive performance for solving (1.4) as well as (1.2). A twisted version of proximal alternating direction method of multipliers (TADMM) [19] was presented for recovering low-rank and sparse matrix from noisy observation, and the convergence and effectiveness of TADMM was also proved in the work. Alternating direction method of multipliers (ADMM) [1,5,15,18,22] is another widely used approach to efficiently solve such problems.
Optimizing the problem of recovering the low-rank and sparse matrix from noisy observations (1.5) is a challenge task. In this work, we employ the 3-block ADMM algorithm to solve (1.5). More importantly, we prove that the iterative sequence generated by proposed algorithm converges to an optimal solution of (1.5) under some mild conditions. Numerical experiments are executed to demonstrate the superiority of the proposed method to some existing methods.
The rest of the paper is organized as follows. In section 2, we introduce some notations and preliminaries which are necessary for our main results. Section 3 describes the details of recovering low rank and sparse matrix from noisy observation via 3-block ADMM. Section 4 analyzes the convergence properties of 3-block ADMM for (1.5) and shows that the objective function value sequence generated by 3-block ADMM algorithm converges to the optimal value. In Section 5, we compare 3block ADMM with some existing ADMM based algorithms to demonstrate the performance of it. Finally, Section 6 concludes the paper.

Preliminaries and notations
In this section, we briefly introduce some notations and results on the shrinkage operator, which are key to the forthcoming sections.
For matrices X and Y with the same size, the inner product in R m×n is defined as ⟨X, The norm associated with this inner product is called the Frobenius norm ∥ · ∥ F . For symmetric and positive semidefinite matrix P ⪰ 0, let ∥x∥ 2 P = x T P x where x ∈ R n . When P is the n × n identity matrix, ∥ · ∥ P denotes the ℓ 2 -norm ∥ · ∥ 2 . Lemma 2.1 ( [18]). For µ > 0 and Y ∈ R m×n , the solution of the problem where abs(·) and sign(·) are absolute value and sign functions, respectively.
and a scalar τ > 0. Then is an optimal solution of the following problem where plus function {·} + is defined by a + = max{a, 0} for a ∈ R.

The 3-block ADMM algorithm
The classical alternating direction method of multipliers (ADMM) is also called 2-block ADMM with two variables, which has been studied in the literature (see [6,7,10]). The 3-block ADMM is extended from the classical ADMM for solving the following convex problem: are closed proper convex functions, and χ i are closed convex sets for i = 1, 2, 3. For given (x k 2 , x k 3 ; λ k ), 3-block ADMM for (3.1) can be summarized as denotes the augmented Lagrangian function of (3.1), λ is the Lagrange multiplier and β > 0 represents a penalty parameter.
Though the convergence properties of the 2-block ADMM have been well-established [10,16,17], the convergence of 3-block ADMM (3.2) has remained unclear for a very long time. In recent work [6], a counterexample was given which showed that without further conditions the 3-block ADMM may actually fail to converge. If all the functions f 1 , f 2 and f 3 are strongly convex, Han et al. [9] proved the global convergence of the 3-block ADMM (3.2). When the second block in the objective function is a strongly convex function and constrained by one coupled linear equation, the work [14] presented a semi-proximal alternating direction of multiplier (sP-ADMM) for the problem (3.1). Specifically, the iterative scheme was described as follows: where P 1 , P 2 , and P 3 are positive semidefinite matrices, and γ > 0 is a step size for the dual update. Moreover, for γ ∈ (0, (1+ √ 5)/2) and β ∈ (0, +∞), the global convergence of sP-ADMM was proved [14]. However, the above results usually require some restrictive conditions on P i , γ and β which may affect the performance of solving large-scale problem in practice. The 3-block ADMM for (3.1) was researched under the conditions that P i = 0, β > 0 and γ = 1 [15]. Moreover, they showed the algorithm was convergent and effective with any penalty parameter β > 0 under no additional assumptions. This paper mainly considers the 3-block ADMM for recovering low rank and sparse matrices from a given corrupted observation matrix expressed as (1.5). In order to solve (1.5) by 3-block ADMM, we first consider its augmented Lagrangian function where Λ ∈ R m×n is a Lagrange multiplier matrix, and β > 0 is the penalty parameter. The 3-block ADMM algorithm for recovering low-rank and sparse matrix is then formulated as follows: This can be equivalently transformed into where ∂(·) denotes the subgradient of a convex function. Eq. (3.5) implies that the equation Λ k+1 = ρD k+1 holds, for any k ≥ 0.
By Lemmas 2.1 and 2.2, (3.5) can be solved by where τ = 1 β . Consequently, for the problem (1.5), there exists a saddle point Based on the above analysis, we propose the algorithm for solving (1.5) which is described in Algorithm 1.

Convergence of the proposed 3-block ADMM
In this section, we will show the convergence of Algorithm 1 for the problem (1.5).
The following identity will be used frequently, As a main result of proposed work, we first present the following theorem.
where G 1 ∈ ∂∥A k+1 ∥ ⋆ , the first inequality is based on the property of convex function and identity (4.1), and the second inequality is obtained by settingÂ = A k from (3.6). Similarly, where S 1 ∈ ∂λ∥B k+1 ∥ 1 . The first inequality is from the property of convex function and identity (4.1), and the second inequality is obtained by settingB = B k from (3.7). By (3.5) and (4.1), we obtain
Based on Theorem 4.1, the convergence of the sequence {(A k , B k , D k , Λ k )} generated by Algorithm 1 can be achieved by the following theorem. Proof. First, we will show that the augmented Lagrangian function has a lower bound.
where the second equality holds from the fact that Λ k+1 = ρD k+1 and the inequality is obtained by β > ρ. By Theorem 4.1 and (4.8), we conclude the convergence of Combining (4.7) and (4.8), for any k > 0, it follows that where β+ρ 2 − ρ 2 β > 0. Taking limit over the sequence in (4.9) and letting k → +∞, we have The above inequality implies that (4.10) From (4.8), we obtain Since sequence {L β (A k , B k , D k , Λ k )} is convergent and bounded, the sequence {(A k+1 , B k+1 } is also bounded. By (3.8) and Λ k+1 = ρD k+1 , we can conclude that By (4.10), we have Similarly, Consequently, we can concluded that From (3.5), (4.10), (4.11), and (4.12), we obtain and Taking over the limit of (4.15) for j → ∞, we have is an optimal solution of (1.5), the optimal s of the objective function on the problem (1.5) can be denoted as f ⋆ , and f ⋆ = Therefore, the sequence of objective function value converges to the optimal value.

Proof. From (3.5) and Theorem 4.2, it follows that
when k → 0. By (4.13) and (4.18), we have Thus the proof is concluded.

Experiments
In this section, we present several numerical experiments to demonstrate the effectiveness of Algorithm 1. All the algorithms are implemented by Matlab R2015a, and are tested on a PC with 4 GB RAM and Intel Core i5-3550 CPU.

Background subtraction from video
Background subtraction which aims to detect the moving objects from the background of a video stream is an important research field of computer vision. The background subtraction is usually modeled as a matrix decomposition problem. In this case, the background of each frame is relatively invariant while the foreground moving objects are sparse. Therefore, the background subtraction problem can be modeled as the low-rank and sparse structure as problem (1.5). As a result, 3-block ADMM can be employed to solve the problem (1.5) and eventually separate the background and foreground objects.
In this experiment, a surveillance video consisting of a static background and a number of people moving in the foreground is used to test the algorithm. The video is captured from an airport, which is composed of 200 frames with the resolution 144 × 176. Each frame is then stacked into a vector and the video is converted as a matrix A with the size of 25344 × 200.
To execute the experiment, some experimental settings are listed as follows. The parameters λ and ρ are selected by trail and error method and set as λ = 8, ρ = 100. C (Sample) is original frame or observation, and D(noise) = C − A − B, where A (Low-rank) and B (Sparse) are, respectively, the background and the foreground that will be separated. We only take four frames of the video as an example and show the decomposition results in Figure 1. Figure 1 illustrates that the proposed method can successfully separate the background and the moving objects of all the four scenes.

Matrix recovery
In this section, we provide numerical tests for matrix recovery task and compare the 3-block alternating direction method (3b-ADMM) with some other algorithms, including 3-block semi-proximal ADMM (3b-sPADMM) [14] and twisted version of the proximal ADMM (TADMM) [19], for solving the problem (1.5).  In this experiment, we generate the data in the same way as [22]. Let A ⋆ and B ⋆ are, respectively, the ground-truth of the low-rank and sparse matrices which will be modeled and recovered by (1.5), r is the rank of matrix A ⋆ . More specifically, the low-rank matrix A ⋆ is generated by A ⋆ = U V , where U = rand(m, r) and V = rand(r, n). B ⋆ is generated as B ⋆ = zeros(m, n), p = randperm(m × n), L = round(spr × m × n) and B ⋆ (p(1 : L)) = randn(L, 1), where spr is the sparsity ratio, rand, randn, zeros, randperm and round are the corresponding MATLAB functions. Let the noise D = 0.001 · rand(m, n) and let C = A ⋆ + B ⋆ + D, then we obtain the observation C with noise.
In the following, we will recover the true matrices A ⋆ and B ⋆ from the observation C.
denote the relative error of low-rank matrix and in sparse matrix, respectively. In our experiments, we set the parameter ρ = 1e + 3, β = 1e + 2, and the initial iteration (A 0 , B 0 , D 0 , Λ 0 ) = (0, 0, 0, 0). The stopping criterion of algorithms is achieved by using where ϵ is tolerance. Figure 2 shows the convergent curves of the three algorithms with the parameters setting as m = n = 100, rank=10, and a given noise D = 0.001 · rand(100, 100). The sparsity ratio is set as 5%. It illustrates that the performance of 3b-sPADMM and TADMM is roughly equal, while the 3b-ADMM converges much faster than them. 3b-ADMM can achieve satisfying result at most 50 iterations.  According to the data in Table 1, we can also observe that 3b-ADMM performs the better among the tested algorithms in terms of both the accuracy and the speed. Therefore, the proposed 3b-ADMM for solving the low-rank and sparse matrices recovery problem has favorable numerical performance and converges faster than some state-of-the-art algorithms.

Conclusions
In this paper, we mainly emphasize the applicability of the 3-block alternating direction method (ADMM) for solving the low-rank and sparse matrix recovery problem from noisy observations, and show the convergence of 3-block ADMM with a strongly convex block when the penalty parameters of the problem (1.5) satisfy β > ρ. Our algorithm is successful to separate background and foreground in video and recover the low-rank and sparse matrix. Moreover, numerical experiments demonstrate the efficiency of the proposed method in comparison with several existing algorithms, and our algorithm can spend less iteration and achieve higher precision.