Elsevier

Applied Mathematics and Computation

Volume 314, 1 December 2017, Pages 133-141
Applied Mathematics and Computation

The two-stage iteration algorithms based on the shortest distance for low-rank matrix completion

https://doi.org/10.1016/j.amc.2017.07.024Get rights and content

Abstract

Despite matrix completion requiring the global solution of a non-convex objective, there are many computational efficient algorithms which are effective for a broad class of matrices. Based on these algorithms for matrix completion with given rank problem, we propose a class of two-stage iteration algorithms for general matrix completion in this paper. The inner iteration is the scaled alternating steepest descent algorithm for the fixed-rank matrix completion problem presented by Tanner and Wei (2016), the outer iteration is used two iteration criterions: the gradient norm and the distance between the feasible part with the corresponding part of reconstructed low-rank matrix. The feasibility of the two-stage algorithms are proved. Finally, the numerical experiments show the two-stage algorithms with shorting the distance are more effective than other algorithms.

Introduction

From the pioneering work on low-rank approximation by Fazel [10] as well as on matrix completion by Candès and Recht [7], there has been a lot of study (see [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26] and references therein) both from theoretical and algorithmic aspects on the problem of recovering a low-rank matrix from partial entries-also known as matrix completion. The problem occurs in many areas of engineering and applied science such as model reduction [18], machine learning [1], [2], control [20], pattern recognition [9], imaging inpainting [3] and computer vision [23] and so on. There is a rapidly growing interest for this issue. Explicitly seeking the lowest rank matrix consistent with the known entries is mathematically expressed as: minZRm×nrank(Z)subjecttoPΩ(Z)=PΩ(Z0),where the matrix Z0Rm×n is the underlying matrix to be reconstructed, Ω is a random subset of indices for the known entries, and PΩ is the associated sampling orthogonal projection operator which acquires only the entries indexed by Ω{1,2,,m}×{1,2,,n}.

Since a manifold of rank r matrices can be factorized into a bi-linear form Z=XY where XRm×r and YRr×n, a few algorithms have been presented to solve (1) while the rank r was known or can be estimated [18], [19]. The general problem (1), however, is non-convex and is NP-hard [12] due to the rank objective. Vandereycken [24] applied the Riemannian optimization to the problem by minimizing the least square distance on the sampling set over the Riemannian manifold of matrix Z0. Then a Riemannian geometry method and a Riemannian trust-region method were given by Mishra et al. [21] and Boumal et al. [5], respectively. Evidently, the computation of a gradient is expensive in method, and then several methods resulted in alternating optimization and XY were raised [8], [15], [22].

On the other hand, Candés and Rechat [7] replaced the rank objective in (1) with its convex relaxation, the nuclear norm ‖Z* which is the sum of all singular values of matrix Z, that is minZRm×nZ*subjecttoPΩ(Z)=PΩ(Z0).Alternative to the convex optimization, there have been many algorithms which are designed to attempt to solve for the global minimum of (1) directly; many of them are adaptations of algorithms for compressed sensing, such as the hard thresholding algorithms [4], [14], [16], the singular value thresholding (SVT) method as well as its variants [6], [13], [25]. However, the computations of a partial singular value decomposition (SVD) were required at each iteration in the most direct implementation of these algorithms. The computational cost of computing the SVD has complexity of O(n3) when the rank r and matrix-size n are proportional, resulting in computing the SVD to be the dominant computational cost at each iteration and then limits their applicability for large n. In addition, Lin et al. [17] proposed an augmented Lagrange multiplier (ALM) method which performs better both in theory and algorithms than the others that with a Q-linear convergence speed globally.

Recently, based on the simple factorization Z=XY where XRm×r and YRr×n, mentioned above, rather than solving (1), algorithms are designed to compute the non-convex problem minX,Yf(X,Y)with f(X,Y):=12PΩ(Z0)PΩ(XY)F2. In fact, the model (3) replaced the rank objective in (1) with the distance between a matrix and an r-dimensional manifold for the fixed-rank problem. Algorithms for the solution of (3) with the distance objective usually follow an alternating minimization scheme, with PowerFactorization [11] and LMaFit [26] two representatives. Tanner and Wei [22] proposed an alternating steepest descent (ASD), and a scaled variant (ScaledASD) methods in 2016. In so doing ASD and ScaledASD [22] are able to recover matrices of substantially higher rank than can LMaFit [26]. However, the rank of most of completing is unknown such that we have to estimate it in advance or approximate it from a lower rank until satisfying PΩ(Z0)=PΩ(XY). In this study, we first define the distance between a matrix and the r-dimensional manifold and then come up with a class of two-stage iteration algorithms for the case that the rank r was unknown. The rank is increased either one-by-one until the optimal rank r is obtained for a lower rank (say, be estimated) model or by combining l jumping-space with one-by-one until the optimal rank r is obtained for a larger rank (say, be estimated) model. The inner iteration finds the matrix which is up to the shortest or approximation shortest distance and the outer iteration finds the optimal r-dimensional manifold with two kinds criterion. The convergence theory of the new algorithms are studied.

The rest of the paper is organized as follows. A class of two-stage iteration algorithms for the case that the rank r was unknown is proposed in Section 2. The convergence of the new algorithms are discussed in Section 3. The numerical experiments are shown and comparison to algorithms in Section 4. Finally, we end the paper with a concluding remark in Section 5.

Here are some necessary notations and preliminaries. Rm×n is used to denote the m × n real matrix set, and Rn the n-dimensional real vector set. XT denotes the transpose of the matrix or vector X. The Frobenius norm is denoted by ‖XF. For a matrix X=(x1,x2,,xn)Rm×n, dim(X) is always used to represent dimensions of the manifold of fixed-rank matrix X and rank(X) represents the rank of a matrix X. Let Ω{1,2,,m}×{1,2,,n} denote the indices of the observed entries of the matrix X, Ω¯ denote the indices of the missing entries. Then PΩ be the orthogonal projection operator on the span of matrices vanishing outside of Ω. So that the (i, j)th component of PΩ(X) is equal to Xij when (i, j) ∈ Ω, and zero otherwise. Also, Zr={ZRm×n:rank(Z)=r} stands for the r-dimensional manifold of fixed-rank matrices.

The one-to-one correspondence between a matrix and its projection enables us to devise a notation of distance between a matrix and an r-dimensional manifold as follows.

Definition 1.1

For a matrix YRm×n,d(Yr)=minYZF2is called distance between a matrix Y and an r-dimensional manifold Zr. That is essentially the distance between matrix Y and its projection onto the r-dimensional manifold Zr.

We trivially give the distance between a feasible matrix and its projection onto the r-dimensional manifold for introducing the new iteration algorithms.

For PΩ(Y)=PΩ(Z0), we called d(Yr)=mindim(Z)=rYZF2 as a distance between a feasible matrix Y and an r-dimensional manifold Zr.

Evidently, it should be noted that min d(Yr) > 0 if r < min rank(Z) and mind(Yr)=0 if r=minrank(Z). To obtain the min d(Yr), some algorithms are presented by combining the model (1) and (4).

Section snippets

Scaled alternating steepest descent (ScaledASD) method

In order to solve the non-convex problem (3), the alternating steepest descent (ASD) method [22] applies steepest gradient descent to f(X, Y) in (3) alternatively with respect to X and Y. If f(X, Y) is written as fY(X) when Y is held constant and fX(Y) when X is held constant, the directions of gradient fY(X)=(PΩ(Z0)PΩ(XY))YTandfX(Y)=XT(PΩ(Z0)PΩ(XY)).The steepest descent stepsizes along the gradient descent directions fY(X) and fX(Y) were denoted by tx=fY(X)F2PΩ(fY(X)Y)F2andty=

Convergence analysis

In this section, the convergence theory of Algorithms 3 and 4 will be discussed in detail and that of Algorithm 2 can be a simple modification in [22].

The following Lemmas will be need in our supplement study.

Lemma 3.1

(Lemma 4.2 of [22]) Assume that the matrix sequence (Xi, Yi) generated by Algorithm 1 is nonsingular (i.e., full rank) during all the iterations. Let(Xik,Yik) be a subsquence of (Xi, Yi) converging to a stationary and nonsingular pair (X*, Y*). Then(Xik,Yik) is bounded and satisfies limik

Numerical experiments

In this section, we conduct experiments of randomly drawn rank r matrices using the model Z0=XY, where XRm×r and YRr×n from the normal distribution N(0,1). A random subset Ω are sampled uniformly at random, and p is defined as its fraction of the entries in Z0. For conciseness, the tests presented consider square matrices as is typical in the study.

The comparison results of the four algorithms (Two-stage I, Two-stage II, Two-stage III, and ALM) are displayed in Tables 1–6. There, we denote

Concluding remarks

For general matrix completion problems, we have proposed a class of two-stage iteration methods, Two-stage I-III. The inner iteration is the scaled alternating steepest descent algorithm for the fixed-rank matrix completion problem, the outer iterations can be used two iteration criterions: the gradient norm and the distance between the feasible part with the corresponding part of completed low-rank matrix. Numerical experiments show that Two-stage II-III methods outperform the famous ALM

References (26)

  • C. Tomasi et al.

    Shape and motion from image streams under orthography : a factorization method

    Int. J. Comput. Vis.

    (1992)
  • WenZ. et al.

    Solving a low-rank factorization model for matrix completion by a non-linear successive over-relaxation algorithm

    Math. Program. Comput.

    (2012)
  • Y. Amit et al.

    Unconvering shared structures in malticalass classification

    Proceeding of the Twenty-Fourth International Conference on Machine Learning

    (2007)
  • A. Argyriou et al.

    Multi-task feature learning

    Adv. Neural Inf. Process. Syst.

    (2007)
  • M. Bertalmio et al.

    Multi-task feature learning, image inpainting

    Comput. Gr.

    (2000)
  • J. Blanchard et al.

    CGIHT: Conjugate Gradient Iterative Hard Thresholding for Compressed Sensing and Matrix Completion

    (2014)
  • N. Boumal et al.

    RTRMC: a Riemannian trust-region method for low-rank matrix completion

  • CaiJ.-F. et al.

    A singular value thresholding method for matrix completion

    SIAM J. Optim.

    (2010)
  • E.J. Candès et al.

    Exact matrix completion via convex optimization

    Found. Comput. Math.

    (2009)
  • ChenC. et al.

    Matrix completion via an alternating direction method

    IMA J. Numer. Anal.

    (2012)
  • L. Eldén

    Matrix Methods in Data Mining and Pattern Recognization

    (2007)
  • M. Fazel

    Matrix rank minimization with applications

    (2002)
  • J.P. Haldar et al.

    Rank-constrained solutions to linear matrix equations using powerfactorization

    IEEE Signal Process. Lett.

    (2009)
  • Cited by (6)

    • Toeplitz matrix completion via smoothing augmented Lagrange multiplier algorithm

      2019, Applied Mathematics and Computation
      Citation Excerpt :

      For example, the singular value theresholding (SVT) method as well as its variants [7,18,37], an accelerated proximal gradient (APG) method [32], the augmented Lagrange multiplier (ALM) method [23] etc. Numerous details derivations on MC problem can be refer to the [5–13,16,18–23,25,28,31,32,34–39] and the references given therein. In this study, we focus on the Toeplitz matrix completion (TMC) problem, one of the most important MC problems.

    • Accelerated low rank matrix approximate algorithms for matrix completion

      2019, Computers and Mathematics with Applications
    • Toeplitz matrix completion via a low-rank approximation algorithm

      2020, Journal of Inequalities and Applications

    This work is supported by NSF of China (11371275) and NSF of Shanxi Province (201601D011004).

    View full text