Compressed Sensing with coherent tight frames via $l_q$-minimization for $0

Our aim of this article is to reconstruct a signal from undersampled data in the situation that the signal is sparse in terms of a tight frame. We present a condition, which is independent of the coherence of the tight frame, to guarantee accurate recovery of signals which are sparse in the tight frame, from undersampled data with minimal $l_1$-norm of transform coefficients. This improves the result in [1]. Also, the $l_q$-minimization $(0<q<1)$ approaches are introduced. We show that under a suitable condition, there exists a value $q_0\in(0,1]$ such that for any $q\in(0,q_0)$, each solution of the $l_q$-minimization is approximately well to the true signal. In particular, when the tight frame is an identity matrix or an orthonormal basis, all results obtained in this paper appeared in [13] and [26].


Introduction
Compressed sensing is a new type of sampling theory, that predicts sparse signals can be reconstructed from what was previously believed to be incomplete information [2,3,4]. By now, applications of compressed sensing are abundant and range from medical imaging and error correction to radar and remote sensing, see [5,6] and the references therein.
In compressed sensing, one considers the following model: where A is a known m×n measurement matrix (with m ≪ n) and z ∈ R n is a vector of measurement errors. The goal is to reconstruct the unknown signal x based on y and A. The key idea of compressed sensing relies on that signal is sparse or approximately sparse. A naive approach for solving this problem consists in searching for the sparsest vector that is consistent with the linear measurements, which leads to: where x 0 is the numbers of nonzero components of x = (x 1 , ..., x n ) ∈ R n , · 2 denotes the standard Euclidean norm and ε ≥ 0 is a likely upper bound on the noise level z 2 . If ε = 0, it is for the noiseless case. If ε > 0, it is for the noisy case. We call that a vector x is s-sparse if x 0 ≤ s. Unfortunately, solving (L 0,ε ) directly is NP-hard in general and thus is computationally infeasible [7,8]. One of the practical and tractable alternatives to (L 0,ε ) proposed in the literature is: which is a convex optimization problem and can be seen as a convex relaxation of (L 0,ε ). The restricted isometry property (RIP), which first appeared in [9], is one of the most commonly used frameworks for sparse recovery via (L 1,ε ). For an integer s with 1 ≤ s ≤ n, we define the s-restricted isometry constants of a matrix A as the smallest constants satisfying for all s-sparse vectors x in R n . By computation, one would observe that δ s = max T ⊂{1,··· ,n},|T |≤s where · denotes the spectral norm of a matrix. However, it would be computationally difficult to compute δ s using (1.2). One of the good news is that many types of random measurement matrices have small restricted isometry constants with very high probability provided that the measurements m is large enough [3,23,24]. Since δ 2s < 1 is a necessary and sufficient condition to guarantee that any s-sparse vector f is exactly recovered via (L 0,0 ) in the noiseless case, many attentions have been focused on δ 2s in the literature [10,11,12,13]. Candès [10] showed that under the condition δ 2s < 0.414, one can recover a (approximately) sparse signal with a small or zero error using (L 1,ε ). Later, the sufficient condition on δ 2s were improved to δ 2s < 0.453 by Lai et al. [11] and δ 2s < 0.472 by Cai et al. [12], respectively. Recently, Li and Mo [13] has improved the sufficient condition to δ 2s < 0.493 and for some special cases, to δ 2s < 0.656. To the best of our knowledge, this is the best known bound on δ 2s in the literature. On the other hand, Davies and Gribonval [20] constructed examples which showed that if δ 2s ≥ 1/ √ 2, exact recovery of certain s-sparse signals can fail in the noiseless case.
For signals which are sparse in the standard coordinate basis or sparse in terms of some other orthonormal basis, the mechanism above holds. However, in practical examples, there are numerous signals of interest which are not sparse in an orthonormal basis. More often than not, sparsity is not expressed in terms of an orthogonal basis but in terms of an overcomplete dictionary [1].
In this paper, we consider recovery of signals that are sparse in terms of a tight frame from undersampled data. Formally, let D be a n × d matrix whose d columns D 1 , ..., D d form a tight frame for R n , i.e.
where ·, · denotes the standard Euclidean inner product. Our object in this paper is to reconstruct the unknown signal f ∈ R n from a collection of m linear measurements y = Af + z under the assumption that D * f is sparse or nearly sparse. Such problem has been considered in [22,25,14,1]. The methods in [22,14,25] force incoherence on the dictionary D so that the matrix AD conforms to the above standard compressed sensing results. As a result, they may not be suitable for dictionary which are largely correlated. One new alternative way imposing no such properties on the dictionary D for reconstructing the signal f from y = Af + z is to find the solution of l 1 -minimization: where again ε ≥ 0 is a likely upper bound on the noise level z 2 . For discussion of the performance of this method, we would like to introduce the definition of D-RIP of a measurement matrix, which first appeared in [1] and is a natural extension to the standard RIP.
For the rest of this paper, D is a n × d tight frame and δ s denotes the D-RIP constants with order s of the measurement matrix A without special mentioning. Throughout this paper, denote x [s] to be the vector consisting of the s-largest coefficients of v ∈ R d in magnitude: Candès et al. [1] showed that if A satisfies D-RIP with δ 2s < 0.08 (in fact, a weaker condition δ 7s < 0.6 was stated), then the solutionf to (P 1,ε ) satisfies where the constants C 0 and C 1 may only depend on δ 2s . It is easy to see that it's computationally difficult to verify the D-RIP for a given deterministic matrix. But for matrices with Gaussian, subgaussian, or Bernoulli entries, the D-RIP condition will be satisfied with overwhelming probability provided that the numbers of measurements m is on the order of s log(d/s). In fact, for any m × n matrix A obeying for any fixed ν ∈ R n , P Aν 2 2 − ν 2 2 ≥ δ ν 2 2 ≤ ce −γmδ 2 , δ ∈ (0, 1) (1.4) (γ, c are positive numerical constants) will satisfy the D-RIP with overwhelming probability provided that m s log(d/s) [1]. Therefore, by using D-RIP, the work is independent on the coherence of the dictionary. The result holds even when the coherence of the dictionary D is maximal, meaning two columns are completely correlated. Although Canès et al. in [1] gave the sufficient condition on δ 2s to guarantee approximately recovery of a signal via (P 1,ε ), the bound on δ 2s is much weaker comparing to the case for which D is an orthonormal basis. We focus on improving it in this paper.
Our first goal of this paper is to show that the sufficient condition on δ 2s above can be improved to δ 2s < 0.493. And in some special cases, the sufficient condition can be improved to δ 2s < 0.656. These results are given and proved in Section 3. Weakening the D-RIP condition has several benefits. First, it allows more measurement matrices to be used in compressed sensing. Secondly, it give better error estimation in a general problem to recover noisy compressible signal. For example, if δ 2s = 1/14, Then by [27,Corollary 3.4], δ 7s ≤ 0.5. Using the approach in [1] one would get a estimation in (1.3) with C 0 = 30, C 1 = 62. While by Theorem 3.4 in Section 3 of this paper, one would get a estimation in (1.3) with C 0 ≃ 5.06 and C 1 ≃ 10.57. Finally, for the same measurement random matrix A which satisfies (1.4), a standard argument as in [24,1] shows that it allows recovering a sparse signal with more non-zero transform coefficients. In a nutshell, weakening the D-RIP condition for (P 1,ε ) is as important as weakening the RIP condition for classical (L 1,ε ).
Note that the l 0 -norm is the limit as q → 0 of the l q -norm in the following sense: Thus l q -norm with 0 < q < 1 can be used for measuring sparsity. Therefore, one alternative way of finding the solution of (L 0,ε ) proposed in the literature is to solve: This is a non-convex optimization problem since l q -norm with 0 < q < 1 is not a norm but a quasi-norm. For any fixed 0 < q < 1, while checking the global minimal value of (L q,ε ) is NP-hard, computing a local minimizer of the problem is polynomial time doable [15]. Therefore, to solve (L q,ε ) is still much faster than to solve (L 0,ε ) at least locally. Reconstruction sparse signals via (L q,ε ) with 0 < q < 1 have been considered in the literature in a series of papers [17,16,18,11,19,26] and some of the virtues are highlighted recently. Lai and Liu [26] showed that as long as the classical restricted isometry constant δ 2s < 1/2, there exist a value q 0 ∈ (0, 1] such that for any q ∈ (0, q 0 ), each solution of (L q,0 ) for the sparse solution of any underdetermined linear system is the sparsest solution. Thus, it's natural for us to consider the reconstruction of a signal f from y = Af + z by the method of l q -minimization (0 < q < 1): Our second goal of this paper is to estimate the approximately error betweenf and f when using the l q -minimization (P q,ε ). We show that if the measurement matrix satisfies the D-RIP condition with δ 2s < 1/2, then there exists a value q 0 = q 0 (δ 2s ) ∈ (0, 1] such that for any q ∈ (0, q 0 ), each solution of the l q -minimization is approximately well to the true signal f . This paper is organized as follows. Some lemmas and notations are introduced in Section 2. Section 3 is devoted to discuss recovery of a signal from noisy data via l 1 -minimization. We begin by give some lemmas in this section. And then we discuss approximately recovery of a signal for the general case in Subsection 3.1 while for the special case in Subsection 3.2. Our main results in this section are Theorem 3.4, Theorem 3.8. In Section 4, we discuss recovery of a signal from noisy data via l q -minimization with 0 < q ≤ 1. Some lemmas and notations are introduced at the beginning. Subsequently, we give the main result Theorem 4.4 and prove it.

Lemmas
We will give some lemmas and notations first. We begin by discussing some of the results of recovery of a signal by l 0 -minimization: Combining with δ 2s < 1, we get 0 = DD * h = h. The proof is finished.
Proof. For u, v ∈ Σ s , assume that Du 2 = Dv 2 = 1. By the definition of D-RIP, we have Thus by a simple modification, we conclude the proof. For T ⊂ {1, ..., d}, denote by D T the matrix D restricted to the columns indexed by T , and write D * T to mean (D T ) * , T c to mean the complement of T in {1, · · · , d}. Given a vector h ∈ R n , we write D * h = (x 1 , · · · , x s , · · · , x 2s , · · · , x d ).
Proof. By the definition of δ 2s and Lemma 2.2, we have where we have used DD * T j h 2 ≤ D * T j h 2 , j ∈ {0, 1, · · · l}.
Proof. By the definition of δ 2s and (2.2), we have

Recovery via l 1 -minimization
In this section, we are concerned with the reconstruction of a signal f from y = Af + z by the method of l 1 -minimization: (P 1,ε ).
Let h =f − f , wheref is the solution of (P 1,ε ) and f is the original signal. We use the same assumptions as in Section 2. Furthermore, rearranging the indices if necessary, we assume that the first s coordinates of D * f are the largest in magnitude and |x s+1 | ≥ |x| s+2 ≥ · · · ≥ |x d |. For the rest of this section, we will always assume that D * Proof. By the simple inequality Proof. By [21, Proposition 1], we have Therefore, we have Combining the above inequality with Lemma 3.1, one can finish the proof.
Sincef is a minimizer of (P 1,ε ), one gets that That is Thus

This implies
According to the feasibility off , Ah must be small:
Applying lemmas 2.4 and 2.5 to the above inequality yields Using to (3.5), one can get Notice that by Lemma 3.2, we have where we have used the fact for all δ 2s Thus, it follows from the above two inequalities and (3.7) that Combining with (3.1), one can finish the proof.
The main result of this subsection is the following theorem.
Proof. By (3.5), we have Hence, we have Then by lemmas 3.1 and 3.2, we have It follows from the above two inequalities that Therefore, with the above inequality and Lemma 3.3 we prove the result.
We have l ≤ 3 by n ≤ 4s. For simplicity, we assume that l = 3. Instead of lemmas 2.4 and 2.5, we have the following results.
Lemma 3.5. We have The main result of this subsection is the following theorem.
Theorem 3.8. If n ≤ 4s and δ 2s < 0.656, then where Proof. By a similar approach as that for (3.9), we have Then by lemmas 3.1, we have Therefore, with the above inequality and Lemma 3.7 we prove the result.
Remark 3.9. When D is an identity matrix, that is for the classical RIP and l 1 -minimization (L 1,ε ), theorems 3.4 and 3.8 were proved in [13].
4 Recovery via l q -minimization with 0 < q < 1 In this section, we will discuss recovery of a signal by l q -minimization (P q,ε ) with 0 < q < 1. For q ∈ (0, 1], let h =f − f , wheref is the solution of (P q,ε ) and f is the original signal. We follow the same assumptions as in Section 2. Moreover, rearranging the indices if necessary, we assume that the first s coordinates of D * f are the largest in magnitude and |x s+1 | ≥ |x| s+2 ≥ · · · ≥ |x d |. For the rest of this section, for q ∈ (0, 1], we will always assume that D * Proof. By the simple inequality Proof. By [26, Lemma 2], we have Therefore, we have where for the last inequality we have used n 1 ≤ n q for n ∈ R l . Combining with Lemma 4.1, one can conclude the result. Analogous to (3.1) and (3.2), one can prove that and Denote that For δ 2s < 1 2 , one can prove that there exists a value q 0 = q 0 (δ 2s ) ∈ (0, 1] such that for all q ∈ (0, q 0 ), ρ s (q) < 1. Indeed, by a easy calculation, ρ s (q) < 1 is equivalent to Since the second term on the left hand side goes to zero as q → 0 + as δ 2s < 1, q ≤ 1 and one can finish the conclusion.