On Recovery of Sparse Signals Via `1 Minimization - CiteSeerX

Report 2 Downloads 27 Views
3388

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009

On Recovery of Sparse Signals Via `1 Minimization T. Tony Cai, Guangwu Xu, and Jun Zhang, Senior Member, IEEE

Abstract—This paper considers constrained `1 minimization methods in a unified framework for the recovery of high-dimensional sparse signals in three settings: noiseless, bounded error, and Gaussian noise. Both `1 minimization with an ` constraint (Dantzig selector) and `1 minimization under an `2 constraint are considered. The results of this paper improve the existing results in the literature by weakening the conditions and tightening the error bounds. The improvement on the conditions shows that signals with larger support can be recovered accurately. In particular, our results illustrate the relationship between `1 minimization with an `2 constraint and `1 minimization with an ` constraint. This paper also establishes connections between restricted isometry property and the mutual incoherence property. Some results of Candes, Romberg, and Tao (2006), Candes and Tao (2007), and Donoho, Elad, and Temlyakov (2006) are extended.

linear equations with more variables than the number of equations. It is clear that the problem is ill-posed and there are generally infinite many solutions. However, in many applications the vector is known to be sparse or nearly sparse in the sense that it contains only a small number of nonzero entries. This sparsity assumption fundamentally changes the problem. Although there are infinitely many general solutions, under regularity conditions there is a unique sparse solution. Indeed, in many cases the unique sparse solution can be found exactly through minimization

Index Terms—Dantzig selector`1 minimization, restricted isometry property, sparse recovery, sparsity.

This minimization problem has been studied, for example, in Fuchs [13], Candes and Tao [5], and Donoho [8]. Understanding the noiseless case is not only of significant interest in its own right, it also provides deep insight into the problem of reconstructing sparse signals in the noisy case. See, for example, Candes and Tao [5], [6] and Donoho [8], [9]. minWhen noise is present, there are two well-known imization methods. One is minimization under an constraint on the residuals

1

1

I. INTRODUCTION HE problem of recovering a high-dimensional sparse signal based on a small number of measurements, possibly corrupted by noise, has attracted much recent attention. This problem arises in many different settings, including compressed sensing, constructive approximation, model selection in linear regression, and inverse problems. Suppose we have observations of the form

T

(I.1) with is given and where the matrix is a vector of measurement errors. The goal is to reconstruct the unknown vector . Depending on settings, the error vector can either be zero (in the noiseless case), bounded, or . It is now well understood that Gaussian where minimization provides an effective way for reconstructing a sparse signal in all three settings. See, for example, Fuchs [13], Candes and Tao [5], [6], Candes, Romberg, and Tao [4], Tropp [18], and Donoho, Elad, and Temlyakov [10]. A special case of particular interest is when no noise is present . This is then an underdetermined system of in (I.1) and Manuscript received May 01, 2008; revised November 13, 2008. Current version published June 24, 2009. The work of T. Cai was supported in part by the National Science Foundation (NSF) under Grant DMS-0604954, the work of G. Xu was supported in part by the National 973 Project of China (No. 2007CB807903). T. T. Cai is with the Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104 USA (e-mail: [email protected]. edu). G. Xu and J. Zhang are with the Department of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee, Milwaukee, WI 53211 USA (e-mail: [email protected]; junzhang@uwm. edu). Communicated by J. Romberg, Associate Editor for Signal Processing. Digital Object Identifier 10.1109/TIT.2009.2021377

subject to

-Constraint

subject to

(I.2)

(I.3)

Writing in terms of the Lagrangian function of ( -Constraint), this is closely related to finding the solution to the regularized least squares (I.4) The latter is often called the Lasso in the statistics literature (Tibshirani [16]). Tropp [18] gave a detailed treatment of the regularized least squares problem. Another method, called the Dantzig selector, was recently proposed by Candes and Tao [6]. The Dantzig selector solves the sparse recovery problem through -minimization with a constraint on the correlation between the residuals and the column vectors of subject to

(I.5)

Candes and Tao [6] showed that the Dantzig selector can be computed by solving a linear program subject to and where the optimization variables are . Candes and Tao [6] also showed that the Dantzig selector mimics the perfor. mance of an oracle procedure up to a logarithmic factor

0018-9448/$25.00 © 2009 IEEE Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.

CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIA

MINIMIZATION

3389

It is clear that some regularity conditions are needed in order for these problems to be well behaved. Over the last few years, many interesting results for recovering sparse signals have been obtained in the framework of the Restricted Isometry Property (RIP). In their seminal work [5], [6], Candes and Tao considered sparse recovery problems in the RIP framework. They provided beautiful solutions to the problem under some conditions on the so-called restricted isometry constant and restricted orthogonality constant (defined in Section II). These conditions essentially require that every set of columns of with certain cardinality approximately behaves like an orthonormal system. Several different conditions have been imposed in various settings. For example, the condition was used in Candes and in Candes, Romberg, and Tao [5], Tao [4], in Candes and Tao [6], and in Candes [3], where is the sparsity index. A natural question is: Can these conditions be weakened in a unified way? Another widely used condition for sparse recovery is the Mutual Incoherence Property (MIP) which requires the to be pairwise correlations among the column vectors of small. See [10], [11], [13], [14], [18]. minimization methods in a In this paper, we consider single unified framework for sparse recovery in three cases, minnoiseless, bounded error, and Gaussian noise. Both constraint (DS) and minimization imization with an constraint ( -Constraint) are considered. Our under the results improve on the existing results in [3]–[6] by weakening the conditions and tightening the error bounds. In particular, miniour results clearly illustrate the relationship between mization with an constraint and minimization with an constraint (the Dantzig selector). In addition, we also establish connections between the concepts of RIP and MIP. As an application, we present an improvement to a recent result of Donoho, Elad, and Temlyakov [10]. In all cases, we solve the problems under the weaker condition (I.6) The improvement on the condition shows that for fixed and , signals with larger support can be recovered. Although our main interest is on recovering sparse signals, we state the results in the general setting of reconstructing an arbitrary signal. It is sometimes convenient to impose conditions that involve only the restricted isometry constant . Efforts have been made in this direction in the literature. In [7], the recovery result was . In [3], the weaker established under the condition was used. Similar conditions have condition also been used in the construction of (random) compressed and sensing matrices. For example, conditions were used in [15] and [1], respectively. We shall remark that, our results implies that the weaker condition

sparse recovery problem. We begin the analysis of minimization methods for sparse recovery by considering the exact recovery in the noiseless case in Section III. Our result improves the main result in Candes and Tao [5] by using weaker conditions and providing tighter error bounds. The analysis of the noiseless case provides insight to the case when the observations are contaminated by noise. We then consider the case of bounded error in Section IV. The connections between the RIP and MIP are also explored. Sparse recovery with Gaussian noise is treated in Section V. Appendices A–D contain the proofs of technical results. II. PRELIMINARIES In this section, we first introduce basic notation and definitions, and then present a technical inequality which will be used in proving our main results. . Let be a vector. The Let support of is the subset of defined by

For an integer , a vector is said to be -sparse if . For a given vector we shall denote by the vector with all but the -largest entries (in absolute value) set to zero and define , the vector with the -largest entries (in absolute value) set to zero. We shall use to denote the -norm of the vector . the standard notation Let the matrix and , the -restricted is defined to be the smallest constant isometry constant such that (II.1) for every -sparse vector . If , we can define another quantity, the -restricted orthogonality constant , as the smallest number that satisfies (II.2) for all and such that and are -sparse and -sparse, respectively, and have disjoint supports. Roughly speaking, the and restricted orthogonality constant isometry constant measure how close subsets of cardinality of columns of are to an orthonormal system. and For notational simplicity we shall write for for hereafter. It is easy to see that and are monotone. That is if

(II.3) (II.4)

if Candes and Tao [5] showed that the constants related by the following inequalities

and

are (II.5)

suffices in sparse signal reconstruction. The paper is organized as follows. In Section II, after basic notation and definitions are reviewed, we introduce an elementary inequality, which allow us to make finer analysis of the

As mentioned in the Introduction, different conditions on and have been used in the literature. It is not always immediately transparent which condition is stronger and which is weaker. We shall present another important property on and

Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.

3390

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009

which can be used to compare the conditions. In addition, it is especially useful in producing simplified recovery conditions. Proposition 2.1: If

. Suppose

Theorem 3.1 (Candes and Tao [5]): Let satisfies

, then

(III.1) (II.6)

In particular,

.

A proof of the proposition is provided in Appendix A. Remark: Candes and Tao [6] imposes in a more recent paper Candes [3] uses consequence of Proposition 2.1 is that a strictly stronger condition than sition 2.1 yields implies that

and . A direct is in fact since Propowhich means .

We now introduce a useful elementary inequality. This inequality allows us to perform finer estimation on , norms. It will be used in proving our main results. Proposition 2.2: Let be a positive integer. Then any descending chain of real numbers

satisfies

Let be a -sparse vector and minimizer to the problem

. Then

is the unique

.

As mentioned in the Introduction, other conditions on and have also been used in the literature. Candes, Romberg, and . Candes and Tao Tao [4] uses the condition [6] considers the Gaussian noise case. A special case with noise of Theorem 1.1 in that paper improves Theorem 3.1 level to by weakening the condition from . Candes [3] imposes the condition . We shall show below that these conditions can be uniformly improved by a transparent argument. A direct application of Proposition 2.2 yields the following result which weakens the above conditions to

Note that it follows from (II.3) and (II.4) that , and . So the condition is weaker . It is also easy to see from (II.5) and than is also weaker than (II.6) that the condition and the other conditions mentioned above. Theorem 3.2: Let

and obeys

. Suppose

. Then the minimizer

satisfies

to the problem

The proof of Proposition 2.2 is given in Appendix B. where III. SIGNAL RECOVERY IN THE NOISELESS CASE As mentioned in the Introduction, we shall give a unified minimization with an contreatment for the methods of straint and minimization with an constraint for recovery of sparse signals in three cases: noiseless, bounded error, and Gaussian noise. We begin in this section by considering the simplest setting: exact recovery of sparse signals when no noise is present. This is an interesting problem by itself and has been considered in a number of papers. See, for example, Fuchs [13], Donoho [8], and Candes and Tao [5]. More importantly, the solutions to this “clean” problem shed light on the noisy case. Our result improves the main result given in Candes and Tao [5]. The improvement is obtained by using the technical inequalities we developed in previous section. Although the focus is on recovering sparse signals, our results are stated in the general setting of reconstructing an arbitrary signal. with and suppose we are given and Let where for some unknown vector . The goal is to recover exactly when it is sparse. Candes and Tao [5] showed that a sparse solution can be obtained by minimization which is then solved via linear programming.

.

, i.e., the In particular, if is a -sparse vector, then minimization recovers exactly. This theorem improves the results in [5], [6]. The improvement on the condition shows that for fixed and , signals with larger support can be recovered accurately. Remark: It is sometimes more convenient to use conditions only involving the restricted isometry constant . Note that the condition (III.2) implies

. This is due to the fact

by Proposition 2.1. Hence, Theorem 3.2 holds under the condican also be used. tion (III.2). The condition Proof of Theorem 3.2: The proof relies on Proposition 2.2 and makes use of the ideas from [4]–[6]. In this proof, we shall also identify a vector as a function by assigning .

Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.

CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIA

Let Let

MINIMIZATION

3391

be a solution to the

minimization problem (Exact). be the support of . Write

and let

such that

Claim:

(III.4)

. Let In fact, from Proposition 2.2 and the fact that , we have

For a subset characteristic function of

, we use

to denote the

, i.e., if if

.

For each , let . Then is decomposed to . Note that ’s are pairwise disjoint, , and for . We first consider the case where is divisible by . For each , we divide into two halves in the following manner: with where

It then follows from Proposition 2.2 that

and

is the first half of

, i.e., Proposition 2.2 also yields

and We shall treat four equal parts

. as a sum of four functions and divide with

into

for any

. Therefore

and

We then define that Note that

for .

by

. It is clear

(III.3)

In fact, since

Since

, we have

, this yields

In the rest of our proof we write . Note that . So we get the equation at the top of the following page. This yields

The following claim follows from our Proposition 2.2. Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.

3392

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009

It then follows from (III.4) that

IV. RECOVERY OF SPARSE SIGNALS IN BOUNDED ERROR We now turn to the case of bounded error. The results obtained in this setting have direct implication for the case of Gaussian noise which will be discussed in Section V. and let Let

where the noise is bounded, i.e., for some bounded set . In this case the noise can either be stochastic or deterministic. The minimization approach is to estimate by the minimizer of subject to We now turn to the case that is not divisible by . Let . Note that in this case and are understood as and , respectively. So the proof for the previous case works if we set and for

We

shall

specifically consider two cases: and . The first case is closely connected to the Dantzig selector in the Gaussian noise setting which will be discussed in more detail in Section V. Our results improve the results in Candes, Romberg, and Tao [4], Candes and Tao [6], and Donoho, Elad, and Temlyakov [10]. We shall first consider

and

where

satisfies

Let be the solution to the (DS) problem given in (I.1). The Dantzig selector has the following property.

and

Theorem 4.1: Suppose satisfying . If

and

with (IV.1)

then the solution In this case, we need to use the following inequality whose proof is essentially the same as Proposition 2.2: For any descending chain of real numbers , we have

to (DS) obeys (IV.2)

with In particular, if .

and

.

is a -sparse vector, then

Remark: Theorem 4.1 is comparable to Theorem 1.1 of Candes and Tao [6], but the result here is a deterministic one instead of a probabilistic one as bounded errors are considered. The proof in Candes and Tao [6] can be adapted to yield a Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.

CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIA

MINIMIZATION

similar result for bounded errors under the stronger condition . Proof of Theorem 4.1: We shall use the same notation as in the proof of Theorem 3.2. Since , letting and following essentially the same steps as in the first part of the proof of Theorem 3.2, we get

If that

, then and for every , and we have

. The latter forces . Otherwise

3393

where the noise satisfies . Once again, this problem can be solved through constrained minimization subject to

(IV.3)

An alternative to the constrained minimization approach is the so-called Lasso given in (I.4). The Lasso recovers a sparse regularized least squares. It is closely connected signal via minimization. The Lasso is a popular to the -constrained method in statistics literature (Tibshirani [16]). See Tropp [18] regularized least squares for a detailed treatment of the problem. By using a similar argument, we have the following result on the solution of the minimization (IV.3). Theorem 4.2: Let vector and

with

. Suppose . If

is a -sparse

(IV.4) then for any

, the minimizer

to the problem (IV.3) obeys (IV.5)

To finish the proof, we observe the following. 1) . be the submatrix obIn fact, let tained by extracting the columns of according to the in, as in [6]. Then dices in

with

.

imProof of Theorem 4.2: Notice that the condition , so we can use the first part of the proof plies that of Theorem 3.2. The notation used here is the same as that in the proof of Theorem 3.2. First, we have

and

2)

Note that In fact

So

We get the result by combining 1) and 2). This completes the proof. We now turn to the second case where the noise is bounded with . The problem is to in -norm. Let from recover the sparse signal

Remark: Candes, Romberg, and Tao [4] showed that, if , then

(The was set to be This implies

in [4].) Now suppose which yields

Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.

. ,

3394

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009

since and Theorem 4.2 that, with

. It then follows from

for all -sparse vector where . Therefore, Theorem 4.2 improves the above result in Candes, Romberg, and Tao [4] by enlarging the support of by 60%.

Proof of Theorem 4.3: It follows from Proposition 4.1 that

Since Theorem 4.2

, the condition

holds. By

Remark: Similar to Theorems 3.2 and 4.1, we can have the estimation without assuming that is -sparse. In the general case, we have

A. Connections Between RIP and MIP In addition to the restricted isometry property (RIP), another commonly used condition in the sparse recovery literature is the mutual incoherence property (MIP). The mutual incoherence property of requires that the coherence bound (IV.6) be small, where are the columns of ( ’s are also assumed to be of length in -norm). Many interesting results on sparse recovery have been obtained by imposing conand the sparsity , see [10], ditions on the coherence bound [11], [13], [14], [18]. For example, a recent paper, Donoho, is a -sparse Elad, and Temlyakov [10] proved that if with , then for any , the vector and minimizer to the problem ( -Constraint) satisfies

with

, provided

.

We shall now establish some connections between the RIP and MIP and show that the result of Donoho, Elad, and Temlyakov [10] can be improved under the RIP framework, by using Theorem 4.2. The following is a simple result that gives RIP constants from MIP. The proof can be found in Appendix C. It is remarked that the first inequality in the next proposition can be found in [17]. be the coherence bound for

. Then

and

(IV.7)

Proposition 4.1: Let

Now we are able to show the following result. Theorem 4.3: Suppose with satisfying (or, equivalently, the minimizer

is a -sparse vector and . Let . If ), then for any

,

to the problem ( -Constraint) obeys

Remarks: In this theorem, the result of Donoho, Elad, and Temlyakov [10] is improved in the following ways. to 1) The sparsity is relaxed from . So roughly speaking, Theorem 4.3 improves the result in Donoho, Elad, and Temlyakov [10] by enlarging the support of by 47%. is usually very 2) It is clear that larger is preferred. Since small, the bound is tightened from to , as is close to

V. RECOVERY OF SPARSE SIGNALS IN GAUSSIAN NOISE We now turn to the case where the noise is Gaussian. Suppose we observe (V.1) and wish to recover from and . We assume that is known and that the columns of are standardized to have unit norm. This is a case of significant interest, in particular in statistics. Many methods, including the Lasso (Tibshirani [16]), LARS (Efron, Hastie, Johnstone, and Tibshirani [12]) and Dantzig selector (Candes and Tao [6]), have been introduced and studied. The following results show that, with large probability, the Gaussian noise belongs to bounded sets. Lemma 5.1: The Gaussian error

.

satisfies (V.2)

and (V.3) Inequality (V.2) follows from standard probability calculations and inequality (V.3) is proved in Appendix D. Lemma 5.1 suggests that one can apply the results obtained in the previous section for the bounded error case to solve the Gaussian noise problem. Candes and Tao [6] introduced the Dantzig selector for sparse recovery in the Gaussian noise setting. Given the observations in (V.1), the Dantzig selector is the minimizer of subject to

(IV.8) with

.

where

.

Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.

(V.4)

CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIA

MINIMIZATION

3395

In the classical linear regression problem when , the least squares estimator is the solution to the normal equation (V.5) in the convex program The constraint (DS) can thus be viewed as a relaxation of the normal (V.3). And similar to the noiseless case, minimization leads to the “sparsest” solution over the space of all feasible solutions. Candes and Tao [6] showed the following result. Theorem 5.1 (Candes and Tao [6]): Suppose -sparse vector. Let be such that Choose the Dantzig selector

Remark: Similar to the results obtained in the previous sections, if is not necessarily -sparse, in general we have, with probability

is a where probability

and

, and with

where

and

.

in (I.1). Then with large probability, obeys (V.6) .1

with

since . The improvement on the error bound is minor. The improvement on the condition is more significant as it shows signals with larger support can be recovered accurately for fixed and .

As mentioned earlier, the Lasso is another commonly used method in statistics. The Lasso solves the regularized least squares problem (I.4) and is closely related to the -constrained minimization problem ( -Constraint). In the Gaussian error be the mincase, we shall consider a particular setting. Let imizer of subject to

(V.7)

where . Combining our results from the last section together with Lemma 5.1, we have the following results on the Dantzig seand the estimator obtained from minimizalector tion under the constraint. Again, these results improve the previous results in the literature by weakening the conditions and providing more precise bounds.

Remark: Candes and Tao [6] also proved an Oracle Inequality in the Gaussian noise setting under for the Dantzig selector . With some additional work, our the condition approach can be used to improve [6, Theorems 1.2 and 1.3] by . weakening the condition to APPENDIX A PROOF OF PROPOSITION 2.1 Let be -sparse and be supports are disjoint. Decompose such that

-sparse. Suppose their as

is

-sparse for and for . Using the Cauchy–Schwartz inequality, we have

is a -sparse vector and the

Theorem 5.2: Suppose matrix satisfies Then with probability

, the Dantzig selector

obeys (V.8) , and with probability at least

with

, This yields

obeys (V.9) with

. Since .

we also have

APPENDIX B PROOF OF PROPOSITION 2.2

.

Remark: In comparison to Theorem 5.1, our result in Theto orem 5.2 weakens the condition from and improves the constant in the bound from to . Note that

Let

= . It = 4=(1 0  0

Candes and Tao [6], the constant C was stated as C appears that there was a typo and the constant C should be C  . 1In

)

Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.

,

3396

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 7, JULY 2009

where each

is given (and bounded) by

Therefore

and the inequality is proved. Without loss of generality, we assume that

APPENDIX C PROOF OF PROPOSITION 4.1

is even. Write

where

Let be a -sparse vector. Without loss of generality, we as. A direct calculation shows sume that that

and

Now let us bound the second term. Note that

Now

These give us

and hence

For the second inequality, we notice that follows from Proposition 2.1 that

. It then

and APPENDIX D PROOF OF LEMMA 5.1 The first inequality is standard. For completeness, we give . Then each a short proof here. Let . Hence marginally has Gaussian distribution

Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.

CAI et al.: ON RECOVERY OF SPARSE SIGNALS VIA

MINIMIZATION

where the last step follows from the Gaussian tail probability bound that for a standard Gaussian variable and any constant , . is We now prove inequality (V.3). Note that a random variable. It follows from Lemma 4 in Cai [2] that for any

Hence

where

. It now follows from the fact that

Inequality (V.3) now follows by verifying directly that for all . ACKNOWLEDGMENT The authors wish to thank the referees for thorough and useful comments which have helped to improve the presentation of the paper. REFERENCES [1] W. Bajwa, J. Haupt, J. Raz, S. Wright, and R. Nowak, “Toeplitz-structured compressed sensing matrices,” in Proc. IEEE SSP Workshop, Madison, WI, Aug. 2007, pp. 294–298. [2] T. Cai, “On block thresholding in wavelet regression: Adaptivity, block size and threshold level,” Statist. Sinica, vol. 12, pp. 1241–1273, 2002. [3] E. J. Candes, “The restricted isometry property and its implications for compressed sensing,” Compte Rendus de l’ Academie des Sciences Paris, ser. I, vol. 346, pp. 589–592, 2008. [4] E. J. Candes, J. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate measurements,” Commun. Pure Appl. Math., vol. 59, pp. 1207–1223, 2006. [5] E. J. Candes and T. Tao, “Decoding by linear programming,” IEEE Trans. Inf. Theory, vol. 51, no. 12, pp. 4203–4215, Dec. 2005.

3397

[6] E. J. Candes and T. Tao, “The Dantzig selector: Statistical estimation when p is much larger than n (with discussion),” Ann. Statist., vol. 35, pp. 2313–2351, 2007. [7] A. Cohen, W. Dahmen, and R. Devore, “Compressed Sensing and Best k -Term Approximation” 2006, preprint. [8] D. L. Donoho, “For most large underdetermined systems of linear equations the minimal ` -norm solution is also the sparsest solution,” Commun. Pure Appl. Math., vol. 59, pp. 797–829, 2006. [9] D. L. Donoho, “For most large underdetermined systems of equations, the minimal ` -norm near-solution approximates the sparsest near-solution,” Commun. Pure Appl. Math., vol. 59, pp. 907–934, 2006. [10] D. L. Donoho, M. Elad, and V. N. Temlyakov, “Stable recovery of sparse overcomplete representations in the presence of noise,” IEEE Trans. Inf. Theory, vol. 52, no. 1, pp. 6–18, Jan. 2006. [11] D. L. Donoho and X. Huo, “Uncertainty principles and ideal atomic decomposition,” IEEE Trans. Inf. Theory, vol. 47, no. 7, pp. 2845–2862, Nov. 2001. [12] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression (with discussion,” Ann. Statist., vol. 32, pp. 407–451, 2004. [13] J.-J. Fuchs, “On sparse representations in arbitrary redundant bases,” IEEE Trans. Inf. Theory, vol. 50, no. 6, pp. 1341–1344, Jun. 2004. [14] J.-J. Fuchs, “Recovery of exact sparse representations in the presence of bounded noise,” IEEE Trans. Inf. Theory, vol. 51, no. 10, pp. 3601–3608, Oct. 2005. [15] M. Rudelson and R. Vershynin, “Sparse reconstruction by convex relaxation: Fourier and Gaussian measurements,” in Proc. 40th Annu. Conf. Information Sciences and Systems, Princeton, NJ, Mar. 2006, pp. 207–212. [16] R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Roy. Statist. Soc. Ser. B, vol. 58, pp. 267–288, 1996. [17] J. Tropp, “Greed is good: Algorithmic results for sparse approximation,” IEEE Trans. Inf. Theory, vol. 50, no. 10, pp. 2231–2242, Oct. 2004. [18] J. Tropp, “Just relax: Convex programming methods for identifying sparse signals in noise,” IEEE Trans. Inf. Theory, vol. 52, no. 3, pp. 1030–1051, Mar. 2006. T. Tony Cai received the Ph.D. degree from Cornell University, Ithaca, NY, in 1996. He is currently the Dorothy Silberberg Professor of Statistics at the Wharton School of the University of Pennsylvania, Philadelphia. His research interests include high-dimensional inference, large-scale multiple testing, nonparametric function estimation, functional data analysis, and statistical decision theory. Prof. Cai is the recipient of the 2008 COPSS Presidents’ Award and a fellow of the Institute of Mathematical Statistics.

Guangwu Xu received the Ph.D. degree in mathematics from the State University of New York (SUNY), Buffalo. He is now with the Department of Electrical engineering and Computer Science, University of Wisconsin-Milwaukee. His research interests include cryptography and information security, computational number theory, algorithms, and functional analysis.

Jun Zhang (S’85–M’88–SM’01) received the B.S. degree in electrical and computer engineering from Harbin Shipbuilding Engineering Institute, Harbin, China, in 1982 and was admitted to the graduate program of the Radio Electronic Department of Tsinghua University. After a brief stay at Tsinghua, he came to the U.S. for graduate study on a scholarship from the Li Foundation, Glen Cover, New York. He received the M.S. and Ph.D. degrees, both in electrical engineering, from Rensselaer Polytechnic Institute, Troy, NY, in 1985 and 1988, respectively. He joined the faculty of the Department of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee, and currently is a Professor. His research interests include image processing and computer vision, signal processing and digital communications. Prof. Zhang has been an Associate Editor of IEEE TRANSACTIONS ON IMAGE PROCESSING.

Authorized licensed use limited to: University of Pennsylvania. Downloaded on June 17, 2009 at 09:48 from IEEE Xplore. Restrictions apply.