Shearlet-Based Deconvolution

Report 2 Downloads 18 Views
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 12, DECEMBER 2009

2673

Shearlet-Based Deconvolution Vishal M. Patel, Member, IEEE, Glenn R. Easley, Member, IEEE, and Dennis M. Healy, Jr.

Abstract—In this paper, a new type of deconvolution algorithm is proposed that is based on estimating the image from a shearlet decomposition. Shearlets provide a multidirectional and multiscale decomposition that has been mathematically shown to represent distributed discontinuities such as edges better than traditional wavelets. Constructions such as curvelets and contourlets share similar properties, yet their implementations are significantly different from that of shearlets. Taking advantage of unique properties of a new M-channel implementation of the shearlet transform, we develop an algorithm that allows for the approximation inversion operator to be controlled on a multiscale and multidirectional basis. A key improvement over closely related approaches such as ForWaRD is the automatic determination of the threshold values for the noise shrinkage for each scale and direction without explicit knowledge of the noise variance using a generalized cross validation (GCV). Various tests show that this method can perform significantly better than many competitive deconvolution algorithms. Index Terms—Deconvolution, generalized cross validation, shearlets, wavelets.

I. INTRODUCTION N image restoration, the goal is to best estimate an image that has been degraded. Examples of image degradation include the blurring introduced by camera motion as well as the noise introduced from the electronics of the system. In the case when the degradations can be modelled as a convolution operation, the process of recovering the original image from the degraded blurred image is commonly called deconvolution. The process of deconvolution is known to be an ill-posed problem. Thus, to get a reasonable image estimate, a method of reducing/ controlling noise needs to be utilized. Wavelets are popular for image representation and are used in a wide variety of image processing applications such as compression, and image restoration [1], [47]. The main reason for wavelets’ success can be explained by their ability to sparsely represent 1-D signals which are smooth away from point discontinuities. By sparse representation, we mean that most of the signal’s energy can be captured by a few of the transform coefficients. This is quantified by the decay rate of the nonlinear

I

Manuscript received January 04, 2009; revised July 12, 2009. First published August 07, 2009; current version published November 13, 2009. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Luminita Vese. V. M. Patel is with the Department of Electrical and Computer Engineering, University of Maryland, College Park, MD 20742 USA (e-mail: [email protected]). G. R. Easley and D. M. Healy, Jr. are with the Department of Mathematics, University of Maryland, College Park, MD 20742 USA (e-mail: geasley@math. umd.edu; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2009.2029594

approximation error. Indeed, it can be shown that the best nonlinear -term wavelet expansion (reconstruction from the largest coefficients) for this type of signal has the rate of decay that is the best achievable [3], [4]. It is understood that, the better the decay rate, the better the signal estimate from the noisy data will be from that representation. It is because of this optimality property of wavelet representations that wavelet-based deconvolution routines have been proposed. However, wavelet representations are actually not optimal for all types of images. Specifically, in dimension two, if we model images as piecewise smooth functions that are smooth edge,1 the standard 2-D wavelets do not reach away from a the best possible rate. In particular, the approximation error for as increases [4]. a wavelet representation decays as As a result, denoising estimates based on 2-D wavelets tend to have small unwanted artifacts and complex decision metrics or schemes need to be utilized to try to improve the quality of the estimate. Multidirectional representations such as shearlets [5], [6] provide nearly the optimal approximation rate for these types as increases [7]) of images (the optimal rate being and the corresponding denoising estimates do not suffer from the same types of artifacts [8]. Although related transforms such as contourlets [9], [10], and curvelets [11]–[13] share similar properties, in this work we utilize properties unique to an implementation of the shearlet transform that offer advantages for the purpose of deconvolution. The concept of using a sparse representation to achieve good estimates for deconvolution has been suggested before (see for example [14] and [15]). However, particular features concerning implementations of such representations that contribute to performance presented here have not been previously considered. Our shearlet-based deconvolution has the unique ability for a multiscale and anisotropic regularization inversion to be done before noise suppression. Furthermore, for a given regularization parameter its adaptive noise suppression surpasses similar schemes. This is an important consideration since in some case it may not be possible to find the optimal regularization parameter. In the implementation stage, to deal with boundary effects, some concepts in the literature have centered around the idea of noise shrinkage either before or after the application of the deconvolution procedure (see [14] and [16]). However, to carry out such schemes effectively, one needs a transform that can be implemented in a nonrecursive formulation as is done in this work with the shearlet transform. Otherwise, error estimates made by one set of coefficients will highly influence estimates made on a different but dependent set of coefficients. In addition, to be effective in regularizing the approximate deconvolution process, a nonsubsampled (redundant) transform should be utilized. This redundancy not only provides for more effective measurements 1C is the space of functions that are bounded and 2-times continuously differentiable.

1057-7149/$26.00 © 2009 IEEE

2674

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 12, DECEMBER 2009

based on the use of auxiliary functions such the GCV function, but it also greatly aids in estimation [10], [17]–[19]. As will become apparent, these desired features are obtained by using an M-channel shearlet transform implementation. A. Image Deconvolution Problem Since a digitally recorded image is a finite discrete data set, an image deconvolution problem is formulated as a matrix inversion problem. Without loss of generality, assume the recorded . Let denote an array of samarrays are of size ples from a zero mean additive white Gaussian noise (AWGN) arrays and , representing with variance . Given the the observed image and the image to be estimated, respectively, the matrix deconvolution problem can be described as (1) where , , and are column vectors representing the arrays , , and lexicographically ordered, and is the matrix that models the blur operator. In the case when is a block-circulant-circulant-block matrix [20], the problem can be described as (2) where , denotes circular convolution, and denotes the point spread function (PSF) of a linear spaceinvariant system. Equation (2) in the discrete Fourier transform (DFT) domain can be written as (3) where , , and are the 2-D DFTs of , , , and , respectively, for . The conditioning of this system is determined by the ratio of the largest to smallest magnitudes of the values. Typically, contains values at or near zero which makes the system ill-conditioned. In general, to regularize the inversion of the convolution operator, a representation that diagonalizes the convolution operator (matrix) is needed in order to appropriately control the is a block-circulant-circuapproximation. In particular, if lant block matrix, is diagonalizable by a Fourier basis. This means that an estimate of the image can be found by filtering the diagonal components of the Fourier diagonalization of in order to approximately invert . For instance, let be is large, is small when a filter that is nearly one when is nearly zero, and such that is defined everywhere. Then, a Fourier-based estimate can be . This in turn means given as that the image is estimated from a Fourier representation. However, if our image is considered as a piecewise smooth edge, then the decay rate function that is smooth away from a of the nonlinear approximation from a Fourier representation is as increases. Yet, for this type of image the decay rate of the nonlinear approximation from a wavelet representaas increases. This means that estimating the tion is image from the perspective of removing noise, a wavelet based estimate would perform better than the one from a Fourier basis.

In short, the ability to get a good estimate depends on balancing a representation that is effective for regularizing the inversion of the convolution operator and a representation that is effective for estimating the image from colored noise by means of the approximate inversion of the operator. (See [15] for further discussion). B. Historical Perspective Deconvolution methods can be separated into two major categories: direct and iterative. Direct Methods: Some of these methods are based on filtering the singular value decomposition (SVD) such as Tikhonov, truncated SVD (TSVD), and Wiener filtering [21]. Increased performance of such direct methods can be attributed to the inclusion of the wavelet-based estimators. One such technique called the Wavelet-Vaguelette deconvolution (WVD) was proposed in [14]. In this work, functionals called vaguelettes are used to simultineously deconvolve and compute the wavelet coefficients. However, the scheme does not provide good estimates for all convolution operators. To overcome this limitation, Kalifa et al. proposed a wavelet packet based method that matches the frequency behavior of certain convolution operators [22]. Additional wavelet-based techniques have been proposed in [16] and [23]–[25]. An improved hybrid wavelet based regularized deconvolution algorithm that works with any ill-conditioned convolution system was developed in [17]. This Fourier-Wavelet Regularized Deconvolution (ForWaRD) method employs Fourier-domain regularized inversion followed by wavelet-domain noise shrinkage to minimize the distortion of spatially localized features in the image. An extension in terms of curvelets, known as ForCuRD, was proposed in [26]. Iterative Methods: Some of the better-known basic iterative methods are the Conjugate Gradient algorithm [21], Richardson-Lucy [27], [28], and Landweber [29]. Many extensions and improvements over these methods have been made that include the use of wavelets, or other sparse representations such as curvelets. Some of these are [18], [30]–[40]. Additional techniques may be found within these references. Note some of the these promising techniques make use of one-norm transform-domain sparsity promotion. Such methods seem to retain edge information well. Among the direct methods, the local polynomial approximation (LPA) algorithm [41] outperformed some of the best existing deconvolution methods such as [17] and [42] in terms of improvement in signal-to-noise-ratio and was established to be state-of-the-art. In this work, we will only focus on direct methods for comparison because some applications may desire direct methods, and also since some of these iterative methods can use the estimates provided from such techniques as the initial starting point. C. Shearlet-Based Deconvolution In this paper, we propose a new approach to image deconvolution that balances a Fourier domain regularized inversion and a shearlet-domain-based noise reduction. An undecimated shearlet transform decomposes an image into various frequency

PATEL et al.: SHEARLET-BASED DECONVOLUTION

2675

regions whose supports are contained in a pair of trapezoidal regions symmetric with respect to the origin and directionally oriented. This aspect provides the unique ability to adaptively regularize the Fourier inversion in a directionally oriented fashion when the blurred image is first projected onto a shearlet domain. Following the regularized inversion, the colored noise is then suppressed using a shearlet domain based thresholding. To improve the estimating capability, we derive a generalized cross validation function to find the optimal shearlet domain shrinkage without explicitly calculating the noise variance. D. Paper Organization In Section II, we give a brief introduction to the shearlet transform. In Section III, we describe the use of generalized cross validation for shearlet thresholding for colored noise. In Section IV, we discuss details about the proposed deconvolution algorithm. In Section V, we show some of the simulation results, and present the concluding remarks in Section VI. II. SHEARLET TRANSFORM In this section, we briefly describe a recently developed multiscale and multidirectional representation called the shearlet transform [6]. Consider the 2-D affine system

Hence, each element has support on a pair of trapezoids, at various scales, symmetric with respect to the origin and oriented along a line of slope . The collection of discrete shearlets is described by

where

For the appropriate choices of , the discrete shearlets form a Parseval frame (tight frame with bounds equal to 1) for [43], i.e., they satisfy the property

The discrete shearlets described above provide a nonuniform angular covering of the frequency plane when restricted to the finite discrete setting for implementation. Thus, it is preferred to reformulate the shearlet transform with restrictions supported in the regions given by and . Specifically, define

where

, and

where

. In addition, we assume that for

is a product of a shearing and an anisotropic dilation matrix for . The generating functions are such that, for

and, for each for

(4) Let where

is a continuous wavelet for which with , and is chosen so that , , , with on ( 1, 1). Then any admits the representation

for

,

, and

. The operator

defined by

is known as the continuous shearlet transform of . The shearlet transform is a function of three variables: the scale , the shear and the translation . In the frequency domain, has support in the set

and choose

to satisfy

for , where denotes the indicator function of the set . With the functions and as above, we deduce the following result. Theorem 1 ([6]): Let and . Then the collection of shearlets together with

is a Parseval frame for

, where

.

2676

Based on and and

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 12, DECEMBER 2009

, filters and can be found so that can be computed as

done before and cannot be done by the current curvelet or contourlet transform implementations. III. GENERALIZED CROSS VALIDATION (GCV) FOR SHEARLET THRESHOLDING

where are the directionally-oriented filters. To simplify the notation, we suppress the superscript and and 1 by re-indexing the absorb the distinction between parameter so that it has double the cardinality. An M-channel can filterbank implementation whose filters correspond to be done by using the techniques given in [44]. As a consequence, for an its implementation has a complexity of image. Notice that, just as in the continuous version, each element is supported on a pair of trapezoids, and each trapezoid is satisfying a contained in a box of size approximately parabolic scaling property. Their supports become increasingly and the elements exhibit highly directional thin as sensitivity since they are oriented along lines with slope given . These properties contribute to being able to establish by the following theorem. away from piecewise Theorem 2 ([45]): Let be curves, and let be the approximate reconstruction of using largest coefficients in the shearlet expansion. Then the

The significance of this result is that a shearlet-based estimate as , where yields a MSE approximation rate of is the noise level of the noisy image [12]. (This is achieved by choosing a threshold so that one reconstructs from the largest noisy shearlet coefficients.) Similarly, one obtains for the MSE approximation rate of wavelet thresholding as . The shearlet transform has similarities to the curvelet transform and the contourlet transform. Shearlets and curvelets, in fact, are the only two systems which are known mathematically using the largest to provide the rate of away from piecewise coefficients for images described as curves. The spatial-frequency tilings of curvelet and shearlet representations are completely different theoretically, yet the implementations of the curvelet transform corresponds to essentially the same tiling as that of the shearlet or contourlet transform. Alternative discrete shearlet decompositions can be created by varying the support of the mother wavelet (which amounts and . to changes in ) and changing the dilation matrices Such changes produce different spatial-frequency tilings composed of regions of support that are restricted to pairs of various trapezoidal regions. When implemented in an undecimated form, the shearlet transform will produce a highly redundant decomposition consisting of the total number of paired trapezoidal regions considered. An important advantage in the use of this redundant shearlet transform implementation for deconvolution is that it allows one to independently estimate each directionally-oriented frequency band with different amounts of regularization. This has not been

In this section, we describe a shearlet-thresholding scheme based on a GCV function for the purpose of noise reduction [46]. One of the major advantages of this GCV method is that it obtains nearly the optimal thresholding without knowing the noise variance. It depends only on the data and automatically adjusts the shrinkage parameter according to the data. A similar GCV method for wavelet thresholding has been proposed in [48]–[50]. Note that, although we are suggesting the use of a GCV function, is also it possible to adapt the new SURE approach [51] for this task. Suppose (5) where the vectors , and represent respectively the observation, the original image and the colored noise that is assumed to be second order stationary (i.e., the mean is constant and the correlation between two points depends only on the distance between them). Corresponding to a threshold , define the to be equal to if soft-threshold function and zero otherwise. We will show that nearly optimal can be obtained by finding the values minthreshold values imizing a GCV function which is dependent on each scale and direction . Just as in the case of wavelets, to obtain results similar to those in [50], it is not necessary for the shearlet coefficients to be uncorrelated at any moment; however, it is necessary that the noise be second order stationary [49]. If the noise process is stationary, then using the multiscale and multidirectional structure of shearlets, we obtain the following lemma. represents a shearlet coefficient of a Lemma 1: If random vector at scale , direction , and location , then the , depends only on variance of this coefficient, the scale and direction . Proof: It follows from the fact that we are using a filter bank with appropriate directional filters and if is a discrete stationary random process which is an input to a shift-invariant , corresponding to scale and direction then the filter, output is a convolution of with which is also stationary [52]. By Lemma 1, the shearlet transform of stationary correlated noise is stationary within each scale and directional component. We can use this property to choose a different threshold for represents the each resolution and directional component. If vector of shearlet coefficients of at scale and direction , then we can write (6) is the number of shearlet coefficients on scale and where direction , is the total number of shearlet coefficients, and (7)

PATEL et al.: SHEARLET-BASED DECONVOLUTION

2677

Since all the components in (6) are positive, minimizing the is equivalent to minmean squared error or risk function for all and . An argument similar to that imizing used in [50] leads to the following GCV functions: (8)

is the total number of shearlet coefficients that were where replaced by zero. We now have the following result. Theorem 3: The minimizer of is asymptotically for scale and optimal for the minimum risk threshold directional component . that minimize for each Thus, by using the values and , a shearlet-based denoised estimate will likely be close to the ideal non-noise corrupted image. An important feature about the use of our nonsubsampled shearlet transform implementation is that it facilitates the use of asymptotic methods such as those based on GCV functions. A subsampled transform implementation would cause the number of coefficients to decrease as the levels progress, so that thresholds found by minimizing the individual GCV functions will be less likely to correspond to the actual threshold values that minimize the risk functions for each frequency band. IV. SHEARLET-BASED DECONVOLUTION Having established a method for obtaining a good image estimate when the image is corrupted by colored noise, let us now focus on how we are to use this method as part of a deconvolution routine. Since our blurring model is described by (2), a suitable pseudo-inverse estimate can be found by regularizing the convolution operator from a discrete Fourier basis. Using the regularized inverse operator (9) for some regularizing parameter the Fourier domain is given by

Fig. 1. Frequency support of the shearlets for different values of a and s.

the DFT of the shearlet filter for a given scale and direction . The shearlet coefficients of an estimate of the image for can be computed in the a given regularization parameter Fourier domain as

for . The remaining aspect of the deconvolution problem is transformed into a denoising problem in the presence of colored noise. This can be dealt with by thresholding the estimated shearlet coefficients using the GCV determined previously without having to know the noise variance explicitly. An important advantage in using the GCV is that after the Fourier regularized inversion (FRI), the method automatically adjusts the shrinkage parameter according to the data. We summarize the shearlet based deconvolution method in Fig. 2. denote the result of the inversion of the shearlet transLet have been threshform after the shearlet coefficients of olded. We want to choose an that minimizes the shearlet-based . However, since is unmean-squared-error (MSE) known, is chosen to be a minimizer of the cost function

, an image estimate in

(10) . This type of regularization apfor plied is often referred to as Tikhonov-regularization [53]. When an estimate of the power spectral density (PSD) can be accurately determined from a method such as that proposed in [54], a Wiener-based solution can be found by using (11) and is the estimated PSD of the image where . for Taking advantage of the shearlet decomposition, we can adaptively control the regularization parameter to be the best suited denote of each frequency supported trapezoidal region. Let

where , is the mean of , and the sum is taken over all values of and from to inclusive. In other words, we choose such that the estimate agrees with the observation based on weighing counter-balanced by an approximation to for each frequency index . This is just an extension (shearlet-based estimate versus wavelet-based estimate) of the cost function originally suggested in [17]. This optimization function assumes the noise to be known. This is not a problem since the noise variance variance can easily be estimated by using a median estimator on the finest scale of the wavelet coefficients of [4], [47].

2678

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 12, DECEMBER 2009

Fig. 2. Fourier-shearlet regularized deconvolution.

TABLE I ISNR FOR DIFFERENT EXPERIMENTS

thresholded shearlet coefficient can be found by miniequal to mizing the cost function

Fig. 3. Images used in this paper for different experiments. (a) Cameraman image, (b) Barbara image, and (c) Lena image, (d) Peppers image.

In this case, there is a great advantage in using the GCV for the shearlet thresholding as the variance of the colored noise at each location and scale dependent on does not have to be estimated. In addition, a GCV-based thresholding routine produces better results over schemes based on estimating the standard deviation of the noise throughout the decomposition. for If we define , then the optimal for each

where and is the inverse DFT . of The use of this optimization function to find has been shown to be satisfactory with many of the examples tested. The L-curve method could be adopted to estimate (see [21] for details on the method). This could prove to be more reliable and does not require any estimate of the noise variance. It is also possible to use the optimization function where denotes the regularized Tikhonov inverse of and denotes the identity matrix. Such an optimization function against the given data weighs the fidelity of the estimate and inversely weighs it against a measure of how far away the regularized inverse operator is from an idealized inversion operator. This generalized cross validation function can be derived by using similar arguments to that given in [46].

PATEL et al.: SHEARLET-BASED DECONVOLUTION

2679

Fig. 4. R and GCV as functions of the threshold  for different scales and directions from Experiment 1. Solid (blue) line corresponds to the GCV ;` , (b) j dotted (red) line corresponds to R . (a) j ; ` , (c) j ; ` .

= 2 = 13

=1 =4

We summarize the main steps of the shearlet-based deconvolution algorithm as follows.

=3 =6

For an image of size

and

, the BSNR is defined in decibels as

Shearlet-based Deconvolution Algorithm Given , for some and . • Use the shearlet filter and apply the regularized filter . (10) or (11) to to obtain to • Apply the GCV based shearlet shrinkage to . obtain Repeat process for a different value of minimized for each and .

until

is

After each shearlet coefficient that minimizes is found, form the final estimate by applying the inverse shearlet transform.

The values that minimize can be found by using either a sequence of possible values or by using a minimization routine. Although we have described the most general case of regularizing each shearlet coefficient separately, in some cases for efficiency it may be preferred to use a common regularization parameter . In such a case, the algorithm is implemented using the coast function instead of . V. EXPERIMENTAL RESULTS In this section, we present results of our proposed algorithm and compare them with some of the recent multiscale wavelet and wavelet-like deconvolution methods described in [17], [41] and [26]. The implementation of the ForWaRD algorithm is available at www.dsp.rice.edu/software and the implementation of the LPA-ICI algorithm is available at www.cs.tut.fi/~lasip/2D. In these experiments we will use the improvement in signal-to-noise-ratio (ISNR) to measure the performance of the routines tested using the images shown in Fig. 3. The ISNR is defined as

where denotes the mean of . For all shearlet transform implementations, we used 1, 8, 8, 16, and 16 directions in the scales from coarse to fine which corresponds to the same decomposition tested in [6]. We apply the GCV based shrinkage to the outputs from each of the 48 filters except the output corresponding to the coarsest scale. Experiments have shown that increasing the number of directions every scale usually results in better estimates. In the case when wavelets are used for image denoising, it was shown in [55] that a Wiener-based wavelet shrinkage filter typically improves upon the mean square error performance over that of hard/soft thresholding. By Wiener-based shrinkage, we mean to weigh the wavelet coefficients as , where are the wavelet coefficients from another denoised estimate, and are scale-dependent regularization parameters. The performance of the proposed method is improved by using a similar Wiener shrinkage filter. In this case, the shearlet coefficients with a slightly different decomposition (three decomposition levels) are filtered using the initial shearlet-based estimate. Several experiments have shown that the final estimate is mostly driven by how successful the initial estimate is, so that even the use of a wavelet decomposition instead of an alternate shearlet decomposition can provide just as an effective estimate. In the first set of tests, we consider the setup of [17], where a Cameraman image is blurred by a 9 9 uniform box-car blur. The AWGN variance, , is chosen with a BSNR of 40 dB. A comparison of different methods in terms of ISNR is shown in Table I under the Experiment 1 column. The shearlet-based method yields a value 7.89 dB which is better than the values obtained by any of the other methods. In Fig. 4, we display a few of the and curves for different scales and directions obtained from Experiment 1. In [17], after the Fourier shrinkage, the leaked colored noise variance was estimated at each scale and was used to shrink

2680

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 12, DECEMBER 2009

Fig. 5. Minimum values obtained by minimizing R (black), and GCV (dot-dash) values along with  (gray) values obtained by the method described in [17], for different scales and directions from Experiment 1. As can be seen from the figure, minimum values of GCV are approximately equal to the minimum values of R . (a) At scale j , there are 16 directions. (b) At scale j , there are 16 directions. (c) At scale j , there are eight directions. (d) At scale j , there are eight directions.

=4

=1

=2

=3

the wavelet coefficients. Similarly, we can estimate the colored noise variance, , at different scales and directions at the output of the shearlet filter bank as follows:

The threshold values are determined by , where is a scale and direction dependent threshold [4], [17]. For comparison, the estimated are plotted in Fig. 5 along with the actual minimum values obtained by minimizing the and functions for experiment 1. Figs. 4 and 5 indicate that both and have approximately the same minimum values. However, in some cases, the estimated values are very different than the values obtained by minimizing the and curves. In Fig. 6, we show the signal-to-noise-ratio (SNR) performance of the shearlet-based deconvolution (black-line) compared to ForWaRD (dotted-line) and ForCuRD (gray-line) as a function of blur SNR (BSNR). In this illustration, we used the 9 9 box-car blur on the Lena image shown in Fig. 3(c). As explained previously, an estimate based on a shearlet or curvelet decomposition decays faster than that of an estimate based on a wavelet decomposition as a function of noise level

Fig. 6. SNR performance of ForSURE (black-line) compared to ForCuRD (gray-line) and ForWaRD (dotted-line) as a function of BSNR.

for images that are smooth away from edges. Fig. 6 displays a similar correspondence in decay rates for the proposed shearlet-based deconvolution and the ForCuRD scheme over the wavelet-based deconvolution scheme (ForWaRD). Since the performance is measured in terms of SNR instead of MSE, it is expected that shearlet and curvelet-based estimates will decay slower as a function of noise level given in terms of BSNR. In the second set of experiments performed over the Cameraman image, we replicate the experimental setup of [42].

PATEL et al.: SHEARLET-BASED DECONVOLUTION

2681

=5

Fig. 7. Details of the image deconvolution experiment with a Barbara image. (a) Original image. (b) Noisy blurred image,  dB. (c) ForWaRD : dB. (d) ForCuRD estimate, : dB. (e) LPA-ICI algorithm, : dB. (f) Shearlet-based estimate, estimate, : dB.

1 74

ISNR = 1 12

ISNR = 1 03

The point spread function of the blur operator is given by: , for , and

ISNR = 1 0

ISNR =

and . The SNR imthe noise variances are provements are summarized in Table I under the Experiment

2682

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 12, DECEMBER 2009

Fig. 8. Details of the image deconvolution experiment with a Peppers image. (a) Original image. (b) Noisy blurred image, BSNR = 30 dB. (c) Regularized inversion estimate, ISNR = 11:03 dB. (d) ForWaRD estimate based on result shown in (c), ISNR = 3:91 dB. (e) LPA-ICI estimate based on result shown in (c), ISNR = 4:17 dB. (f) Shearlet-based estimate based on result shown in (c), ISNR = 5:23 dB.

0

2 and Experiment 3 columns, for and , respectively. Again, the shearlet-based deconvolution algorithm outperforms the other methods in terms of ISNR. In the third set of tests, the original image of Lena is blurred by a Gaussian PSF defined as

for , where is a normalizing constant ensuring that the blur is of unit mass, and is the variance that

determines the severity of the blur. In this experiment we chose and the noise variance, , with a BSNR of 40 dB. We report the simulation results under the Experiment 4 column of Table I. Again, our proposed method outperforms the best performing methods known for this problem setup. In the fifth experiment, we use the blur filter considered in [41]. The original image of Barbara is blurred by a 5 5 separable filter with weights [1, 4, 6, 4, 1]/16 in both the horizontal and vertical directions and then contaminated with AWGN with . The details of the images obtained by the different methods are shown in Fig. 7. Again, the shearlet-based algo-

PATEL et al.: SHEARLET-BASED DECONVOLUTION

H

= trace I H H

2683

C

x

x

Fig. 9. Plots of k x 0 yk ( ( 0 ~ )), ( ), and k 0 k for various values of are displayed as a black line, a gray line, and a dash-dotted line, respectively. The output values have been rescaled for illustration purposes. The locations of their minimal value are marked with a circled “X”.

rithm performs the best in terms of ISNR and captures the details better than any of the other methods. In light of the robustness to noise of the shearlet-based method showed in Fig. 6, we replicated the set-up similar to that given in [17]. In this case, the Peppers image is blurred by 9 uniform box-car blur and the AWGN added is such a9 dB. We tested the ForWaRD method that the dB , the ForCuRD method dB , the LPA-ICI method dB , and the shearlet-based method dB using the same regularization parameter (not necessarily optimal) for each routine (experiment 6). Close-ups of some of the results are shown in Fig. 8. The Fourier regularized inversion estimate used in all three algorithms is shown in Fig. 8(c). This experiment presents an important comparison in robustness to noise suppression and an indication of the shearlet-based algorithm’s high default tolerance level when the regularization parameter is not chosen optimally. Plots of the validation functions and are shown for various values of in Fig. 9. For comparison, a plot of is also given. In the experiment set up for this comparision, we used the Peppers image blurred by a 9 9 uniform box-car blur and the AWGN added with corresponding dB. The results give an indication that the validation function can be just as effective as for finding estimates of the optimal when the standard deviation of the noise is not estimated. VI. CONCLUSION In this work, we have proposed an effective shearlet-based deconvolution algorithm which utilizes the power of a Fourier representation to approximately invert the convolution operator and a redundant shearlet representation to provide an image estimate. The multiscale and multidirectional aspects of the shearlet transform provide a better estimation capability over that of the wavelet transform or wavelet-like transforms for images exhibiting piecewise smooth edges. In addition, we have

adapted a method of automatically determining the threshold values for the shearlet noise shrinkage without knowing the noise variance by using a generalized cross validation function. Demonstrations show this method to outperform many of the state-of-the-art methods that have been previously compared to the ForWaRD algorithm in the literature and points the direction of future research in this area to the utilization of such multiscale and multidirectional representations as shearlets. A new multichannel implementation of the shearlet transform can be utilized that commutes with periodic convolution so may be projected onto a shearlet domain before esthat timating the inversion of the convolution operator. This allows for the regularization of the approximate inversion operator to be controlled on a multiscale and multidirectional basis. In the cases where the convolution operator is well behaved (see [14] for details) so that inversion of the convolution operator does not need approximating, this technique can be adapted so as to provide a Vaguelette-Shearlet Decomposition (VSD) estimate similar to the Vaguelette-Wavelet Decomposition estimate proposed in [16]. Likewise, in the case of appropriately behaved convolution operators, this technique could be easily adapted to provide a Shearlet-Vaguelette Decomposition based estimate. Several future directions of inquiry are possible considering our new approach. Many complicated techniques that go beyond trimming wavelet coefficients by a threshold parameter have be proposed, such as methods based on the hidden Markov models [56]. It is very likely that utilization of such techniques adapted to the shearlet domain will provide an even greater performance when combined to a deconvolution routine. It would also be of interest to apply the methods developed here to blind deconvolution when the knowledge of the point spread function is not assumed. ACKNOWLEDGMENT The authors would like to thank R. Baraniuk, R. Neelamani, and the anonymous reviewers for their many valuable comments and suggestions which significantly improved this paper. REFERENCES [1] D. L. Donoho and I. M. Johnstone, “Adapting to unknown smoothness via wavelet shrinkage,” J. Amer. Statist. Assoc., vol. 90, no. 432, pp. 1200–1224, 1995. [2] D. L. Donoho and I. Johnstone, “Ideal spatial adaptation via wavelet shrinkage,” Biometrika, vol. 81, pp. 425–455, Dec. 1994. [3] D. L. Donoho, M. Vetterli, R. A. DeVore, and I. Daubechies, “Data compression and harmonic analysis,” IEEE Trans. Inf. Theory, vol. 44, pp. 2435–2476, 1998. [4] S. Mallat, A Wavelet Tour of Signal Processing. New York: Academic, 1998. [5] D. Labate, W.-Q. Lim, G. Kutyniok, and G. Weiss, “Sparse multidimensional representation using shearlets,” in Proc. SPIE, Wavelet Applications in Signal and Image Processing XI, Sep. 2005, vol. 5914, pp. 254–262. [6] G. R. Easley, D. Labate, and W.-Q. Lim, “Sparse directional image representations using the discrete shearlet transform,” Appl. Comput. Harmon. Anal., vol. 25, no. 1, pp. 25–46, Jul. 2008. [7] D. L. Donoho, “Sparse components of images and optimal atomic decomposition,” Constr. Approx., vol. 17, pp. 353–382, 2001. [8] G. R. Easley, D. Labate, and F. Colonna, “Shearlet-based total variation diffusion for denoising,” IEEE Trans. Image Process., vol. 18, no. 2, pp. 260–268, Feb. 2009. [9] M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional multiresolution image representation,” IEEE Trans. Image Process., vol. 14, no. 12, pp. 2091–2106, Dec. 2005.

2684

[10] A. L. Cunha, J. Zhou, and M. N. Do, “The nonsubsampled contourlet transform: Theory, design, and applications,” IEEE Trans. Image Process., vol. 15, no. 10, pp. 3089–3101, Oct. 2006. [11] E. J. Candès and D. L. Donoho, “Curvelets—A surprisingly effective nonadaptive representation for objects with edges,” in Curves and Surface Fitting, A. Cohen, C. Rabut, and L. L. Schumaker, Eds. Nashville, TN: Vanderbilt Univ. Press, 1999. [12] J. L. Starck, E. J. Candès, and D. L. Donoho, “The curvelet transform for image denoising,” IEEE Trans. Image Process., vol. 11, no. 6, pp. 670–684, Jun. 2002. [13] E. J. Candès, L. Demanet, D. L. Donoho, and L. Ying, “Fast discrete curvelet transforms,” Multiscale Model. Simul., vol. 5, no. 3, pp. 861–899, 2006. [14] D. L. Donoho, “Nonlinear solution of linear inverse problems by wavelet-vaguelette decomposition,” Appl. Comput. Harmon. Anal., vol. 2, pp. 101–126, 1995. [15] E. J. Candés and D. L. Donoho, “Recovering edges in ill-posed inverse problems: Optimality of curvelet frames,” Ann. Statist., vol. 30, no. 3, pp. 784–842, 2002. [16] F. Abramovich and B. W. Silverman, “Wavelet decomposition approaches to statistical inverse problems,” Biometrika, vol. 85, no. 1, pp. 115–129, Oct. 1998. [17] R. Neelamani, H. Choi, and R. G. Baraniuk, “ForWaRD: Fourierwavelet regularized deconvolution for ill-conditioned systems,” IEEE Trans. Image Process., vol. 52, no. 2, pp. 418–433, Feb. 2004. [18] J. Starck, M. K. Nguyen, and F. Murtagh, “Wavelets and curvelets for image deconvolution: A combined approach,” Signal Process., vol. 83, no. 10, pp. 2279–2283, 2003. [19] R. R. Coifman and D. L. Donoho, “Translation invariant de-noising,” in Wavelets and Statistics, A. Antoniadis and G. Oppenheim, Eds. New York: Springer-Verlag, 1995, pp. 125–150. [20] M. K. Ng, R. H. Chan, and W. Tang, “A fast algorithm for deblurring models with Neumann boundary conditions,” SIAM J. Sci. Comput., vol. 21, no. 3, pp. 851–866, 1999, (electronic). [21] P. C. Hansen, Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion. Philadelphia, PA: SIAM, 1998. [22] J. Kalifa, S. Mallat, and B. Rouge, “Deconvolution by thresholding in mirror wavelet bases,” IEEE Trans. Image Process., vol. 12, no. 4, pp. 446–457, Apr. 2003. [23] I. M. Johnstone, G. Kerkyacharian, D. Picard, and M. Raimondo, “Wavelet deconvolution in a periodic setting,” J. Roy. Statist. Soc. B Stat. Methodol., vol. 66, no. 3, pp. 547–573, 2004. [24] M. Pensky and B. Vidakovic, “Adaptive wavelet estimator for nonparametric density deconvolution,” Ann. Statist., vol. 27, no. 6, pp. 2033–2053, 1999. [25] J. Fan and J.-Y. Koo, “Wavelet deconvolution,” IEEE Trans. Inf. Theory, vol. 48, no. 3, pp. 734–747, Mar. 2002. [26] R. N. Neelamani, M. Deffenbaugh, and R. G. Baraniuk, “Texas two-step: A framework for optimal multiinput single-output deconvolution,” IEEE Trans. Image Process., vol. 16, no. 11, pp. 2752–2765, Nov. 2007. [27] W. H. Richardson, “Bayesian-based iterative method of image restoration,” J. Opt. Soc. Amer., vol. 62, pp. 55–59, 1972. [28] L. B. Lucy, “An iterative technique for the rectification of observed distributions,” Astron. J., vol. 79, pp. 745–754, 1974. [29] L. Landweber, “An iterative formula for Fredholm integral equations of the first kind,” Amer. J. Math., vol. 73, no. 3, pp. 615–624, Jul. 1951. [30] J.-L. Starck, A. Bijaoui, and F. Murtagh, “Multiresolution support applied to image filtering and deconvolution,” CVIP: Graph. Models Image Process., vol. 57, pp. 420–431, 1995. [31] J. M. Bioucas-Dias, “Bayesian wavelet-based image deconvolution: A GEM algorithm exploiting a class of heavy-tailed priors,” IEEE Trans. Image Process., vol. 15, no. 4, pp. 937–951, Apr. 2006. [32] M. Elad, B. Matalon, and M. Zibulevsky, “Coordinate and subspace optimization methods for linear least squares with non-quadratic regularization,” Appl. Comput. Harmon. Anal., vol. 23, pp. 346–367, Nov. 2007. [33] G. Teschke, “Multi-frame representations in linear inverse problems with mixed multi-contraints,” Appl. Comput. Harmon. Anal., vol. 22, pp. 43–60, 2007. [34] I. Daubechies, G. Teschke, and L. Vese, “Iteratively solving linear inverse problems,” Inv. Probl. Imag., vol. 1, no. 1, pp. 29–46, 2007. [35] M. J. Fadili and J.-L. Starck, Sparse Representations-Based Image Deconvolution by Iterative Thresholding, preprint, 2007. [36] H. Takeda, S. Farsiu, and P. Milanfar, “Deblurring using regularized locally adaptive kernel regression,” IEEE Trans. Image Process., vol. 17, no. 4, pp. 550–563, Apr. 2008.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 12, DECEMBER 2009

[37] I. Daubechies, M. Fornasier, and I. Loris, Accelerated Projected Gradient Method for Linear Inverse Problems With Sparsity Constraints, preprint, 2008. [38] C. Vonesch and M. Unser, “A fast thresholded Landweber algorithm for wavelet-regularized multidimensional deconvolution,” IEEE Trans. Image Process., vol. 17, no. 4, pp. 539–549, Apr. 2008. [39] I. Daubechies, M. Defrise, and C. De Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Comm. Pure Appl. Math., vol. 57, no. 11, pp. 1413–1457, 2004. [40] P. de Rivaz and N. Kingsbury, “Bayesian image deconvolution and denoising using complex wavelets,” presented at the IEEE Int. Conf. Image Process., Thessaloniki, Greece, 2001. [41] V. Katkovnik, K. Egiazarian, and J. Astola, “A spatially adaptive nonparametric regression image deblurring,” IEEE Trans. Image Process., vol. 14, no. 10, pp. 1469–1478, Oct. 2005. [42] M. A. T. Figueiredo and R. D. Nowak, “An EM algorithm for wavelet based image restoration,” IEEE Trans. Image Process., vol. 12, no. 8, pp. 906–916, Aug. 2003. [43] K. Guo, W. Lim, D. Labate, G. Weiss, and E. Wilson, “Wavelets with composite dilations and their MRA properties,” Appl. Comput. Harmon. Anal., vol. 20m, pp. 231–249, 2006. [44] G. R. Easley, V. Patel, and D. M. Healy, Jr., “A M-channel directional filter bank compatible with the contourlet and shearlet frequency tiling,” presented at the SPIE Wavelets XII, San Diego, CA, Aug. 26–30, 2007. [45] K. Guo and D. Labate, “Optimally sparse multidimensional representation using shearlets,” SIAM J. Math. Anal., vol. 39, no. 1, pp. 298–318, 2007. [46] G. H. Golub, M. Heath, and G. Wahba, “Generalized cross-validation as a method for choosing a good ridge parameter,” Technometrics, vol. 21, no. 2, pp. 215–223, 1979. [47] D. L. Donoho and I. Johnstone, “Ideal spatial adaptation via wavelet shrinkage,” Biometrika, vol. 81, pp. 425–455, Dec. 1994. [48] N. Weyrich and G. T. Warhola, “Wavelet shrinkage and generalized cross validation for image denoising,” IEEE Trans. Image Process., vol. 7, no. 1, pp. 82–90, Jan. 1998. [49] M. Jansen, M. Malfait, and A. Bultheel, “Generalized cross validation for wavelet thresholding,” Signal Process., vol. 56, no. 1, pp. 33–44, Jan. 1997. [50] M. Jansen and A. Bultheel, “Multiple wavelet threshold estimation by generalized cross validation for images with correlated noise,” IEEE Trans. Image Process., vol. 8, no. 7, pp. 947–953, Jul. 1999. [51] F. Luisier, T. Blu, and M. Unser, “A new SURE approach to image denoising: Interscale orthonormal wavelet thresholding,” IEEE Trans. Image Process., vol. 16, no. 3, pp. 593–606, Mar. 2007. [52] A. Papoulis and S. U. Pillai, Probability, Random Variables and Stochastic Processes, 3rd ed. New York: McGraw-Hill, 2002. [53] A. N. Tikhonov and V. Y. Arsenin, Solution of Ill-Posed Problems. New York: Wiley, 1977. [54] A. D. Hillery and R. T. Chin, “Iterative Wiener filters for image restoration,” IEEE Trans. Signal Process., vol. 39, no. 8, pp. 1892–1899, Aug. 1991. [55] S. Ghael, A. M. Sayeed, and R. G. Baraniuk, “Improved wavelet denoising via empirical Wiener filtering,” in Proc. SPIE, Wavelet Applications in Signal and Image Processing V, Oct. 1997, vol. 3169, pp. 389–399. [56] M. Crouse, R. Nowak, and R. Baraniuk, “Wavelet-based statistical signal processing using hidden Markov models,” IEEE Trans. Signal Process., vol. 46, no. 4, pp. 886–902, Apr. 1998.

Vishal M. Patel (M’01) received the B.S. degrees in electrical engineering and applied mathematics (with honors) and the M.S. degree in applied mathematics from North Carolina State University, Raleigh, in 2004 and 2005, respectively. He is currently pursuing the Ph.D. degree in electrical engineering at the University of Maryland, College Park. His research interests include compressed sensing, radar imaging, inverse problems, pattern recognition, and biometrics. Mr. Patel is a member of Eta Kappa Nu, Pi Mu Epsilon, Phi Beta Kappa, and a student member of SIAM.

PATEL et al.: SHEARLET-BASED DECONVOLUTION

Glenn R. Easley (M’08) received the B.S. (with honors) and M.A. degrees in mathematics from the University of Maryland, College Park, in 1993 and 1996, respectively, and the Ph.D. degree in computational science and informatics from George Mason University in 2000. Since 2000, he has been with System Planning Corporation working in signal and image processing. He has also been a Visiting Assistant Professor for the Norbert Wiener Center, University of Maryland, College Park, since 2005. His research interests include computational harmonic analysis, with special emphasis on wavelet theory and time-frequency analysis, synthetic aperture radar, deconvolution, and computer vision.

2685

Dennis M. Healy, Jr. received the B.S. degrees in physics and mathematics and the Ph.D. degree in mathematics from the University of California at San Diego (UCSD), La Jolla, in 1980 and 1986, respectively. He is a Professor of mathematics at the University of Maryland, College Park, as well as program manager for the Microsystems Technology Office at DARPA. Previously, he served as program manager for DARPA’s Applied and Computational Mathematics Program in the Defense Sciences Office and as Associate Processor at Dartmouth College with joint appointments in the Departments of Mathematics and Computer Science. His research concerns applied computational mathematics in real-world settings including medical imaging, optical fiber communications, design and control of integrated sensor/processor systems, control of quantum systems, statistical pattern recognition, and fast nonabelian algorithms for data analysis.