a primal-dual algorithm for joint demosaicking and deconvolution

Report 1 Downloads 66 Views
A PRIMAL-DUAL ALGORITHM FOR JOINT DEMOSAICKING AND DECONVOLUTION Hiˆep Quang Luong, Bart Goossens, Jan Aelterman, Aleksandra Piˇzurica and Wilfried Philips Ghent University, TELIN-IPI-IBBT, Sint-Pietersnieuwstraat 41, B-9000 Ghent, Belgium ABSTRACT

2. PROBLEM FORMULATION

In this paper, we present a first-order primal-dual algorithm for tackling the joint demosaicking and deconvolution problem. The proposed algorithm exploits the sparsity of both discrete gradient (TV) and shearlet coefficients as prior knowledge. In order to deal with this sparsity across the color channels, we first decorrelate the signals in color space before sparsifying them spatially, resulting in a separable transform. We demonstrate that this approach yields better results than employing group sparsity strategies. We propose to update the decorrelation operator during the image reconstruction, this approach will result in a significant improvement in PSNR. By relaxing the sparsity of the chrominance signals, we obtain both better objective and subjective image quality compared to other stateof-the-art demosaicking and deconvolution algorithms. Also, color artifacts due to demosaicking are suppressed very well. Index Terms— Demosaicking, deconvolution, shearlets, sparsity, primal-dual algorithm 1. INTRODUCTION Nowadays, most color images are acquired by cameras that have a single sensor in combination with a color filter array (CFA). Amongst all CFAs, the most commonly used pattern is the Bayer one, despite the fact that other patterns exist with more interesting properties, see e.g. [1]. The missing samples need to be estimated using interpolation techniques, also called demosaicking. We refer the reader to [2] for a broad overview of demosaicking techniques. In case of blurred images, we need additional deconvolution. Blur arise from the fact that the camera captures the scene that is out of focus or in the presence of fast motion (either camera shake or fast moving objects), etc. The main drawback of applying demosaicking and deconvolution sequently is that input noise for deconvolution becomes correlated, which inherently needs more complex denoising algorithms. Recently, joint demosaicking and deconvolution have been studied by various researchers [3, 4, 5, 6]. In [3], Har-Noy et al. combine a deconvolved luminance channel with a fast demosaicked blurred RGB to remove motion blur. Paliy et al. focused on removing Poisson noise using LPA-ICI [4], the same LPA-ICI approach is applied in [6] as a preprocessing step. Soulez and Thi´ebaut developed a Bayesian restoration method using edge-preserving spatial and spectral regularization [5]. In this paper, we will follow a variational approach as in [5] and improve it by applying recent developments in the sparsity and optimization community. In Section 2, we describe the primal-dual problem formulation. In Section 3, we propose a new algorithm to solve the joint demosaicking and deblurring problem. We compare the results of our proposed algorithm with those of other methods in Section 4. We end this paper with a conclusion in Section 5. B. Goossens is a postdoctoral researcher of the Fund for the Scientific Research in Flanders (FWO) Belgium.

978-1-4673-2533-2/12/$26.00 ©2012 IEEE

2801

The most common model for a linear degradation caused by the CFA, blur and additive noise is given by the following matrix-vector formulation: y = CBx + n, (1) where y ∈ Rn , x ∈ R3n and n ∈ Rn are the observed, ideal (or hypothetical desired) and noise images respectively in a columnstack ordering (i.e. the columns of all image channels are stacked into a single vector where each channel contains n pixels and the ith component of x is denoted as xi ), B ∈ R3n×3n represents the blur operator and C ∈ Rn×3n denotes the Bayer down-sampling operator. In this paper, we employ stationary (or shift-invariant) blur models in the acquisition model. The use of a space-varying degradation model is also possible, but leads to more complex solutions. When using the stationary model, the matrix B can be approximated by a circulant-block-circulant matrix, which can be implemented efficiently in the discrete Fourier domain. To solve the ill-posed and ill-conditioned inverse problem in presence of white Gaussian noise, we adopt the following primal variational formulation: ˆ = arg min x x

λ CBx − y22 + (D ⊗ S)xp , 2

(2)

where λ is a regularization parameter between the likelihood term and image prior (respectively the first and second term), ⊗ denotes the kronecker product (this also indicates that the transform is separable), S ∈ Rm×n is a spatially sparsifying transformation that operates on a channel per channel basis (with m being the number of transform coefficients per channel) and D ∈ R3×3 performs a color decorrelation operation. We will discuss both operators into more details in the next section. We define the finite p -norm for the P image prior as f p = ( i∈Ω fi )1/p (where Ω denotes the measure space). For 1 ≤ p < ∞, the cost function given by (2) will be convex and the optimization procedure guarantees a global optimum. In the case that 0 ≤ p < 1, we refer the reader to non-convex optimization algorithms, for example using iterative re-weighted leastsquares [7]. Applying the Legendre-Fenchel transform [8] on the image prior term results in the following primal-dual formulation: E λ D ˆ = arg min max x, (DH ⊗ SH )r + CBx − y22 − rq , x r 2 x (3) where r is the dual variable, ·H denotes the Hermitian transpose, ·, · is the inner product and the q -norm is the convex conjugate of the p -norm, where p1 + 1q = 1 (for p ≥ 1). We solve the saddlepoint problem (3) by the first-order primal-dual algorithm as proposed in [9], which converges to the solution with an optimal rate for this type of problems.

ICIP 2012

3. PROPOSED PRIMAL-DUAL APPROACH In this section, we will discuss more details on our choice of the color decorrelation matrix D, the spatially sparsifying operator S and the p -norm. After the discussions, the complete algorithm is summarized in Algorithm 1 (where σ and τ are step parameters [9]). Note first that the joint linear degradation operator CB cannot be implemented efficiently using FFTs. To overcome this problem, we can additionally dualize the likelihood term in equation (3), resulting in: D E ˆ = arg min max x, (DH ⊗ SH )r + CBx − y, s x r,s x (4) 1 − rq − s22 , 2λ where s is the additional dual variable and its resolvent operator is given on line 7 of Algorithm 1. It is well-known that there is a strong correlation between color channels of an image, which is also exploited by many demosaicking algorithms [2]. A simple way to decorrelate the RGB components is use a color transform such as YCrCb that splits the image into a luminance and 2 chrominance channels. Given the “ideal” image, the optimal color decorrelation matrix D can be obtained via the principal component analysis (PCA) transform. Unfortunately, in practice, we do not have this ideal image. In this paper, we propose to estimate and update the decorrelation matrix D during the image reconstruction (e.g. at every 20th iteration) using a more robust 1 -norm PCA transform, which is less sensitive to outliers. To construct D, we apply the 1 -norm PCA transform (as proposed in [10]) on Sx(j) . Applying D maximizes the 1 -dispersion of the sparse coefficients across the RGB channels. Note that D is orthonormal by construction, which means that DH = D−1 . In order to achieve a good image reconstruction quality, the choice of sparsifying transforms is important. The most popular ones are based on total-variation (TV) or wavelets. Recently, more and more interest has been raised in sparse signal approximation using learned dictionaries [11] or multiresolution representations with better directional selectivity such as curvelets, ridgelets or shearlets [12, 13, 14]. Shearlets are theoretically optimal in representing images with edges, which is also referred to as optimal sparsity [13]. In this paper, we combine both the p -norm of discrete gradient (TV) and the discrete shearlet coefficients as image prior. We have demonstrated this combination successfully for the reconstruction of MR data [15]. We denote the discrete gradient operator as STV = ∇ and the discrete shearlet transform as SS , and their H adjoint operators as SH TV and SS respectively. Because the discrete shearlet transform is a tight frame, SH S is also the backward discrete shearlet transform [14]. The spatially transform in equa– » sparsifying STV . tion (4) can than be written as S = SS For the shearlet coefficients and the luminance (w.r.t. the principal component, in analogy with the terminology of color transforms) channel of the discrete gradient coefficients, we apply the 1 -norm (p = 1) sparsity. Since the Legendre-Fenchel transform of the 1 norm is the indicator of a convex set (q = ∞), the resolvent operator reduces to a pointwise Euclidean projector onto 2 -balls [9], see e.g. line 10 of Algorithm 1. Because the chrominance signals are much more smooth, we propose to use the 1.5 -norm, which gives better reconstruction results as illustrated in Section 4. There are two reasons why we have chosen p = 1.5: the first reason is that the 1 -norm does not suppress coloring artifacts very well, these artifacts originate from bad high-frequency chrominance signals due to chro-

2802

matic aliasing. On the other hand, the 2 -norm smooths the chrominance information too much over the edges, therefore a balance is found in the 1.5 -norm. The second reason is that its LegendreFenchel transform is given by the 3 -norm (q = 3), which happens to have a quite simple closed-form resolvent proximal operator [16], see line 12 of Algorithm 1. Algorithm 1 Proposed joint demosaicking and deconvolution 1: Initialize iteration number j = 0 and reconstruct x(0) with bi-

linear demosaicking from y

2: Initialize the dual variables r(0) = Sx(0) and s(0) = CBx(0) −

y

˜ = x(0) 3: Initialize the prediction variable x 4: while stopping criterion is not met do 5: Initialize and update D every 20th iteration [10] 6: j ← j +1 7: 8: 9: 10: 11: 12: 13: 14: 15: 16:

x − y) s(j−1) + σ(CB˜ 1 + σ/λ ˜ x r = r(j−1) + σ(D ⊗ S)˜ For the shearlet and TV luminance components: r˜i (j) ∀i ∈ ΩS ∪ ΩTV,l : ri = max(1, |˜ ri |) For the TV chrominancepcomponents: 1 + 12|˜ ri | − 1 (j) ∀i ∈ ΩTV,c : ri = r˜i 6|˜ ri | x(j) = x(j−1) − τ (DH ⊗ SH )r(j) − τ BH CH s(j) ˜ = 2x(j) − x(j−1) x end while ˆ = x(j) x s(j) =

4. EXPERIMENTAL RESULTS In a controlled experiment, we perform a quantitative evaluation of the joint demosaicking and deconvolution algorithms. We simulate the noisy and blurred Bayer data by successively applying Gaussian blur (with σB = 1), adding white Gaussian noise (with σn = 1) and applying the Bayer CFA on the 24 images of the Kodak set as formulated in model (1). The parameters are fixed over all experiments √ and have been chosen based on trial-and-error: σ = τ = 1/ 8, λ = 100 and the number of iterations is fixed at 500, although the algorithm converges much faster, we want to have a fair quality comparison with all variations of the algorithm. We compare the proposed primal-dual algorithm as described in Section 3 to another variational approach using edge-preserving spatial and spectral regularization [5]1 and demosaicking followed by deconvolution using a state-of-the-art denoisaicking (i.e. joint demosaicking and denoising) algorithm [17]2 (the employed deconvolution algorithm is the same as the proposed algorithm but without the Bayer down-sampling operator). We also compare with variations on the proposed method: (i) 1 -norm on the RGB channels and without color decorrelation (referred to as PD-RGB), (ii) using a fixed YCrCB-decorrelation operator (referred to as PD-YCrCb), (iii) 1 -norm on both luminance and chrominance channels (referred to as PD-full-L1), (iv) structured 1 -norm on the RGB channels using the concept of group sparsity, i.e. treating the RGB-values as a single vector because of the correlation between the color channels [7] 1 We 2 We

aries.

have optimized the parameters to obtain the best PSNR results. have modified this algorithm to correctly deal with image bound-

(referred to as PD-groupsparsity). The average PSNR values over RGB channels is given in Table 1. Table 1. Average PSNR results for the different reconstruction methods. The best result is written in bold. Soulez and Thi´ebaut [5] Denoisaicking [17] + deconvolution Proposed method PD-RGB PD-groupsparsity PD-YCrCb PD-full-L1

32.42 dB 32.93 dB 33.09 dB 29.70 dB 30.50 dB 32.15 dB 32.83 dB

From the numerical results, we can observe that the somewhat naive approach (PD-RGB) that only takes the 1 -norm sparsity and no correlation into account, produces the worst PSNR result. By introducing the concept of group sparsity (PD-groupsparsity) as demonstrated in [7], the results only improve with 0.8 dB on average. However, by decorrelating the color channels before sparsifying them, we obtain much better results. We can conclude that color decorrelation is much more powerful than exploiting this correlation indirectly through arbitrary ‘group structures”. Using an adaptive decorrelation matrix as proposed is also better than using a fixed general color decorrelation transform such as YCrCb, the PSNR even improves with almost 1 dB. When we compare the 1 -norm with the 1.5 -norm for the chrominance sparsified coefficients (PDfull-L1), we note that there is only a slight improvement in PSNR. However, in Figure 1, we observe that color artifacts is suppressed very well with the proposed method using the 1.5 -norm, while this is not the case with the 1 -norm of the chrominance data. We can also see that PD-RGB and PD-groupsparsity suffer even more from these color artifacts (see also Figure 2). The proposed method produces also better PSNR results compared to the joint reconstruction method of Soulez and Thi´ebaut [5] and only slightly better PSNR results w.r.t. sequential denoisaicking [17] and deconvolution. The reason is that blurred image data makes the demosaicked result less prone to artifacts due to the fact that there are no strong edges, so denoising becomes more prominent, which is processed here by the state-of-the-art denoising algorithm BM3D [18]. Although objective PSNR results are comparable, denoisaicking can fail completely as illustrated in Figure 2: errors in the demosaicked image are even emphasized by the deconvolution.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

5. CONCLUSION In this paper, we proposed a new joint demosaicking and deconvolution algorithm based on first-order primal-dual minimization, which achieves better PSNR results compared to other state-of-theart methods. The proposed algorithm solves the inverse problem by promoting a combined sparse gradient and shearlet model, which is very effective for natural images. Rather than using a fixed decorrelation, we propose to update the decorrelation matrix during the reconstruction by applying the 1 –norm PCA transform, which results in an increase of image quality. Another improvement in both objective and subjective image quality is to use the 1.5 -norm for chrominance gradient data, i.e. relaxing the sparsity constraints, which suppresses color artifacts due to demosaicking very well. Further improvements can be achieved by employing non-convex priors (e.g. 0 ≤ p < 1), see also [7].

2803

Fig. 1. Detailed part of (a) original image, (b) bilinear demosaicking (without deblurring), (c) PD-RGB, (d) PD-groupsparsity, (e) PD-full-L1, (f) Denoisaicking [17] + deconvolution, (g) Soulez and Thi´ebaut [5] and (h) proposed method.

[6] M. Trimeche, D. Paliy, M. Vehvilainen, and V. Katkovnik, “Multichannel image deblurring of raw color components,” in Proc. of SPIE, 2005, vol. 5674, pp. 169–178. [7] A. Majumdar and R.K. Ward, “Compressed sensing of color images,” Signal Processing, vol. 90, pp. 3122–3127, 2010. [8] R.T. Rockafellar, Convex Analysis, Princeton university press, 1970. [9] A. Chambolle and T. Pock, “A first-order primal-dual algorithm for convex problems with applications to imaging,” Journal of mathematical imaging and vision, vol. 40, no. 1, pp. 120–145, 2011.

(a)

(b)

[10] N. Kwak, “Principal component analysis based on L1-norm maximization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, pp. 1672–1680, 2008.

(c)

[11] R. Rubinstein, M. Zibulevsky, and M. Elad, “Double sparsity: learning sparse dictionaries for sparse signal approximation,” IEEE Transactions on Signal Processing, vol. 58, no. 3, pp. 1553–1564, 2010. [12] J.-L. Starck, E. Candes, and D. Donoho, “The curvelet transform for image denoising,” IEEE Transactions on Image Processing, vol. 11, no. 6, pp. 670–684, 2002. [13] K. Guo and D. Labate, “Optimal sparse multidimensional representation using shearlets,” SIAM Journal on mathematical analysis, vol. 39, pp. 298–318, 2007.

(d)

(e)

[14] B. Goossens, J. Aelterman, H.Q. Luong, A. Pizurica, and W. Philips, “Efficient design of a low redundant discrete shearlet transform,” in Proc. of International Workshop on Local and Non-Local Approximation in Image Processing, 2009, pp. 112–124.

(f)

Fig. 2. Detailed part of (a) original image, (b) bilinear demosaicking (without deblurring), (c) PD-RGB, (d) Denoisaicking [17] + deconvolution, (e) Soulez and Thi´ebaut [5] and (f) proposed method.

[15] J. Aelterman, H. Luong, B. Goossens, A. Pizurica, and W. Philips, “Augmented Lagrangian based reconstruction of non-uniformly sub-Nyquist sampled MRI data,” Signal Processing, vol. 91, no. 12, pp. 2731–2742, 2011. [16] P.L. Combettes and J.-C. Pesquet, “A Douglas-Rachford splitting approach to nonsmooth convex variational signal recovery,” IEEE Journal of selected topics in signal processing, vol. 1, no. 4, pp. 1–12, 2007.

6. REFERENCES [1] L. Condat, “A new color filter array with optimal properties for noiseless and noisy color image acquisition,” IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2200–2210, 2011.

[17] L. Condat, “A simple, fast and efficient approach to denoisaicking: joint demosaicking and denoising,” in Proc. of IEEE International Conference on Image Processing, 2010, pp. 905– 908.

[2] B.K. Gunturk, J. Glotzbach, Y. Altunbasak, R.W. Schaffer, and R.M. Mersereau, “Demosaicking: color filter array interpolation,” IEEE Signal Processing Magazine, vol. 22, no. 1, pp. 44–54, 2005.

[18] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3d transform-domain collaborative filtering,” IEEE Transactions on Image Processing, vol. 16, no. 8, pp. 2167–2178, 2007.

[3] S. Har-Noy, S.H. Chang, and T.Q. Nguyen, “Demosaicking images with motion blur,” in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2010, pp. 1006–1009. [4] D. Paliy, A. Foi, R. Bilcu, V. Katkovnik, and K. Egiazarian, “Joint deblurring and demosaicing of Poissonian Bayer-data based on local adaptivity,” in Proc. of European Signal Processing Conference (EUSIPCO), 2008. [5] F. Soulez and E. Thiebaut, “Joint deconvolution and demosaicing,” in Proc. of IEEE International Conference on Image Processing, 2009, pp. 145–148.

2804