Spatially adaptive denoising using mixture ... - Semantic Scholar

Report 1 Downloads 221 Views
Spatially Adaptive Denoising Using Mixture Modeling of Wavelet Coefficients Il Kyu Eom1 and Yoo Shin Kim2 1.

Dept. of Information & communication Eng., Miryang Natioanl University, Miryang, Korea 2. Dept. of Electronics Eng., Pusan National University, Pusan, Korea ABSTRACT

A wavelet coefficient is generally classified into two categories: significant (large) and insignificant (small). Therefore, each wavelet coefficient is efficiently modelled as a random variable of a Gaussian mixture distribution with unknown parameters. In this paper, we propose an image denoising method by using mixture modelling of wavelet coefficients. The coefficient is classified as either noisy or clean by using proper threshold [2]. Based on this classification, binary mask value that takes an important role to suppress noise is produced. The probability of being clean signal is estimated by a set of mask values. Then we apply this probability to design Wiener filter to reduce noise and also develop the method of selecting windows of different sizes around the coefficient. Despite the simplicity of our method, experimental results show that our method outperforms other critically sampled wavelet denoising schemes. Keywords: Adaptive denoising, mixture modeling, wavelet transform, Wiener filter, selecting windows of different sizes

1. INTRODUCTION A wide class of image processing algorithms is based on discrete wavelet transform. The transform coefficients within subbands can be modeled as independent identically distributed random variables with generalized Gaussian distribution (GGD). However, the statistics of images show striking non-Gaussian behaviors, and wavelet coefficients of images exhibit regular but non-Gaussian properties. The wavelet transform of a signal typically consists of a small number of large coefficients and a large number of small coefficients. Most wavelet coefficients have small values and contain little signal information. On the other hand, a few wavelet coefficients have large values that contain significant signal information. Therefore, each wavelet coefficient is probabilistically in one of two states: significant state or insignificant state. A number of recent approaches [4-8] to denoising take advantage of joint statistical relationships by adaptively estimating the variance of a coefficient from a local neighborhood consisting of coefficients within a subband. This kind of local variance estimation was originally developed in the context of adaptive Wiener denoising. It is important to construct statistical model in order to accurately estimate the signal variance from the noisy signal. In practice, Wiener filter is constructed by using the estimated variance of the original signal. It is then applied to the noisy signal. Hidden Markov tree (HMT) model [6-8] uses non-Gaussian marginal probability distribution function as a two component Gaussian mixture. The components are labeled by a hidden state signifying whether the coefficient is significant or insignificant. HMT model estimates mixture variances and state probabilities by using expectationmaximization (EM) algorithm. This model suffers from high computational complexity. Simple mixture modeling to suppress noise [4] using simple classification technique is based on zerotree coding method [10]. In this paper, we present a new and simple denoising scheme using a mixture modeling of wavelet coefficients. Our method is based on a wavelet domain Wiener filter. More specifically, we propose probability weighted Wiener filter by using the probability of wavelet coefficient according to its significance such as Gaussian mixture model. The wavelet coefficient is classified as either noisy or clean by using proper threshold [2]. Based on this classification, binary mask value is produced. In contrast to HMT model, we adopt averaged binary mask value to estimate the state probabilities of wavelet coefficient. And also we develop the method of selecting windows of different sizes around the coefficient.

1612

Visual Communications and Image Processing 2003, Touradj Ebrahimi, Thomas Sikora, Editors, Proceedings of SPIE Vol. 5150 (2003) © 2003 SPIE · 0277-786X/03/$15.00

The paper is organized as follows. In section 2, overview of wavelet transform is described. The proposed method is demonstrated and discussed in section 3. In section 4, the simulation results are presented and discussed. Finally, concluding remarks are given in section 5.

2. OVERVIEW OF WAVELET TRANSFORM A wavelet decomposition of a signal is decomposition in a particular basis functions, called wavelets. A twodimensional image f (s ) is decomposed as f ( s) =

∑u l

j 0 ,l

⋅ φ jLL,l + 0

∑∑ (w j ≥ j0 l

HL j ,l

LH LH HH HH ⋅ψ HL j ,l ( s ) +w j , l ⋅ψ j , l ( s ) + w j ,l ⋅ψ j ,l ( s )) )

{LH , HL, HH } . The

with φ jLL,l = 2 j φ LL ( 2 j s − l ) , ψ Bj ,l = 2 jψ B (2 j s − l ) and B is one of 0

0

0

0

(1)

denote the

LH , HL, HH

subbands of the wavelet decomposition. Equation (1) defines the transform of f (s) to its wavelet coefficients

{w

HL LH j ,l , w j ,l

}

{

LH HH , w HH . The image is presented in terms of shifted and dilated wavelet functions ψ HL j ,l j ,l ,ψ j ,l ,ψ j ,l

}

and scaling

−j

function φ . The basis wavelets differ by their position l and level j . The factor 2 is the characteristic scale of all wavelets at level j . The higher the level is, the smaller the characteristic scale and the higher the frequency. LL

The image will be discretized on and N × N grid. This imposes a maximal level of decomposition J = log 2 N > j ≥ j 0 , with 4 j−1 wavelet coefficients in each subband and 4 j −1 scaling coefficients at each scale. A wavelet coefficient w LH t a scale j represents information about the image in the spatial region around 2 − j l . At next finest scale j + 1 , j ,l information about this region is presented by four wavelet coefficients. This leads to a quad-tree structuring of each of the three subbands as shown in Fig. 1. The orthogonal wavelet transform is good approximation for some image processing areas such as image compression. However, the orthogonal wavelet transform is not shift-invariant. The wavelet coefficients of two different shifts of an image can be very different [11], with no simple relation between them. To overcome such a shift-variant characteristic, overcomplete wavelet transform such as redundant wavelet transform, undecimated wavelet transform [12-13], complex wavelet transform [14] in introduced. In this paper, we only focus on the orthogonal wavelet transform for simplicity.

HL

LH

LH

Fig. 1 Subbands of the wavelet transform. The arrows show the relationship from parent to child.

Proc. of SPIE Vol. 5150

1613

3. PROPOSED DENOISING METHOD 3.1 GENERATING BINARY MASK VALUE A noisy image is modeled as follows: o = v+n

(2)

with n a Gaussian random filed whose components are independent and identically distributed (iid) with zero mean and known variance σ n2 . The goal is to recover the underlying image v . Transformed into the wavelet domain, the problem is as follows: y = w + n'

(3)

where n ' is also Gaussian iid with zero mean and known variance σ n2 . The goal is now to estimate w . Most wavelet coefficients have small values and contain little signal information. On the other hand, a few wavelet coefficients have large values that contain significant signal information. Therefore, each wavelet coefficient is probabilistically in one of two states: significant state or insignificant state. Given particular wavelet coefficient wl (we omit scale and subband information), it is classified into two categories. The corresponding binary decision can be represented as a binary label xl with xl ∈ {0,1} . The binary label called mask xl is

0, wl is noisy xl =  1, wl is clean

(4)

It is important to define the measure of a criterion to obtain the mask value. The measure can be the absolute magnitude of a wavelet coefficient or Holder exponent [15-16]. In this paper, we use the absolute magnitude of wavelet coefficient because the orthogonal wavelet transform is used in our method. To gain the binary mask value, a proper threshold value is also defined. We adopt a spatially adaptive threshold [2], Tl shown as follows: Tl =

σ n2 σ

(5)

where σ is the standard deviation of a clean image. Unfortunately, σ is not known, so we estimate σ . The estimated variance σ l2 is 

σ l2 = max 

1 cl

∑y

yl ∈cl

2 l

where cl is the set of neighbor coefficients around the yl , and c l spatially adaptive threshold is re-defined by Tl =

σ n2 σl



− σ n2 ,0   

(6)

is the number of neighbor coefficients. Then, the

(7)

Above threshold is a good threshold itself for image denoising. When σ n / σ l > 1 , the noise is dominated and the

1614

Proc. of SPIE Vol. 5150

threshold is chosen to be large. In this paper, we use this threshold only to obtain binary mask value. If y l ≥ Tl , then xl = 1 , or not xl = 0 . Fig. 2 shows the binary mask value for noisy Lena image. As shown in Fig. 2, most of significant coefficients are detected but some mismatched values are also shown in background region.

Fig. 2 The binary mask for noisy Lena wavelet coefficients. The set of bright point shows significant class, and the set of dark points shows insignificant class.

3.2 MIXTURE MODELING We model each wavelet coefficient wl as a realization of a linear combination of Gaussian random variables. If we let g (a; µ , σ 2 ) =

 (a − µ ) 2  exp−  2σ 2  2πσ  1

(8)

denote the Gaussian probability density function (pfd). The binary mask value xl = 1 corresponds to a zero-mean, high-variance Gaussian, then we can write f ( wl | xl = 1) = g ( wi ;0, σ 12,l )

(9)

where σ 12,l is variance for significant state. The mask value xl = 0 , in turn, corresponds to a zero-mean, low-variance Gaussian f (wl | xl = 0) = g ( wi ;0, σ 02,l )

(10)

with low-variance σ 02,l < σ 12,l . The marginal pdf f ( wl ) is obtained by a convex combination of the conditional Gaussian densities f ( wl ) = p1l ⋅ g ( wl ;0, σ 12,l ) + pl0 ⋅ g ( wl ;0, σ 02,l )

(11)

Proc. of SPIE Vol. 5150

1615

where p1l and pl0 can be interpreted as the probability that wl is significant or insignificant, and p1l = 1 − pl0 . The Gaussian mixture model is parameterized by a

{p , σ 1 l

2 2 0, l , σ 1,l

}

for each wavelet coefficient wl .

3.2. ESTIMATION OF PARAMETERS The probability of a wavelet coefficient of being significant is calculated by the simple average of xl in the neighbor coefficients including wl . That is, p 1l =

∑x

l

(12)

cl

This is very simple but efficient estimation. The mismatched values in Fig. 2 are probabilistically reduced as shown in Fig. 3.

Fig. 3 pl1 values for Lena in wavelet domain. The probability values are scaled from 0 to 255.G

Let c 1l be the set of wavelet coefficients corresponding to xl = 1 and c l0 be the set of wavelet coefficients corresponding to xl = 0 . Two variances are estimated by using maximum likelihood estimator for Gaussian pdf. That is, 

σ 12,l = max 



σ 02,l = max 

1 c1l 1 c

∑y

2 l

yl ∈c1l

∑y

2 l 0 0 l yl ∈cl



− σ n2 ,0   

(13)



− σ n2 ,0   

(14)

Variances are calculated by the set of wavelet coefficients only in their regions. Our method is simple but shows good performance compared with other denoising algorithms based on critically sampled wavelet transform.

1616

Proc. of SPIE Vol. 5150

3.4 SELECTING WINDOWS OF DIFFERENT SIZES Additionally, square-shaped windows of different sizes around the coefficient are used to estimate variance more precisely. Small size windows are selected because of the fact that the variations in the image data are considerable in these regions. On the other hand, in smoother background regions, windows with larger sizes are more efficient [3]. To determine sizes of window, we also use the probability of significance. Higher probability shows edges or texture regions, while smaller probability presents smoother regions. We modify the binary mask value based on the significance of their parents as follows: xlm = xl , j × xl , j −1

(15)

And we also define the probability of being edge, ple by using the modified mask value in the same manner in Eq. (12). The larger the value of ple corresponds to the higher probability of being edge of wl . The modified binary mask values are shown in Fig. 4. As shown in Fig. 4, isolated mask values are removed. According to the value of p le , the size of neighboring window is varied in {3 × 3,5 × 5,7 × 7,9 × 9} .

Fig. 4 xlm values for Lena in wavelet domain using the modified mask values.G

4. EXPERIMENTAL RESULTS We use the images Lena and Barbara (512X512, 8bpp) as test images. The iid Gaussian noise at different levels are generated using randn in MATLAB. For the wavelet transform, five levels of critically sampled decomposition and symmetric QMF 9-tap filter [9] are used. Simulation results of test images for fixed window size and variable window size in each coefficient are shown in Table 1 and Table 2. Results are compared to those of critically sampled wavelet denoising methods except CHMT [7]. Our method using variable window size shows more efficient for Lena image, and almost the same results for Barbara image. Especially, our results show higher PSNR under the environment of higher noise variances. Fig. 5 shows simulation results for Lena of noise variance σ n = 20 . 5. CONCLUSIONS In this paper, we propose a statistical mixture modeling of wavelet coefficients for image denoising. Well-known threshold method is used to construct the binary mask values that capture significant property of wavelet coefficients.

Proc. of SPIE Vol. 5150

1617

Based on the significant property, a probability weighted Wiener filter is proposed. And variable window size around the data point is also proposed using the probabilities of significant or insignificant wavelet coefficients. Table 1. PSNR results for several denoising methods of Lena(left) and Barbara(right) image. PSNR/ σ n SAWT[2] MAP[1] AML[3] HMT[8] CHMT[7] SSM[4] Bivariate[15] Fixed window variable window

10 34.31 34.39 33.9 34.9 34.8 34.36 34.48 34.66

15 32.39 32.36 32.38 31.8 32.5 32.51 32.56 32.80

20 31.07 31.01 30.96 30.4 31.19 31.11 31.40

25 29.24 29.98 29.84 29.5 29.9 30.15 30.03 30.40

PSNR/ σ n SAWT[2] MAP[1] AML[3] HMT[8] CHMT[7] SSM[4] Bivariate[15] fixed window variable window

10 32.57 32.67 31.9 32.4 32.25 32.72 32.77

15 29.96 30.19 30.26 29.4 30.0 29.97 30.35 30.41

20 28.36 28.59 28.65 27.8 28.36 28.75 28.81

25 27.23 27.42 27.43 27.1 27.16 27.58 27.66

G

G

Fig 5. The result for Lena image. Top-left: original image, top-right: noisy image(21.11dB), bottom-left: scaled sizes of window around the coefficient(black: 9 × 9 size, white: 3× 3 size, gray: middle sizes), bottom-right: noise reduced image(31.51dB).

1618

Proc. of SPIE Vol. 5150

6. REFERENCES 1. 2. 3.

4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

M. K. Mihcak, I. Kozintsev, K. Ramchandran, and P. Moulin, "Low-complexity image denoising based on statistical modeling of wavelet coefficients," IEEE Signal Processing Letters, vol. 6, pp. 300-303, 1999. S. G. Chang, B. Yu, and M. Vetterli, "Spatially adaptive wavelet thresholding with context modeling for image denoising," IEEE Trans. Image Processing,”vol. 9, no.9, pp.1522-1531, 2000. M. K. Mihcak, I. Kozintsev, K. Ramchandran, "Spatially Adaptive statistical Modeling of Wavelet Image Coefficients and Its Application to Denosing," Proc. IEEE Int. Conf. Acous., Speech and Signal Processing, vol.6, pp. 3253-3256, 1999. J. Liu and P. Moulin, "Image denoising based on scale-space mixture modeling of wavelet coefficients," Proc. IEEE Int. Conf. on Image Processing, Kobe, Japan, 1999. M. S. Crouse, R. D. Nowak, and R.G. Baraniuk, "Wavelet-based statistical signal processing using hidden Markov models," IEEE. Transaction on Image Processing, vol.46, pp. 886-902, 1998. J. K. Romberg, H. Choi, and R. G. Baraniuk, "Bayesian tree-structured image modeling using wavelet-domain hidden Markov models," IEEE. Trans. Image Processing, vol.10, no.7, pp. 1056-1068, 2001. H. Choi, J. Romberg, R. Baraniuk, and N. Kingsbury, "Hidden Markov Tree Modeling of Complex Wavelet Transforms," Proc. IEEE Int. Conf. Acous., Speech and Signal Processing, Istanbul, Turkey, June, 2000. J. K. Romberg, H. Choi, and R. Baraniuk, "Bayesian tree structured image modeling using wavelet domain hidden Markov model, " Proc. SPIE, vol.3816, pp.31-44, 1999. E. H. Adelson, E. Simoncelli, and R. Hingorani, "Orthogonal pyramid transforms for image coding, " Proc. SPIE, vol.845, pp.50-58, 1987. J. M. Shapiro, "Embedded Image Coding Using Zerotrees of Wavelet Coefficients," IEEE Transaction on Signal Processing, vol.41, no.12, pp.3445-3462, 1993. E. P. Simoncelli, W. T. Freeman, E. H. Edelson, and D. J. Heeger, “Shiftable multiscale transform,” , IEEE Transaction on Information Theory, vol.38, no.2, pp.587-607, 1992. M. Malfiat and D. Roose, “Wavelet-based image denoising using a Markov random field a priori model,” ," IEEE. Transaction on Image Processing, vo.6, no.4, pp.549-565, 1997. A. Pizurica, W. Ohulips, I. Lemahieu, and M. Acheroy, “A joint inter- and intra scale statistical model for Bayesian wavelet based image denoising,” ," IEEE. Transaction on Image Processing, vol.11, no.5, pp.545-557, 2002. N. G. Kingsbury, “Image processing with complex wavelets,” Phil. Trans. Royal Society London A, vol.357, pp.2543-2560, 1999. L. Sendur and I. W. Selesnick, “Bivariate shrinkage with local variance estimation,” IEEE Signal Processing Letters, vol.9, no.12, pp.438-441, 2002.

G

Proc. of SPIE Vol. 5150

1619