Inverse Halftoning Using Wavelets y Zixiang Xiong1 , Michael T. Orchard2, and Kannan Ramchandran3
Dept of Electrical Engineering, University of Hawaii, Honolulu, HI 96822 Dept of Electrical Engineering, Princeton University, Princeton, NJ 08544 3 University of Illinois at Urbana-Champaign, Urbana, IL 61801 Revised on Oct. 25, 1997
1 2
Abstract
This paper introduces a new approach to inverse halftoning using nonorthogonal wavelets. The distinct features of this wavelet-based approach are: a) edge information in the highpass wavelet images of a halftone image is extracted and used to assist inverse halftoning, b) crossscale correlations in the multiscale wavelet decomposition are used for removing background halftoning noise while preserving important edges in the wavelet lowpass image, c) experiments show that our simple wavelet-based approach outperforms the best results obtained from inverse halftoning methods published in the literature, which are iterative in nature [1, 2].
Contacting author: Zixiang Xiong Tel: (808) 956-9648, Fax: (808) 956-3507, email:
[email protected] 1 Introduction Inverse halftoning addresses the problem of recovering a continuous-tone image from a halftoned binary image [1, 2]. Inverse halftoning has applications to image scaling, enhancement, facsimile image processing, and image compression [3]. In the halftoning process [4], a continuous-tone image is rendered into a binary image for the purpose of display or printing. Although the inverse halftoning scheme presented here works with any halftoning process, this paper focuses on the most popular one: error diusion. In error diusion, the error between the continuous-tone input and binary output at each pixel is diused over a causal neighborhood (or kernel). The simplest form of error diusion kernel is the Floyd-Steinberg kernel [5]. Since the halftoning process is a many-to-one map, inverse halftoning is an ill-posed inverse problem with no unique solution. Iterative approaches (e.g. projection onto convex sets [6]) have been studied for addressing the inverse problem by incorporating both spatial and frequency constraints [2]. Assuming the error diusion kernel is known a priori, the halftoned image itself provides an obvious spatial constraint, namely, that the recovered continuous-tone image should produce the given halftone image when it is error diused. The frequency constraints, on the other hand, y This work was supported in part by a grant from ARO under grant DAAH04-96-1-0227.
1
do not re ect known properties of the solution, but rather reasonable a priori assumptions about properties of images in general. Clearly, reconstruction quality depends heavily on the accuracy of these assumptions. A popular frequency constraint is to require inverse-halftoned images to be bandlimited [1, 2]. Algorithms based on lowpass frequency constraints recognize that the selection of bandwidth parameters plays a critical role in algorithm performance. Too much band limitation oversmooths edges, whereas too little band limitation results in noisy inverse-halftoned images. In the past, various lowpass lters have been used (e.g. halfband lowpass [1], Gaussian lowpass and optimal lowpass-based on singular value decomposition (SVD) [2]). A fundamental limitation of inverse halftoning algorithms based on global lowpass frequency constraints is that they are not able to extract useful highpass information from the halftoned image. Image edges consist of substantial highpass energy, and edges are often rendered reasonably well by good halftoning processes. Thus, global lowpass constraints ignore important highpass information embedded in the edges of the halftoned image. This paper introduces a new approach to inverse halftoning using wavelets1. The wavelet representation oers a decomposition of the halftoned image that allows us to selectively choose appropriate information from all frequency bands (i.e. lowpass information in smooth regions augmented by highpass information around edges) for good inverse halftoning performance. Our approach is motivated by recent results in signal denoising using wavelets [7, 8], and we adopt a similar denoising perspective in developing our algorithm. A global frequency constraint assumes that wavelet coecients of the signal (in our case, the desired image) decay exponentially with scale at some rate . Note: this corresponds to a Sobolev space signal model. Algorithms based on this assumption set some threshold beyond which coecients are considered to be dominated by noise. In typical images, the vast majority of coecients exhibit very fast decay rate in scale, while a few edge coecients decay at a much slower rate (i.e. in any given band, the edge coecients have much larger magnitudes). Thus, the Sobolev model is forced to choose between setting a low threshold that treats the fast decay coecients properly at the cost of throwing away important edge coecients, or setting a high threshold that preserves edge coecients at the cost of including a large number of noisy coecients with no signi cant energy. To address this problem, DeVore et al. [9] have studied modeling signals as functions from Besov spaces, and have shown that wavelet bases have very natural and attractive properties for nonlinear approximation of functions in Besov spaces (they also allow the wavelet bases to be redundant, and give an example of wavelet approximation using overcomplete representations). Loosely speaking, a Besov space is the space of functions whose wavelet coecients, ordered in decreasing order of magnitude, decay exponentially at some rate . The signi cance of the Besov model is that it characterizes a function by its N largest wavelet coecients, in contrast with the Sobolev model which characterizes a function by its rst (ordered in frequency) N coecients. It can be shown that, for piecewise smooth functions 1 In this paper, we adopt the convention that we mean the use of discrete wavelet transforms whenever we mention the use of wavelets.
2
with most coecients decaying exponentially at some high rate , and a suciently small number of edge coecients with much slower decay, the accuracy of the Besov approximation tracks the fast rate while the Sobolev approximation tracks the slow rate. In denoising applications, if the signal is modeled as a function from a Besov space corrupted by Gaussian noise, an optimal reconstruction must both identify and estimate the N largest wavelet coecients. It is signi cant that, unlike the signal, the Gaussian noise process does not generate isolated large coecients among the highband coecients. Thus, simple nonlinear strategies such as Donoho's [7] thresholding scheme which identi es the N signal coecients by selecting the largest coecients give optimal reconstructions. Non-Gaussian noise (e.g., Cauchy) was also treated by Donoho [10] using nonlinear multiresolutional analysis within his thresholding-based denoising framework. In the case of halftoning noise, various local noise phenomena (e.g., false contouring, error diusion oscillations, etc.) may also be responsible for isolated large energy highband coecients. Thus, unlike the case of Gaussian noise, it may be wrong to conclude that isolated high energy highband coecients must re ect signal components. Nevertheless, in this work we assume that good halftoning methods are successful at eliminating localized energy concentrations in regions where the signal is smooth, thus allowing us to assume that the largest isolated highband coecients are due to signal components. While it is true that halftoning noise will also contribute localized energy concentration at image edges, these corrupt the signal components but should not cause us to miss or falsely identify the important coecients. We are thus motivated to conjecture that wavelet-based denoising methods should be eective for inverse halftoning, though it may not be possible to prove the same optimality properties as in standard denoising settings. Our proposed inverse halftoning algorithm is based on the overcomplete nonorthogonal wavelets introduced in [11]. It adopts a frequency adaptive approach and demonstrates some noteworthy dierences from previous algorithms. Another wavelet-based unscreening approach was taken by Luo et al. in [12], in which a critically sampled wavelet representation was used. The overcomplete image representation provided by the nonorthogonal wavelets in [11] has certain advantages over a critically sampled wavelet representation for our problem of inverse halftoning. It characterizes an image by the local maxima of its wavelet transform modulus, and nding these local maxima is equivalent to the Canny edge detector [13]. We will implicitly make use of this property when addressing noise removal of the lowpass wavelet image in Section 3.2. Our new inverse halftoning algorithm begins by applying a nonorthogonal, overcomplete wavelet transform to a given halftone image [14]. The highpass wavelet images will be dominated by halftoning blue noise, whose power increases with respect to frequency [4]. Any good inverse halftoning algorithm must eliminate this noise. However, conventional strategies of lowpass ltering [1, 2] (i.e., throwing away the highpass images) remove important edge information, and the soft thresholding scheme described in [7] will not work well in this case either because the halftoning noise is not Gaussian. We take a novel lowpass approach to extract useful edge information in the 3
highpass images while suppressing the blue noise, providing a signi cant performance boost for our inverse halftoning algorithm. For intermediate highpass bands of the wavelet decomposition, we propose an explicit edge extraction algorithm based on cross-scale correlations and we use the resulting edges to perform spatially varying ltering of these images. This approach adaptively removes background halftoning noise while preserving important edge information in the bandpass bands. Our proposed inverse halftoning approach is universal in the sense that no a priori knowledge about the halftoning process is assumed, yet the remarkable feature of this simple algorithm is that it outperforms the best results obtained from iterative methods in [1, 2]. When the a priori information of the error diusion kernel is available, our inverse halftoning algorithm can take advantage of it, and achieve better performance.
2 Discrete Dyadic Wavelet Transform We describe the discrete dyadic wavelet transform from a signal processing point of view, referring the reader to [11] for more mathematical details. Assume f (n) is a 1-D discrete sequence of length N . The rst stage of the dyadic wavelet transform decomposes f (n) into a lowpass sequence S1f (n) and a highpass sequence W1 f (n) by passing f (n) through a lowpass lter h(n) and a highpass lter g(n) as shown in Fig. 1 (a); the original sequence f (n) can be recovered from S1 f (n) and W1f (n) (both are of length N ) by using them as inputs to the system of Fig. 1 (b), where h(?n) and k(n) are the reconstructing lowpass and highpass lters, respectively. The lters h(n), g (n) and k(n) are designed so that the perfect reconstruction condition jH (! )j2 + G(! )K (! ) = 1 is satis ed. These lters are connected with wavelet bases via the following three frequency domain relationships: !
(1)
!
(2)
(2! ) = e?j 2 H (! )(! ); (2! ) = e?j 2 G(! )(! ); !
?(2! ) = ej 2 K (! )(! );
(3)
where (x) is the smoothing function, (x) the analysis wavelet, and (x) the synthesis wavelet. The lters we use are given in [11], the corresponding wavelet (x) is the derivative of a cubic spline which is close to a Gaussian function. To generate a multiscale discrete dyadic wavelet transform, we repeat the system of Fig. 1 on the lowpass sequences. All lters used at scale s (s > 0) are upsampled by a factor of 2s compared with those at scale zero. It is straightforward to generalize the discrete dyadic wavelet transform from 1-D sequences to 2D images. Assume f (m; n) is an image of size M N . At each scale s, with s > 0 and S0 f = f (m; n), the wavelet transform decomposes Ss?1 f into a lowpass image Ss f , a horizontal highpass image WsH f and a vertical highpass image WsV f , from which the Ss?1 f can be reconstructed by the
4
inverse wavelet transform2. Note that all three wavelet images Ss f , WsH f and WsV f at scale s are of size M N , which is the same as Ss?1 f .
3 Inverse Halftoning Using Wavelets Let the given halftone image be f (m; n). Conventional lowpass techniques in inverse halftoning only use lowpass information (i.e., S1f ) of f (m; n) [1, 2]. In our new wavelet approach, we use information of the wavelet representation of f (m; n) in all frequency bands. The block diagram of the proposed wavelet-based inverse halftoning scheme is shown in Fig. 2.
3.1 Edge extraction from highpass wavelet images In our wavelet approach, useful edge information can be extracted from highpass wavelet images W1H f and W1V f to assist inverse halftoning. We use a simple Gaussian lowpass lter to extract horizontal edges from W1V f and vertical edges from W1H f . Once this edge information is combined with the lowpass halftone image via the inverse wavelet transform, we get an inverse-halftoned m2 +n2 image of much higher quality. The Gaussian lter takes the form of h(m; n) = ke? 22 , for ?3 m; n 3, where k is a scaling factor so that the DC gain of the lter is 1. 2 controls the frequency response of the Gaussian lter, which in turn determines the amount of useful edge information to be extracted. In our experiments, we choose 2 = 2 for all images as it gives the best inverse halftoning performance.
3.2 Edge-preserving noise removal of the lowpass wavelet image After the rst wavelet lowpass operation, some halftoning patterns are still visible in smooth regions of S1 f . This is due to noise introduced in the lowpass band during the halftoning process. To remove the halftoning noise in S1 f and at the same time protect important edges in its non-smooth regions, we exploit cross-scale correlations among WsO f 's (s = 2; 3; O 2 fH; V g) (see Fig. 3) oered by the multiscale wavelet transform for edge extraction [15]. Locations with cross-scale correlation above a certain threshold are identi ed as edges, while low cross-scale correlation regions are treated as background, and noise in these regions are suppressed. Two edge maps (one vertical, and another horizontal) are generated by directly multiplying cross-scale highpass wavelet coecients as follows:
E H (m; n) = W2H f (m; n)W3H f (m; n);
(4)
(5) E V (m; n) = W2V f (m; n)W3V f (m; n); for 0 m M ? 1 and 0 n N ? 1. Finally, edge extraction is done by thresholding the sum 2 We omit the 2-D indices (m; n) in Ss f , WsH f , WsV f and Ss?1 f for the sake of notational simplicity.
5
of the two edge maps: E H and E V . That is: ( if E H (m; n) + E V (m; n) > T; E (m; n) = 01;; otherwise ;
(6)
where E (m; n) = 1 means S1f (m; n) is an edge pixel, while E (m; n) = 0 means S1 f (m; n) belongs to a background region. The purpose of merging two directional edge maps into one is for better extraction of diagonal edges [16]. The actual noise removal from S1 f is carried out by setting
W2H f (m; n) = W2V f (m; n) = 0; if E (m; n) = 0:
(7)
This noise removal process from S1 f is depicted in Fig. 4. It certainly depends on the threshold T in (6), but the overall inverse halftoning performance is not too sensitive to T . We experimentally determine the best value of T for inverse halftoning. For the results reported in the next section, T is set to be 60.
4 Experimental Results
We experimented on the standard 512 512 Lena and Peppers images. Fig. 6 (a) shows the halftoned Lena and Peppers images using the Floyd-Steinberg error diusion [5]. PSNR between the original continuous-tone image and the inverse-halftoned image is used as the performance measure. For the sake of judging the visual quality, all images (in GIF format) related to the paper are available at http://spectra.eng.hawaii.edu/~zx/paper/halftone.html.
4.1 Comparisons of lowpass lters
We rst compare the performance of the wavelet lowpass lter with other lters that were previously used in inverse halftoning, namely, the 9 9 Gaussian lter [2] and the halfband lowpass lter [1]. The error diused Lena and Peppers images are processed by using dierent lowpass lters, and PSNR numbers between the resulting images and the original continuous-tone Lena and Peppers are show in Table 1, which indicate that the wavelet lowpass lter is superior to others. Gaussian Halfband Wavelet Lena 28.64 dB 29.60 dB 30.38 dB Peppers 27.59 dB 27.76 dB 28.56 dB Table 1: Performance comparison of dierent lowpass lters for the Peppers image.
4.2 Blind inverse halftoning using wavelets
Our inverse halftoning algorithm uses edge extraction for noise removal from the the wavelet lowpass image and the useful information from the highpass images. The reconstructed Lena and Peppers images obtained from our inverse halftoning algorithm are shown in Fig. 6 (b). The PSNRs with respect to the original continuous-tone images are 31.50 dB for Lena and 30.43 dB for Peppers, respectively. These PSNRs are higher that the best results obtained by the iterative method using halfband lowpass ltering and statistical smoothing in [1] (see Table 2). 6
Results in [1] This paper Lena 31.00 dB 31.50 dB Peppers 29.30 dB 30.43 dB Table 2: Comparisons of inverse halftoning performances, assuming the error diusion kernel is unknown.
4.3 Inverse halftoning using wavelets assuming the error diusion kernel is known Note that the above inverse halftoning results are obtained without using any a priori knowledge of the error diusion kernel. If the error diusion kernel is known, we can use this extra information and achieve better inverse halftoning performance. We rst apply a MAP projection operator as used in [1] to the continuous-tone image after blind inverse halftoning. Using information about the error diusion kernel, the MAP projector makes minimum adjustment at the pixel level of the continuous-tone image so that the resulting image, once error diused, gives the same original halftone image. The MAP projector is followed by a similar statistical smoothing operator as used in [1] to give a nal smooth continuous-tone image, as shown in Fig. 5. The statistical smoothing operator is a nonlinear process that intends to \smooth out" the spiky noise introduced to the image background by the MAP projector. The inverse halftoning results after one MAP projection and smooth operation are given in Table 3, from which we see that using the kernel information gives about 0.2 dB gain over blind inverse halftoning. Finally, comparing Table 2 and 3, we see that, for the Peppers image, even without knowing the error diusion kernel, our wavelet-based inverse halftoning algorithm outperforms the best result achievably by the iterative method of [1] which has access to the kernel information. Results in [1] This paper Lena 32.00 dB 31.67 dB Peppers 30.30 dB 30.69 dB Table 3: Comparisons of inverse halftoning performances, assuming the error diusion kernel is known.
5 Conclusions In this paper we have introduced a new approach to inverse halftoning using nonorthogonal wavelets. It represents an improvement over the state-of-the-art of inverse halftoning. Although theoretical work in [9, 7] indicates that wavelets play a fundamental role in inverse halftoning, the the better performance of our wavelet-based approach is the real proof. Since we do not claim optimality in our proposed algorithm, future research directions include nding the best threshold T and new inverse halftoning approaches based on nonlinear multiresolutional analysis [10]. 7
6 Acknowledgement The rst author would like to thank Dr. Ricardo de Queiroz at Xerox Corp. for interesting discussions on inverse halftoning, and Dr. P. Wong from HP Labs for providing the original Lena and Peppers images.
References [1] P. W. Wong, \Inverse Halftoning and Kernel Estimation for Error Diusion", IEEE Trans. Image Processing, vol. 4, pp. 486-498, April 1995. [2] S. Hein and A. Zakhor, \Halftone to Continuous-Tone Conversion of Error-Diusion Coded Images", IEEE Trans. Image Processing, vol. 4, pp. 208-216, February 1995. [3] M. Y. Ting and E. A. Riskin, \Error-Diused Image Compression Using a Binary-to-GrayScale Decoder and Predictive Pruned Tree-Structured Vector Quantization", IEEE Trans. Image Processing, vol. 3, pp. 854-857, November 1994. [4] R. A. Ulichney, Digital Halftoning, Cambridge, MA: MIT Press, 1987. [5] R. Floyd and L. Steinberg, \An Adaptive Algorithm for Spatial Grey Scale", SID Int. Symp. Dig. Tech. Papers, pp. 36-37, 1975. [6] D. C. Youla and H. Webb, \Image Restoration by the Method of Convex Projections: Part I{Theory", IEEE Trans. Med. Imaging. vol. MI-1, pp. 81-94, October 1982. [7] D. Donoho, \Denoising by Soft-Thresholding", IEEE Trans. Inform. Theory vol. 41, pp. 613627, May 1995. [8] M. Lang et al., \Noise Reduction Using An Undecimated Discrete Wavelet Transform", Signal Processing Newsletters, vol. 3, pp. 10-13, January 1996. [9] R. DeVore, B. Jawarth, and B. Lucier, \Image Compression Through Wavelet Transform Coding", IEEE Trans. Inform. Theory vol. 38, pp. 719-746, March 1992. [10] D. Donoho, \Nonlinear Wavelet Methods for Recovering Signals, Images, and Densities from indirect and noisy data." Technical Report 437, Stanford University, July 1993. [11] S. Mallat and S. Zhong, \Characterization of Signals from Multiscale Edges", IEEE Trans. Patt. Anal. Machine Intell., vol. 14, pp. 710-732, July 1992. [12] J. Luo, R. de Queiroz, and Z. Fan, \A Wavelet-based Universal Unscreening Technique", Submitted to IEEE Signal Processing Letters, 1996. [13] J. Canny, \A Computational Approach to Edge Detection", IEEE Trans. Patt. Anal. Machine Intell., vol. 8, pp. 679-698, 1986. [14] Z. Xiong, M. T. Orchard, and K. Ramchandran, \Inverse Halftoning Using Wavelets", Proc. ICIP'96, Lausanne, Switzerland, September 1996. [15] Y. Xu, J. B. Weaver, D. M. Healy, Jr. and J. Lu, \Wavelet Transform Domain Filters: A Spatially Selective Noise Filtration Technique", IEEE Trans. Image Processing, vol. 3, pp. 747-758, November 1994. [16] I. Daubechies, Personal Communication, 1996. 8
f(n)
h(n)
S 1 f(n)
S 1 f(n)
h(-n)
g(n)
W1 f(n)
W1 f(n)
k(n)
(a)
^
f(n)
(b)
Figure 1: Discrete dyadic wavelet transform. (a) The rst stage of the dyadic wavelet transform decomposes f (n) into sequences S1 f (n) and W1f (n). (b) S1f (n) and W1 f (n) are ltered and summed to reconstruct the original sequence f (n).
S1 f f(m,n) one scale WT
H
W1 f V
W1 f
Edge-preserving noise removal
Edge extraction by Gaussian lowpass filtering
one scale y(m,n) IWT
Edge extraction by Gaussian lowpass filtering
Figure 2: Block diagram of the proposed wavelet-based inverse halftoning scheme.
S3 f S2 f S1 f
Second scale WT
H W2
f
Third scale WT
H
W3 f V
W3 f
V
W2 f
Figure 3: Multiscale discrete dyadic wavelet transform. 9
S2 f S1 f
one scale WT
H
W2 f V
H
W2 f [m,n]=0, if E(m,n)=0
one scale IWT
W2 f V
W2 f [m,n]=0, if E(m,n)=0
Figure 4: Edge-preserving noise removal of the lowpass wavelet image S1 f .
Wavelet-based blind inverse halftoning
MAP projection
Statistical smoothing
Figure 5: Inverse halftoning using information about the error diusion kernel.
10
(a)
(b) Figure 6: (a) Error-diused Lena and Peppers images using Floyd-Steinberg kernel. (b) Reconstructed images using our proposed blind inverse halftoning algorithm. The PSNRs with respect to the original continuous-tone images are 31.50 dB for Lena, and 30.43 dB for Peppers, respectively.
11
List of Tables 1 2 3
Performance comparison of dierent lowpass lters for the Peppers image. : : : : : : Comparisons of inverse halftoning performances, assuming the error diusion kernel is unknown. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Comparisons of inverse halftoning performances, assuming the error diusion kernel is known. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
6 7 7
List of Figures 1 2 3 4 5 6
Discrete dyadic wavelet transform. (a) The rst stage of the dyadic wavelet transform decomposes f (n) into sequences S1 f (n) and W1 f (n). (b) S1f (n) and W1f (n) are ltered and summed to reconstruct the original sequence f (n). : : : : : : : : : : : : 9 Block diagram of the proposed wavelet-based inverse halftoning scheme. : : : : : : : 9 Multiscale discrete dyadic wavelet transform. : : : : : : : : : : : : : : : : : : : : : : 9 Edge-preserving noise removal of the lowpass wavelet image S1 f . : : : : : : : : : : : 10 Inverse halftoning using information about the error diusion kernel. : : : : : : : : : 10 (a) Error-diused Lena and Peppers images using Floyd-Steinberg kernel. (b) Reconstructed images using our proposed blind inverse halftoning algorithm. The PSNRs with respect to the original continuous-tone images are 31.50 dB for Lena, and 30.43 dB for Peppers, respectively. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11
12