A Modified Vector Quantization Based Image Compression Technique Using Wavelet Transform Jayanta Kumar Debnath, Newaz Muhammad Syfur Rahim, and Wai-keung Fung Abstract—An image compression method combining discrete wavelet transform (DWT) and vector quantization (VQ) is presented. First, a three-level DWT is performed on the original image resulting in ten separate subbands (ten codebooks are generated using the Self Organizing Feature Map algorithm, which are then used in Vector Quantization, of the wavelet transformed subband images, i.e. one codebook for one subband). These subbands are then vector quantized. VQ indices are Huffman coded to increase the compression ratio. A novel iterative error correction scheme is proposed to continuously check the image quality after sending the Huffman coded bit stream of the error codebook indices through the channel so as to improve the peak signal to noise ratio (PSNR) of the reconstructed image. Ten error codebooks (each for each subband of the wavelet transformed image) are also generated for the error correction scheme using the difference between the original and the reconstructed images in the wavelet domain. The proposed method shows better image quality in terms of PSNR at the same compression ratio as compared to other DWT and VQ based image compression techniques found in the literature. The proposed method of image compression is useful for various applications in which high quality (i.e. high precision) are critical (like criminal investigation, medical imaging, etc). Index Terms—Vector Quantization, Wavelet Transform, Compression Ratio.
I. I NTRODUCTION ATA compression is the process of converting data files into smaller files for efficient storage and transmission. There are two types of image compression techniques, namely i) Lossless and ii) Lossy compression. In this work lossy compression technique is used. A wavelet representation provides, access to a set of data at various levels of detail. However, wavelet analysis differ from Fourier analysis such that the different signal frequencies are described by individual wavelet basis functions are localized rather than global. Advantages of wavelet analysis include the following: very good image approximation with just a few coefficients [1]; it can be used to extract and encode edge information [1], which provide important visual cues in differentiating images. Moreover, the coefficients of a wavelet decomposition provide information that is independent of the original image resolution. Thus, a wavelet based scheme allows us to easily compare images of different resolutions. Finally, wavelet decompositions are fast and easy to compute, requiring linear time in the size of the image [1]. Wavelet
D
W. K. Fung and J. K. Debnath are with the Electrical and Computer Engineering department of University of Manitoba, Winnipeg, Manitoba, Canada. email:
[email protected] N. M. S. Rahim is with the Electrical and Electronic Engineering department of Bangladesh University of Engineering and Technology (BUET), Dhaka, Bangladesh. email:
[email protected] transform does not reduce the amount of data present in the image. It is simply a different form of representation of the image. Vector quantization (one form of lossy compression technique) on the other hand can reduce the amount of data in the image. In this work, a combined approach of image compression, based on the wavelet transform [1] [2] and Vector quantization [3] is presented. This proposed method gives superior results which are in general applicable to any images. This proposed method of image compression is applicable to those area of digital images where high precision reconstructed image is required like criminal investigations, medical imaging, etc. The proposed method also allows users to specify arbitrary image quality (in terms of PSNR), in sacrifice of Compression Ratio (CR). This method is tested on gray scale images, but it can be easily extended to color image compression by processing the three R, G, B color matrices individually. The rest of the paper is organized as follows: Section II describes the introductory concepts of vector quantization. Section III describes the introductory concepts on wavelet transform and discrete wavelet transform. Section IV introduces the proposed image compression method. Compression steps (like codebook generation step, encoding step, and decoding steps) of the proposed image compression technique is explained in details in this section. Section V presents the experimental results using the proposed image compression technique and lastly section VI concludes the papaer. II. P RINCIPLE OF V ECTOR Q UANTIZATION Vector quantization is a powerful tool for digital image compression. A vector quantizer (VQ) is defined as a mapping Q of K dimensional Euclidean space RK in to a finite subset Y of RK shown in the following equation: yi ; for i = 1,2,3,· · ·,N], Q: RK → Y , where Y = [¯ is the set of reproduction vectors and is called a vector quantizer codebook, and N is the number of vectors in Y . For generating codebook there exists different algorithms, like Linde, Buzo and Gray (LBG) algorithm [3] Self Organizing Feature Map (SOFM) [4] [5], etc. SOFM algorithm is used in the proposed method. SOFM is a neural network model consists of one input and one output layer. Each input node is connected with output node by adaptive weights as shown in Fig. 1. SOFM produces codebook for vector quantization through modifying the weights between input nodes and output nodes. The schematic diagram of a full search vector quantizer is shown in Fig. 2. Full search vector quantizer consists of a codebook. All the pixels of the input image is assigned to one of the codebook entries.
171 c 978-1-4244-1821-3/08/$25.002008 IEEE
array. Wavelet transform decomposes an image into a set of different resolution sub-images, corresponding to the various frequency bands. Wavelets are a class of functions used to localize a given signal in both space and scaling domains. Wavelets automatically adapt to both the high-frequency and the low frequency components of a signal by different sizes of windows [1] [2]. Wavelets are functions generated from one single function ψ (as shown in the following equation), which is called mother wavelet, by dilations (a) and translations (b). x−b ) a Where ψ must satisfy the following conditions. ∞ ψ(x)dx = 0 − 21
ψa,b (x) = |a|
Fig. 1.
Schematic Explanation of the SOFM algorithm
ψ(
−∞
and
∞
−∞
2
|ψ(x)| dx = 1
(1)
Wavelet transform is the representation of any arbitrary signal x(t) as a decomposition of the wavelet basis or write x(t) as an integral over a and b of ψa,b . In this work Discrete Wavelet Transform (DWT) is used. It is the discretized version of the continuous wavelet transforms (as defined by equation 1), for efficient computer implementation. DWT of signal x(t) is defined by the equation: cm,n ψm,n (t) x(t) = m,n
where, cm,n = 2
Fig. 2.
Schematic Explanation of Vector Quantization
Compression is achieved by transmitting codebook indices instead of transmitting the image pixels. So, if the codebook is of size 128, then it will only require 7 bit to transmit the codebook indices. Vector quantizer also requires the same codebook at the decoding end. The decoder just receives the codebook indices and reconstruct the pixels of the image. III. P RINCIPLE OF WAVELET T RANSFORM Wavelet analysis is a technique to transform an array of N numbers from their actual numerical values to an array of N wavelet coefficients. Each wavelet coefficient represents the closeness of the fit (or correlation) between the wavelet function at a particular size and a particular location within the data array. By varying the size of the wavelet function (usually in powers-of-two) and shifting the wavelet so that it covers the entire array, one can build up a picture of the overall match between the wavelet function and the data
172
−m 2
∞
−∞
x(t)ψm,n (t)dt
The coefficients cm,n characterizes the projection of x(t) onto the base formed by ψm,n (t). DWT is implemented using the subband coding method. The whole subband process consists of a filter bank (a series of filters), and filters of different cut-off frequencies, used to analyze the signal at different scales. The procedures starts by passing the signal through a half band high-pass filter and a half band low-pass filter. The filtered signal is then down-sampled. Then the resultant signal is processed in the same way as above. This process will produce sets of wavelet transform coefficients that can be used to reconstruct the signal. IV. P ROPOSED I MAGE C OMPRESSION M ETHOD In this work lossy compression technique is used. A combined approach of image compression, based on the wavelet transform [2] and vector quantization [3] is presented. This proposed method gives superior results over other Wavelet Transform and Vector Quantization based image compression methods (as demonstrated in Section V). This proposed method of image compression is applicable to those areas of digital images where high precision reconstructed image is required like criminal investigations, medical imaging, etc. Moreover if a user needs a better quality image he just has to inform the transmission end to transmit an image of that
2008 International Joint Conference on Neural Networks (IJCNN 2008)
Fig. 3. Ten subbands of the image Lenna after applying 3-level 2D DWT, in a single frame.
quality. This method is tested on gray scale images, but it can be easily extended to color images by processing the three color matrices separately. In this work a 3-level 2-D DWT is firstly applied to the test image (i.e. the image to be compressed) and then VQ is used to different subbands for compression. Ten subbands are created after the application of 3-level 2-D DWT using SOFM, and thus all these codebooks are used for this all subbands individually. 3-level 2-D DWT is applied to images because the low frequency subband, which contains the maximum energy content of the original image, becomes of smaller size so that in case of vector quantization this subband is treated with a codebook size of 7-bits only. These vector indices are subjected to Huffman coding [6] for improving the compression ratio of the transmitted data . Whole compression process of this work is divided into three steps, i) Codebook generation, ii) Encoding of the original image and iii) Decoding of the image. All of these steps will now be discussed. A. Codebook generation step The proposed method uses a total of twenty codebooks, ten codebooks for original image reconstruction and other ten are used to reconstruct the error images. In the codebook generation step (i.e. the training stage) four different standard images (namely Lenna, Couple, Frog, and Baboon as shown in Fig. 6) are used to generate ten original codebooks and also ten error codebooks are generated in this step. In the ten original codebook generation step, 3-level 2-D DWT is applied to each of these original training images. These generates ten wavelet subbands for each of the original images. Similar subbands of each images are then combined to form a single frame and this frame is then considered as a new image. Fig. 3 depicts the wavelet subbands of the image
Fig. 4.
Detailed schematic diagram of the encoder
Lenna, after applying 3-level 2-D DWT. Therefore there are ten separate images available at this stage. Using these ten separate images, ten separate codebooks are generated using SOFM. Then in the error codebook generation step, using these generated ten codebooks ten subband images are vector quantized and then these subbands are reconstructed. These ten reconstructed images are then compared with the original ten images in the wavelet domain, the error of this comparison was taken to generate the error codebooks. In this case also SOFM is used. B. Encoding process Details of the encoding process is shown schematically in Fig. 4. In this step, 3-level 2-D DWT is applied to the test image (i.e. the image to be compressed). Then each of these available ten subbands is vector quantized using the original codebooks, so that separate codebook is used for different subbands. The codebook indices of this VQ process are transmitted to the decoder after Huffman coding. At the encoder end image is reconstructed using the transmitted image indices and peak signal to noise ratio (PSNR) of this transmitted image is calculated to test the image quality. If the calculated PSNR is higher or equal to the desired PSNR then the process ends, otherwise the iterative error correction method is executed. In this iterative error correction method, error between the original image and the reconstructed image (I.E), is calculated in the wavelet domain. Vector quantization using the available error codebooks is then applied to these subband errors between the original and reconstructed image (R.I.).
2008 International Joint Conference on Neural Networks (IJCNN 2008)
173
Fig. 5.
Detailed schematic diagram of the decoder
Error codebook indices are also transmitted to the decoder after Huffman coding. The transmitted error image is reconstructed from the transmitted error codebook indices (at the encoder or transmission end). Then the reconstructed image errors (R.I.E) are added (algebraically) to the previously reconstructed image, and thus R.I. is modified. This iterative error correction process continues until the PSNR of the modified reconstructed image is larger than or equal to the desired PSNR or the maximum number of iteration (considering the case of infinite loop, the iteration process stopped by force at the third iteration) is reached.
Fig. 6. Different images used for training purpose of the proposed algorithm (Top Left image is Lenna, Top right is Baboon, Bottom left is Couple, Bottom right is Frog)
C. Decoding process Details of the decoding process is schematically shown in Fig. 5. The decoder first receives the Hufman coded bit-stream of the VQ indices corresponding to the original wavelet coefficients from the channel. It then reconstructs the codebook indices of the different wavelet subbands. In the initial stage the receiver receives the reconstructed image and successively in the later steps the receiver receives image errors (actually it receives Huffman coded image errors, and reconstruct the image error coefficients from these Huffman coded indices). The receiver adds (algebraically) the received errors of each subband. In the final step the image is reconstructed using 3-level inverse 2-D DWT. V. EXPERIMENTAL RESULTS For generating different codebooks, four images namely Lenna, Couple, Frog, Baboon, are used as shown in Fig. 6. Different codebook sizes used in this work are listed in
174
Fig. 7. Schematic of the wavelet subbands after applying 3-level 2-D DWT to an image
2008 International Joint Conference on Neural Networks (IJCNN 2008)
TABLE I D ETAILS ABOUT DIFFERENT C ODEBOOK SIZES USED IN THIS WORK Wavelet Subbands cA2 cV2 cD2 cH2 cH1 cD1 cV1 cH cD cV
Original Codebook Size 128 = 27 32 = 25 32 = 25 32 = 25 16 = 24 16 = 24 16 = 24 16 = 24 16 = 24 16 = 24
Error Codebook Size 32 = 25 32 = 25 32 = 25 32 = 25 16 = 24 16 = 24 16 = 24 16 = 24 16 = 24 16 = 24
Table I. The different notations used in this table (like cA2 , etc.) are corresponding to the different subbands of the test image after applying 3-level 2-D DWT, as shown in Fig. 7. As mentioned previously for the lower wavelet subband (i.e. the subband with maximum energy content) here in this case of Table I, is CA2 , is given higher priority in encoding (7bits data stream is used in this case). But in case of other subbands (like cV2 , cD2 , and cH2 ) 5-bits data streams are used. Similarly in case of error subbands (subbands like, cA2 , cV2 , cD2 , and cH2 ) 5 bits data streams are used. The images namely Peppers, Boat, Plane, and Woman as shown in Fig. 8 are used to test the proposed method. Examples of original and reconstructed images using the proposed method are shown in Fig. 9. Different PSNR and compression ratios (CR) as obtained from the testing phase of the proposed method for different images are listed in TableII. The compression ratios are calculated after using Huffman coding of the VQ indices. All of the experimental images were in the size of 512x512 pixels. As shown in table II, the PSNR of image Peppers at compression ratio 38.94 is 30.70 in the original reconstruction (i.e. in the first iteration), then in case of 1st error iteration (i.e. in the second iteration) this PSNR was increased to 35.18 with a compression ratio equal to 22.66. Since Huffman coded error codebook indices are supposed to be transmitted to the decoder, there is a decrease in the compression ratio. In the second error iteration (i.e. in the third iteration including the first iteration called original reconstruction) the PSNR is increased to 38.62 with further decreased compression ratio to 18.44. Simulation results of the other test images are listed in the same way in Table-II. In Table-III comparison of different experimental results (as obtained from Table-II) of the proposed method with the relevant methods proposed by Rahim and Yahagi in [7], by Wang et al. in [8], and by Khalifa in [9] is presented. The image compression technique as explained by Rahim and Yahagi in [7] is a Feature Map Finite State Vector Quantizaion (FMFSVQ) based method. Image compression techniques explained by Wang et al. in [8], and by Khalifa in [9] are wavelet and vector quantization based. But in these cases the error correction scheme are not applied. It is clear from Table-III that our proposed method gives superior results. Also, in this Table-III, different methods were compared with different images of the proposed method, as those different methods performed experiments with such particular images
Fig. 8. Different images used for test purpose of the proposed algorithm (Top Left image is Plane, Top right is Woman, Bottom left is Boat, Bottom right is Peppers) TABLE II S IMULATION RESULTS OF THE PROPOSED METHOD FOR DIFFERENT IMAGES
Image Pepper Boat Plane Woman
Original PSNR CR PSNR CR PSNR CR PSNR CR
reconstruction 30.70 38.94 29.87 36.97 28.80 41.28 36.41 46.91
1st error loop 35.18 22.66 34.57 21.57 33.60 24.02 43.13 27.42
2nd error loop 38.62 18.44 38.18 17.52 37.34 19.31 48.75 21.95
(that is the result in the different methods are not available for the same images). For example, the PSNR of image Pepper achieved by FMFSVQ method [7], is 29.70 at a compression ratio of 21.74. On the other hand our proposed method can achieve PSNR of 35.18 at a compression ratio of 22.66,which represents better image quality at higher compression ratio. As demonstrated in Table III, our proposed method gives superior results, in terms of PSNR and compression ratio to other wavelet transform and vector quantization based compression techniques. VI. C ONCLUSION In this paper a technique of digital image compression as well as decompression based on multiresolution analysis using wavelet transform and vector quantization is proposed. It is clear from the above discussion that inclusion of the error correction scheme resulted in a great improvement
2008 International Joint Conference on Neural Networks (IJCNN 2008)
175
R EFERENCES [1] L. Prasad and S. Iyenqar, Wavelet Analysis with Applications to Image Processing. CRC-Press, 1997. [2] M. K. M. X. Wang, E. Chan and S. Panchanathan, “Wavelet Based Image Coding Using Nonlinear Interpolative Vector Quantization,” IEEE transaction on image processing, vol. 5, no. 3, pp. 518–526, Mar. 1996. [3] W.-T. C. Ruey-Feng Chang and J.-S. Wang, “A Fast Finite-State Algorithm for Vector Quantizer Design,” IEEE Transaction on Signal Processing, vol. 40, no. 1, pp. 221–225, Jan. 1992. [4] T. Kohonen, Self Organizing Maps, 3rd ed. Springer, 2001. [5] D.-S. Q. Hong Wang, Ling LU and X. Luo, “Image compression based on wavelet transform and vector quantization,” IEEE Proceedings of the (a) Original Image Pepper on the left, on the right reconstructed Pepper with First International Conference on Machine Learning and Cybernetics, PSNR = 35.18 and CR = 22.66 Beijing, China. [6] K. Sayood, Introduction to Data Compression. USA: San Francisco, Morgan Kaufmann, 2000. [7] N. M. S. Rahim and T. Yahagi, “Image coding using an improved feature map finite-state vector quantization,” IEICE Transactions on Fundamental of Electronics, Communications and Computer Sciences, vol. E85-A, no. 11, pp. 2453–2458, Nov. 2002. [8] S. J. L. Ching Yang Wang and L. W. Chang, “Wavelet Image Coding using variable Blocksize Vector Quantization with Optimal Quadtree Segmentation,” ELSEVIER, Journal of Signal Processing: Image Communication, vol. 15, pp. 879–890, Nov. 2000. [9] O. O. Khalifa, “Fast Algorithm for VQ-based wavelet coding system,” IEEE. Transaction on Image Processing, vol. 1, no. 2, pp. 205–220, Apr. 1992. (b) Original Image Boat on the left, on the right reconstructed Boat with PSNR = 38.17 and CR = 17.52 Fig. 9. Different original Image and reconstructed images using the proposed method
TABLE III C OMPARISON OF THE PROPOSED METHOD Image
Proposed Available Method Methods Original 1st error 2nd error Method in [7] reconstruction loop loop PSNR CR PSNR CR PSNR CR PSNR CR Peppers30.70 38.94 35.18 22.66 38.62 18.43 29.70 21.74 Woman 36.41 46.91 43.13 27.42 48.75 21.95 31.71 23.39 Original 1st error 2nd error Method in [8] reconstruction loop loop PSNR CR PSNR CR PSNR CR PSNR CR Plane 28.80 41.28 33.60 24.02 37.34 19.31 35.38 16 Original 1st error 2nd error Method in [9] reconstruction loop loop PSNR CR PSNR CR PSNR CR PSNR CR Boat 29.87 36.97 34.57 21.57 38.18 17.52 28.00 25.0
on the overall image compression process. It reduce the compression ratio a little bit, but increases the PSNR of the image drastically, which ultimately assures the purpose of the method for those area of image compression which requires high quality image. The proposed method provides satisfactory image quality with high compression ratio. Future works on this field may include to implement Finite State Vector Quantization (FSVQ) with combination of Wavelet transform, which may give further improvements in image quality at high compression ratios.
176
2008 International Joint Conference on Neural Networks (IJCNN 2008)