A Compression Algorithm Based on Classified ... - Semantic Scholar

Report 1 Downloads 65 Views
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 15, 1-9 (1999)

A Compression Algorithm Based on Classified Interpolative Block Truncation Coding and Vector Quantization CHUN-HUNG KUO, CHANG-FUU CHEN* AND WEN-NAN HSIA Department of Electrical Engineering Tatung Institute of Technology Taipei, Taiwan 104, R.O.C. *Department of Computer and Communication Engineering National Institute of Technology at Kaohsiung Kaohsiung, Taiwan 807, R.O.C.

In this paper, the mean square error with block truncation coding (BTC) is analyzed to propose a classified interpolative BTC algorithm with VQ for improving the interpolative BTC with VQ [7]. The simulation results indicate that the bit rate and PSNR of this classified interpolative BTC algorithm with VQ are better than those of the interpolative BTC algorithm with VQ [7].

Keywords: image coding, block truncation coding (BTC), vector quantization (VQ), interpolative BTC (IBTC)

1. INTRODUCTION Block truncation coding (BTC) [1] is an efficient image coding method that has been adopted to obtain the statistical properties of a block in image compression. Low computational complexity and superior channel error resisting ability make it attractive in real-time image compression. The BTC output data set includes a binary bit plane, which defines the quantization level of each pixel, and two reconstruction level values, determined by the mean and standard deviation of the block [1]. A modified BTC method [2, 3], denoted as AMBTC, achieves a smaller mean square error (MSE) and improves the quality of the reconstructed image. However, the threshold used to form the binary bit plane in these methods is set as the mean of the block, so they fail to achieve optimum quantization. In order to reduce the bit rate, vector quantization (VQ) [4] has been successfully applied to quantize the set of the two codewords generated by the BTC for a block [5]. In addition, Zeng proposed two interpolative BTC coding algorithms based on effective subsampling of the bit plane, which reduces the bit rate [6]. Furthermore, Zeng and Neuvo [7] combined the two methods mentioned above to develop a new method, denoted as VQIBTC in this paper, which largely increases the compression ratio. However, the use of VQ to quantize the two codewords greatly increases the complexity of the algorithm. In fact, VQIBTC, which quantizes each block using an equal number of quantization levels, is not efficient in improving the quality. In general, encoding a very smooth block using its mean Received May 15, 1996; accepted February 27, 1997. Communicated by Soo-Chang Pei.

1

2

CHUN-HUNG KUO, CHANG-FUU CHEN AND WEN-NAN HSIA

results in good quality. In this case, both the bit rate and the computational complexity can be further reduced. Therefore, in this paper, we propose a classified algorithm to improve VQ-IBTC. In this paper, the mean square error is analyzed first and, based on the result, we propose an algorithm which is used to classify the image block. Depending on the MSE analysis, the block is processed by means of one-level or two-level BTC. If the block is processed by means of two-level BTC, a criterion is used to determine either the quincunx or the every-other-row-and-every-other-column subsampling method [6] for its bit plane to reduce the bit rate.

2. MEAN SQUARE ANALYSIS FOR GENERALIZED BTC For a BTC algorithm, an image is partitioned into nonoverlapping blocks with size m by m, where, generally, m = 4. Let pixels be sorted into a nondecreasing order sequence X given xk , k = 0, 1 , …, m2 – 1, that is, x0 ≤ x1 ≤ x2 ≤ ……… ≤ xk ≤ ……… ≤ xm2 – 1.

(1)

The BTC algorithm segments the above sequence X into n intervals, and each pixel in the interval is quantized using the mean of these pixels in the interval to preserve the first moment of the interval and the block. Consider the number of pixels in the ith interval as n −1 2 Ni (0 ≤ i ≤ n – 1), that is, ∑ N i = m ; the subscript of the sequence X is modified as xij to i =0

represent the pixel to be located at the jth (0 ≤ j ≤ Ni – 1) pixel of the ith interval, and the quantization level for each interval is given as Qi =

1 N i −1 ∑ x . N i j= 0 ij

(2)

Therefore, the mean square error (MSE) for the block processed by BTC is obtained as MSE =

1 n −1 N i −1 1 n −1N i −1 (Q i − x ij ) 2 = 2 ∑ [ ∑ x ij2 − N i Q i2 ] . 2 ∑ ∑ m i = 0 j= 0 m i = 0 j= 0

(3)

Let ∆µi = Qi – µ be the distance between the mean of block µ and the ith value of quantization level Qi. Substituting Qi = µ + ∆µi into eq.(3), the MSE becomes MSE =

1 n −1 N i −1 2 2 2 ∑ [ ∑ (x ij − µ ) − 2 N i µ∆µ i − N i ∆µ i ] m 2 i = 0 j= 0

2 =σ −

2µ n −1 1 n −1 [ N i ∆µ i ] − 2 ∑ N i ∆µ i2 , 2 ∑ m i=0 m i =0

(4)

where σ2 expresses the variance of the block (or the sequence X). To preserve the first moment of the block, we have n −1

2 ∑ N iQi − m µ = 0 .

i =0

(5.a)

A COMPRESSION ALGORITHM BASED ON BLOCK TRUNCATION CODING

3

That is, n -1

n -1

i=0

i=0

å N i (Q i - m ) = å N i Dm i = 0.

(5.b)

Substituting eq.(5) into eq.(4), the MSE is simplified as 2 MSE = s -

1 m

2

n -1

2 2 2 å N i Dm i = s - s r .

i =0

(6)

1 n −1 2 ∑ N i ∆µ i is the variance of the reconm 2 i =0 structed block quantized by BTC and is denoted by σ 2r . From eq.(6), if the segmentation of the sequence X is fixed, the variance of the reconstructed block, σ 2r , is the improvement in the mean square error. For example, the original BTC [1] and AMBTC [2] use the mean of the block as the threshold to directly segment the sequence into two regions. It is not the best approach to utilize the mean of the block as a threshold because the improvement in the mean square error, σ 2r , is dependent on the choice of threshold. Therefore, if all possible threshold candidates are considered, the threshold mapped to the maximum σ 2r is the best threshold. In this case, the reconstructed block has the minimum mean square error, and the method can be called optimal BTC. The variance of a high-detail block, including edges, is very large, so it is necessary to use a larger number of quantization levels to quantize the block. However, from the results of many experiments using the original BTC or AMBTC, we know that two-level quantization is sufficient to quantize a high-detail block. On the other hand, if the variance of the block, σ2, is very small, then the variance of the reconstructed block, σ 2r , should also be very small since the MSE in eq. (6) is positive. Therefore, it is not efficient to reduce the MSE using multi-level quantization. That is, in this case, a one-level quantizer is sufficient to quantize the block such that only the mean of the block is transmitted, and the bit rate and the computational complexity are reduced. Again, from eq. (6), the variance of the original block, σ2, is fixed, and the variance of the reconstructed block, σ 2r , with two-level quantization is already known after performing the optimal BTC. Therefore, we can use σ 2r to classify the block in order to avoid calculating the variance of the original block.

It should be noted that the second term

3. THE CLASSIFIED INTERPOLATIVE BTC ALGORITHM In the proposed algorithm shown in Fig. 1, σ 2r is determined by performing the optimum BTC using a two-level quantizer. It should be noted that if the user considers computation complexity, the AMBTC [2] can be used to replace the optimal BTC in Fig. 1. In this case, calculating the variance of the reconstructed block, σ 2r , is still easier than calculating the variance of the original block. Once the block is classified and performed using one-level BTC, only the mean of the block is transmitted. However, if the block is classified and performed using two-level BTC, then the two codewords, Q1 and Q2, and the corresponding bit plane should be transmitted. In order to reduce the bit rate, the two codewords can be compressed using VQ [4] and, according to σ 2r , the bit plane can be subsampled using either

4

CHUN-HUNG KUO, CHANG-FUU CHEN AND WEN-NAN HSIA

Fig. 1. The flowchart of the proposed algorithm.

the quincunx or every-other-row-and-every-other-column subsampling method [6, 7]. It should be noted that, in this paper, the two codewords processed using different subsampling methods are encoded by means of different codebooks; that is, two different codebooks are established at the transmitter and receiver for different classes. Since the every-other-rowand-every-other-column subsampling method seriously blurs the edge of the reconstructed block, it is not suitable for use in a block with large σ 2r . Therefore, a block with small σ 2r can be subsampled using the every-other-row-and-every-other-column method while a block with large σ 2r should be subsampled using the quincunx method. In a word, the proposed algorithm identifies blocks as low-, middle- or high-detail blocks, according to the value of σ 2r . One-level BTC is suitable for low-detail blocks. Middledetail blocks can be performed using the two-level BTC, every-other-row-and-every-othercolumn subsampling, and VQ methods, denoted VQ-IBTC-2 in [7]. High-detail blocks can be performed using the two-level BTC, quincunx subsampling, and VQ methods, denoted VQ-IBTC-1 in [7].

A COMPRESSION ALGORITHM BASED ON BLOCK TRUNCATION CODING

5

4. SIMULATIONS AND RESULTS The images “LENA”, shown in Fig. 2(a), “PEPPER”, “JET”, and “BABOO” with size 512 × 512 were utilized to test the proposed algorithm. In the simulation, the block size m was set to 4, and the codebook size of the VQ which is used for the two codewords, Q1 and Q2, was 256; that is, an 8-bit index was transmitted to obtain the quantized code vector. However, if the block was performed using one-level BTC, the mean of the block was quantized as 6 bits and transmitted. Blocks classified as

(a)

(c) 1.00 bpp; 32.8 dB

(b) 0.70 bpp; 32.9 dB

(d) 0.75 bpp; 31.1 dB

Fig. 2. (a) The original image and the reconstructed images with (b) proposed VQ-CIBTC and (c) VQIBTC-1 and (d) VQ-IBTC-2 [7].

6

CHUN-HUNG KUO, CHANG-FUU CHEN AND WEN-NAN HSIA

block low − detail  middle − detail block  block high − detail

σ 2r ≤ 10

for

for 10 < σ 2r ≤ 50 for

(7)

σ 2r > 50.

It should be noted that the quality of the image could be increased by reducing the values of Thr1 or Thr2 shown in Fig. 1. In this study, the two thresholds are chosen so as to obtain close values of PSNR for various methods as listed in Table 1. Table 1. Comparison of the Bit Rate (bpp) and PSNR (dB) for the proposed VQ-CIBTC, VQ-IBTC-1 and VQ-IBTC-2 [7]. Image

LENA

PEPPER

JET

BABOO

Method

BR

PSNR

BR

PSNR

BR

PSNR

BR

PSNR

VQ-CIBTC

0.70

32.9

0.75

31.3

0.67

31.0

0.98

23.5

VQ-IBTC-1

1

32.8

1

32.8

1

30.8

1

23.5

VQ-IBTC-2

0.75

31.1

0.75

30.0

0.75

28.8

0.75

22.7

The overhead for identifying the class of a block required 2 bits per block, so the bit rate increased significantly. To reduce the number of bits of overhead, the concept of grouping was used [8]. Let k overheads be transmitted at a time, denoted as {o0, o1, …, ok–1}, and let the value of each overhead belong to {0, 1, …, r – 1}; then, an index V can be expressed as k −1

V = ∑ o i r k −1−i ≤ r k . i =0

(8)

In this study, blocks were classified into three classes, that is, r = 3. From Eq. (8), the wordlength of index V is log2rk bits. In addition, because the characteristics of the neighborhood blocks are nearly the same, the correlation of the neighborhood indices V is very high. Therefore, the number of bits encoded by the index V can be further reduced sing the entropy coding method, for example, the Huffman coding [9, 10]. In this simulation, k was chosen as 4 (2 × 2). It should be noted that the user can use a fixed codebook to avoid having to transmit the codebook each time. Table 1 shows the bit rate, including the overhead for the proposed method, and the PSNR of all the reconstructed images using the proposed method, denoted as VQ-CIBTC, and VQ-IBTC-1 and VQ-IBTC-2 proposed in [7], where the PSNR is defined as PSNR = 10 log10(2552/MSE).

(9)

In addition, the reconstructed images of “LENA” obtained using each method are also shown in Fig. 2 for comparison. Clearly, from Table 1, the bit rate obtained using the proposed method is less than those for VQ-IBTC-1 and VQ-IBTC-2, and the PSNR obtained using the proposed method is higher than those obtained using VQ-IBTC-1 and VQ-IBTC-2. The reason why the PSNR of

A COMPRESSION ALGORITHM BASED ON BLOCK TRUNCATION CODING

7

the proposed method is better than those of VQ-IBTC-1 and VQ-IBTC-2 is that using onelevel BTC or the every-other-row-and-every-other-column subsampling method to quantize low-detail or middle-detail blocks does not significantly increase the MSE. However, due to the classification of the proposed method, the codebook of the proposed method used to quantize the two codewords of high-detail blocks is much better than that of VQ-IBTC-1 or VQ-IBTC-2, so the quantization of the two codewords using the proposed method is better than that for VQ-IBTC-1 or VQ-IBTC-2. In addition, the main job of computation in all of the above methods is to search for the best codevector to quantize the two codewords for a block. For VQ-IBTC-1 and VQ-IBTC-2, the codewords for each block in the image are performed using VQ, but, for the proposed method, only the codewords of the blocks belonging to the middle-detail or high-detail classes are performed using VQ. Thus, the computational complexity of the proposed method is much lower than that of VQ-IBTC-1 or VQ-IBTC-2, especially for an image with fewer edges. Considering the above advantages, the proposed method is more efficient than the interpolative BTC associated with VQ [7].

5. CONCLUSIONS In this paper, a classified interpolative BTC algorithm with VQ has been proposed to improve the interpolative BTC algorithm with VQ. The proposed algorithm classifies blocks into low-, middle-, or high-detail blocks based on the variance of the reconstructed block. Each classified block is encoded using either one-level BTC or two-level VQ-IBTC (VQ-IBTC-1 or VQ-IBTC-2). The simulation results indicate that the bit rate and the PSNR obtained using the proposed algorithm are better than those obtained using VQ-IBTC proposed in [7].

ACKNOWLEDGMENT The authors are greatful to the Tatung Institute of Technology, Taiwan, for financial support provided under contract B83058-B83-1207-01-E.

REFERENCES 1. E. J. Delp and O. R. Mitchell, “Image compression using block compression,” IEEE Transactions on Communication, Vol. 27, No. 9, 1979, pp. 1335-1342. 2. M. D. Lema and O. R. Mitchell, “Absolute moment block truncation coding and its applications to color images,” IEEE Transactions on Communication, Vol. 32, No. 10, 1984, pp. 1148-1157. 3. V. R. Udpikar and J. P. Raina, “A modified algorithm for block truncation coding,” Electronics Letter, Vol. 21, No. 20, 1985, pp. 900-902. 4. Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer design,” IEEE Transactions on Communication, Vol. 28, No. 1, 1980, pp. 84-95. 5. V. R. Udpikar and J. P. Raina, “BTC image coding using vector quantization,” IEEE Transactions on Communication, Vol. 35, No. 3, 1987, pp. 352-355.

8

CHUN-HUNG KUO, CHANG-FUU CHEN AND WEN-NAN HSIA

6. B. Zeng, “Two interpolative BTC image coding schemes,” Electronics Letter, Vol. 27, No. 13, 1991, pp. 1126-1128. 7. B. Zeng and Y. Neuvo, “Interpolative BTC image coding with vector quantization,” IEEE Transactions on Communication, Vol. 41, No. 10, 1993, pp. 1436-1438. 8. JTC 1/SC 29, “Coding of moving pictures and associated audio þor digital storage media at up to about 1.5 MBit/s-Part 3: Audio”, ISO/IEC Standard, Vol. 11172-3, 1993. 9. M. I. Lu and C. F. Chen, “An encoding procedure and a decoding procedure þor a new modified Huffman code”, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 38, No. 1, 1990, pp. 128-136. 10. M. I. Lu and C. F. Chen, “A Huffman-type code generator with order-N complexity”, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 38, No. 9, 1990, pp. 1619-1626.

Chun-Hung Kuo was born in Tainan, Taiwan, R.O.C., on September 28, 1968. He received his B.S. and Ph.D. degrees in Electrical Engineering from the Tatung Institute of Technology, Taipei, Taiwan, in 1991 and 1996, respectively. Currently, he is a researcher of Computer and Communication Research Laboratory, ITRI. He received the Excellent University Youth Award, Taiwan, Republic of China, in 1991. His research interests include digital signal processing, digital image processing, speech coding, data compression, and digital communication.

Chang-Fuu Chen was born in Ping-Tung, Taiwan, on March 26, 1947. He received the B.S. degree (with great distinction) from the Tatung Institute of Technology, Taipei, Taiwan, in 1969 and the M.S., Engineer, and Ph.D. degrees from Stanford University, Stanford, CA, in 1974, 1975 and 1982, respectively, all in Electrical Engineering. After serving as an Electrical Engineering Officer in the Combined Service Forces, he joined the Tatung Institute of Technology in 1970. He became an Associate Professor in the Department of Electrical Engineering in 1977, and was a Professor from 1984 to 1996 (19731975, 1979-1983, on leave). He also was chairman of the department from 1983 to 1991. From, February 1982 to August 1983, he was a member of the Senior Technology Staff of the Advanced Marketing and Technology Department of Granger Associates Co., Santa Clara, CA, where he was a major contributor to the digital signal processing functions of the digital echo canceller system for the company. From December 1984 to June 1986, he was General Manager of the information Products Division and the Computer Plant, Tatung Company, Taiwan, during which time he moved the plant over from huge losses to profit and computerized the company. He also was a consultant of the President Room for the company from 1986 to 1996. Now, he is a Professor and Chairman of the Department of Computer and Communication Engineering, National Institute of Technology at Kaohsiung. He was a recipient of the Industrial Education Award sponsored by the Hsieh-Chih Industrial

A COMPRESSION ALGORITHM BASED ON BLOCK TRUNCATION CODING

9

Foundation, Taiwan, in 1989, and the Distinguished Teaching Award sponsored by the Ministry of Education of the Republic of China in 1992. His current research interests include digital systems and digital signal processing.

Wen-Nan Hsia was born in Taichung Taiwan, R.O.C., in 1970. He received his B.S. and M.S. degrees in Electrical Engineering form the Tatung Institute of Technology, Taipei, Taiwan, in 1993 and 1995, respectively. He joined ACER as an engineer in 1997. His research interests include digital signal processing, data compression, digital communication, and communication systems.