Zero-error codes for correlated information sources - CiteSeerX

Report 1 Downloads 77 Views
Zero-Error Codes for Correlated Information Sources A. Kh. Al Jabri and S. Al-Issa EE Dept., College of Engineerin King Saud University P.O. Box 800, Riyadh 11421 Saudi Arabia Email : f45eOII@ksu, edu. sa Abstract. Stepian and W~olf[2] gave a characterization for the compression region of distributed correlated sources. Their result is for codes with a decoding probability of error approaching zero as the code length is increased. Of interest in many applications is to find codes for which the probability of error is exactly zero. For this latter case, block codes using the zero-error information between the sources have been proposed by Witsenhausen [3]. Better codes, however, can be obtained by further exploitation of the statistical dependency impeded in the correlation information. In this paper variable-length zero-error codes are proposed that are generally more efficient than Witsenhausen codes. A method for their construction is presented and an example demonstrating such construction with the achieved rate region axe given. 1

Introduction

In most practical situations, d a t a produced by sources have some form of redundancy t h a t can be removed without affecting their information contents. Such removal is necessary for efficient d a t a transmission a n d / o r storage. The amount of redundancy in the d a t a from the source is determined by the Shannon entropy [1]. In multi-user environments, however, information m a y be collected at several places and brought to a common site for processing or for other use. If the d a t a from the sources are correlated, then further reduction in the total number of t r a n s m i t t e d bits required to represent the sources is possible; a fact t h a t was shown by Slepian and Wolf (SW) [2]. A simple example of such situation with the achievable rate region is shown in Figure 1 and will be considered for investigation in this paper. Consider the two discrete memoryless correlated sources shown in Figure l a with outputs represented by the r a n d o m variables X and Y taking values on the alphabet X = { X l , X 2 , . . . ,Xn~} and y = {yl,Y2,... ,Y,~}, respectively, and each pair (x,y), x E k', y E Y, of symbols is generated according to the joint probability mass function (pmf) P z y ( x , y ) . Let IIPxyII = [Pxy(x, y), x = 1, 2 , . . . , nx, y = 1, 2 , . . . , ny] denotes the matrix representation of this pmf. An obvious encoding method is to encode each source independently. The achievable rates R x and R y in this case satisfy

R x >_H ( X )

and

R y > H(Y),

18

Ry --~Encoder I ~

D e

Dorrelated ~ources

T~x

(2, Y) H(X, Y) H(Y) d H(Y/X) ° c

(z,y) Encoder II

r

H(X/Y) H(X) (a)

H(X, Y)

(b)

Fig. 1. Correlated source coding configuration and achievable rate region. where H(X) and H(Y) are the entropies of X and Y, respectively. This determines the two dimensional region 7~z shown in Figure lb. On the other hand, using the correlation information Slepian and Wolf showed that correlated sources X and Y can be separately described at rates Rx and Ry and recovered with arbitrarily low probability of error, Pe, by a common decoder if and only if

Rx >_H(X/Y), Ry _> g(}TX), Rx + Ry > H(X, Y), where H(X, Y) is the joint entropy of X and Y [2]. This paper is concerned with codes for which Pe is exactly zero. For a single source, zero-error codes with rates as closed as desired to the source entropy can be constructed using Huffman codes with the implication that zero-error codes do also exist for correlated sources that achieve 7~i. Beyond this region, the only known result reported in the literature, to the best of knowledge, is that of Witsenhausen [3]. Finding the region for which the codes Pe is exactly zero is still, however, an open problem. In Witsenhausen work, the minimum number of transmission symbols needed to transmit information with zero-error about say Y in the presence of full description of X at the decoder is related to the chromatic number of the adjacency graph of the transition channel between Y and X [3]. If M is the number of such symbols, then the achieved rate, Ry, will be 1 n

Ry = - tog2 M

bits/source symbol.

In this paper we show the possibility of constructing better codes. The idea is based on the observation that the correlation information can provide in certain

19

cases, in addition to the zero-error information, statistical information t h a t can b e used for constructing a class of variable-length zero-error codes t h a t are generally more efficient t h a n block codes.

2

Codes

Construction

In the proposed coding method, the d a t a from one of the sources is first encoded by one of the encoders (called here the p r i m a r y encoder) then the result is delivered to the decoder. T h e other encoder, on the other hand, (called here the secondary encoder) exploits its zero-error information and their statistical structure obtained through IIPxyll to encode its own d a t a and pass it to the decoder. From these information, the decoder should be able to recover the d a t a from b o t h sources. It is interesting to note t h a t the selection of which encoder to be the primary and which one to be the secondary is i m p o r t a n t in the codes construction and efficiency. This is due, in general, to the non s y m m e t r y in the conditional statistics viewed from one side to the other. This p a p e r introduces preliminary results of a class of zero-error codes that, in general, achieve rates below that obtained by independent encoding of the sources and also more efficient t h a n other known codes. The correlation information, represented by IIPx,yll, can be viewed as a communication channel connecting the alphabet of the two sources. Such a channel may pass zerc~-error information and their statistics from one site to the other. By allowing one source to encode its d a t a in an optimal way, the other source, using the side information obtained through the correlation channel, needs only to pass necessary information to the decoder needed to resolve its ambiguity a b o u t the d a t a from the second source. The amount of side information depends on the structure of the correlation channel. In general, the correlation channel may not be symmetric. In fact, there are situations where one direction passes zero-error information while the other does not. The erasure channel is a simple example of such situation. For correlated sources the following algorithm can be used to construct zeroerror variable-length codes. With out loss of generality, let X be the p r i m a r y eneoder and Y be secondary one. T h e proposed code construction m e t h o d is as follows: 1. Choose the same block size, m, of input symbols to the encoders. This defines the extended sources X m and y m with alphabet A"m and y m , respectively. 2. Let {Xl,X2,... ,Xnm} be the set of symbols from X "~. Encode these blocks using Huffman coc~e and let ( h D h 2 , . . . , h a 2 } be the set of corresponding codewords. 3. Construct for every i, i -- 1, 2,. . . , n vm, the set A'i with elements from A"m such that Xi = {x : x E Z m and p(x_/yi) > 0}. 4. Partition the set y m into a number of subsets, say Yl, 3;2,--., Yd(m), for some number d(m) such t h a t the elements of every Yi, i = 1 , 2 , . . . , d ( m ) have disjoint corresponding subsets in X m as defined in the previous step.

20

5. Let the symbol si represents Yi, i = 1 , 2 , . . . , d ( m ) with a corresponding probability p(si). Encode the set of symbols {sl, s 2 , . . . , Sd(m)} using Huffman code and let {gI, g2,..., gd(m) } be the set of corresponding codewords. This Huffman code and the one given in step 2 are the required codebooks. Notice here that it may be possible that there are more than one way to partition in Step 4. In such case one needs to select the partition that yields the minimum entropy. That is, if P is the set of all possible partitions, then one needs to find P that yields

d(m) 1 ~_,p(si) log 2 ve~' i=1 p(si)" min

This ends the construction phase. To decode, one needs to perform the following three steps: 1. Decode the symbols received from the primary encoder. 2. Decode the received symbols from the secondary encoder to obtain gi and hence si for some i, i = 1,2,... ,d(m). 3. Since x_i, i = 1,2,... , n ~ and si are known to the decoder, it will be able to resolve the ambiguity about which symbols have been transmitted from the secondary source and, therefore, recover both encoded data. The rate achieved depends on the particular partition rule. The achievable rate is then given by

n x >_H ( X ) and

Ry > min H(Y'), --

PEP

where Y' is a random variable over the set {sl, s2,..., sd(m)} with a pmfas defined above.

3

Numerical

Results

In this part we give an example showing the method of code construction and the performance of the constructed code. Consider the case of two sources for which the joint pmf matrix is given by

I]PxYI] =

'3/20

3/20

o

o

1/15

0 0 1/30 1/12

1/15 1/20 0 0

1/15 1/20 1/30 1/12

1/20 0 0

The marginal pmf of X and Y are given by

o \ O 1 0 1/12]

21

i

1

!2

3

4

5

p(xi) 3/lO 1/5 3/20 1/10 1/4 P(Yi) ]4/15 4/15 7/60 7/30 7/60 The partitions of the y symbols for m = 1 are shown below Symbols in y] Corresponding subset of X

Yl

Xl, X2~X3} {Xl,X4,X5} {~2, X3} {X2,X3,X4,X5} {X4,X5}

Y2 l/3 Y4 Y5

There are different ways to take the union among the partitions. These and the achievable rate are summarized below Partition

{~1}, {y2}, {y3}, {y4}, {y4} {yl}, {y2}, {y3,ys}, {y~} {yl}, {~2,y3}, {~4}, {ys} {yl,y~}, {y~}, {y3}, {y4} {yl, y~}, {y2,y3}, {y4}

Achievable rate 2.23012 1.99679 1.89028 1.89028 1.55044

The above shows the effect of partition selection on the achieved rate with the last raw yielding the minimum entropy and, hence, the required partition. If, on the other hand, the primary encoder is taken to be on the Y side, then there will be no zero-error codes below H(X). The achievable rate region 7~0 by the proposed codes and the SW region are shown in Figure 2. The achieved rates by Witsenhausen code is defined by the region (Rx >_ H(X) and Ry >_ log2(3)) which is included in R0.

Rv

3.64 2.23

1.55

1.41

1.41

2.23

3.64

Rx

Fig. 2. Achieved rates for different codes (m -- 1).

22

4

Conclusions

In this p a p e r a m e t h o d for generating a class of variable-length zero-error codes for correlated information sources have been proposed. T h e idea is based on the fact t h a t the correlation between the sources can provide, in m a n y situations, zero-error and statistical information t h a t can be used for designing such codes. T h e proposed m e t h o d of encoding yields codes t h a t are generally more efficient t h a n previously known block codes for such sources.

References 1. T. Cover and J. Thomas, Elements Of Information Theory, John Wiley, 1991 2. D. Slepian and J. Wolf, "Noiseless Coding of Correlated Information Sources", IEEE Trans. Inform. Theory, Vol. IT-19, pp. 471-480, July 1973. 3. H. S. Witsenhausen,"The zero-error side information problem and chromatic numbers", IEEE Trans. Inform. Theory, Vol. IT-22, pp. 592-593, 1976.