IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998
Bit-Error Probability for Maximum-Likelihood Decoding of Linear Block Codes and Related Soft-Decision Decoding Methods Marc P. C. Fossorier, Member, IEEE, Shu Lin, Fellow, IEEE, and Dojun Rhee
Abstract— In this correspondence, the bit-error probability Pb for maximum-likelihood decoding of binary linear block codes is investigated. The contribution Pb (j ) of each information bit j to Pb is considered and an upper bound on Pb (j ) is derived. For randomly generated codes, it is shown that the conventional approximation at high SNR (dH =N ) 1 Ps , where Ps represents the block error probability, Pb holds for systematic encoding only. Also systematic encoding provides the minimum Pb when the inverse mapping corresponding to the generator matrix of the code is used to retrieve the information sequence. The biterror performances corresponding to other generator matrix forms are also evaluated. Although derived for codes with a generator matrix randomly generated, these results are shown to provide good approximations for codes used in practice. Finally, for soft-decision decoding methods which require a generator matrix with a particular structure such as trellis decoding, multistage decoding, or algebraic-based soft-decision decoding, equivalent schemes that reduce the bit-error probability are discussed. Although the gains achieved at practical bit-error rates are only a fraction of a decibel, they remain meaningful as they are of the same orders as the error performance differences between optimum and suboptimum decodings. Most importantly, these gains are free as they are achieved with no or little additional circuitry which is transparent to the conventional implementation.
Index Terms—Bit-error probability, block codes, maximum-likelihood decoding, soft-decision decoding, weight distribution.
I. INTRODUCTION For equally likely binary phase-shift keying (BPSK) transmission over the additive white Gaussian noise (AWGN) channel, maximumlikelihood decoding (MLD) of a linear (N; K ) block code provides the optimum decoding strategy to minimize the probability that a decoded block is in error [1, p. 26]. However, MLD does not guarantee that the probability of a bit being in error is minimized. Although this fact has long been recognized, the optimal decoding strategy that minimizes the bit-error probability associated with MLD for the AWGN is still unknown due to the high dependency on the code structure which makes the analysis very difficult. Most of the research on this subject is related to the binarysymmetric channel (BSC) and mostly consists of improving the traditional decoding rule using the standard array [2, p. 68]. Therefore, these works assume hard-decision decoding of the received sequence. A concise overview of these decoding schemes is available in [3, Part II ]. For the AWGN, asymptotic bounds on the error performance have been derived in [4]. In [4], the probability of a bit error in the decoded sequence of length N is minimized. However, in many applications, the main goal is the minimization of a bit error in the corresponding Manuscript received June 8, 1986; revised February 5, 1998. This work was supported by NSF under Grant NCR-94-15374, NASA under Grant NAG 5-931, and LSI Logic Corporation. The material in this correspondence was presented in part at the IEEE International Symposium on Information Theory and Its Applications, Victoria, BC, Canada, September 1996. M. P. C. Fossorier and S. Lin are with the Department of Electrical Engineering, University of Hawaii at Manoa, Honolulu, HI 96822 USA. D. Rhee is with the LSI Logic Corporation, Milpitas, CA 95035 USA. Publisher Item Identifier S 0018-9448(98)06742-X.
3083
sequence of K information bits. The two corresponding bit-error probabilities are the same for a systematic code, but they differ for other encoding methods not in systematic form because of error propagation effects when retrieving the K information bits from the decoded binary sequence. In this correspondence, we consider the minimization of the bit-error probability for MLD of linear block codes and other softdecision decoding methods. Although not optimum, this minimization remains important as MLD has been widely used in practical applications. We assume that the information sequence of length K is recovered from the decoded codeword based on the inverse mapping defined from the generator matrix of the code. For block codes, the large error coefficients can justify this strategy which is explicitly or implicitly used in many decoding methods such as conventional trellis decoding, multistage decoding, or majority-logic decoding. Therefore, for a particular code and the same optimal block error probability, we determine the best encoding method for delivering as few erroneous information bits as possible whenever a block is in error at the decoder output. We first derive a general upper bound on the bit-error probability which applies to any generator matrix and is tight at medium to high signal-tonoise ratio (SNR). This bound considers the individual contribution of each information bit separately. For randomly generated codes, we then show that the systematic generator matrix (SGM) provides the minimum bit-error probability. To this end, a submatrix of the generator matrix defining an equivalent code for the bit considered is introduced. A similar general result holds for the optimum bit-error probability related to the BSC [3]. We finally discuss how to achieve this performance whenever the systematic encoding is not the natural choice, as for trellis decoding [5], multistage decoding [6]–[8], or MLD in conjunction with algebraic decoding [9]–[13]. Minimizing the bit-error probability associated with MLD becomes also important whenever the considered block code is used as the inner code of a concatenated coding system [14]. In Section II of this correspondence, general definitions and bounds for the bit-error probability associated with MLD are briefly reviewed. The individual contribution of each information bit to these bounds is then evaluated in Section III. Results for randomly generated codes are derived in Section IV. In Section V, we discuss how to improve the bit-error probability corresponding to a generator matrix not in systematic form, as for trellis decoding or algebraic-based softdecision decoding. Finally, concluding remarks are given in Section VI. II. REVIEW OF THE BIT-ERROR PROBABILITY FOR MLD
Suppose an (N; K; dH ) binary linear code C with generator matrix and minimum Hamming distance dH is used for error control over the AWGN channel. The matrix G consists of the K rows g i for i = 1; 1 1 1 ; K . Let c = (c1 ; c2 ; 1 1 1 ; cN ) be a codeword in C . For BPSK transmission, the codeword c is mapped into a normalized bipolar sequence x = (x1 ; x2 ; 1 1 1 ; xN ) with xi = (01)c 2 f61g. After transmission, the received sequence at the output of the sampler in the demodulator is r = (r1 ; r2 ; 1 1 1 ; rN ) with ri = xi + ni , where for 1 i N , ni ’s are statistically independent Gaussian random variables with zero mean and variance N0 =2. Since C is a linear code, we can assume that the all-zero codeword is transmitted. If wi represents the number of codewords of weight i in C , the block error probability associated with MLD is upperG
0018–9448/98$10.00 1998 IEEE
3084
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998
TABLE I DIFFERENT WEIGHT DISTRIBUTIONS FOR THE (64; 22; 16) RM CODE IN SYSTEMATIC (S) (S: j 2 [1; 22]) AND BOOLEAN (B) FORMS (B1 : j = 1; B2 : j 2 [2; 6]; B7 : j 2 [7; 22])
weight i in the subcode generated by the G(j ) and define
bounded, based on the union bound [1, p. 30], by Ps
p
N
~ ( i) wi Q
i= d
01=2 1 e0n
=N
x
dn:
At high SNR, the union bound is tight and the first term in the summation of (1) provides an accurate approximation, i.e., Ps
wd
p
~ ( dH ): Q
(2)
Similarly, the average bit-error probability is bounded by [1, p. 30] Pb
p
N
i ~ ( i) wi Q K
i=d
i N
dNH Ps :
(5)
It is important to mention that this approximation was derived in [1] and [15] for hard-decision decoding. For MLD, (5) seems unjustified for nonsystematic encoding as decoding errors within a block can propagate when recovering the information bits. This issue is investigated in the following sections. III. EVALUATION OF EACH BIT CONTRIBUTION TO Pb While (3) is related to the K information bits delivered by the decoder, it does not show the contribution of each of these bits to Pb . To this end, we express the bit-error probability as follows: Pb =
1
K
K j =1
N i=d
Pb (j )
(7)
(6)
where Pb (j ) represents the error probability for the j th bit in a block of K information bits delivered by the decoder. Consider a linear code C with generator matrix G. For 1 j K , let G(j ) denote the matrix obtained after deleting the j th row of G. The matrix G(j ) generates a subcode of C . Let wi (j ) be the number of codewords of
p
~ ( i): w ~i (j )Q
(8)
We call w ~i (j ) the effective error coefficient associated with the j th information bit with respect to the generator matrix G. For large SNR, Pb (j ) is approximated as follows: Pb (j )
w~d
p
~( (j )Q
dH ):
(9)
Define the effective average bit-error coefficient with respect to the generator matrix G as wi =
1
K
(4)
so that at high SNR, we obtain the following commonly used approximation for the bit-error probability: Pb
Pb (j )
(3)
where i represents the average number of nonzero information symbols associated with a codeword of weight i. In practice, a common approximation is i K
0 wi (j ):
Note that w ~i (j ) is simply the number of codewords of weight i which correspond to information sequences (b1 ; b2 ; 1 1 1 ; bK ) with bj = 1. Hence for systematic encoding, the set fw ~i (j )g is equivalent to the set S1 introduced in [4] when restricting the associated position m 2 f1; 1 1 1 ; K g. Based on the union bound, we have
where ~ (x) = (N0 ) Q
w ~i (j ) = wi
(1)
K j =1
w ~i (j ):
(10)
It follows from (3) and (6) that wi = i wi =K and for large SNR, Pb
wd
p
~ ( dH ): Q
(11)
Remark: The value w ~i (j ) depends on the mapping defined by the generator matrix G of code considered as it implicitly assumes that the inverse mapping corresponding to G is used to retrieve the information bits from the decoded codeword. Since for a linear code, this mapping is one-to-one and thus invertible, the previous expressions are valid for any representation of G, systematic as well as nonsystematic. Example 3.1: Table I summarizes the values wi (j ) for the Reed–Muller (RM) code with G in systematic and in Boolean forms [16, p. 370]. For the systematic form, the 22 bits have the same value wi (j ). For the Boolean form, we obtain three different values for wi (j ) for j 2 f1g, j 2 f2; 6g, and j 2 f7; 22g. Hence, these three values correspond to each order associated with the basis vectors defining the Boolean representation [16, p. 374]. Based on Table I and (7), we observe that wd increases from 651 for G in systematic form to 1000 for G in Boolean form. For a linear code, performing row operations on the generator matrix provides an equivalent code with respect to the block error probability since W (C ) = fwi g remains the same. This is no longer true when considering the bit-error probability as suggested by the following theorem. (64; 22; 16)
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998
Theorem 3.1: If the ith row g i is added to the j th row g j in the generator matrix G of the code C , then only the bit-error probability Pb (i) associated with the information bit i is changed for the new ~. generator matrix G ~ (j ) = G(j ) by Proof: The proof is immediate after noting that G ~ (l) and G(l) defines the construction, and that for l 6= i and l 6= j , G same code since the generators at position (i; j ) are, respectively, (g i , g j 8g i ) and (g i , g j ), in the corresponding respective bases. However, ~ (i) is fg gl6=i;j fg 8 g g which is not the generating basis for G l j i equivalent to fg l gl6=i; j fg j g.
3085
TABLE II VALUES OF w d =(dH =N 1 wd ) ASSOCIATED WITH DIFFERENT GENERATOR MATRICES OF SOME RM CODES: SYSTEMATIC GENERATOR MATRIX (S0 ), SQUARING CONSTRUCTION WITH TWO-COMPONENT MATRICES IN SYSTEMATIC FORM (S1 ), DOUBLE-SQUARING CONSTRUCTION WITH THREE-COMPONENT MATRICES IN SYSTEMATIC FORM (S2 ), TRELLIS-ORIENTED GENERATOR MATRIX (TO ), AND GENERATOR MATRIX IN BOOLEAN FORM (B )
IV. RANDOMLY GENERATED CODES For a given code and a given generator matrix, the union bound on the corresponding bit-error probabilities can be evaluated based on Sections II and III. For codes of small to medium dimensions, the bound in (8) can be determined exactly with the aid of a computer. Also, for a given category of codes with a generator matrix in a specific form, wi (j ) can be derived similarly to wi by using structural properties of the code. However, in many cases of practical interest, both sets fwi g and fw ~i (j )g remain unknown if the length N and dimension K are too large. In many such cases, a good approximation of (1) can be obtained by substituting a binomial distribution to the weight distribution of the code [16]. Many Bose-ChaudhuriHocquenghem (BCH) codes of practical dimensions fall into this category of codes. The binomial distribution corresponds to randomly generated codes. In this section, we show how the results of randomly generated codes can be used to approximate (8). A. Dependency Matrix
We consider a K 2 N matrix G = [gi; j ] generated randomly with Prob (gi; j = 0) = Prob (gi; j = 1) = 1=2. We evaluate Pb (j ) corresponding to row j and without loss of generality, permute row-j and row-1 to simplify the derived notations. Then, by performing row and column permutations as well as row additions except permuting row-1 or adding row-1 to other rows of G, we can obtain the form
G=
0
IK 01
P
(12)
where 0 is a row of K 0 1 zeroes, IK 01 represents the identity matrix of dimension K 0 1, and P is a K 2 (N 0 K + 1) matrix. Theorem 3.1 ensures that it is possible to put G into the form of (12) without changing Pb (j ). The matrix P can be chosen with a column starting with “1” and of Hamming weight . The value represents the minimum Hamming weight of the columns starting with “1” for all the possible P ’s constructed. Since dim (C ) = K, such a column always exists and in general, the construction method of the code considered allows to easily identify . After permuting this column to position , G can be rewritten as
G= where P~ is a matrix
D (j ) [0]
P~
(13)
K 2 (N 0 ) matrix and D (j ) is the 2 squared D (j ) = I 0 11T 01
(14)
associated with the row-j considered, with 1 representing the allvector. The matrix D (j ) is defined as the dependency matrix associated with dimension j of the generator matrix G. This matrix allows us to derive the following theorem.
1
Theorem 4.1: Let C be an (N; K ) linear block code with a generator matrix generated randomly. Then the value w ~i (j ) corresponding to the dimension j with associated dependency matrix D (j ) is well approximated by
w~i (j ) 20(N 0K )
b(01)=2c l=0
2l + 1
N 0 : i 0 (2l + 1)
(15)
Proof: The proof is given in Appendix A. Theorem 4.1 indicates that the larger , the larger the corresponding Pb (j ). Consequently, = 1 gives the smallest bit-error probability based on (15). For this case, D1 = [1] which provides the first column of IK in (13). Therefore, based on Theorem 4.1, the optimum bit-error probability for MLD at medium to high SNR is achieved by a systematic encoding if the inverse mapping defined by G is used to retrieve the information bits. This strategy is intuitively correct since whenever a code sequence estimated by the decoder is in error, the best strategy to recover the information bits is simply to determine them independently. Otherwise, errors propagate. By using the approximation [16]
wi 20(N 0K ) N i
(16)
which tightly approximates the weight distribution of a code with a randomly generated matrix, (15) becomes for = 1
w~i (j ) (i=N )wi :
(17)
In this case, at high SNR, the bit-error probability for MLD follows
p Pb dH wd Q~ ( dH ): N
(18)
In general, wd is the parameter commonly known for a considered code C . Hence, based on (9) and (11), the ratios w ~d (j )=wd and wd =wd are of particular interest since they allow to closely approximate Pb (j ) and Pb at high SNR. Different normalized ratios wd =wd 1 (dH =N )01 are listed in Table II for some RM codes of length N 32 with various structures of generator matrices used in practice. We consider the SGM, the generator matrices of the squaring and double-squaring constructions [5] with each component matrix in systematic form, the trellis-oriented generator matrix (TOGM) as described in [5] and the generator matrix in Boolean form. We see that the SGM provides the smallest error coefficient. For N 16, we notice a significant difference between the average bit-error coefficients associated with the SGM and the TOGM. Also, for all these codes, the Boolean representation provides the worst value of wd . This is not surprising as for this form, it is well known that only one bit can be decoded independently of the others [17].
3086
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998
B. Application to the Iterative Squaring Construction Based on (15), we can easily approximate the ratio w ~i (j )=wi for each value of . We illustrate this fact by considering the iterative squaring construction [5], so that for level-L, = 2L . We assume that the approximation by the distribution of a randomly generated code holds. Arguments based on the central limit theorem can be used to justify this assumption [16], but are beyond the scope of this correspondence. In Appendix B, it is shown that for the iterative squaring construction of level-L, the ratio w ~i (j )=wi is well approximated by
l01 N 0 2x i 2 2 w~i (j )=wi 2l i(N 0 i) (1 + O (i =N )) N (N 0 1) x=1 N 0 2x
(19)
for 1 l L. The values l correspond to the possible forms of dependency matrices associated with the iterative squaring construction of level-L. Equation (19) shows that for l 1, the ratio w~i (j )=wi is multiplied by 2(N 0 2l i)=(N 0 2l ) each time l is incremented. Therefore, (19) provides a good approximation of the ratio w ~i (j )=wi for the L-level iterative construction as long as l01 i)=(N 0 2l01 ) > 1. While the approximation associated 2(N 0 2 with (19) provides a lower bound on this ratio, an upper bound is obtained by writing, after some elementary algebra
O (i2 =N 2 ) = ((2l01 0 1)(2l 0 3) + 1)(i2 =N 2 ) 0 O(i=N 2 ):
(20)
For all RM codes with N 64 and all studied practical constructions, we find that the approximation of (19) is in fact satisfied with equality for i = dH . Also a close upper bound on w ~i (j ) is obtained for i > dH . This fact can be verified for the Boolean representation of the (64; 22; 16) RM code given in Table I. Therefore, despite the fact that the weight distributions of RM codes are far from binomial distributions, their “bell-shaped” structures seem sufficient for (19) to provide a good approximation of the error coefficients associated with the dominant terms of (8). V. APPLICATIONS A. ML Trellis Decoding Assuming the results derived in Section IV-A hold for the linear block codes considered, it follows that any MLD scheme based on a particular nonsystematic generator matrix becomes suboptimum if this matrix is used for encoding. Hence, if the TOGM of a code is used for encoding, then trellis decoding becomes suboptimum with respect to the bit-error probability. In this section, we show that for ML trellis decoding it is still possible to achieve the optimum bit-error probability corresponding to Section IV-A. Let Gt denote the TOGM of the code Ct . Then, by row additions only, it is possible to obtain the generator matrix G of an equivalent code C which contains the K columns of the identity matrix. This matrix is known as the reduced echelon form (REF). These operations modify the mapping between information bits and codewords, but since no column permutation has been realized, each codeword of C is still uniquely represented by a path in the trellis diagram of Ct . Therefore, ML trellis decoding of the received sequence is still possible if we use G for encoding. The decoder estimates the code ^ which is closest to the received sequence r . Then the sequence x information bits are easily retrieved due to the systematic nature of G. Since no restriction on Gt applies, the matrix G can be obtained for any possible trellis decomposition. A simple example that shows how to construct and exploit G is given next for the (8; 4; 4) RM code.
Fig. 1. (8; 4; 4) RM code trellis diagram.
Example 5.1: Let consider the (8; 4; 4) RM code with 0 1 0 0
Gt (4) =
1 1 0 0
0 1 1 0
1 1 1 0
0 0 1 1
1 0 1 1
0 0 0 1
1 0 0 1
:
(21)
The corresponding four-section minimum trellis is represented in Fig. 1. After adding rows 1 and 3 to row 2, we obtain
G=
0 1 0 0
1 0 0 0
0 0 1 0
1 1 1 0
0 1 1 1
1 0 1 1
0 0 0 1
1 1 0 1
:
(22)
We easily verify that any codeword defined by G still corresponds to a trellis path. The information sequence is easily retrieved from the trellis decoding solution ^ c = (^c1 ; 1 1 1 ; c^8 ) by identifying ^ b = (^c2 ; c^1 ; c^3 ; c^7 ). In [18], a specific MLD algorithm for the (63; 57; 3) Hamming code is proposed. The decoding is realized based on a generator matrix in cyclic form. It is also shown that an equivalent systematic representation outperforms the cyclic form by 0.4 dB at the bit-error rate (BER) 1005 . However, the decoding of the systematic code requires an additional step. By processing the generator matrix in cyclic form as described in this section, this additional step can be removed as the encoding matrix becomes G = [I57 P6 ]. On the other hand, the cyclic structure no longer exits, but the encoder remains very simple. Fig. 2 depicts the simulation results for the (32; 26; 4) RM code. We simulated MLD based on both the REF and the TOGM, and plotted the first term of the union bound given in (11). At the BER 06 10 , the bounds are very tight and the gap between the two curves is about 0.2 dB.
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998
Fig. 2.
3087
Simulated and theoretical bit-error probabilities for the (32; 26; 4) RM code with encoding in TOGM and REF.
B. MLD in Conjunction with Algebraic Decoding Several soft-decision decoding algorithms in conjunction with an algebraic decoder have been proposed [9]–[13]. In general, algebraic decoding is associated with a particular generator matrix form Ga . Therefore, if this form is used for encoding, the corresponding algorithm becomes suboptimum with respect the bit-error probability. Algebraic decoding algorithms can be divided into two classes, depending on whether the decoder delivers an estimate of the transmitted codeword of length N or of the information sequence of length K . In the first case, the method of Section V-A extends in a straightforward fashion. Hence decoding of cyclic codes can be realized this way. However, a similar method is also possible for the second class of algebraic decoders. Again, this method is transparent with respect to algebraic decoding, so that the conventional algebraic decoder corresponding to the code considered can still be used. This method simply consists of recording the row operations processed to obtain G in REF from Ga and applying the inverse operations to the information sequence delivered by the algebraic decoder. We illustrate this method for majority-logic decoding of RM codes. Example 5.2: Majority-logic decoding of the (8; 4; 4) RM code is based on the generator matrix in Boolean form
a=
G
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
:
(23)
By adding each row to row-1, we obtain
G
=
1 0 0 0
0 0 0 1
0 0 1 0
1 0 1 1
0 1 0 0
1 1 0 1
1 1 1 0
0 1 1 1
;
which contains the four columns of I4 . If G is used for encoding the information sequence b = (b0 ; 1 1 1 ; b3 ) and the conventional majority-logic decoding based on Ga is implemented, the estimated ^ information sequence b = (^b0 ; 1 1 1 ; ^b3 ) is retrieved from the infor^ mation sequence d = (d^0 ; 1 1 1 ; d^3 ) delivered by the decoder after identifying ^ ^0 b0 = d ^ ^0 b1 = d
^1 8 d
^ ^0 b2 = d
^2 8 d
^ ^0 b3 = d
^3 : 8 d
These equations define the new mapping corresponding to the operations processed to obtain G from Ga . Example 5.2 can be generalized to majority-logic decoding of any RM code. If r is the order of the RM code of length 2m considered, the information sequence can be decomposed into r + 1 subblocks of m i bits each, for 0 i r [16]. For each value of i, the number decoded bits d^j contained in each of the m i equations to retrieve the corresponding bits ^bi is 2i . These numbers are determined from the elementary row additions processed to obtain G in REF from Ga .1 Note that these equations simply undo the propagation of errors associated with majority-logic decoding. Furthermore, this method is transparent with respect to algebraic decoding, so that the conventional algebraic decoder corresponding to the code considered can still be used. A simple logic circuit is needed to retrieve the information sequence from the decoded sequence. Fig. 3 depicts the improvement achieved by this method for Chase algorithm-2 with majority-logic-decoding for the (64, 42, 8) RM code [10]. The proposed method outperforms Chase algorithm-2 with conventional majority-logic-decoding by 0.15 dB at the BER 1005 .
(24) 1 For each row i, the first “1” of the row in dimension-i in G.
G
a is chosen to represent
3088
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998
Fig. 3. Simulated bit error probabilities for Chase algorithm-2 of the (64; 42; 8) RM code with majority-logic decoding, and encoding in Boolean and REF forms.
C. Multistage Decoding Multistage decoding exploits the decomposable structure of the codes considered [6]–[8]. Consequently, multistage decoding does not achieve the best bit-error probability if the generator matrix Gd corresponding to the decomposable structure of the code is used for encoding. The REF can be obtained from Gd in two steps. First the generator matrices of the component codes corresponding to each decoding stage in Gd are put into REF independently to obtain G0d . Then, the REF G is obtained by additions of rows from different submatrices in G0d . The conventional multi-stage decoding algorithm can be used in conjunction with encoding based on G. At each decoding stage, the same methods as previously described can be used to retrieve the correct sequence based on the processing of the corresponding submatrix when constructing G0d from Gd . Finally, the inverse operations performed to construct G from G0d are applied to the decisions delivered by each stage. Example 5.3: Trellis-based closest-coset decoding (CCD) of the (8; 4; 4) RM code is realized based on [6]
Gd
=
1 0 0 0
1 1 0 0
0 1 1 0
0 0 1 0
1 0 0 1
1 1 0 1
0 1 1 1
0 0 1 1
:
(25)
The matrix G0d is obtained by adding row-2 to row-1, so that the matrices of the (4; 3; 2) and (4; 1; 4) component codes associated with each stage are both in REF, with the repetition structure of the first component code preserved. Finally, row-4 is added to row-1 to obtain
G
=
1 0 0 0
0 1 0 0
1 1 1 0
0 0 1 0
0 0 0 1
1 1 0 1
0 1 1 1
1 0 1 1
:
(26)
Assume G is used for encoding b = (b0 ; 1 1 1 ; b3 ) and for simplicity, consider the noiseless transmission case. If b0 = 0, then conventional CCD directly delivers the correct ^ c . Else, if b0 = 1, the codeword ^ c1 decoded at the first stage is the complement of the correct codeword corresponding to this stage. However, once the contribution of ^ c1 has been removed from the received sequence, the second-stage decoding is realized based on the three first rows of G0d with the method of Section V-A. Consequently, if ^b1; 0 and (^b2; 0 , ^b2; 1 , ^b2; 2 ) represent the information sequences delivered by stage-1 and stage-2, respectively, the final decoded sequence becomes ^ b0 = ^ b2; 0 ^ b1 = ^ b2; 1 ^ b2 = ^ b2; 2 ^ b3 = ^ b1; 0
8^ b2; 0 :
Equivalently, if ^ c 1 and ^ c 2 represent the codewords delivered at stages 1 and 2, respectively, then the information sequence is retrieved from ^ c1 8 ^ c 2 based on G in REF. Gains of few tenths of a decibel are achieved by encoding based on G rather than Gd . However, such gains remain very meaningful: for the (32; 26; 4) RM code, CCD with encoding based on the REF slightly outperforms MLD trellis decoding with encoding based on the TOGM at BER larger than 1005 . D. Concatenated Coding The simplest concatenated coding scheme consists of the cascade of a nonbinary outer code and an inner code [14]. At the receiver, the inner code decoder delivers hard-decision estimates to the outercode decoder for most of the applications used in practice. Hence, minimizing the number of errors within each information block delivered by the inner-code decoder becomes particularly relevant to this scheme, although the ultimate goal would be to minimize the number of bytes in error.
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998
3089
Fig. 4. Simulated bit error probabilities for concatenated schemes with (255; 223) RS outer code and (64; 40; 8) inner code, and encoding in REF and TOGM.
We consider the concatenated scheme presented in [19] where the inner code is a (64; 40) subcode of the (64; 42) RM code and the outer code is the NASA standard (255; 223) RS code over GF (28 ). The outer code is interleaved to a depth of 5. For this scheme, Fig. 4 represents the simulated bit-error performance for encoding with the TOGM and the REF. We observe that the systematic encoding outperforms the TOGM by about 0.2 dB at the BER 1005 . More importantly, we also notice that while the error performance curves corresponding to the inner codes differ by a constant value due to different error coefficients, the difference in biterror probability between the error performance curves corresponding to the concatenated system increases as the SNR increases.
APPENDIX A PROOF OF THEOREM 4.1 In (14), if the first row of D (j ) is deleted, the resulting matrix becomes equivalent to the generator matrix of a single parity-check (SPC) code. Also the matrix obtained by deleting the first row of P~ given in (13) can be viewed as an (N 0 ) 2 (K 0 1) randomly generated matrix since random row additions of the last K 0 rows of G as given in (13) modify neither the general structure of this equation, nor wi (j ). Therefore, for odd, we evaluate based on (16)
wi (j ) 20((N 0K )0(01)) VI. CONCLUSION In this correspondence, we have shown that for many good block codes, the SGM provides the best bit-error probability for MLD when the inverse mapping of the generator matrix G is used to retrieve the information sequence. Based on the presented results, we can conclude that a careful choice of the generator matrix becomes important when comparing different optimum, near-optimum, or suboptimum soft-decision decoding schemes. Generally, tenths of a decibel separate the bit-error performance of such schemes at practical BER, so that a poor choice of the generator matrix of one of the schemes may result in an important relative degradation. By exploiting the fact that modifying the mapping between information bits and codewords is transparent to the decoder, we modified conventional trellis decoding, multistage decoding, and MLD in conjunction with an algebraic decoder so that these schemes achieve the same bit-error performance as for systematic encoding. Hence the decoding becomes independent of the encoding and can simply be viewed as a process providing the most likely codeword of the codebook. As a result, the decoder structure remains the same as the conventional one but in some cases the decoded sequence requires an additional simple reprocessing.
b=2c
b=2c
01 2l
l=0
N 0 i 0 2l l=0 b=2c 20(N 0K) Ni 0 2l + 1 l=0
20 N 0K (
Similarly, for
)
01
+ 201
2l
01
N 0 i 0 2l
2l
N 0 i 0 (2l + 1)
:
even, we obtain
wi (j ) 20(N 0K )0(01)
=201
01 2l
l=0
1 Ni 002l
+
+ 201
01
2l
01
N 0 i0 201
;
(27)
3090
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998
or, equivalently,
ACKNOWLEDGMENT
N i
wi (j ) 20(N 0K )
=201
0
2l + 1
l=0
N 0 i 0 (2l + 1)
: (28)
The authors wish to thank H. T. Moorthy for many interesting discussions on the subject and for providing the trellis decoding simulation results. They are also very grateful to Prof. T. Kasami for many valuable comments and to one reviewer for improving the presentation of the results.
Regrouping (27) and (28), we finally obtain
b(01)=2c
w~i (j ) 20(N 0K )
N 0 : i 0 (2l + 1)
2l + 1
l=0
REFERENCES (29)
APPENDIX B APPROXIMATION OF w ~i (j ) BASED ON THE ITERATIVE SQUARING CONSTRUCTION
For 1
l L and = 2l , we first expand (15) by
w~i (j ) 20(N 0K )
01
2
l
2
2x + 1
x=0
N 0 2l i 0 (2x + 1)
+
20(N 0K0l) Ni 2 01 2x 2 0202x (i 0 x) bbb(N 0 i 0 y) x=0 x=0
1
2
01
x=0
y=0
0 x)
(N
20 N 0K0l Ni Ni((NN 00i1)) (
)
01 2x
2
1
x=0 x=1
0 x)
(i
2
01
x=2
2
0202x y=1
(N
(N
0 i 0 y)
0 x)
:
We then approximate 2
01 2x x=0 x=1 =
(i
0 x)
2
0202x y=1
N 2 02 0 N 2 03
(N
0 i 0 y)
l 0 2)i 0
(2
2
02
x=0
x
+ O (i
2
N 2 04 ):
(30)
Finally, by noting that 2
02
x=0
x=
2
01
x=2
x0
l01 x=1
x
2
(31)
we obtain
l01 w~i (j ) 20(N 0K 0l) N i(N 0 i) i N (N 0 1) x=1 x 1 NN 00 22xi (1 + O(i2 =N 2 )) l01 N 0 2x i (1 + O(i2 =N 2 ))w 2l Ni((NN 00i1)) i x x=1 N 0 2 based on (16).
[1] G. C. Clark and J. B. Cain, Jr., Error-Correction Coding for Digital Communications. New York: Plenum, 1981. [2] S. Lin and D. J. Costello, Jr., Error Control Coding Fundamentals and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1983. [3] A. B. Kiely, J. T. Coffey, and M. R. Bell, “Optimal information bit decoding of linear block codes,” IEEE Trans. Inform. Theory, vol. 41, pp. 130–140, Jan. 1995. [4] C. R. P. Hartmann, L. D. Rudolph, and K. G. Mehrotra, “Asymptotic performance of optimum bit-by-bit decoding for the white Gaussian channel,” IEEE Trans. Inform. Theory, vol. IT-23, pp. 520–522, July 1977. [5] G. D. Forney, Jr., “Coset codes II: Binary lattices and related codes,” IEEE Trans. Inform. Theory, vol. 34, pp. 1152–1187, Sept. 1988. [6] F. Hemmati, “Closest coset decoding of juju v j codes,” IEEE J. Select. Areas Commun., vol. 7, pp. 982–988, Aug. 1989. [7] G. Schnabl and M. Bossert, “Soft-decision decoding of Reed–Muller codes as generalized multiple concatenated codes,” IEEE Trans. Inform. Theory, vol. 41, pp. 304–308, Jan. 1995. [8] M. P. C. Fossorier and S. Lin, “Generalized coset decoding,” IEEE Trans. Commun., vol. 45, pp. 393–395, Apr. 1997. [9] G. D. Forney, Jr, “Generalized minimum distance decoding,” IEEE Trans. Inform. Theory, vol. IT-12, pp. 125–131, Apr. 1966. [10] D. Chase, “A class of algorithms for decoding block codes with channel measurement information,” IEEE Trans. Inform. Theory, vol. IT-18, pp. 170–182, Jan. 1972. [11] T. Kaneko, T. Nishijima, H. Inazumi, and S. Hirasawa, “An efficient maximum likelihood decoding of linear block codes with algebraic decoder,” IEEE Trans. Inform. Theory, vol. 40, pp. 320–327, Mar. 1994. [12] H. T. Moorthy, S. Lin, and T. Kasami, “Soft-decision decoding of binary linear block codes based on an iterative search algorithm,” IEEE Trans. Inform. Theory, vol. 43, pp. 1030–1040, May 1997. [13] M. P. C. Fossorier and S. Lin, “Complementary reliability-based decodings of binary linear block codes,” IEEE Trans. Inform. Theory, vol. 43, pp. 1667–1672, Sept. 1997. [14] G. D. Forney, Jr., Concatenated Codes. Cambridge, MA: MIT Press, 1966. [15] D. J. Torrieri, “The information-bit error rate for block codes,” IEEE Trans. Commun., vol. COM-32, pp. 474–476, Apr. 1984. [16] F. J. Mac Williams and N. J. A. Sloane, The Theory of Error-Correcting Codes. Amsterdam, The Netherlands: North-Holland, 1977. [17] W. W. Peterson, Error-Correcting Codes. Cambridge, MA: MIT Press, 1961. [18] A. M. Michelson and D. F. Freeman, “Viterbi decoding of the (63, 57) Hamming codes—Implementation and performance results,” IEEE Trans. Commun., vol. 43, pp. 2653–2656, Nov. 1995. [19] T. Kasami, T. Takata, K. Yamashita, T. Fujiwara, and S. Lin. “On bit error probability of a concatenated coding scheme,” IEEE Trans. Commun., vol. 45, pp. 536–543, May 1997.
(32)