Separating Erasures from Errors for Decoding - IEEE Xplore

Report 1 Downloads 39 Views
ISIT 2008, Toronto, Canada, July 6 - 11, 2008

Separating Erasures from Errors for Decoding Khaled A.S. Abdel-Ghaffar

Jos H. Weber

University of California, Dept. ECE Davis, CA 95616, USA Email: [email protected]

Delft University of Technology, IRCTR/CWPC Mekelweg 4, 2628 CD Delft, The Netherlands Email: [email protected]

Abstract— Most decoding algorithms of linear codes, in general, are designed to correct or detect errors. However, many channels cause erasures in addition to errors. In principle, decoding over such channels can be accomplished by deleting the erased symbols and decoding the resulting vector with respect to a punctured code. For any given linear code and any given maximum number of correctable erasures, we introduce paritycheck matrices yielding parity-check equations that do not check any of the erased symbols and which are sufficient to characterize all punctured codes corresponding to this maximum number of erasures. This allows for the separation of erasures from errors to facilitate decoding. The parity-check matrices typically have redundant rows. We give several constructions of such matrices and prove general bounds on their minimum sizes.

I. I NTRODUCTION Many channels cause errors, where the transmitted symbol is received erroneously, and erasures, where the transmitted symbol is received as an erasure, denoted by , whose position is known. Using coding techniques, the receiver attempts to correct the errors and retrieve the erased symbols or to detect the presence of errors. linear block code over GF( ), where , Let be an , and denote the code’s length, dimension, and Hamming distance, respectively, and is a prime power. Such a code is a -dimensional subspace of the space of vectors of length over GF( ), in which any two different vectors differ in at least positions. The set of codewords of can be defined as the null space of the row space of an binary parity-check matrix of rank . The row space of is the dual code of . Since a -ary vector is a codeword of if and only if , where the superscript denotes transpose, the parity-check matrix gives rise to parity-check equations, denoted by 











































!









(

















!







(







+



.



+

1

2





3

5

6





+



9

:




for D



G



J



L

L

L



. 

is said to check in position if and An equation only if . In the most general scenario, if the number of erasures, , does not exceed , then the decoder can choose two and satisfying nonnegative integers 3

5

6





S









Q



+



+

O

C

U



!

S

G

X

S

Y

=

S

U

Z

X S

J

=

Z

S

Y

[



!

G

(1)

such that the following is true. If the number of errors does , then the decoder can correct all errors and not exceed erasures. Otherwise, if the number of errors is greater than but at most , then the decoder can detect the occurrence S

X

=

X

S

=

X

S

Z

S

Y

=

978-1-4244-2571-6/08/$25.00 ©2008 IEEE

215

errors and, in this case, may request the of more than retransmission of the codeword, see e.g., [7]. Notice that the above only states the existence of a decoder with such capabilities but does not show a specific algorithm to achieve these capabilities. Typically, decoders are devised for correcting or, alternatively, detecting errors, but not necessarily in combination with erasures. For BCH codes and ReedSolomon codes, these decoders can be modified to correct erasures as well [9]. This modification is based on the specific structure of the codes. However, known algorithms that are applicable to linear codes in general use trials in which and the resulting erasures are replaced by symbols in GF word is decoded using a decoder capable of correcting or detecting errors only. Although two trials are sufficient for binary codes, the number of trials grows rapidly with rendering the applicability of this algorithm to be very limited for codes over large fields [9]. On the other hand, to prove the existence of a decoder with the prescribed capabilities, it suffices to show that if all erasures from the received word are deleted, then the errors in the resulting word can be corrected or detected based on the punctured code whose codewords are obtained by deleting all the symbols corresponding to the erased symbols in the received word, see e.g., [7]. The crux of the proof is based on (1) which implies that the punctured code has Hamming and, therefore, errors distance of at least can be corrected or detected in the word with all erasures deleted. In case of error correction, the erasures can then be recovered since their number is less than the Hamming distance of the code. Although this proof is actually based on an algorithm, the implementation of the decoder requires the characterization, e.g, a parity-check matrix, of the punctured code which depends on the positions of the erasures in the received word. For the decoder to computationally characterize the punctured code after receiving the word may require an unacceptable time delay. On the other hand, storing precomputed characterizations of all punctured codes corresponding to all erasure patterns may not be feasible either. In this paper, we propose to use a parity-check matrix, which typically has redundant rows, and which contains, as submatrices, parity-check matrices of all codes punctured up to a fixed number of symbols, denoted by . This fixed number, ranging from zero to , is presumably the maximum number of erasures caused by the channel in a codeword. The parity-check matrix we are proposing yields enough parityS

X

=





S



U

S

^

!

J

X

=

Z

S

Y

Z

G

_



!

G

ISIT 2008, Toronto, Canada, July 6 - 11, 2008

check equations that do not check any erased symbol and are sufficient to characterize the punctured code. Having these parity-check equations not checking any of the erased symbols leads to the notion of “separating” the erasures from the errors. The fact that the same parity-check equations characterize the punctured code allows for error-decoding of this code with all erasures masked. To reduce storage requirements, we are interested in finding a parity-check matrix with a minimal number of rows satisfying the above condition. This work is motivated by the interest shown in the last decade in decoding techniques, such as belief propagation especially applied to LDPC codes, that are based on paritycheck matrices with a large number of redundant rows. Decoding exploits the redundancy of these matrices to yield good performance. The computational complexity of decoding is reduced at the price of storing parity-check matrices with more rows than necessary to characterize the codes. Actually, decoding techniques based on such parity-check matrices have been introduced already to decode words suffering from erasures only [4],[1]. For this application, the decoder seeks a parity-check equation that checks exactly one erased symbol whose value can be determined directly from the equation. A set of positions is called a stopping set if there is no parity-check equation that checks exactly one symbol in these positions. Erasure decoding fails if and only if erasures fill the positions of a nonempty stopping set. The parity-check matrices we are proposing do not have nonempty stopping sets of sizes less than or equal to the maximum number of erasures except in the case of and the code is MDS. Thus, except for this case, for any pattern of or fewer erasures, not only are there enough paritycheck equations not checking any of the erased symbols that characterize the punctured code, but also there is a paritycheck equation that checks exactly one of the erased symbols. This greatly facilitates the retrieval of the erased symbols once the errors are corrected. It is not surprising to see that this work is related to work on stopping sets, specially to [2],[3],[8],[10]. However, work on stopping sets assumes that the channel does not cause errors, which limits its applicability. On the contrary, our work deals with errors in addition to erasures. The basic concept behind the proposed decoding technique is best illustrated by an example as follows. binary extended Hamming Example 1: Let be the code with parity-check matrix 





















, which is obtained by deleting the know that erased symbol from , is the received word corresponding to a transmitted codeword in the code whose parity-check matrix is





























































(3)

















This parity-check matrix which is obtained by deleting the second column and the third, fourth, and fifth rows in , is a parity-check matrix of a Hamming code. Decoding yields . Updating to , the codeword the erased symbol can be retrieved from the third paritycheck equation of as , and the transmitted codeword . corresponding to is Finally, suppose that the received word is . In this case, our goal is to detect, rather than to correct, a single error. From the second and sixth parity-check equations of , , obtained by deleting we know that the vector the erased symbols from the received word, is a codeword in the code with parity-check matrix



































































































































(4)



















is in the null space of this matrix, In particular, since is a codeword in this code of length 6. This code has Hamming distance 2 and, therefore, if the channel caused at most one error, then it did not cause any errors in the nonerased symbols. The erased symbols can be retrieved from the and the first and then the third parity-check equations of transmitted codeword is recovered as . The rest of this paper is organized as follows. Preliminaries and basic definitions are covered in Section II. Sections III and IV introduce the notions of separating matrices and separating redundancy, respectively, and contain the main results of the paper. Section V concludes the paper.































II. P RELIMINARIES Let

be a subset of and be a subset . For any of size , let where and . Then, is a submatrix of . For simplicity, we write and to in case and , denote respectively. We allow for empty matrices, i.e., with no rows or no columns, e.g., if either or or both are empty. The rank of an empty matrix is defined to be zero. If is a vector of length , then denotes the vector whose components are indexed by . Furthermore, for the code of length , define the punctured code of























#























!



$

&

'



(



#



)

!









$

&

'

(



,





-







$



$

)

$



$





































#

























!





/



!

 





























/































(2)



 

























!







































0











i.e., consists of all codewords in in which the components in positions belonging to the set , defined by , are deleted. Clearly is a linear code over GF( ) of length , dimension , and Hamming distance . Furthermore, if , since the deletion of any number of components then less than the Hamming distance from two distinct codewords 



Notice that the first four rows form a parity-check matrix of the code. However, as we show, allowing redundant rows simplifies the decoding of errors and erasures. Assume that the channel causes at most one error. Suppose that is received. Since the first, second, and sixth parity-check equations do not check the erased symbol, we























216





*



*























!



,





4

2

!





3











3

$



$

3

3



7

5



!

7

5

$



$

$



$

!







ISIT 2008, Toronto, Canada, July 6 - 11, 2008

in a code of Hamming distance results in distinct vectors. , then there is a one-to-one It follows that if correspondence between the codes and such that if and only if there is a unique such that . Let be an matrix over GF and be a subset of . We define 















































































#

%

#











)

*

































)



)









!



#



$

&

















































(5)

&

















&




















>



)

ISIT 2008, Toronto, Canada, July 6 - 11, 2008

This proves that is not a stopping set. By the Singleton bound, see e.g., [5], Lemma 3 and Thecodes and all values of orem 1 are applicable to all except if and the code is MDS, . The following lemma addresses this. i.e., be an MDS linear code over Lemma 4: Let GF , i.e., . Then, any parity-check matrix, , of separates all sets of size . In particular, any -separating. separating parity-check matrix of is Proof: Since the dual code of an MDS code is MDS [5], is an code with . In particular, has weight at least . If every nonzero row in , , then no nonzero row in has zeros in all positions indexed by and the rank of is zero. From Lemma 2, it follows that separates . Let be an linear code over GF with a full-rank parity-check matrix . In the next two results, we construct which are -separating for any parity-check matrices for given . In both constructions, we assume that since the excluded case can be treated based on Lemma 4. From Lemma 3, it suffices to show that the constructed parity-check matrices separate all sets of size to conclude that they are -separating. In the first construction, let , where , be the distinct subsets of of size . For each , has rank as . By elementary row operations on , we can obtain an matrix, , for each , of rank such that its last rows have zeros in the positions indexed by . Let be the matrix whose set of rows is the union of the sets of the last rows in for . has at most rows. Clearly, is an -separating parity-check Theorem 2: The matrix matrix for the code . If , then is also -separating. Proof: In the proof, we use to denote . First, we will is indeed a parity-check matrix of . Since show that is in the row space of the parity-check matrix , its null . We have to space contains and its rank is at most show that its rank is not less than . Since has rank , there is a set of size such that has rank . Then, for each , there is a vector in the row space of that has a nonzero entry in position and zeros in all other positions in . These vectors are clearly independent and each one belongs to the row space spanned by the last rows of for some such that . We conclude that has linearly independent rows and, therefore, its rank is at least . Next, we show that separates the set for each . Notice that contains the last rows of which are linearly independent. In particular, the rank of and, consequently, the rank of , are at least equal to . The rank of cannot be larger because has rank of Lemma 1. We conclude that and, from Lemma 2, separates . This is true for all , 

































































































































































#

#

#







































































,

,

0







.







#

#

#











,

















,

,







#

#

#

4









#



#



6

















#













#

5

#

#











0



,

5











#

#

#







6



5









,



,



















#

#

#





6













:







4





4









,











,



5













#

#

#



6





4





6









,











,



,

























,



































#

#



#































,



,































.







#

#

#







,



,











































H

4







G

6

4

I









H







H

4









,



,



























,



























,



.

J

,













#

#

#















,



















K



,





























(7)







 





 









#

,







which proves that separates all sets of size . The statement follows from Lemma 4. regarding In the second construction, let be a matrix over GF whose rows are all the nonzero normalized vectors of length and weight at most , where . Define . Clearly, has rows. Theorem 3: The matrix is an -separating parity-check matrix for the code . If , then is also -separating. to denote . Since is a Proof: In the proof, we use submatrix of and all the rows of are linear combinations of the rows of , is a parity-check matrix for . The theorem clearly holds if . So, we assume in the following . First, we will show that separates any subset that of of size . Let be the set of all rows in such that . For each nonzero normalized vector of length over GF , let be the set of all rows in such that is a nonzero multiple of . Let be the set is nonempty. of nonzero normalized vectors such that Clearly,









A









A











,











"

#

, pick a row in . Since contains all For each contains, as rows, normalized vectors of weight 1 as rows, all the vectors in . Let and suppose that is a row in other than . Since contains all normalized vectors of weight 2 as rows, then contains, as a row, a nonzero multiple for each nonzero element GF . of the vector Clearly, for some , one of these vectors, denoted by , is such that and contains a nonzero multiple contains vectors, each is of this vector. Hence, a nonzero multiple of , where and . So far we counted vectors in . Since has minimum distance , then has rank . In particular, there are vectors that form a basis for the -dimensional vector space over GF . . If , then every vector , where Hence, , for , can be added to a linear such that the resulting vector combination of is such that . As contains as rows all nonzero normalized vectors of weight up to , a nonzero belongs to . There are such multiple of vectors. Thus, from (7), we already counted 









































)

)









)





















K



































L

















G



"

















#





,

,



















,

H

,







#

#

#







,





J









,











,

,









L

4

5



0









2







#



#



#



#

#



#



,

3



















K





,

















































,













,























,











,



4

5











#

#

#







4



6

















5



4

















,



4





















"

#



"

#





,





4



4











4



4











,

5

218



vectors in . These vectors are linearly independent since they can be obtained by elementary row operations on the rows of which has full rank. From Lemma 1, this proves that and, consequently, have rank . Therefore, from Lemma 2, separates . Since is an arbitrary set of size , is an -separating parity-check matrix of . The follows from statement regarding Lemma 4.



















,

















,



,



















,

ISIT 2008, Toronto, Canada, July 6 - 11, 2008

IV. S EPARATING R EDUNDANCY Let be an linear code over GF . Then, from Theorems 2 and 3, has a parity-check matrix which is separating for each , . We define the th separating redundancy, , of to be the minimum number of rows of an -separating parity-check matrix of . The following results are corollaries to Theorems 2 and 3, respectively. Corollary 1: Let be an linear code over GF . Then, for each nonnegative integer , 



































































is a parity-check matrix of the First, we will show that added to extended Hamming code. Since the th row in the th row in is the vector , it follows that the matrices in (8) and (9) have the same row spaces, and consequently the is a parity-check matrix same null spaces. This proves that of the extended Hamming code. It remains to show that is -separating. From Lemma 2, it suffices to prove that has rank for any given . Let be the set of indices for which . Then, consists of all rows in whose indices belong to and all rows in whose indices do not belong to . These latter rows are precisely the rows in , whose indices do not belong to , added to the vector . From the fact that the matrix in (8) has rank , it follows that the rows of , which are the rows of with the all-ones vector is perhaps added to some of them, are linearly independent. Hence, has rank . Since is obtained by deleting an all-zero column from , has also rank . -





@



@

-

-



-



C









.



,

,

,



.







C









?

@



#

?



-

%

!





if



















,





















if





. linear code over GF ,







Corollary 2: Let be an Then, for each nonnegative integer 



































. 





=





?







%

-

!



if





#







"



#















$



#



,





















?

if . Next, we present a lowerbound on . Theorem 4: Let be an linear code over GF Then, for each nonnegative integer , #

"



#

$



















-









#





















. 

-



C



%

!





?

-

!

%

-



C















V. C ONCLUSION





















*

,









Proof: Suppose that is an -separating parity-check matrix for all subsets of . Consider the collection of submatrices of size . The number of distinct nonzero codewords in that appear in these matrices is a lower bound has rank on the number of rows of . Each submatrix and, in particular, has at least rows. Each has zeros in the positions indexed by . If row in is a nonzero codeword in of weight , then it appears such submatrices. This number is at most in at most 

-







-











.



,

,

,









3





-









-

















-





3











In this paper we introduced the concept of separating paritycheck matrices for decoding over channels causing errors and erasures. We presented two constructions of such matrices and a lower bound on the minimum number of rows in separating matrices. Potential future work includes the determination of the minimum number of rows in such matrices for classes of codes of practical interest. ACKNOWLEDGMENT The authors were supported by the NSF through grant CCF0727478 and by STW under McAT project DTC.6438.





as . There are matrices each with rows. Therefore, the total number of distinct rows is at least as given by the lowerbound in the statement of the theorem. Example 3: The binary extended Hamming code is a linear code over GF with parity-check matrix of the form (8) 



*

R EFERENCES











3

-















.









.

























.









where is an binary matrix whose columns are the distinct binary vectors of length and is the allones row vector of length . This code is also known . The dual code of as the Reed-Muller code RM RM is the Reed-Muller code RM which is a binary linear code. We will show that binary extended Hamming code, for the . Indeed, since , we have from Theorem 4. To prove equality, we give an explicit construction for the extended Hamming code of a parity-check matrix which is -separating. Let be the complement , , . of the matrix Define (9) 



9

.





.









.









.

















=





.











.









.





.

















*





.





3

.





.





-



=





#



?











#

?



@





.



,

,

,







C





.



,

,

,



.









-

,

219

[1] C. Di, D. Proietti, I. E. Telatar, T. J. Richardson, and R. L. Urbanke, “Finite-length analysis of low-density parity-check codes on the binary erasure channel,” IEEE Trans. Inform. Theory, vol. 48, no. 6, pp. 15701579, June 2002. [2] J. Han and P. H. Siegel, “Improved upper bounds on stopping redundancy,” IEEE Trans. Inform. Theory, vol. 53, no. 1, pp. 90–104, January 2007. [3] H. D. L. Hollmann and L. M. G. M. Tolhuizen, “On parity check collections for iterative erasure decoding that correct all correctable erasure patterns of a given size,” IEEE Trans. Inform. Theory, vol. 53, no. 2, pp. 823–828, February 2007. [4] M. G. Luby, M. Mitzenbacher, M. A. Shokrollahi, and D. A. Spielman, “Effi cient erasure correcting codes,” IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 569–584, February 2001. [5] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes. Amsterdam, The Netherlands: North-Holland, 1977. [6] W. W. Peterson and E. J. Weldon Jr., Error-Correcting Codes. Cambridge, MA: MIT Press, 1972. [7] R. M. Roth, Introduction to Coding Theory. Cambridge, UK: Cambridge University Press, 2006. [8] M. Schwartz and A. Vardy, “On the stopping distance and the stopping redundancy of codes,” IEEE Trans. Inform. Theory, vol. 52, no. 3, pp. 922–932, March 2006. [9] S. A. Vanstone and P. C. van Oorschot, An Introduction to Error Correcting Codes with Applications. Norwell, MA: Kluwer, 1989. [10] J. H. Weber and K. A. S. Abdel-Ghaffar, “Results on parity-check matrices with optimal stopping and/or dead-end set enumerators,” IEEE Trans. Inform. Theory, vol. 54, no. 3, pp. 1368–1374, March 2008.