Concatenated Permutation Block Codes for Correcting ... - IEEE Xplore

Report 9 Downloads 84 Views
Concatenated Permutation Block Codes for Correcting Single Transposition Errors Reolyn Heymann∗ , Jos H. Weber† , Theo G. Swart∗ and Hendrik C. Ferreira∗ ∗

University of Johannesburg, Dept. E&E Eng. Science, South Africa Email: {rheymann, tgswart, hcferreira}@uj.ac.za † TU Delft, The Netherlands Email: [email protected]

Abstract—Permutation codes are advantageous due to their favourable symbol diversity properties and are applied in flash memories combined with rank modulation. Codebooks traditionally consist of permutations with specific distance properties. A class of permutation codes was presented where a codeword consists of a sequence or concatenation of permutations, rather than a single permutation. These codebooks were constructed to correct substitution or deletion errors. In this paper, permutations are concatenated to form codewords with the goal of detecting and correcting adjacent transposition errors. An outer code is used to detect erroneous permutations in the codeword, using additional parity permutations. The symbol diversity of permutation codes is preserved and codebooks with higher cardinalities are constructed which result in better code rates.

I. I NTRODUCTION Permutation codes are used in combination with rank modulation in flash memories [1]. Flash memory consists of cells. A number of cells are grouped together to form a block. A cell is programmed by injecting or removing charge. Every cell has a number of discrete levels representing specific symbols. Current flash memories allow a single cell to be programmed (injecting charge) but a single cell cannot be erased (removing charge). If the value of a cell needs to be erased, the entire block needs to be copied to a different location, all the original cells are erased and then the block is copied back except for the cell whose value needs to change. It is thus more time consuming to write to flash memory than reading from flash memory: the injection procedure needs to be controlled very carefully since an overshoot error (injecting more charge than needed) cannot be fixed by just erasing the value of the single cell and reprogramming it. In [1] rank modulation has been proposed. With rank modulation and permutation codes, it is not necessary to have distinct levels representing specific symbols any more, but symbol values are represented relative to each other. If one symbol has a higher value than another, then more charge will be stored in the cell representing the higher value. Overshoot will thus no longer be a problem. Most errors in flash memory are caused by charge leakage thus making the charge of cells drift down. The number of errors due to leakage is decreased due to the use of rank modulation. However, leakage is not necessarily at the same rate and may still cause errors.

978-1-4799-5999-0/14/$31.00 ©2014 IEEE

The use of permutation codes combined with M-ary FSK modulation has also been shown to be able to combat different types of noise present in powerline communication (PLC), especially for narrowband PLC [2] in the CENELEC A band. Every symbol in the permutation code is mapped to a specific frequency. Examples of applications are automatic meter reading and demand side management. Error-correcting codes for transposition errors have been presented in [3] and [4]. In [3] codes use the Lee metric as distance measure. A graph approach is used to construct codebooks capable of correcting adjacent transposition errors. In [4], the Kendall tau distance is used. Several upper and lower bounds are presented in [4]. Error correction in flash memories using interleaved codes has been proposed in [5]. Symbols are divided into different sets and then interleaved to form codewords. Only even permutations are used. Thus, symbols from one set may only occur in certain positions. These codes use the Ulam metric as distance measure and focuses on translocation errors. In [6], permutations are concatenated to form codewords. A larger subset of the symmetric permutation group, SM , is used than in traditional coding techniques where one permutation is one codeword, which result in better code rates. The focus of the work was on substitution errors and deletion errors. The work in this paper adapts the construction from [6] to be able to correct adjacent transposition errors. SM will also be divided into subsets, however, the subsets will contain more permutations with larger distances than in [6]. Outer codes are used to detect erroneous subwords. The subsets are then used to correct the transposition errors. This paper is organized as follows: in the next section formal definitions and notations are given for the most important concepts used in this paper. This paper follows on the work done in [6], which is summarized in Section III. A code construction is presented in Section IV. This section illustrates how the construction can be used in combination with an outer code to correct adjacent transposition errors. The outer code is generalized in Section V. The construction is compared to previous work in Section VI and the paper is concluded in Section VII.

576

II. D EFINITIONS AND N OTATIONS The Hamming distance, d, between two codewords is defined as the number of positions in which the codewords differ. The minimum Hamming distance of a codebook, dmin , is defined as the minimum Hamming distance between any two different codewords in the codebook. Since only the Hamming distance will be used in this paper, any reference to distance will refer to Hamming distance. An (n, k) Hamming code will be used as an outer code, where n refers to the length of the codeword and k to the number of information symbols. Specifically, a (7, 4) Hamming code will be used. Let Bv be the set consisting of all binary sequences of length v. Let SM denote the set of all M ! permutations of the integers 1, 2, . . . , M . The codebook, C, consists of codewords constructed out of multiple concatenated permutations. The maximum rate obtainable for any permutation code is log2 (M !) bits/symbol. M

In a multiset permutation, every symbol may occur a specific number of times. Initially, the set SM is partitioned into subsets using set partitioning. Every subset contains 4 permutations. The Hamming distance between permutations in the same subset is 4 and between permutations from different subsets is 2. Binary data is mapped to the permutations in the subsets. These permutations will form subwords of the codeword. An additional permutation will be added to the subwords as a parity permutation. The subwords, together with the parity permutation will form the codeword as illustrated in Figure 1. Codeword 3421

1324

2413

Parity Permutation

(1)

If a larger integer precedes a smaller one in a permutation, it is defined as an inversion. For example, the total number of inversions in the permutation 613452 is equal to 8. A permutation can be either even or odd. A permutation is even if the total number of inversions is even and odd if the total number of inversions is odd. The set of all even permutations is denoted as AM , where |AM | = M !/2 and dmin = 3 [7]. A substitution error occurs if one symbol is transmitted but another is received. If a transmitted symbol is not received, then a deletion error occurred. A transposition error occurs if the two symbols at positions i and j are swapped. If |i−j| = 1, then an adjacent transposition error occurred. For example, if an adjacent transposition error occurs at position 1 and 2 of the codeword 1423, the resulting codeword will be 4123. A translocation error occurs if a symbol is moved to a different position causing a shift in the other symbols. A translocation error is equivalent to a number of adjacent transposition errors. A circular shift occurs if the first symbol of a permutation is moved to the last position and all the other symbols are moved one position to the left. A permutation of M symbols can be shifted M − 1 times before returning to the original permutation. All the shifts of a permutation will have a distance of M . For example, the circular shifts of the permutation 1234, are 2341, 3412 and 4123. Let i and j be positive integers and let σ(j) represent the transformation of the decimal representation of j to the binary representation of j. The function σ(j) ∧ σ(i) is the bitwise AND operation of the two binary numbers. For example, let j = 1 and i = 3, then σ(j) ∧ σ(i) = 01.

1342

Fig. 1.

Concatenation of permutations to form a codeword (M = 4)

Substitution errors are detected by using the permutation property that every symbol can only occur once in every codeword. Thus, subwords containing a specific symbol more than once are identified. If only one such a subword occurs in the codeword, the parity permutation is used to correct the substitution error. This construction can also be adapted to detect and correct two substitution errors or a deletion error, as specified in [6]. This adaptation will not be discussed here since it is not needed to follow the construction presented in this paper. IV. C ODE C ONSTRUCTION In [6], the permutation property of every symbol occurring only once was used to detect substitution errors. The parity permutation was only used to correct the already identified erroneous subword. However, if a transposition error occurs in a subword, the resulting subword is still a permutation. Hence, we cannot use the permutation property to detect errors. More parity permutations will thus have to be added to the codeword to assist with the detection and correction of the error. To illustrate the concept, a [7, 4] codeword will be used based on Hamming codes. Thus, every codeword will consist of 4 permutations which have binary bits mapped to it, and 3 parity permutations. An example is shown in Figure 2. Codeword 1234

4312

4132

1342

1234

1342

1234

Parity Permutations

III. C ONCATENATED P ERMUTATION B LOCK C ODES In [6], a new class of permutation codes was presented where, instead of considering one permutation as a codeword, codewords consist of a sequence of permutations. These codes are a subset of the more general class of multiset permutations.

Fig. 2.

Concatenation of permutations to form a [7,4] codeword (M = 4)

Firstly, the set SM is divided into subsets. For M symbols, SM = M !, which is divided into (M − 1)! subsets, each

577

1) If at least one yl is not in R, then go to Step 2. Else calculate the syndrome values:

containing M permutations with a Hamming distance of M . A permutation, with its M − 1 circular shifts will form a subset. The maximum length, v, of binary codewords which will be mapped to the permutations is v = blog2 M !c.

(2)

The number of permutations of SM which will be used is thus 2v . The minimum number of subsets which is needed, L, is  v 2 L= . (3) M If 2v mod M ≡ 0, then all these subsets will contain M permutations, otherwise the last subset will not be complete and only contain 2v mod M permutations. For example, let M = 5: binary codewords of length 6 will be mapped to the permutations. Only 26 = 64 permutations are 64 needed. Thus, of the 120 5 = 24 subsets only L = d 5 e = 13 will be used. Subset 13 will only contain 4 permutations. The possible subsets are: R0 = {12345, 23451, 34512, 45123, 51234}, R1 = {12354, 23541, 35412, 54123, 41235}, R2 = {12435, 24351, 43512, 35124, 51243}, R3 = {12453, 24531, 45312, 53124, 31245}, R4 = {12534, 25341, 53412, 34125, 41253}, R5 = {12543, 25431, 54312, 43125, 31254}, R6 = {13245, 32451, 24513, 45132, 51324}, R7 = {13254, 32541, 25413, 54132, 41325}, R8 = {13425, 34251, 42513, 25134, 51342}, R9 = {13452, 34521, 45213, 52134, 21345}, R10 = {13524, 35241, 52413, 24135, 41352}, R11 = {13542, 35421, 54213, 42135, 21354}, R12 = {14235, 42351, 23514, 35142}. The fact that R12 only contains 4 permutations will not affect the coding scheme. For all permutation sequences r ∈ Ri , define ψ(r) = i. Let R = R0 ∪ R1 ∪ . . . ∪ RL−1 . Let φ be a one-to-one mapping from the set Bv to R. Encoding: A source generates a sequence u of (4 × v) bits, which is partitioned into 4 sequences u1 , u2 , u3 , u4 , all from Bv . Let xl = φ(ul ) for all l and let c1 ≡ −(ψ(x1 ) + ψ(x2 ) + ψ(x3 ))

mod L,

(4)

c2 ≡ −(ψ(x1 ) + ψ(x3 ) + ψ(x4 ))

mod L,

(5)

c3 ≡ −(ψ(x1 ) + ψ(x2 ) + ψ(x4 ))

mod L.

(6)

Then the encoder output is the code sequence x = (x1 , x2 , x3 , x4 , x5 , x6 , x7 ), where the check sequence x5 is taken from Rc1 , x6 is taken from Rc2 , and x7 is taken from Rc3 . Decoding: Let the received sequence be y = (y1 , y2 , . . . , y7 ), where each yi is a sequence of M symbols from {1, 2, . . . , M }. The decoding procedure consists of the following steps:

2) 3)

4)

5) 6)

s1 ≡ (ψ(y1 ) + ψ(y2 ) + ψ(y3 ) + ψ(y5 ))

mod L, (7)

s2 ≡ (ψ(y1 ) + ψ(y3 ) + ψ(y4 ) + ψ(y6 ))

mod L, (8)

s3 ≡ (ψ(y1 ) + ψ(y2 ) + ψ(y4 ) + ψ(y7 ))

mod L. (9)

If s1 = 0, s2 = 0 and s3 = 0, then set x0 = y and go to Step 5, else go to Step 3. If there exists an e such that ye is not in R and yl is in R for all l 6= e, then go to Step 4. Else, go to Step 6. Identify the erroneous subword, ye , by using Table I. If e = 5, 6, 7, the error occurred in the parity permutations, set x0 = y and go to Step 5. Else, go to the next step. For j = 1, 2, . . . , (M − 1) set xj as y, with symbols at positions j and (j + 1) swapped in ye . Calculate s1 , s2 and s3 . If s1 = 0, s2 = 0 and s3 = 0, then set x0 = xj for this j and go to Step 5. Else, go to Step 6. Set the decoder output as u0 = (u01 , u02 , u03 , u04 ), where u0l = φ−1 (x0l ) and STOP. Detect that more than one error occurred and STOP. TABLE I E RROR VALUES s1 s2 s3 =0 =0 6= 0 =0 6= 0 =0 =0 6= 0 6= 0 6= 0 =0 =0 6= 0 =0 6= 0 6= 0 6= 0 =0 6= 0 6= 0 6= 0

e 7 6 4 5 2 3 1

Remarks: The code rate is given by: R=

4blog2 M !c bits/symbol. 7M

(10)

Additionally, blog2 M c additional information bits can be encoded into the choice of x5 , x6 and x7 . The code rate is then improved to: R=

4blog2 M !c + 3blog2 M c bits/symbol. 7M

(11)

The distance between permutations in a subset is M . For higher values of M , the increased distance can be used to correct more errors in an erroneous subword, rather than just one transposition error. Step 4 in the decoding algorithm can be adjusted to also correct multiple substitution errors or transposition errors. Even translocation errors of limited length can be corrected. The number of errors (or length of transposition errors) that can be corrected will depend on M . However, the focus of this paper is on single adjacent transposition errors and other errors will be considered in future work. Mapping: The map φ used in the construction could be implemented through a table look-up. In [6] an algorithm is presented to map the binary data to the permutations.

578

Example: Let M = 4 and the information sequence be u = (0000, 1001, 0111, 0001). The following subsets will be used: R0 R1 R2 R3

= {1234, 2341, 3412, 4123}, = {1243, 2431, 4312, 3124}, = {1324, 3241, 2413, 4132}, = {1342, 3421, 4213, 2134}.

The algorithm in [6] is used to map the binary data to the permutations. Then the encoded sequence is x = (1234, 4132, 2431, 4123, 1243, 1342, 1324). Five cases are considered for the received sequence y and the corresponding decoding results are given. • If y = x (no errors), then we go from Step 1 to Step 5 and the decoding result is u0 = u. • If y = (1234, 1432, 2431, 4123, 1243, 1342, 1324), i.e. the symbol at position 5 was swapped with the symbol at position 6, then we find in Step 1 that y2 is not in R. Thus, in Step 2 we find that e = 2 and proceed to Step 4. For x12 = 4132, we find s1 = 0, s2 = 0, s3 = 0. For x22 = 1342, we find s1 6= 0 and s3 6= 0 and x32 = 1423 is not in R. In conclusion, x0 = x1 = (1234, 4132, 2431, 4123, 1243, 1342, 1324) and Step 5 gives u0 = u. Hence, the error has been corrected. • If y = (1234, 1432, 4231, 4123, 1243, 1342, 1324), i.e., the symbols at positions 5 and 6 were swapped as well as the symbols at positions 9 and 10, then we find that y2 and y3 are not in R, and thus we go from Step 1 via Step 2 to Step 6, and the decoding result is the detection of (at least) two errors. • If y = (1234, 4123, 2431, 4213, 1243, 1342, 1324), i.e., the symbols at positions 7 and 8 are swapped as well as symbols at positions 14 and 15, then we find in Step 1 that all subwords are in R, but that s1 6= 0, s2 6= 0, s3 6= 0 and thus in Step 3 we find that e = 1 and proceed to Step 4. For x11 = 2134, we find s1 6= 0 and s2 6= 0. For x21 = 1324, we find s2 6= 0 and s3 6= 0. For x31 = 1243, we find s1 6= 0 and s3 6= 0. In conclusion, we end up in Step 6, and the decoding result is the detection of (at least) two errors. • If y = (1234, 4132, 2431, 4132, 1243, 1342, 1324), i.e., the symbols at positions 15 and 16 are swapped, then we find in Step 1 that all subwords are in R, but that s2 6= 0 and s3 6= 0. In Step 3 we find that e = 4 and proceed to Step 4. x14 = 1423, which is not in R. For x24 = 4321, we find s2 6= 0 and s3 6= 0. For x34 = 4123, we find s1 = 0, s2 = 0, s3 = 0. Thus, x0 = x3 = (1234, 4132, 2431, 4123, 1243, 1342, 1324) and Step 5 gives u0 = u.

previous section. On the other hand, the outer code is still only a single error correction code. So even if more subwords are used per codeword, only one erroneous subword per codeword can be identified and corrected. When generalizing the Hamming code, the parity permutations will be added between the other permutations, and not at the end of the codeword as in the previous section. The set SM will be partitioned as in the previous section. The mapping from the binary subwords to the permutation subwords will also be identical. An (n, k) Hamming code will be used as outer code, thus the codeword will consist of n permutation subwords of which k contain information. Let m be the number of parity subwords, then n = 2m − 1 and k = n − m. Encoding: A source generates a sequence u of (k × v) bits, which is partitioned into k sequences u1 , u2 , . . . , uk , all from Bv . Let zj = φ(uj ). 1) The parity permutations are inserted into the sequence x at the positions where i is a power of 2. Firstly, the permutations which are not parity permutations are inserted into x. Let j = 1 and i = 1. If i is a power of 2, then i = i + 1, else xi = zj , j = j + 1 and i = i + 1. Repeat while i ≤ n. 2) Next, the subset number of each parity permutation is calculated. Let p = (p1 , p2 , . . . , pm ) be the set of parity permutations, j = 1, 2, . . . , m and x2j−1 = pj . For i from 1 to n for each pj and cj = 0: If 2j−1 6= i and σ(2j−1 ) ∧ σ(i) 6= 0, then cj = −(cj + ψ(xi ))

mod L.

(12)

The parity permutation pj is then taken from Rcj . The encoder output is the code sequence x = (x1 , x2 , . . . , xn ). Decoding: Let the received sequence be y = (y1 , y2 , . . . , yn ), where each yi is a sequence of M symbols from {1, 2, . . . , M }. The decoding procedure consists of the following steps: 1) If at least one yl is not in R, then go to Step 2. Else, let j = 1, 2, . . . , m. For i from 1 to n for each pj and sj = 0: if σ(2j−1 ) ∧ σ(i) 6= 0, then

V. G ENERALIZING THE OUTER H AMMING CODE In the previous section, a (7, 4) Hamming code was used to illustrate the code construction. However, any (n, k) Hamming code can be used. Using a Hamming code with a better code rate will also improve the code rate of the construction in the

579

sj = sj + ψ(xi )

mod L.

(13)

If sj 6= 0, then set sj = 1. If sj = 0 for all values of j, then set x0 = y and go to Step 5, else go to Step 3. 2) If there exists an e such that ye is not in R and yl is in R for all l 6= e, then go to Step 4. Else, go to Step 6. 3) A syndrome s can now be constructed where s1 is the least significant bit and sm the most significant bit. The decimal value of s is the position of the erroneous subword ye . If e is a power of 2, the error occurred in a parity permutation, set x0 = y and go to Step 5. Else, go to the next step. 4) For j = 1, 2, . . . , (M − 1) set xj as y, with symbols at positions j and (j + 1) swapped in ye . Calculate s. If

s = 0 then set x0 = xj for this j and go to Step 5. Else, go to Step 6. 5) Remove the parity permutations from x0 and set the decoder output as u0 = (u01 , u02 , . . . , u0k ), where u0j = φ−1 (x0j ) and STOP. 6) Detect that more than one error occurred and STOP. Remarks: The code rate is given by:

Code Rate

3.5

kblog2 M !c bits/symbol. (14) nM Additionally, blog2 M c additional information bits can be encoded into the choice of the parity permutations. The code rate is then improved to: R=

kblog2 M !c + mblog2 M c bits/symbol. (15) R= nM Example: Let M = 4 and the information sequence be u = (0000, 1001, 0111, 0001, 0000, 1001, 0111, 0001, 1111, 1010, 1001). The same subsets will be used as in the previous example. Let m = 4, n = 15 and k = 11. The algorithm in [6] is used to map the binary data to the permutations. Calculating the sets for the parity permutations: c1 = 2, c2 = 0, c3 = 2 and c4 = 2. Then the encoded sequence is x = (1324, 1234, 1234, 1324, 4132, 2431, 4123, 1324, 1234, 4132, 2431, 4123, 3421, 2413, 4132). Only one case is considered since it is similar to the scenarios in the previous example. If y = (1324, 1234, 1243, 1324, 4132, 2431, 4123, 1324, 1234, 4132, 2431, 4123, 3421, 2413, 4132), i.e., the symbols at positions 11 and 12 are swapped, then we find in Step 1 that all subwords are in R, but that s1 6= 0 and s2 6= 0 and thus in Step 3 we find that the syndrome is 0011 and e = 3. We proceed to Step 4. Both x11 = 2143 and x21 = 1423 are not in R. For x31 = 1234, the syndrome value s = 0 and is thus the correct permutation. VI. C OMPARISON TO PREVIOUS WORK More parity permutations are added compared to the construction in [6], to be able to correct adjacent transposition errors. Figure 3 shows the code rates for different constructions and different values of M and different outer codes (equation 14 is used). The line marked “Concatenated” is the construction presented in [6] with a value of K = 4, thus the same number as information permutations as in the (7, 4) Hamming code case. For every K = 4 information permutations, one parity permutation is added. However, it is important to note that the constructions in Figure 3 have different properties with regards to their error correction capabilities. Codes with the optimal code rate, log2 (M !)/M , do not have any error-correction capabilities. The concatenated code with K = 4, can correct one substitution error in every K + 1 = 5 permutations or detect two substitution errors. The codes presented in this paper can correct a single adjacent transposition error per codeword. Other work in literature can correct errors in every permutation. The concatenated coding scheme is thus suited

log2 (M !)/M Concatenated 3 (7, 4) codes (15, 11) codes 2.5 (63, 57) codes 2 1.5 1 0.5 4

6

8

10

12

14

16

18

20

M Fig. 3.

Code rates for different codes

for environments with low error probabilities which benefits from higher code rates. VII. C ONCLUSION The construction in [6] is adapted to correct single adjacent transposition errors. A number of permutations is concatenated and parity permutations are added to form a codeword. A construction is presented with an outer code based on (7, 4) Hamming codes. The outer code can identify which permutation is erroneous. The permutation property in combination with the set construction are then used to correct the error. This construction is then generalized for any (n, k) Hamming code. These constructions make use of a larger subset of SM , thus leading to higher cardinalities and higher code rates. The code rates can be further increased by using Hamming codes with higher code rates as outer codes. For future work different outer codes will be considered with higher error-correction capabilities or better code rates. The construction can then be adapted to correct translocation errors, as well as multiple substitution errors.

580

R EFERENCES [1] A. Jiang, R. Mateescu, M. Schwartz and J. Bruck, “Rank modulation for flash memories,” IEEE Trans. Inform. Theory, vol. 55, no. 6, pp. 2659–2673, May 2009. [2] A. J. H. Vinck, “Coded modulation for powerline communications,” Proc. Int. J. Elec. Commun., vol. 54, no. 1, pp. 45–49, 2000. [3] A. Jiang, M. Schwartz and J. Bruck, “Error-correcting codes for rank modulation,” Proc. ISIT, Toronto, Canada, pp. 1736–1740, July 2008. [4] A. Barg and A. Mazumdar, “Codes in permutations and error correction for rank modulation,” Proc. ISIT, Austin, Texas, USA, pp. 854–858, June 2010. [5] F. Farnoud, V. Skachek and O. Milenkovic, “Error-correction in flash memories via codes in the Ulam metric,” IEEE Trans. Inform. Theory, vol. 59, no. 5, pp. 3003–3020, May 2013. [6] R. Heymann, J. H. Weber, T. G. Swart and H. C. Ferreira, “Concatenated permutation block codes based on set partitioning for substitution and deletion error-control,” Proc. ITW, Sevilla, Spain, pp. 1–5, Sep. 2013. [7] M. Deza and S. A. Vanstone, “Bounds on permutation arrays,” Journal of Statistical Planning and Inference, vol. 2, no. 2, pp. 197–209, 1978.