Soft-Decision Decoding of Reed-Solomon Codes Using Successive Error-and-Ersure Decoding Soo-Woong Lee and B. V. K. Vijaya Kumar Carnegie Mellon University, Pittsburgh, PA 15213, USA
[email protected] and
[email protected] Abstract—We propose a soft-decision decoding algorithm of Reed-Solomon (RS) codes using successive error-and-erasure decoding. Extensive simulations are conducted to show the possible performance gain of the proposed method. We derive a formula for performance estimation based on ordered statistics of symbol reliability, which matches well with the results of the simulation. The proposed method with almost the same average complexity as a conventional hard-decision decoder outperforms Koetter-Vardy (KV) algorithm and Chase2-GMD algorithm (CGA).
I. I NTRODUCTION Reed-Solomon (RS) codes are one of the most popular codes in current applications for digital communications and data storage systems. There is no doubt that RS codes are very good codes at symbol level. RS codes are maximum distance separable (MDS) codes and have fast decoding algorithms to correct as many error symbols as up to half the minimum distance. Those two factors provide RS codes with practically strong error correction capability for burst errors. Since the initial advent of RS codes, there has been much research in extending the correction bound. The early approaches were to conduct conventional hard-decision algebraic decoding iteratively modifying the received sequences and select the most probable codeword out of the list of candidate codewords [1]–[3]. Recently, totally new algebraic soft-decision RS decoders based on polynomial evaluation definition of RS codes have been proposed [4][5]. However, due to high computational complexity and large memory requirement, these soft-decision decoding methods are still unattractive for practical applications. In this paper, refocusing on iterative algebraic decoding, or successive algebraic decoding, such as Generalized minimum distance (GMD) decoding and Chase2 decoding, we propose a new soft-decision RS decoding algorithm. In Section II, we review GMD decoding, Chase2 decoding, and combined Chase2-GMD algorithm (CGA) in brief for the purpose of comparison with the proposed algorithm. In Section III the successive error-and-erasure decoding (SED) as a hard-decision list decoding algorithm is introduced first. In Section IV, the new soft-decision decoding algorithm is proposed to reduce the complexity of the hard-decision SED. In Section V, extensive simulations of the proposed algorithm are conducted and the performance of the proposed method is compared with other soft-decision decoding algorithms. In order to predict the performance at high signal-to-noise ratios (SNR) of practical interest, a formula for performance
estimation is derived in Section VI. In Section VII, using a stopping criterion, it is shown that the average complexity of the proposed method can be almost the same as that of a conventional hard-decision decoder. II. GMD AND C HASE 2 DECODING Generally, an (n, k, dmin ) RS code is defined as a q-ary BCH code of length (q − 1) over GF(q), where n is the length of a codeword, k is the length of a message, dmin is the minimum distance. In most applications, dmin is an odd number, so dmin is assumed to be odd in this paper for simplicity. The conventional hard-decision RS decoder such as Berlekamp-Massey algorithm or Euclidean algorithm is able to correct up to any t symbol errors, where t = (dmin − 1)/2. If Nera symbols are erased, the error-and-erasure decoder with almost the same algorithm can correct Nerr errors in unerased locations in addition to Nera erasures on the condition that 2Nerr + Nera < dmin .
(1)
The GMD decoder repeats error-and-erasure decoding while erasing the least reliable symbols (LRS) successively. At the ith stage, 2i LRSs are erased, where i = 0, 1, · · · , t [1]. The Chase2 decoder runs error-only decoding for the modified received sequences whose t LRSs are replaced by all possible (i.e., q t ) combinations [2]. The GMD decoder is simple, but provides moderate performance gain. On the other hand, the Chase2 decoding exhibits much better performance than GMD but with much higher complexity. As a compromise between GMD and Chase2, the CGA was introduced to provide performance-complexity tradeoff [3]. A CGA(η) decoder induces q η error patterns flipping η LRSs and executes GMD-like decoding for each error pattern. Erasure patterns include 0 to (2t − 2η) erasures in increments of 2, erasing the (η + 1)th to the (2t − η)th LRSs. The number of iterations of error-and-erasure decoding of CGA(η) is q η (t + 1 − η). CGA(0) is the GMD decoding and CGA(t) is the Chase2 decoding. With increasing η, performance gets better, but complexity increases exponentially. III. S UCCESSIVE ERROR - AND - ERASURE DECODING AS A HARD - DECISION LIST DECODING : HSED(f ) Before we discuss soft-decision SED, hard-decision SED (HSED) is introduced first. A HSED(f ) repeats error-anderasure decoding with every possible combination of f erasures out of n symbols, where f is an even number less than
978-1-4244-2324-8/08/$25.00 © 2008 IEEE. This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings.
0
10
1 f=0(Conv.HDD) f=2 f=4 f=6 f=8 f = 10 f = 12 f = 14 f = 16
CER
0.8 0.7 Correction rate (τ/n)
−5
10
Conv.HDD GS with infinite m HSED(t) HSED(2t)
0.9
−10
10
0.6 0.5 0.4 0.3
−15
10
0.2 0.1 −20
10
5
6
7
8
9
0
10
0
0.2
0.4 0.6 Code rate (k/n)
E /N (db) b
Fig. 1.
0
Performance of SED as a hard-decision list decoding.
Fig. 2.
or equal to 2t. HSED(f ) can correct up to any (t+f /2) symbol errors to be a hard-decision bounded-distance list decoding. The codeword error rate (CER) of HSED can be easily calculated as the summation of the binomial distribution from (t + f /2 + 1) to n with symbol error probability ps , i.e., PHSED(f ) =
n
ps i (1 − ps )n−i .
(2)
i=t+f /2+1
The complexity, which is defined as the number iterations of error-and-erasure decoding, is given by nf . Because syndrome calculation is required only one time, the complexity of one iteration is similar to the complexity of an error-only decoder. Fig. 1 shows the CER of HSED compared to the conventional hard-decision decoder (HDD) for a (255,239,17) RS code over GF(28 ) through AWGN channel with BPSK modulation. CER decreases exponentially with increasing f . Fig. 2 compares the correction bounds of HSED(t) and HSED(2t) with those of the conventional hard-decision decoder and the Guruswami-Sudan (GS) list decoder. The asymptotic correction bound of the √ GS decoder with infinite multiplicity is known as (n − nk) [4]. While the GS algorithm does not extend correction bound when code rate is practically high, HSEDs extend correction bound by f /2 regardless of code rate. For example, consider the (255,239) RS code with a correction bound of 8. The asymptotic correction bound of the GS decoder is still 8, but that of HSED(8) is 12 and HSED(16) doubles the conventional correction bound. IV. S UCCESSIVE ERROR - AND - ERASURE DECODING USING SOFT INFORMATION : SED(l, f ) In the previous section, it was shown that HSED provides considerable performance gain, but the complexity increases almost exponentially with increasing correction bound. Now we try to reduce the complexity by using soft information. The basic idea is that the erasure positions are constrained according to the soft information. The new soft-decision decoding algorithm is defined with two parameters l and f . A SED(l, f ) repeats error-and-erasure decoding with every possible combination of even number of erasures less than or equal to f within l LRSs. Accordingly, l
0.8
1
Extension of correction bound.
is a positive integer smaller than or equal to n and f is zero or a positive even integer smaller than or equal to the minimum of l and 2t. The final codeword is selected out of the list of candidate codewords using soft information like other list decoding algorithms. The complexity of SED(l, f ) is given by f CSED(l,f ) = i=0,even il . For example, Consider a (7,3,5) RS code over GF(23 ). Let the reliability vector of a received sequence be (0.45 0.5 0.55 0.6 0.7 0.8 0.9), where small value means low reliability. The GMD decoder repeats error-and-erasure decoding with the erasure patterns of (0000000), (xx00000), and (xxxx000). SED(3,2) repeats error-and-erasure decoding with the erasure patterns of (0000000), (xx00000), (x0x0000), and (0xx0000). Consider some special cases of SED(l, f ). SED(l, 0) is the conventional hard-decision decoding with a complexity of 1 regardless of l. SED(n, f ) corresponds to HSED(f ) introduced in Section III. For fixed f , the performance of SED(n, f ) is the best among SED(l, f )s, with the highest complexity. That is, HSED(f ) provides the lower bound of CER and the upper bound of complexity for fixed f . Among SED(n, f ), SED(n, 2t), or HSED(2t), has the best performance and the highest complexity out of all possible SED(l, f )s. V. S IMULATION In order to investigate the performance of the proposed method, all possible SED(l, f )s are simulated for the (255,239) RS code over AWGN with BPSK. The parameter f is swept from 2 to 16 in increments of 2 and l is from f to 255 for given f . The symbol reliability over GF(2m ) is given by −1 m 2|yi | − 1 + e σb 2 , (3) r= i=1
where yi is the ith bit in a received symbol and σb 2 = N0 /2. It is assumed that if the list of candidate codewords includes the original codeword, the decoding is considered as a success without the final selection step. Because the performance of SED is still far from that of the maximum likelihood decoding (MLD), this assumption is acceptable without considerable loss of accuracy. The result of a SED(l, f ) for a corrupted codeword can be easily determined without actual iterations of error-and-erasure
978-1-4244-2324-8/08/$25.00 © 2008 IEEE. This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings.
-1
−1
10
10
Conv.HDD KV(inf) CGA(0~8) HSED(0~16) SED(1~255)
SED(1,0) −2
-2
10
10
SED(2~255,2) SED(4~255,4)
−3
-3
10
10
SED(6~255,6)
CER
CER
SED(8~255,8) -4
10
SED(10~255,10) SED(12~255,12)
-5
−4
10
−5
10
10
SED(14~255,14) −6
10
-6
10
SED(16~255,16) −7
10
-7
10
0
10
5
10
10
10
15
10
20
10
0
10
25
10
5
10
10
10
Complexity
decoding. Let i be the total number of error symbols in the received sequence. The number of the error symbols in l LRSs is denoted by Nl and the number of error symbols in (n − l) most reliable symbols (MRS) by Nn−l . Then, the list of the candidate codewords by the SED(l, f ) includes the original codeword if the following condition is satisfied by (1): if Nl > f f + 2(i − f ) < dmin , if Nl ≤ f and Nl is even Nl + 2Nn−l < dmin , otherwise. (4) Nl + 1 + 2Nn−l < dmin ,
Fig. 4.
min(f, Nl ) + 2(i − min(f, Nl )) < dmin .
(5)
For fixed f , the CERs of SED(l, f )s converge that of SED(n, f ) with increasing l as shown in Fig. 3. The slope of converging becomes steep with increasing SNR. It is also seen that if the complexities of several SED(l, f )s are similar, the SED whose f is the largest has almost the best performance. So, it can be said that the SED with the largest f can represent the SEDs for given complexity. From now on, SED(l) means SED(l, f ) where f is min(l, 2t) if l is an even number, min(l − 1, 2t) otherwise. For example, SED(1) is SED(1,0), SED(8) is SED(8,8), and SED(28) is SED(28,16). Fig. 4 presents CER versus complexity graph of SED(l)s with other algorithms at Eb /N0 = 6.5dB. SED(l) outperforms CGA for the same complexity and has much smaller complexity than CGA for the same performance. Furthermore, SED(l) achieves very low minimum CER which Koetter-Vardy (KV) algorithm and CGA can not reach, for the same SNR. The performance of several SED(l)s are compared with those of other algorithms in Fig. 5. SED(28) outperforms CGA(3) with similar complexity. SED(116) outperforms Chase2 with similar complexity and almost approaches the maximum performance of SED(255). Even SED(28) outperforms Chase2 and KV algorithm with infinite multiplicity. With a much smaller complexity of 2t , the bit flipping Chase2 decoder gives almost the same performance as the symbol flipping Chase2 decoder at the AWGN channel for high rate RS codes as shown in Fig. 5, because the number of
25
10
30
10
0
10
Conv.HDD KV(inf) CGA(0):GMD CGA(3) CGA(8):Chase2 b−Chase2 SED(28) SED(116) SED(255)
−1
10
−2
10
−3
10
−4
10
−5
10
−6
10
−7
10
−8
10
5
5.5
6
6.5 7 E /N (db) b
With the assumption that dmin is odd, these three separate conditions can be generalized as follows
20
10
Performance versus complexity of SED(1∼255) at Eb /N0 =6.5dB.
CER
Fig. 3. Simulation results of all possible SED(l, f )s for the (255, 239) RS code at Eb /N0 =6.5dB over AWGN channel with BPSK modulation.
15
10 Complexity
7.5
8
8.5
0
Fig. 5. Simulation results for the (255, 239) RS code over AWGN channel.
error bits per error symbol is mostly one. In real applications such as magnetic recording channels, however, errors are not independent of each other so that multi-bit error symbols are dominant [6]. This is one of the reasons why RS codes are preferred in real applications despite relatively limited random error correction capability. Multi-bit errors per erroneous symbol are helpful for the soft-decision decoding algorithms based on symbol reliability order such as the proposed method and the symbol flipping Chase2, because the symbol reliabilities become more distinguishable between erroneous and correct symbols. On the other hand, increasing the number of error bits per error symbol degrades the performance of the bit flipping Chase2 decoder significantly. Even worse, the soft-decision algorithms based on bit reliability have difficulties in concatenating with run-length limited (RLL) codes. A simple multi-bit error channel is built by adding an extra error bit next to each spontaneous error bit by AWGN. In such a channel, SED(9) outperforms the bit flipping Chase2 decoder with the same complexity of 256 as shown in Fig. 6. VI. P ERFORMANCE ESTIMATION In the previous section, the simulation results have shown that the proposed method provides considerable performance gain for the CER above 10−8 . However, most applications require very low CER below 10−10 , which it takes prohibitively long time to achieve with simulation. Therefore, analytical performance estimation is required for real applications.
978-1-4244-2324-8/08/$25.00 © 2008 IEEE. This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings.
0
0
10
10 Conv.HDD Chase2 b−Chase2 SED(9) SED(16) HSED(16)
−1
10
−2
10
conv.−sim conv.−est Chase2−sim Chase2−est SED(16)−sim SED(16)−est SED(28)−sim SED(28)−est SED(116)−sim SED(116)−est SED(255)−sim SED(255)−est
−5
10
−3
CER
CER
10
−10
10
−4
10
−5
10
−15
10
−6
10
−7
10
−20
5
5.5
6
6.5 7 E /N (dB) b
7.5
8
10
8.5
5
6
7
8
9
10
E /N (dB)
0
b
Fig. 6. Simulation for the (255, 239) RS code over a double-bit error channel.
Fig. 7.
0
Performance estimation for the (255, 239) RS code over AWGN.
The performance of the proposed algorithm over AWGN channel with BPSK modulation can be estimated using ordered statistics of symbol reliability. First, the decodability condition in (5) for SED(l, f ) can be interpreted as
where u(x) is a step function. As a fitting factor, the standard deviation is given by
i ≤ t + f /2 and Nn−l < dmin − i.
where Qinv is the inverse function of Q(x). The validity of this assumption will be supported just by the result of estimation without analytical proof in this paper. Now, all derivations follow in the same way as [7] and [8]. The corresponding cumulative distribution functions (CDF) are
(6)
Then, the expression for CER can be formulated as follows.
t+f /2
PSED(l,f ) =
P (i errors)
i=t+1
· P (at least (dmin − i) errors in (n − l) MRSs) + P (more than t + f /2 errors). (7) The bit error probability pb and the symbol error probability ps are computed as Q(1/σb ) and 1 − (1 − pb )m , respectively, ∞ where Q(y) = y f (x)dx with the probability density function (PDF) f (x) of a standard normal distribution. Let βj (i) be the jth largest reliability out of i erroneous symbols and γh (n − i) be the hth largest reliability out of (n − i) correct symbols. Then (7) can be reformulated as follows. t+f /2 n ps i (1 − ps )n−i PSED(l,f ) = i i=t+1 · P (βdmin −i (i) ≥ γn−l−(dmin −i)+1 (n − i)) n n ps i (1 − ps )n−i . (8) + i i=t+f /2+1
The solution for this kind of ordered statistics problem is well described in [7] and [8] for binary block codes. However, it is not easy to obtain the closed-form PDFs of the symbol reliability in (3) for non-binary codes. So, we assume that a monotonically increasing function can transform the symbol reliabilities of an erroneous symbol and a correct symbol so that resulting PDFs, fαe and fαc , may follow the normal distributions with mean values of -1 and 1, respectively, and with a common variance of σs 2 conditioning positive values. f ((x + 1)/σs ) u(x) σs Q(1/σs ) f ((x − 1)/σs ) fαc (x) = u(x), σs [1 − Q(1/σs )] fαe (x) =
(9) (10)
σs = 1/Qinv (ps ),
(11)
Q(1/σs ) − Q((x + 1)/σs ) u(x) Q(1/σs ) 1 − Q(1/σs ) − Q((x − 1)/σs ) Fαc (x) = u(x). 1 − Q(1/σs ) Fαe (x) =
(12) (13)
From ordered statistics theory, the PDFs of βj (i) and γh (n−i) are computed as follows, i! [1 − Fαe (x)]j−1 fαe (x)[Fαe (x)]i−j (j − 1)!(i − j)! (14) (n − i)! fγh (n−i) (x) = (h − 1)!(n − i − h)! · [1 − Fαc (x)]h−1 fαc (x)[Fαc (x)]n−i−h . (15) fβj (i) (x) =
Finally, the probability that βj (i) is greater than or equal to γh (n − i) can be computed as the following double integral of fβj (i) and fγh (n − i).
∞
∞ P (βj (i) ≥ γh (n − i)) = fγh (n−i) (x) fβj (i) (y)dydx. 0
x
(16) Fig. 7 shows that the results of estimation match those of the simulation very precisely. It is remarkable that the CERs of SEDs decrease faster than that of Chase2 and approach that of HSED(2t) with increasing SNR. VII. AVERAGE COMPLEXITY The average complexity can be further reduced by using some stopping criterion. The set of erasure patterns of SED(l+ 1) includes that of SED(l) completely. Accordingly, if SED(l) succeeds to get the original codeword, SED(l + 1) does too. Consider applying SED(i)s from i = 1 to l one after another to implement SED(l). If SED(i) fails, SED(i + 1)
978-1-4244-2324-8/08/$25.00 © 2008 IEEE. This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings.
Algorithm 1 Recursive implementation of SED(l) do error-only decoding check stopping criterion for i = 2 : l do f = min(l, 2t) if l is even, min(l − 1, 2t) otherwise {Ej } = all erasure patterns with odd number of erasures less than f at (i − 1) LRSs and an erasure at ith LRS for j = 1 : |{Ej }| do do error-and-erasure decoding with erasure pattern Ej check stopping criterion end for end for In order to compute the upper bound of average complexity, it is assumed that if SED(i) fails to find the original codeword but SED(i + 1) succeeds, SED(i + 1) does so with the last erasure pattern, which is the worst case. Cavg
SED(l)
≤
l
CSED(i) {PSED(i−1) − PSED(i) }
i=1
+ CSED(l) PSED(l) ,
(17)
where PSED(0) = 1. The average complexity remains slightly greater than one until CSED reaches around 1/PSED . For a given l, the average complexity decreases with increasing SNR as shown in Fig. 8, because the CER becomes lower. Table I summarizes the performance-complexity tradeoff and feasibility of the proposed method. Consider a target CER of 10−15 . The throughput of a conventional hard-decision decoder is assumed 1Gbps. The proposed algorithm can provide up to 1.5dB gain over the conventional decoder depending on the choice of l. With a complexity of 256, SED(9) offers 0.8dB gain which is the asymptotic gain of KV algorithm with infinite multiplicity. All of SEDs in the table have the average complexity slightly greater than one. The worst case complexity Cmax would prevent some SEDs with high gain from being practical in real applications. However, the worst case latency Lmax can be shortened at the expense of power and area of the hardware. The throughput of the error-and-erasure decoder can be much higher than 1Gbps with current CMOS technology and a parallel structure of several conventional decoders can shorten the latency further. TABLE I P ERFORMANCE GAIN AND AVERAGE COMPLEXITY AT CER = 10−15 SED(9)
SED(13)
SED(21)
SED(47)
Chase2
gain (dB)
0.8
1.0
1.25
1.5
1.1
Cavg
1.00· · ·
1.00· · ·
1.00· · ·
1.07
-
Cmax
256
4096
∼ 106
∼ 1012
∼ 1019
Lmax
0.5ms
8.4ms
2.1s
45 days
106 years
0
10
E /N = 6dB b
0
−5
10
E /N = 7dB b
CER
continues the iterations of error-and-erasure decoding with only additional erasure patterns. The decoding stops when a resulting codeword meets the stopping criterion. In this paper, specific stopping criterion would not be discussed and simply ideal stopping criterion is assumed to exist. Alg. 1 shows this recursive decoding process of SED(l).
0
−10
10
C
SED(1~255)
Cavg_SED(1~255)
−15
10
Eb/N0 = 8dB −20
10
0
10
5
10
10
10
15
10 Complexity
20
10
25
10
Fig. 8. Average complexity for (255,239) RS code over AWGN with BPSK.
For example, if the throughput of a conventional decoder is 10Gbps and ten conventional decoders are used parallel, the latency becomes one over a hundred times smaller. VIII. C ONCLUSION A soft-decision decoding algorithm of RS codes using iterative algebraic decoding was proposed. As a subset of the proposed algorithm, a hard-decision list decoding which extends the correction bound even with high rate codes was also introduced. The results of simulation shows that the proposed algorithm exhibits considerable performance gain over conventional hard-decision decoding and outperforms other soft-decision decoding algorithms such as GMD, Chase2, and KV. The performance estimation using ordered statistics of symbol reliability coincides with the results of simulation and makes it possible to predict the performance at practically high SNR. The proposed method achieves almost the same average throughput as the conventional decoder and provides much better performance-complexity tradeoff than CGA. The totally symbol-based algorithm of the new method presents robust error correction capability also in burst error channel. R EFERENCES [1] G. D. Forney, “Generalized minimum distance decoding,” IEEE Trans. on Information Theory, vol. 12, no. 2, pp. 125–131, 1966. [2] D. Chase, “A class of algorithms for decoding block codes with channel measurement information,” IEEE Trans. on Information Theory, vol. IT18, pp. 170–182, 1972. [3] H. Tang, Y. Liu, M. Fossorier, and S. Lin, “On combining Chase2 and GMD decoding algorithms for nonbinary block codes,” IEEE Communications Letters, vol. 5, no. 5, pp. 209–211, 2001. [4] V. Guruswami and M. Sudan, “Improved decoding of Reed-Solomon and algebraic-geometry codes,” IEEE Trans. on Information Theory, vol. 45, pp. 1757–1767, 1999. [5] R. Koetter and A. Vardy, “Algebraic soft-decision decoding of ReedSolomon codes,” IEEE Trans. on Information Theory, vol. 49, no. 11, pp. 2809–2825, 2003. [6] S. Jeon and B. V. K. Vijaya Kumar, “Error event analysis of partial response targets for perpendicular magnetic recording,“ Proc. IEEE Globecom, Washington DC, Nov. 2007, pp. 277–282. [7] D. Agrawal and A. Vardy, “Generalized minimum distance decoding in Euclidean space: Performance analysis,” IEEE Trans. on Information Theory, vol. 46, pp. 60–83, 2000. [8] M. P. C. Fossorier and S. Lin, “Error performance analysis for reliability based decoding algorithms,” IEEE Trans. on Information Theory, vol. 48, pp. 287–293, 2002.
978-1-4244-2324-8/08/$25.00 © 2008 IEEE. This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings.