International Symposium on Information Theory and its Applications, ISITA2006 Seoul, Korea, October 29–November 1, 2006
Analysis of Complexity and Convergence Speed of Sequential Schedules for Decoding LDPC Codes Sunghwan Kim† , Min-Ho Jang† , Jong-Seon No† , Song-Nam Hong‡ , and Dong-Joon Shin‡ †
‡
School of Electrical Eng. & Com. Sci. Seoul National University, Seoul, Korea E-mail: {nodoubt, mhjang}@ccl.snu.ac.kr,
[email protected] Abstract In this paper, a sequential message-passing decoding algorithm of low-density parity-check (LDPC) codes by partitioning check nodes is analyzed. This decoding algorithm shows better bit error rate (BER) performance than the conventional message-passing decoding algorithm, especially for the small number of iterations. Analytical results tell us that as the number of partitioned subsets of check nodes increases, the BER performance becomes better. We also derive the recursive equations for mean values of messages at check and variable nodes by using density evolution with Gaussian approximation. Finally, the analytical results are confirmed by the simulation results. 1. INTRODUCTION In 1996, low-density parity-check (LDPC) codes, originally invented by Gallager [1], were rediscovered by MacKay and Neal [2]. Since then, LDPC codes have been the main research topic in the error-control coding area because they show the capacity-approaching performance with feasible complexity [3]. Compared with turbo codes, they have lower decoding complexity due to the message-passing decoding based on the sum-product algorithm, but slower decoding convergence speed. An LDPC code can be defined by a very sparse parity-check matrix which contains mostly 0’s and a few 1’s. The sparseness of the parity-check matrix gives low decoding complexity and better decoding performance. LDPC codes can be classified into two classes according to the degrees of variable nodes and check This work is financially supported by the Ministry of Education and Human Resources Development (MOE), the Ministry of Commerce, Industry and Energy (MOCIE) and the Ministry of Labor (MOLAB) through the fostering project of the Lab of Excellency, by BK21, and by the Ministry of Information and Communication through the ITRC program.
Division of Electron. & Com. Eng. Hanyang University, Seoul, Korea E-mail:
[email protected],
[email protected] nodes. If the degrees of variable nodes and check nodes of an LDPC code are dv and dc , respectively, it is called a (dv , dc ) regular LDPC code. Otherwise, it is called an irregular LDPC code. Recently, there has been a great deal of efforts on implementing LDPC decoder. In general, hardware implementation of LDPC decoder uses parallel processing. However, if the decoder cannot be implemented in the fully parallel processing mode, sequential decoding approach has to be taken. An efficient sequential decoding algorithm and its hardware implementation are introduced in [4], where messages between each variable node and its neighbors are sequentially updated. A novel shuffled iterative decoding by partitioning variable nodes is introduced in [5]. This scheme has the same computational complexity as the iterative decoding based on flooding schedule and by simulation it is shown to converge faster. Similarly, a message-passing decoding algorithm which is sequentially performed on each variable node is introduced in [6] and the fast convergence of this algorithm is also verified by simulation. In [7] and [8], various new schedules have been proposed, in which the messages are exchanged at each iteration according to the parameters of the Tanner graph [9] such as the girths and the closed walks of the nodes. They are categorized as node based versus edge based, unidirectional versus bidirectional, and deterministic versus probabilistic and the performance/complexity tradeoff is studied through simulation. In [10] and [11], new serial LDPC decoding algorithms by partitioning check nodes are introduced, which are especially suitable for hardware implementation. By simulation, these decoding algorithms are shown to converge faster than the iterative decoding based on flooding schedule. The sequential message-passing decoding algorithm can be briefly explained as follows. First, the check nodes are partitioned into p subsets appropriately.
629
Then, this LDPC code can be described by the interconnected p subgraphs, each of which consists of the corresponding subset of check nodes and the connected variable nodes. The decoding can be performed by applying the message-passing decoding algorithm to each subgraph sequentially, which makes one iteration. In one iteration, the computational complexity of the sequential decoding algorithm is the same as that of the fully parallel decoding algorithm and the sequential decoding converges faster. If more efficient message update at variable nodes is used for the fully parallel message-passing decoding, the computational complexity of our decoding may become larger but this additional complexity is not substantial. One drawback is that the decoding delay per iteration becomes larger than that of the fully parallel decoding as p becomes bigger. However, it is suitable when fully parallel decoding is not feasible. It is very interesting to analyze the fast convergence of sequential message-passing decoding algorithm by applying density evolution with Gaussian approximation at each iteration. 2. A Sequential Message-Passing Decoding Algorithm by Partitioning Check Nodes In this section, the conventional message-passing decoding algorithm [3] of (dv , dc ) regular LDPC codes will be reviewed and a sequential message-passing decoding algorithm by partitioning check nodes will be introduced. Let v and u be messages in the log-likelihood ratio (LLR) form from variable node to check node and from check node to variable node, respectively. Then v can be updated by summing all incoming messages as v=
dX v −1
ui
(1)
i=0
where u0 is the received value from the channel and ui , i = 1, · · · , dv − 1, are the incoming messages from the neighbors except for the check node that gets the updated message v. In a check node, the output message u can be updated under ‘tanh rule’ as tanh
dY c −1 u vj = tanh 2 2 j=1
(2)
where vj , j = 1, · · · , dc − 1, are the incoming messages from neighbors except for the variable node that gets the updated message u. In the sequential message-passing decoding algorithm, we assume that the check nodes are partitioned into p subsets. The messages from variable nodes to
the check nodes in the first subset are updated and then the messages from the check nodes in the first subset to their neighbors are updated. Similarly, this decoding procedure is sequentially applied to the remaining p − 1 subsets of check nodes. One iteration in the sequential decoding algorithm includes all the above sequential message updating and passing for all variable nodes and all subsets of check nodes. Thus, it is clear that the amount of computations for one iteration in the sequential decoding algorithm is the same as that for one iteration in the conventional decoding algorithm if the update rule (1) is performed for each message at variable node.
(a)
(b)
Figure 1: One iteration in the sequential decoding algorithm of a (2, 4) regular LDPC code with length 8 and p = 2. (a) Message-passing for the first check node subset. (b) Message-passing for the second check node subset. Fig. 1 shows the sequential decoding procedures for a (2, 4) regular LDPC code of length 8 when p = 2. Here, circles and squares stand for variable nodes and check nodes, respectively. The messages received from the channel are represented by the arrows at the top of the circles. In Fig. 1 (b), the messages from variable nodes to the check nodes in the second subset are updated by using the messages from the check nodes in the first subset, which were already updated as in Fig. 1 (a). 2.1. Analysis by Density Evolution with Gaussian Approximation Density evolution with Gaussian approximation is based on approximating the probability densities of messages as Gaussians or Gaussian mixtures [13]. Since this is easier to analyze and computationally faster than the density evolution, it is a useful method for investigating the behavior of the message-passing decoding algorithm. In this section, we will only consider (dv , dc ) regular LDPC codes. For irregular LDPC codes, the similar analysis can be used.
630
Density evolution with Gaussian approximation in the conventional message-passing decoding algorithm can be explained as follows. Let mu and mv be the means of u and v, respectively. By taking expectation at both sides of (1), we get (l−1) m(l) . v = mu0 + (dv − 1)mu
(3)
(l)
The updated mean mu at the l-th iteration can be calculated by taking expectation at both sides of (2), i.e., ¸ · ¸dc −1 · u(l) v (l) = E tanh (4) E tanh 2 2 £ ¤ R (u−mu )2 1 where E tanh u2 = √4πm tanh u2 e− 4mu du. R u Let φ(x) be the function defined as 1 − φ(x) =
√1 4πx
R R
tanh u2 e−
(u−x)2 4x
du,
number of distinct edge distributions (a1 , a2 , · · · , ap ) for connections from a variable node to p subsets is the repeated combination p Hdv = (dv + p − 1)!/dv !(p − 1)!, where x! denotes x × (x − 1) × · · · × 2 × 1. When e edges from a variable node are connected to Sj , the number of distinct edge distributions for connections from this variable node to p subsets is p−1 Hdv −e . For a given distribution (a1 , a2 , · · · , ap ) with ai 6= 0, the mean of message from a variable node to the subset Si in the conventional message-passing decoding algorithm can be expressed as mu0 + (ai − 1)m(l−1) + uS i
mu0 +(ai −1)m(l−1) + uS i
i−1 X
aj m(l) uS + j
j=1
p X j=i+1
aj m(l−1) . (7) uS j
To consider messages from a variable node to the subset Si , it is assumed that ai can vary from 1 to dv . Since the uniformly at random partitioning is assumed, the probability that the message with mean value in (7) is passed to the subset Si can be derived as 1 (dv − 1)! × dv −1 . a1 !a2 ! · · · ai−1 !(ai − 1)!ai+1 ! · · · ap ! p
(8)
Using (6), (7), and (8), the recursive equation for the mean of message from the check node in Si to the variable node can be expressed as (dv − 1)! X −1 m(l) uSi= φ 1 −1 − d −1 a !a ! · · · (a 1 2 i − 1)! · · · ap !p v (a ,··· ,a ) ³
p 1 ai 6=0
×φ mu0 + (ai − 1)m(l−1) uS
2.2. Density Evolution with Gaussian Approximation for Uniformly at Random Partitioning Suppose that all check nodes are partitioned into p subsets. We assume that each check node is placed among p subsets with probability 1/p, which is called uniformly at random partitioning. Then, the sequential message-passing decoding algorithm can be analyzed as follows. Let Sj , 1 ≤ j ≤ p, be the j-th subset of check nodes and uSj be a message from a check node in Sj to a variable node. Let (a1 , a2 , · · · , ap ) denote the distribution of edges from a variable node to p subsets, where aj is the number of edges connected from a variable Pp node to check node in Sj and j=1 aj = dv . Then, the
j
Since the sequential message-passing decoding procedure is sequentially performed from S1 to Sp , the above equation should be modified as
1, 0,
(6) In the next subsections, density evolution analysis will be performed for two partitioning schemes, uniformly at random partitioning and bimodal partitioning. The former one assumes distributing check nodes equally probably among subsets but the latter one assumes some constraints for distributing check nodes among subsets, which may not be feasible for some cases. By considering these two rather extreme schemes, we can get an idea about how to partition the check nodes.
aj m(l−1) . uS
j=1 j6=i
if x > 0
if x = 0 if x = ∞. (5) By combining (3), (4), and (5), a recursive equation for mu can be derived as µ h idc −1 ¶ (l) −1 (l−1) mu = φ 1 − 1 − φ(mu0 + (dv − 1)mu ) .
p X
i
+
i−1 X j=1
aj m(l) uS + j
p X j=i+1
dc −1 aj m(l−1) . uSj
(9)
2.3. Density Evolution with a Gaussian Approximation for Bimodal Partitioning We consider the case that check nodes connected to a variable node are distributed among all subsets, S1 , S2 , · · · , Sp , as evenly as possible. Then, among the check nodes connected to a variable node, the number of check nodes contained in each subset should be one
631
of two consecutive numbers. Therefore, this scheme is called the bimodal partitioning. Note that if this number of check nodes is divisible by the number of subsets, then each subset contains the same number of check nodes. The bimodal partitioning will be analyzed by considering the following two cases, p < dv and p ≥ dv . 2.3.1. p < dv Suppose dv = bp + r, where b is a positive integer and 0 ≤ r ≤ p − 1. Then b edges from a variable node are connected to each subset and the edge distributions for connections from a variable node to p subsets are determined by the connections of the remaining r edges. Hence, the number of distinct edge distributions (a1 , a2 , · · · , ap ) for connections from a variable node to p! p subsets is p Cr = r!(p−r)! and b or b + 1 edges from a variable node are connected to each subset. When b edges from a variable node are connected to Si , the number of distinct edge distributions for connections between this variable node and p subsets is p−1 Cr . When b+1 edges from a variable node are connected to Sj , the number of distinct edge distributions for connections between this variable node and p subsets is p−1 Cr−1 . Then, for a given (a1 , a2 , · · · , ap ) where aj is (l) b or b + 1, the mean of message mvSi from a variable node to the subset Si can be expressed as (l−1) m(l) + vS = mu0 +(ai −1)muS i
i
i−1 X
aj m(l) uS + j
j=1
p X
aj m(l−1) uS j
j=i+1
(10) and the probability that the message is passed to the subset Si can be derived as follows. ³ ´ Pr m(l) = vS i
ai . b ×p−1 Cr + (b + 1) ×p−1 Cr−1
(11)
The recursive equation for the mean of message from the check node in the subset Si to the variable node can be expressed as
X −1 m(l) 1 −1 − uS =φ i
ai b ×p−1 Cr + (b + 1) ×p−1 Cr−1
(a1 ,··· ,ap )
×φ mu0 + (ai − 1)m(l−1) + uS i
+
p X j=i+1
dc −1 aj m(l−1) uS j
.
i−1 X j=1
aj m(l) uS
j
2.3.2. p ≥ dv In this case, the number of distinct edge distributions for connections from a variable node to p subsets becomes p Cdv . Assume that one edge from a variable node is connected to the subset Si . Then, the number of distinct edge distributions is p−1 Cdv −1 . Then, for a given (a1 , a2 , · · · , ap ) where aj is zero or one, the mean of message from a variable node to the subset Si can be expressed as m u0 +
i−1 X
aj m(l) uS + j
j=1
p X
aj m(l−1) . uS j
j=i+1
(12)
The probability that the message with the mean value in (13) is passed to the subset Si becomes 1/p−1 Cdv −1 . The recursive equation for the mean of message from the check node in Si to the variable node can be expressed as X 1 −1 φ (mu0 1 − 1 − m(l) uS i = φ p−1 Cdv −1 (a1 ,··· ,ap ) dc −1 p i−1 X X aj m(l−1) + aj m(l) . uS uS + j=1
j
j=i+1
j
2.4. Mean and BER curves for p = 2, 3, and 4 By using the previous results, the mean evolutions and the BER curves for the cases of p = 2, 3 and 4 are obtained and compared. Equation (5) can be simplified 0.86 as φ(x) ≈ e−0.4527x +0.0218 [13]. Then, the threshold value of the (3, 6) regular LDPC codes for the messagepassing decoding algorithm is 0.8747, which is also the threshold value for the sequential decoding algorithm. Under binary phase-shift keying (BPSK) modulation and the additive white Gaussian noise (AWGN) channels with the standard deviation values 0.83 and 0.87 which are less than 0.8747, the mean values at each iteration for the (3,6) regular LDPC code are compared in Fig. 2. In Fig. 2, R and B stand for the uniformly at random partitioning and bimodal partitioning, respectively. It is shown that partitioning the check nodes increases convergence speed of the message-passing decoding algorithm when σ = 0.87. It is also shown that the bimodal partitioning is better than the uniformly at random partitioning in terms of convergence speed. The bimodal partitioning seems to make more effective message update in one iteration than the uniformly at random partitioning does because edges of all variable nodes in the bimodal partitioning are connected to the subsets as evenly as possible and the message update
632
for each subset utilizes the similar number of messages from variable nodes. As the number of subsets of the check nodes increases, we can see that the convergence speed becomes faster in Fig. 2. However, the gain in the convergence speed becomes negligible for p ≥ 4. This can be explained that since the degree of variable nodes is 3, even if p goes beyond 3, the overall bimodal partitioning has the similar structure as the case of p = 3 and therefore the performance becomes similar.
100
80 p=3 B
Mean
60
3. Simulation Results Simulation for the sequential decoding algorithm when p = 1, 3, and 4 is done for BPSK modulation in AWGN channel. Note that R and B denote the uniformly at random partitioning and bimodal partitioning, respectively. Fig. 4 shows the BER performance of irregular LDPC code with length 1000 and rate 1/2, which was constructed by optimizing the degree distribution [3] with restricting the maximum degree of variable nodes to 8. Fig. 5 shows the BER performance of a (3, 6) regular LDPC code with length 1000 and rate 1/2, which is randomly constructed. In Figs 4 and 5, the sequential decoding algorithm with the uniformly at random partitioning is used.
p=2 R p=2 B
10
40 p=4 B
10
-1
-2
20 10
10
20
30
40
50
BER
p=1
0
60
10
Figure 2: Mean evolution of (3, 6) regular LDPC code for various number of subsets when σ = 0.87.
p=1 I=
1
p=1 I=
5
p=1 I=10
10
Number of Iterations
-3
10
-4
p=1 I=20 p=1 I=50
-5
p=4 I=
1 R
p=4 I=
5 R
p=4 I=10
R
p=4 I=20
R
p=4 I=50
R
-6
1.0
1.5
2.0 E
Fig. 3 shows the BER of (3, 6) regular LDPC codes for the uniformly at random partitioning and bimodal partitioning with p = 2 and unform partitioning with p = 3 and 4. It is shown in Fig. 3 that BER performance of the sequential decoding algorithm with partitioning is better than that with non-partitioning and the sequential decoding algorithm with bimodal partitioning is superior to that with the uniformly at random partitioning.
10
10
BER
-1
10
BER
10
10
p=3, 4 B -4
p=2 B
10
p=2 R
p=2 R
p=2 B
-3
10
-5
p=1 I=
1
p=1 I=
5
-4
p=1 I=20 p=1 I=50
-5
p=4 I=
1 R
p=4 I=
5 R
p=4 I=10
R
p=4 I=20
R
p=4 I=50
R
-6
1.0
1.5
2.0 E
10
10
10
10
3.0
-2
p=1
-3
2.5
(dB)
-1
-2
p=1
0
p=1 I=10
10 10
/N
Figure 4: Performance comparison for an irregular LDPC code with length 1000 and rate 1/2.
10
10
b
b
/N
0
2.5
3.0
(dB)
-6
-7
-8
Figure 5: Performance comparison for a (3, 6) regular LDPC code with length 1000 and rate 1/2.
=0.87
=0.83
-9
10
20
30
40
50
60
Number of Iterations
Figure 3: BER of (3, 6) regular LDPC code when σ = 0.83 and 0.87.
Figs. 4 and 5 show that the BER performance of the sequential decoding algorithm is better than that of the conventional decoding algorithm, especially for the 5 and 10 iterations. However, for the 50 iterations, the BER improvement decreases since both decoding
633
methods get enough iteration gain. Note that both decoding algorithms have the same threshold values. Fig. 6 shows the performance improvement of the sequential decoding algorithm with bimodal partitioning for a (3, 6) quasi-cyclic (QC) LDPC code [14] with length 4092 and rate 1/2. By computer search, we found the shift values of the circulant permutation matrices in the QC LDPC code such that the girth of the QC LDPC code becomes 8.
10
10
10
-1
[4] M. Cocco, J. Dielissen, M. Heijligers, A. Hekstra, and J. Huisken, “A scalable architecture for LDPC decoding,” in Proc. DATE’04, pp. 88–93, Feb. 2004. [5] J. Zhang and M. Fossorier, ”Shuffled iterative decoding,” IEEE Trans. Commun. vol. 53, no. 2, pp. 209–213, Feb. 2005.
I=1
I=5
-2
-3
[3] T. Richardson and R. Urbanke, “The capacity of low-density parity-check codes under messagepassing decoding,” IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 599–618, Feb. 2001.
p=1 I=
1
p=1 I=
5
[6] H. Kfir and I. Kanter, “Parallel versus sequential updating for belief propagation decoding,” Physica A, Elsevier, vol. 330, pp. 259–270, 2003.
p=1 I=10
BER
p=1 I=50
10
10
10
-4
-5
p=3 I=
1 R
p=3 I=
5 R
p=3 I=10
R
p=3 I=50
R
p=3 I=
1 B
p=3 I=
5 B
p=3 I=10
B
p=3 I=50
B
I=10 I=50
[7] Y. Mao and A.H. Banihashemi, “Decoding lowdensity parity-check codes with probabilistic schedule,” IEEE Commun. Lett., vol. 5, no. 10, pp. 414–416, Oct. 2001.
-6
1.0
1.5
2.0 E
b
/N
0
2.5
(dB)
[8] H. Xiao and A. H. Banihashemi, “Graph-based message-passing schedules for decoding LDPC codes,” IEEE Trans. Commun., vol. 52, no. 12, pp. 2098–2105, Dec. 2004.
Figure 6: Performance comparison of bimodal and uniformly at random partitionings of a (3, 6) regular QC LDPC code with length 4092 and rate 1/2.
[9] R. Tanner, “A recursive approach to low complexity code,” IEEE Trans. Inform. Theory, vol. IT-27, pp. 533–547, Sept. 1981.
4. CONCLUSIONS
[10] E. Yeo, P. Pakzad, B. Nikolic, and V. Anantharam, “High throughput low-density paritycheck decoder architectures,” in Proc. GLOBECOM’01, Nov. 2001, pp. 3019–3024.
The sequential message-passing decoding algorithm with check node partitioning outperforms the conventional decoding algorithm, especially for the small number of iterations. This implies that the sequential algorithm improves the convergence speed without increasing the decoding complexity. By using density evolution with Gaussian approximation, we investigated the reason why our decoding algorithm has faster convergence speed. Moreover, the sequential algorithm can be applied to any code represented by Tanner graph and also it can be useful to implement the practical decoder when the computing power is limited.
[11] M. M. Mansour and N. R. Shanbhag, “Turbo decoder architecture for low-density parity-check codes,” in Proc. IEEE Global Telecommun. Conf., Nov. 2002, pp. 1383–1388. [12] E. Sharon, S. Litsyn, and J. Goldberger, ”An efficient message-passing schedule for LDPC decoding,” in Proc. Electrical and Electronics Engineers in Israel, pp. 223-226, Sept. 2004. [13] S.-Y. Chung, On the construction of some capacity-approaching coding schemes. Ph.D. dissertation, MIT, Cambrideg, MA, Sept. 2000.
References [1] R. G. Gallager, Low-density parity-check codes. Cambridge, MA: MIT Press, 1963.
[14] M. Fossorier, “Quasi-cyclic low-density paritycheck codes from circulant permutation matrices,” IEEE Trans. Inform. Theory, vol. 50, no. 8, pp. 1788–1793, Aug. 2004.
[2] D. J. C. MacKay and R. M. Neal, “Near Shannon limit performance of low density parity check codes,” IEEE Electron. Lett., vol. 32, no. 18, pp. 1645–1646, Aug. 1996.
634