A 0.8V CMOS ANALOG DECODER FOR AN (8,4,4 ... - Semantic Scholar

Report 1 Downloads 108 Views
A 0.8V CMOS ANALOG DECODER FOR AN (8,4,4) EXTENDED HAMMING CODE Nhan Nguyen, Chris Winstead, Vincent C. Gaudet, and Christian Schlegel Department of Electrical and Computer Engineering 2nd Floor, ECERF Bldg., University of Alberta Edmonton, Alberta, Canada, T6G 2V4 {nguyen, winstead, vgaudet, schlegel}@ece.ualberta.ca ABSTRACT

u0

u1

u2

u3

x4

x5

x6

x7

A novel way to decode error control codes is through the use of analog circuits. Decoders exploit the non ideal behaviour of transistors operating in the subtheshold mode to process probability information. This paper describes the design of an (8,4,4) extended Hamming decoder operating at supply voltage 0.8V using 0.18µm CMOS technology. When biased at 1 µA, a decoding rate of 444 kbps and energy per decoded bit of 0.64 nJ/b is achieved.

Fig. 1. (8,4,4) extended Hamming factor graph

1. INTRODUCTION

2. CODES AND GRAPHS

The most powerful codes are turbo [1] and LDPC [2, 3] codes. Decoders operating on these types of codes process soft information in iterations to calculate final decision values. This can be naturally done with analog networks where soft information is represented by voltage and currents. These values iterate throughout the network to some final steady state to yield the decision. Decoders that use subthreshold mode networks and its variants were first realized by [4] and subsequently by [5, 6, 7, 8] as an alternative to increasingly complex digital implementations. As feature size shrinks, the allowable operating voltage reduces. In order to make analog decoders viable for future processes, low voltage decoders are being studied. In this paper, we present a maximum likelihood decoder realization which operates at below 1V. We will describe the basic blocks used for construction and describe its calculated errors. A decoder architecture is discussed along with the operation of interfaces. The transient characteristics of an error correction are shown. Sensitivity of the decoder outputs to reset is discussed briefly. We end the paper by showing the timing and operation of the overall decoder. Thanks to Alberta iCORE, Canadian Microelectronics Corporation, NSERC, Micronet R & D, and the University of Alberta Faculty of Engineering for funding.

Codes, when described as factor graphs [9, 10] can easily be mapped to analog networks. Factor graphs break down global functions into smaller local functions which are highly connected to each other. Decoding operations done on such a graph follow the rules outlined in the sum product algorithm [10]. An (8,4,4) extended Hamming code factor graph is shown in Fig. 1. Here, extra redundancy is added to further increase the performance [5]. The graph contains three types of nodes: equality, check, and variable nodes. Connections between nodes are bidirectional. Following the convention of [5], the check node performs the function     pz (0) px (0)py (0) + px (1)py (1) = (1) pz (1) px (0)py (1) + px (1)py (0) and the equality node,     pz (0) px (0)py (0) =γ pz (1) px (1)py (1)

(2)

where a constant factor γ is used to make pz (0)+pz (1) = 1. A single unidirectional node operates on two input variables x and y to produce an output z. In the above equations, these variables are probability distributions with two possible values. In implementation, equality and check nodes

are also termed ‘probability gates’ or ‘soft gates’ since their design methodology and roles are similar to that of digital gates. Variable nodes serve as input-output (I/O) ports. 3. PROBABILITY GATES Equality and check gates can be implemented using traditional Gilbert [11] multipliers where probability distributions are represented as currents. In [12] we showed that a modified version of [13]’s multiplier, the ‘low voltage Gilbert multiplier’ as shown in Fig. 2, can be used to perform similar computations. The are three differences: (i) the source of diode connected current mirrors do not need to be biased and can be connected to ground (ii) the bias current transistor M1 has to operate in non-saturation (iii) additional variables need to be included to balance the denominator term if it was to operate on probability distributions. While the circuit of Fig. 2 can be described as vector by scalar multiplication, we could easily say that the nth output current branch is given as On =

Is In P Is + n In

where pz (0), pz (1) are expected probability outputs and Iz0 , Iz1 are their simulated current counterparts. We observed more errors as transistor M1 (refer to Fig. 2) is pushed into the saturation region. This means that for good performance, the global bias current and the supply voltage should be low. In addition, we found that check gates perform better than equality gates. In both types of gates, the error is relatively low in the mid probability region (where both inputs are close to 0.5). On the decoder level, we do not expect these errors to contribute much to the final output since, ultimately, it is the overall bit decisions that matter. IN

ON

O2

O1

...

(3) .

(5)

where ke and kc are constants which can be controlled by normalizer transistor sizing. As discussed in [10], these constants will have no effect on the final output of the decoder.

I2

.

...

where Is is the local bias current, In is the nth input current, and On is nth output current branch. By exploiting this relationship, an equality gate and a check gate can be constructed as shown in Figs. 3 and 4 respectively [12]. The output currents, Iz0 and Iz1 , generated by Figs. 3 and 4 will be half of their ideal. This is because two probability distributions are summed in the denominator. Amplification can be achieved using normalizing current mirrors [12] where, a differential pair is used to boost the output currents in units of Iu . The unit current Iu is typically used to represent probability 1. However, since the output currents do not always sum to Iu , it is more accurately described as the global bias current. The output current pair is a log likelihood ratio (LLR) value which can be passed onto the next gate for further processing. A unidirectional gate is constructed by capping normalizers on top of the circuits shown in Figs. 3 and 4. By doing this, we arrive at the following equations for the equality node     Iz0 Ix0 Iy0 = kc (4) Iz1 Ix1 Iy1 and for the check node     Iz0 Ix0 Iy0 + Ix1 Iy1 = ke Iz1 Ix0 Iy1 + Ix1 Iy0

These gates have been simulated using HSPICE to understand errors in the computed output values. Fig. 5 shows a typical surface plot with x and y input probability in the range of 0.01 to 0.99. The error is a difference in LLR     pz (0) Iz0 − log (6) ∆LLR = log Iz1 pz (1)

I1

IS M1

Fig. 2. Low voltage Gilbert multiplier

4. DECODER AND SIMULATION RESULTS The decoder architecture is shown in Fig. 6. At the input interface, VLLR and VREF are sampled serially before being passed into the decoder in parallel. The decoder converts these voltages into probability currents and processes the information. The outputs from the decoder are compared to arrive at final digital bit decisions. Clocking is needed only for I/O interfaces. The decoder was designed in a 0.18µm 6M1P CMOS process; the layout of the core is shown in Fig. 7. The core dimensions are 158 x 276 µm. An IC is currently being fabricated. Simulation results shown in this paper are from extracted views without pads. The chosen decoder can correct at most one error. The transient characteristics of an error correction (on bit 3) are shown in Fig. 8. In this case, currents were injected into the

Iz0

Iz1

.

Iy0

.

VLLR VREF

input y

16

Iy1

decoder Ix1

u0

u1

u2

u3

x4

x5

x6

x7

RST

Ix0

^ u CLK

8 DOUT

output

Fig. 3. Equality node Fig. 6. Decoder architecture .

I y0

I z0

I z1

. .

.

I y1 I x1

I x0

Fig. 4. Check node

Fig. 7. Decoder core layout

Input probabilities: p1(0) = 0.7 p2(0) = 0.6 p3(0) = 0.3 p4(0) = 0.2 p5(0) = 0.8 p6(0) = 0.3 p7(0) = 0.2 p8(0) = 0.1

Error correction on bit 3

Fig. 8. Transient characteristics of an error correction Fig. 5. Equality node error ∆LLR (VDD = 0.5, Iu = 1nA)

INVALID

1010

0 0 11

11 1 0

[2] R. G. Gallager, Low-Density Parity-Check Codes, MIT Press, 1963. [3] D. J. C. MacKay and R. M. Neal, “Good codes based on very sparse matrices,” in Proc. 5th IMA Conference. 1995, pp. 100–111, Springer.

1 0 11 0 1 1 0X 0 01 1 1 1 1 0X 1 1 1 1 10 0 0

Fig. 9. Decoder operating on 3 received words

decoder core and output currents were measured. We used a supply of 0.5 V and global bias of 10 nA. This moderate error was corrected in less than 10 us. The decoder is sensitive to reset since old probability values are fed back into the circuit. The swing on one out of every two output currents is severely limited. Since two current branches represent a probability distribution, pass transistors are used to equalize that distribution. This equalizing of the distribution or reset takes place before a new code word is accepted. The timing of the decoder and its operation can be seen in Fig. 9. An initial framing signal FRAME is needed to initialize the input interface. Eight clock cycles (falling edge triggered) are used to shift in serial voltages VREF and VLLR. Another 8 clock cycles are used by the decoder to process information. In total, 16 clock cycles are required until the first decoded bits appear. Thereafter, because of decoder reset and pipelining, successive decoded bits will appear every 9 clock cycles. As an example, three received words are injected into the decoder. To simplify simulation setup, all input probabilities are 0.8. The first, second, and third words are 10110110, 00111110, and 1111 1000 respectively. The error bits are shown in bold. The corrected information bit are seen on DOUT. 5. CONCLUSIONS

[4] F. Lustenberger, M. Helfenstein, H.-A. Loeliger, F. Tarkoy, and G. S. Moschytz, “All-analog decoder for a binary (18,9,5) tail-biting trellis code,” in Proc. ESSIRC, Duisburg, Germany, Sept. 1999, pp. 362– 365. [5] F. Lustenberger, On the Design of Analog VLSI Interative Decoders, Ph.D. thesis, ETH, Zurich, Nov. 2000. [6] M. Moerz, T. Gabara, R. Yan, and J. Hagenauer, “An analog 0.25um bicmos tailbiting map decoder,” in Proc. IEEE ISSCC, San Francisco, CA, Feb. 2000, pp. 356–357. [7] C. Winstead, J. Dai, W. J. Kim, S. Little, Y.-B. Kim, C. Myers, and C. Schlegel, “Analog map decoder for (8,4) hamming code in subthreshold cmos,” in Proc. Advanced Research in VLSI Conf., Salt Lake City, UT, March 2001, pp. 132–147. [8] V. Gaudet and G. Gulak, “A 13.3mbps 0.35um cmos analog turbo decoder ic with a configurable interleaver,” in Proc. ISSCC, Feb. 2003, pp. 148–149, 484. [9] Jr. G. D. Forney, “Codes on graphs: normal realizations,” IEEE Trans. Inform. Theory, vol. 47, pp. 520– 548, Feb. 2001. [10] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Trans. Inform. Theory, vol. 47, pp. 498–519, Feb. 2001. [11] B. Gilbert, “A precise four-quadrant multiplier with subnanosecond response,” IEEE J. of Solid-State Circuits, vol. 3, pp. 365–373, 1968.

With supplies of VDD = 0.8 V and Iu = 1 µA, the simulated power consumption is roughly 283 µW. The I/O circuits are clocked at 1 MHz, giving a decoding rate of 444 kbps. The energy per decoded bit is then 0.64 nJ/b. We have demonstrated the feasibility of low voltage analog decoding.

[12] C. Winstead, N. Nguyen, V. Gaudet, and C. Schlegel, “Low-voltage cmos translinear circuits for analog decoders,” in Proc. Int. Symp. on Turbo Codes and Related Topics, Brest, France, Sept. 2003.

6. REFERENCES

[13] E. Seevinck, E. A. Vittoz, M. du Plessis, T-H. Joubert, and W. Beetge, “Cmos translinear circuits for minimum supply voltage,” IEEE Trans. on Circuits and Systems II, vol. 47, no. 12, pp. 1560–1564, Dec. 2000.

[1] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near shannon limit error-correcting coding and decoding: turbo codes,” in Proc. IEEE ICC, Geneva, Switzerland, May 1993, pp. 1064–1070.