17.2 4x12 Gb/s 0.96 pJ/b/Lane Analog-IIR Crosstalk Cancellation and ...

Report 9 Downloads 18 Views
4ൈ12 Gb/s 0.96 pJ/b/lane Analog-IIR Crosstalk Cancellation and Signal Reutilization Receiver for Single-Ended I/Os in 65 nm CMOS Taehyoun Oh and Ramesh Harjani University of Minnesota, Minneapolis, USA, Email: [email protected]

Abstract A crosstalk cancellation and signal reutilization (XTCR) algorithm implemented with analog-IIR networks dramatically improves signal integrity across 4 closely-spaced single-ended PCB traces. The prototype XTCR design implemented in 65 nm CMOS improves the measured average horizontal and vertical-eye openings of the 4 channels by 37.5% and 26.4% at 10-8 BER, while consuming only 0.96 pJ/b/lane.

Introduction The demand for higher throughput combined with the finite number of I/Os has increased the need for higher data rates per pin. Unfortunately, this usually results in increased crosstalk noise and increased power. Crosstalk can be reduced by utilizing differential I/Os but at the cost of doubling the number of I/O pads and increased power consumption. Crosstalk cancellation (XTC) is a harder problem in single-ended I/Os, which are often used in memory interfaces, due to the larger crosstalk signal that does not tail off as rapidly with line separation (~1/D) in comparison to differential I/Os (~1/D3) [3]. In [1], crosstalk between two differential I/Os is reduced by utilizing passive switched-capacitors within a decision feedback equalizer. In [3], a passive analog-IIR based crosstalk cancellation and signal reutilization (XTCR) scheme was introduced. However, [1-3] have demonstrated XTC between only 2 channels with just one aggressor where generalized extension to multiple channels (≥4) is not straightforward, which limits their applicability. In this work, we have extended the two channel only XTCR scheme from [3] to support an infinite number of channels and have verified the extension by implementing a prototype 4 channel XTCR design operating at 12 Gb/s (2×speed of [3]) per pin for single-ended I/Os.

Proposed Algorithm for 4 channel XTCR Fig. 1(a) includes a block diagram for the XTCR receiver and Fig. 1(b) shows the channel spacing (1W, 2W, 1W) used during the measurements discussed later. We intentionally closely bundle channels 1&2 (1W) and channels 3&4 (1W) to illustrate the benefits of increased throughput via the XTCR technique and increase bundle-to-bundle spacing between channel 2&3 (2W spacing) to reduce the residual error term within the XTCR receiver to a reasonable level. Four independent, single-ended, NRZ signals (X1-X4) are applied to the 4 channels. The matrix representation in Fig. 1(c) shows the transmitted signals (X1-X4), the channel (H, diagonal elements), the received signals (Y1-Y4), and crosstalk terms (-jωβH, derivative of the channel [3]). Y1, Y4 have 1 aggressor each and Y2, Y3 have 2 aggressors each. We equalize the crosstalk by differentiating the received signals from adjacent channels and adding them with the appropriate gain (β, δ), as shown in Fig. 1(d). For example, the channel 2 output signal (Z2) is free of crosstalk from the 2 adjacent channels and has additional beneficially reutilized crosstalk energy that boosts the high frequency gain by (ω2β2+ω2δ2)HX2. A secondary error term (ω2βδHX4) decreases as the coupling coefficient between channel 2 and 3 (δ) reduces, i.e., implying increased spacing. In order to reduce the secondary error term the spacing between channels 2 and 3 are varied across 1.5W, 2W and 2.5W. The measured amplitudes of the residual error signals reduce as 87, 73 and 42 mVppd respectively. This secondary error term is not addressed in [1-3] where crosstalk between only 2 channels is handled. However, generalized XTC with multiple channels will invariably end up with this error term that is handled here. We are able to handle an infinite number of channels by separating the bundle pairs by 2W, such that the error term gain (~δ) becomes negligible. Note, the channels spacing is similar to differential lines, however, our channels are single-ended, i.e., twice the number of data channels in comparison. 978-1-4673-0849-6/12/$31.00 ©2012 IEEE

Prototype Low Power Circuit Design Fig. 2 shows the circuit implementation for the proposed XTCR receiver. The single-ended to differential converters (SDC) are placed at the front of the receiver for improved PSRR. The phase delays of the crosstalk signal (path 1) and XTC signal (path 2) are equalized by adding a low-pass filter compensator (1/(1+jωRC)) to path 1 in addition to the differentiators (jωRC/(1+jωRC)) in path 2 with equal pole frequencies (10 GHz) so the phase difference between these two paths is 90˚ across all frequencies, as shown in Fig. 2 (left inset). 3-bit VGAs control the path gain. Using this VGA allows us to control the cancellation signal over a wide range, unlike [1] which can only attenuate the passive CR filter output. Current mode signal adders combine the 2~3 large-current branches of high-speed signals. Current-bleeding PMOS transistors are used to prevent excessive DC voltage drop at the output and allow additional gm. Standard differential linear equalizers (LE) remove any remaining ISI.

Verifying Low Crosstalk Multi-Lanes Signal Transmission 6” PCB traces with spacings shown in Fig. 1(b) and 24” coaxial connection cables were used to test the prototype. Fig. 3(a) shows the insertion losses (11 dB at 6 GHz) of the 4 channels as well as the crosstalk between them. Note that the crosstalk from non-adjacent channels (ch1&3, ch2&4) are 25 dB lower and can be ignored. XTC gain calibration: To find the optimal XTC gain between, for example, channel 1 and 2, we transmit a single 500 mVp-p 10 Gb/s NRZ signal on channel 1 only and calibrate the differentiation path gain that results in the minimal residual crosstalk. Fig. 3(b) shows the residual error amplitude after XTC versus VGA gain settings. Fig. 4 shows that the crosstalk coupled from channel 1 to 2 with an initial amplitude of 233 mVppd is optimally reduced to 59 mVppd at the receiver output for a [b2b1b0]=[011] gain setting. For the final result after XTC, only 20 mV of the residual noise is deterministic showing the effective suppression (>10 times) of the crosstalk. Measurement verifications for practical application: Four independent PRBS7 NRZ data at 8 and 12 Gb/s were applied to closely-spaced 4 channels and a BERT scope was used to monitor the eyes and BER contours of channel 1-4 receiver outputs while XTCR was switched off/on, as shown in Fig. 5. The measured BER bathtub curves of channels 1-4 (XTCR on) at 8 and 12 Gb/s is illustrated in Fig. 6. At 12 Gb/s, all channels eyes are completely closed without XTCR. After turning on XTCR, all 4 channel eye openings show an average of 37.5%UI horizontal and 26.4% vertical improvement at 10-8 BER. Error free zones with a BER of 10-12 have been achieved for all channels. The improved eye opening and other performance numbers are summarized and compared to prior works in Fig. 7. The SDC, XTCR (VGAs, Adder) and LE consume 5.9, 11.5 and 3.9 mW/lane from a 1.1 V supply, respectively.

Conclusions The XTCR occupies 0.036 mm2/lane chip area. The issue of residual crosstalk error signal for multiple lanes was identified for the first time and handled efficiently. The implementation can be extended to an infinite number of single-end I/Os as long as the spacing between bundles is slight larger. While prior XTC techniques [1-4], have presented only single channel outputs, this work shows the signal integrity improvement for all 4 channels. To the best of our knowledge, this work shows the largest eye- improvement at 12 Gb/s.

References [1] M. H. Nazari et al, IEEE ISSCC Dig., pp. 446-447, Feb. 2011. [2] S. Bae et al, IEEE ISSCC Dig, pp. 498-499, Feb. 2011. [3] T. Oh and R. Harjani, IEEE JSSC, pp. 1843-1856, Aug. 2011.

2012 Symposium on VLSI Circuits Digest of Technical Papers

140

W 2W W

Figure 1. Top block diagram of proposed XTCR receiver and FEXT channel description SDC Vi1

Ch.1

b0

1X

1X b1

1X

2X

LE

b2

2X

Figure 4. Crosstalk cancellation performance at ch.2 output (path 1 and path 2 in Figure 2)

4X

Vo1

4X

50Ω

di ff

ωo=1/RC

Adder

VGA 1X

iff D

Log f C

2R

comp

90° Log f

Vi2

(Path2) crosstalk cancellation signal

C

diff

Comp

Ch.2

R

C 2

LE

Vo2

LE

Vo3

R

50Ω

Vi3

Ch.3 50Ω

Figure 5. Channel 1−4 eye-performance measurement results for XTCR off/on at 8 Gb/s (top) and 12 Gb/s (bottom) and performance improvements in Figure 7 are based on the BER contours at 12 Gb/s and XTCR off/on shown here on the 3rd and 4th row. Vi4

Ch.4

LE

Vo4

50Ω

Figure 2. Proposed XTCR receiver circuit diagram

Figure 6. Measured bathtub curves (channel 1−4, XTCR on) at 8 Gb/s (left) and 12 Gb/s (right) from the eye-diagrams in the 4th row in Figure 4. Reference Technology (nm) XTC type

LE

Figure 3. Measured loss and FEXT, XTC calibration and multi-lane XTCR Rx die-photo

978-1-4673-0849-6/12/$31.00 ©2012 IEEE

[1]* 45 Rx passive switching Differential 2 12.5

[2]* 45 Tx FIR

[3]* 130 Rx analog - IIR Single 2 6

This work 65 Rx analog-IIR

Single-ended I/O type Single 4 Multi-channel # 2 12 Data rate (Gb/s) 7 Ch. 1 Ch. 2 Ch. 3 Ch. 4 Avg. XTC power 0.033 N/A 2.4 (pJ/bit/lane) 0.85 1.07 1.07 0.85 0.96 41.4 35.8 35.7 37 37.5 Horizontal eye 23.5 4.2 28 improvement (%UI↑)** BER < 10-8 34.3 28 26.1 17.2 26.4 Vertical eye N/A N/A 12.4 improvement (%↑) ** BER < 10-8 XTC block area 0.036 N/A N/A 0.03 (mm2/lane) (*) Performances of only one channel output are presented (**) Eye improvements using only XTC circuits are compared; pre-emphasis is not considered

Figure 7. Performance comparison with prior work

2012 Symposium on VLSI Circuits Digest of Technical Papers

141