Interactive Schemes for the AWGN Channel with ... - Semantic Scholar

Report 3 Downloads 67 Views
1

Interactive Schemes for the AWGN Channel with Noisy Feedback Assaf Ben-Yishai and Ofer Shayevitz

arXiv:1509.03085v1 [cs.IT] 10 Sep 2015

Abstract We study the problem of communication over an additive white Gaussian noise (AWGN) channel with an AWGN feedback channel. When the feedback channel is noiseless, the classic Schalkwijk-Kailath (S-K) scheme is known to achieve capacity in a simple sequential fashion, while attaining reliability superior to non-feedback schemes. In this work, we show how simplicity and reliability can be attained when the feedback is noisy. We introduce a low-complexity low-delay scheme that operates close to capacity for a fixed bit error probability (e.g. 10−6 ). We further provide an asymptotic construction admitting an error exponent that can significantly exceed the best possible non-feedback exponent. Both results hold when the feedback channel is sufficiently better than the feedforward channel. Our approach is based on the interpretation of feedback transmission as a side-information problem, and employs an interactive modulo-lattice solution.

I. I NTRODUCTION While feedback cannot increase the capacity of point-to-point memoryless channels [1], there exist noiseless feedback communication schemes that can provide a significant improvement in terms of simplicity and reliability, see e.g. [2]–[5]. However, these elegant feedback schemes completely fail in the presence of arbitrarily low feedback noise, rendering them grossly impractical. This naturally raises the question of whether simplicity and reliability can still be achieved to some degree in a practical setup of noisy feedback. In this paper, we address this question in a Gaussian setting and answer it in the affirmative. The setup we consider is the following. Two Terminals A and B are connected by pair of independent AWGN channels, and are limited by individual power constraints. The channel from Terminal A (resp. B) to Terminal B (resp. A) is referred to in the literature as the feedforward (resp. feedback) channel. Terminal A is in possession of a message to be reliably transmitted to Terminal B. To that end, an interactive communication model is adopted where both terminals are allowed to employ coding and exchange signals on the fly. This model is sometimes referred to as active feedback, and should be distinguished from the passive feedback setting where no coding is allowed over the feedback channel. The AWGN channel with noiseless feedback was studied in the classic works of Schalkwijk and Kailath [2], [3], who introduced a capacity-achieving communication scheme referred to herein as the S-K Scheme. This linear-feedback coding scheme employs a first-order recursion at both terminals, and is markedly simpler than its non-feedback counterparts that typically employ long block codes and complex encoding/decoding techniques. In terms of reliability, the error probability attained by the S-K scheme decays super-exponentially with the delay, in contrast to the weaker exponential decay achieved by non-feedback codes [6]. However, this scheme is not robust to any amount of feedback noise, as was initially observed in [3] and further strengthened in [7]. The main contribution of this work is in showing that to some extent, the merits of noiseless feedback can be carried over to the practical regime of noisy feedback. Contrary to the noiseless feedback case, these improvements in simplicity and reliability are not simultaneously achieved. In terms of reliability, we construct an interactive protocol that is of comparable complexity to non-feedback schemes, but is superior in the asymptotic error exponent sense. In terms of simplicity, we depart from the standard asymptotic regime and show how a fixed (but low) error probability can be attained at a small capacity gap, where the latter term refers to the amount of excess SNR required by the scheme above the minimum predicted by the Shannon limit. Both these constructions are useful when the signal-to-noise ratio of the feedback channel sufficiently exceeds that of the feedforward channel. As a case in point, consider the high-SNR regime and assume that the SNR of the feedback channel exceeds the SNR of the feedforward channel by 20dB. Then our simplicity-oriented scheme operates at a capacity gap of merely 0.8dB with only 19 rounds of interaction, and attains a bit error rate of 10−6 . This should be juxtaposed against two reference systems, The authors are with the Department of EE–Systems, Tel Aviv University, Tel Aviv, Israel {[email protected], [email protected]}. The work of A. Ben-Yishai was partially supported by an ISF grant no. 1367/14. The work of O. Shayevitz was supported by an ERC grant no. 639573, a CIG grant no. 631983, and an ISF grant no. 1367/14. This paper was presented in part at Allerton 2014 and ISIT 2015.

2

operating at the same bit error rate: On the one hand, state-of-the-art non-feedback codes that attain the same capacity gap require roughly two orders-of-magnitude increase in delay and complexity. On the other hand, the capacity gap attained by a minimal delay uncoded system is at least 9dB. Finally, under the same setup, our reliability-oriented scheme attains an error exponent exceeding the sphere-packing bound of the feedforward channel for a wide range of rates below capacity. The construction we introduce is based on endowing the S-K scheme with modulo-lattice operations. As observed in [8], the feedforward S-K scheme can be interpreted as a solution to a Joint-Source-Channel-Coding (JSCC) problem via analog transmission. Here, we further observe that the feedback transmission problem can be cast as a similar problem but with side information (i.e. the message) at the receiver (i.e., Terminal A). This observation is crucial for our construction, and is leveraged by means of modulo-lattice analog transmission in the spirit of Kochman and Zamir [9]. Let us briefly describe the simplicity-oriented version of our scheme. Terminal A encodes its message into a scalar Θ using pulse amplitude modulation (PAM). In subsequent rounds, Terminal B computes a linear estimate of Θ, and feeds back an exponentially amplified version of this estimate, modulo a fixed interval. The modulo operation facilitates the essential “zoom-in” amplification without exceeding the power limit, at the cost of a possible modulo-aliasing error. In turn, Terminal A employs a suitable modulo computation and obtains (if no modulo-aliasing occurs) the estimation error, corrupted by excess additive noise. This quantity is then properly scaled and sent over the feedforward channel to Terminal B. After a fixed number of rounds, Terminal B decodes the message using a minimum distance rule. Loosely speaking, the scheme’s error probability is dictated by the events of a modulo-aliasing in any of the rounds, as well as the event where the remaining estimation noise is larger than the minimum distance of the PAM. The reliability-oriented version of our scheme is based on an asymptotic generalization of same idea. We employ a block code instead of PAM, and have the S-K scheme operate over blocks while replacing the scalar modulo with a multi-dimensional modulo-lattice operation. We then provide an asymptotic error analysis using the Poltyrev exponent to account for modulo aliasing errors, and channel coding exponents to account for the error of the block code. We note that a conceptually simpler construction would have been concatenated coding, with the scalar simplicity-oriented scheme as an inner code and a block outer code. However, the induced channel viewed by the outer code is non-Gaussian, and obtaining a closed-form expression for the resulting error exponent appears to be involved. Related work. In [7], [10], the authors analyzed the reliability function of the AWGN channel at zero rate for noisy passive feedback, i.e. where the channel outputs are fed back without any processing. In [11], the authors considered a concatenated coding scheme with a passive linear-feedback inner code and a block outer code, and provided some error exponent results. In Section VI-F, we compare our reliability-oriented scheme to [11], and show that the exponent obtained in [11] is better for low rates whereas our exponent is better for high rates. In [12], which is closer to our interactive setting, the reliability function associated with the transmission of a single bit over AWGN channel with noisy active feedback has been considered. Specifically, it was shown that active feedback roughly quadruples the error exponent relative to passive feedback. The achievability result of [12] is better than ours at zero rate but does not extend to positive rates. Organization. Notations and definitions are given in Section II. The problem setup is introduced in Section III. Necessary background is given in Section IV. Simple interaction is addressed in Section V, and improving reliability is addressed in Section VI. II. N OTATIONS AND D EFINITIONS def

In the sequel, we use the following notations. For any number x > 0, we write xdB = 10 log10 (x) to denote the value of x in decibels. The Gaussian Q-function is Z ∞  1 def exp −u2 /2 du (1) Q (x) = √ 2π x

and Q−1 (·) is its functional inverse. We write f (x) = O(g(x)) for limsupx→∞ |f (x)/g(x)| < ∞. We write log for base 2 def logarithm, and ln for the natural logarithm. We use the vector notation xn =(x1 , . . . , xn ) and boldface letters such as x to . . . indicate vectors of size NΛ . We write an ≥ bn to mean lim inf n→∞ n1 ln abnn ≥ 0, and similarly define ≤ and =. The Capacity Gap. Recall that the Shannon capacity of the AWGN channel with signal-to-noise ratio SNR is given by

1 log(1 + SNR). (2) 2 This is the maximal rate achievable by any scheme (of unbounded complexity/delay, with or without feedback) under vanishing error probability. Conversely, the minimal SNR required to reliably attain a rate R is 22R − 1. The capacity gap Γ attained by C=

3

Zn

Terminal A Xn

Terminal B Yn

feedforward channel interaction rounds

W

c W

feedback channel Yen Fig. 1.

en Z

en X

Block diagram of interactive coding over an AWGN channel with noisy feedback

a coding scheme that operates at rate R over an AWGN channel, is the excess SNR required by the scheme over the minimum predicted by the Shannon limit, i.e., SNR . (3) 22R − 1 Note that if a nonzero bit/symbol error probability is allowed, then one can achieve rates exceeding the Shannon capacity (2), and this effect should in principle be accounted for, to make the definition of the capacity gap fair. However, for small error probabilities the associated correction factor (given by the inverse of the corresponding rate distortion function) becomes negligible, and we therefore ignore it in the sequel. def

Γ=

III. S ETUP Our problem setup is depicted in Fig. 1. The feedforward and feedback channels connecting Terminal A to Terminal B and vice versa, are AWGN channels given by Yn = Xn + Zn , en + Z en , Yen = X

(4) (5)

en , Yen ) are the input and output of the feedforward (resp. feedback) channel at time n respectively. The where Xn , Yn (resp. X en ), feedforward (resp. feedback) channel noise Zn ∼ N (0, σ 2 ) (resp. Zen ∼ N (0, σ e2 )) is independent of the input Xn (resp. X and constitutes an i.i.d. sequence. The feedforward and feedback noise processes are mutually independent. Terminal A is in possession of a message W ∼ Uniform([M ]), to be described to Terminal B over N rounds of communication. To that end, the terminals can employ an interactive scheme defined by a pair of functions (ϕ, ϕ) e as follows: At time n, Terminal A sends a function of its message W and possibly of past feedback channel outputs over the feedforward channel, i.e., Xn = ϕ(W, Ye n−1 ).

(6)

en = ϕ(Y X e n ).

(7)

Similarly, Terminal B sends function of its past observations to Terminal A over the feedback channel, i.e.,

Remark 1: The dependence of ϕ and ϕ e on n is suppressed. In general, we allow these functions to further depend on common randomness shared by the terminals. We note in passing that our definition of the feedback transmission scheme is sometimes referred to as active feedback; the term passive feedback is usually reserved to the special case where ϕ(Y e n ) = Yn . The number of rounds N is fixed. While feedback protocols with variable transmission length exist and can improve reliability relative to non-feedback transmission [15], [16], they are beyond the scope of this work. We assume that Terminal A (resp.

4

Terminal B) is subject to an average power constraint P (resp. Pe), namely N X

n=1

E(Xn2 )

≤ N · P,

N X

n=1

e 2 ) ≤ N · Pe . E(X n

(8)

def e e def We denote the feedforward (resp. feedback) signal-to-noise ratio by SNR = σP2 (resp. SNR = σeP2 ). The excess signal-to-noise e def NR ratio of the feedback over the feedforward is denoted by ∆SNR = SSNR . Throughout this work, we assume that ∆SNR > 1. def log M An interactive scheme (ϕ, ϕ) e is associated with a rate R = N and an error probability pe , which is the probability that Terminal B errs in decoding the message W at time N , under the optimal decision rule.

IV. P RELIMINARIES In this section, we describe the building blocks underlying our interactive scheme. First, we review the use of uncoded PAM signaling, and discuss its associated capacity gap. Then, we describe the basic problem of joint-source-channel coding (JSCC) via analog transmission, and show how to build the S-K scheme from uncoded PAM and iterative JSCC. Lastly, we discuss the problem of JSCC with side information using modulo arithmetic, and present a simple scalar solution that is later implemented as part of our simplicity-oriented scheme. A. Uncoded PAM PAM is a simple and commonly used modulation scheme, where 2R symbols are mapped (one-to-one) to the set {±1η, ±3η, · · · , ±(2R − 1)η}

(9)

Canonically, the normalization factor η is set so that the overall mean square of the constellation (assuming equiprobable p symbols) is unity. A straightforward calculation yields η = 3/ (22R − 1). In the general case where the mean square of the √ constellation is constrained to be P , η is replaced with η P . It is easy to show that for an AWGN channel with zero mean noise of variance σ 2 and average input power constraint P , the probability of error incurred by the optimal detector is bounded by the probability that the noise exceed half the minimal distance of the PAM constellation, i.e., ! r √ ! Pη 3SNR = 2Q . (10) pe < 2Q σ 22R − 1 Fixing the error probability pe and solving the inequality (10) for R yields:   1 SNR R > log 1 + , 2 Γ0 (pe )

(11)

where

1 h −1  pe i2 Q . (12) 3 2 Comparing (11) and (2), we see that PAM signaling with error probability pe admits a capacity gap of Γ0 (pe ). For a typical value of pe = 10−6 , the capacity gap of uncoded PAM is Γ0,dB = 9dB. Finally, we assume as usual that bits are mapped to PAM constellation symbols via Gray labeling. The associated bit error probability can thus be bounded by √ ! √ ! 2 Pη Pη pe pb < Q + 2Q 3 ≈ . (13) R σ σ R def

Γ0 (pe ) =

The bound is assumes an error in one bit out or R for a nearest neighbor decision error, and an error in all bits for all other decision errors. where the approximation is becomes tight for small pe due to the strong decay of the Q-function.

5

B. Joint Source Channel Coding (JSCC) via Analog Transmission It is well known [17] that when a Gaussian source is to be transmitted over an AWGN channel under a quadratic distortion measure, analog transmission obtains the optimal distortion with minimal delay. This solution is a simple instance of joint  source-channel coding (JSCC). More explicitly, we wish to convey a Gaussian r.v. ε ∼ N 0, σε2 over an AWGN channel  Y = X + Z, Z ∼ N 0, σ 2 , with expected input power constraint EX 2 ≤ P (i.e. SNR = σP2 ). The optimal transmission and estimation boil down to X = αε and εb = βY . The√optimal choice of α yields a power scaling factor, i.e. α = σσε , and optimal SNR choice of β yields the Wiener coefficient β = σσε SNR+1 . Plugging α and β yields the minimal attainable MSE in this setup: σε2 (14) SNR + 1 Namely, this JSCC scheme improves the estimation error of ε by a factor SNR + 1 relative to a trivial guess. In the sequel we shall use this simple notion as a building block for both the classic S-K scheme and the newly proposed noisy feedback schemes. 2

E(b ε − ε) =

C. The S-K Scheme Consider the setting of communication over the AWGN channel with noiseless feedback, i.e., where σ e2 = 0. The S-K scheme can be described as follows. First, Terminal A maps the message W to a PAM constellation point Θ. In the first round, it sends a scaled version of Θ satisfying the power constraint P . In subsequent rounds, Terminal B maintains an estimate b n of Θ given all the observation it has, and feeds it back to Terminal A. Terminal A then computes the estimation error Θ def b εn = Θ n − Θ, and sends it to Terminal B using analog transmission. (A) Initialization:

Terminal A: Map the message W to a PAM point Θ. Terminal A ⇒ Terminal B: √ • Send X1 = PΘ • Receive Y1 = X1 + Z1 b 1 = √Y1 . Terminal B: Initialize the Θ estimate1 to Θ P

(B) Iteration:

Terminal B ⇒ Terminal A: •



en = Θ bn Send the current Θ estimate: X e e Receive Yn = Xn

Terminal A: Compute the estimation error εn = Yen − Θ. Terminal A ⇒ Terminal B:

Send εn via analog transmission. i.e. Xn+1 = αn εn , where αn = • Receive Yn = Xn + Zn b n+1 = Θ b n − εbn , where Terminal B: Update the Θ estimate1 Θ •

√ P σn

def

where σn2 = Eε2n .

εbn = βn+1 Yn+1

(15)

is the Minimum Mean Square Error (MMSE) estimate of εn , thus, βn+1 is the appropriate Wiener coefficient: p √ P σn2 σn SNR βn+1 = = · . (16) P + σ2 σ 1 + SNR (C) Decoding: b N w.r.t. the PAM constellation. At time N , Terminal B decodes the message using a minimum distance decoder for Θ

To calculate the error probability and rate attained by the S-K scheme, we note that εn+1 = εn − εbn . Using the property (14) of analog transmission yields: 2 σn+1 =

1 Note

1 σn2 = n. 1 + SNR SNR (1 + SNR)

that this is the minimum variance unbiased estimate of Θ.

(17)

6

An important observation is that using (17) and the fact that the power of Θ is normalized to unity, one can regard the channel −2 from Θ to ΘN as an AWGN channel with a signal-to-noise ratio SNRN = σN , namely: SNRN = SNR · (1 + SNR)N −1 . Plugging SNRN into (10) and bounding the Q-function by Q(x) < 21 exp(− 12 x2 ) gives:   3 SNR · (1 + SNR)N −1 1 . pe < exp − 2 2 22N R − 1

Plugging in the AWGN channel capacity (2) and removing the “−1” term, we obtain:   SNR pe < 21 exp − 23 1+SNR · 22N (C−R) .

(18)

(19)

(20)

which is the well-known doubly exponential decay of the error probability of the S-K scheme. Let us now provide an alternative interpretation of the S-K scheme performance, in terms of the capacity gap attained after a finite number of rounds. Plugging SNRN in (11) yields:   1 SNR · (1 + SNR)N −1 R> . (21) log 1 + 2N Γ

Substituting the resulting R in the definition of the capacity gap (3) and assuming SNR ≫ 1 yields the following approximation for high SNR: ΓS-K dB (pe , N ) ≈

Γ0,dB (pe ) . N

(22)

This behavior is depicted by the dashed curve in Fig. 4. D. Joint Source Channel Coding with Side Information A key observation made in this work is that while the transmission of εn from Terminal A to Terminal B can be regarded as a JSCC problem, the transmission of εn from Terminal B to Terminal A can be regarded as a JSCC problem with side b n = Θ + εn, and wishes to convey the current information. More explicitly, at round n, Terminal B holds its current estimate Θ estimation error εn to Terminal A. Terminal A, at its end, knows Θ and can use it as side information. The above problem can clearly be solved optimally using separation into two disjoint coding problem: A Gaussian source coding problem with arbitrary side information at the receiver [18], and a regular channel code. In this work we chose not to separate the problem but rather give a JSCC lattice-based solution in the spirit of Kochman and Zamir [9]. The motivation for choosing this solution stems from its simplicity in the scalar case discussed in Section V, and the ease of analysis in the asymptotic case discussed in Section VI. It should be noted that for clarity of exposition and ease of analysis, we use the high-SNR version of [9], which can be slightly suboptimal in the low-SNR regime. Let us now describe this solution. For simplicity, we start with the scalar case. First, we need some definitions and properties of modulo arithmetic. For a given d > 0, the scalar modulo-d function is x def (23) Md [x] = x − d · round d where the round(·) operator returns nearest integer to its argument (rounding up at half). The following properties are easily verified. Proposition 1: (i) Md [x] ∈ [− d2 , d2 ) (ii) if d1 + d2 ∈ [− d2 , d2 ), then Md [Md [x + d1 ] + d2 − x] = d1 + d2 .

(24)

otherwise, a modulo-aliasing error term of kd 6= 0 is added to the right-hand-side (24), for some integer k. (iii) Let V ∼ Uniform([− d2 , d2 )). Then Md [x + V ] is uniformly distributed over [− d2 , d2 ) for any x ∈ R. d2 . (iv) E(Md [x + V ])2 = 12 Using the above, we can provide a solution to the aforementioned JSCC problem with side information. Assume that Terminal b = Θ + ε, and wants to convey ε ∼ N (0, σ 2 ) to Terminal A (which is in possession of Θ), over an B is in possession of Θ ε

7

e + Z. e The channel is characterized by Z e ∼ N (0, σ e 2 ≤ Pe . Let V ∼ Uniform([− d , d )) be a AWGN channel Ye = X e2 ) and EX 2 2 dither signal known at both terminals. Then, Terminal B transmits h i b +V . e = Md γ Θ (25) X p where we set d = 12Pe to guarantee that the power constraint is satisfied. Terminal A computes the estimate h i 1 (26) εb = Md Ye − γΘ − V . γ Hence, by Proposition 1 property (ii):

e ∈ [− d , d ) we obtain In the case where γε + Z 2 2

εb =

h i 1 Md γε + Ze γ

εb = ε +

1e Z γ

(27)

(28)

which is similar to analog transmission when Θ is known also at Terminal B. The question that arises at this point is how to set γ. Clearly, a large γ would increase the probability of modulo-aliasing error, but at the same time reduce the additive estimation error in εb. Defining the probability of modulo-aliasing error pmod as   d d def (29) / [− , ) pmod = Pr γε + Ze ∈ 2 2 p and recalling that d = 12Pe , we obtain the trade-off s  √  e 3 P  = 2Q 3L (30) pmod = 2Q  γ 2 σε + σ e where we have implicitly introduced the looseness parameter L, given by L=

Pe , +σ e2

γ 2 σε2

(31)

Observe that a larger L implies a smaller variance of the modulo argument in (27), hence a smaller modulo-aliasing error probability. On the other hand, a larger L implies a smaller γ, and hence a larger estimation error by virtue of (28). We note in passing that in the asymptotic regime, L > 1 is both necessary and sufficient in order for pmod → 0 with the lattice dimension. In the sequel, it will be convenient to express our results in terms of L instead of γ, as the former is a more natural parameter of the problem. V. S IMPLE I NTERACTION In this section we present our simplicity-oriented interactive scheme, using the S-K scheme and the scalar modulo JSCC scheme with side information as building blocks. We analyze the associated capacity gap and discuss implementation issues. The scheme is presented in Subsection V-A. An upper bound on the capacity gap attained by the scheme is given in Subsection V-B, and proved in Subsection V-C. Numerical results are presented in Subsection V-D. Practical implementation issues are addressed in Subsection V-E, and a concluding discussion appears in Subsection V-F. A. The Proposed Scheme In what follows, we assume that the terminals share a common random i.i.d sequence {Vnp }N n=1 , mutually independent of d d the noise sequences and the message, where Vn ∼ Uniform([− 2 , 2 )). As before, we set d = 12Pe . Recall the definition of b n and εn in Subsection IV-C, as the estimation of Θ and the corresponding estimation error at Terminal B and time n. Θ A block diagram of the scheme is depicted in Fig. 2. Let us describe our scheme in detail. The scheme is given in terms of the parameters α, βn , γn which dictate the performance. The specific choice of these parameters is given in the next subsection. (A) Initialization: Terminal A: Map the message W to a PAM point Θ. Terminal A ⇒ Terminal B:

8

Terminal A

Terminal B

−Vn

Θ

−γn

Zn α Md (·)

D

Xn

−βn

Yn

feedforward channel

γn

bn Θ

Vn Md (·)

feedback channel Yen Fig. 2.

Block diagram of the proposed scheme

• •

en Z

en X

√ Send X1 = P Θ Receive Y1 = X1 + Z1

b1 = Terminal B: Initialize the Θ estimate to Θ

Y1 √ P

.

(B) Iteration:

Terminal B ⇒ Terminal A: b n , compute and send • Given the Θ estimate Θ •

en + Z en Receive Yen = X

h i en = Md γn Θ b n + Vn X

Terminal A: Extract a noisy scaled version of estimation error εn : i h εen = Md Yen − γn Θ − Vn

(32)

(33)

Note that εen = γn εn + Zen , unless a modulo-aliasing error occurs. Terminal A ⇒ Terminal B:

Send a scaled version of εen : Xn+1 = αe εn , where α is set so that to meet the input power constraint P (computed later). • Receive Yn = Xn + Zn b n+1 = Θ b n − εbn , where Terminal B: Update the Θ estimate Θ •

The choice of βn is described in the sequel.

εbn = βn+1 Yn+1

(34)

(C) Decoding: b N w.r.t. the PAM constellation. At time N the receiver decodes the message using a minimum distance decoder for Θ

B. Main Result: The Capacity Gap

Recall the capacity gap function Γ0 (·) of uncoded PAM given in (12). Fix some target error probability pe and a desired number of rounds N . Set the looseness parameter to 1 h −1  pe i2 Q , (35) L= 3 4N

9

and set the scheme parameters α, βn , γn to α=

βn =

where

Define:

σn−1 σ

v u u σn = tSNR−1 ·

s

L

P , Pe

(36)

r   e −1 SNR · 1 − LSNR

1 + SNR s Pe 1 −σ e2 , γn = σn L

,

(37)

(38)

e −1 1 − L · SNR 1 + SNR · 1 + L · ∆SNR−1

!1−n

.

(39)

def

Ψ1 = 1 + L · ∆SNR−1 1 def Ψ2 = e −1 1 − L · SNR 10/ ln 10 def Ψ3 = N −1 − N −1 − 1 − SNR · Ψ1 N Ψ2 N Γ0 N

pe 2



(40) −1

Theorem 1: For the choice of parameters above, the interactive communication scheme described in Subsection V-A achieves in N rounds an error probability pe and a capacity gap Γ∗dB satisfying: Γ∗dB (pe , N )
1: ! ! N N [ [ (48) Pr En = Pr En′ . n=1

n=1

¯i denote the event complementary to Ei . Define the event Proof: Let E def

Jn =

n \

i=1

¯i E

(49)

11

Let us show by induction that Jn = Jn′ . For n = 1, we have e1 ∈ [− d , d )} J1 = {γ1 ε1 + Z 2 2 e1 ∈ [− d , d )} = {γ1 ε′1 + Z 2 2 =

(50)

J1′

′ where (50) follows from the sample path identity. Assuming Jk−1 = Jk−1 and using the sample path identity again, we have

ek ∈ [− d , d )} ∩ Jk−1 Jk = {γk εk + Z 2 2 ′ ′ e = {γk εk + Zk ∈ [− 2d , d2 ))} ∩ Jk−1 =

(51) (52)

Jk′

(53)

′ By the exact same argument (replacing ∈ with ∈) / we clearly have that Jn−1 ∩ En = Jn−1 ∩ En′ . Thus we can write ! ! N N n−1 X [ \  C Pr Pr Ei ∩ En En = Pr(E1 ) + n=1

= Pr(J 1 ) + ′

= Pr(J 1 ) +

n=2 N X

n=2 N X n=2

= Pr

N [

En′

n=1

!

(54)

i=1

Pr (Jn−1 ∩ En ) ′ Pr Jn−1 ∩ En′



(55) (56) (57)

Combining the above with (47) and applying the union bound in the coupled system, we obtain pe ≤

N X

Pr (En′ ) .

(58)

n=1

Thus, we can now upper bound the error probability by calculating probabilities in the coupled system, which involves only scalar Gaussian densities and significantly simplifies the analysis. From this point on, for the sake of brevity, we completely drop the prime notations that distinguish the original system and the coupled system (specifically, this applies to the variables e Ye , ε, εe). X, Y, X, def Let us set γn such that Pr(E1 ) = · · · = pPr(En−1 ) = pm , for some pm small enough to be determined later. Specifically, en is Gaussian in the coupled system , we have that recalling the definition (45) and that d = 12Pe , and since εen = γn εn + Z s  3Pe  pm = 2Q  . (59) Ee ε2n Recall the definition of L in (31), where here σε2 is in fact σn2 , namely the variance of εn in the coupled system. We have that

Combining the above two equations, we obtain

Pe L

(60)

 p i−2 h m L = 3 Q−1 2

(61)

Ee ε2n = γn2 σn2 + σ e2 =

pe e must hold, which lower Remark 7: L defined in (35) is a special case of the above with pm = 2N . Note that L < SNR bounds the attainable error probability. Now, set α so that the input power constraint at Terminal A is met, namely P ≥ EXn2 = α2 Ee ε2n . From (60) it stems that: s P (62) α= L . Pe

12

Remark 8: It should be emphasized that this calculation is accurate for the coupled system only. In the original system, a modulo-aliasing error may cause the power constraint to be violated. However, since εe2n ≤ 3Pe and since the probabilities of modulo-aliasing errors are set to be very low (lower then the target error probability) the power constraint violation is negligible, and can be practically ignored; e.g., for pe = 10−6 and N = 20, the increase in average power due to this effect is lower than 10−4 dB. We also note in passing that the power constraint at Terminal B is always satisfied (regardless of parameter choice), due to dithering. The parameter βn determines the evolution of the estimation error. The linear estimate of εn : εbn = βn+1 Yn+1 , is the optimal 2 estimate in the coupled system, in which εn and Yn+1 are jointly Gaussian. We would thus like to minimize E (εn − εbn ) . Recalling the input-output relation of the feedforward channel and using (62) we obtain q   en + Zn+1 (63) γ ε + Z Yn+1 = LP n n e P and solving the optimization for βn yields:

σn , (64) σ 1 + SNR = εn − εbn and computing the MMSE for the optimal choice of βn+1 above, we obtain a recursive formula βn+1 =

Recalling that εn+1 for σn2 and SNRn :

r   e −1 SNR · 1 − LSNR

1 SNRn = 2 = SNR · σn

e −1 1 − L · SNR 1 + SNR · 1 + L · ∆SNR−1

!n−1

= SNR · (1 + SNR · Ψ1 Ψ2 )n−1

It is possible to give a different and modular interpretation for (65). Let us rewrite (63): q q LP e γ ε + Yn+1 = LP n n e e Zn + Zn+1 P P

(65)

(66)

This equation can be regarded as aqJSCC problem designated for the transmission of εn over an AWGN channel. The effective LP e 2 noise of this AWGN channel is e Zn + Zn+1 . Some algebra shows that the variance of this noise is Ψ1 σ where Ψ1 P is defined in (40). We call this phenomenon noise insertion, as previously mentioned in Remark 2. A consequence of this en . Subtracting this phenomenon is that part of the transmission power is now consumed by the noise element related to Z penalty from P shows that the part of transmission power used for the description of εn is Ψ2 P . We call this phenomena power loss as previously mentioned in Remark 3. So, all in all, the SNR of the channel describe by (66) is Ψ1 Ψ2 · SNR. Using this fact together with the SNR evolution of the S-K scheme (18), and noting that the noise insertion and power loss effects only occur after the second round, we obtain (65). Finally, using (58) and the derivations above, the error probability is bounded by ! r 3SNRN . (67) pe ≤ (N − 1)pm + 2Q 22N R − 1 To derive a lower bound on the capacity gap for our scheme, we can rearrange (67) above to obtain a lower bound on R, use the expression (65) for SNRN , and plug this into the definition of the capacity gap (3). This yields (41), where Ψ3 is a x remainder term obtained by pedestrian manipulations using the inequality − ln(1 − x) ≤ 1−x for x ∈ (0, 1). This completes pe the proof of Theorem 1. Note that the result was obtained for the specific choice pm = 2N of the modulo-aliasing error. In general, reducing pm increases L which in turn decreases SNRN , and hence increases the second addend on the right-hand-side of (67), resulting in a trade-off that could potentially be further optimized. D. Numerical Results The behavior of the capacity gap for our scheme as a function of the number of interaction rounds and ∆SNR is depicted in Fig. 3 and Fig. 4, for high SNR and low SNR setups. In both figures we plotted the capacity gap, for a target rate R and a target error probability pe = 10−6 , where the SNR corresponding to R was found by a numeric search on (21), and the capacity gap was calculated by definition (3). It can be seen that the higher ∆SNR, the smaller the capacity gap, where at ∆SNR = 30dB we virtually obtain the noiseless feedback performance. The points marked nopt are those for which the

13

8 Capacity gap [dB]

nopt = 6

∆SNR = 6dB

6 nopt = 12

4

∆SNR = 10dB

2 nopt = 22 ∆SNR = 20dB nopt = 23 ∆SNR = 30dB

0

clean feedback

5

10

15

20

25

30

35

N interaction rounds Fig. 3.

The capacity gap as function of the iterations and ∆SNR for a target rate R = 1 (low SNR), and target error probability pt = 10−6

Capacity gap [dB]

8

∆SNR = 3dB

nopt = 4

6

nopt = 5

∆SNR = 6dB

4

nopt = 11

∆SNR = 10dB

2

0

clean feedback

5

10

15

nopt = 19 nopt = 20

20

25

∆SNR = 20dB ∆SNR = 30dB

30

35

N interaction rounds Fig. 4.

The capacity gap as function of the iterations and ∆SNR for a target rate R ≥ 4 (high SNR), and target error probability pt = 10−6

capacity gap is less than 0.2dB above the minimal value attained. In Fig. 3 the rate was set to R = 1, and it can be seen that ∆SNR = 10dB reduces the capacity gap to 4.2dB in 12 iterations, and ∆SNR = 20dB reduces the capacity gap to 1.1dB in 22 iterations. In Fig. 4, the rate was set to R = 4, and it can be seen that ∆SNR = 10dB reduces the capacity gap to 3.5dB in 11 iterations, and ∆SNR = 20dB reduces the capacity gap to 0.8dB in 19 iterations. Observing (42) we can see that for high SNR the result is only a function of ∆SNR, thus does not depend on the target rate or the base SNR. E. Notes on Implementation The scheme described in this section is both simple and practical, as opposed to its noiseless feedback counterparts that break down in the presence of feedback noise. This provides impetus for further discussing implementation related aspects. The following conditions should be met for our results to carry merit: 1) Information asymmetry: Terminal A has substantially more information to convey than Terminal B. 2) SNR asymmetry: The SNR of the feedforward channel is lower than the SNR of the feedback channel. This can happen due to differences in power constraints (e.g. when Terminal A is battery operated and Terminal B is connected to the power grid), path losses, or noise/interference asymmetry. 3) Complexity/delay constraints: There are severe complexity or delay constraint at Terminal A. 4) Two-way signaling: Our scheme assumes sample-wise feedback. The communication system should therefore be full duplex where both terminals have virtually the same signaling rate; hence, the terminals split the bandwidth between them even

14

though only Terminal A is transmitting information. This situation can sometimes be inherent to the system, but should otherwise be tested against the (non-interactive) solution where the entire bandwidth is allocated to Terminal A. This choice of forward vs. feedback bandwidth allocation yields a system trade-off that is SNR dependent: Terminal A can use our scheme and achieve a rate of C(SNRdB −Γ∗dB ), or alternatively employ non-interactive codes over the full forward–feedback bandwidth. The latter option doubles the forward signaling rate but also incurs a 3dB loss in SNR and a potentially larger capacity gap Γ†dB , resulting in a rate of 2C(SNRdB − 3dB − Γ†dB ). It can therefore be seen that our solution is generally better for low enough SNR. For instance, for pe = 10−6 and ∆SNR > 30dB our scheme outperforms (with comparable complexity and delay) full bandwidth uncocded PAM for any SNR < 23dB, and outperforms (with significantly smaller complexity and delay) full bandwidth non-feedback codes with Γ†dB = 3dB for any SNR < 9dB. The use of very large PAM constellations, whose size is exponential in the product of rate and interaction rounds, seemingly requires extremely low noise and distortion at the digital and analog circuits in Terminal A. This may appear to impose a major implementation obstacle. Fortunately, this is not the case. The full resolution implied by the constellation size is by b N ; the transmitted and received signals in the construction confined only to the original message Θ and the final estimate Θ course of interaction can be safely quantized at a resolution determined only by the channel noise (and not the final estimation noise), as in commonplace communication systems. Figuratively speaking, the source bits are revealed along the interaction process, where the number of bits revealed in every round is determined by the channel SNR. Another important implementation issue is sensitivity to model assumptions. We have successfully verified in simulation the robustness of the proposed scheme in several reasonable scenarios including correlative noise, excess quantization noise, and multiplicative channel estimation noise. The universality of the scheme and its performance for a wider range of models remains to be further investigated. F. Discussion Note that so far we have limited our discussion to the PAM symbol error rate pe . The bit-error rate is in fact lower, since an error in PAM decoding affects only a single bit with high probability (13), assuming Gray labeling. However, note that the modulo-aliasing error will typically result in many erroneous bits, and hence optimizing the bit error rate does not yield a major improvement over its upper bound pe . Further fine-tuning of the scheme can be obtained by non-uniform power allocation over interaction rounds in both Terminal A and B; in particular, we note that Terminal B is silent in the last round, which can be trivially leveraged. We also note in passing that our scheme can be used in conjunction with (say) an outer block code, to achieve other power/delay/complexity/error probability tradeoffs. We note again that for any choice of SNR and ∆SNR, the error probability attained by our scheme cannot be made to vanish with the number interaction rounds while maintaining a non-zero rate, as in the noiseless feedback S-K scheme case. e which in turn by (30) imposes a lower bound on the attainable error probability, The reason is that (31) implies that L < SNR, dictated by the probability of modulo-aliasing of the feedback noise. Equivalently, one cannot get arbitrarily close to capacity for a given target error probability, since increasing the number of iterations improves SNRN and reduces the PAM decoding error term, but at the same time increases the modulo-aliasing error term (67). Hence, our scheme is not capacity achieving in the usual sense. However, it can get close to capacity in the sense of reducing the capacity gap using a very short block length, typically N ≈ 20 in the examples presented. To the best of our knowledge, state-of-the-art (non-interactive) block codes require a block length typically larger by two orders of magnitude to reach the same gap at the same error probability. Consequently, the encoding delay of our scheme is markedly lower than that these competing schemes. Alternatively, compared to a minimal delay uncoded system under the same error probability, our scheme operates at a much lower capacity gap for a wide regime of settings, and hence can be significantly more power efficient. Another important issue is that of encoding and decoding complexity. Our proposed scheme applies only two multiplications and one modulo operation at each terminal in each interaction round. This is significantly lower than the encoding/decoding complexity of good block codes, even if non-optimal methods such as iterative decoding are employed. VI. I MPROVING R ELIABILITY In this section we describe an asymptotic version of our simple scheme, which is aimed at improving reliability at the cost of increased complexity and delay. This scheme is shown to outperform its non-feedback counterparts in the error exponent sense, when ∆SNR is large enough. The scheme is based on replacing the scalar PAM signaling with a random block code, and replacing the scalar modulo operation with a modulo lattice operation. The error analysis of both the modulo error and

15

decoder decision error is performed in the coupled system as before, but are now concerned with error exponents instead of the scalar Q-function. Classical error exponent results are given in Subsection VI-A. JSCC with lattices is discussed in Subsection VI-B. The proposed scheme is introduced in Subsection VI-C. A lower bound on its error exponnet is given in Subsection VI-D, and proved in Subsection VI-E. A concluding discussion appears in Subsection VI-F. A. Block Codes and Error Exponents In the sequel, we replace the scalar PAM mapping of the message point W → Θ with an AWGN channel block code mapping W → Θ of length N . In this subsection we cite the classic results on the performance of block-codes for the AWGN channel. For channel coding over the AWGN channel with a signal-to-noise ratio SNR and rate R, there exist block codes of length N whose average error probability (averaged over the messages) under maximum likelihood decoding is exponentially . c (Y ) 6= W ) ≤ e−N Er (R) where Er (SNR, R) is given by [6]: upper bounded by Pr(W    Esp (SNR, R) if Rrc < R ≤ C (68) Er (SNR, R) = Erc (SNR, R) if Rex < R ≤ Rrc   E (SNR, R) if 0 < R ≤ R ex

ex

def

def

The boundaries between the regions are as follows. The Shannon capacity is C = 12 log(1 + SNR). The critical rate is Rcr =   p p def 1 2 1/2 log 1/2 + SNR/4 + 1/2 1 + SNR2/4 . The expurgation rate is R 1 1 SNR /4 . The exponents in /2 + /2 1 + ex = /2 log the above three regions are given by: s ! 4β SNR β + 1 − (β − 1) 1 + Esp (SNR, R) = 4β SNR(β − 1) s ! 1 4β SNR(β − 1) + ln β − 1+ (69) 2 2 SNR(β − 1) where β = 22R ,

  SNR SNR 1 + log β − Erc (SNR, R) = 1 − β + 2 2 2 1 − log(β) − log(2)R 2

(70) (71)

where here β = 2e2Rcr , and i p SNR h (72) 1 − 1 − 2−2R . 4 It is also well known and readily verified that for 0 < R < C, the exponent Esp (SNR, R) coincides with the asymptotic expression of Shannon’s sphere packing bound for the AWGN channel [19]. Hence, Esp (SNR, R) is also an upper bound for the reliability function. Eex (SNR, R) =

B. JSCC with Side Information Using High Dimensional Lattices From this point on, boldface letters such as X, Θ denote vectors of size NΛ . As shown IV-D, the probability of moduloaliasing error in the JSCC problem with side information in its scalar version, while sometimes small, is bounded away from zero. In order to make this probability arbitrarily small, the dimension of the scheme should be increased. This can be achieved by introducing large dimensional lattices and replacing the scalar modulo with the corresponding modulo-lattice operations. Let us start by quickly surveying a few basic lattice notations and properties [20]: (i) (ii) (iii) (iv) (v) (vi)

A lattice of dimension NΛ is denoted Λ = G · ZNΛ , where G is the generating matrix. Vol(Λ) = | det(G)| is the lattice cell volume. The nearest neighbor quantization of x w.r.t. Λ is denoted QΛ [x]. V0 = {x : QΛ [x] = 0} is the fundamental Voronoi cell pertaining to Λ. def The modulo Λ operation is MΛ [x] = x − QΛ [x]. MΛ [·] satisfies the distributive law MΛ [MΛ [x] + y] = MΛ [x + y]

(73)

16

def

(vii) The volume to noise ratio (VNR) of a lattice in the presence of AWGN with variance σ 2 is µ(Λ) = [Vol(Λ)]2/NΛ /σ 2 . def 2/N (viii) The normalized second moment of a lattice Λ is G(Λ) = σ 2 (Λ)/ [Vol(Λ)] Λ , where σ 2 (Λ) = N1Λ E(kV k2 ) and V is uniformly distributed on V0 .

Consider again the JSCC problem introduced in Subsection IV-D, where now Terminal B is in possession of a vector b = Θ + ε, and wants to convey the i.i.d. ∼ N (0, σ 2 ) error vector ε to Terminal A which is in possession of Θ, over an Θ ε e + Z. e The channel is again characterized by Z e ∼ N (0, σ e 2 ≤ Pe. We assume that a dither AWGN channel Ye = X e2 ) and EX signal V ∼ Uniform (V0 ), mutually independent of the message and the channel noises, is known at both terminals. Let us revise the JSCC with side information scheme presented in Subsection IV-D, replacing scalar modulo operation Md [·] with the lattice modulo MΛ [·]. Hence, Terminal B transmits h i b +V . e = MΛ γ Θ (74) X Where the dither V ∼ Uniform (V0 ). Terminal A estimates: h h i i 1 1 e b ε = MΛ Ye − γΘ − V = MΛ γε + Z γ γ

e ∈ V0 then And if γε + Z

b ε=ε+

1e Z γ

(75)

(76)

We are now ready to set the parameters of the modulo lattice scheme. Let us set the lattice second moment to equal the feedback power constraint σ 2 (Λ) = Pe . This would guarantee (due to dithering) that the feedback transmission power constraint is satisfied. The modulo-aliasing error event is the event where e ∈ γε + Z / V0

(77)

Recall the definition of the looseness parameter L of the lattice in (31), and note that L = µ(Λ) · G(Λ). It was shown in [21, 1 + o(1), and a modulo-error that is exponentially Theorem 5] that there exist lattices that asymptotically attain G(Λ) = 2πe . µ(Λ) −NΛ Ep ( 2πe ) bounded by pmod ≤ e , where Ep (x) is the Poltyrev exponent, given by [20]  1    2 (x − 1 − ln(x)) if 1 < x ≤ 2 Ep (x) = 21 ln(x) + ln( e4 ) if 2 < x ≤ 4 (78)   1x if x > 4 8 Hence in our notations, such lattices satisfy

.

pmod ≤ e−NΛ Ep (L) .

(79)

Clearly, in a similar fashion to the scalar scheme, the setting of L determines the tradeoff between modulo-error probability and decision error probability. Setting L close to 1 will maximize the effective signal-to-noise ratio (and maximize the block code error exponent), but at the same time minimize the modulo-error exponent. Setting L to be large will do the opposite, and will also reduce the maximal achievable rate. We note that the corresponding lattice-based JSCC scheme in [9] is better at low SNR, due to the addition of another Wiener coefficient multiplier at the receiver before the modulo operation, which results in non-Gaussian statistics of the error. For simplicity of analysis, this latter technique is not used here. C. Description of the Scheme In this subsection, we show how to combine the blockwise coding and blockwise modulo operations into one scheme. In a nutshell, the message W is mapped into a codeword Θ of length NΛ , and sent in the first block (NΛ channel uses). This replaces the PAM transmission in the scalar scheme. In the sequel, vector analog transmission is used over the feedforward, and vector modulo-lattice transmission is used over the feedback. Ultimately, W is decoded using a maximum likelihood decoding rule. Since under this protocol both terminals are idle half the time, we interlace two identical schemes, encoding and decoding two independent messages, as illustrated in Fig. 5. We denote the block index (or round index) by k ∈ {1, ..., K}. For brevity, and with a mild abuse of notations, we only describe the evolution of one of the interlaced scheme. The setting of the parameters α, βk , γk will be discussed in the sequel. The dither variables V k are i.i.d. and uniformly distributed on V0 .

17

1

NΛ +1

2NΛ +1

3NΛ +1

...

2

NΛ +2

2NΛ +2

3NΛ +2

...

.. .

.. .

.. .

NΛ −1

2NΛ −1

3NΛ −1

4NΛ −1

...



2NΛ

3NΛ

4NΛ

...

..

.

...

(2K−1)· NΛ +1

(2K−1)· NΛ +2

.. . 2KNΛ −1

Decode Block Code ×2

Lattic/Block Code Axis : NΛ

S-K Roounds : K × 2

2KNΛ

Fig. 5. Blockwise transmission. The time instants are divided into blocks of size NΛ . Single headed arrows “→” and “←” denote transmission from Terminal A to Terminal B respectfully. Double headed errors , “։” and “և”, bear the same meaning but for the second scheme.

(A) Initialization: Terminal A: Map the message W to codeword Θ using a codebook for the AWGN channel with average power P . Terminal A ⇒ Terminal B: • •

Send X 1 = Θ Receive Y 1 = X 1 + Z 1

b 1 = Y 1. Terminal B: Initialize the Θ estimate to Θ

(B) Iteration:

Terminal B ⇒ Terminal A: b k , compute and send in the following block • Given the Θ estimate Θ i h bk + V k fk = MΛ γk Θ X •

(80)

fk + Z ek Receive Ye k = X

Terminal A: Extract a noisy scaled version of the estimation error vector εk : i h e εk = MΛ Ye k − γk Θ − V k

(81)

e k , unless a modulo-aliasing error occurs. Note that e εk = γk εk + Z Terminal A ⇒ Terminal B:

Send a scaled version of e εk : X k+1 = αe εk , where α is set so that the input power constraint P is met. • Receive Y k+1 = X k+1 + Z k+1 b k+1 = Θ bk −b Terminal B: Update the Θ estimate Θ εk , where



b εk = βk+1 Y k+1

(82)

c (Θ b K ) using an ML decision rule w.r.t. the (C) Decoding: After the reception of block K the receiver decodes the message W codebook. D. Main Result: The Error Exponent Set the scheme parameters α, βk , γk to α=

βk =

σk−1 σ

s

L

P , Pe

r   e −1 SNR · 1 − LSNR 1 + SNR

(83)

,

(84)

18

1 γk = σk where σk = σk (L) =

1 SNRK (L) ,

s

and def

SNRK (L) = SNR ·

Pe −σ e2 , L

(85)

e −1 1 − LSNR 1 + SNR 1 + L∆SNR−1

!K−1

,

(86)

The following theorem provides a lower bound on the error exponent obtained by our scheme. Theorem 2: For the choice of parameters above, the interactive communication scheme described in Subsection VI-C attains . an error probability pe ≤ e−NΛ EF B (R) , where n o def min{Er (SNRK (L),KR),Ep (L)} EFB (R) = max , (87) 2K K∈N,L≥1

E. Proof of Theorem 2 Define the error event def

EK =

o n b K ) 6= W c (Θ W

(88)

The error probability of each of the interlaced schemes is pe = Pr(EK ), and hence the total error probability is upper bounded by 2pe . Therefore, below we analyze only a single scheme, since the 2 factor does not change the exponential behavior. As in the analysis of the scalar scheme, the channel Θ → Y K is not Gaussian due to the non-linear modulo operations, which complicates a direct analysis. In order to circumvent this, we will upper bound the error probability by also taking modulo-error events into account, as done before. These errors events are defined by o n def ek ∈ (89) / V0 Ek = γk εk + Z and we have that

k=1

!

Applying the coupling argument of Lemma 1, we can obtain ! K [ Pr Ek = Pr

K [

pe ≤ Pr

K [

k=1

Using the union bound we obtain

pe ≤

K X

Ek

.

k=1

Ek′

(90)

!

.

(91)

Pr (Ek′ ) .

(92)

k=1

Calculating the above probabilities now involves only Gaussian random vectors, which significantly simplifies the analysis. From this point on, we perform an asymptotic exponential analysis. We set the parameters such that all modulo-aliasing error probabilities are equal. Hence pe ≤

K X

k=1

.

Pr(Ek′ ) ≤ max(pmod , pdec )

(93) def

where without loss of asymptotic optimality we have set all the modulo-error probabilities to be equal pmod = Pr (Ek′ ), and def ′ ). The modulo-aliasing error can be exponentially upper bounded by the Poltyrev exponent (79), also defined pdec = Pr (EK . −NΛ Ep (L) i.e. pmod ≤ e . b K is in fact equivalent (in the coupled system) to NΛ parallel independent AWGN We observe that the channel Θ → Θ 2 channels each with the same noise variance σK , and with a signal-to-noise ratio SNRK (L) given in (86). We can now encode the message W into Θ using a Gaussian codebook of block length NΛ and rate KR to obtain .

pdec ≤ e−NΛ Er (SNRK (L),KR) .

(94)

19

Er Esp ∆SNR = 20dB ∆SNR = 30dB

Error Exponents/SNR

3

2

1

0

0

0.2

0.4

0.6

0.8

1

R/C Fig. 6.

Error exponents with and without feedback for SNR = 20dB with ∆SNR = 20dB and ∆SNR = 30dB

Note that the rate KR is chosen such that the overall rate over K rounds is R. Balancing the exponents for pdec and pmod yields the result. The division by 2K is due to the use of two interlaced schemes, that doubles the overall delay. The trade-off is now clear: setting the lattice looseness L to be large reduces pmod but also reduces SNRK (L) hence increasing pdec , and vice versa. Due to the monotonicity of Er (SNRK (L), KR), Ep (L) in L, a numerical solution to (87) can be easily found. F. Discussion Numerical evaluations of Esp , Er and EF B for SNR = 20dB are depicted in Fig. 6. Comparing our achievable exponent to the sphere packing bound (which upper bounds the best achievable error exponent without feedback) we see that for ∆SNR = 20dB, EF B is slightly better than the sphere packing bound. For ∆SNR = 30dB, EF B is significantly better than the sphere packing bound. e It is now instructive to give an approximation for high SNR. First, recall that the only constraint on L is that 1 < L < SNR. e For any fixed ∆SNR, we can therefore always set L ≫ ∆SNR, if SNR (and hence SNR) is large enough. Under this assumption, it can be verified that for SNR ≫ 1  K e −L SNR SNRK (L) ≥ (1 + o(1)) . (95) ∆SNR · LK−1 Let us now set R = 0, and solve Er (SNRK (L), RK)) = Ep (L) for L. Since both Er and Ep are in their expurgation regions, we have the following equation:  K e −L S NR 1 1 = L. (96) 4 ∆SNR · LK−1 8 1 e The solution yields L∗ = SNR/(1 + ( 1 ∆SNR) K ). Plugging this (87) yields for any K > 1: 2

EF B (0) ≥

Optimizing for K, yields:

SNR · ∆SNR·   (1 + o(1)) 1 16K 1 + ( 21 ∆SNR) K

K ∗ = 0.78 · ln

1 2 ∆SNR



≈ 0.18 · ∆SNRdB − 0.54.

(97)

(98)

Noting that x1/(a ln x) = e1/a and plugging in the above, we obtain EF B (0) ≥

∆SNR· · SNR (1 + o(1)) 57.44 (ln(∆SNR) − 0.693)

(99)

20

We note that K ∗is linear in∆SNRdB , hence the number of rounds is effectively very small. Moreover, the ratio EF B (0)/Esp (0) ∆SNR behaves like Ω ln(∆SNR) , and grows unbounded with ∆SNR. Chance and Love [11] suggested a concatenated coding scheme for the noisy feedback setup. Specifically, they provided an error exponent for a construction where the inner code consists of several iterations of a linear feedback coding scheme, and the outer code is a random block code. Their inner code is based on the idea of maximizing the output SNR via linear processing. Here, we compare their results to ours, and show that their exponent is superior at low rates, and ours is superior at high rates. In their Lemma 8, they provide the following achievable exponent at zero rate:   SNR SNR EFCL (0) = 1 + ∆SNR · (100) B 4 1 + SNR which is clearly better than our exponent (99). However, in the same Lemma they also derive a “shut-off rate” for their scheme, CL namely a threshold rate Rth above which their exponent does not improve on the no-feedback one:   1 SNR CL Rth = log 1 + 2SNR · ∆SNR · (1 − γ0 ) (101) 4 1 + SNR where γ0 ∈ [0, 1] is a root of a quadratic equation given in their Lemma 6 (note that a factor CL expression for Rth ). Dividing the above by the capacity, it is easy to show that   CL Rth 1 + log(1 + ∆SNR) 1 (1 + SNR) log e 1+ ≤ + C 2 log SNR 2∆SNR · SNR2 log SNR

1 2

is missing in the original

(102)

For SNR ≫ 1 and a fixed ∆SNR, the above upper bound clearly converges to 1/2, hence in this regime the error exponent in [11] does not yield any improvement over the non-feedback exponent for rates above half the capacity. Fig. 7 contains a comparison of our error exponent EF B with the error exponent EFCL B . In this setting SNR = 10dB and CL ∆SNR = 23dB. The number of iterations for EF B was optimized per SNR. We can see that EFCL B dramatically improves over the non-feedback achievable error exponent at rates close to zero, but then falls below it at rates above 0.46C. Our error exponent is smaller than EFCL B at rates below 0.3C, but outperforms it and consistently beats the sphere packing bound for rates up to 0.9C. The error exponent analysis of the concatenated scheme in [11] enjoys the fact that the inner feedback code is linear, and hence the resulting end-to-end channel remains Gaussian. This facilitates the use of the well studied Gaussian error exponents. Of course, one can also construct a concatenated coding scheme using our non-linear simplicity-oriented scheme from Section V, in lieu of the linear one. Such a construction should intuitively obtain superior performance, since our inner code is significantly better than linear processing, and constitutes a better building block. Specifically, for uncoded PAM it achieves an error probability that corresponds to an AWGN channel with a signal-to-noise ratio that increases exponentially in the first few rounds. Alas, the same non-linearity induces an end-to-end channel that is non-Gaussian, and hence forbids the direct use of Gaussian error exponents, rendering the analysis difficult. The different construction described in this section was introduced in order to circumvent this technical difficulty, possibly at the cost of a weaker result, especially at low rates. R EFERENCES [1] C. E. Shannon, “The zero-error capacity of a noisy channel,” IEEE Trans. Inf. Theory, vol. IT-2, pp. 8–19, Sep 1956. [2] J. P. M. Schalkwijk and T. Kailath, “A coding scheme for additive noise channels with feedback part I: No bandwidth constraint,” IEEE Trans. Inf. Theory, vol. IT-12, pp. 172–182, Apr 1966. [3] J. P. M. Schalkwijk, “A coding scheme for additive noise channels with feedback part II: Band-limited signals,” IEEE Trans. Inf. Theory, vol. IT-12, pp. 183–189, Apr 1966. [4] M. Horstein, “Sequential transmission using noiseless feedback,” IEEE Trans. Info. Theory, vol. IT-9, pp. 136–143, Jul 1963. [5] O. Shayevitz and M. Feder, “Optimal feedback communication via posterior matching,” IEEE Trans. Inf. Theory, vol. 57, no. 3, pp. 1186–1222, Mar 2011. [6] R. G. Gallager, Information Theory and Reliable Communication, New York: John Wiley & Sons, 1968. [7] Y.-H. Kim, A. Lapidoth, and T. Weissman, “On the reliability of Gaussian channels with noisy feedback,” in Proc. 41st Allerton Conf. Communication, Control Computing, Sep. 2006, pp. 364–371. [8] R. G. Gallager and B. Nakibo˘glu, “Variations on a theme by Schalkwijk and Kailath,” Information Theory, IEEE Transactions on, vol. 56, no. 1, pp. 6–17, 2010. [9] Y. Kochman and R. Zamir, “Joint Wyner-Ziv/Dirty-Paper Coding by Analog Modulo-Lattice Modulation,” IEEE Trans. Inf. Theory, vol. 55, pp. 4878–4889, 2009. [10] M. V. Burnashev and H. Yamamoto, “Noisy feedback improves the gaussian channel reliability function,” in ISIT, 2014, pp. 2554–2558. [11] Z. Chance and D. J. Love, “Concatenated coding for the AWGN channel with noisy feedback,” IEEE Trans. Inf. Theory, vol. 57, pp. 6633–6649, Oct. 2011.

21

Er Esp EF B EFCLB

Error Exponents/SNR

0.8

0.6

0.4

0.2

0

0

0.2

0.4

0.6

0.8

1

R/C CL ). In this setting SNR = 10dB Fig. 7. A comparison of non-feedback error exponents, our error exponent (EF B ) and Chance and Love’s error exponent (EF B and ∆SNR = 23dB

[12] Y.-H. Kim, A. Lapidoth, and T. Weissman, “Error exponents for the Gaussian channel with active noisy feedback,” IEEE Trans. Inf. Theory, vol. 57, no. 3, pp. 1223–1236, March 2011. [13] A. Ben-Yishai and O. Shayevitz, “The Gaussian Channel with Noisy Feedback: Improving Reliability via Interaction,” in ISIT 2015, 2015, pp. 2500–2504, Also available in http://arxiv.org/abs/1501.06671. [14] A. Ben-Yishai and O. Shayevitz, “The Gaussian Channel with Noisy Feedback: Near-Capacity Performance via Simple Interaction,” in Proc. 52nd Allerton Conf. Communication, Control Computing, Oct. 2014, pp. 152–159, Also available in http://arxiv.org/abs/1407.8022. [15] M. V. Burnashev, “Data transmission over a discrete channel with feedback,” Probl. Pered. Inf., vol. 12, no. 4, pp. 10–30, 1976, English translation in Probl. Inf. Transm, pp. 250-265, 1976. [16] A. Sato and H. Yamamoto, “Error exponents of discrete memoryless channels and awgn channels with noisy feedback,” in ISITA, 2010, pp. 452–457. [17] T. Goblick, “Theoretical limitations on the transmission of data from analog sources,” IEEE Trans. Inf. Theory, vol. IT-11, no. 4, pp. 558–567, Oct 1965. [18] R. Zamir and S. Shamai and U. Erez, “Nested linear/lattice codes for structured multiterminal binning,” Information Theory, IEEE Transactions on, vol. 48, no. 6, pp. 1250–1276, 2002. [19] C. E. Shannon, “Probability of error for optimal codes in a Gaussian channel,” Bell Syst. Tech. J., vol. 38, pp. 611–656, 1959. [20] R. Zamir, Lattice Coding for Signals and Networks, Cambridge University Press, 2014. [21] S. Litsyn U. Erez and R. Zamir, “Lattices which are good for (almost) everything,” IEEE Trans. Inf. Theory, vol. 51, no. 10, pp. 3401–3416, Oct 2005.