Lecture Channels with State (Reading: NIT ., ., .–., .)
∙ ∙ ∙ ∙ ∙
DMC with state DMC with DM state Causal state information available at the encoder Noncausal state information available at the encoder Writing on dirty paper
© Copyright – Abbas El Gamal and Young-Han Kim
DMC with state sn
M
Encoder
Xn
p(y|x, s)
Yn
Decoder
̂ M
∙ DMC with state (X × S , p(y|x, s), Y) ∙ State: Channel uncertainty, jamming, fading, memory faults, host image ∙ Three general classes:
Compound channel: State is fixed throughout transmission
Arbitrarily varying channel: sn is an arbitrary sequence
Random state
/
DMC with DM state Si /Sn
p(s)
Si /Sn
Si M
Xi
Encoder
p(y|x, s)
Yi
Decoder
̂ M
∙ DMC with DM state (X × S , p(y|x, s)p(s), Y) ∙ State information availability:
Encoder, decoder, neither, both
Noiseless, noisy, coded
Causal (Si known before transmission i), noncausal (Sn known before transmission)
∙ For each setup, (nR , n) code, achievability, and capacity defined in the usual way
/
Simple special cases ∙ No state information available at either the encoder or the decoder: C = max I(X; Y), p(x)
where p(y|x) = ∑s p(s)p(y|x, s) ̂ n , sn )): ∙ State information available (causally or noncausally) at the decoder (m(y CSI-D = max I(X; Y, S) = max I(X; Y |S), p(x)
p(x)
and is achieved by treating (Y, S) as the channel output ̂ n , sn )): ∙ State information available at both encoder and decoder (xn (m, si ), m(y CSI-ED = max I(X; Y |S) p(x|s)
for both causal and noncausal cases Key achievability idea: Treat Sn as a time-sharing sequence /
Proof of achievability (Goldsmith–Varaiya ) ∙ Split M into independent messages with rates Rs , s ∈ S; hence ∑s Rs = R ∙ Codebook generation:
n
For each s, generate nRs sequences xn (ms , s) ∼ ∏i= pX|S (xi |s), ms ∈ [ : nRs ]
∙ Encoding:
To send message m = (ms : s ∈ S), store each of xn (ms , s) in a FIFO buffer for s
In time i, transmit the first untransmitted symbol from the FIFO buffer for si s xn (m , ) xn− (m , )
x (m , )
x (m , )
xn (m , ) xn− (m , )
x (m , )
x (m , )
xn (ms , s) xn− (ms , s)
x (ms , s)
x (ms , s)
/
Proof of achievability (Goldsmith–Varaiya ) ∙ Split M into independent messages with rates Rs , s ∈ S; hence ∑s Rs = R ∙ Codebook generation:
n
For each s, generate nRs sequences xn (ms , s) ∼ ∏i= pX|S (xi |s), ms ∈ [ : nRs ]
∙ Encoding:
To send message m = (ms : s ∈ S), store each of xn (ms , s) in a FIFO buffer for s
In time i, transmit the first untransmitted symbol from the FIFO buffer for si
∙ Decoding and the analysis of the probability of error:
Demultiplex the received sequence into subsequences (yns (s), s ∈ S), ∑s ns = n If sn ∈ Tє(n) , then ns ≥ n( − є)p(s) for every s ∈ S
̂ s , s), yn(−є)p(s) (s)) ∈ Tє(n) ̂ s for each s such that (xn(−є)p(s) (m Find a unique m
By the LLN and packing lemma, Pe(n) (s) → if Rs < ( − є)p(s)I(X; Y|S = s) − δ(є)
Hence, Pe(n) → if R < ( − є)I(X; Y|S) − δ(є)
/
Causal state information available at the encoder Si
p(s) Si
M
Xi
Encoder
Yi
p(y|x, s)
̂ M
Decoder
Theorem . (Shannon ) CCSI-E =
max I(U; Y),
p(u), x(u,s)
where U is independent of S with |U | ≤ min{(|X | − )|S| + , |Y|}
∙ Proof of the converse: Read NIT .
/
Proof of achievability Si
p(s) Si
M
Encoder
Ui
x(u, s)
Xi
p(y|x, s)
Yi
Decoder
̂ M
∙ Fix p(u) and x(u, s) that achieve CSI-E ∙ Shannon strategy: Attach a “physical device” x(u, s) in front of the actual channel ∙ This induces a DMC p(y|u) = ∑s p(y|x(s, u), s)p(s) with input U and output Y ∙ Now code for the induced DMC p(y|u) to achieve I(U; Y) Encoding: To send m, transmit xi = x(ui (m), si ), i ∈ [ : n]
∙ Can be viewed as coding over all functions {xu (s) : S → X } (u: function index) /
Noncausal state information available at the encoder Sn
p(s) Si
M
Encoder
Xi
p(y|x, s)
Yi
Decoder
̂ M
∙ Motivation for noncausal state information:
Memory with defects
Write-once memory
Digital watermarking
General broadcast channel
/
Memory with stuck-at faults S
∙ ∙ ∙ ∙
p(s) p/
p/
−p
X
Y
stuck at
stuck at
If the reader knows the fault locations:
CSI-D = CSI-ED = − p
If neither the writer nor the reader knows:
C = − H(p/)
If the writer knows the fault locations:
CSI-E = ?
Kuznetsov–Tsybakov () showed:
CSI-E = − p /
Multicoding ∙ Codebook generation: Randomly partition {, }n into nR subcodebooks C() . ..
C(m) .. .
C(nR )
. .. .. .
. .. .. .
. .. .. .
. .. .. .
. .. .. .
. .. .. .
. .. .. .
/
Multicoding ∙ Writing: Store m with sn : C() .. .
C(m) . . .
C(nR )
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . ← . . .
xn
/
Multicoding ∙ Reading: C() .. .
C(m) . . .
C(nR )
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . ← . . .
yn
/
Analysis of the probability of error
C()
C()
C()
C(nR )
∙ Error occurs iff there is no xn ∈ C(m) that matches the fault pattern ∙ For n large, there are ≈ np faults ∙ Hence, there are ≐ n(−p) that match any given fault pattern ∙ If R < − p and n large, C(m) has a matching sequence w.h.p. ∙ Hence the capacity CSI-E = − p
/
Gelfand–Pinsker theorem ∙ Gelfand–Pinsker () generalized this result to arbitrary DMC with DM state Theorem . CSI-E =
max
p(u|s), x(u,s)
I(U; Y) − I(U; S),
where |U | ≤ min{|X | ⋅ |S|, |Y| + |S| − }
∙ Example: Memory with defects
If S = , set U = X ∼ Bern(/)
If S = or , set U = X = S
Then, I(U; Y) − I(U; S) = H(U |S) − H(U |Y) = − p
/
Proof of achievability ∙ Codebook generation: Fix p(u|s) and x(u, s) that achieves CSI-E , let R̃ > R
For each m ∈ [ : nR ], generate a subcodebook C(m) consisting of ̃ ̃ ̃ n n(R−R) sequences un (l) ∼ ∏i= pU (ui ), l ∈ [(m − )n(R−R) + : mn(R−R) ] un u () n
C()
C(m) un (l)
C(nR ) ̃ un (nR ) /
Proof of achievability ∙ Encoding: To send m given sn , find un (l) ∈ C(m) such that (un (l), sn ) ∈ Tє(n)
If no such un (l) exists, set l = Then transmit xi = x(ui (l), si ) for i ∈ [ : n] sn
un un () C()
C(m) un (l)
Tє(n) (U, S)
C(nR ) ̃ un (nR ) /
Proof of achievability ∙ Decoding:
̂ such that (un (l), yn ) ∈ Tє(n) for some un (l) ∈ C(m) ̂ Find the unique m
un u () n
C()
C(m) un (l)
C(nR ) ̃ un (nR ) /
Analysis of the probability of error ∙ Consider P(E) conditioned on M = ∙ Let L denote the index of the chosen U n for Sn and M = ∙ Error events: E = (U n (l), Sn ) ∉ Tє(n) for all U n (l) ∈ C(),
E = (U n (L), Y n ) ∉ Tє(n) ,
̃
E = (U n (l), Y n ) ∈ Tє(n) for some l ∉ [ : n(R−R) ]
Thus, by the union of events bound P(E) ≤ P(E ) + P(Ec ∩ E ) + P(E )
/
Conditional and joint typicality lemmas Conditional typicality lemma n Let (X, Y) ∼ p(x, y) and є > є . If xn ∈ Tє(n) ∼ ∏ni= pY|X (yi |xi ), then (X), Y
lim P(xn , Y n ) ∈ Tє(n) (X, Y) =
n→∞
∙ If xn ∈ Tє(n) (X), є > є , then for n sufficiently large,
|Tє(n) (Y |xn )| ≥ n(H(Y|X)−δ(є))
Joint typicality lemma (part ) and Ỹ n ∼ ∏ni= pY (̃yi ), then for some Let (X, Y) ∼ p(x, y) and є > є . If xn ∈ Tє(n) δ(є) → as є → and n sufficiently large, P(xn , Ỹ n ) ∈ Tє(n) (X, Y) ≥ −n(I(X;Y)+δ(є)) /
Covering lemma (U = ) ̂ ∼ p(x, x̂) and є < є ∙ Let (X, X)
∙ Let X n ∼ p(xn ) be arbitrarily distributed such that lim PX n ∈ Tє(n) (X) =
n→∞
∙ Let X̂ n (m) ∼ ∏ni= pX̂ (̂xi ), m ∈ A, |A| ≥ nR , be independent of each other and of X n X̂ n
Xn
Tє(n) (X)
̂ n () X Xn
̂ n (m) X
/
Covering lemma (U = ) ̂ ∼ p(x, x̂) and є < є ∙ Let (X, X)
∙ Let X n ∼ p(xn ) be arbitrarily distributed such that lim PX n ∈ Tє(n) (X) =
n→∞
∙ Let X̂ n (m) ∼ ∏ni= pX̂ (̂xi ), m ∈ A, |A| ≥ nR , be independent of each other and of X n
Lemma . (Covering lemma) There exists δ(є) → as є → such that ̂ n (m)) ∉ T (n) for all m ∈ A = , lim P(X n , X є
n→∞
̂ + δ(є) if R > I(X; X)
/
Analysis of the probability of error ∙ Error events: E = (U n (l), Sn ) ∉ Tє(n) for all U n (l) ∈ C(),
E = (U n (L), Y n ) ∉ Tє(n) ,
̃
E = (U n (l), Y n ) ∈ Tє(n) for some l ∉ [ : n(R−R) ]
̃ ̂ ← U), ∙ By the covering lemma (|A| = n(R−R) , X ← S, X
̃ − R > I(U; S) + δ(є ) P(E ) → if R
∙ Since Ec = {(U n (L), X n , Sn ) ∈ Tє(n) } and n
n
n
n
n
n
n
n
n
i=
i=
Y |{U (L) = u , X = x , S = s } ∼ pY|U,X,S (yi |ui , xi , si ) = pY|X,S (yi |xi , si ), by the conditional typicality lemma, P(Ec ∩ E ) →
̃ ], and Y n are independent, ∙ Since U n (l) ∼ ∏ni= pU (ui ), l ∉ [ : n(R−R)
̃ < I(U; Y) − δ(є) by the packing lemma, P(E ) → if R
∙ Combining the bounds, P(E) → if R < I(U; Y) − I(U; S) − δ(e ) − δ(є) /
Proof of the converse (Heegard–El Gamal ) ∙ We will need the Csisz´ar sum identity: Let (U, X n , Y n ) ∼ F(u, xn , yn ), then n
n ; Yi |Y i− , U) I(Xi+ i=
n
n = I(Y i− ; Xi |Xi+ , U) i=
∙ By Fano’s inequality, nR ≤ I(M; Y n ) + nєn n
≤ I(M, Y i− ; Yi ) + nєn i= n
n
i= n
i= n
i= n
i= n
i=
i=
= I(M, Y i− , Sni+ ; Yi ) − I(Sni+ ; Yi |M, Y i− ) + nєn = I(M, Y i− , Sni+ ; Yi ) − I(Y i− ; Si |M, Sni+ ) + nєn = I(M, Y i− , Sni+ ; Yi ) − I(M, Sni+ , Y i− ; Si ) + nєn
∙ Now, identify Ui = (M, Sni+ , Y i− ) (Ui → (Xi , Si ) → Yi ), . . . /
Gaussian channel with additive Gaussian state S
Sn M
Encoder
Z Yn
Xn
Decoder
̂ M
∙ S ∼ N(, Q) and Z ∼ N(, ) are independent ∙ Average power constraint P on X ∙ State information not available at the encoder or decoder: C = C P/( + Q) ∙ State information available at the decoder: CSI-D = CSI-ED = C(P) ∙ State information available noncausally at the encoder (Costa ): Theorem . (Writing on dirty paper) CSI-E = C(P) /
Application: Digital watermarking S
Sn M
Encoder
Z
Xn
Yn
Decoder
̂ M
∙ The publisher embeds a watermark X in a host image S ∙ Given Sn , the authentication message M is encoded into watermark X n (M, Sn ) ∙ The watermark is added to image to generate watermarked image X n + Sn ∙ An authenticator wishes to retrieve M from Y n = X n + Sn + Z n , where Z ∼ N(, ) ∙ What is the optimal tradeoff between
Capacity C (amount of watermark information) and
Power of watermark X (determines fidelity of watermarked image)?
∙ By the writing on dirty paper, C(D) = C(D), where D is power of watermark /
Proof of achievability ∙ Gelfand–Pinsker theorem for the DMC with DM state and input cost: CSI-E =
max
p(u|s), x(u,s):E(b(X))≤B
I(U; Y) − I(U; S)
∙ For Gaussian channel with additive Gaussian state, find optimal F(u|s) and x(u, s) ∙ Let U = X + αS, where X ∼ N(, P) is independent of S ! ∙ With this choice,
(P + Q + )(P + α Q) I(U; Y) = log , PQ( − α ) + (P + α Q) P + α Q I(U; S) = log P Thus R(α) = I(U; Y) − I(U; S) =
P(P + Q + ) log PQ( − α) + (P + α Q)
∙ Maximizing w.r.t. α, we find that α∗ = P/(P + ) and R(α∗ ) = C(P) /
Extensions ∙ Non-Gaussian state (Cohen–Lapidoth ): C = C(P) ∙ Vector writing on dirty paper: Read NIT ., . S
Sn M
Encoder
Xn
Z Yn
G
Decoder
̂ M
n
Average power constraint: ∑i= E(xT (m, S, i)x(m, S, i)) ≤ nP
S ∼ F(s) and Z ∼ N(, Ir ) are independent
As in the scalar case, the capacity is the same as if S were not present: C=
max
F(x):E(x T x)≤P
I(X; GX + Z) =
max
K X : tr(K X )≤P
log |GKX GT + Ir |
/
Summary ∙ DMC with DM state ∙ Channel coding with side information ∙ Shannon strategy ∙ Gelfand–Pinsker coding:
Multicoding (subcodebook generation)
Joint typicality encoding
∙ Covering lemma ∙ Writing on dirty paper ∙ Vector writing on dirty paper
/
References Cohen, A. S. and Lapidoth, A. (). The Gaussian watermarking game. IEEE Trans. Inf. Theory, (), –. Costa, M. H. M. (). Writing on dirty paper. IEEE Trans. Inf. Theory, (), –. Gelfand, S. I. and Pinsker, M. S. (). Coding for channel with random parameters. Probl. Control Inf. Theory, (), –. Goldsmith, A. J. and Varaiya, P. P. (). Capacity of fading channels with channel side information. IEEE Trans. Inf. Theory, (), –. Heegard, C. and El Gamal, A. (). On the capacity of computer memories with defects. IEEE Trans. Inf. Theory, (), –. Kuznetsov, A. V. and Tsybakov, B. S. (). Coding in a memory with defective cells. Probl. Inf. Transm., (), –. Shannon, C. E. (). Channels with side information at the transmitter. IBM J. Res. Develop., (), –.
/