The Likelihood Encoder with Applications to Lossy Compression and Secrecy Eva C. Song Paul Cuff H. Vincent Poor Dept. of Electrical Eng., Princeton University, NJ 08544 {csong, cuff, poor}@princeton.edu
Abstract—A likelihood encoder is studied in the context of lossy source compression. The analysis of the likelihood encoder is based on a soft-covering lemma. It is demonstrated that the use of a likelihood encoder together with the soft-covering lemma recovers the point-to-point rate-distortion function, the rate-distortion function with side information at the decoder, and several other important inner bounds for multi-user lossy compression. The likelihood encoder also provides a way for analyzing rate-distortion based secrecy systems. Coupled with hybrid coding, new achievability results are obtained for a joint source-channel secrecy model that outperform an operationally separate source-channel coding scheme.
I. I NTRODUCTION Rate-distortion theory, founded by Shannon in [1] and [2], provides the fundamental limits of lossy source compression. The minimum rate required to represent an independent and identically distributed (i.i.d.) source sequence under a given tolerance of distortion is given by the rate-distortion function. Inspired by Yamamoto [3], recent works [4] [5] [6] [7] [8] have taken the rate-distortion approach to secrecy problems. Unlike the equivocation notion of secrecy, which measures the statistical independence between the source and the eavesdropper’s observation, the rate-distortion approach to secrecy measures directly how much distortion can be forced on the eavesdropper if it must reconstruct the source. This distortion-based notion of secrecy was used in [9], [10] and [11] with the presence of secret key sharing between the encoder and the legitimate receiver. It was shown in [10] that a secret key with any strictly positive rate can force the eavesdropper’s reconstruction of the source to be as bad as if she knows only the source distribution, i.e. the distortion under perfect secrecy. This result suggests, if instead of a shared secret key, the decoders have access to different side information, we should be able to force the eavesdropper’s reconstruction of the source to be the distortion under perfect secrecy as long as the legitimate receiver’s side information is somewhat stronger than the eavesdropper’s side information with respect to the source. This is indeed the case, which will be formally stated herein. However, in the more general case, the legitimate receiver may not have the stronger side This research was supported in part by the Air Force Office of Scientific Research under Grant FA9550-12-1-0196 and MURI Grant FA9550-09-05086, in part by the Army Research Office under MURI Grant W911NF-11-1-0036, and in part by National Science Foundation under Grants CCF-1116013, CNS09-05086 and ECCS-1343210 This paper was presented in part at 2013 IEEE Information Theory Workshop (ITW), and in part at 2014 IEEE Allerton Conference on Communication, Control, and Computing (Allerton).
information. Can a positive distortion still be forced upon the eavesdropper? We will show in this paper that we can encode the source in favor of the legitimate receiver’s side information so that the eavesdropper can only make limited use of the encoded message even with the help of her side information. Recent work [12] has bridged the gap between the ratedistortion based secrecy with equivocation. It turns out the equivocation is a special case of the distortion metric if the source is causally disclosed during decoding at the eavesdropper and the distortion function is chosen as the log-loss function. In this paper, we first review the likelihood encoder as a tool to prove achievability results for lossy source compression, and then apply this tool to secrecy problems through a source coding setting and a joint source-channel coding setting. II. P RELIMINARIES A. Notation A sequence X1 , ..., Xn is denoted by X n . Limits taken with respect to “n → ∞” are abbreviated as “→n ”. Inequalities with lim supn→∞ hn ≤ h and lim inf n→∞ hn ≥ h are abbreviated as hn ≤n h and hn ≥n h, respectively. When X denotes a random variable, x is used to denote a realization, X is used to denote the support of that random variable, and ∆X is used to denote the probability simplex of distributions with alphabet X . A Markov relation is denoted by the symbol −. We use EP , PP , and IP (X; Y ) to indicate expectation, probability, and mutual information taken with respect to a distribution P ; however, when the distribution is clear from the context, the subscript will be omitted. To keep the notation uncluttered, the arguments of a distribution are sometimes omitted when the arguments’ symbols match the subscripts of the distribution, e.g. PX|Y (x|y) = PX|Y . We use a bold capital letter P to denote that a distribution P is random. We use R to denote the set of real numbers and R+ to denote the nonnegative subset. For a distortion measure d : X × Y 7→ R+ , we use E[d(X, Y )] to measure the distortion of X incurred by representing it as Y . The maximum distortion is defined as dmax =
max
(x,y)∈X ×Y
d(x, y).
The distortion between two sequences is defined to be the per-letter average distortion n 1X d(xn , y n ) = d(xt , yt ). n t=1
B. Total Variation Distance
Xn
The total variation distance between two probability measures P and Q on the same σ-algebra F of subsets of the sample space X is defined as
Yn
M Decoder gn
Encoder fn
Fig. 1: Point-to-point lossy compression setup
kP − QkT V , sup |P (A) − Q(A)|. A∈F
D. Soft-Covering Lemma
Property 1 (cf. Property 2 [13]). Total variation distance satisfies the following properties: (a) If X is countable, then total variation can be rewritten as kP − QkT V =
1X |p(x) − q(x)|, 2
(1)
x∈X
where p(·) and q(·) are the probability mass functions of X under P and Q, respectively. (b) Let ε > 0 and let f (x) be a function in a bounded range with width b ∈ R+ . Then kP −QkT V
< ε =⇒ EP [f (X)]−EQ [f (X)] < εb. (2)
(c) Total variation satisfies the triangle inequality. For any S ∈ ∆X , kP − QkT V ≤ kP − SkT V + kS − QkT V .
(3)
(d) Let PX PY |X and QX PY |X be two joint distributions on ∆X ×Y . Then
Here we only introduce the basic version of the softcovering lemma which is strong enough for the analysis of source coding problems. For secrecy, a stronger version of the soft-covering lemma is needed. One can consider the role of the soft-covering lemma in analyzing the likelihood encoder as analogous to that of the J-AEP which is used for the analysis of joint-typicality encoders. The general idea of the soft-covering lemma is that the distribution induced by selecting uniformly from a random codebook and passing the codeword through a memoryless channel is close to an i.i.d. distribution as long as the codebook size is large enough. Lemma 1 (Lemma IV.1 [14]). Given a joint distribution PXY , let C (n) be a random collection of sequences Y n (m), with m = 1, ..., 2nR , each drawn independently and i.i.d. according to PY . Denote by PX n the output distribution induced by selecting an index m uniformly at random and applying Y n (m) to the memoryless channel specified by PX|Y . Then if R > I(X; Y ),
# " n
Y
EC n PX n − PX →n 0.
t=1
TV
III. L OSSY S OURCE C OMPRESSION kPX PY |X − QX PY |X kT V = kPX − QX kT V .
(4)
(e) For any P, Q ∈ ∆X ×Y , kPX − QX kT V ≤ kPXY − QXY kT V .
(5)
C. The Likelihood Encoder We now define the likelihood encoder, operating at rate R, which receives a sequence x1 , ..., xn and maps it to a message M ∈ [1 : 2nR ]. In normal usage, a decoder will then use M to form an approximate reconstruction of the x1 , ..., xn sequence. The encoder is specified by a codebook of y n (m) sequences and a joint distribution PXY . Consider the likelihood function for each codeword, with respect to a memoryless channel from Y to X, defined as follows: L(m|xn ) , PX n |Y n (xn |y n (m)). A likelihood encoder is a stochastic encoder that determines the message index with probability proportional to L(m|xn ), i.e. L(m|xn ) ∝ L(m|xn ). 0 |xn ) L(m 0 nR m ∈[1:2 ]
PM |X n (m|xn ) = P
We demonstrate the application of the likelihood encoder to lossy source compression problems using the point-to-point communication setting. The work in this section was originally presented in [15] [16]. A. Problem Setup and Result Review Rate-distortion theory determines the optimal compression rate R for an i.i.d. source sequence X n distributed according to Xt ∼ P X with the following constraints: n • Encoder fn : X 7→ M (possibly stochastic); n • Decoder gn : M 7→ Y (possibly stochastic); nR • Compression rate: R, i.e. |M| = 2 . The system performance is measured according to the timeaveraged distortion (as defined in the notation section): Pn 1 n n • Average distortion: d(X , Y ) = n t=1 d(Xt , Yt ). Definition 1. A rate distortion pair (R, D) is achievable if there exists a sequence of rate R encoders and decoders (fn , gn ), such that E[d(X n , Y n )] ≤n D. Definition 2. The rate distortion function is R(D) , inf {(R,D) is achievable} R. The above mathematical formulation is illustrated in Fig. V-A. The characterization of this fundamental quantity in
Y n (M )
M
under the idealized distribution Q shown in Fig. 2 can be written as
Xn P X|Y
C (n)
QX n M Y n (xn , m, y n ) = QM (m)QY n |M (y n |m)QX n |M (xn |m) (9) n Y 1 1{yn = Y n (m)} P X|Y (xt |Yt (m)) (10) = nR 2 t=1
Fig. 2: Idealized distribution with test channel P Y |X
information theory is given in [17] as R(D)
=
min P Y |X :E[d(X,Y )]≤D
IP (X; Y ),
(6)
where the mutual information is taken with respect to P XY = P X P Y |X . In other words, we are able to achieve distortion level D with any rate less than R(D) given in (6). B. Achievability Proof Using the Likelihood Encoder To prove achievability, we will use the likelihood encoder and approximate the overall behavior of the system by a wellbehaved distribution. The soft-covering lemma allows us to claim that the approximating distribution matches the system. Let R > R(D), where R(D) is from the right side of (6). We prove that R is achievable for distortion D. By the rate-distortion formula stated in (6), we can fix P Y |X such that R > IP (X; Y ) and EP [d(X, Y )] < D. We will use the likelihood encoder derived from P XY and a random codebook {y n (m)} generated according to P Y to prove the result. The decoder will simply reproduce y n (M ) upon receiving the message M . The distribution induced by the encoder and decoder is PX n M Y n (xn , m, y n ) = P X n (xn )PM |X n (m|xn )PY n |M (y n |m) ,
P X n (xn )PLE (m|xn )PD (y n |m)
n
1
1{yn = Y n (m)} nR
2
n Y
P X|Y (xt |yt ).
(11)
t=1
The idealized distribution Q satisfies following property: for any (xn , y n ) ∈ X n × Y n , (12) EC (n) [QX n Y n (xn , y n )] = P X n Y n (xn , y n ) Qn where P X n Y n denotes the i.i.d. distribution t=1 P XY . This implies, in particular, that the distortion under the idealized distribution Q averaged over the random codebook, conveniently simplifies to EP [d(X, Y )]. That is, EC (n) [EQ [d(X n , Y n )]] = EP [d(X, Y )],
(13)
It is worth emphasizing that although Q is very different from the i.i.d. distribution on (X n , Y n ), it is exactly the i.i.d. distribution when averaged over codebooks and thus achieves the same expected distortion. Our motivation for using the likelihood encoder comes from this construction of Q. Notice that QM |X n (m|xn ) = PLE (m|xn ),
(14)
QY n |M (y n |m) = PD (y n |m).
(15)
and (7) (8)
where PLE is the likelihood encoder and PD is a codeword lookup decoder. We now concisely restate the behavior of the encoder and decoder, as components of the induced distribution. nR Codebook generation: We Qnindependently generate 2 n sequences in Y according to i=1 P Y (yi ) and index them by m ∈ [1 : 2nR ]. We use C (n) to denote the random codebook. Encoder: The encoder PLE (m|xn ) is the likelihood encoder that chooses M stochastically with probability proportional to the likelihood function given by n
=
n
L(m|x ) = P X n |Y n (x |Y (m)). Decoder: The decoder PD (y n |m) is a codeword lookup decoder that simply reproduces Y n (m). Analysis: We will consider two distributions for the analysis, the induced distribution P and an approximating distribution Q, which is much easier to analyze. We will show that P and Q are close in total variation (on average over the random codebook). Hence, P achieves the performance of Q. Design the approximating distribution Q via a uniform distribution over a random codebook and a test channel P X|Y as shown in Fig. 2. We will refer to a distribution of this structure as an idealized distribution. The joint distribution
Now invoking the soft-covering lemma, since R IP (X; Y ), we have EC (n) kP X n − QX n kT V ≤ n →n 0.
>
This gives us ≤ ≤
EC (n) [kPX n Y n − QX n Y n kT V ] EC (n) [kPX n Y n M − QX n Y n M kT V ] n ,
(16) (17)
where (16) follows from Property 1(e) and (17) follows from (14),(15) and Property 1(d). By Property 1(b), |EP [d(X n , Y n )] − EQ [d(X n , Y n )]| ≤ dmax kP − QkT V . (18) Finally we apply the random coding argument. ≤ ≤ ≤ ≤n
EC (n) [EP [d(X n , Y n )]] EC (n) [EQ [d(X n , Y n )]] +EC (n) [|EP [d(X n , Y n )] − EQ [d(X n , Y n )]|] (19) (20) EP [d(X, Y )] + dmax EC (n) [kP − QkT V ] EP [d(X, Y )] + dmax n (21) D, (22)
where (20) follows from (13) and (18); (21) follows from (17). Therefore, there exists a codebook satisfying the requirement. Remark 1. As the proof emphasizes, the distribution Q serves as an accurate approximation to the true system behavior, and this is not unique to the likelihood encoder. In [18] a converse statement is shown. That is, any efficient source encoding satisfying a distortion constraint behaves like Q as measured by normalized divergence. However, a stochastic encoder is generally required for the approximation to hold in total variation. Furthermore, for the likelihood encoder, the accuracy of this approximation is easily verified using the softcovering lemma. For other encoders, the proof requires more effort to establish. This analysis also carries over to more complicated source coding problems, such as the Wyner-Ziv setting and the Berger-Tung setting [19] [16]. IV. R ATE -D ISTORTION BASED S ECRECY Having understood the fundamental tool, we can apply this type of analysis to secrecy problems. We consider a source coding setup over a noiseless wiretap channel where the legitimate receiver and the eavesdropper have side information correlated with the source at their decoders. The result of this section was originally presented in [8]. A. Problem Setup We want to determine the rate-distortion region for a secrecy system with an i.i.d. source and two side information sequences (X n , B n , W n ) distributed according to Qn P t=1 XBW (xt , bt , wt ) satisfying the following constraints: n • Encoder fn : X 7→ M (possibly stochastic); n n • Legitimate receiver decoder gn : M×B 7→ Y (possibly stochastic); • Eavesdropper decoder PZ n |M W n ; nR • Compression rate: R, i.e. |M| = 2 . The system performance is measured according to the following distortion metrics: • Average distortion for the legitimate receiver: E[db (X n , Y n )] ≤n Db •
Minimum average distortion for the eavesdropper: min
PZ n |M W n
E[dw (X n , Z n )] ≥n Dw
Note that db and dw can be the same or different distortion measures. Definition 3. The rate-distortion triple (R, Db , Dw ) is achievable if there exists a sequence of rate R encoders and decoders (fn , gn ) such that E[db (X n , Y n )] ≤n Db and min
PZ n |M W n
E[dw (X n , Z n )] ≥n Dw .
The above mathematical formulation is illustrated in Fig.V-A.
Bn Xn
M
Encoder fn
Decoder gn
PZ n |M W n
Yn
Zn
Wn Fig. 3: Secrecy system setup with side information at the decoders
B. Main Result Here we only recap the main result of this setting and outline the encoding and decoding scheme. Theorem 1. A rate-distortion triple (R, Db , Dw ) is achievable if R > I(V ; X|B) Db ≥ E[db (X, Y )] Dw ≤ min E[dw (X, Z(U, W ))]
(23) (24) (25)
I(V ; B|U ) > I(V ; W |U )
(26)
z(u,w)
for some P U V XBW = P XBW P V |X P U |V , where Y φ(V, B) for some function φ(·, ·).
=
Theorem 1 involves two auxiliary variables U and V that are correlated with the source X in a Markov chain relationship. The variable V can be understood as the lossy representation of X that is communicated efficiently using random binning to the intended receiver, which will be used with the side information B to estimate X, just as in the setting without an eavesdropper which was pioneered by [20]. The purpose of the auxiliary variable U is to provide secrecy similar to the way secrecy is achieved in [21]. The side information at the intended receiver must be better than that of the eavesdropper (as measured by mutual information with V ) in order to prevent decoding of V . The variable U (if needed) is given away to all parties as the first layer of a superposition code in order to generate this condition for V . C. Achievability Scheme Using the Likelihood Encoder The source is encoded into four messages Mp , Mp0 , Ms and Ms0 , where Mp and Ms are transmitted and Mp0 and Ms0 are virtual messages that are not physically transmitted, but will be recovered with small error at the legitimate receiver with the help of the side information. On the other hand, Mp and Mp0 play the role of public messages, which both the legitimate receiver and the eavesdropper will decode; Ms and Ms0 index a codeword that is kept secret from the eavesdropper, which only the legitimate receiver can make sense of with its own side information. Fix a distribution P U V XBW = P U P V |U P X|V P BW |X
satisfying
X IP (V ; B|U ) > IP (V ; W |U ), EP [db (X, φ(V, B))] ≤ Db , min EP [dw (X, Z(U, W ))] ≥ Dw ,
0
and fix rates
Rs , Rs0 Rp +
,
0
β
such that
1
Rp0 Rp0
α 1−α
1
1
1−β
1
Fig. 4: Side information B and W correlated with source X
D. Example
The distribution induced by the encoder and decoder is P(x , b , w , mp , m0p , ms , m0s , m ˆ 0p , m ˆ 0s , y n ) P X n B n W n (xn , bn , wn )PE (mp , m0p , ms , m0s |xn ) PD (m ˆ 0p , m ˆ 0s |mp , ms , bn )PΦ (y n |mp , m ˆ 0p , ms , m ˆ 0s , bn ), (27) n
0
0
β
> IP (U ; X), < IP (U ; B), Rs + Rs0 > IP (X; V |U ), IP (V ; W |U ) < Rs0 < IP (V ; B|U ). n
W 1−β
e
z(u,w)
Rp , Rp0 ,
X
B 1−α α
n
where PE (mp , m0p , ms , m0s |xn ) is the source encoder; PD (m ˆ 0p , m ˆ 0s |mp , ms , bn ) is the first part of the decoder that estimates m0p and m0s as m ˆ 0p and m ˆ 0s ; n 0 0 n PΦ (y |mp , m ˆ p , ms , m ˆ s , b ) is the second part of the decoder that reconstructs the source sequence. Codebook generation: We independently Qn generate 0 2n(Rp +Rp ) sequences in U n according to t=1 P U (u0 t ) and index by (mp , m0p ) ∈ [1 : 2nRp ] × [1 : 2nRp ]. (n) We use CU to denote this random codebook. For each 0 (mp , m0p ) ∈ [1 : 2nRp ] × [1 : 2nRp ], we independently 0 generate 2n(Rs +Rs ) sequences in V n according to Qn 0 0 0 t=1 P V |U (vt |ut (mp , mp )) and index by (mp , mp , ms , ms ), 0 (n) (ms , m0s ) ∈ [1 : 2nRs ] × [1 : 2nRs ]. We use CV (mp , m0p ) to denote this random codebook. Encoder: The encoder PE (mp , m0p , ms , m0s |xn ) is a likelihood encoder [19] that chooses Mp , Mp0 , Ms , Ms0 stochastically according to the following probability:
We give an example for lossless compression case with Hamming distortion measure for the eavesdropper. The Hamming distortion measure is defined as 0, x=y d(x, y) = 1, otherwise. Let X n be a sequence of i.i.d. Bern(p) source, and let B n and W n be side information obtained through a binary erasure channel (BEC) and binary symmetric channel (BSC), respectively, i.e. P X (0) = 1 − P X (1) = 1 − p, P B|X (e|x) = α, P W |X (1 − x|x) = β. This is illustrated in Fig. 4. This type of side information was also considered in [22], but only with Bern( 12 ) source. We plot the distortion at the eavesdropper as a function of the source distribution p for fixed α and β in Fig. 5 and Fig. 6.
0.04 0.02 0
L(m|xn ) PE (m|xn ) = P L(m|x ¯ n) m∈M ¯
−0.02
dw
−0.04 −0.06 −0.08
(mp , m0p , ms , m0s ), nRs0
where m = [1 : 2nRs ] × [1 : 2
M = [1 : 2
nRp
] × [1 : 2
0 nRp
]×
], and
−0.1 −0.12
inner bound outer bound I(X;B)−I(X;W)
−0.14 −0.16 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
p
n
n
n
L(m|x ) = P X n |V n (x |v (m)). Decoder: The decoder has two parts. Let PD (m ˆ 0p , m ˆ 0s |mp , ms , bn ) be a good channel decoder with respect to the superposition sub-codebook {v n (mp , ap , ms , as )}ap ,as and the memoryless channel P B|V . For the second part of the decoder, fix a function φ(·, ·). Define φn (v n , bn ) as the concatenation {φ(vt , bt )}nt=1 and set the decoder PΦ to be the deterministic function PΦ (y n |mp , m ˆ 0p , ms , m ˆ 0s , bn ) , 1{y n = φn (v n (mp , m ˆ 0p , ms , m ˆ 0s ), bn )}. The performance of this scheme is referred to [8].
Fig. 5: Distortion at the eavesdropper as a function of source distribution p with α = 0.4, β = 0.04
In Fig. 5, when the legitimate receiver’s side information is more capable than the eavesdropper’s side information with respect to the source, perfect secrecy at the eavesdropper is achieved; when the eavesdropper’s side information is more capable than the legitimate receiver, with our encoding scheme, we achieve a positive distortion at the eavesdropper with no additional cost on the compression rate to ensure lossless decoding at the legitimate receiver. It is worth noting that our scheme encodes the source so that it favors the side information for the legitimate receiver even if the legitimate
t = 1, . . . , n
0.12
0.1
dw
0.08
Sn
0.06
X n PY Z|X
Encoder fn
0.04
Zn
0.02
0 0
Yn
inner bound outer bound I(X;B)−I(X;W) 0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
receiver’s side information is less capable, as opposed to the case where the regular Wyner-Ziv (Slepian-Wolf) encoding scheme that gives the same compression rate but no distortion at the eavesdropper. In Fig. 6, since the legitimate receiver’s side information is always more capable than the eavesdropper’s side information, perfect secrecy is ensured. V. S ECURE S OURCE -C HANNEL C ODING In this section, we review how the likelihood encoder can be used in a source-channel coding setting to achieve secrecy through a joint source-channel coding technique–hybrid coding. This work was originally presented in [23]. A. Problem Setup We want to determine the condition for a joint sourcechannel secrecy system so that we can guarantee reliable communication to the legitimate receiver while a certain level of distortion can be forced to the eavesdropper. The input of the system is an i.i.d. source sequence S n distributed according to Q n t=1 QnP S (st ) and the channel is a memoryless broadcast channel t=1 P Y Z|X (yt , zt |xt ). The source realization is causally disclosed to the eavesdropper during decoding. The sourcechannel coding model satisfies the following constraints: n n • Encoder fn : S 7→ X (possibly stochastic); n • Legitimate receiver decoder gn : Y 7→ Sˆn (possibly stochastic); n • Eavesdropper decoders {PS ˇt |Z n S t−1 }t=1 . The system performance is measured by a distortion metric d(·, ·) as follows: • Average distortion for the legitimate receiver: h i E d(S n , Sˆn ) ≤n Db •
Minimum average distortion for the eavesdropper: {PSˇ
min
t |Z
E[d(S , Sˇn )] ≥n De n
n n S t−1 }t=1
Definition 4. A distortion pair (Db , De ) is achievable if there exists a sequence of source-channel encoders and decoders (fn , gn ) such that E[d(S n , Sˆn )] ≤n Db and {PSˇ
min
t |Z
n n S t−1 }t=1
E[d(S n , Sˇn )] ≥n De
Eve
Sˆt
Sˇt
S t−1
0.5
p
Fig. 6: Distortion at the eavesdropper as a function of source distribution p with α = 0.4, β = 0.1
Decoder gn
Fig. 7: Joint source-channel secrecy system setup with causal source disclosure at the eavesdropper
The above mathematical formulation is illustrated in Fig. 7. B. Main Result Here we recap the main achievability result obtained using hybrid code with the analysis using the likelihood encoder. An achievability region using basic secure hybrid coding is given in the following theorem. Theorem 2. A distortion pair (Db , De ) is achievable if I(U ; S) < I(U ; Y ) Db ≥ E[d(S, φ(U, Y ))] De ≤ β min E[d(S, ψ0 (Z))]
(28) (29)
ψ0 (z)
+(1 − β) min E[d(S, ψ1 (U, Z))] (30) ψ1 (u,z)
where β = min
[I(U ; Y ) − I(U ; Z)]+ ,1 I(S; U |Z)
(31)
for some distribution P S P U |S P X|SU P Y Z|X and function φ(·, ·). C. Scheme Outline The source and channel distributions P S and P Y Z|X are given by the problem statement. Fix a joint distribution Qn P S P U |S P X|SU P Y Z|X . We will use P S n to denote t=1 P S . Codebook generation: We independently generate 2nR Qn sequences in U n according to t=1 P U (ut ) and index by m ∈ [1 : 2nR ]. We use C (n) to denote this random codebook. Encoder: Encoding has two steps. In the first step, a likelihood encoder PLE (m|sn ) is used. It chooses M stochastically according to the following probability: L(m|sn ) L(m|s ¯ n) m∈M ¯
(32)
L(m|sn ) = P S n |U n (sn |un (m)).
(33)
PLE (m|sn ) = P where M = [1 : 2nR ], and
In the second step, the encoder produces the channel input through a random transformation given by n Y t=1
P X|SU (xt |st , Ut (m)).
Decoder: Decoding also has two steps. In the first step, let PD1 (m|y ˆ n ) be a good channel decoder with respect to the codebook {un (a)}a and memoryless channel P Y |X . In the second step, fix a function φ(·, ·). Define φn (un , y n ) as the concatenation {φ(ut , yt )}nt=1 and set the decoder PD2 to be the deterministic function PD2 (ˆ sn |m, ˆ y n ) , 1{ˆ sn = φn (un (m), ˆ y n )}.
(34)
The analysis is referred to [23]. D. Example The source is distributed i.i.d. according to Bern(p) and the channels are binary symmetric channels with crossover probability p1 = 0 and p2 = 0.3. For simplicity, we require lossless decoding at the legitimate receiver. Hamming distance is considered for distortion at the eavesdropper. A numerical comparison of the joint source-channel coding scheme (I) using hybrid code with an operationally separate source-channel coding scheme (O) is demonstrated in Fig. 8. The choice of auxiliary random variable U in Scheme I is SX, which may not necessarily be the optimum choice but good enough to outperform Scheme O.
0.5
0.4
Scheme O Scheme I No Encoding Perfect Secrecy Outer Bound
De
0.3
0.2
0.1
0.0 0.0
0.1
0.2
p
0.3
0.4
0.5
Fig. 8: Distortion at the eavesdropper as a function of source distribution p with p1 = 0, p2 = 0.3.
R EFERENCES [1] C. E. Shannon, “A mathematical theory of communication,” Bell Sys. Tech. Journal, vol. 27, pp. 379–423, 623–656, 1948. [2] C. E. Shannon, “Coding theorems for a discrete source with a fidelity criterion,” IRE National Convention Record, Part 4, pp. 142–163, 1959. [3] H. Yamamoto, “Rate-distortion theory for the shannon cipher system,” IEEE Transactions on Information Theory, vol. 43, pp. 827–835, 1997. [4] P. Cuff, “Using a secret key to foil an eavesdropper,” in Proc. IEEE 48th Annual Allerton Conference on Communication, Control, and Computing, pp. 1405–1411, Sept 2010. [5] P. Cuff, “A framework for partial secrecy,” in Proc. IEEE Global Telecommunications Conference (GLOBECOM), pp. 1–5, Dec 2010. [6] C. Schieler and P. Cuff, “Secrecy is cheap if the adversary must reconstruct,” in IEEE International Symposium on Information Theory Proceedings (ISIT), pp. 66–70, July 2012. [7] E. C. Song, E. Soljanin, P. Cuff, H. Poor, and K. Guan, “Rate-distortionbased physical layer secrecy with applications to multimode fiber,” IEEE Transactions on Communications, vol. 62, pp. 1080–1090, March 2014.
[8] E. C. Song, P. Cuff, and H. V. Poor, “A rate-distortion based secrecy system with side information at the decoders,” in Proc. IEEE 52th Annual Allerton Conference on Communication, Control, and Computing, Oct 2014. [9] P. Cuff, “Using a secret key to foil an eavesdropper,” in Proc. 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1405–1411, Sept 2010. [10] C. Schieler and P. Cuff, “Secrecy is cheap if the adversary must reconstruct,” in Proc. IEEE International Symposium on Information Theory (ISIT), pp. 66–70, July 2012. [11] E. C. Song, P. Cuff, and H. V. Poor, “A bit of secrecy for gaussian source compression,” in Proc. IEEE International Symposium on Information Theory Proceedings (ISIT), pp. 2567–2571, July 2013. [12] C. Schieler and P. Cuff, “Rate-distortion theory for secrecy systems,” IEEE Transactions on Information Theory, vol. 60, pp. 7584–7605, Dec 2014. [13] C. Schieler and P. Cuff, “Rate-distortion theory for secrecy systems,” CoRR, vol. abs/1305.3905, 2013. [14] P. Cuff, “Distributed channel synthesis,” IEEE Transactions on Information Theory, vol. 59, no. 11, pp. 7071–7096, 2013. [15] P. Cuff and E. C. Song, “The likelihood encoder for source coding,” in Proc. IEEE Information Theory Workshop (ITW), 2013. [16] E. C. Song, P. Cuff, and H. V. Poor, “The likelihood encoder for lossy compression,” arXiv preprint arXiv:1408.4522, 2014. [17] T. M. Cover and J. A. Thomas, Elements of Information Theory. John Wiley & Sons, 2012. [18] C. Schieler and P. Cuff, “A connection between good rate-distortion codes and backward dmcs,” in Proc. IEEE Information Theory Workshop (ITW), pp. 1–5, IEEE, 2013. [19] E. C. Song, P. Cuff, and H. V. Poor, “The likelihood encoder for lossy source compression,” in Proc. IEEE International Symposium on Information Theory Proceedings (ISIT), Sept 2014. [20] A. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Transactions on Information Theory, vol. 22, pp. 1–10, Jan 1976. [21] J. Villard and P. Piantanida, “Secure lossy source coding with side information at the decoders,” in Proc. 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 733–739, Sept 2010. [22] J. Villard, P. Piantanida, and S. Shamai, “Secure transmission of sources over noisy channels with side information at the receivers,” IEEE Transactions on Information Theory, vol. 60, pp. 713–739, Jan 2014. [23] E. C. Song, P. Cuff, and H. V. Poor, “Joint source-channel secrecy using hybrid coding,” submitted to ISIT, 2015.