1
Distributed Source Coding for Correlated Memoryless Gaussian Sources
arXiv:0908.3982v2 [cs.IT] 16 Sep 2009
Yasutada Oohama
Abstract—We consider a distributed source coding problem of L correlated Gaussian observations Yi , i = 1, 2, · · · , L. We assume that the random vector Y L = t (Y1 , Y2 , · · · , YL ) is an observation of the Gaussian random vector X K = t (X1 , X2 , · · · , XK ), having the form Y L = AX K + N L , where A is a L × K matrix and N L = t (N1 , N2 , · · · , NL ) is a vector of L independent Gaussian random variables also independent of X K . The estimation error on X K is measured by the distortion covariance matrix. The rate distortion region is defined by a set of all rate vectors for which the estimation error is upper bounded by an arbitrary prescribed covariance matrix in the meaning of positive semi definite. In this paper we derive explicit outer and inner bounds of the rate distortion region. This result provides a useful tool to study the direct and indirect source coding problems on this Gaussian distributed source coding system, which remain open in general. Index Terms—Multiterminal source coding, rate-distortion region, CEO problem.
I. I NTRODUCTION Distributed source coding of correlated information sources are a form of communication system which is significant from both theoretical and practical points of view in multi-user source networks. The first fundamental theory in those coding systems was established by Slepian and Wolf [1]. They considered a distributed source coding system of two correlated information sources. Those two sources are separately encoded and sent to a single destination, where the decoder reconstruct the original sources. In the above distributed source coding systems we can consider the case where the source outputs should be reconstructed with average distortions smaller than prescribed levels. Such a situation suggests the multiterminal rate distortion theory. The rate distortion theory for the above distributed source coding system formulated by Slepian and Wolf has been studied by [2]-[9]. Recently, Wagner et al. [10] have given a complete solution in the case of Gaussian information sources and quadratic distortion. As a practical situation of distributed source coding systems, we can consider a case where the distributed encoders can not directly access to the source outputs but can access to their noisy observations. This situation was first studied by Yamamoto and Ito [11]. They call the investigated coding system the communication system with a remote source. Subsequently, a similar distributed source coding system was studied by Flynn and R. M. Gray [12]. Manuscript received xxx, 20XX; revised xxx, 20XX. Y. Oohama is with the Department of Information Science and Intelligent Systems, University of Tokushima, 2-1 Minami Josanjima-Cho, Tokushima 770-8506, Japan.
In this paper we consider a distributed source coding problem of L correlated Gaussian sources Yi , i = 1, 2, · · · , L which are noisy observations of Xi , i = 1, 2, · · · , K. We assume that Y L = t (Y1 , Y2 , · · · , YL ) is an observation of the source vector X K = t (X1 , X2 , · · · , XK ), having the form Y L = AX K + N L , where A is a L × K matrix and N L = t (N1 , N2 , · · · , NL ) is a vector of L independent Gaussian random variables also independent of X K . When K = 1, the source coding system becomes that of the quadracitc Gaussian CEO problem investigated by [13]-[15]. The system in the case of K = L was studied by Pandya et al. [16]. They derived lower and upper bounds of the minimum sum rate in the rate distortion region. Several partial solutions in the case of K = L and A = IL are obtained by Oohama [17]-[20]. In the above previous works the estimation error was measured by the quadratic distortion. We measure the estimation error by the distortion covariance matrix instead of the quadratic distortion. The rate distortion region is defined by a set of all rate vectors for which the estimation error is upper bounded by an arbitrarily prescribed covariance matrix in the meaning of positive semi definite. In this paper we derive explicit outer and inner bounds of the rate distortion region. Some preliminary versions of the results of this paper are found in Oohama [21]. The remote source coding problem treated in this paper is also referred to as the indirect distributed source coding problem. On the other hand, the multiterminal rate distortion problem in the frame work of distributed source coding is called the direct distributed source coding problem. As shown in the paper of Wagner et al. [10] and in the recent work by Wang et al. [22], we have a strong connection between the direct and indirect distributed source coding problems. In this paper we also consider the multiterminal rate distortion problem, i.e. the direct distributed source coding problem for the Gaussian information source specified with Y L = X L + N L , which corresponds to the case of K = L and A = IL . We shall derive a result which implies a strong connection between the remote source coding problem and the multiterminal rate distortion problem. This result states that all results on the rate distortion region of the remote source coding problem can be converted into those on the rate distortion region of the multiterminal source coding problem. Using this result, we drive several new partial solutions to the Gaussian multiterminal rate distortion problem.
2
X1
X2 .. . XK
N1 (n) ϕ (Y 1 ) Y1 X1 - ? h - ϕ(n) 1 1 B ˆ B N2 X1 B (n) X ˆ2 Y2 ϕ2 (Y 2 ) B X2 ? -NB ψ (n)- - - h - ϕ(n) . 2 .. A ˆK X .. . NL XK (n) ϕ (Y ) Y L L - ? h - ϕ(n) L L
Fig. 1. Distributed source coding system for L correlated Gaussian observations
II. P ROBLEM S TATEMENT AND P REVIOUS R ESULTS A. Formal Statement of Problem In this subsection we present a formal statement of problem. Throughout this paper all logarithms are taken to the base natural. Let Xi , i = 1, 2, · · · , K be correlated zero mean Gaussian random variable. For each i = 1, 2, · · · , K, Xi takes values in the real line Xi . We write a k dimensional random vector as X K = t (X1 , X2 , · · · , XK ). We denote △ the covariance matrix of X K by ΣX K . Let Y L = t (Y1 , Y2 , · · · , YL ) be an observation of the source vector X K , having the form Y L = AX K + N L , where A is a L × K matrix and N L = t (N1 , N2 , · · · , NL ) is a vector of L independent zero mean Gaussian random variables also independent of X K . 2 For i = 1, 2, · · · , L, σN stands for the variance of Ni . Let i {(X1 (t), X2 (t), · · · , XK (t))}∞ t=1 be a stationary memoryless △ multiple Gaussian source. For each t = 1,2, · · · , X K (t) = t (X1 (t), X2 (t), · · · , Xk (t)) has the same distribution as X K . A random vector consisting of n independent copies of the random variable Xi is denoted by △
X i = (Xi (1), Xi (2), · · · , Xi (n)). For each t = 1, 2, · · ·, Yi (t), i = 1, 2, · · · , L is a vector of L correlated observations of X K (t), having the form Y L (t) = AX K (t)+N L (t), where N L (t), t = 1, 2, · · · , are independent identically distributed (i.i.d.) Gaussian random vector having the same distribution as N L . We have no assumption on the number of observations L, which may be L ≥ K or L < K. The distributed source coding system for L correlated Gaussian observations treated in this paper is shown in Fig. 1. In this coding system the distributed encoder functions ϕi , i = 1, 2, · · · , L are defined by (n)
ϕi
△
: Xin → Mi = {1, 2, · · · , Mi } . (n) △
For each i = 1, 2, · · · , L, set Ri = n1 log Mi , which stands (n) for the transmission rate of the encoder function ϕi . The (n) (n) (n) joint decoder function ψ (n) = (ψ1 , ψ2 , · · · , ψK ) is defined by (n)
ψi
: M1 × · · · × ML → Xˆin , i = 1, 2, · · · , K,
where Xˆi is the real line in which a reconstructed random variable of Xi takes values. For X K = (X 1 , X 2 , · · · , X K ), set △
(n)
(n)
(n)
ϕ(n) (Y L ) = ϕ1 (Y 1 ), ϕ2 (Y 2 ), · · · , ϕL (Y L )), ˆ (n) (n) L ψ1 (ϕ (Y )) X1 (n) L (n) X ˆ2 K △ ψ2 (ϕ (Y )) , ˆ X = . = .. .. . (n) ˆK X ψ (ϕ(n) (Y L )) K
△
ˆ i ||2 , dii = E||X i − X △ ˆ i, X j − X ˆ j i , 1 ≤ i 6= j ≤ K, dij = EhX i − X
where ||a|| stands for the Euclid norm of n dimensional vector a and ha, bi stands for the inner product between a and b. Let ΣX K − X ˆ K be a covariance matrix with dij in its (i, j) entry. Let Σd be a given L × L covariance matrix which serves as a distortion criterion. We call this matrix a distortion matrix. For a given distortion matrix Σd , the rate vector (R1 , R2 , · · · , RL ) is Σd -admissible if there exists a sequence (n) (n) (n) {(ϕ1 , ϕ2 , · · · , ϕL , ψ (n) )}∞ n=1 such that (n)
lim sup Ri
≤ Ri , for i = 1, 2, · · · , L ,
n→∞
lim sup n1 ΣX K −X ˆ K Σd , n→∞
where A1 A2 means that A2 − A1 is positive semidefinite matrix. Let RL (Σd |ΣX K Y L ) denote the set of all Σd admissible rate vectors. We often have a particular interest in the minimum sum rate part of the rate distortion region. To examine this quantity, we set ) ( L X △ Ri . Rsum,L (Σd |ΣX K Y L ) = min (R1 ,R2 ,···,RL ) ∈RL (Γ,DK |ΣX K Y L )
i=1
We consider two types of distortion criterion. For each distortion criterion we define the determination problem of the rate distortion region. Problem 1. Vector Distortion Criterion: Fix K × K invertible matrix Γ and positive vector DK = (D1 , D2 , · · · , DK ). For given Γ and DK , the rate vector (R1 , R2 , · · · , RL ) (n) is (Γ, DK )-admissible if there exists a sequence {(ϕ1 , (n) (n) (n) ∞ ϕ2 , · · · , ϕL , ψ )}n=1 such that lim sup R(n) ≤ Ri , for i = 1, 2, · · · , L, n→∞ i h t Γ ≤ Di , for i = 1, 2, · · · , K, lim sup Γ n1 ΣX K −X ˆK n→∞
ii
where [C]ij stands for the (i, j) entry of the matrix C. Let RL (Γ, DK |ΣX K Y L ) denote the set of all (Γ, DK )admissible rate vectors. When Γ is equal to the K × K identity matrix IK , we omit Γ in RL (Γ, D|ΣX K Y L ) to simply write RL (D|ΣX K Y L ). Similar notations are used for other sets or quantities. To examine the sum rate part of RL (Γ, DK |ΣX K Y L ), define ) ( L X △ K Ri . Rsum,L (Γ, D |ΣX K Y L ) = min (R1 ,R2 ,···,RL ) ∈RL (Γ,DK |ΣX K Y L )
i=1
3
Problem 2. Sum Distortion Criterion: Fix K × K positive definite invertible matrix Γ and positive D. For given Γ and D, the rate vector (R1 , R2 , · · · , RL ) is (Γ, D)-admissible if there (n) (n) (n) exists a sequence {(ϕ1 , ϕ2 , · · · , ϕL , ψ (n) )}∞ n=1 such that lim sup R(n) ≤ Ri , for i = 1, 2, · · · , L, n→∞ h i t K lim sup tr Γ n1 ΣX K −X Γ ≤ D. ˆ
and set ˆ (in) (Σd |ΣX K Y L ) R L △ = conv RL : There exists a random vector U L ∈ G(Σd ) such that X Ri ≥ I(US ; YS |US c ) i∈S
for any S ⊆ Λ . } ,
n→∞
To examine the sum rate part of RL (Γ, D|ΣX K Y L ), define △
Rsum,L (Γ, D|ΣX K Y L ) =
min
(R1 ,R2 ,···,RL ) ∈RL (Γ,D|ΣX K Y L )
(
L X
Ri
i=1
)
where conv{A} stands for the convex hull of the set A. Set ˆ (in) (Γ, DK |ΣX K Y L ) R L [ △ = conv RL (Σd |ΣX K Y L ) , t K
.
ΓΣd Γ∈SK (D )
Let SK (DK ) be a set of all K ×K covariance matrices whose (i, i) entry do not exceed Di for i = 1, 2, · · · , K. Then we have K
RL (Γ, D |ΣX K Y L ) =
[
RL (Σd |ΣX K Y L ),(1)
△
= conv Define
ΓΣd t Γ∈SK (DK )
[
RL (Γ, D|ΣX K Y L ) =
RL (Σd |ΣX K Y L ).
PK
i=1
RL (Γ, DK |ΣX K Y L ) . (3)
Di ≤D
B. Inner Bounds and Previous Results In this subsection we present inner bounds of RL (Σd |ΣX K Y L ), RL (Γ, DL |ΣX K Y L ), and RL (Γ, D |ΣX K Y L ). Those inner bounds can be obtained by a standard technique developed in the field of multiterminal source coding. △
Set Λ = {1, 2, · · · , L}. For i ∈ Λ, let Ui be a random variable taking values in the real line Ui . For any subset S ⊆ Λ, we introduce the notation US = (Ui )i∈S . In particular UΛ = U L = (U1 , U2 , · · · , UL ). Define △
tr[ΓΣd t Γ]≤D
RL (Σd |ΣX K Y L ) .
and set △
In this paper we establish explicit inner and outer bounds of RL (Σd |ΣX K Y L ). Using the above bounds and equations (1) and (2), we give new outer bounds of RL (Γ, D|ΣX K Y L ) and RL (Γ, DK |ΣX K Y L ).
G(Σd ) =
[
△
Furthermore, we have [
ΣX K |Y L = (Σ−1 + t AΣ−1 A)−1 XK NL
(2)
tr[ΓΣd t Γ]≤D
RL (Γ, D|ΣX K Y L ) =
ˆ (in) (Γ, D|ΣX K Y L ) R L
U L : U L is a Gaussian random vector that satisfies US → YS → X K → YS c → US c , UL → Y L → XK for any S ⊆ Λ and ΣX K −ψ(U L ) Σd for some linear mapping ψ : U L → Xˆ K . }
dK (ΓΣX K |Y L t Γ) = [ΓΣX K |Y L t Γ]11 , [ΓΣX K |Y L t Γ]22 , · · · , [ΓΣX K |Y L t Γ]LL .
ˆ (in) (Σd |ΣX K Y L ), R ˆ (in) (Γ, DL |ΣX K Y L ), We can show that R L L ˆ (in) (Γ, D|ΣX K Y L ) satisfy the following property. and R L Property 1: (in)
ˆ (Σd |ΣX K Y L ) is not void if and only if Σd ≻ a) The set R L ΣX K |Y L . ˆ (in) (Γ, DK |ΣX K Y L ) is not void if and only if b) The set R L K K D > d (Γ ΣX K |Y L t Γ). ˆ (in) (Γ, D|ΣX K Y L ) is not void if and only if c) The set R L D > tr[ΓΣX K |Y L t Γ]. On inner bounds of RL (Σd |ΣX K Y L ), RL (Γ, DL |ΣX K Y L ˆ L (Γ, D|ΣX K Y L ), we have the following result. ), and R Theorem 1 (Berger [4] and Tung [5]): For any Σd ≻ ΣX K |Y L , we have ˆ (in) (Σd |ΣX K Y L ) ⊆ RL (Σd |ΣX K Y L ) . R L For any Γ and any DK > dK (ΓΣX K |Y L t Γ), we have ˆ (in) (Γ, DK |ΣX K Y L ) ⊆ RL (Γ, DK |ΣX K Y L ) . R L For any Γ and any D > tr[ΓΣX K |Y L t Γ], we have ˆ (in) (Γ, D|ΣX K Y L ) ⊆ RL (Γ, D|ΣX K Y L ) . R L The above three inner bounds can be regarded as variants of the inner bound which is well known as that of Berger [4] and Tung [5]. When K = 1 and L × 1 column vector A has the form A = t [11 · · · 1], the system considered here becomes the Quadracitc Gaussian CEO problem. This problem was first posed and
4
investigated by Viswanathan and Berger [13]. They dealt with the case of ΣN L = σ 2 IL and studied an asymptotic form of △
Rsum (D|σ 2 ) = lim inf Rsum,L (D|ΣXY L )
b) Suppose that rL ∈ AL (Σd ). If rL r =0 still belongs to S AL (Σd ), then J S (|Σd |, rS |rS c )|rS =0 = JS (rS |rS c )|rS =0
L→∞
for small D. Subsequently, Oohama [14] determined an exact form of Rsum (D|σ 2 ). The region RL (D|ΣXY L ) was determined by Oohama [15]. In the case where K = L and Γ = A = IL , Oohama [17]-[21] derived inner and outer bounds of RL (D|ΣX L Y L ). Oohama [18],[19] also derived explicit sufficient conditions for inner and outer bounds to match and found examples of information sources for which rate distortion region are explicitly determined. In [21], Oohama derived explicit outer bounds of RL (Σd |ΣX L Y L ), RL (DL |ΣX L Y L ), and RL (D |ΣX L Y L ). Recently, Wagner et al. [10] have determined R2 (D2 | ΣX 2 Y 2 ). Their result is as follows. Theorem 2 (Wagner et al. [10]): For any D2 > d2 ([ΣX K | Y L ]), we have (in)
ˆ (D2 |ΣX 2 Y 2 ) . R2 (D2 |ΣX 2 Y 2 ) = R 2 Their method for the proof depends heavily on the specific property of L = 2. It is hard to generalize it to the case of L ≥ 3.
= 0. Property 3: Fix rL ∈ AL (Σd ). For S ⊆ Λ, set △
fS = fS (rS |rS c ) = J S (|Σd |, rS |rS c ) . By definition, it is obvious that fS , S ⊆ Λ are nonnegative. △ We can show that f = {fS }S⊆Λ satisfies the followings: a) f∅ = 0. b) fA ≤ fB for A ⊆ B ⊆ Λ. c) fA + fB ≤ fA∩B + fA∪B . In general (Λ, f ) is called a co-polymatroid if the nonnegative function ρ on 2Λ satisfies the above three properties. Similarly, we set n o △ . f˜S = f˜S (rS |rS c ) = JS (rS |rS c ) , f˜ = f˜S S⊆Λ
Then (Λ, f˜) also has the same three properties as those of (Λ, f ) and becomes a co-polymatroid. To describe our result on RL (Σd |ΣX K Y L ), set (out)
III. M AIN R ESULTS A. Inner and Outer Bounds of the Rate Distortion Region In this subsection we state our result on the characterizations of RL (Σd |ΣX K Y L ), RL (Γ, DK |ΣX K Y L ), and RL (Γ, D |ΣX K Y L ). To describe those results we define several functions and sets. For ri ≥ 0, i ∈ Λ, let Ni (ri ), i ∈ Λ be L independent 2 Gaussian random variables with mean 0 and variance σN /(1− i −2ri ). Let ΣN L (rL ) be a covariance matrix for the random e vector N L (rL ). Fix nonnegative vector rL . For θ > 0 and for S ⊆ Λ, define △ −1 Σ−1 = Σ NS c (rS c ) N L (r L ) r =0 , Y S e2ri △ 1 i∈S , J S (θ, rS |rS c ) = log+ −1 −1 t 2 θ ΣX K + AΣNSc (rSc ) A Y −1 t 2ri + AΣ A e Σ−1 K L L N (r ) X △ 1 i∈S JS (rS |rS c ) = log −1 , −1 t 2 ΣX K + AΣNSc (rSc ) A △
where S c = Λ − S and log+ x = max{log x, 0} . Set h i−1 △ −1 t . AL (Σd ) = rL ≥ 0 : Σ−1 + AΣ A Σ d XK N L (r L )
We can show that for S ⊆ Λ, J S (|Σd |, rS |rS c ) and JS (rS |rS c ) satisfy the following two properties. Property 2: a) If rL ∈ AL (Σd ), then for any S ⊆ Λ, J S (|Σd |, rS |rS c ) ≤ JS (rS |rS c ) .
RL (θ, rL |ΣX K Y L ) X △ = RL : Ri ≥ J S (θ, rS |rS c ) i∈S
for any S ⊆ Λ . } ,
(out) RL (Σd |ΣX K Y L ) [ △ (out) = RL (|Σd |, rL |ΣX K Y L ) , r L ∈AL (Σd ) (in)
RL (rL ) X △ = RL : Ri ≥ JS (rS |rS c ) i∈S
for any S ⊆ Λ . } ,
△
(in) RL (Σd |ΣX K Y L )
= conv
[
r L ∈AL (Σd ) (in)
(in) RL (rL |ΣX K Y L ) . (out)
We can show that RL (Σd |ΣX K Y L ) and RL (Σd |ΣX K Y L ) satisfy the following property. (in) (out) Property 4: The sets RL (Σd |ΣX K Y L ) and RL (Σd |ΣX K Y L ) are not void if and only if Σd ≻ ΣX K |Y L . Our result on inner and outer bounds of RL (Σd |ΣX K Y L ) is as follows. Theorem 3: For any Σd ≻ ΣX K |Y L , we have (in) ˆ (in) (Σd |ΣX K Y L ) RL (Σd |ΣX K Y L ) ⊆ R L (out)
⊆ RL (Σd |ΣX K Y L ) ⊆ RL
(Σd |ΣX K Y L ) .
Proof of this theorem is given in Section V. This result includes the result of Oohama [21] as a special case by letting K = L and Γ = A = IL . From this theorem we can
5
(in)
derive outer and inner bounds of RL (Γ, DK | ΣX K Y L ) and RL (Γ,D|ΣX K Y L ) . To describe those bounds, set
RL (Γ, D|ΣX K Y L ) [ (in) RL (rL ) . = conv L
(out)
(Γ, DK |ΣX K Y L ) [ (out) RL (Σd |ΣX K Y L ),
RL △
=
r ∈BL (Γ,D)
The following result is obtained as a simple corollary from Theorem 3. Corollary 1: For any Γ and any DK > dK (ΓΣX K |Y L t Γ), we have
ΓΣd t Γ∈SK (DK ) (in)
RL (Γ, DK |ΣX K Y L ) [ △ (in) = conv RL (Σd |ΣX K Y L ) , t K
(in)
(out)
⊆ RL (Γ, DK |ΣX K Y L ) ⊆ RL
(out)
RL
△
=
(Γ, D|ΣX K Y L ) [ (out) RL (Σd |ΣX K Y L ),
(out)
⊆ RL (Γ, D|ΣX K Y L ) ⊆ RL
tr[ΓΣd Γ]≤D
o n △ −1 −1 t , A) A(rL ) = Σd : Σd (Σ−1 + AΣ L L K N (r ) X △
△
θ(Γ, DK , rL ) =
max
|Σd | ,
max
|Σd | .
Σd :Σd ∈AL (r L ), tr[ΓΣd t Γ]≤D Σd :Σd ∈AL (r L ), Σd ∈SK (DK )
Let ξ be a nonnegative number that satisfy
BL (Γ, D) n o △ −1 t −1t = rL ≥ 0 : tr[Γ(Σ−1 + AΣ A) Γ] ≤ D K L L X N (r )
=
(Γ, DK |ΣX K Y L ) [ (out) RL (θ(Γ, DK , rL ), rL |ΣX K Y L ) ,
r L ∈BL (Γ,DK ) (in)
RL (Γ, DK |ΣX K Y L ) [ (in) = conv RL (rL |ΣX K Y L ) , L K r ∈BL (Γ,D )
=
(out) RL (Γ, D|ΣX K Y L ) [ (out) RL (θ(Γ, D, rL ), rL |ΣX K Y L ) , r L ∈BL (Γ,D)
Define
−1 + [ξ − α−1 = D. i ] + αi
△
ω(Γ, D, rL ) = |Γ|−2
K Y −1 + [ξ − α−1 . i ] + αi
i=1
(in)
It can easily be verified that RL (Γ, DK |ΣX K Y L ), RL ( (in) (out) Γ, DK |ΣX K Y L ), RL (Γ, D|ΣX K Y L ), and RL (Γ, D| ΣX K Y L ) satisfies the following property. Property 5: (out) (in) a) The sets RL (Γ, DK |ΣX K Y L ) and RL (Γ, DK |ΣX K K > dK (ΓΣX K |Y L t Γ). Y L ) are not void if and only if D (in) (out) b) The sets RL (Γ, DK |ΣX K Y L ) and RL (Γ, DK |ΣX K t Y L ) are not void if and only if D > tr[ΓΣX K |Y L Γ]. c) (out)
K X i=1
BL (Γ, DK ) o n △ −1 −1t K t A) Γ ∈ S (D ) . + AΣ = rL ≥ 0 : Γ(Σ−1 K L L K N (r ) X
RL
(Γ, D|ΣX K Y L ) .
Those result includes the result of Oohama [21] as a special case by letting K = L and Γ = A = IL . Next we compute θ(Γ, D, rL ) to derive a more explicit expression of (out) RL (Γ , DK |ΣX K Y L ). This expression will be quite useful (out) for finding a sufficient condition for the outer bound RL (Γ K L , D |ΣX K Y L ) to be tight. Let αi = αi (r ), i = 1, 2, · · · , K be K eigen values of the matrix + t AΣ−1 A t Γ−1 . Γ−1 Σ−1 XK N L (r L )
Furthermore, set
(out)
(in)
(in)
ˆ (Γ, D|ΣX K Y L ) RL (Γ, D|ΣX K Y L ) ⊆ R L
(in)
RL (Γ, D|ΣX K Y L ) [ △ (in) RL (Σd |ΣX K Y L ) . = conv t θ(Γ, D, rL ) =
(Γ, DK |ΣX K Y L ) .
For any Γ and any D > tr[ΓΣX K |Y L t Γ], we have
tr[ΓΣd t Γ]≤D
Set
(in)
ˆ (Γ, DK |ΣX K Y L ) RL (Γ, DK |ΣX K Y L ) ⊆ R L
ΓΣd Γ∈SK (D )
The function ω(Γ, D, rL ) has an expression of the so-called water filling solution to the following optimization problem: ω(Γ, D, rL ) = |Γ|−2
max
K Y
ξP i αi ≥1,i∈Λ , i=1 K ξi ≤D i=1
ξi .
(4)
Then we have the following theorem. Theorem 4: For any Γ and any positive D, we have θ(Γ, D, rL ) = ω(Γ, D, rL ) . (out)
A more explicit expression of RL ω(Γ, D, rL ) is given by
(Γ, D|ΣX K Y L ) using
(out)
RL (Γ, D|ΣX K Y L ) [ △ (out) = RL (ω(Γ, D, rL ), rL |ΣX K Y L ) . r L ∈BL (D)
Proof of this theorem will be given in Section V. The above expression of the outer bound includes the result of Oohama [21] as a special case by letting K = L and Γ = A = IL .
6
If (i′ , i′′ ) 6= (1, 1), then the value of
B. Matching Condition Analysis (out)
For L ≥ 3, we present a sufficient condition for RL (Γ, (in) D| ΣX K Y L ) ⊆ RL (D|ΣX K Y L ) . We consider the following condition on θ(Γ, D, rL ). Condition: For any i ∈ Λ, e−2ri θ(Γ, D, rL ) is a monotone decreasing function of ri ≥ 0. We call this condition the MD condition. The following is a key lemma to derive the matching condition. This lemma is due to Oohama [19], [20] Lemma 1 (Oohama [19],[20]): . If θ(Γ, D, rL ) satisfies the MD condition on BL ( Γ, D), then (in) RL (Γ, D|ΣX K Y L )
= RL (Γ, D|ΣX K Y L ) (out)
= RL
(Γ, D|ΣX K Y L ).
Based on Lemma 1, we derive a sufficient condition for θ(Γ, D, rL ) to satisfy the MD condition. This sufficient condition is closely related to the distribution of eigen values of t −1
Γ
+ t AΣ−1 (Σ−1 A)Γ−1 . XK N L (r L )
−1 t −1 [t Qi t Γ−1 (Σ−1 Qi ]i′ i′′ X K + AΣN L (uL ) A)Γ −1 = [t Qi t Γ−1 Σ−1 Qi ]i′ i′′ + XK Γ
L X
uj
j=1
t
(ˆ aj Qi )(ˆ aj Qi ) i′ i′′
does not depend on ui . Note that the matrix t
−1 t −1 Qi t Γ−1 (Σ−1 Qi X K + AΣN L (uL ) A)Γ
has the same eigen values as those of t −1
Γ
(Σ−1 + t AΣ−1 A)Γ−1 . XK N L (uL )
We recall here that αi = αj (uL ), j = 1, 2, · · · , K are K eigen values of the above two matrices. Let αmin = αmin (uL ) and αmax = αmax (uL ) be the minimum and maximum eigen values among αj , j = 1, 2, · · · , K. According to Oohama [19], [20], we have the following lemma on those eigen values. Lemma 2 (Oohama[19],[20]): For each i = 1, 2, · · · , L, we have L αmin (uL ) ≤ ||ˆ ai ||2 ui + ηi (uL [i] ) ≤ αmax (u ),
Define △
1 2 σN
ui =
(1 − e−2ri ) , for i = 1, 2, · · · , L.
K X ∂αj ∂αj ≥ 0, for j = 1, 2, · · · , K, = ||ˆ ai ||2 . ∂ui ∂u i j=1
(5)
i
From (5), we have 2ri = log
1 2 σN 1
2 σN i
i
− ui
.
The following is a key lemma to derive a sufficient condition for the MD condition to hold. Lemma 3: If αmin (uL ) and αmax (uL ) satisfy 1 1 1 , − ≤ 1 L L 2 αmin (u ) αmax (u ) ||ˆ ai || σ2 + ηi (uL [i] )
By the above transformation we regard t −1
Γ
and θ(Γ, D, rL ) as functions of uL , that is, t −1
Γ
−1 t −1 (Σ−1 X K + AΣN L (r L ) A)Γ
= t Γ−1 (Σ−1 + t AΣ−1 A)Γ−1 , XK N L (uL ) L
for i = 1, 2, · · · , L on BL (Γ, D), then θ(Γ, D, uL ) satisfies the MD condition on BL (Γ, D). Let α∗max be the maximum eigen value of t −1
Γ
L
and θ(Γ, D, r ) = θ(Γ, D, u ). Let a ˆij be the (i, j) entry of △ −1 ˆ i = [ˆ ai1 a ˆi2 · · · a ˆik ] . Let Q be a K × K unitary AΓ . Set a matrix. We consider the following matrix: t
Ni
(Σ−1 + t AΣ−1 A)Γ−1 XK N L (r L )
From Lemmas 1-3 and an elementary computation we obtain the following. Theorem 5: If we have
Qt Γ−1 (Σ−1 + t AΣ−1 A)Γ−1 Q XK N L (uL )
= t Qt Γ−1 Σ−1 Γ−1 Q + XK
L X
uj t (ˆ aj Q)(ˆ aj Q) .
j=1
For each i = 1, 2, · · · , L, choose K×K unitary matrix Q = Qi ˆ i Qi = [||ˆ so that a ai ||0 · · · 0] . For this choice of Q = Qi , set ηi = ηi (uL [i] ) X △ = t Qi t Γ−1 Σ−1 Γ−1 Qi 11 + uj t (ˆ aj Qi )(ˆ aj Qi ) 11 , XK j6=i
△
where uL [i] = u1 · · · ui−1 ui+1 · · · uL . Similar notations are used for other variables or random variables. Then we have −1 t −1 [t Qi t Γ−1 Σ−1 Qi ]11 X K + AΣN L (uL ) A Γ = ||ˆ ai ||2 ui + ηi .
−1 t −1 (Σ−1 . X K + AΣN L A)Γ
tr[ΓΣX K |Y L t Γ] < D ≤
K+1 α∗ max
,
then (in)
(in)
ˆ (Γ, D|ΣX K ) RL (Γ, D|ΣX K ) = R L (out)
= RL (Γ, D|ΣX K Y L ) = RL
(Γ, D|ΣX K Y L ).
In particular, Rsum,L (Γ, D) −1 −1 t L X + AΣ A Σ 1 XK N L (r L ) −1 ri + log = min . Σ K 2 r L ∈BL (Γ,D) X i=1
Proofs of Lemma 3 and Theorem 5 will be stated in Section V. From Theorem 5, we can see that we have several nontrivial cases where R(in) (Γ, D|ΣX K Y L ) and R(out) (Γ, D|ΣX K Y L ) match.
7
N1
For a given Σd , the rate vector (R1 , R2 , · · · , RL ) is Σd (n) (n) (n) admissible if there exists a sequence {(ϕ1 , ϕ2 , · · · , ϕL , (n) ∞ ψ )}n=1 such that
(n)
Y1 ϕ1 (Y 1 ) ? X1 - h- Y1 - ϕ(n) 1 N2 Y2 ? X2 - h- Y2 - ϕ(n) 2 .. .
.. .
NL
YL ? XL - h- YL - ϕ(n) L
J ˆ J Y1 J (n) Yˆ 2 ϕ2 (Y 2 ) JJ ^ φ(n)- . .. Yˆ L (n) ϕL (Y L )
Fig. 2. Distributed source coding system for L correlated Gaussian sources
IV. A PPLICATION TO THE M ULTITERMINAL R ATE D ISTORTION P ROBLEM In this section we consider the multiterminal rate distortion problem for Gaussian information source specified with Y L . We consider the case where K = L and A = IL . In this case we have Y L = X L + N L . The Gaussian random variables Yi ,i = 1, 2, · · · , L are L-noisy components of random vector X L . The Gaussian random vector X L can be regarded as a “hidden” information source of Y L . Note that (X L , Y L ) satisfies YS → X L → YS c for any S ⊆ Λ . A. Problem Formulation and Previous Results The distributed source coding system for L correlated Gaussian source treated here is shown in Fig. 2. Definitions of encoder functions ϕi , i = 1, 2, · · · , L are the same as the (n) (n) previous definitions. The decoder function φ(n) = (φ1 , φ2 , (n) · · · , φL ) is defined by (n) φi
: M1 × · · · × ML → Yˆin , i = 1, 2, · · · , K,
where Yˆi is the real line in which estimations of Yi take values. For Y L = (Y 1 , Y 2 , · · · , Y L ), set ˆ Y1 ˆ L Y2 Yˆ = . .. Yˆ L (n) (n) (n) (n) φ1 (ϕ1 (Y 1 ), ϕ2 (Y 2 ), · · · , ϕL (Y L )) (n) (n) (n) (n) △ φ2 (ϕ1 (Y 1 ), ϕ2 (Y 2 ), · · · , ϕL (Y L )) , = .. . (n)
(n)
(n)
(n)
φL (ϕ1 (Y 1 ), ϕ2 (Y 2 ), · · · , ϕL (Y L ))
△ d˜ii = E||Y i − Yˆ i ||2 , △ d˜ij = EhY i − Yˆ i , Y j − Yˆ j i , 1 ≤ i 6= j ≤ L .
Let ΣY L −Yˆ L
be a covariance matrix with d˜ij in its (i, j) entry.
lim sup R(n) ≤ Ri , for i = 1, 2, · · · , L , n→∞
lim sup n1 ΣY L −Yˆ L Σd . n→∞
Let RL (Σd |ΣY L ) denote the set of all Σd -admissible rate vectors. We consider two types of distortion criterion. For each distortion criterion we define the determination problem of the rate distortion region. Problem 3. Vector Distortion Criterion: For given L × L invertible matrix Γ and DL > 0, the rate vector (R1 , R2 , · · · , RL ) is (Γ, DL )-admissible if there exists a sequence (n) (n) (n) {(ϕ1 , ϕ2 , · · · , ϕL , φ(n) )}∞ n=1 such that lim sup R(n) ≤ Ri , for i = 1, 2, · · · , L , n→∞ i h lim sup Γ n1 ΣY L −Yˆ L t Γ ≤ Di , for i = 1, 2, · · · , K . ii
n→∞
Let RL (Γ, DL |ΣY L ) denote the set of all (Γ, DL )-admissible rate vectors. The sum rate part of the rate distortion region is defined by ) ( L X △ L Ri . Rsum,L (Γ, D |ΣY L ) = min (R1 ,R2 ,···,RL ) ∈RL (Γ,DK |ΣY L )
i=1
Problem 4. Sum Distortion Criterion: For given L×L invertible matrix Γ and D > 0, the rate vector (R1 , R2 , · · · , RL ) is (n) (n) (Γ, D)-admissible if there exists a sequence {(ϕ1 , ϕ2 , · · · , (n) (n) ∞ ϕL , φ )}n=1 such that lim sup R(n) ≤ Ri , for i = 1, 2, · · · , L, n→∞ h i lim sup tr Γ n1 ΣY L −Yˆ L t Γ ≤ D . n→∞
Let RL (Γ, D|ΣY L ) denote the set of all admissible rate vectors. The sum rate part of the rate distortion region is defined by ) ( L X △ Ri . Rsum,L (Γ, D|ΣY L ) = min (R1 ,R2 ,···,RL ) ∈RL (Γ,D|ΣY L )
i=1
Relations between RL (Σd |ΣY L ), RL (Γ, DL |ΣY L ), and RL (Γ, D|ΣY L ) are as follows. [ RL (Γ, DL |ΣY L ) = RL (Σd |ΣY L ), (6) ΓΣd t Γ∈SL (DL )
[
RL (Γ, D|ΣY L ) =
RL (Σd |ΣY L ).
(7)
RL (Γ, DL |ΣY L ) .
(8)
tr[ΓΣd t Γ]≤D
Furthermore, we have RL (Γ, D|ΣY L ) = PL
[
i=1
Di ≤D
We first present inner bounds of RL (Σd |ΣY L ), RL (Γ, DL |ΣY L ), and RL (Γ, D|ΣY L ). Those inner bounds can be
8
obtained by a standard technique of multiterminal source coding. Define △ L ˜ d) = G(Σ U : U L is a Gaussian random vector that satisfies US → YS → X L → YS c → US c UL → Y L → XL for any S ⊂ Λ and ΣY L −φ(U L ) Σd for some linear mapping φ : U L → Yˆ L . } and set ˆ (in) (Σd |ΣY L ) R L △ = conv RL : There exists a random vector ˜ d ) such that U L ∈ G(Σ X Ri ≥ I(US ; YS |US c ) i∈S
for any S ⊆ Λ . } ,
ˆ (in) (Γ, DL |ΣY L ) R L [ △ ˆ (in) (Σd |ΣY L ) , = conv R L t L ΓΣd Γ∈SL (D )
ˆ (in) (Γ, D|ΣY L ) R L [ △ ˆ (in) (Σd |ΣY L ) . = conv R L t tr[ΓΣd Γ]≤D
Then we have the following result. Theorem 6 (Berger [4] and Tung [5]): For any positive definite Σd , we have ˆ (in) (Σd |ΣY L ) ⊆ RL (Σd |ΣY L ). R L L
For any invertible Γ and any D > 0, we have ˆ (in) (Γ, DL |ΣY L ) ⊆ RL (Γ, DL |ΣY L ) . R L For any invertible Γ and any D > 0, we have
B. New Partial Solutions In this subsection we state our results on the characterizations of RL (Σd |ΣY L ), RL (Γ, DL |ΣY L ), and RL (Γ, D|ΣY L ). Before describing those results we derive an important relation between remote source coding problem and multiterminal rate distortion problem. We first observe that by an elementary computation we have ˜ L+N ˜L , X L = AY
−1 −1 −1 ˜ L is a zero mean where A˜ = (Σ−1 ΣN L and N X L +ΣN L ) Gaussian random vector with covariance matrix ΣN˜ L = (Σ−1 XL −1 L L ˜ +Σ−1 ) . The random vector N is independent of Y . Set NL △ B = A˜−1 ΣN˜ L t A˜−1 = ΣN L + ΣN L Σ−1 Σ L, XL N △
bL = t ([B]11 , [B]22 , · · · , [B]LL ) , △ ˜= B ΓB t Γ , △ t ˜ 11 , [B] ˜ 22 , · · · , [B] ˜ LL ) . ˜bL = ([B]
From (9), we have the following relation between X L and Y L: ˜ L+N ˜ L, X L = AY (10) ˜ L is a sequence of n independent copies of N ˜ L and where N (n) (n) (n) L is independent of Y . Now, we fix {(ϕ1 , ϕ2 , · · · , ϕL , (n) ∞ ψ )}n=1 , arbitrary. For each n = 1, 2, · · ·, the estimation ˆ L of X L is given by X (n) ψ1 (ϕ(n) (Y L )) (n) (n) L ψ2 (ϕ (Y )) L ˆ . X = .. . (n)
ψL (ϕ(n) (Y L ))
L Using this estimation, we construct an estimation Yˆ of Y L by L ˆL, Yˆ = A˜−1 X (11)
which is equivalent to
ˆ (in) (Γ, D|ΣY L ) ⊆ RL (Γ, D|ΣY L ) . R L ˆ (in) (Γ, DL |ΣY L ) for Γ = IL is well The inner bound R L known as the inner bound of Berger [4] and Tung [5]. The above three inner bounds are variants of this inner bound. Recently, Wagner et al. [10] have determined the sum rate part Rsum,2 (D2 |ΣY 2 ). Their result is as follows: Theorem 7 (Wagner et al.[10]): For any positive vector D2 , we have Rsum,2 (D2 |ΣY 2 ) =
min
(R1 ,R2 ) ˆ (in) (D2 |Σ 2 ) ∈R Y 2
(R1 + R2 ) .
According to Wagner et al.[10], the results of Oohama [9],[15], and [14] play an essential role in deriving their result. Their method for the proof depends heavily on the specific property of L = 2. It was hard to generalize it to the case of L ≥ 3.
(9)
ˆ L = A˜Yˆ L . X
(12)
From (10) and (12), we have ˆ L = A(Y ˜ L − Yˆ L ) + N˜L . XL − X
(13)
L L Since Yˆ is a function of Y L , Yˆ − Y L is independent of ˜ L . Computing 1 Σ L ˆ L based on (13), we obtain N n X −X 1 ˆL n ΣX L − X
= A˜
From (14), we have 1 ˆL n ΣY L − Y
1 ˆL n ΣY L − Y
t
A˜ + ΣN˜ L .
− ΣN˜ L t A˜−1 t ˜−1 A −B. = A˜−1 n1 ΣX L −X ˆL = A˜−1
(14)
1 ˆL n ΣX L − X
(15)
9
(n)
(n)
(n)
Conversely, we fix {(ϕ1 , ϕ2 , · · · , ϕL , φ(n) )}∞ n=1 , arbiL trary. For each n = 1, 2, · · ·, using the estimation Yˆ of Y L given by (n) φ1 (ϕ(n) (Y L )) (n) (n) L φ2 (ϕ (Y )) L , ˆ Y = .. . (n)
φL (ϕ(n) (Y L ))
we construct an estimation of X L by (12). Then using (10) and (12), we obtain (13). Hence we have the relation (14). The following proposition provides an important strong connection between remote source coding problem and multiterminal rate distortion problem. Proposition 1: For any positive definite Σd , we have ˜ d + B)t A|Σ ˜ XLY L ) . RL (Σd |ΣY L ) = RL (A(Σ
˜ d + B)t A˜ |ΣX L Y L ) . Thus, which implies that RL ∈ RL (A(Σ ˜ d + B)t A|Σ ˜ XLY L ) RL (Σd |ΣY L ) ⊆ RL (A(Σ is proved. Next we prove the second equality. We have the following chain of equalities: RL (Γ, DL |ΣY L ) [ = RL (Σd |ΣY L ) ΓΣd t Γ∈SL (DL )
=
˜ d + B)t A|Σ ˜ X L Y L ). Proof: Suppose that R ∈ RL (A(Σ (n) (n) (n) (n) ∞ Then there exists {(ϕ1 , ϕ2 , · · · , ϕL , ψ )}n=1 such that lim sup R(n) ≤ Ri , for i = 1, 2, · · · , L ,
˜ d + B)t A|Σ ˜ XLY L ) RL (A(Σ
[
=
˜ d + B)t A|Σ ˜ XLY L ) RL (A(Σ
˜−1 A(Σ ˜ d +B)t A ˜t (ΓA ˜−1 ) ΓA ∈SL (DL +˜ bL )
[
=
For any invertible Γ and any D > 0, we have
L
[
˜−1 A(Σ ˜ d +B)t A ˜t A ˜−1 t Γ ΓA −ΓB t Γ∈SL (DL )
For any invertible Γ and any D > 0, we have
˜ XLY L ) . RL (Γ, D|ΣY L ) = RL (ΓA˜−1 , D + tr[B]|Σ
˜ d + B)t A|Σ ˜ X LY L ) RL (ΓA(Σ
ΓΣd t Γ∈SL (DL )
L
RL (Γ, DL |ΣY L ) = RL (ΓA˜−1 , DL + ˜bL |ΣX L Y L ) .
[
=
ˆ d |ΣX L Y L ) RL (Σ
ˆ d =A(Σ ˜ d +B) A≻Σ ˜ Σ , X L |Y L t
˜−1 Σ ˆ d t (ΓA ˜−1 )∈SL (DL +˜ ΓA bL )
= RL (ΓA˜−1 , DL + ˜bL |ΣX L Y L ) . Thus the second equality is proved. Finally we prove the third equality. We have the following chain of equalities:
n→∞ t ˜ ˜ lim sup n1 ΣX L −X ˆ L A(Σd + B) A . n→∞
ˆ L , we construct an estimation Yˆ L of Y L by Yˆ L = Using X L ˆ . Then from (15), we have A˜−1 X lim sup n1 ΣY L −Yˆ L n→∞ t ˜−1 = lim sup A˜−1 n1 ΣX L −X A −B ˆL n→∞ ˜−1 ˜
A(Σd + B)t A˜t A˜−1 − B = Σd ,
A
RL (Γ, D|ΣY L ) [ RL (Σd |ΣY L ) = tr[ΓΣd t Γ]≤D
=
is proved. Next we prove the reverse inclusion. Suppose that (n) (n) (n) RL ∈ RL (Σd |ΣY L ). Then there exists {(ϕ1 , ϕ2 , · · · , ϕL , φ(n) )}∞ n=1 such that lim sup R(n) ≤ Ri , for i = 1, 2, · · · , L , n→∞
lim sup n1 ΣY L −Yˆ L Σd . n→∞
L
ˆ L of X L by X ˆ L= Using Yˆ , we construct an estimation X L A˜Yˆ . Then from (14), we have lim sup n1 ΣX L −X ˆL n→∞ = lim sup A˜ n1 ΣY L −Yˆ L t A˜ + ΣN˜ L n→∞
˜ d t A˜t + Σ ˜ L = A(Σ ˜ d + B)t A˜t , AΣ N
˜ d + B)t A|Σ ˜ XLY L ) RL (ΓA(Σ
tr[ΓΣd t Γ]≤D
= ˜−1
tr[ΓA
=
[
˜ d + B)t A|Σ ˜ XLY L ) RL (A(Σ
˜ d +B) A A A(Σ −tr[ΓB t Γ]≤D t
˜t
˜−1 t
Γ]
[
˜ d + B)t A|Σ ˜ XLY L ) RL (A(Σ
[
ˆ d |ΣX L Y L ) RL (Σ
˜−1 A(Σ ˜ d +B)t A ˜t (ΓA ˜−1 )] tr[ΓA ˜ ≤D+tr[B]
˜ d + B)t A|Σ ˜ X L Y L ) . Thus which implies that RL ∈ RL (A(Σ ˜ d + B)t A|Σ ˜ X LY L ) . RL (Σd |ΣY L ) ⊇ RL (A(Σ
[
=
ˆ d =A(Σ ˜ d +B)t A≻Σ ˜ Σ , X L |Y L ˜−1 Σ ˆ d t (ΓA ˜−1 )]≤D+tr[B] ˜ tr[ΓA
˜ XLY L ) . = RL (ΓA˜−1 , D + tr[B]|Σ Thus the third equality is proved. Proposition 1 implies that all results on the rate distortion regions for the remote source coding problems can be converted into those on the multiterminal source coding problems. In the following we derive inner and outer bounds of RL (Σd |ΣY L ), RL (Γ, DL |ΣY L ), and RL (Γ, D|ΣY L ) using Proposition 1. We first derive inner and outer bounds of RL (Σd |ΣY L ). For ri ≥ 0, i ∈ Λ, let Vi (ri ), i ∈ Λ be L independent Gaussian random variables with mean 0 and variance σV2 i /(e2ri − 1). Let ΣV L (rL ) be a covariance matrix for the random vector V L (rL ). Fix nonnegative vector rL . For θ > 0 and for S ⊆ Λ,
10
Next, we derive inner and outer bounds of RL (Γ,DK |ΣY L ) and RL (Γ,D|ΣY L ). Set
define Σ−1 VS (rS c )
△ = Σ−1 L L V (r ) r =0 , S
L Y
e2ri |ΣY L + B| 1 △ i=1 J˜S (θ, rS |rS c ) = log+ −1 2 θ|Σ L | Σ + L Σ−1 Y
, VS c (rS c )
Y
Set
−1 ΣY L + Σ−1 L (r L ) △ 1 V ˜ . JS (rS |rS c ) = log −1 2 + Σ Σ−1 VS c (rS c ) YL
△ A˜L (Σd ) =
h i−1 rL ≥ 0 : Σ−1 + Σ−1 Σd YL V L (r L )
△ −1 −1 A˜L (rL ) = {Σd : Σd (Σ−1 }, Y L + ΣV L (r L ) ) △ ˜ D L , rL ) = θ(Γ,
△ ˜ D, rL ) = θ(Γ,
.
B˜L (Γ, D) n h i o △ −1 −1t = rL ≥ 0 : tr Γ(Σ−1 + Σ ) Γ ≤ D . L L L Y V (r )
Define four regions by
+ B|, rL |ΣY L ) ,
(out)
RL △
=
˜L (Σd ) r L ∈A △
(in)
RL (rL |ΣY L ) =
△
(in)
RL :
RL (Σd |ΣY L ) = conv
X
Ri ≥ JS (rS |rS c )
for any S ⊆ Λ . } ,
(Γ, DL |ΣY L ) [ (out) ˜ (θ(Γ, DL , rL ), rL |ΣY L ) , R L
˜L (Γ,DL ) r L ∈B
i∈S
|Σd + B| .
B˜L (Γ, DL ) n o −1t L = rL ≥ 0 : Γ(ΣY−1L + Σ−1 ) Γ ∈ S (D ) , L L L V (r )
for any S ⊆ Λ . } , (out) RL (|Σd
max
˜L (r L ), Σd :Σd ∈A tr[ΓΣd t Γ]≤D
△
i∈S
[
|Σd + B| ,
Furthermore, set
Define four regions by X △ (out) RL (θ, rL |ΣY L ) = RL : Ri ≥ J˜S (θ, rS |rS c ) △ (out) RL (Σd |ΣY L ) =
max
˜L (r L ), Σd :Σd ∈A ΓΣd t Γ∈SL (DL )
[
r L ∈AL (Σd )
(in) RL (rL |ΣY L ) .
The functions and sets defined above have properties shown in the following. Property 6: ˜ d ) = G(A(Σ ˜ d + B)t A). ˜ a) For any positive definite Σd , G(Σ b) For any positive definite Σd , we have ˆ (in) (Σd |ΣY L ) = R ˆ (in) (A(Σ ˜ d + B)t A|Σ ˜ XLY L ) . R L L c) For any positive definite Σd and any S ⊆ Λ, we have ˜ d + B)t A|, ˜ rS |rS c ) = J˜S (|Σd + B|, rS |rS c ) J S (|A(Σ JS (rS |rS c ) = J˜S (rS |rS c ) ˜ d + B)t A) ˜ = d) For any positive definite Σd , AL (A(Σ A˜L (Σd ) . e) For any positive definite Σd , we have (out)
˜ d + B)t A|Σ ˜ X L Y L ) = R(out) (Σd |ΣY L ) , (A(Σ L (in) (in) ˜ t ˜ L L (Σd |ΣY L ) . + B) A|Σ ) = R R (A(Σ d X Y RL
(in)
RL (Γ, DL |ΣY L ) [ △ (in) RL (rL |ΣY L ) , = conv r L ∈B˜L (Γ,DL )
(out)
RL
△
=
(Γ, D|ΣY L ) [ (out) ˜ R (θ(Γ, D, rL ), rL |ΣY L ) , L
˜L (Γ,D) r L ∈B (in)
RL (Γ, D|ΣY L ) [ △ (in) = conv RL (rL |ΣY L ) . r L ∈B˜L (Γ,D)
It can easily be verified that the functions and sets defined above have the properties shown in the following. Property 7: a) For any invertible Γ and any DL > 0, we have ˆ (in) (Γ, DL |ΣY L ) R L (in) ˆ = R (ΓA˜−1 , DL + ˜bL |ΣX L Y L ) . L
For any invertible Γ and any D > 0, we have ˆ (in) (Γ, D|ΣY L ) R L (in) ˆ ˜ XLY L ) . = R (ΓA˜−1 , D + tr[B]|Σ L
L
L
From Theorem 3, Proposition 1 and Property 6, we have the following. Theorem 8: For any positive definite Σd , we have (in)
(in)
ˆ (Σd |ΣY L ) RL (Σd |ΣY L ) ⊆ R L (out)
⊆ RL (Σd |ΣY L ) ⊆ RL
(Σd |ΣY L ) .
b) For any rL ≥ 0, we have ˜ L ) ⇔ A(Σ ˜ d + B)t A˜ ∈ A(rL ) , Σd ∈ A(r −2 ˜ DL , rL ) = A˜ θ(ΓA˜−1 , DL , rL ) , θ(Γ, −2 ˜ D, rL ) = A˜ θ(ΓA˜−1 , D, rL ) . θ(Γ,
11
c) For any invertible Γ and any DL > 0, we have
= =
where the last equality follows from the choice of Γ specified with (16). From (17) and (18), we have i h ˜ ≥ (L + 1)δ − tr δIL + δ 2 Σ ˜ −1L (L + 1)µ∗min − tr[B] X
(out) RL (Γ, DL |ΣY L ) (out) RL (A˜−1 , DL + ˜bL |ΣX L Y L ) , (in) RL (Γ, DL |ΣY L ) (in) RL (ΓA˜−1 , DL + ˜bL |ΣX L Y L ) .
≥ (L + 1)δ − Lδ − δ 2 λ−1 min 2 −1 = δ − δ λmin .
For any invertible Γ and any D > 0, we have
= =
Hence if 0 < D ≤ δ − δ 2 λ−1 min ,
(out) RL (Γ, D|ΣY L ) (out) ˜ XLY L ) , RL (ΓA˜−1 , D + tr[B]|Σ (in) RL (Γ, D|ΣY L ) (in) ˜ X LY L ) . RL (ΓA˜−1 , D + tr[B]|Σ
From Corollary 1, Proposition 1 and Property 7, we have the following theorem. Theorem 9: For any invertible Γ and any D > 0, we have (in) ˆ (in) (Γ, DL |ΣY L ) RL (Γ, DL |ΣY L ) ⊆ R L L
⊆ RL (Γ, D |ΣY L ) ⊆
(19)
(20)
then the matching condition holds. The right member of (20) takes the maximum value 14 λmin for δ = 21 λmin . Summerizing the above argument, we establish the following corollary from Theorem 10. Corollary 2: If the minimum eigen value λmin of ΣX L satisfies 1 0 < D ≤ λmin , 4 then for any diagonal matrix Γ specified with (16) we have
(out) RL (Γ, DL |ΣY L ) .
(in) ˆ (in) (Γ, D|ΣY L ) RL (Γ, D|ΣY L ) = R L
For any invertible Γ and any D > 0, we have
(out)
= RL (Γ, D|ΣY L ) = RL
(Γ, D|ΣY L ) .
(in) RL (Γ, D|ΣY L )
⊆ RL (Γ, D|ΣY L )
ˆ (in) (Γ, D|ΣY L ) ⊆R L (out) ⊆ RL (Γ, D|ΣY L ) .
C. Sum Rate Characterization for the Cyclic Shift Invariant Source
(out)
Next, we derive a matching condition for RL (Γ, D|ΣY L ) (in) to coincide with RL (Γ, D|ΣY L ). By Theorems 5 and 9, Proposition 1 and Property 7, we establish the following. Theorem 10: Let µ∗min be the minimum eigen value of ˜ = Γ ΣN L + ΣN L Σ−1L ΣN L t Γ . B X
In this subsection we further examine an explicit characterization of Rsum,L ( D|ΣY l ) when the source has a certain symmetrical property. Let 1 2 ··· i ··· L τ= τ (1) τ (2) · · · τ (i) · · · τ (L)
0 < D ≤ (L + 1)µ∗min − tr Γ(ΣN L + ΣN L Σ−1 Σ L )t Γ , XL N
τ (1) = 2, τ (2) = 3, · · · , τ (L − 1) = L, τ (L) = 1 .
If we have
then
(in) ˆ (in) (Γ, D|ΣY L ) RL (Γ, D|ΣY L ) = R L (out)
= RL (Γ, D|ΣY L ) = RL
(Γ, D|ΣY L ) .
We are particularly interested in the case where Γ is the following diagonal matrix: γ 1 L γ2 X γi−2 = 1 . (16) , Γ= . .. i=1 γL
0
0
Let δ > 0 be an arbitrary positive constant specified later. We △ ˜ XL = choose ΣN L so that ΣN L = δ 2 Γ2 . Set Σ ΓΣX L Γ. Then, −1 ˜ L . Hence we have ˜ = δIL + δ 2 Σ we have B X
µ∗min ≥ δ .
(17)
Let λmin be the minimum eigen value of ΣX L . Since ΣX L ≻ ˜ −1L ≺ λ−1 Γ−2 . Hence we have λmin IL , we have Σ min X i h −1 −1 ˜ L ≤ λ tr Γ−2 = λ−1 , (18) tr Σ min min X
be a cyclic shift on Λ, that is,
Let pXΛ (xΛ ) = pX1 X2 ···XL (x1 , x2 , · · · , xL ) be a probability density function of X L . The source X L is said to be cyclic shift invariant if we have pXΛ (xτ (Λ) ) = pX1 X2 ···XL (x2 , x3 , · · · , xL , x1 ) = pX1 X2 ···XL (x1 , x2 , · · · , xL−1 , xL ) for any (x1 , x2 , · · · , xL ) ∈ X L . In the following argument we assume that X L satisfies the cyclic shift invariant property. We further assume that Ni , i ∈ Λ are i.i.d. Gaussian random variables with mean 0 and variance ǫ. Then, the observation Y L = X L + N L also satisfies the cyclic shift invariant property. We assume that the covariance matrix ΣN L of N L is given by ǫIL . Then A˜ and B are given by −1 . + IL , B = ǫ IL + ǫΣ−1 A˜ = ǫΣ−1 XL XL
Fix r > 0, let Ni (r), i ∈ Λ be L i.i.d. Gaussian random variables with mean 0 and variance ǫ/(1 − e−2r ). The covariance matrix ΣN L (r) for the random vector N L (r) is given by ΣN L (r) =
1 − e−2r IL . ǫ
12
Let λi , i ∈ Λ be L eigen values of the matrix ΣX L and let βi = βi (r), i ∈ Λ be L eigen values of the matrix 1 − e−2r −1 t ˜ ˜ A ΣX L + IL A. ǫ Using the eigen values of ΣX L , βi (r), i ∈ Λ can be written as # " 2 1 λi λi −2r . βi (r) = e − ǫ λi + ǫ λi + ǫ Let ξ be a nonnegative number that satisfies L X
{[ξ −
βi−1 ]+
+
βi−1 }
△
ω ˜ (D, r) =
= D + tr[B] .
L Y [ξ − βi−1 ]+ + βi−1 .
i=1
The function ω ˜ (D, r) has an expression of the so-called water filling solution to the following optimization problem: ω ˜ (D, r) =
max
PLξi βi ≥1,i∈Λ , i=1
Set
ξi ≤D+tr[B]
L Y
ξi .
(21)
i=1
△ 1 e2Lr |ΣY L + B| ˜ J(D, r) = log , ω ˜ (D, r) 2 # " −1 −2r 1 − e △ −1 t ˜−1 . A IL ζ(r) = tr A˜−1 ΣX L + ǫ
By definition we have ζ(r) =
L X i=1
1 . βi (r)
(22)
Since ζ(r) is a monotone decreasing function of r, there exists a unique r such that ζ(r) = D+tr[B], we denote it by r∗ (D+ tr[B]). Note that (r, r, · · · , r) ∈ BL (A˜−1 , D + tr[B]) | {z } L
Set
⇔ ζ(r) ≤ D + tr[B] ⇔ r ≥ r∗ (D + tr[B]) , −1 ∗ 1 − e−2r ∗ −2 −1 ˜ IL . ω ˜ (D, r ) = |A| ΣX L + ǫ △ (l) Rsum,L (D|ΣY L ) =
min
r≥r ∗ (D+tr[B])
△
△
βi0 (r) = min βi (r) , βi1 (r) = max βi (r) . 1≤i≤L
1≤i≤L
Then we have the following two lemmas. Lemma 4: If 2 λmax + ǫ L (βi0 (r))2 βi1 (r) − βi0 (r) ≤ ǫe2r · L−1 λmax or equivalent to λi0 λi1 λi0 λi1 e2r − − + λi1 + ǫ λi0 + ǫ λi0 + ǫ λi1 + ǫ 2 2 2 λi0 λmax + ǫ λi0 L 2r e − (23) ≤ L−1 λmax λi0 + ǫ λi0 + ǫ
i=1
Define
monotone decreasing function of r ∈ [r∗ ( D + tr[B]), +∞), (l) we have Rsum,L (D|ΣY L ) = Rsum,L (D|ΣY L ). Let λ be the maximum eigen value of ΣX L . Set
˜ J(D, r) .
Then, we have the following. Theorem 11: Assume that the source X L and its noisy version Y L = X L + N L are cyclic shift invariant. Then, we have (l) Rsum,L (D|ΣY L ) ≥ Rsum,L (D|ΣY L ) . Proof of this theorem will be stated in Section V. (l) Next, we examine a sufficient condition for Rsum,L (D |ΣY L ) to coincide with Rsum,L ( D|ΣY L ). It is obvious ˜ r) that when e−2Lr ω ˜ (D, r) is a from the definition of J(D,
holds for r ≥ r∗ (D+tr[B]), then e−2Lr ω ˜ (D, r) is a monotone decreasing function of r ∈ [r∗ (D + tr[B]), ∞). Lemma 5: If we have λi1 λi0 − λi1 + ǫ λi0 + ǫ 2 2 λi1 λmax + ǫ λi0 4L , (24) ≤ L−1 λmax λi0 + ǫ λi1 + ǫ
then the sufficient condition (23) in Lemma 4 holds for any nonnegative r. Proofs of Lemmas 4 and 5 will be given in Section V. If we take ǫ sufficiently small in (24) in Lemma 5, then the left hand side of this inequality becomes close to zero. On the 4L other hand, the right hand side of (24) becomes close to L−1 . Hence if we choose ǫ sufficiently small, then the inequality (23) in Lemma 4 always holds. Next we suppose that Gaussian source Y L satisfies the cyclic shift invariant property. Then it is obvious that for arbitraly prescribed small positive ǫ, we can always choose Gaussian random vector N L so that ΣN L = ǫIL and Y L = X L +N L . For the above choice of N L , the Gaussian remote source X L also satisfies the cyclic shift invariant property. Summarizing those arguments we obtain the following theorem. (l) Theorem 12: If Y L is cyclic shift invariant, then Rsum,L (D| ΣY L ) = Rsum,L (D|ΣY L ). Furthermore, the curve R = Rsum,L (D|ΣY L ) has the following parametric form: # " L Y 1 2Lr βi (r) , R = log |ΣY L + B|e 2 i=1
D=
L X i=1
1 − tr[B] . βi (r)
V. P ROOFS OF THE R ESULTS A. Derivation of the Outer Bounds
In this subsection we prove the results on outer bounds of the rate distortion region. We first state two important lemmas which are mathematical cores of the converse coding theorem. For i = 1, 2, · · · , L, set 1 (n) Wi = ϕi (Y i ), ri = I(Y i ; Wi |X K ) . (25) n
13
For S ⊆ Λ, let QS be a unitary matrix which transforms X K into Z K = QX K . For X
K
K
K
K
= (X (1), X (2), · · · , X (n))
Z K = QX K = (QX K (1), QX K (2), · · · , QX K (n)) . ˆ Furthermore, for X set
ˆ K (1), X ˆ K (2), · · · , X ˆ K (n)), we = (X
ˆ K = QX ˆ K = (QX ˆ K (1), QX ˆ K (2), · · · , QX ˆ K (n)) . Z
L h(Z i |Z K [i] W ) ( −1 ) n −1 t t ≥ log (2πe) Q Σ−1 Q . X K + AΣNΛ (r (n) ) A 2 Λ ii
Proofs of those two lemmas will be stated in Appendix. The following corollary immediately follows from Lemmas 6 and 7. (n) (n) Corollary 3: For any ΣX K Y L and for any (ϕ1 , ϕ2 , · · · , (n) ϕL , ψ (n) ), we have (n)
A.
From Lemma 7, we obtain the following corollary. Corollary 4: For any S ⊆ Λ, we have n K −1 t I(X ; WS ) ≤ log I + ΣX K AΣ (n) A . NS (rS ) 2
(26)
Proof: For each i ∈ Λ − S, we choose Wi so that it takes a (n) constant value. In this case we have ri = 0 for i ∈ Λ − S. Then by Lemma 7, for any i = 1, 2, · · · , K, we have h(Z i |Z K [i] W S ) ( −1 ) n −1 −1 t t ≥ log (2πe) Q ΣX K + AΣ Q .(27) (n) A NS (rS ) 2 ii
WS → Y S → X K → Y S c → WS c
(29)
hold for any subset S of Λ. Assume (R1 , R2 , · · · , RL ) ∈ (n) (n) RL (Σd |ΣX K Y L ). Then, there exists a sequence {(ϕ1 , ϕ2 , (n) (n) ∞ · · · , ϕL , ψ }n=1 such that (n) lim sup Ri ≤ Ri , i ∈ Λ n→∞ (30) 1 lim sup ΣX K −X ˆ K Σd n→∞ n We set
△
(n)
ri = lim sup ri n→∞
= lim sup n→∞
1 I(Y i ; WS |X K ) . n
(31)
For any subset S ⊆ Λ, we have the following chain of inequalities: X X (n) nRi ≥ log Mi ≥
X
i∈S
H(Wi ) ≥ H(WS |WS c )
i∈S
NS (rS )
becomes the following diagonal matrix: λ1 λ2 Q Σ−1 + t AΣ−1 (n) A t Q = .. XK NS (rS ) .
Step (a) follows from the rotation invariant property of the (conditional) differential entropy. Step (b) follows from (27). Step (c) follows from (28). (out) We first prove the inclusion RL (Σd | ΣX K Y L ) ⊆ RL (Σd |ΣX K Y L ) stated in Theorem 3. Using Lemmas 6, 7, Corollary 4 and a standard argument on the proof of converse coding theorems, we can prove the above inclusion. (out) Proof of RL (Σd |ΣX K Y L ) ⊆ RL (Σd |ΣX K Y L ): We first observe that
i∈S
We choose a unitary matrix Q so that −1 t t Q Σ−1 + AΣ A Q (n) XK
0
n ≤ log (2πe)K |ΣX K | 2 K X n 1 −1 −1 t t Q ΣX K + AΣ Q log + (n) A NS (rS ) 2 2πe ii i=1
=
where h(·) stands for the differential entropy. Lemma 7: For any i = 1, 2, · · · , K, we have
NΛ (rΛ )
i=1
(b)
(c)
L K ˆ ˆK h(Z i | Z K [i] W ) ≤ h(Z i − Z i | Z [i] − Z [i] ) h i−1 n ≤ log (2πe) Q n1 Σ−1K ˆ K t Q , X −X 2 ii
Σ−1 + t AΣ−1 XK
= h(X K ) − h(Z K |WS ) K X W h Z i |Z K ≤ h XK − S [i]
K X n n log |ΣX K | + log λi 2 2 i=1 −1 n n −1 t = log |ΣX K | + log ΣX K + AΣ (n) A N (r ) S 2 2 S n −1 t = log I + ΣX K AΣ (n) A . NS (rS ) 2
We have the following two lemmas. Lemma 6: For any i = 1, 2, · · · , K, we have
1 −1 n ΣX K − X ˆK
I(X K ; WS ) = h(X K ) − h(X K |WS ) (a)
we set
K
Then we have the following chain of inequalities:
0 λK
. (28)
= I(X K ; WS |WS c ) + H(WS |WS c X K ) X (a) = I(X K ; WS |WS c ) + H(Wi |X K ) i∈S
(b)
K
= I(X ; WS |WS c ) +
X
H(Wi |X K )
i∈S
(c)
= I(X K ; WS |WS c ) + n
X i∈S
(n)
ri ,
(32)
14
where steps (a),(b) and (c) follow from (29). We estimate a lower bound of I(X K ; WS |WS c ). Observe that I(X K ; WS |WS c ) = I(X K ; W L ) − I(X K ; WS c ) (33) Since an upper bound of I(X S c ; WS c ) is derived by Corollary 4, it suffices to estimate a lower bound of I(X K ; W L ). We have the following chain of inequalities: I(X K ; W L ) = h(X K ) − h(X K |W L ) ˆ K) ≥ h(X K ) − h(X K |X
Proof of Theorem 4: We choose a unitary matrix Q so that −1 t t −1t + AΣ A QΓ−1 Σ−1 Γ Q K L L X N (r ) α1 α2 = . .. . αK
0
0
Then we have
−1 −1 t t t QΓ Σ−1 + AΣ A ΓQ K L L X N (r ) −1 α1 α−1 2 = . . .. −1 αK
ˆ K) ≥ h(X ) − h(X − X i h n n K ≥ log (2πe)K |ΣX K | − log (2πe)K n1 ΣX K −X ˆ 2 2 K
=
K
|ΣX K | n log 1 2 Σ n
. K ˆ X K −X
0
(34)
Combining (33), (34), and Corollary 4, we have X (n) I(X K ; WS |WS c ) + n ri
0
For Σd ∈ A(rL ), set
Q (n) 2ri |ΣX K | n i∈S e log 2 I + ΣX K t AΣ−1 (n) A 1 Σ K ˆ K NS c (rS c ) n X −X Q (n) 2ri n i∈S e . = log −1 2 Σ K + t AΣ−1 (n) A 1 Σ K ˆ K X n X −X c
≥
P
K
I(X ; WS |WS c ) + n
i∈S
X
(n)
ri
is nonnegative.
−1 t −1t ΓΣd t Γ Γ(Σ−1 Γ, X L + AΣN L (r L ) A)
(40), and tr[ΓΣd t Γ] ≤ D, we have ξi ≥ α−1 i , for i = 1, 2, · · · , K , K h i X t ˜ d = tr ΓΣd Γ ≤ D . ξi = tr Σ
Furthermore, by Hadamard’s inequality we have
(n) ri
(35)
+
1 −1 n ΣX K − X ˆK
=
≤ |Γ|−2
(37)
and
(out)
ξi . (42)
i=1
|Σd |
max
(39)
K Y
ξi αiP ≥1,i=1,2,···,K , i=1 K ξi ≤D i=1
i∈S
From (38) and (39), RL (Σd |ΣX K Y L ) ⊆ RL is concluded.
max
Σd :Σd ∈AL (r L ), tr[ΓΣd t Γ]≤D
(36)
By letting n → ∞ in (36) and (37) and taking (30) into account, we have for any S ⊆ Λ X (38) Ri ≥ J S (|Σd | , rS |rS c ) , Σ−1 + t AΣ−1 A Σ−1 d . XK N L (r L )
K Y
θ(Γ, D, rL )
for S ⊆ Λ. On the other hand, by Corollary 3, we have AΣ−1 (n) A NΛ (rΛ )
˜ d ]ii = |Γ|−2 [Σ
Combining (41) and (42), we obtain
i∈S
t
K Y
i=1
Combining (32) and (35), we obtain X (n) (n) (n) Ri ≥ J S n1 ΣX K −X ˆ K , rS rS c
(41)
i=1
˜ d | ≤ |Γ|−2 |Σd | = |Γ|−2 |Σ
i∈S (n) (n) 1 . ≥ nJ S n ΣX K −X ˆ K , rS rS c
Σ−1 XK
ii
Since
NS (rS c )
Note here that I(X K ; WS |WS c )+n Hence, we have
h i △ ˜d . ξi = Σ
△ ˜d = QΓΣd t Γt Q , Σ
i∈S
(40)
ξi = ω(Γ, D, rL ) .
˜ d is a diagonal matrix. The equality holds when Σ Proof of Theorem 11: Assume that (R1 , R2 , · · · , RL ) ∈ (n) (n) RL (D|ΣY L ). Then, there exists a sequence {(ϕ1 , ϕ2 , (n) · · · , ϕL , φ(n) }∞ n=1 such that (n) lim sup Ri ≤ Ri , i ∈ Λ n→∞ 1 (43) lim sup ΣY Λ −Yˆ Λ Σd , tr[Σd ] ≤ D n→∞ n for some Σd . (n)
(Σd |ΣX K Y L )
(n)
For each l = 0, 1, · · · , L − 1, we use (ϕτ l (1) , ϕτ l (2) , · · · , (n)
ϕτ l (L) ) for the encoding of (Y 1 , Y 2 , · · · , Y L ). For i ∈ Λ
15
and for l = 0, 1, · · · , L − 1, set
3. From (46), we have # " L−1 1 X −1 −1 ΣXΛ + Σ (n) Nτ l (Λ) (r l ) L τ (Λ)
△ △ Wl,i = ϕτ l (i) (Y i ), Yˆ l,i = φτ l (i) (ϕτ l (i) (Y i )), (n) △ 1 rl,i = I(Y i ; Wl,i |X L ). n
In particular, (a) (n)
(n)
r0,i = ri
=
△
for i ∈ Λ.
"
△
(n)
(n)
(n)
L
1 X (n) r . L i=1 i
By the cyclic shift invariant property of X Λ and Y Λ , we have for l = 0, 1, · · · , L − 1, L
L
1 X (n) 1 X (n) rl,i = r = r(n) . L i=1 L i=1 0,i
(44)
For Σd = [dij ], set
l=0
Then, we have L−1 1 X1 lim sup ˆ l n ΣY Λ − Y τ (Λ) n→∞ L (a)
= lim sup n→∞
(b)
1 L
L−1 X
1 L
l=0 L−1 X
1 ˆ l n ΣY τ l (Λ) −Y τ (Λ)
l=0
(c)
τ l (Σd ) = Σd .
(45)
Step (a) follows from the cyclic shift invariant property of Y Λ . Step (b) follows from (43). Step (c) follows from the ˆΛ definition of Σd . From Y Λ , we construct an estimation X ˆ Λ = A˜Yˆ Λ . Then for l = 0, 1, · · · , L − 1, we of X Λ by X have the following. (a)
Nτ l (Λ) (r
1 −1 n ΣX l
τ (Λ)
(n) ) τ l (Λ)
ˆ l −X τ (Λ)
h = A˜ n1 ΣY Λ −Yˆ
= Σ−1 X l
τ (Λ)
(c) 1 −1 = n ΣX − X ˆ Λ
τ l (Λ)
t
1 ˆ l n ΣY Λ − Y τ (Λ)
L−1 1 X1 ˆ l n ΣY Λ − Y τ (Λ) L l=0
1 L
L−1 X
+ Σ−1
Nτ l (Λ) (r
!
1 ˆ l n ΣY Λ − Y τ (Λ)
A˜ + ΣXΛ |YΛ t
i−1
A˜ + ΣXΛ |YΛ
+B
l=0
!
t
#−1
#−1 ˜ A .
(n) τ l (Λ)
)
from which we obtain L−1 1 X1 +B ˆ l n ΣY Λ − Y τ (Λ) L l=0 ! ) #−1 " ( (n) 1 − e−2r −1 t ˜ . IL A˜ A ΣXΛ + ǫ
(47)
i∈Λ
i∈Λ
i∈Λ
= I(X Λ ; Wτ l (Λ) ) + H(Wτ l (Λ) |X Λ ) X (a) = I(X Λ ; Wτ l (Λ) ) + H(Wl,i |X Λ ) i∈Λ
X
I(Y Λ ; Wl,i |X Λ )
i∈Λ
.
(46)
Steps (a) and (c) follow from the cyclic shift invariant property of XΛ and X Λ , respectively. Step (b) follows from Corollary
(49)
Next we derive a lower bound of the sum rate part. For each l = 0, 1, · · · , L−1, we have the following chain of inequalities. X X X (n) nRi ≥ log Mi ≥ H(Wl,i ) ≥ H(Wτ l (Λ) )
= I(X Λ ; Wτ l (Λ) ) +
i−1
t
−1 ˜ t A+Σ ˜ is convex with Step (a) follows form that (AΣ XΛ |YΛ ) respect to Σ. On the other hand, we have # " L−1 1 X −1 −1 ΣXΛ + Σ (n) Nτ l (Λ) (r l ) L τ (Λ) l=0 ! (n) L 1 X 1 − e−2ri IL = Σ−1 XΛ + L i=1 ǫ PL (n) 1 r −2 L (a) i=1 i 1 − e IL Σ−1 XΛ + ǫ ! −2r (n) 1 − e IL . (48) = Σ−1 XΛ + ǫ
τ l (Λ)
A˜ + ΣXΛ |YΛ
l=0
l=0
−1 Σ−1 XΛ + Σ
Step (a) follows from that 1 − e−2a is a concave function of a. Combining (47) and (48), we obtain ! (n) 1 − e−2r −1 ΣXΛ + IL ǫ ! #−1 " L−1 X 1 1 + B t A˜ A˜ , ˆ l n ΣY Λ − Y τ (Λ) L
L−1 1 X l τ (Σd ) = [dτ l (i)τ l (j) ], Σd = τ (Σd ) . L △
△
l
(b)
l=0
= A˜
rτ l (Λ) = (rl,1 , rl,2 , · · · , rl,L ) , for l = 0, 1, · · · , L − 1 , r(n) =
A˜
A˜
1 I(Y i ; Wi |X i ), n
Furthermore, set (n)
1 L "
l=0 L−1 Xh
(b)
= I(X Λ ; Wτ l (Λ) ) + nLr(n) (c) n | |Σ X Λ + nLr(n) ≥ log 2 1Σ ˆ n
X Λ −X τ l (Λ)
16
˜ t ˜ A + Σ AΣ Y X |Y Λ Λ Λ n + nLr(n) = log 1 2 t ˜ ˜ A + ΣXΛ |YΛ A n ΣY Λ −Yˆ l τ (Λ) n |ΣYΛ + B| + nLr(n) . (50) = log 1 2 n ΣY Λ −Yˆ l + B
τ (Λ)
Step (a) follows from (29). Step (b) follows from (45). Step (c) follows from (34). From (50), we have L−1 1 X X (n) Ri L i∈Λ l=1 i∈Λ L−1 + B| |Σ 1 X1 YΛ + Lr(n) log ≥ 1 L 2 n ΣY Λ −Yˆ l + B l=1 τ (Λ)
X
(n)
Ri
0
0
Set
.
h i △ △ △ ˆd = ˆd = ˆd + B ˆ . QB t Q , ξi = Σ QΣd t Q , B Σ ii
From (53) and (54) we have
ξi ≥ βi−1 (r), i ∈ Λ , L h i X (55) ˆd + B ˆ = tr [Σd + B] ≤ D + tr[B] . ξi = tr Σ i=1
From (55), we have
L X 1 ˆ ≤ D + tr[B] ξi = tr[Σˆd + B] ≤ βi (r) i=1 ∗
⇔ r ≥ r (D + tr[B]) .
Hence, from (52), (56), and (58) we have Lr L X 1 e |ΣY + B| Ri ≥ min log r≥r ∗ (D+tr[B]) 2 ω ˜ (D, r) i=1
ˆ d + B| ˆ ≤ |Σd + B| = |Σ
i=1
ˆ d + B] ˆ ii = [Σ
L Y
i=1
min
r≥r ∗ (D+tr[B])
˜ J(D, r) = Rsum,L (D|ΣY L ) ,
In this subsection we prove RL (Σd |ΣX K Y L ) ⊆ RL (Σd |ΣX K Y L ) stated in Theorem 3. (in) Proof of RL (Σd |ΣX K Y L ) ⊆ RL (Σd |ΣX K Y L ): Since (in) ˆ ( Σd |ΣX K Y L ) ⊆ RL (Σd |ΣX K Y L ) is proved by Theorem R L (in) ˆ (in) (Σd |ΣX K Y L ) 1, it suffices to show RL (Σd |ΣX K Y L ) ⊆ R L (in) to prove RL (Σd |ΣX K Y L ) ⊆ RL (Σd |ΣX K Y L ). We assume (in) that RL ∈ RL (Σd |ΣX K Y L ). Then, there exists nonnegative vector rL such that −1 −1 t Σ−1 + AΣ A Σd K L L X N (r ) and
X
Ri ≥ K(rS |rS c ) for any S ⊆ Λ .
(56)
Let Vi , i ∈ Λ be L independent zero mean Gaussian random variables with variance σV2 i . Define Gaussian random variables Ui , i ∈ Λ by Ui = Xi + Ni + Vi . By definition it is obvious that UL → Y L → XK US → YS → X K → YS c → US c (60) for any S ⊆ Λ .
2 For given ri ≥ 0, i ∈ Λ, choose σV2 i so that σV2 i = σN /(e2ri − i 1) when ri > 0. When ri = 0 we choose Ui so that Ui takes constant value zero. In the above choice, the covariance matrix of N L + V L becomes ΣN L (rL ) . Define the linear function ψ of U L by −1 t −1t L ψ U L = (Σ−1 AΣ−1 X K + AΣN L (r L ) A) N L (r L ) U . ˆ L = ψ U L and Set X h i △ ˆ i ||2 , dii = E ||Xi − X i h △ ˆ j , 1 ≤ i 6= j ≤ K. ˆ i Xj − X dij = E Xi − X
Let ΣX K −Xˆ K be a covariance matrix with dij in its (i, j) entry. By simple computations we can show that
(57)
(61)
and that for any S ⊆ Λ, JS (rS |rS c ) = I(YS ; US |US c ) .
ξi .
(59)
i∈S
+ t AΣ−1 A)−1 Σd ΣX K −Xˆ K = (Σ−1 N L (r L ) XK
Furthermore, by Hadamard’s inequality we have L Y
(58)
(in)
βL
ξi ≤D+tr[B]
ξi = ω ˜ (D, r) .
i=1
B. Derivation of the Inner Bounds
Step (a) follows from that − log |Σ+B| is convex with respect to Σ. Letting n → ∞ in (49) and (51) and taking (45) into account, we have " # X |ΣYΛ + B| 1 + Lr , (52) Ri ≥ log Σd + B 2 i∈Λ −1 1 − e−2r −1 t ˜ Σd + B A ΣXΛ + IL A˜ ,(53) ǫ tr[Σd + B] = tr[Σd ] + tr[B] ≤ D + tr[B] . (54)
β1 β2 −2r 1 − e ˜t Q = A I + Qt A˜ Σ−1 L .. XΛ ǫ .
PLξi βi ≥1,i∈Λ ,
L Y
completing the proof.
l=1
Now we choose a unitary matrix Q so that
max
i=1
=
|ΣYΛ + B| 1 + Lr(n) . (51) ≥ log L−1 1 X 2 1 + B ˆ l n ΣY Λ − Y τ (Λ) L
i=1
|Σd + B| ≤
=
(a)
L X
Combining (55) and (57), we obtain
(62)
From (60) and (61), we have U L ∈ G(Σd ). Thus, from (62) (in) ˆ (in) (Σd |ΣX K Y L ) is concluded. RL (Σd |ΣX K Y L ) ⊆ R L
17
C. Proofs of the Results on Matching Conditions We first observe that the condition −1 −1 −1 t t tr Γ ΣX K + AΣN L (rL ) A Γ ≤D
is equivalent to
K X j=1
1 ≤ D. αj (rL )
(63)
˜ = {1, 2, · · · , K} and let S ⊆ Λ ˜ Proof of Lemma 3: Let Λ be a set of integers that satisfies α−1 ≥ ξ in the definition of i θ(Γ, D, uL ). Then, θ(Γ, D, uL ) is computed as L
Y 1 αi
1 (K−|S|)K−|S|
i∈S
!
D−
X 1 αi i∈S
!K−|S|
≥ χi
.
Fix i ∈ Λ arbitrary. For simplicity of notation we set 1 △ ai ||2 2 + ηi (uL χi = ||ˆ [i] ) σNi and set 1 2 σN
△
Ψ = log
j∈S
(a)
θ(Γ, D, u ) =
If |S| = K, Φj ≥ 0, j ∈ Λ is obvious. We hereafter assume |S| ≤ K − 1. Computing Φj , we obtain X 1 − K − |S| · (χi − αmin ) Φ j = χi D − αj αj j∈S X 1 +(αj − αmin ) D − αj j∈S X 1 − K − |S| · (χi − αmin ) ≥ χi D − αj αj
1
2 σN i
i
− ui
− log θ(Γ, D, uL ) .
Computing the partial derivative of Ψ by ui , we obtain X ∂αj 1 ∂Ψ K − |S| 1 1 + X = − . 1 1 α2 ∂ui ∂ui αj − ui 2 D− j σN j∈S αj
˜ j∈Λ−S
K − |S| K − |S| − · (χi − αmin ) αmax αmin 1 1 1 . = χi (K − |S|) − + αmax αmin χi
X ∂αj 1 K − |S| 1 X − = 1 α2 ∂ui αj D − j j∈S αj j∈S
then, Φj ≥ 0 for j ∈ S . Proof of Theorem 5: By (63), we have 1 K −1 ≤D− αmin (rL ) αmax (rL ) K 1 +D− . = L αmax (r ) αmax (rL ) Hence, if D−
j∈S
||ˆ ai ||2 (64) χi − (||ˆ ai ||2 ui + ηi ) From Lemma 2 and (64), we obtain X ∂αj 1 K − |S| 1 1 ∂Ψ . X ≥ − + 1 α2 ∂ui ∂ui αj χ − α i min D − j j∈S αj +
k∈S
To examine signs of contents of the above summation we set X 1 K − |S| △ Φj = D − (χi − αmin ) − αj αj j∈S X 1 . +αj D − αj j∈S
1 K , ≤ L αmax (r ) χi
or equivalent to
2
||ˆ ai || + 1 2 + η − (||ˆ ||ˆ a || ai ||2 ui + ηi ) 2 i i σN i X ∂αj 1 K − |S| 1 X − = 1 α2 ∂ui αj D− j j∈S αj
(65)
Step (a) follows from the inequality (63). From (65), we can see that if 1 1 1 for i ∈ Λ, − ≤ αmin (rL ) αmax (rL ) χi
i
1 K − |S| − · (χi − αmin ) αj αj
≥ χi ·
j∈S
X
1 D− αmax (rL ) ≤ K χi
(66)
holds for rL ∈ BL (Γ, D) and i ∈ Λ, the condition on αmin and αmax in Lemma 3 holds. By Lemma 2, we have αmax (rL ) ≤ α∗max for rL ∈ BL (Γ, D). It can be seen from (66) and (67) that 1 D− α∗max ≤ K for i ∈ Λ . χi
(67)
(68)
is a sufficient condition for (66) to hold. By Lemma 2, we have 1 lim αmax (uL ) χi = ||ˆ ai ||2 2 + ηi (uL [i] ) ≤ σNi ui → 21 σ Ni
≤
α∗max
for i ∈ Λ,
from which we have 1 α∗max ≤ Dα∗max − 1 . D− χi
Thus, if we have Dα∗max − 1 ≤ K or equivalent to D ≤ (K + 1)/α∗max , we have (68).
18
Proof of Lemma 4: We first derive expression of ω ˜ (D, r) using βi = βi (r), i ∈ Λ. Let S be a set of integers that satisfies βi−1 ≥ ξ in the definition of ω ˜ (D, r). Then ω ˜ (D, r) is computed as Y 1 βk
1 (L−|S|)L−|S|
ω ˜ (D, r) =
k∈S
!
X 1 D− βk k∈S
!L−|S|
.
Fix i ∈ Λ arbitrary and set
then Φk ≥ 0 for k ∈ Λ. The inequality (70) is equivalent to 2 2r 1 1 ǫe L βi0 λk + ǫ − ≤ . βi0 βi1 λk |S| βi1
△
Ψ = Lr − log ω ˜ (D, r) . Computing the derivative of Ψ by r, we obtain
Hence
dΨ dr =
Xe
k∈S
From (69), we can see that if ) ( 2 λk 1 e−2r |S| + βi0 ǫL λk + ǫ βi1 2 −2r e |S| λk 1 − · ǫL λk + ǫ βi0 2 −2r βi λk 1 e |S| 1 ≥ 0, (70) = 0 − − βi1 ǫL λk + ǫ βi0 βi1
−2r
ǫ
λk λk + ǫ
2
1 βk −
L − |S| 1 +L X 1 β2 D− k βk
k∈S X e−2r λk 2 1 L L − |S| 1 X = + − . 1 β2 ǫ λk + ǫ βk |S| D− k∈S βk k k∈S
To examine signs of contents of the above summation we set ) 2 ( −2r X 1 λk L − |S| |S| △ e D− − Φk = ǫL λk + ǫ βk βk k∈S ! X 1 . +βk D − βk
1 1 − ≤ βi0 βi1
Step (a) follows from D−
L X 1 X 1 X 1 ≥0⇔D− ≥ . βk βk βk k=1
k∈S
k∈Λ−S
λmax + ǫ λmax
2
ǫe2r L βi0 L − 1 βi1
(71)
is a sufficient condition for Φk ≥ 0, k ∈ Λ. The condition (71) is equivalent to 2 λmax + ǫ L (βi0 (r))2 , βi1 (r) − βi0 (r) ≤ ǫe2r · L−1 λmax completing the proof. Proof of Lemma 5: Set F (r) −1 2 λi0 λi1 λi0 △ e2r − . + = e2r − λi0 + ǫ λi1 + ǫ λi0 + ǫ Then, the sufficient condition stated in Lemma 4 is equivalent to
k∈S
If |S| = L, Φk ≥ 0, k ∈ Λ is obvious. We hereafter assume |S| ≤ L − 1. Computing Φk , we obtain ( ) ! 2 X 1 e−2r |S| λk Φk = D− + βk ǫL λk + ǫ βk k∈S 2 λk L − |S| e−2r |S| · − ǫL λk + ǫ βk ( ) ! 2 −2r X 1 (a) e |S| λk ≥ + βk ǫL λk + ǫ βk k∈Λ−S 2 λk L − |S| e−2r |S| · − ǫL λk + ǫ βk ) ( 2 −2r λk L − |S| e |S| + βi0 ≥ ǫL λk + ǫ βi1 2 e−2r |S| λk L − |S| − · . (69) ǫL λk + ǫ βi0
λi0 λi1 − λi1 + ǫ λi0 + ǫ 2 2 λmax + ǫ L λi0 · F (r) . ≤ L−1 λmax λi0 + ǫ
(72)
To derive an explicit sufficient condition for (72) to hold, we estimate a lower bound of F (r). Set λi0 λi1 λi1 △ △ T (r) = e2r − ,P = + . λi0 + ǫ λi1 + ǫ λi1 + ǫ Then F (r) = [T (r)]−1 [T (r) + P ]2 = T (r) + ≥ 4P =
P2 + 2P T (r)
4λi1 . λi1 + ǫ
Hence, λi0 λi1 − λi1 + ǫ λi0 + ǫ 2 2 λi1 λmax + ǫ λi0 4L ≤ L−1 λmax λi0 + ǫ λi1 + ǫ is a sufficient condition for (72) to hold.
19
A PPENDIX A. Proof of Lemma 6 In this appendix we prove Lemma 6. To prove this lemma we need some preparations. For i ∈ Λ, set △
Fi (Σ|Q) =
sup
pX ˆ K |X K : ΣX K −X ˆ K Σ
K K h(Zi − Zˆi |Z[i] − Zˆ[i] ).
(α)
(0)
△
t
K t
(α)
(α)
(α)
(pX˜ K |X K , pX K ). Let ΣX˜ be a covariance matrix computed
˜ K . Let p K ˜ K Note that by definition we have Z˜ K = QX X X ˜ K ). Let q K ˜K (xK , x ˜K ) be a density function of (X K , X Z Z (z K , z˜K ) be a density function of (Z K , Z˜ K ) induced by the unitary matrix Q, that is,
(α)
from the density pX˜ K . Since (α)
=
=
sup
pX ˜ K |X K : ΣX ˜ K Σ
sup
pX ˜ K |X K : ΣX ˜ K Σ
K ) h(Z˜i |Z˜[i]
−
Z
−
Z
(1 − α)Σ(0) + αΣ(1) .
K qZ˜ K (z K ) log qZ˜i |Z˜ K (zi |z[i] )dz K [i]
(α)
qZ˜ K (z K ) log
qZ˜ K (z ) K dz . K) qZ˜ K (z[i]
(1)
Then we have
[i]
t=1
t=1
(0)
qZ˜ K = (1 − α)qZ˜K + αqZ˜K .
K
ˆ i | ZK − ≤ h(Z i − Z [i] n X ˆ i (t) | Z K (t) − Z ˆK ≤ h(Z i (t) − Z [i] (t)) [i] Fi ΣX K (t)−X ˆ K (t) Q
! n 1X ΣX K (t)−Xˆ K (t) Q ≤ nFi n t=1 = nFi n1 ΣX K −X ˆ K Q h i−1 (c) n 1 −1 = log (2πe) Q n Σ K ˆ K t Q . X −X 2 ii
(b)
(α)
By definition it is obvious that
(1 − α)Fi (Σ(0) |Q) + αFi (Σ(1) |Q) (0) Z q ˜K (z K ) K (0) dz = −(1 − α) qZ˜ K (z K ) log Z (0) K) qZ˜K (z[i] [i]
−α
Z
(1)
(1)
qZ˜K (z K ) log
qZ˜ K (z K ) (1) K) qZ˜ K (z[i]
dz K
[i]
(α)
qZ˜K (z K )
≤ −
Z
qZ˜ K (z K ) log
= −
Z
(α) (α) K )dz K qZ˜ K (z K ) log qZ˜ |Z˜ K (zi |z[i] i
(a)
ˆK Z [i] )
≤
△
(α)
K h(Z i | Z K [i] W )
n X
(73)
qZ K Z˜K (z K , z˜K ) = pt QZ K t QZ˜ K (t Qz K , t Q˜ zK ) .
The following two properties on Fi (Σ|Q) are useful for the proof of Lemma 6. Lemma 8: Fi (Σ|Q) is concave with respect to Σ. Lemma 9: n −1 o 1 Fi (Σ|Q) = log (2πe) QΣt Q ii . 2 We first prove Lemma 6 using those two lemmas and next prove Lemmas 8 and 9. Proof of Lemma 6: We have the following chain of inequalities.
(a)
(1)
(0)
(α)
ΣX˜ = (1 − α)ΣX˜ + αΣX˜
(α) Let qZ K Z˜K be a density function of (Z K , Z˜ K ) induced by the unitary matrix Q, that is,
Fi (Σ|Q) sup
(1)
we have
K
Expression of Fi (Σ|Q) using the above density functions is the following.
pX ˜ K |X K : ΣX ˜ K Σ
(0)
pX˜ K = (1 − α)pX˜ K + αpX˜ K ,
qZ K Z˜ K (z , z˜ ) = pt QZ K t QZ˜K ( Qz , Q˜ z ).
=
(1)
˜ K ) defined by Let pX K X˜ K be a density function of (X K , X (α)
△ △ ˜K = ˆ K , Z˜ K = X XK − X Z K − Zˆ K .
K
ties achieving Fi (Σ(0) |Q) and Fi (Σ(1) |Q), respectively. For 0 ≤ α ≤ 1, define a conditional density parameterized with α by pX˜ K |X K = (1 − α)pX˜ K |X K + αpX˜ K |X K .
To compute Fi (Σ|Q), define two random variables by
K
Step (a) follows from the definition of Fi (Σ|Q). Step (b) follows from Lemma 8. Step (c) follows from Lemma 9. Proof of Lemma 8: For given covariance matrices Σ(0) (0) (1) and Σ(1) , let pX˜ K |X K and pX˜ K |X K be conditional densi-
(α)
(α) K ) qZ˜K (z[i]
dz K
[i]
[i]
≤ Fi (1 − α)Σ(0) + αΣ(1) Q .
(b)
Step (a) follows from log sum inequality. Step (b) follows from the definition of Fi (Σ|Q) and (73). Proof of Lemma 9: Let (G)
1
△
qZ˜K (z K ) =
(2πe)
K 2
|ΣZ˜ K |
1 2
e
− 12 t [z K ]Σ−1 [z K ] ˜K
and let (G)
(G) K qZ˜ |Z˜ K (zi |z[i] ) i [i]
=
qZ˜ K (z K ) (G)
K) qZ˜ K (z[i] [i]
Z
20
(G)
be a conditional density function induced by qZ˜ K (·). We first observe that Z
qZ˜ K (z K ) log
K qZ˜i |Z˜ K (zi |z[i] ) [i]
(G) K ˜ K (zi |z[i] ) i |Z
qZ˜
dz K ≥ 0 .
(74)
[i]
From (74), we have the following chain of inequalities. Z K K ˜ ˜ )dz K h(Zi |Z[i] ) = − qZ˜ K (z K ) log qZ˜i |Z˜ K (zi |z[i] [i] Z (G) K )dz K ≤ − qZ˜ K (z K ) log qZ˜ |Z˜ K (zi |z[i] i
[i]
(G) qZ˜K (z K ) K dz (G) K ) qZ˜K (z[i]
= −
Z
= −
Z
qZ˜ K (z
Z
qZ˜ K (z K ) log qZ˜ K (z K )dz K
+ (a)
= −
(b)
=
= (c)
≤
Z
(G) ) log qZ˜K (z K )dz K
(G)
pZ K Y L (z K , y L ) =
(G)
1 (2πe)
K+L 2
1
|ΣZ K Y L | 2
e
− 12 t [z K y L ]Σ−1 K Z
YL
h
zK yL
i
,
where Σ−1 Z K Y L has the following form: −1 −1 t t t Q(Σ−1 K + AΣN L A) Q −Q AΣN L −1 X ΣZ K Y L = . −Σ−1 At Q Σ−1 NL NL Set
(75)
(G)
Step (a) follows from the fact that qZ˜ L and qZ˜ L yield the (G)
same moments of the quadratic form log qZ˜L . Step (b) is a well known formula on the determinant of matrix. Step (c) follows from ΣX˜ L Σ. Thus n −1 o 1 Fi (Σ|Q) ≤ log (2πe) QΣ−1t Q ii 2
is concluded. Reverse inequality holds by letting pX˜ K |X K be Gaussian with covariance matrix Σ. B. Proof of Lemma 7 In this appendix we prove Lemma 7. We write a unitary matrix Q as Q = [qij ], where qij stands for the (i, j) entry of Q. The unitary matrix Q transforms X K ˜ = Qt A and let q˜ij be the (i, j) entry into Z K = QX K . Set Q t of Q A. The following lemma states an important property on the distribution of Gaussian random vector Z K . This lemma is a basis of the proof of Lemma 7. Lemma 10: For any i = 1, 2, · · · , K, we have the following.
j6=i
ˆi is a νij , j ∈ {1, 2, · · · , K} − {i} are suitable constants and N 1 zero mean Gaussian random variables with variance gii . For ˆi is independent of Zj , j ∈ {1, 2, · · · , K} − {i} each i ∈ S, N and Yj , j ∈ Λ. Proof: Without loss of generality we may assume i = 1. Since Y L = AX K + N L , we have ΣX K ΣX K ΣX K Y L = . ΣX K AΣX K t A + ΣN L
(G)
L 1 X 1 X q˜ij ˆ Zi = − νij Zj + 2 Yj + Ni , gii gii j=1 σN j
(77)
The density function pZ K Y L (z K , y L ) of (Z K , Y L ) is given by
(G) K )dz K qZ˜ K (z K ) log qZ˜K (z[i] [i] (G)
L 2 X q˜ij t + gii = QΣ−1 Q , 2 XK ii σ j=1 Nj
Since Z K = QX K , we have QΣX K t Q QΣX K ΣZ K Y L = . ΣX K t Q AΣX K t A + ΣN L
[i]
K
K )dz K qZ˜ K (z K ) log qZ˜ K (z[i] [i] ) ( |ΣZ˜ K | 1 log (2πe) 2 |ΣZ˜ K | [i] i−1 h 1 −1 log (2πe) ΣZ˜K 2 ii h i−1 1 −1 t log (2πe) QΣX Q ˜K 2 ii o n 1 −1 . log (2πe) QΣ−1t Q ii 2
+ =
Z
qZ˜ K (z K ) log
where
(76)
△ + t AΣ−1 A)t Q ij νij = Q(Σ−1 XK NL
L X q˜ik q˜jk t + = QΣ−1 Q 2 XK ij σN k k=1 t q˜ij △ −1 βij = − Q AΣN L ij = − 2 . σNj
,
(78)
Now, we consider the following partition of Σ−1 ZK Y L : Q(Σ−1 + t AΣ−1 A)t Q −Qt AΣ−1 XK NL NL Σ−1 = ZK Y L −Σ−1 At Q Σ−1 NL NL t g g = 11 12 , g12 G22
where g11 , g12 , and G22 are scalar, K + L − 1 dimensional vector, and (K + L − 1) ×(K + L − 1) matrix, respectively. It is obvious from the above partition of Σ−1 that we have ZK Y L L 2 X q˜1k −1 t g11 = ν11 = QΣX K Q 11 + 2 , σN (79) k k=1 t g12 = [ν12 · · · ν1K β11 β12 · · · β1L ] . It is well known that Σ−1 has the following expression: ZK Y L g tg Σ−1 = 11 12 ZK Y L g12 G22 t t g11 1 012 012 = 1 012 G22 − g111 t g12 g12 g11 g12 IL−1 1 g111 t g12 . × 012 IL−1
21
Furthermore, for k ∈ Λ, define
Set i h △ K L y n ˆ 1 = z1 |z[1]
Then, we have t
1 1 h K Li = z1 + z y g12 . (80) 1 g11 [1] g11 g12
[z K y L ]ΣZ K Y L
K L = t [z1 |z[1] y ]
K L y ] = [ˆ n1 |z[1]
zK yL
= z1 +
n ˆ1 zK . [1] yL
L h(Ψ1 |Z K [i] , W )
(81)
j=2
1 g11
L X j=1
K L L = I(Ψ1 ; Z K |Z K [i] , W ) + h(Ψ1 |X , W ) K L L = I(Ψ1 ; Z i |Z K [i] , W ) + h(Ψ1 |X , W )
K L L = h(Z i |Z K [i] , W ) − h(Z i |Ψ1 , Z [i] , W )
+h(Ψ1 |X K , W L )
q˜1j 2 yj . σN j
(b)
(82)
ˆ1 It can be seen from (81) and (82) that the random variable N defined by L L 1 X q˜1j 1 X △ ˆ1 = ν1j Zj − Z1 + N 2 Yj g11 j=2 g11 j=1 σN j
is a zero mean Gaussian random variable with variance g111 K and is independent of Z[1] and Y L . This completes the proof of Lemma 10. The followings are two variants of the entropy power inequality. Lemma 11: Let U i , i = 1, 2, 3 be n dimensional random vectors with densities and let T be a random variable taking values in a finite set. We assume that U 3 is independent of U 1 , U 2 , and T . Then, we have 2 1 n h(U 2 +U 3 |U 1 T ) 2πe e
≥
2 1 n h(U 2 |U 1 T ) 2πe e
+
2 1 n h(U 3 ) . 2πe e
Lemma 12: Let U i , i = 1, 2, 3 be n random vectors with densities. Let T1 , T2 be random variables taking values in finite sets. We assume that those five random variables form a Markov chain (T1 , U 1 ) → U 3 → (T2 , U 2 ) in this order. Then, we have ≥
2 1 n h(U 1 +U 2 |U 3 T1 T2 ) 2πe e 2 2 1 1 n h(U 1 |U 3 T1 ) + n h(U 2 |U 3 T2 ) 2πe e 2πe e
.
L 1 X q˜ij 1 X ˆ νij Z j + 2 Y j + Ni , gii gii j=1 σN j
(83)
j6=i
ˆ i is a vector of n independent copies of zero mean where N Gaussian random variables with variance g1ii . For each i ∈ ˆ i is independent of Z j , j ∈ {1, 2, · · · , K} − {i} and Λ, N Y j , j ∈ Λ. Set △
h(n) =
1 L h(Z i |Z K [i] , W ) . n
K L = nh(n) − h(Z i |Ψ1 , Z K [i] ) + h(Ψ1 |X , W ) n = nh(n) − log 2πe(gii )−1 + h(Ψ1 |X K , W L ) . (85) 2 Step (a) follows from that Z K can be obtained from X K by the invertible matrix Q. Step (b) follows from the Markov chain L Z i → (Ψ1 , Z K → W L. [i] ) → Y
From (85), we have (n)
L 1 n2 h(Ψ1 |Z K e2h 1 2 h(Ψ1 |X K ,W L ) [i] ,W ) = . (86) e gii · en 2πe 2πe 2πe Substituting (86) into (84), we obtain (n)
(n)
e2h 1 1 1 2 h(Ψ1 |X K ,W L ) e2h + · . ≥ en 2πe 2πe gii 2πe gii
(87)
2h(n)
Solving (87) with respect to e 2πe , we obtain −1 (n) e2h 1 2 h(Ψ1 |X K ,W L ) n . ≥ gii − e 2πe 2πe K
2
(88) L
Next, we evaluate a lower bound of e n h(Ψ1 |X ,W ) . Note that for j = 1, 2, · · · , s − 1 we have the following Markov chain: q˜ij K WSj+1 , Ψj+1 (Y Sj+1 ) → X → Wj , σ2 Y j . (89) Nj 2
Proof of Lemma 7: By Lemma 10, we have Zi = −
K K L L = I(Ψ1 ; X K |Z K [i] , W ) + h(Ψ1 |X , Z [i] , W ) (a)
L L 1 X 1 X ν1j zj + β1j yj g11 j=2 g11 j=1
ν1j zj −
(84)
L On the quantity h(Ψ1 |Z K [i] , W ) in the right member of (84), we have the following chain of equalities:
012 g11 012 G22 − g111 g12 t g12
1 g11
j=k
(n)
t
L X
L X q˜ij 2 Yj. σN j
L e2h 1 n2 h(Ψ1 |Z K 1 1 [i] ,W ) + . ≥ e 2πe (gii )2 2πe gii
From (78)-(80), we have n ˆ 1 = z1 +
△
Applying Lemma 11 to (83), we have
z1 g11 t g12 K z[1] g12 G22 yL
△
Sk = {k, k + 1, · · · , L} , Ψk = Ψk (Y Sk ) =
K
L
1 Based on (89), we apply Lemma 12 to 2πe e n h(Ψj |X ,W ) for j = 1, 2, · · · , s − 1. Then, for j = 1, 2, · · · , s − 1, we have the following chains of inequalities : 1 2 h(Ψj |X K ,W L ) en 2πe K q ˜ij 2 2 Y j X ,WSj+1 ,Wj 1 n h Ψj+1 + σN 1 = e 2πe K q ˜ij 2 Y h X ,W j j σ2 1 n 1 n2 h( Ψj+1 |X K ,WSj+1 ) Nj e e + ≥ 2πe 2πe (n) −2rj K 2 1 n h( Ψj+1 |X ,WSj+1 ) 2 e . (90) e = + q˜ij 2 2πe σNj
22
Using (90) iteratively for j = 1, 2, · · · , s − 1, we have s
(n) −2rj
1 2 h(Ψ1 |X K ,W L ) X 2 e q˜ij 2 ≥ en 2πe σNj j=1
.
(91)
Combining (77), (88), and (91), we have −1 (n) (n) s −2rj X 1 − e e2h 2 t q˜ij ≥ QΣ−1 2 X K Q ii + 2πe σN j j=1 −1 −1 t t = Q Σ−1 + AΣ A Q , (92) (n) XK NΛ (rΛ )
ii
completing the proof. R EFERENCES [1] D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, vol. IT-19, pp. 471-480, July 1973. [2] A. D. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inform. Theory, vol. IT-22, pp. 1-10, Jan. 1976. [3] A. D. Wyner, “The rate-distortion function for source coding with side information at the decoder-II: General sources,” Inform. Contr., vol. 38, pp. 60-80, July 1978. [4] T. Berger, “Multiterminal source coding,” in the Information Theory Approach to Communications (CISM Courses and Lectures, no. 229), G. Longo, Ed. Vienna and New York : Springer-Verlag, 1978, pp. 171231. [5] S. Y. Tung, “Multiterminal source coding,” Ph.D. dissertation, School of Electrical Engineering, Cornell University, Ithaca, NY, May 1978. [6] T. Berger, K. B. Houswright, J. K. Omura, S. Tung, and J. Wolfowitz, “An upper bound on the rate distortion function for source coding with partial side information at the decoder,” IEEE Trans. Inform. Theory, vol. IT-25, pp. 664-666, Nov. 1979. [7] A. H. Kaspi and T. Berger, “Rate-distortion for correlated sources with partially separated encoders,” IEEE Trans. Inform. Theory, vol. IT-28, pp. 828-840, Nov. 1982. [8] T. Berger and R. W. Yeung, “Multiterminal source encoding with one distortion criterion,” IEEE Trans. Inform. Theory, vol. IT-35, pp. 228236, Mar. 1989. [9] Y. Oohama, “Gaussian multiterminal source coding,” IEEE Trans. Inform. Theory, vol. 43, pp. 1912-1923, Nov. 1997. [10] A. B. Wagner, S. Tavildar, and P. Viswanath, “Rate region of the quadratic Gaussian two-encoder source-coding problem,” IEEE Trans. Inf. Theory, vol. 54, pp. 1938-1961, May 2008. [11] H. Yamamoto and K. Itoh, “Source coding theory for multiterminal communication systems with a remote source”, Trans. of the IECE of Japan, vol. E63, no.10, pp. 700-706, Oct. 1980. [12] T. J. Flynn and R. M. Gray, “Encoding of correlated observations,” IEEE Trans. Inform. Theory, vol. IT-33, pp. 773-787, Nov. 1987. [13] H. Viswanathan and T. Berger, “The quadratic Gaussian CEO problem,” IEEE Trans. Inform. Theory, vol. 43, pp. 1549-1559, Sept. 1997. [14] Y. Oohama, “The rate-distortion function for the quadratic Gaussian CEO problem,” IEEE Trans. Inform. Theory, vol. 44, pp. 1057-1070, May 1998. [15] Y. Oohama, “Rate-distortion theory for Gaussian multiterminal source coding systems with several side Informations at the decoder,” IEEE Trans. Inform. Theory, vol. 51, pp. 2577-2593, July 2005. [16] A. Pandya, A. Kansal, G. Pottie and M. Srivastava, “Fidelity and resource sensitive data gathering,” Proceedings of the 42nd Allerton Conference, Allerton, IL, June 2004. [17] Y. Oohama, “Rate distortion region for separate coding of correlated Gaussian remote observations,” Proceedings of the 43rd Allerton Conference, Allerton, IL, pp. 2237-2246, Sept. 2005. [18] Y. Oohama, “Separate source coding of correlated Gaussian remote sources,” Proceedings of Information Theory & Applications Inaugural Workshop, UCSD, CA, Feb. 6-10, 2006. [19] Y. Oohama, “Rate distortion region for distributed source coding of correlated Gaussian remote sources,” Proceedings of the IEEE International Symposium on Information Theory, Toronto, Canada, July 6-11, pp. 41-45, 2008.
[20] Y. Oohama, “Distributed source coding of correlated Gaussian remote sources,” preprint; available at http://arxiv.org/PS − cache/arxiv/pdf/0904/0904.0751v3.pdf . [21] Y. Oohama, “Distributed source coding of correlated Gaussian observations,” Proceedings of the 2008 International Symposium on Information Theory and its Applications,, Auckland, New Zealand, December 7-10, pp. 1441-1446, 2008. [22] J. Wang, J. Chen and X. Wu, “On the minimum sum rate of Gaussian multiterminal source coding: New proofs,” Proceedings of the IEEE International Symposium on Information Theory, Seoul, Korea, June 28July 3, pp. 1463-1467, 2009.